Abstract
Graphs and networks are prevalent in modeling relational datasets from many fields of research. By using iterative solvers to approximate graph measures (specifically Katz Centrality), we can obtain a ranking vector consisting of a number for each vertex in the graph identifying its relative importance. We use the residual to accurately estimate how much of the ranking from an approximate solution matches the ranking given by the exact solution. Using probabilistic matrix norms and applying numerical analysis to the computation of Katz Centrality, we obtain bounds on the accuracy of the approximation compared to the exact solution with respect to the highly ranked nodes. This relates the numerical accuracy of the linear solver to the data analysis accuracy of finding the correct ranking. In particular, we answer the question of which pairwise rankings are reliable given an approximate solution to the linear system. Experiments on many real-world networks up to several million vertices and several hundred million edges validate our theory and show that we are able to accurately estimate large portions of the approximation. By analyzing convergence error, we develop confidence in the ranking schemes of data mining.
Original language | English (US) |
---|---|
Pages (from-to) | 68-78 |
Number of pages | 11 |
Journal | Procedia Computer Science |
Volume | 108 |
DOIs | |
State | Published - 2017 |
Externally published | Yes |
Event | International Conference on Computational Science ICCS 2017 - Zurich, Switzerland Duration: Jun 12 2017 → Jun 14 2017 |
All Science Journal Classification (ASJC) codes
- General Computer Science
Keywords
- data analysis
- graphs
- katz centrality
- numerical accuracy
- ranking