TY - GEN
T1 - A statistical framework for streaming graph analysis
AU - Fairbanks, James
AU - Ediger, David
AU - McColl, Rob
AU - Bader, David A.
AU - Gilbert, Eric
PY - 2013
Y1 - 2013
N2 - In this paper we propose a new methodology for gaining insight into the temporal aspects of social networks. In order to develop higher-level, large-scale data analysis methods for classification, prediction, and anomaly detection, a solid foundation of analytical techniques is required. We present a novel approach to the analysis of these networks that leverages time series and statistical techniques to quantitatively describe the temporal nature of a social network. We report on the application of our approach toward a real data set and successfully visualize high-level changes to the network as well as discover outlying vertices. The real-time prediction of new connections given the previous connections in a graph is a notoriously difficult task. The proposed technique avoids this difficulty by modeling statistics computed from the graph over time. Vertex statistics summarize topological information as real numbers, which allows us to leverage the existing fields of computational statistics and machine learning. This creates a modular approach to analysis in which methods can be developed that are agnostic to the metrics and algorithms used to process the graph. We demonstrate these techniques using a collection of Twitter posts related to Hurricane Sandy. We study the temporal nature of betweenness centrality and clustering coefficients while producing multiple visualizations of a social network dataset with 1.2 million edges. We successfully detect vertices whose triangle-forming behavior is anomalous.
AB - In this paper we propose a new methodology for gaining insight into the temporal aspects of social networks. In order to develop higher-level, large-scale data analysis methods for classification, prediction, and anomaly detection, a solid foundation of analytical techniques is required. We present a novel approach to the analysis of these networks that leverages time series and statistical techniques to quantitatively describe the temporal nature of a social network. We report on the application of our approach toward a real data set and successfully visualize high-level changes to the network as well as discover outlying vertices. The real-time prediction of new connections given the previous connections in a graph is a notoriously difficult task. The proposed technique avoids this difficulty by modeling statistics computed from the graph over time. Vertex statistics summarize topological information as real numbers, which allows us to leverage the existing fields of computational statistics and machine learning. This creates a modular approach to analysis in which methods can be developed that are agnostic to the metrics and algorithms used to process the graph. We demonstrate these techniques using a collection of Twitter posts related to Hurricane Sandy. We study the temporal nature of betweenness centrality and clustering coefficients while producing multiple visualizations of a social network dataset with 1.2 million edges. We successfully detect vertices whose triangle-forming behavior is anomalous.
UR - http://www.scopus.com/inward/record.url?scp=84893339077&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893339077&partnerID=8YFLogxK
U2 - 10.1145/2492517.2492620
DO - 10.1145/2492517.2492620
M3 - Conference contribution
AN - SCOPUS:84893339077
SN - 9781450322409
T3 - Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013
SP - 341
EP - 347
BT - Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013
PB - Association for Computing Machinery
T2 - 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2013
Y2 - 25 August 2013 through 28 August 2013
ER -