TY - GEN
T1 - Massive social network analysis
T2 - 39th International Conference on Parallel Processing, ICPP 2010
AU - Ediger, David
AU - Jiang, Karl
AU - Riedy, Jason
AU - Bader, David A.
AU - Corley, Courtney
AU - Farber, Rob
AU - Reynolds, William N.
PY - 2010
Y1 - 2010
N2 - Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128- processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6 billion edge graph in 55 minutes and a realworld graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter's message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.
AB - Social networks produce an enormous quantity of data. Facebook consists of over 400 million active users sharing over 5 billion pieces of information each month. Analyzing this vast quantity of unstructured data presents challenges for software and hardware. We present GraphCT, a Graph Characterization Toolkit for massive graphs representing social network data. On a 128- processor Cray XMT, GraphCT estimates the betweenness centrality of an artificially generated (R-MAT) 537 million vertex, 8.6 billion edge graph in 55 minutes and a realworld graph (Kwak, et al.) with 61.6 million vertices and 1.47 billion edges in 105 minutes. We use GraphCT to analyze public data from Twitter, a microblogging network. Twitter's message connections appear primarily tree-structured as a news dissemination system. Within the public data, however, are clusters of conversations. Using GraphCT, we can rank actors within these conversations and help analysts focus attention on a much smaller data subset.
UR - http://www.scopus.com/inward/record.url?scp=77954764743&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954764743&partnerID=8YFLogxK
U2 - 10.1109/ICPP.2010.66
DO - 10.1109/ICPP.2010.66
M3 - Conference contribution
AN - SCOPUS:77954764743
SN - 9780769541563
T3 - Proceedings of the International Conference on Parallel Processing
SP - 583
EP - 593
BT - Proceedings - 39th International Conference on Parallel Processing, ICPP 2010
Y2 - 13 September 2010 through 16 September 2010
ER -