TY - GEN
T1 - Tracking structure of streaming social networks
AU - Ediger, David
AU - Riedy, Jason
AU - Bader, David A.
AU - Meyerhenke, Henning
PY - 2011
Y1 - 2011
N2 - Current online social networks are massive and still growing. For example, Facebook has over 500 million active users sharing over 30 billion items per month. The scale within these data streams has outstripped traditional graph analysis methods. Real-time monitoring for anomalies may require dynamic analysis rather than repeated static analysis. The massive state behind multiple persistent queries requires shared data structures and flexible representations. We present a framework based on the STINGER data structure that can monitor a global property, connected components, on a graph of 16 million vertices at rates of up to 240 000 updates per second on 32 processors of a Cray XMT. For very large scale-free graphs, our implementation uses novel batching techniques that exploit the scale-free nature of the data and run over three times faster than prior methods. Our framework handles, for the first time, real-world data rates, opening the door to higher-level analytics such as community and anomaly detection.
AB - Current online social networks are massive and still growing. For example, Facebook has over 500 million active users sharing over 30 billion items per month. The scale within these data streams has outstripped traditional graph analysis methods. Real-time monitoring for anomalies may require dynamic analysis rather than repeated static analysis. The massive state behind multiple persistent queries requires shared data structures and flexible representations. We present a framework based on the STINGER data structure that can monitor a global property, connected components, on a graph of 16 million vertices at rates of up to 240 000 updates per second on 32 processors of a Cray XMT. For very large scale-free graphs, our implementation uses novel batching techniques that exploit the scale-free nature of the data and run over three times faster than prior methods. Our framework handles, for the first time, real-world data rates, opening the door to higher-level analytics such as community and anomaly detection.
UR - http://www.scopus.com/inward/record.url?scp=83455246442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=83455246442&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2011.326
DO - 10.1109/IPDPS.2011.326
M3 - Conference contribution
AN - SCOPUS:83455246442
SN - 9780769543857
T3 - IEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum
SP - 1691
EP - 1699
BT - 2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011
T2 - 25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011
Y2 - 16 May 2011 through 20 May 2011
ER -