Abstract
Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte and beyond graph stream data from end users. In this paper, a succinct and efficient double index data structure is designed to build the sketch of a graph stream to meet general queries. A single pass stream model, which includes general sketch building, distributed sketch based analysis algorithms and regression based approximation solution generation, is developed, and a typical graph algorithm-triangle counting-is implemented to evaluate the proposed method. Experimental results on power law and normal distribution graph streams show that our method can generate accurate results (mean relative error less than 4%) with a high performance. All our methods and code have been implemented in an open source framework, Arkouda, and are available from our GitHub repository, Bader-Research. This work provides the large and rapidly growing Python community with a powerful way to handle terabyte and beyond graph stream data using their laptops.
Original language | English (US) |
---|---|
Article number | 221 |
Journal | Algorithms |
Volume | 14 |
Issue number | 8 |
DOIs | |
State | Published - Aug 2021 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Numerical Analysis
- Computational Theory and Mathematics
- Computational Mathematics
Keywords
- Arkouda framework
- Graph stream sketch
- Streaming graph data
- Triangle counting