Abstract
Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte and beyond graph stream data from end users. In this paper, a succinct and efficient double index data structure is designed to build the sketch of a graph stream to meet general queries. A single pass stream model, which includes general sketch building, distributed sketch based analysis algorithms and regression based approximation solution generation, is developed, and a typical graph algorithm-triangle counting-is implemented to evaluate the proposed method. Experimental results on power law and normal distribution graph streams show that our method can generate accurate results (mean relative error less than 4%) with a high performance. All our methods and code have been implemented in an open source framework, Arkouda, and are available from our GitHub repository, Bader-Research. This work provides the large and rapidly growing Python community with a powerful way to handle terabyte and beyond graph stream data using their laptops.
| Original language | English (US) |
|---|---|
| Article number | 221 |
| Journal | Algorithms |
| Volume | 14 |
| Issue number | 8 |
| DOIs | |
| State | Published - Aug 2021 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Theoretical Computer Science
- Numerical Analysis
- Computational Theory and Mathematics
- Computational Mathematics
Keywords
- Arkouda framework
- Graph stream sketch
- Streaming graph data
- Triangle counting