Streaming graph sampling with size restrictions

Anita Zakrzewska, David A. Bader

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Many graph datasets originating from online social network, financial or biological sources are too large to store or analyze. The analysis of such networks may be made more tractable if they are reduced to smaller subgraphs via sampling. While most of the known graph sampling methods are designed with static graphs in mind, many real datasets are massive and rapidly growing, making streaming methods necessary. We present two new techniques, Randomly Induced Edge Sampling (RIES) and Weighted Edge Sampling (WES). Both methods sample a stream of edges in a single pass, without the need to know future properties of the stream. In contrast to previous work that focused on limiting only the number of vertices, our methods restrict the number of edges, thus truly limiting the size of the sampled subgraph. We compare the performance of RIES and WES against the previously known streaming Random Edge (RE) method on eight social network datasets. Using four structural graph properties, we find that both RIES and WES produce subgraphs that are more structurally similar to the original graph than are the subgraphs produced by streaming RE. We also examine the sensitivity of the two algorithms with respect to their parameters. The parameters of WES affect its performance in a more predictable manner and are easier to set. Both new algorithms represent an improvement in the available streaming graph analysis toolkit.

Original languageEnglish (US)
Title of host publicationProceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017
EditorsJana Diesner, Elena Ferrari, Guandong Xu
PublisherAssociation for Computing Machinery, Inc
Pages282-290
Number of pages9
ISBN (Electronic)9781450349932
DOIs
StatePublished - Jul 31 2017
Externally publishedYes
Event9th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017 - Sydney, Australia
Duration: Jul 31 2017Aug 3 2017

Publication series

NameProceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017

Conference

Conference9th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2017
Country/TerritoryAustralia
CitySydney
Period7/31/178/3/17

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'Streaming graph sampling with size restrictions'. Together they form a unique fingerprint.

Cite this