TY - GEN
T1 - Evaluating mixed patterns on large data graphs using bitmap views
AU - Wu, Xiaoying
AU - Theodoratos, Dimitri
AU - Skoutas, Dimitrios
AU - Lan, Michael
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - Developing efficient and scalable techniques for pattern queries over large graphs is crucial for modern applications such as social networks, Web analysis, and bioinformatics. In this paper, we address the problem of efficiently finding the homomorphic matches for tree pattern queries with child and descendant edges (mixed pattern queries) over a large data graph. We propose a novel type of materialized views to accelerate the evaluation. Our materialized views are the sets of occurrence lists of the nodes of the pattern in the data graph. They are stored as compressed bitmaps on the inverted lists of the node labels in the data graph. Reachability information between occurrence list nodes is provided by a node reachability index. This technique not only minimizes the materialization space but also reduces CPU and I/O costs by translating view materialization processing into bitwise operations. We provide conditions for view usability using the concept of pattern node coverage. We design a holistic bottom-up algorithm which efficiently computes pattern query matches in the data graph using bitmap views. An extensive experimental evaluation shows that our method evaluates mixed patterns up to several orders of magnitude faster than existing algorithms.
AB - Developing efficient and scalable techniques for pattern queries over large graphs is crucial for modern applications such as social networks, Web analysis, and bioinformatics. In this paper, we address the problem of efficiently finding the homomorphic matches for tree pattern queries with child and descendant edges (mixed pattern queries) over a large data graph. We propose a novel type of materialized views to accelerate the evaluation. Our materialized views are the sets of occurrence lists of the nodes of the pattern in the data graph. They are stored as compressed bitmaps on the inverted lists of the node labels in the data graph. Reachability information between occurrence list nodes is provided by a node reachability index. This technique not only minimizes the materialization space but also reduces CPU and I/O costs by translating view materialization processing into bitwise operations. We provide conditions for view usability using the concept of pattern node coverage. We design a holistic bottom-up algorithm which efficiently computes pattern query matches in the data graph using bitmap views. An extensive experimental evaluation shows that our method evaluates mixed patterns up to several orders of magnitude faster than existing algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85065541557&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065541557&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-18576-3_33
DO - 10.1007/978-3-030-18576-3_33
M3 - Conference contribution
AN - SCOPUS:85065541557
SN - 9783030185756
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 553
EP - 570
BT - Database Systems for Advanced Applications - 24th International Conference, DASFAA 2019, Proceedings
A2 - Yang, Jun
A2 - Gama, Joao
A2 - Li, Guoliang
A2 - Natwichai, Juggapong
A2 - Tong, Yongxin
PB - Springer Verlag
T2 - 24th International Conference on Database Systems for Advanced Applications, DASFAA 2019
Y2 - 22 April 2019 through 25 April 2019
ER -