TY - GEN
T1 - Efficient discovery of embedded patterns from large attributed trees
AU - Wu, Xiaoying
AU - Theodoratos, Dimitri
N1 - Publisher Copyright:
© Springer International Publishing AG, part of Springer Nature 2018.
PY - 2018
Y1 - 2018
N2 - Discovering informative patterns deeply hidden in large tree datasets is an important research area that has many practical applications. Many modern applications and systems represent, export and exchange data in the form of trees whose nodes are associated with attributes. In this paper, we address the problem of mining frequent embedded attributed patterns from large attributed data trees. Attributed pattern mining requires combining tree mining and itemset mining. This results in exploring a larger pattern search space compared to addressing each problem separately. We first design an interleaved pattern mining approach which extends the equivalence-class based tree pattern enumeration technique with attribute sets enumeration. Further, we propose a novel layered approach to discover all frequent attributed patterns in stages. This approach seamlessly integrates an itemset mining technique with a recent unordered embedded tree pattern mining algorithm to greatly reduce the pattern search space. Our extensive experimental results on real and synthetic large-tree datasets show that the layered approach displays, in most cases, orders of magnitude performance improvements over both the interleaved mining method and the attribute-as-node embedded tree pattern mining method and has good scaleup properties.
AB - Discovering informative patterns deeply hidden in large tree datasets is an important research area that has many practical applications. Many modern applications and systems represent, export and exchange data in the form of trees whose nodes are associated with attributes. In this paper, we address the problem of mining frequent embedded attributed patterns from large attributed data trees. Attributed pattern mining requires combining tree mining and itemset mining. This results in exploring a larger pattern search space compared to addressing each problem separately. We first design an interleaved pattern mining approach which extends the equivalence-class based tree pattern enumeration technique with attribute sets enumeration. Further, we propose a novel layered approach to discover all frequent attributed patterns in stages. This approach seamlessly integrates an itemset mining technique with a recent unordered embedded tree pattern mining algorithm to greatly reduce the pattern search space. Our extensive experimental results on real and synthetic large-tree datasets show that the layered approach displays, in most cases, orders of magnitude performance improvements over both the interleaved mining method and the attribute-as-node embedded tree pattern mining method and has good scaleup properties.
UR - http://www.scopus.com/inward/record.url?scp=85048971810&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048971810&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-91458-9_34
DO - 10.1007/978-3-319-91458-9_34
M3 - Conference contribution
AN - SCOPUS:85048971810
SN - 9783319914572
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 558
EP - 576
BT - Database Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Proceedings
A2 - Pei, Jian
A2 - Sadiq, Shazia
A2 - Li, Jianxin
A2 - Manolopoulos, Yannis
PB - Springer Verlag
T2 - 23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018
Y2 - 21 May 2018 through 24 May 2018
ER -