Efficient discovery of embedded patterns from large attributed trees

Xiaoying Wu, Dimitri Theodoratos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Discovering informative patterns deeply hidden in large tree datasets is an important research area that has many practical applications. Many modern applications and systems represent, export and exchange data in the form of trees whose nodes are associated with attributes. In this paper, we address the problem of mining frequent embedded attributed patterns from large attributed data trees. Attributed pattern mining requires combining tree mining and itemset mining. This results in exploring a larger pattern search space compared to addressing each problem separately. We first design an interleaved pattern mining approach which extends the equivalence-class based tree pattern enumeration technique with attribute sets enumeration. Further, we propose a novel layered approach to discover all frequent attributed patterns in stages. This approach seamlessly integrates an itemset mining technique with a recent unordered embedded tree pattern mining algorithm to greatly reduce the pattern search space. Our extensive experimental results on real and synthetic large-tree datasets show that the layered approach displays, in most cases, orders of magnitude performance improvements over both the interleaved mining method and the attribute-as-node embedded tree pattern mining method and has good scaleup properties.

Original languageEnglish (US)
Title of host publicationDatabase Systems for Advanced Applications - 23rd International Conference, DASFAA 2018, Proceedings
EditorsJian Pei, Shazia Sadiq, Jianxin Li, Yannis Manolopoulos
PublisherSpringer Verlag
Pages558-576
Number of pages19
ISBN (Print)9783319914572
DOIs
StatePublished - 2018
Event23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018 - Gold Coast, Australia
Duration: May 21 2018May 24 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10828 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other23rd International Conference on Database Systems for Advanced Applications, DASFAA 2018
CountryAustralia
CityGold Coast
Period5/21/185/24/18

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Efficient discovery of embedded patterns from large attributed trees'. Together they form a unique fingerprint.

Cite this