Efficient evaluation of generalized path pattern queries on XML data

Xiaoying Wu, Stefanos Souldatos, Dimitri Theodoratos, Theodore Dalamagas, Timos Sellis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

15 Scopus citations

Abstract

Finding the occurrences of structural patterns in XML data is a key operation in XML query processing. Existing algorithms for this operation focus almost exclusively on path-patterns or tree-patterns. Requirements inflexible querying of XML data have motivated recently the introduction of query languages that allow a partial specification of path-patterns in a query. In this paper, we focus on the efficient evaluation of partial path queries, a generalization of path pattern queries. Our approach explicitly deals with repeated labels (that is, multiple occurrences of the same label in a query). We show that partial path queries can be represented as rooted dags for which a topological ordering of the nodes exists. We present three algorithms for the efficient evaluation of these queries under the indexed streaming evaluation model. The first one exploits a structural summary of data to generate a set of path-patterns that together are equivalent to a partial path query. To evaluate these path-patterns, we extend PathStack so that it can work on path-patterns with repeated labels. The second one extracts a spanning tree from the query dag, uses a stack-based algorithm to find the matches of the root-to-leaf paths in the tree, and merge-joins the matches to compute the answer. Finally, the third one exploits multiple pointers of stack entries and a topological ordering of the query dag to apply a stack-based holistic technique. An analysis of the algorithms and extensive experimental evaluation shows that the holistic algorithm outperforms the other ones.

Original languageEnglish (US)
Title of host publicationProceeding of the 17th International Conference on World Wide Web 2008, WWW'08
Pages835-844
Number of pages10
DOIs
StatePublished - 2008
Event17th International Conference on World Wide Web 2008, WWW'08 - Beijing, China
Duration: Apr 21 2008Apr 25 2008

Publication series

NameProceeding of the 17th International Conference on World Wide Web 2008, WWW'08

Other

Other17th International Conference on World Wide Web 2008, WWW'08
Country/TerritoryChina
CityBeijing
Period4/21/084/25/08

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications

Keywords

  • XML
  • XPath query evaluation

Fingerprint

Dive into the research topics of 'Efficient evaluation of generalized path pattern queries on XML data'. Together they form a unique fingerprint.

Cite this