An original semantics to keyword queries for xml using structural patterns

Dimitri Theodoratos, Xiaoying Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need to integrate multiple data sources with different tree structures have motivated recently the suggestion of keyword-based techniques for querying XML documents. The semantics adopted by these approaches aims at restricting the answers to meaningful ones. However, these approaches suffer from low precision, while recent ones with improved precision suffer from low recall. In this paper, we introduce an original approach for assigning semantics to keyword queries for XML documents. We exploit index graphs (a structural summary of data) to extract tree patterns that return meaningful answers. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect and exploit meaningful tree patterns. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Since our approach is based on tree-pattern query evaluation, it can be easily implemented on top of an XQuery engine.

Original languageEnglish (US)
Title of host publicationAdvances in Databases
Subtitle of host publicationConcepts, Systems and Applications - 12th International Conference on Database Systems for Advanced Applications, DASFAA 2007, Proceedings
PublisherSpringer Verlag
Pages727-739
Number of pages13
ISBN (Print)9783540717027
DOIs
StatePublished - 2007
Externally publishedYes
Event12th International Conference on Database Systems for Advanced Applications, DASFAA 2007 - Bangkok, Thailand
Duration: Apr 9 2007Apr 12 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4443 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Conference on Database Systems for Advanced Applications, DASFAA 2007
Country/TerritoryThailand
CityBangkok
Period4/9/074/12/07

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'An original semantics to keyword queries for xml using structural patterns'. Together they form a unique fingerprint.

Cite this