TY - GEN
T1 - An original semantics to keyword queries for xml using structural patterns
AU - Theodoratos, Dimitri
AU - Wu, Xiaoying
PY - 2007
Y1 - 2007
N2 - XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need to integrate multiple data sources with different tree structures have motivated recently the suggestion of keyword-based techniques for querying XML documents. The semantics adopted by these approaches aims at restricting the answers to meaningful ones. However, these approaches suffer from low precision, while recent ones with improved precision suffer from low recall. In this paper, we introduce an original approach for assigning semantics to keyword queries for XML documents. We exploit index graphs (a structural summary of data) to extract tree patterns that return meaningful answers. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect and exploit meaningful tree patterns. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Since our approach is based on tree-pattern query evaluation, it can be easily implemented on top of an XQuery engine.
AB - XML is by now the de facto standard for exporting and exchanging data on the web. The need for querying XML data sources whose structure is not fully known to the user and the need to integrate multiple data sources with different tree structures have motivated recently the suggestion of keyword-based techniques for querying XML documents. The semantics adopted by these approaches aims at restricting the answers to meaningful ones. However, these approaches suffer from low precision, while recent ones with improved precision suffer from low recall. In this paper, we introduce an original approach for assigning semantics to keyword queries for XML documents. We exploit index graphs (a structural summary of data) to extract tree patterns that return meaningful answers. In contrast to previous approaches that operate locally on the data to compute meaningful answers (usually by computing lowest common ancestors), our approach operates globally on index graphs to detect and exploit meaningful tree patterns. We implemented and experimentally evaluated our approach on DBLP-based data sets with irregularities. Its comparison to previous ones shows that it succeeds in finding all the meaningful answers when the others fail (perfect recall). Further, it outperforms approaches with similar recall in excluding meaningless answers (better precision). Since our approach is based on tree-pattern query evaluation, it can be easily implemented on top of an XQuery engine.
UR - http://www.scopus.com/inward/record.url?scp=38049098513&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=38049098513&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-71703-4_61
DO - 10.1007/978-3-540-71703-4_61
M3 - Conference contribution
AN - SCOPUS:38049098513
SN - 9783540717027
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 727
EP - 739
BT - Advances in Databases
PB - Springer Verlag
T2 - 12th International Conference on Database Systems for Advanced Applications, DASFAA 2007
Y2 - 9 April 2007 through 12 April 2007
ER -