Return specification inference and result clustering for keyword search on XML

Ziyang Liu, Yi Chen

Research output: Contribution to journalArticlepeer-review

23 Scopus citations


Keyword search enables Web users to easily access XML data without the need to learn a structured query language and to study possibly complex data schemas. Existing work has addressed the problem of selecting qualified data nodes that match keywords and connecting them in a meaningful way, in the spirit of inferring the where clause in XQuery. However, how to infer the return clause for keyword searches is an open problem. To address this challenge, we present a keyword search engine for data-centric XML, XSeek, to infer the semantics of the search and identify return nodes effectively. XSeek recognizes possible entities and attributes inherently represented in the data. It also distinguishes between predicates and return specifications in query keywords. Then based on the analysis of both XML data structures and keyword patterns, XSeek generates return nodes. Furthermore, when the query is ambiguous and it is hard or impossible to determine the desirable return information, XSeek clusters the query results according to their semantics based on the user-specified granularity, and enables the user to easily browse and select the desired ones. Extensive experimental studies show the effectiveness and efficiency of XSeek.

Original languageEnglish (US)
Article number10
JournalACM Transactions on Database Systems
Issue number2
StatePublished - Apr 1 2010
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems


  • Keyword search
  • Result clustering
  • XML


Dive into the research topics of 'Return specification inference and result clustering for keyword search on XML'. Together they form a unique fingerprint.

Cite this