The goal of this research project is to provide high-quality keyword search results on semi-structured data in XML format. To address the challenge of handling inherent ambiguity in keyword search, fundamental techniques and an effective search engine are developed that exploit the meta-information in the data in order to infer user search intention and to achieve high search quality. The project includes novel research on the following key areas: (1) Query Result Generation: identifying relevant nodes in XML data and composing atomic and intact query results, each of which represents an object of the inferred user search goal; (2) Query Result Presentation: developing techniques for result ranking, snippet generation, and result clustering, in order to help users quickly find the most relevant results; (3) Advanced Queries and Data Models: supporting expressive search options and handling XML data with rich constraints; and (4) Efficiency: developing techniques for performance optimization, including indexes, materialized views, and top-k query processing. Furthermore, an axiomatic evaluation framework is initiated for formally reasoning about XML keyword search strategies. The success of the project will advance the state-of-the-art of keyword search on XML data, enhance the research and education infrastructure in this area, and have broader impacts on both general public as well as scientific communities for information discovery. This research is intergrated with education through curriculum enhancement, student advising, workshops as well as outreach programs. Publications, software and course materials that are resulted from this project will be disseminated via the project website (http://www.public.asu.edu/~ychen127/xseek/).
|Effective start/end date||3/1/09 → 3/31/13|
- National Science Foundation: $384,342.00