Query expansion is a functionality of search engines that suggests a set of related queries for a user-issued keyword query. Typical corpus-driven keyword query expansion approaches return popular words in the results as expanded queries. Using these approaches, the expanded queries may correspond to a subset of possible query semantics, and thus miss relevant results. To handle ambiguous queries and exploratory queries, whose result relevance is difficult to judge, we propose a new framework for keyword query expan-sion: we start with clustering the results according to user specified granularity, and then generate expanded queries, such that one ex-panded query is generated for each cluster whose result set should ideally be the corresponding cluster. We formalize this problem and show its APX-hardness. Then we propose two efficient algorithms named iterative single-keyword refinement and partial elimination based convergence, respectively, which effectively generate a set of expanded queries from clustered results that provides a classifica-tion of the original query results. We believe our study of generat-ing an optimal query based on the ground truth of the query results not only has applications in query expansion, but has significance for studying keyword search quality in general.
|Original language||English (US)|
|Number of pages||12|
|Journal||Proceedings of the VLDB Endowment|
|State||Published - Mar 2011|
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
- Computer Science(all)