Abstract
Query expansion is a functionality of search engines that suggests a set of related queries for a user-issued keyword query. Typical corpus-driven keyword query expansion approaches return popular words in the results as expanded queries. Using these approaches, the expanded queries may correspond to a subset of possible query semantics, and thus miss relevant results. To handle ambiguous queries and exploratory queries, whose result relevance is difficult to judge, we propose a new framework for keyword query expan-sion: we start with clustering the results according to user specified granularity, and then generate expanded queries, such that one ex-panded query is generated for each cluster whose result set should ideally be the corresponding cluster. We formalize this problem and show its APX-hardness. Then we propose two efficient algorithms named iterative single-keyword refinement and partial elimination based convergence, respectively, which effectively generate a set of expanded queries from clustered results that provides a classifica-tion of the original query results. We believe our study of generat-ing an optimal query based on the ground truth of the query results not only has applications in query expansion, but has significance for studying keyword search quality in general.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 350-361 |
| Number of pages | 12 |
| Journal | Proceedings of the VLDB Endowment |
| Volume | 4 |
| Issue number | 6 |
| DOIs | |
| State | Published - Mar 2011 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Computer Science (miscellaneous)
- General Computer Science