Abstract
Studies show that about 50% of Web search is for information exploration purposes, where a user would like to investigate, compare, evaluate, and synthesize multiple relevant results. Due to the absence of general tools that can effectively analyze and differentiate multiple results, a user has to manually read and comprehend potential large results in an exploratory search. Such a process is time consuming, labor intensive and error prone. Interestingly, we find that the metadata information embedded in structured data provides a potential for automating or semi-automating the comparison of multiple results. In this article we present an approach for structured data search result differentiation. We define the differentiability of query results and quantify the degree of difference. Then we define the problem of identifying a limited number of valid features in a result that can maximally differentiate this result from the others, which is proved NP-hard. We propose two local optimality conditions, namely single-swap and multi-swap, and design efficient algorithms to achieve local optimality.We then present a feature type-based approach, which further improves the quality of the features identified for result differentiation. To show the usefulness of our approach, we implemented a system CompareIt, which can be used to compare structured search results as well as any objects. Our empirical evaluation verifies the effectiveness and efficiency of the proposed approach.
Original language | English (US) |
---|---|
Article number | 4 |
Journal | ACM Transactions on Database Systems |
Volume | 37 |
Issue number | 1 |
DOIs | |
State | Published - Feb 2012 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Information Systems
Keywords
- Comparison
- Databases
- Differentiation
- Keyword search
- Result analysis
- Structured data
- XML data