Differentiating search results on structured data

Ziyang Liu, Yi Chen

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Studies show that about 50% of Web search is for information exploration purposes, where a user would like to investigate, compare, evaluate, and synthesize multiple relevant results. Due to the absence of general tools that can effectively analyze and differentiate multiple results, a user has to manually read and comprehend potential large results in an exploratory search. Such a process is time consuming, labor intensive and error prone. Interestingly, we find that the metadata information embedded in structured data provides a potential for automating or semi-automating the comparison of multiple results. In this article we present an approach for structured data search result differentiation. We define the differentiability of query results and quantify the degree of difference. Then we define the problem of identifying a limited number of valid features in a result that can maximally differentiate this result from the others, which is proved NP-hard. We propose two local optimality conditions, namely single-swap and multi-swap, and design efficient algorithms to achieve local optimality.We then present a feature type-based approach, which further improves the quality of the features identified for result differentiation. To show the usefulness of our approach, we implemented a system CompareIt, which can be used to compare structured search results as well as any objects. Our empirical evaluation verifies the effectiveness and efficiency of the proposed approach.

Original languageEnglish (US)
Article number4
JournalACM Transactions on Database Systems
Volume37
Issue number1
DOIs
StatePublished - Feb 1 2012
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Information Systems

Keywords

  • Comparison
  • Databases
  • Differentiation
  • Keyword search
  • Result analysis
  • Structured data
  • XML data

Fingerprint

Dive into the research topics of 'Differentiating search results on structured data'. Together they form a unique fingerprint.

Cite this