Diversification of keyword query result patterns

Cem Aksoy, Ananya Dass, Dimitri Theodoratos, Xiaoying Wu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations


Keyword search allows the users to search for information on tree data without making use of a complex query language and without knowing the schema of the data sources. However, keyword queries are usually ambiguous in expressing the user intent. Most of the current keyword search approaches either filter or use a scoring function to rank the candidate result set. These techniques do not differentiate the results and might return to the user a result set which is not the intended. To address this problem, we introduce in this paper an original approach for diversification of keyword search results on tree data which aims at returning a subset of the candidate result set trading off relevance for diversity. We formally define the problem of diversification of patterns of keyword search results on tree data as an optimization problem. We introduce relevance and diversity measures on result pattern sets. We design a greedy heuristic algorithm that chooses top-k most relevant and diverse result patterns for a given keyword query. Our experimental results show that the introduced relevance and diversity measures can be used effectively and that our algorithm can efficiently compute a set of result patterns for keyword queries which is both relevant and diverse.

Original languageEnglish (US)
Title of host publicationWeb-Age Information Management - 17th International Conference, WAIM 2016, Proceedings
EditorsBin Cui, Xiang Lian, Dexi Liu, Nan Zhang, Jianliang Xu
PublisherSpringer Verlag
Number of pages13
ISBN (Print)9783319399577
StatePublished - 2016
Event17th International Conference on Web-Age Information Management, WAIM 2016 - Nanchang, China
Duration: Jun 3 2016Jun 5 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Other17th International Conference on Web-Age Information Management, WAIM 2016

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'Diversification of keyword query result patterns'. Together they form a unique fingerprint.

Cite this