TY - GEN
T1 - Trading off popularity for diversity in the results sets of keyword queries on linked data
AU - Dass, Ananya
AU - Theodoratos, Dimitri
N1 - Publisher Copyright:
© Springer International Publishing AG 2017.
PY - 2017
Y1 - 2017
N2 - Keyword search is the most popular technique for querying the ever growing repositories of RDF graph data on the Web. However, keyword queries are ambiguous. As a consequence, they typically produce on linked data a huge number of candidate results corresponding to a plethora of alternative query interpretations. Current approaches ignore the diversity of the result interpretations and might fail to satisfy the users who are looking for less popular results. In this paper, we propose a novel approach for keyword search result diversification on RDF graphs. Our approach instead of diversifying the query results per se, diversifies the interpretations of the query (i.e., pattern graphs). We model the problem as an optimization problem aiming at selecting k pattern graphs which maximize an objective function balancing relevance and diversity. We devise metrics to assess the relevance and diversity of a set of pattern graphs, and we design a greedy heuristic algorithm to generate a relevant and diverse list of k pattern graphs for a given keyword query. The experimental results show the effectiveness of our approach and proposed metrics and also the efficiency of our algorithm.
AB - Keyword search is the most popular technique for querying the ever growing repositories of RDF graph data on the Web. However, keyword queries are ambiguous. As a consequence, they typically produce on linked data a huge number of candidate results corresponding to a plethora of alternative query interpretations. Current approaches ignore the diversity of the result interpretations and might fail to satisfy the users who are looking for less popular results. In this paper, we propose a novel approach for keyword search result diversification on RDF graphs. Our approach instead of diversifying the query results per se, diversifies the interpretations of the query (i.e., pattern graphs). We model the problem as an optimization problem aiming at selecting k pattern graphs which maximize an objective function balancing relevance and diversity. We devise metrics to assess the relevance and diversity of a set of pattern graphs, and we design a greedy heuristic algorithm to generate a relevant and diverse list of k pattern graphs for a given keyword query. The experimental results show the effectiveness of our approach and proposed metrics and also the efficiency of our algorithm.
UR - http://www.scopus.com/inward/record.url?scp=85020532523&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85020532523&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-60131-1_9
DO - 10.1007/978-3-319-60131-1_9
M3 - Conference contribution
AN - SCOPUS:85020532523
SN - 9783319601304
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 151
EP - 170
BT - Web Engineering - 17th International Conference, ICWE 2017, Proceedings
A2 - Cabot, Jordi
A2 - De Virgilio, Roberto
A2 - Torlone, Riccardo
PB - Springer Verlag
T2 - 17th International Conference on Web Engineering, ICWE 2017
Y2 - 5 June 2017 through 8 June 2017
ER -