TY - GEN
T1 - Cohesive keyword search on tree data
AU - Dimitriou, Aggeliki
AU - Theodoratos, Dimitri
AU - Dass, Ananya
AU - Vassiliou, Yannis
N1 - Publisher Copyright:
© 2016, Copyright is with the authors.
PY - 2016
Y1 - 2016
N2 - Keyword search is the most popular querying technique on semistructured data. Keyword queries are simple and convenient. However, as a consequence of their imprecision, there is usually a huge number of candidate results of which only very few match the user's intent. Unfortunately, the existing semantics for keyword queries are ad-hoc and they generally fail to "guess" the user intent. Therefore, the quality of their answers is poor and the existing algorithms do not scale satisfactorily. In this paper, we introduce the novel concept of cohesive keyword queries for tree data. Intuitively, a cohesiveness relationship on keywords indicates that they should form a cohesive whole in a query result. Cohesive keyword queries allow term nesting and keyword repetition. Cohesive keyword queries bridge the gap between flat keyword queries and structured queries. Although more expressive, they are as simple as flat keyword queries and not require any schema knowledge. We provide formal semantics for cohesive keyword queries and rank query results on the proximity of the keyword instances. We design a stack based algorithm which efficiently evaluates cohesive keyword queries. Our experiments demonstrate that our approach outperforms in quality previous filtering semantics and our algorithm scales smoothly on queries of even 20 keywords on large datasets.
AB - Keyword search is the most popular querying technique on semistructured data. Keyword queries are simple and convenient. However, as a consequence of their imprecision, there is usually a huge number of candidate results of which only very few match the user's intent. Unfortunately, the existing semantics for keyword queries are ad-hoc and they generally fail to "guess" the user intent. Therefore, the quality of their answers is poor and the existing algorithms do not scale satisfactorily. In this paper, we introduce the novel concept of cohesive keyword queries for tree data. Intuitively, a cohesiveness relationship on keywords indicates that they should form a cohesive whole in a query result. Cohesive keyword queries allow term nesting and keyword repetition. Cohesive keyword queries bridge the gap between flat keyword queries and structured queries. Although more expressive, they are as simple as flat keyword queries and not require any schema knowledge. We provide formal semantics for cohesive keyword queries and rank query results on the proximity of the keyword instances. We design a stack based algorithm which efficiently evaluates cohesive keyword queries. Our experiments demonstrate that our approach outperforms in quality previous filtering semantics and our algorithm scales smoothly on queries of even 20 keywords on large datasets.
UR - http://www.scopus.com/inward/record.url?scp=85046689352&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85046689352&partnerID=8YFLogxK
U2 - 10.5441/002/edbt.2016.15
DO - 10.5441/002/edbt.2016.15
M3 - Conference contribution
AN - SCOPUS:85046689352
T3 - Advances in Database Technology - EDBT
SP - 137
EP - 148
BT - Advances in Database Technology - EDBT 2016
A2 - Manolescu, Ioana
A2 - Pitoura, Evaggelia
A2 - Marian, Amelie
A2 - Maabout, Sofian
A2 - Tanca, Letizia
A2 - Koutrika, Georgia
A2 - Stefanidis, Kostas
PB - OpenProceedings.org
T2 - 19th International Conference on Extending Database Technology, EDBT 2016
Y2 - 15 March 2016 through 18 March 2016
ER -