TY - GEN
T1 - Heuristic containment check of partial tree-pattern queries in the presence of index graphs
AU - Theodoratos, Dimitri
AU - Souldatos, Stefanos
AU - Dalamagas, Theodore
AU - Placek, Pawel
AU - Sellis, Timos
PY - 2006
Y1 - 2006
N2 - The wide adoption of XML has increased the interest of the database community on tree-structured data management techniques. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we use a query language which allows partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. To evaluate a PTPQ, we exploit index graphs which generate an equivalent set of "complete" tree-pattern queries.In order to process PTPQs, we need to efficiently solve the PTPQ satisfiability and containment problems. These problems become more complex in the context of PTPQs because the partial specification of the structure allows new, non-trivial, structural expressions to be derived from those explicitly specified in a PTPQ. We address the problem of PTPQ satisfiability and containment in the absence and in the presence of index graphs, and we provide necessary and sufficient conditions for each case. To cope with the high complexity of PTPQ containment in the presence of index graphs,we study a family of heuristic approaches for PTPQ containment based on structural information extracted from the index graph in advance and on-the-fly. We implement our approaches and we report on their extensive experimental evaluation and comparison.
AB - The wide adoption of XML has increased the interest of the database community on tree-structured data management techniques. Querying capabilities are provided through tree-pattern queries. The need for querying tree-structured data sources when their structure is not fully known, and the need to integrate multiple data sources with different tree structures have driven, recently, the suggestion of query languages that relax the complete specification of a tree pattern. In this paper, we use a query language which allows partial tree-pattern queries (PTPQs). The structure in a PTPQ can be flexibly specified fully, partially or not at all. To evaluate a PTPQ, we exploit index graphs which generate an equivalent set of "complete" tree-pattern queries.In order to process PTPQs, we need to efficiently solve the PTPQ satisfiability and containment problems. These problems become more complex in the context of PTPQs because the partial specification of the structure allows new, non-trivial, structural expressions to be derived from those explicitly specified in a PTPQ. We address the problem of PTPQ satisfiability and containment in the absence and in the presence of index graphs, and we provide necessary and sufficient conditions for each case. To cope with the high complexity of PTPQ containment in the presence of index graphs,we study a family of heuristic approaches for PTPQ containment based on structural information extracted from the index graph in advance and on-the-fly. We implement our approaches and we report on their extensive experimental evaluation and comparison.
KW - Partial tree-pattern query
KW - Query containment
KW - Tree-structured data
UR - http://www.scopus.com/inward/record.url?scp=34547641021&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34547641021&partnerID=8YFLogxK
U2 - 10.1145/1183614.1183679
DO - 10.1145/1183614.1183679
M3 - Conference contribution
AN - SCOPUS:34547641021
SN - 1595934332
SN - 9781595934338
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 445
EP - 454
BT - Proceedings of the 15th ACM Conference on Information and Knowledge Management, CIKM 2006
T2 - 15th ACM Conference on Information and Knowledge Management, CIKM 2006
Y2 - 6 November 2006 through 11 November 2006
ER -