TY - GEN
T1 - QPIAD
T2 - 23rd International Conference on Data Engineering, ICDE 2007
AU - Khatri, Hemal
AU - Fan, Jianchun
AU - Chen, Yi
AU - Kambhampati, Subbarao
PY - 2007
Y1 - 2007
N2 - Incompleteness due to missing attribute values (aka "null values") is very common in autonomous web databases, on which user accesses are usually supported through mediators. Traditional query processing techniques that focus on the strict soundness of answer tuples often ignore tuples with critical missing attributes, even if they wind up being relevant to a user query. Ideally we would like the mediator to retrieve such relevant uncertain answers and gauge their relevance by accessing their likelihood of being relevant answers to the query. However, the autonomous nature of the databases poses several challenges, such as the restricted access privileges, limited query patterns, and sensitivity of database and network resource consumption in the web environment. We introduce a novel query rewriting and optimization framework QPIAD that tackles these challenges to retrieve relevant uncertain answers. Our technique involves reformulating the user query based on approximate functional dependencies (AFDs) among the database attributes and ranking these queries using value distributions learned from Naïve Bayes Classifiers. Empirical studies demonstrate the effectiveness of our approach in retrieving relevant uncertain answers with high precision, high recall and manageable cost.
AB - Incompleteness due to missing attribute values (aka "null values") is very common in autonomous web databases, on which user accesses are usually supported through mediators. Traditional query processing techniques that focus on the strict soundness of answer tuples often ignore tuples with critical missing attributes, even if they wind up being relevant to a user query. Ideally we would like the mediator to retrieve such relevant uncertain answers and gauge their relevance by accessing their likelihood of being relevant answers to the query. However, the autonomous nature of the databases poses several challenges, such as the restricted access privileges, limited query patterns, and sensitivity of database and network resource consumption in the web environment. We introduce a novel query rewriting and optimization framework QPIAD that tackles these challenges to retrieve relevant uncertain answers. Our technique involves reformulating the user query based on approximate functional dependencies (AFDs) among the database attributes and ranking these queries using value distributions learned from Naïve Bayes Classifiers. Empirical studies demonstrate the effectiveness of our approach in retrieving relevant uncertain answers with high precision, high recall and manageable cost.
KW - Autonomous databases
KW - Incomplete databases
KW - Query rewriting
KW - Querying hidden web
UR - http://www.scopus.com/inward/record.url?scp=34548755508&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548755508&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2007.369028
DO - 10.1109/ICDE.2007.369028
M3 - Conference contribution
AN - SCOPUS:34548755508
SN - 1424408032
SN - 9781424408030
T3 - Proceedings - International Conference on Data Engineering
SP - 1430
EP - 1432
BT - 23rd International Conference on Data Engineering, ICDE 2007
Y2 - 15 April 2007 through 20 April 2007
ER -