TY - GEN
T1 - AUDIO
T2 - 2012 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML-PKDD 2012
AU - Liu, Ruilin
AU - Wang, Hui
AU - Monreale, Anna
AU - Pedreschi, Dino
AU - Giannotti, Fosca
AU - Guo, Wenge
PY - 2012
Y1 - 2012
N2 - Spurred by developments such as cloud computing, there has been considerable recent interest in the data-mining-as-a-service paradigm. Users lacking in expertise or computational resources can outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises issues about result integrity: how can the data owner verify that the mining results returned by the server are correct? In this paper, we present AUDIO, an integrity auditing framework for the specific task of distance-based outlier mining outsourcing. It provides efficient and practical verification approaches to check both completeness and correctness of the mining results. The key idea of our approach is to insert a small amount of artificial tuples into the outsourced data; the artificial tuples will produce artificial outliers and non-outliers that do not exist in the original dataset. The server's answer is verified by analyzing the presence of artificial outliers/non-outliers, obtaining a probabilistic guarantee of correctness and completeness of the mining result. Our empirical results show the effectiveness and efficiency of our method.
AB - Spurred by developments such as cloud computing, there has been considerable recent interest in the data-mining-as-a-service paradigm. Users lacking in expertise or computational resources can outsource their data and mining needs to a third-party service provider (server). Outsourcing, however, raises issues about result integrity: how can the data owner verify that the mining results returned by the server are correct? In this paper, we present AUDIO, an integrity auditing framework for the specific task of distance-based outlier mining outsourcing. It provides efficient and practical verification approaches to check both completeness and correctness of the mining results. The key idea of our approach is to insert a small amount of artificial tuples into the outsourced data; the artificial tuples will produce artificial outliers and non-outliers that do not exist in the original dataset. The server's answer is verified by analyzing the presence of artificial outliers/non-outliers, obtaining a probabilistic guarantee of correctness and completeness of the mining result. Our empirical results show the effectiveness and efficiency of our method.
UR - http://www.scopus.com/inward/record.url?scp=84866862402&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866862402&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33486-3_1
DO - 10.1007/978-3-642-33486-3_1
M3 - Conference contribution
AN - SCOPUS:84866862402
SN - 9783642334856
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 1
EP - 18
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2012, Proceedings
Y2 - 24 September 2012 through 28 September 2012
ER -