TY - GEN
T1 - DeAnomalyzer
T2 - 5th IEEE International Conference on Artificial Intelligence Testing, AITest 2023
AU - Ahmed, Muyeed
AU - Neamtiu, Iulian
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Anomaly Detection (AD) is a popular unsupervised learning technique, but AD implementations are difficult to test, understand, and ultimately improve. Contributing factors for these difficulties include the lack of a specification to test against, output differences (on the same input) between toolkits that supposedly implement the same AD algorithm, and no linkage between learning parameters and undesirable outcomes. We have implemented DeAnomalyzer, a black-box tool that improves AD reliability by addressing two issues: nondeterminism (wide output variations across repeated runs of the same implementation on the same dataset) and inconsistency (wide output variations between toolkits on the same dataset). Specifically, DeAnomalyzer uses a feedback-directed, gradient descent-like approach to search for toolkit parameter settings that maximize determinism and consistency. DeAnomalyzer can operate in two modes: univariate, without ground truth, targeted to general users, and bivariate, with ground truth, targeted to algorithm designers and developers. We evaluated DeAnomalyzer on 54 AD datasets and the implementations of four AD algorithms in three popular ML toolkits: MATLAB, R, and Scikit-learn. The evaluation has revealed that DeAnomalyzer is effective at increasing determinism and consistency without sacrificing performance, and can even improve performance.
AB - Anomaly Detection (AD) is a popular unsupervised learning technique, but AD implementations are difficult to test, understand, and ultimately improve. Contributing factors for these difficulties include the lack of a specification to test against, output differences (on the same input) between toolkits that supposedly implement the same AD algorithm, and no linkage between learning parameters and undesirable outcomes. We have implemented DeAnomalyzer, a black-box tool that improves AD reliability by addressing two issues: nondeterminism (wide output variations across repeated runs of the same implementation on the same dataset) and inconsistency (wide output variations between toolkits on the same dataset). Specifically, DeAnomalyzer uses a feedback-directed, gradient descent-like approach to search for toolkit parameter settings that maximize determinism and consistency. DeAnomalyzer can operate in two modes: univariate, without ground truth, targeted to general users, and bivariate, with ground truth, targeted to algorithm designers and developers. We evaluated DeAnomalyzer on 54 AD datasets and the implementations of four AD algorithms in three popular ML toolkits: MATLAB, R, and Scikit-learn. The evaluation has revealed that DeAnomalyzer is effective at increasing determinism and consistency without sacrificing performance, and can even improve performance.
KW - AI reliability
KW - AI testing
KW - Anomaly Detection
KW - Machine Learning
KW - Nondeterminism
KW - Verification
UR - http://www.scopus.com/inward/record.url?scp=85172272695&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172272695&partnerID=8YFLogxK
U2 - 10.1109/AITest58265.2023.00012
DO - 10.1109/AITest58265.2023.00012
M3 - Conference contribution
AN - SCOPUS:85172272695
T3 - Proceedings - 5th IEEE International Conference on Artificial Intelligence Testing, AITest 2023
SP - 17
EP - 25
BT - Proceedings - 5th IEEE International Conference on Artificial Intelligence Testing, AITest 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 17 July 2023 through 20 July 2023
ER -