Abstract
Adversarial attacks can fool convolutional networks and make the systems vulnerable to fraud and deception. How to defend against malicious attacks is a critical challenge in practice. Adversarial attacks are often conducted by adding tiny perturbations on images to cause network misclassification. Noise reduction can defend the attacks; however, it is not suited for all the cases. Considering that different models have different tolerance abilities on adversarial attacks, we develop a novel detecting module to remove noise by adaptive process and detect adversarial attacks without modifying the models. Experimental results show that by comparing the classification results on adversarial samples of MNIST and two subclasses of ImageNet datasets, our models can successfully remove most of the noise and obtain detection accuracies of 97.71% and 92.96%, respectively. Furthermore, our adaptive module can be assembled into different networks to achieve detection accuracies of 70.83% and 71.96%, respectively, on the white-box adversarial attacks of ResNet18 and SCD01MLP images. The best accuracy of 62.5% is obtained for both networks when dealing with the black-box attacks.
Original language | English (US) |
---|---|
Article number | 2252022 |
Journal | International Journal of Pattern Recognition and Artificial Intelligence |
Volume | 36 |
Issue number | 12 |
DOIs | |
State | Published - Sep 30 2022 |
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition
- Artificial Intelligence
Keywords
- Robust machine learning
- adversarial attack
- image reconstruction
- noise reduction