A Synergetic Attack against Neural Network Classifiers combining Backdoor and Adversarial Examples

Guanxiong Liu, Issa Khalil, Abdallah Khreishah, Nhat Hai Phan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The pervasiveness of neural networks (NNs) in critical computer vision and image processing applications makes them very attractive for adversarial manipulation. A large body of existing research thoroughly investigates two broad categories of attacks targeting the integrity of NN models. The first category of attacks, commonly called Adversarial Examples, perturbs the model's inference by carefully adding noise into input examples. In the second category of attacks, adversaries try to manipulate the model during the training process by implanting Trojan backdoors. Researchers show that such attacks pose severe threats to the growing applications of NNs and propose several defenses against each attack type individually. However, such one-sided defense approaches leave potentially unknown risks in real-world scenarios when an adversary can unify different attacks to create new and more lethal ones bypassing existing defenses.In this work, we show how to jointly exploit adversarial perturbation and model poisoning vulnerabilities to practically launch a new stealthy attack, dubbed AdvTrojan. AdvTrojan is stealthy because it can be activated only when: 1) a carefully crafted adversarial perturbation is injected into the input examples during inference, and 2) a Trojan backdoor is implanted during the training process of the model. We leverage adversarial noise in the input space to move Trojan-infected examples across the model decision boundary, making it difficult to detect. The stealthiness behavior of AdvTrojan fools the users into accidentally trusting the infected model as a robust classifier against adversarial examples. AdvTrojan can be implemented by only poisoning the training data similar to conventional Trojan backdoor attacks. Our thorough analysis and extensive experiments on several benchmark datasets show that AdvTrojan can bypass existing defenses with a success rate close to 100% in most of our experimental scenarios and can be extended to attack federated learning as well as high-resolution images.

Original languageEnglish (US)
Title of host publicationProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021
EditorsYixin Chen, Heiko Ludwig, Yicheng Tu, Usama Fayyad, Xingquan Zhu, Xiaohua Tony Hu, Suren Byna, Xiong Liu, Jianping Zhang, Shirui Pan, Vagelis Papalexakis, Jianwu Wang, Alfredo Cuzzocrea, Carlos Ordonez
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages834-846
Number of pages13
ISBN (Electronic)9781665439022
DOIs
StatePublished - 2021
Event2021 IEEE International Conference on Big Data, Big Data 2021 - Virtual, Online, United States
Duration: Dec 15 2021Dec 18 2021

Publication series

NameProceedings - 2021 IEEE International Conference on Big Data, Big Data 2021

Conference

Conference2021 IEEE International Conference on Big Data, Big Data 2021
Country/TerritoryUnited States
CityVirtual, Online
Period12/15/2112/18/21

All Science Journal Classification (ASJC) codes

  • Information Systems and Management
  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Information Systems

Keywords

  • Neural networks
  • Trojan attack
  • adversarial attack

Fingerprint

Dive into the research topics of 'A Synergetic Attack against Neural Network Classifiers combining Backdoor and Adversarial Examples'. Together they form a unique fingerprint.

Cite this