WICKED ODDITIES: SELECTIVELY POISONING FOR EFFECTIVE CLEAN-LABEL BACKDOOR ATTACKS

  • Quang H. Nguyen
  • , Nguyen Ngoc-Hieu
  • , The Anh Ta
  • , Thanh Nguyen-Tang
  • , Kok Seng Wong
  • , Hoang Thanh-Tung
  • , Khoa D. Doan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Deep neural networks are vulnerable to backdoor attacks, which poison the training data to manipulate the behavior of models trained on such data. Clean-label backdoor is a more stealthy form of attack, as they do not change the labels of the poisoned data. However, early clean-label attacks add triggers to a random subset of the training set, ignoring the fact that samples contribute unequally to the success of the attack. Consequently, they either require high poisoning rates or fail to achieve high attack success rates. To alleviate the problem, several supervised learning-based sample selection strategies have been proposed; these methods assume access to the entire labeled training set and require training, which can be expensive and may not always be practical. This work studies a new and more practical (but also more challenging) threat model where the attacker only provides data for the target class (e.g., in face recognition systems) and has no knowledge of the victim model or any other classes in the training set. We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate in this setting. Our threat model poses a serious threat in training machine learning models with third-party datasets since the attack can be performed effectively with limited information. Extensive experiments on multiple benchmark datasets illustrate the effectiveness of our strategies in improving clean-label backdoor attacks. Our implementation is available here.

Original languageEnglish (US)
Title of host publication13th International Conference on Learning Representations, ICLR 2025
PublisherInternational Conference on Learning Representations, ICLR
Pages3033-3055
Number of pages23
ISBN (Electronic)9798331320850
StatePublished - 2025
Externally publishedYes
Event13th International Conference on Learning Representations, ICLR 2025 - Singapore, Singapore
Duration: Apr 24 2025Apr 28 2025

Publication series

Name13th International Conference on Learning Representations, ICLR 2025

Conference

Conference13th International Conference on Learning Representations, ICLR 2025
Country/TerritorySingapore
CitySingapore
Period4/24/254/28/25

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'WICKED ODDITIES: SELECTIVELY POISONING FOR EFFECTIVE CLEAN-LABEL BACKDOOR ATTACKS'. Together they form a unique fingerprint.

Cite this