How Vulnerable are Large Language Models (LLMs) against Adversarial Bit-Flip Attacks?

Abeer Matar A. Almalky, Ranyang Zhou, Shaahin Angizi, Adnan Siraj Rakin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

The recent popularity of Large Language Models (LLMs) has transformed the capabilities of modern Artificial Intelligence (AI) across a wide range of safety-critical applications. Hence, it is necessary to explore and understand their vulnerability across different attack surfaces. Previous works have explored the vulnerability of LLMs through software-level input manipulation attacks, such as jailbreaking or backdoor attacks. However, their vulnerability against adversarial weight perturbation through memory fault injection mechanisms such as Bit-flip Attack (BFA) remains underexplored. To make matters worse, the enormous size of the LLMs makes it challenging to fit the entire model into on-chip memory. Hence, storing a large portion of the model in off-chip memory (DRAM) is necessary, making it potentially vulnerable to memory fault injection attacks. In this work, for the first time, we explore the vulnerability of LLMs against adversarial weight perturbation attacks leveraging bit-flip in memory. For our analysis, we took two representative attacks with distinct characteristics: first, Bit-Flip Attack (BFA), which injects a bit-flip into the model to achieve performance degradation, and second, Deep-TROJ, which injects a bit-flip in the page frame number to insert backdoor behavior into the model. Although both of these attacks were developed for vision applications, our exploration in this work is to answer the following: How vulnerable are LLMs against such adversarial bit-flip attacks? We perform extensive experiments on benchmark LLMs and datasets to answer the question and provide key insights, which will serve as a foundational basis for future attack development on LLMs.

Original languageEnglish (US)
Title of host publicationGLSVLSI 2025 - Proceedings of the Great Lakes Symposium on VLSI 2025
PublisherAssociation for Computing Machinery
Pages534-539
Number of pages6
ISBN (Electronic)9798400714962
DOIs
StatePublished - Jun 29 2025
Event35th Edition of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025 - New Orleans, United States
Duration: Jun 30 2025Jul 2 2025

Publication series

NameProceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI

Conference

Conference35th Edition of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025
Country/TerritoryUnited States
CityNew Orleans
Period6/30/257/2/25

All Science Journal Classification (ASJC) codes

  • General Engineering

Keywords

  • Attack
  • Bit-Flip
  • LLM
  • Security

Fingerprint

Dive into the research topics of 'How Vulnerable are Large Language Models (LLMs) against Adversarial Bit-Flip Attacks?'. Together they form a unique fingerprint.

Cite this