TY - GEN
T1 - How Vulnerable are Large Language Models (LLMs) against Adversarial Bit-Flip Attacks?
AU - Almalky, Abeer Matar A.
AU - Zhou, Ranyang
AU - Angizi, Shaahin
AU - Rakin, Adnan Siraj
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/6/29
Y1 - 2025/6/29
N2 - The recent popularity of Large Language Models (LLMs) has transformed the capabilities of modern Artificial Intelligence (AI) across a wide range of safety-critical applications. Hence, it is necessary to explore and understand their vulnerability across different attack surfaces. Previous works have explored the vulnerability of LLMs through software-level input manipulation attacks, such as jailbreaking or backdoor attacks. However, their vulnerability against adversarial weight perturbation through memory fault injection mechanisms such as Bit-flip Attack (BFA) remains underexplored. To make matters worse, the enormous size of the LLMs makes it challenging to fit the entire model into on-chip memory. Hence, storing a large portion of the model in off-chip memory (DRAM) is necessary, making it potentially vulnerable to memory fault injection attacks. In this work, for the first time, we explore the vulnerability of LLMs against adversarial weight perturbation attacks leveraging bit-flip in memory. For our analysis, we took two representative attacks with distinct characteristics: first, Bit-Flip Attack (BFA), which injects a bit-flip into the model to achieve performance degradation, and second, Deep-TROJ, which injects a bit-flip in the page frame number to insert backdoor behavior into the model. Although both of these attacks were developed for vision applications, our exploration in this work is to answer the following: How vulnerable are LLMs against such adversarial bit-flip attacks? We perform extensive experiments on benchmark LLMs and datasets to answer the question and provide key insights, which will serve as a foundational basis for future attack development on LLMs.
AB - The recent popularity of Large Language Models (LLMs) has transformed the capabilities of modern Artificial Intelligence (AI) across a wide range of safety-critical applications. Hence, it is necessary to explore and understand their vulnerability across different attack surfaces. Previous works have explored the vulnerability of LLMs through software-level input manipulation attacks, such as jailbreaking or backdoor attacks. However, their vulnerability against adversarial weight perturbation through memory fault injection mechanisms such as Bit-flip Attack (BFA) remains underexplored. To make matters worse, the enormous size of the LLMs makes it challenging to fit the entire model into on-chip memory. Hence, storing a large portion of the model in off-chip memory (DRAM) is necessary, making it potentially vulnerable to memory fault injection attacks. In this work, for the first time, we explore the vulnerability of LLMs against adversarial weight perturbation attacks leveraging bit-flip in memory. For our analysis, we took two representative attacks with distinct characteristics: first, Bit-Flip Attack (BFA), which injects a bit-flip into the model to achieve performance degradation, and second, Deep-TROJ, which injects a bit-flip in the page frame number to insert backdoor behavior into the model. Although both of these attacks were developed for vision applications, our exploration in this work is to answer the following: How vulnerable are LLMs against such adversarial bit-flip attacks? We perform extensive experiments on benchmark LLMs and datasets to answer the question and provide key insights, which will serve as a foundational basis for future attack development on LLMs.
KW - Attack
KW - Bit-Flip
KW - LLM
KW - Security
UR - https://www.scopus.com/pages/publications/105017655586
UR - https://www.scopus.com/inward/citedby.url?scp=105017655586&partnerID=8YFLogxK
U2 - 10.1145/3716368.3735278
DO - 10.1145/3716368.3735278
M3 - Conference contribution
AN - SCOPUS:105017655586
T3 - Proceedings of the ACM Great Lakes Symposium on VLSI, GLSVLSI
SP - 534
EP - 539
BT - GLSVLSI 2025 - Proceedings of the Great Lakes Symposium on VLSI 2025
PB - Association for Computing Machinery
T2 - 35th Edition of the Great Lakes Symposium on VLSI 2025, GLSVLSI 2025
Y2 - 30 June 2025 through 2 July 2025
ER -