Skip to main navigation Skip to search Skip to main content

Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models

  • Chun Jie Chong
  • , Chenxi Hou
  • , Zhihao Yao
  • , Seyed Mohammadjavad Seyed Talebi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Web-based Large Language Model (LLM) services have been widely adopted and have become an integral part of our Internet experience. Third-party plugins enhance the functionalities of LLMs by enabling access to real-world data and services. However, the privacy consequences associated with these services and their third-party plugins are not well understood. Sensitive prompt data are stored, processed, and shared by cloud-based LLM providers and third-party plugins. In this paper, we propose Casper, a prompt sanitization technique that aims to protect user privacy by detecting and pseudonymizing sensitive information from user inputs before sending them to LLM services. Casper runs entirely on the user's device as a browser extension and does not require any changes to the online LLM services. At the core of Casper is a three-layered sanitization mechanism consisting of a rule-based filter, a Machine Learning (ML)-based named entity recognizer, and a browser-based local LLM topic identifier. We evaluate Casper on a dataset of 4500 synthesized prompts and 2000 real-world prompts. The results show that Casper can effectively filter out Personal Identifiable Information (PII) with an accuracy of 92.6 %, and detect privacy-sensitive topics with an accuracy ranging from 92.5 % to 94.0 %. Furthermore, Casper successfully pseudonymized 92.0% of the sensitive information in the prompts while ensuring that the LLM's responses to the sanitized prompts remained moderately similar to those for the original prompts, with a cosine similarity of 0.538.

Original languageEnglish (US)
Title of host publicationProceedings - 2025 IEEE 12th International Conference on Cyber Security and Cloud Computing, CSCloud 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages122-131
Number of pages10
ISBN (Electronic)9798331587819
DOIs
StatePublished - 2025
Event12th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2025 - New York City, United States
Duration: Nov 7 2025Nov 9 2025

Publication series

NameProceedings - 2025 IEEE 12th International Conference on Cyber Security and Cloud Computing, CSCloud 2025

Conference

Conference12th IEEE International Conference on Cyber Security and Cloud Computing, CSCloud 2025
Country/TerritoryUnited States
CityNew York City
Period11/7/2511/9/25

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Software
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality

Keywords

  • Large Language Model
  • Web Privacy

Fingerprint

Dive into the research topics of 'Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models'. Together they form a unique fingerprint.

Cite this