Privacy Leakage via Speech-induced Vibrations on Room Objects through Remote Sensing based on Phased-MIMO

Cong Shi, Tianfang Zhang, Zhaoyi Xu, Shuping Li, Donglin Gao, Changming Li, Athina Petropulu, Chung Tse Michael Wu, Yingying Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech eavesdropping has long been an important threat to the privacy of individuals and enterprises. Recent research has shown the possibility of deriving private speech information from sound-induced vibrations. Acoustic signals transmitted through a solid medium or air may induce vibrations upon solid surfaces, which can be picked up by various sensors (e.g., motion sensors, high-speed cameras and lasers), without using a microphone. To date, these threats are limited to scenarios where the sensor is in contact with the vibration surface or at least in the visual line-of-sight. In this paper, we revisit this important line of research and show that a remote, long-distance, and even thru-the-wall speech eavesdropping attack is possible. We discover a new form of speech eavesdropping attack that remotely elicits speech from minute surface vibrations upon common room objects (e.g., paper bags, plastic storage bin) via mmWave sensing, signal processing, and advanced deep learning techniques. While mmWave signals have high sensitivity for vibrations, they have limited sensing distance and normally do not penetrate through walls. We overcome this key challenge through designing and implementing a high-resolution software-defined phased-MIMO radar that integrates transmit beamforming, virtual array, and receive beamforming. The proposed system enhances sensing directivity by focusing all the mmWave beams toward a target room object, allowing mmWave signals to pick up minute speech-induced vibrations from a long distance and even through walls. To realize the attack, we design an object identification technique that scans objects in a room and identifies a prominent object that is most sensitive to speech vibrations for vibration feature extraction. We successfully demonstrate speech privacy leakage using speech-induced vibrations via the development of a deep learning framework. Our framework can leverage domain adaptation techniques to infer speech content based only on the unlabeled vibration data of a victim. We validate the proof-of-concept attack on digit recognition through extensive experiments, involving 40 speakers, five common room objects, and attack scenarios with mmWave devices inside and outside the room. Our phased-MIMO-based attack can achieve success rates of 88% ∼ 98% and 64% ∼ 86% with and without using speech labels for training. The success rates are 81% ∼ 94% and 58% ∼ 74% for thru-the-wall attacks. Furthermore, we discuss possible defense methods to mitigate this unprecedented security threat.

Original languageEnglish (US)
Title of host publicationCCS 2023 - Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery, Inc
Pages75-89
Number of pages15
ISBN (Electronic)9798400700507
DOIs
StatePublished - Nov 15 2023
Externally publishedYes
Event30th ACM SIGSAC Conference on Computer and Communications Security, CCS 2023 - Copenhagen, Denmark
Duration: Nov 26 2023Nov 30 2023

Publication series

NameCCS 2023 - Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security

Conference

Conference30th ACM SIGSAC Conference on Computer and Communications Security, CCS 2023
Country/TerritoryDenmark
CityCopenhagen
Period11/26/2311/30/23

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Computer Science Applications
  • Software

Keywords

  • Speech privacy attack
  • mmWave sensing
  • phased-MIMO

Fingerprint

Dive into the research topics of 'Privacy Leakage via Speech-induced Vibrations on Room Objects through Remote Sensing based on Phased-MIMO'. Together they form a unique fingerprint.

Cite this