Voice Anonymization in Urban Sound Recordings

Alice Cohen-Hadria, Mark Cartwright, Brian McFee, Juan Pablo Bello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

19 Scopus citations

Abstract

Monitoring health and noise pollution in urban environments often entails deploying acoustic sensor networks to passively collect data in public spaces. Although spaces are technically public, people in the environment may not fully realize the degree to which they may be recorded by the sensor network, which may be perceived as a violation of expected privacy. Therefore, we propose a method to anonymize and blur the voices of people recorded in public spaces-a novel, yet increasingly important task as acoustic sensing becomes ubiquitous in sensor-equipped smart cities. This method is analogous to Google's face blurring in its Street View photographs, which arose from similar concerns in the visual domain. The proposed blurring method aims to anonymize voices by removing both the linguistic content and personal identity from voices, while preserving the rest of the acoustic scene.The method consists of a three-step process. First, voices are separated from non-voice content by a deep U-Net source separation model. Second, we evaluate two approaches to obscure the identity and intelligibility of the extracted voices: A low pass filter to remove most of the formants in the voices, and an inversion of Mel-Frequency Cepstral Coefficients (MFCC). Finally, the blurred vocal content is mixed with the separated non-vocal signal to reconstruct the acoustic scene. Using background recordings from a real urban acoustic sensor network in New York City, we present a complete evaluation of our method, with automatic speech recognition, speaker identification, sound event detection, and human perceptual evaluation.

Original languageEnglish (US)
Title of host publication2019 IEEE 29th International Workshop on Machine Learning for Signal Processing, MLSP 2019
PublisherIEEE Computer Society
ISBN (Electronic)9781728108247
DOIs
StatePublished - Oct 2019
Externally publishedYes
Event29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019 - Pittsburgh, United States
Duration: Oct 13 2019Oct 16 2019

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2019-October
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference29th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2019
Country/TerritoryUnited States
CityPittsburgh
Period10/13/1910/16/19

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Signal Processing

Keywords

  • privacy
  • source separation
  • urban recordings
  • voice anonymization

Fingerprint

Dive into the research topics of 'Voice Anonymization in Urban Sound Recordings'. Together they form a unique fingerprint.

Cite this