TY - GEN
T1 - EmoLeak
T2 - 43rd IEEE International Conference on Distributed Computing Systems, ICDCS 2023
AU - Mahdad, Ahmed Tanvir
AU - Shi, Cong
AU - Ye, Zhengkun
AU - Zhao, Tianming
AU - Wang, Yan
AU - Chen, Yingying
AU - Saxena, Nitesh
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Emotional state leakage attracts increasing concerns as it reveals rich sensitive information, such as intent, demo graphic, personality, and health information. Existing emotion recognition techniques rely on vision and audio data, which have limited threat due to the requirements of accessing restricted sensors (e.g., cameras and microphones). In this work, we first investigate the feasibility of detecting the emotional state of people in the vibration domain via zero-permission motion sensors. We find that when voice is being played through a smartphone's loudspeaker or ear speaker, it generates vibration signals on the smartphone surface, which encodes rich emotional information. As the smartphone is the go-to device for almost everyone nowadays, our attack based only on motion sensors raises severe concerns about emotion state leakage. We comprehensively study the relationship between vibration data and human emotion based on several publicly available emotion datasets (e.g., SAVEE, TESS). Time-frequency features and machine learning techniques are developed to determine the emotion of the victim based on speech vibrations. We evaluate our attack on both the ear speakers and loudspeakers on a diverse set of smartphones. The results demonstrate our attack can achieve a high accuracy, with around 95.3% (random guess 14.3%) accuracy for the loudspeaker setting and 60.52% (random guess 14.3%) accuracy for the ear speaker setting.
AB - Emotional state leakage attracts increasing concerns as it reveals rich sensitive information, such as intent, demo graphic, personality, and health information. Existing emotion recognition techniques rely on vision and audio data, which have limited threat due to the requirements of accessing restricted sensors (e.g., cameras and microphones). In this work, we first investigate the feasibility of detecting the emotional state of people in the vibration domain via zero-permission motion sensors. We find that when voice is being played through a smartphone's loudspeaker or ear speaker, it generates vibration signals on the smartphone surface, which encodes rich emotional information. As the smartphone is the go-to device for almost everyone nowadays, our attack based only on motion sensors raises severe concerns about emotion state leakage. We comprehensively study the relationship between vibration data and human emotion based on several publicly available emotion datasets (e.g., SAVEE, TESS). Time-frequency features and machine learning techniques are developed to determine the emotion of the victim based on speech vibrations. We evaluate our attack on both the ear speakers and loudspeakers on a diverse set of smartphones. The results demonstrate our attack can achieve a high accuracy, with around 95.3% (random guess 14.3%) accuracy for the loudspeaker setting and 60.52% (random guess 14.3%) accuracy for the ear speaker setting.
KW - emotion recognition
KW - motion sensor
KW - side channel
KW - speech privacy
UR - http://www.scopus.com/inward/record.url?scp=85175079284&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85175079284&partnerID=8YFLogxK
U2 - 10.1109/ICDCS57875.2023.00052
DO - 10.1109/ICDCS57875.2023.00052
M3 - Conference contribution
AN - SCOPUS:85175079284
T3 - Proceedings - International Conference on Distributed Computing Systems
SP - 316
EP - 326
BT - Proceedings - 2023 IEEE 43rd International Conference on Distributed Computing Systems, ICDCS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 July 2023 through 21 July 2023
ER -