TY - GEN
T1 - TOWARDS PRECISE DETECTION OF PERSONAL INFORMATION LEAKS IN MOBILE HEALTH APPS
AU - Ardalani, Alireza
AU - Antonucci, Joseph
AU - Neamtiu, Iulian
N1 - Publisher Copyright:
© 2024 Big Data Analytics, Data Mining and Computational Intelligence. All Rights Reserved.
PY - 2024
Y1 - 2024
N2 - Mobile apps are used in a variety of health settings, from apps that help providers, to apps designed for patients, to health and fitness apps designed for the general public. These apps ask the user for, and then collect and “leak” a wealth of Personal Information (PI). We analyze the PI that apps collect via their user interface, whether the app or third-party code is processing this information, and finally where the data is sent or stored. Prior work on leak detection in Android has focused on detecting leaks of (hardware) device-identifying information, or policy violations; however, no work has looked at processing and leaking of PI in the context of health apps. The first challenge we tackle is extracting the semantic information contained in app UIs to discern the extent, and nature, of personal information. The second challenge we tackle is disambiguating between first-party, legitimate leaks (e.g., the app storing data in its database) and third-party, problematic leaks, e.g., processing this information by, or sending it to, advertisers and analytics. We conducted a study on 1,243 Android apps: 623 medical apps and 621 Health&Fitness apps. We categorize PI into 16 types, grouped in 3 main categories: identity, medical, anthropometric. We found that the typical app has one first-party leak and five third-party leaks, though 221 apps had 20 or more leaks. Next, we show that third-party leaks (e.g., advertisers, analytics) are 5x more frequent than first-party leaks. Then, we show that 71% of leaks are to local storage (i.e., the phone, where data could be accessed by unauthorized apps) whereas 29% of leaks are to the network (e.g., Cloud). Finally, medical apps have 20% more PI leaks than Health&Fitness apps, due to collecting additional medical PI.
AB - Mobile apps are used in a variety of health settings, from apps that help providers, to apps designed for patients, to health and fitness apps designed for the general public. These apps ask the user for, and then collect and “leak” a wealth of Personal Information (PI). We analyze the PI that apps collect via their user interface, whether the app or third-party code is processing this information, and finally where the data is sent or stored. Prior work on leak detection in Android has focused on detecting leaks of (hardware) device-identifying information, or policy violations; however, no work has looked at processing and leaking of PI in the context of health apps. The first challenge we tackle is extracting the semantic information contained in app UIs to discern the extent, and nature, of personal information. The second challenge we tackle is disambiguating between first-party, legitimate leaks (e.g., the app storing data in its database) and third-party, problematic leaks, e.g., processing this information by, or sending it to, advertisers and analytics. We conducted a study on 1,243 Android apps: 623 medical apps and 621 Health&Fitness apps. We categorize PI into 16 types, grouped in 3 main categories: identity, medical, anthropometric. We found that the typical app has one first-party leak and five third-party leaks, though 221 apps had 20 or more leaks. Next, we show that third-party leaks (e.g., advertisers, analytics) are 5x more frequent than first-party leaks. Then, we show that 71% of leaks are to local storage (i.e., the phone, where data could be accessed by unauthorized apps) whereas 29% of leaks are to the network (e.g., Cloud). Finally, medical apps have 20% more PI leaks than Health&Fitness apps, due to collecting additional medical PI.
KW - Android
KW - Health&Fitness Apps
KW - Information Flow Analysis
KW - Medical Apps
KW - Personal Information Leaks
UR - http://www.scopus.com/inward/record.url?scp=85207096511&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85207096511&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85207096511
T3 - Proceedings of the International Conferences on Big Data Analytics, Data Mining and Computational Intelligence 2024, BigDaCI 2024; Connected Smart Cities 2024, CSC 2024; and e-Health 2024, EH 2024
SP - 118
EP - 125
BT - Proceedings of the International Conferences on Big Data Analytics, Data Mining and Computational Intelligence 2024, BigDaCI 2024; Connected Smart Cities 2024, CSC 2024; and e-Health 2024, EH 2024
A2 - Abraham, Ajith
A2 - Peng, Guo Chao
A2 - Isaias, Pedro
A2 - Isaias, Pedro
PB - IADIS
T2 - 9th International Conference on Big Data Analytics, Data Mining and Computational Intelligence, BigDaCI 2024, the 10th International Conference on Connected Smart Cities, CSC 2024 and the 16th International Conference on e-Health, EH 2024, Part of the 18th Multi Conference on Computer Science and Information Systems 2024, MCCSIS 2024
Y2 - 13 July 2024 through 15 July 2024
ER -