TY - GEN
T1 - Scraping Sticky Leftovers
T2 - 43rd IEEE Symposium on Security and Privacy, SP 2022
AU - Santhanam, Preethi
AU - Dang, Hoang
AU - Shan, Zhiyong
AU - Neamtiu, Iulian
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Sixty-five percent of mobile apps require user accounts for offering full-fledged functionality. Account information includes private data, e.g., address, phone number, credit card. Our concern is 'leftover' account data kept on the server after account deletion, which can be a significant privacy violation. Specifically, we analyzed 1,435 popular apps from Google Play (and 771 associated websites), of which 678 have their own sign-up process, to answer questions such as: Can accounts be deleted at all? Following account deletion, will user data remain on the app's servers? If so, for how long? Do apps keep their promise to remove data? Answering these questions, and more generally, understanding and tackling the leftover account problem, is challenging. A fundamental obstacle is that leftover data is manipulated and retained in a private space, on the app's backend servers; we devised a novel, reverse-engineering approach to infer leftover data from app-server communication. Another obstacle is the distributed nature of this data: program analysis as well as information retrieval are required on both the app and its website. We have developed an end-to-end solution (static analysis, dynamic analysis, natural language processing) to the leftover account problem. First, our toolchain checks whether an app, or its website, support account deletion; next, it checks whether the app/website have a data retention policy, and whether the account is left on servers after deletion, or after the specified retention period; finally, it automatically cleans up leftover accounts. We found that 64.45% of apps do not offer any means for users to delete accounts; 2.5% of apps still keep account data on app servers even after accounts are deleted by users. Only 5% of apps specify a retention period; some of these apps violate their own policy by still retaining data months after the period has ended. Experiments show that our approach is effective, with an F-measure \gt 88%, and efficient, with a typical analysis time of 279 seconds per app/website.
AB - Sixty-five percent of mobile apps require user accounts for offering full-fledged functionality. Account information includes private data, e.g., address, phone number, credit card. Our concern is 'leftover' account data kept on the server after account deletion, which can be a significant privacy violation. Specifically, we analyzed 1,435 popular apps from Google Play (and 771 associated websites), of which 678 have their own sign-up process, to answer questions such as: Can accounts be deleted at all? Following account deletion, will user data remain on the app's servers? If so, for how long? Do apps keep their promise to remove data? Answering these questions, and more generally, understanding and tackling the leftover account problem, is challenging. A fundamental obstacle is that leftover data is manipulated and retained in a private space, on the app's backend servers; we devised a novel, reverse-engineering approach to infer leftover data from app-server communication. Another obstacle is the distributed nature of this data: program analysis as well as information retrieval are required on both the app and its website. We have developed an end-to-end solution (static analysis, dynamic analysis, natural language processing) to the leftover account problem. First, our toolchain checks whether an app, or its website, support account deletion; next, it checks whether the app/website have a data retention policy, and whether the account is left on servers after deletion, or after the specified retention period; finally, it automatically cleans up leftover accounts. We found that 64.45% of apps do not offer any means for users to delete accounts; 2.5% of apps still keep account data on app servers even after accounts are deleted by users. Only 5% of apps specify a retention period; some of these apps violate their own policy by still retaining data months after the period has ended. Experiments show that our approach is effective, with an F-measure \gt 88%, and efficient, with a typical analysis time of 279 seconds per app/website.
KW - Android
KW - Dynamic-Analysis
KW - Leftover-Account-Information
KW - Static-Analysis
UR - http://www.scopus.com/inward/record.url?scp=85135919554&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85135919554&partnerID=8YFLogxK
U2 - 10.1109/SP46214.2022.9833720
DO - 10.1109/SP46214.2022.9833720
M3 - Conference contribution
AN - SCOPUS:85135919554
T3 - Proceedings - IEEE Symposium on Security and Privacy
SP - 2145
EP - 2160
BT - Proceedings - 43rd IEEE Symposium on Security and Privacy, SP 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 May 2022 through 26 May 2022
ER -