TY - GEN
T1 - Preventing unwanted social inferences with classification tree analysis
AU - Motahari, Sara
AU - Ziavras, Sotirios
AU - Jones, Quentin
PY - 2009
Y1 - 2009
N2 - A serious threat to user privacy in new mobile and web2.0 applications stems from 'social inferences'. These unwanted inferences are related to the users' identity, current location and other personal information. We have previously introduced 'inference functions' to estimate the social inference risk based on information entropy. In this paper, after analyzing the problem and reviewing our risk estimation method, we create a decision tree to distinguish between high risk and normal situations. To evaluate our methodology, test and training datasets were collected during a large mobile-phone field study for a location-aware application. The classification tree employs our two inference functions, for the current and past situations, as internal nodes. Our results show that the achieved true classification rates are significantly better than approaches that employ other available features for the internal nodes of the trees. The results also suggest that common classification tools cannot accurately capture the information entropy for social applications. This is mostly due to the lack of enough training data for high-risk, low-entropy situations and outliers. Thus, we conclude that estimating the information entropy and the relevant inference risk using a pre-processor can yield a simpler and more accurate classification tree.
AB - A serious threat to user privacy in new mobile and web2.0 applications stems from 'social inferences'. These unwanted inferences are related to the users' identity, current location and other personal information. We have previously introduced 'inference functions' to estimate the social inference risk based on information entropy. In this paper, after analyzing the problem and reviewing our risk estimation method, we create a decision tree to distinguish between high risk and normal situations. To evaluate our methodology, test and training datasets were collected during a large mobile-phone field study for a location-aware application. The classification tree employs our two inference functions, for the current and past situations, as internal nodes. Our results show that the achieved true classification rates are significantly better than approaches that employ other available features for the internal nodes of the trees. The results also suggest that common classification tools cannot accurately capture the information entropy for social applications. This is mostly due to the lack of enough training data for high-risk, low-entropy situations and outliers. Thus, we conclude that estimating the information entropy and the relevant inference risk using a pre-processor can yield a simpler and more accurate classification tree.
KW - Knowledge representation and reasoning
KW - Reasoning under fuzziness or uncertainty
UR - http://www.scopus.com/inward/record.url?scp=77949505502&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77949505502&partnerID=8YFLogxK
U2 - 10.1109/ICTAI.2009.15
DO - 10.1109/ICTAI.2009.15
M3 - Conference contribution
AN - SCOPUS:77949505502
SN - 9781424456192
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 500
EP - 507
BT - ICTAI 2009 - 21st IEEE International Conference on Tools with Artificial Intelligence
T2 - 21st IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2009
Y2 - 2 November 2009 through 5 November 2009
ER -