TY - GEN
T1 - Moving from descriptive to causal analytics
T2 - 2012 ACM International Workshop on Smart Health and Wellbeing, SHB 2012 - Co-located with CIKM 2012
AU - Schryver, Jack
AU - Shankar, Mallikarjun
AU - Xu, Songhua
PY - 2012
Y1 - 2012
N2 - The knowledge management community has introduced a multitude of methods for knowledge discovery on large datasets. In the context of public health intelligence, we integrated and incorporated some of these methods into an analyst's workflow that proceeds from the data-centric descriptive level of analysis to the model-centric causal level of reasoning. We show several case studies of the proposed analyst's workflow as applied to the US Health Indicators Warehouse (HIW), which is a medium scale, public dataset regarding community health information as collected by the US federal government. In our case studies, we demonstrate a series of visual analytics efforts targeted at the HIW, including visual analysis according to correlation matrices, multivariate outlier analysis, multiple linear regression of Medicare costs, confirmatory factor analysis, and hybrid scatterplot and heatmap visualization for distributions of a group of health indicators. We conclude by sketching a preliminary framework for examining causal dependence hypotheses for future data science research in public health.
AB - The knowledge management community has introduced a multitude of methods for knowledge discovery on large datasets. In the context of public health intelligence, we integrated and incorporated some of these methods into an analyst's workflow that proceeds from the data-centric descriptive level of analysis to the model-centric causal level of reasoning. We show several case studies of the proposed analyst's workflow as applied to the US Health Indicators Warehouse (HIW), which is a medium scale, public dataset regarding community health information as collected by the US federal government. In our case studies, we demonstrate a series of visual analytics efforts targeted at the HIW, including visual analysis according to correlation matrices, multivariate outlier analysis, multiple linear regression of Medicare costs, confirmatory factor analysis, and hybrid scatterplot and heatmap visualization for distributions of a group of health indicators. We conclude by sketching a preliminary framework for examining causal dependence hypotheses for future data science research in public health.
KW - Community health indicators
KW - Machine learning
KW - Multivariate statistics
KW - Visual analytics
UR - http://www.scopus.com/inward/record.url?scp=84870397188&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870397188&partnerID=8YFLogxK
U2 - 10.1145/2389707.2389709
DO - 10.1145/2389707.2389709
M3 - Conference contribution
AN - SCOPUS:84870397188
SN - 9781450317122
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1
EP - 8
BT - SHB'12 - Proceedings of the 2012 ACM International Workshop on Smart Health and Wellbeing, Co-located with CIKM 2012
Y2 - 29 October 2012 through 29 October 2012
ER -