TY - GEN
T1 - Predicting lung cancer incidence from air pollution exposures using shapelet-based time series analysis
AU - Yoon, Hong Jun
AU - Xu, Songhua
AU - Tourassi, Georgia
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/4/18
Y1 - 2016/4/18
N2 - In this paper we investigated whether the geographical variation of lung cancer incidence can be predicted through examining the spatiotemporal trend of particulate matter air pollution levels. Regional trends of air pollution levels were analyzed by a novel shapelet-based time series analysis technique. First, we identified U.S. counties with reportedly high and low lung cancer incidence between 2008 and 2012 via the State Cancer Profiles provided by the National Cancer Institute. Then, we collected particulate matter exposure levels (PM2.5 and PM10) of the counties for the previous decade (1998-2007) via the AirData dataset provided by the Environmental Protection Agency. Using shapelet-based time series pattern mining, regional environmental exposure profiles were examined to identify frequently occurring sequential exposure patterns. Finally, a binary classifier was designed to predict whether a U.S. region is expected to experience high lung cancer incidence based on the region's PM2.5 and PM10 exposure the decade prior. The study confirmed the association between prolonged PM exposure and lung cancer risk. In addition, the study findings suggest that not only cumulative exposure levels but also the temporal variability of PM exposure influence lung cancer risk.
AB - In this paper we investigated whether the geographical variation of lung cancer incidence can be predicted through examining the spatiotemporal trend of particulate matter air pollution levels. Regional trends of air pollution levels were analyzed by a novel shapelet-based time series analysis technique. First, we identified U.S. counties with reportedly high and low lung cancer incidence between 2008 and 2012 via the State Cancer Profiles provided by the National Cancer Institute. Then, we collected particulate matter exposure levels (PM2.5 and PM10) of the counties for the previous decade (1998-2007) via the AirData dataset provided by the Environmental Protection Agency. Using shapelet-based time series pattern mining, regional environmental exposure profiles were examined to identify frequently occurring sequential exposure patterns. Finally, a binary classifier was designed to predict whether a U.S. region is expected to experience high lung cancer incidence based on the region's PM2.5 and PM10 exposure the decade prior. The study confirmed the association between prolonged PM exposure and lung cancer risk. In addition, the study findings suggest that not only cumulative exposure levels but also the temporal variability of PM exposure influence lung cancer risk.
UR - http://www.scopus.com/inward/record.url?scp=84968610584&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84968610584&partnerID=8YFLogxK
U2 - 10.1109/BHI.2016.7455960
DO - 10.1109/BHI.2016.7455960
M3 - Conference contribution
AN - SCOPUS:84968610584
T3 - 3rd IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2016
SP - 565
EP - 568
BT - 3rd IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2016
Y2 - 24 February 2016 through 27 February 2016
ER -