TY - GEN
T1 - Data analytics approaches for breast cancer survivability
T2 - 67th Annual Conference and Expo of the Institute of Industrial Engineers 2017
AU - Kibis, Eyyub Y.
AU - Büyüktahtakin, I. Esra
AU - Dag, Ali
N1 - Funding Information:
We gratefully acknowledge the support of the National Science Foundation CAREER Award under Grant # CBET-1554018.
PY - 2017
Y1 - 2017
N2 - In the early stages of breast cancer, surgery, chemotherapy, and radiotherapy are considered effective methods to remove a cancerous tumor that is detected in the breast area and on the lymph nodes. However, undetected cancer cell remnants on the breast tissue and lymph nodes, inefficient treatment methods, as well as the patient's health condition may impact the patient's lifetime expectancy. In this study, given a set of explanatory variables that include the patient's demographics, health condition, and cancer treatment regimen, our objective is to investigate the performance of four different machine learning methods including an artificial neural network (ANN), classification and regression tree (C&RT), logistic regression, and Bayesian belief network (BBN). We utilize these four methods with a ten-fold cross validation in order to predict the ten-year survivability of a breast cancer patient after initial diagnosis. The results of each method are compared with respect to accuracy, sensitivity, specificity, and area under the curve (AUC) metrics. We observe that the logistic regression method shows better performance compared to the others with respect to the AUC metric. In all prediction models, the stage of the cancer is the most important predictor of breast cancer survivability.
AB - In the early stages of breast cancer, surgery, chemotherapy, and radiotherapy are considered effective methods to remove a cancerous tumor that is detected in the breast area and on the lymph nodes. However, undetected cancer cell remnants on the breast tissue and lymph nodes, inefficient treatment methods, as well as the patient's health condition may impact the patient's lifetime expectancy. In this study, given a set of explanatory variables that include the patient's demographics, health condition, and cancer treatment regimen, our objective is to investigate the performance of four different machine learning methods including an artificial neural network (ANN), classification and regression tree (C&RT), logistic regression, and Bayesian belief network (BBN). We utilize these four methods with a ten-fold cross validation in order to predict the ten-year survivability of a breast cancer patient after initial diagnosis. The results of each method are compared with respect to accuracy, sensitivity, specificity, and area under the curve (AUC) metrics. We observe that the logistic regression method shows better performance compared to the others with respect to the AUC metric. In all prediction models, the stage of the cancer is the most important predictor of breast cancer survivability.
KW - Artificial neural network
KW - Bayesian belief network
KW - Breast cancer
KW - Data mining
KW - Decision tree
UR - http://www.scopus.com/inward/record.url?scp=85030977304&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85030977304&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85030977304
T3 - 67th Annual Conference and Expo of the Institute of Industrial Engineers 2017
SP - 591
EP - 596
BT - 67th Annual Conference and Expo of the Institute of Industrial Engineers 2017
A2 - Nembhard, Harriet B.
A2 - Coperich, Katie
A2 - Cudney, Elizabeth
PB - Institute of Industrial Engineers
Y2 - 20 May 2017 through 23 May 2017
ER -