TY - JOUR
T1 - Fair and explainable Myocardial Infarction (MI) prediction
T2 - Novel strategies for feature selection and class imbalance correction
AU - Akter, Simon Bin
AU - Akter, Sumya
AU - Tuli, Moon Das
AU - Eisenberg, David
AU - Lotvola, Aaron
AU - Islam, Humayera
AU - Fernandez, Jorge Fresneda
AU - Hüttemann, Maik
AU - Pias, Tanmoy Sarkar
N1 - Publisher Copyright:
© 2024
PY - 2025/1
Y1 - 2025/1
N2 - The rising incidences of myocardial infarction (MI), often affecting individuals without traditional risk factors, highlight the urgent need for improved early detection using personal health data. However, health surveys and electronic health records (EHRs) frequently suffer from class imbalances, leading to prediction biases and differences between specificity and sensitivity, which hinder reliable model development despite the valuable insights contained in these datasets. To address this, we have introduced a novel approach to enhance MI risk prediction using self-reported attributes from the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health Interview Survey (NHIS) dataset. Our approach incorporates three innovative techniques: the Dual-Path Artificial Neural Network (DP-ANN) to mitigate biased decision making across imbalanced datasets, the Triple Criteria Selection (TCS) for unbiased feature selection, and Minority Weighted Sampling (MWS) to tackle challenges of uncontrolled minority class sampling. These methods collectively enhance MI prediction and feature relevance. The DP-ANN model has achieved balanced performance, with an average specificity of 80%, sensitivity of 82%, and AUC–ROC of 89.5%, improving imbalance variance by approximately 14.96% compared to prior studies. By outperforming other models across four heavily imbalanced datasets, our approach demonstrates robustness and generalizability. Additionally, SHapley Additive exPlanations (SHAP) analysis has revealed key predictors and risk factors for MI, such as coronary heart disease and bronchitis in females, and stroke among individuals aged 35–54. In conclusion, our study provides a robust model for healthcare professionals to assess MI risk through targeted factors, promoting early detection and potentially improving patient outcomes.
AB - The rising incidences of myocardial infarction (MI), often affecting individuals without traditional risk factors, highlight the urgent need for improved early detection using personal health data. However, health surveys and electronic health records (EHRs) frequently suffer from class imbalances, leading to prediction biases and differences between specificity and sensitivity, which hinder reliable model development despite the valuable insights contained in these datasets. To address this, we have introduced a novel approach to enhance MI risk prediction using self-reported attributes from the Behavioral Risk Factor Surveillance System (BRFSS) and the National Health Interview Survey (NHIS) dataset. Our approach incorporates three innovative techniques: the Dual-Path Artificial Neural Network (DP-ANN) to mitigate biased decision making across imbalanced datasets, the Triple Criteria Selection (TCS) for unbiased feature selection, and Minority Weighted Sampling (MWS) to tackle challenges of uncontrolled minority class sampling. These methods collectively enhance MI prediction and feature relevance. The DP-ANN model has achieved balanced performance, with an average specificity of 80%, sensitivity of 82%, and AUC–ROC of 89.5%, improving imbalance variance by approximately 14.96% compared to prior studies. By outperforming other models across four heavily imbalanced datasets, our approach demonstrates robustness and generalizability. Additionally, SHapley Additive exPlanations (SHAP) analysis has revealed key predictors and risk factors for MI, such as coronary heart disease and bronchitis in females, and stroke among individuals aged 35–54. In conclusion, our study provides a robust model for healthcare professionals to assess MI risk through targeted factors, promoting early detection and potentially improving patient outcomes.
KW - Behavioral Risk Factor Surveillance System (BRFSS)
KW - Explainable AI (XAI)
KW - Imbalance correction
KW - Myocardial Infarction (MI)
KW - National Health Interview Survey (NHIS)
UR - https://www.scopus.com/pages/publications/85210537247
UR - https://www.scopus.com/pages/publications/85210537247#tab=citedBy
U2 - 10.1016/j.compbiomed.2024.109413
DO - 10.1016/j.compbiomed.2024.109413
M3 - Article
C2 - 39615231
AN - SCOPUS:85210537247
SN - 0010-4825
VL - 184
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 109413
ER -