TY - JOUR
T1 - DI++
T2 - A deep learning system for patient condition identification in clinical notes
AU - Shi, Jinhe
AU - Gao, Xiangyu
AU - Kinsman, William C.
AU - Ha, Chenyu
AU - Gao, Guodong Gordon
AU - Chen, Yi
N1 - Publisher Copyright:
© 2021
PY - 2022/1
Y1 - 2022/1
N2 - Accurately recording a patient's medical conditions in an EHR system is the basis of effectively documenting patient health status, coding for billing, and supporting data-driven clinical decision making. However, patient conditions are often not fully captured in structured EHR systems, but may be documented in unstructured clinical notes. The challenge is that not all disease mentions in clinical notes actually refer to a patient's conditions. We developed a two-step workflow for identifying patient's conditions from clinical notes: disease mention extraction and disease mention classification. We implemented this workflow in a prototype system, DI++, for Disease Identification. An advanced deep learning model, CLSTM-Attention model, is developed for disease mention classification in DI++. Extensive empirical evaluation on about one million pages of de-identified clinical notes demonstrates that DI++ has significant performance advantage over existing systems on F1 Score, Area Under the Curve metrics, and efficiency. The proposed CLSTM-Attention model outperforms the existing deep learning models for disease mention classification.
AB - Accurately recording a patient's medical conditions in an EHR system is the basis of effectively documenting patient health status, coding for billing, and supporting data-driven clinical decision making. However, patient conditions are often not fully captured in structured EHR systems, but may be documented in unstructured clinical notes. The challenge is that not all disease mentions in clinical notes actually refer to a patient's conditions. We developed a two-step workflow for identifying patient's conditions from clinical notes: disease mention extraction and disease mention classification. We implemented this workflow in a prototype system, DI++, for Disease Identification. An advanced deep learning model, CLSTM-Attention model, is developed for disease mention classification in DI++. Extensive empirical evaluation on about one million pages of de-identified clinical notes demonstrates that DI++ has significant performance advantage over existing systems on F1 Score, Area Under the Curve metrics, and efficiency. The proposed CLSTM-Attention model outperforms the existing deep learning models for disease mention classification.
KW - Clinical notes
KW - Concept extraction
KW - Deep learning
KW - Deep neural network
KW - Disease mention extraction
KW - Natural language processing (NLP)
KW - Patient condition classification
UR - http://www.scopus.com/inward/record.url?scp=85121759610&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85121759610&partnerID=8YFLogxK
U2 - 10.1016/j.artmed.2021.102224
DO - 10.1016/j.artmed.2021.102224
M3 - Article
C2 - 34998515
AN - SCOPUS:85121759610
SN - 0933-3657
VL - 123
JO - Artificial Intelligence in Medicine
JF - Artificial Intelligence in Medicine
M1 - 102224
ER -