TY - GEN
T1 - Algorithmic detection of inconsistent modeling among SNOMED CT concepts by combining lexical and structural indicators
AU - Agrawal, Ankur
AU - Perl, Yehoshua
AU - Ochs, Chris
AU - Elhanan, Gai
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/12/16
Y1 - 2015/12/16
N2 - SNOMED CT is important for clinical applications, such as Electronic Health Record (EHR) encoding. However, inconsistency in modeling its concepts may prevent SNOMED CT from providing proper support for clinical use. This study provides an effective methodology for locating inconsistently modeled SNOMED CT concepts. One can expect lexically similar concepts to be modeled similarly. Positional similarity sets, sets of lexically similar concepts having only one different word at the same position of their names, are introduced. Concepts in such sets have a higher likelihood of being unjustifiably inconsistently modeled. A technique to incorporate three structural indicators into the selected sets is provided to further improve the likelihood of finding inconsistently modeled concepts. An analysis of a sample of 50 such sets and for each of these three indicators is performed. The sample of positional similarity sets is found to have 18.6% inconsistent concepts. The use of structural indicators is shown to further improve the likelihood of finding inconsistently modeled concepts up to 41.6% with high statistical significance when compared to the previous sample of positional similarity sets. Positional similarity sets with different structural indicators are shown to help identify inconsistencies in concept modeling with high likelihood. Furthermore, such sets enable the comparison of concept modeling in the context of other lexically similar concepts, which enhances the effectiveness of corrections by auditors. Such quality assurance methods can be used to Supplement IHTSDO's own efforts in order to improve the quality of SNOMED CT.
AB - SNOMED CT is important for clinical applications, such as Electronic Health Record (EHR) encoding. However, inconsistency in modeling its concepts may prevent SNOMED CT from providing proper support for clinical use. This study provides an effective methodology for locating inconsistently modeled SNOMED CT concepts. One can expect lexically similar concepts to be modeled similarly. Positional similarity sets, sets of lexically similar concepts having only one different word at the same position of their names, are introduced. Concepts in such sets have a higher likelihood of being unjustifiably inconsistently modeled. A technique to incorporate three structural indicators into the selected sets is provided to further improve the likelihood of finding inconsistently modeled concepts. An analysis of a sample of 50 such sets and for each of these three indicators is performed. The sample of positional similarity sets is found to have 18.6% inconsistent concepts. The use of structural indicators is shown to further improve the likelihood of finding inconsistently modeled concepts up to 41.6% with high statistical significance when compared to the previous sample of positional similarity sets. Positional similarity sets with different structural indicators are shown to help identify inconsistencies in concept modeling with high likelihood. Furthermore, such sets enable the comparison of concept modeling in the context of other lexically similar concepts, which enhances the effectiveness of corrections by auditors. Such quality assurance methods can be used to Supplement IHTSDO's own efforts in order to improve the quality of SNOMED CT.
KW - Lexical analysis
KW - Modeling inconsistency
KW - SNOMED CT
KW - Terminology auditing
KW - Terminology quality assurance
UR - http://www.scopus.com/inward/record.url?scp=84962381497&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962381497&partnerID=8YFLogxK
U2 - 10.1109/BIBM.2015.7359731
DO - 10.1109/BIBM.2015.7359731
M3 - Conference contribution
AN - SCOPUS:84962381497
T3 - Proceedings - 2015 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015
SP - 476
EP - 483
BT - Proceedings - 2015 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015
A2 - Schapranow, lng. Matthieu
A2 - Zhou, Jiayu
A2 - Hu, Xiaohua Tony
A2 - Ma, Bin
A2 - Rajasekaran, Sanguthevar
A2 - Miyano, Satoru
A2 - Yoo, Illhoi
A2 - Pierce, Brian
A2 - Shehu, Amarda
A2 - Gombar, Vijay K.
A2 - Chen, Brian
A2 - Pai, Vinay
A2 - Huan, Jun
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2015
Y2 - 9 November 2015 through 12 November 2015
ER -