TY - JOUR
T1 - Expanding the Extent of a UMLS Semantic Type via Group Neighborhood Auditing
AU - Chen, Yan
AU - Gu, Huanying
AU - Perl, Yehoshua
AU - Halper, Michael
AU - Xu, Junchuan
PY - 2009/9
Y1 - 2009/9
N2 - Objective: Each Unified Medical Language System (UMLS) concept is assigned one or more semantic types (ST). A dynamic methodology for aiding an auditor in finding concepts that are missing the assignment of a given ST, S is presented. Design: The first part of the methodology exploits the previously introduced Refined Semantic Network and accompanying refined semantic types (RST) to help narrow the search space for offending concepts. The auditing is focused in a neighborhood surrounding the extent of an RST, T (of S) called an envelope, consisting of parents and children of concepts in the extent. The audit moves outward as long as missing assignments are discovered. In the second part, concepts not reached previously are processed and reassigned T as needed during the processing of S's other RSTs. The set of such concepts is expanded in a similar way to that in the first part. Measurements: The number of errors discovered is reported. To measure the methodology's efficiency, "error hit rates" (i.e., errors found in concepts examined) are computed. Results: The methodology was applied to three STs: Experimental Model of Disease (EMD), Environmental Effect of Humans, and Governmental or Regulatory Activity. The EMD experienced the most drastic change. For its RST "EMD ∩ Neoplastic Process" (RST "EMD") with only 33 (31) original concepts, 915 (134) concepts were found by the first (second) part to be missing the EMD assignment. Changes to the other two STs were smaller. Conclusion: The results show that the proposed auditing methodology can help to effectively and efficiently identify concepts lacking the assignment of a particular semantic type.
AB - Objective: Each Unified Medical Language System (UMLS) concept is assigned one or more semantic types (ST). A dynamic methodology for aiding an auditor in finding concepts that are missing the assignment of a given ST, S is presented. Design: The first part of the methodology exploits the previously introduced Refined Semantic Network and accompanying refined semantic types (RST) to help narrow the search space for offending concepts. The auditing is focused in a neighborhood surrounding the extent of an RST, T (of S) called an envelope, consisting of parents and children of concepts in the extent. The audit moves outward as long as missing assignments are discovered. In the second part, concepts not reached previously are processed and reassigned T as needed during the processing of S's other RSTs. The set of such concepts is expanded in a similar way to that in the first part. Measurements: The number of errors discovered is reported. To measure the methodology's efficiency, "error hit rates" (i.e., errors found in concepts examined) are computed. Results: The methodology was applied to three STs: Experimental Model of Disease (EMD), Environmental Effect of Humans, and Governmental or Regulatory Activity. The EMD experienced the most drastic change. For its RST "EMD ∩ Neoplastic Process" (RST "EMD") with only 33 (31) original concepts, 915 (134) concepts were found by the first (second) part to be missing the EMD assignment. Changes to the other two STs were smaller. Conclusion: The results show that the proposed auditing methodology can help to effectively and efficiently identify concepts lacking the assignment of a particular semantic type.
UR - http://www.scopus.com/inward/record.url?scp=69549090022&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=69549090022&partnerID=8YFLogxK
U2 - 10.1197/jamia.M2951
DO - 10.1197/jamia.M2951
M3 - Article
C2 - 19567802
AN - SCOPUS:69549090022
SN - 1067-5027
VL - 16
SP - 746
EP - 757
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 5
ER -