A study of terminology auditors' performance for UMLS semantic type assignments

Huanying Helen Gu, Gai Elhanan, Yehoshua Perl, George Hripcsak, James J. Cimino, Julia Xu, Yan Chen, James Geller, C. Paul Morrey

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.

Original languageEnglish (US)
Pages (from-to)1042-1048
Number of pages7
JournalJournal of Biomedical Informatics
Issue number6
StatePublished - Dec 2012

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Computer Science Applications


  • Auditing of terminologies
  • Auditor performance
  • Auditor reliability
  • Quality Assurance
  • Semantic type assignments
  • UMLS auditing


Dive into the research topics of 'A study of terminology auditors' performance for UMLS semantic type assignments'. Together they form a unique fingerprint.

Cite this