Structural group-based auditing of missing hierarchical relationships in UMLS

Yan Chen, Huanying (Helen) Gu, Yehoshua Perl, James Geller

Research output: Contribution to journalArticlepeer-review

24 Scopus citations


The Metathesaurus of the UMLS was created by integrating various source terminologies. The inter-concept relationships were either integrated into the UMLS from the source terminologies or specially generated. Due to the extensive size and inherent complexity of the Metathesaurus, the accidental omission of some hierarchical relationships was inevitable. We present a recursive procedure which allows a human expert, with the support of an algorithm, to locate missing hierarchical relationships. The procedure starts with a group of concepts with exactly the same (correct) semantic type assignments. It then partitions the concepts, based on child-of hierarchical relationships, into smaller, singly rooted, hierarchically connected subgroups. The auditor only needs to focus on the subgroups with very few concepts and their concepts with semantic type reassignments. The procedure was evaluated by comparing it with a comprehensive manual audit and it exhibits a perfect error recall.

Original languageEnglish (US)
Pages (from-to)452-467
Number of pages16
JournalJournal of Biomedical Informatics
Issue number3
StatePublished - Jun 2009

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Computer Science Applications


  • Auditing
  • Hierarchical relationships
  • Partition
  • Refined semantic network
  • Refined semantic type
  • Semantic refinement
  • Semantic type assignment
  • UMLS


Dive into the research topics of 'Structural group-based auditing of missing hierarchical relationships in UMLS'. Together they form a unique fingerprint.

Cite this