Auditing SNOMED Integration into the UMLS for Duplicate Concepts

Kuo Chuan Huang, James Geller, Gai Elhanan, Yehoshua Perl, Michael Halper

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


The UMLS contains terms from many sources. Every update of a source requires reintegration. Each new term needs to be assigned to a preexisting UMLS concept, or a new concept must be created. Whenever the integration process unnecessarily creates a new concept, this is undesirable. We report on a method to detect such undesirable duplicate concepts. Terms are removed from the UMLS and reintegrated using "piecewise synonym generation." The concept of the reintegrated term is programmatically compared to the initial concept of the term (before removal). If they are different, this indicates an error, either in the integration process or in the initial concept. Thus, such a term-concept pair is deemed suspicious. A study of five hierarchies of the SNOMED found 7.7% suspicious matches. A human expert needs to evaluate the correctness of suspicious concepts. In a sample of 149 of those, 19% of concepts were found to be duplicates.

Original languageEnglish (US)
Pages (from-to)321-325
Number of pages5
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
StatePublished - 2010

All Science Journal Classification (ASJC) codes

  • General Medicine


Dive into the research topics of 'Auditing SNOMED Integration into the UMLS for Duplicate Concepts'. Together they form a unique fingerprint.

Cite this