Piecewise synonyms for enhanced UMLS source terminology integration.

Kuo Chuan Huang, James Geller, Michael Halper, James J. Cimino

Research output: Contribution to journalArticlepeer-review

13 Scopus citations


The UMLS contains more than 100 source vocabularies and is growing via the integration of others. When integrating a new source, the source terms already in the UMLS must first be found. The easiest approach to this is simple string matching. However, string matching usually does not find all concepts that should be found. A new methodology, based on the notion of piecewise synonyms, for enhancing the process of concept discovery in the UMLS is presented. This methodology is supported by first creating a general synonym dictionary based on the UMLS. Each multi-word source term is decomposed into its component words, allowing for the generation of separate synonyms for each word from the general synonym dictionary. The recombination of these synonyms into new terms creates an expanded pool of matching candidates for terms from the source. The methodology is demonstrated with respect to an existing UMLS source. It shows a 34% improvement over simple string matching.

Original languageEnglish (US)
Pages (from-to)339-343
Number of pages5
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
StatePublished - 2007

All Science Journal Classification (ASJC) codes

  • General Medicine


Dive into the research topics of 'Piecewise synonyms for enhanced UMLS source terminology integration.'. Together they form a unique fingerprint.

Cite this