Abstract
The UMLS contains more than 100 source vocabularies and is growing via the integration of others. When integrating a new source, the source terms already in the UMLS must first be found. The easiest approach to this is simple string matching. However, string matching usually does not find all concepts that should be found. A new methodology, based on the notion of piecewise synonyms, for enhancing the process of concept discovery in the UMLS is presented. This methodology is supported by first creating a general synonym dictionary based on the UMLS. Each multi-word source term is decomposed into its component words, allowing for the generation of separate synonyms for each word from the general synonym dictionary. The recombination of these synonyms into new terms creates an expanded pool of matching candidates for terms from the source. The methodology is demonstrated with respect to an existing UMLS source. It shows a 34% improvement over simple string matching.
Original language | English (US) |
---|---|
Pages (from-to) | 339-343 |
Number of pages | 5 |
Journal | AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium |
State | Published - 2007 |
All Science Journal Classification (ASJC) codes
- General Medicine