Abstract
Objective: The Unified Medical Language System (UMLS) integrates terms from different sources into concepts and supplements these with the assignment of one or more high-level semantic types (STs) from its Semantic Network (SN). For a composite organic chemical concept, multiple assignments of organic chemical STs often serve to enumerate the types of the composite's underlying chemical constituents. This practice sometimes leads to the introduction of a forbidden redundant ST assignment, where both an ST and one of its descendants are assigned to the same concept. A methodology for resolving redundant ST assignments for organic chemicals, better capturing the essence of such composite chemicals than the typical omission of the more general ST, is presented. Materials and methods: The typical SN resolution of a redundant ST assignment is to retain only the more specific ST assignment and omit the more general one. However, with organic chemicals, that is not always the correct strategy. A methodology for properly dealing with the redundancy based on the relative sizes of the chemical components is presented. It is more accurate to use the ST of the larger chemical component for capturing the category of the concept, even if that means using the more general ST. Results: A sample of 254 chemical concepts having redundant ST assignments in older UMLS releases was audited to analyze the accuracy of current ST assignments. For 81 (32%) of them, our chemical analysis-based approach yielded a different recommendation from the UMLS (2009AA). New UMLS usage notes capturing rules of this methodology are proffered. Conclusions: Redundant ST assignments have typically arisen for organic composite chemical concepts. A methodology for dealing with this kind of erroneous configuration, capturing the proper category for a composite chemical, is presented and demonstrated.
Original language | English (US) |
---|---|
Pages (from-to) | 141-151 |
Number of pages | 11 |
Journal | Artificial Intelligence in Medicine |
Volume | 52 |
Issue number | 3 |
DOIs | |
State | Published - Jul 2011 |
All Science Journal Classification (ASJC) codes
- Medicine (miscellaneous)
- Artificial Intelligence
Keywords
- Categorization
- Complex chemical
- Composite chemical
- Conjugate chemical
- Metathesaurus
- Redundant semantic type assignment
- Semantic Network
- Semantic type assignment
- Unified Medical Language System