In previous research, we have studied concepts that occur in pairs of medical terminologies and are known to be identical, because they have the same ID number in the Unified Medical Language System (UMLS). We observed that such concepts rarely have exactly the same sets of children (=subconcepts) in the two terminologies. The number of common children was found to vary widely. A special situation was identified where the children in one terminology relate to the common parent in a very different way than the children in the other terminology. For example, children in one terminology might subdivide a parent concept by anatomical location in one terminology and by disease kind in the other terminology. We coined the term “alternative classification” (of the same parent concept) for such situations. In previous work, only human experts could recognize alternative classifications. In this paper, we present a mathematically expressed criterion for likely cases of alternative classifications. We compare the recommendations of this criterion, expressed by a mathematical quantity called “EFI” becoming zero, with the decisions of a human expert. It is found that the human expert agreed with the criterion in 72% of all cases, which is a big improvement over having no computable criterion at all. Besides alternative classifications, common parent concepts in a pair of terminologies might also indicate a possible import of a child concept missing in one terminology, different granularities, or errors in either one of the two terminologies. In this paper, we further investigate different kinds of alternative classifications.
All Science Journal Classification (ASJC) codes
- Computer Science Applications
- Health Informatics
- Alternative classification
- Concept import
- Ontologies and terminologies