Abstract
Objective: To develop and test an auditing methodology for detecting errors in medical terminologies satisfying systematic inheritance. This methodology is based on various abstraction taxonomies that provide high-level views of a terminology and highlight potentially erroneous concepts. Design: Our auditing methodology is based on dividing concepts of a terminology into smaller, more manageable units. First, we divide the terminology's concepts into areas according to their relationships/roles. Then each multi-rooted area is further divided into partial-areas (p-areas) that are singly-rooted. Each p-area contains a set of structurally and semantically uniform concepts. Two kinds of abstraction networks, called the area taxonomy and p-area taxonomy, are derived. These taxonomies form the basis for the auditing approach. Taxonomies tend to highlight potentially erroneous concepts in areas and p-areas. Human reviewers can focus their auditing efforts on the limited number of problematic concepts following two hypotheses on the probable concentration of errors. Results: A sample of the area taxonomy and p-area taxonomy for the Biological Process (BP) hierarchy of the National Cancer Institute Thesaurus (NCIT) was derived from the application of our methodology to its concepts. These views led to the detection of a number of different kinds of errors that are reported, and to confirmation of the hypotheses on error concentration in this hierarchy. Conclusion: Our auditing methodology based on area and p-area taxonomies is an efficient tool for detecting errors in terminologies satisfying systematic inheritance of roles, and thus facilitates their maintenance. This methodology concentrates a domain expert's manual review on portions of the concepts with a high likelihood of errors.
Original language | English (US) |
---|---|
Pages (from-to) | 676-690 |
Number of pages | 15 |
Journal | Journal of the American Medical Informatics Association |
Volume | 13 |
Issue number | 6 |
DOIs | |
State | Published - Nov 2006 |
All Science Journal Classification (ASJC) codes
- Health Informatics