Auditing as Part of the Terminology Design Life Cycle

Hua Min, Yehoshua Perl, Yan Chen, Michael Halper, James Geller, Yue Wang

Research output: Contribution to journalArticlepeer-review

72 Scopus citations


Objective: To develop and test an auditing methodology for detecting errors in medical terminologies satisfying systematic inheritance. This methodology is based on various abstraction taxonomies that provide high-level views of a terminology and highlight potentially erroneous concepts. Design: Our auditing methodology is based on dividing concepts of a terminology into smaller, more manageable units. First, we divide the terminology's concepts into areas according to their relationships/roles. Then each multi-rooted area is further divided into partial-areas (p-areas) that are singly-rooted. Each p-area contains a set of structurally and semantically uniform concepts. Two kinds of abstraction networks, called the area taxonomy and p-area taxonomy, are derived. These taxonomies form the basis for the auditing approach. Taxonomies tend to highlight potentially erroneous concepts in areas and p-areas. Human reviewers can focus their auditing efforts on the limited number of problematic concepts following two hypotheses on the probable concentration of errors. Results: A sample of the area taxonomy and p-area taxonomy for the Biological Process (BP) hierarchy of the National Cancer Institute Thesaurus (NCIT) was derived from the application of our methodology to its concepts. These views led to the detection of a number of different kinds of errors that are reported, and to confirmation of the hypotheses on error concentration in this hierarchy. Conclusion: Our auditing methodology based on area and p-area taxonomies is an efficient tool for detecting errors in terminologies satisfying systematic inheritance of roles, and thus facilitates their maintenance. This methodology concentrates a domain expert's manual review on portions of the concepts with a high likelihood of errors.

Original languageEnglish (US)
Pages (from-to)676-690
Number of pages15
JournalJournal of the American Medical Informatics Association
Issue number6
StatePublished - Nov 2006

All Science Journal Classification (ASJC) codes

  • Health Informatics


Dive into the research topics of 'Auditing as Part of the Terminology Design Life Cycle'. Together they form a unique fingerprint.

Cite this