Quality assurance of the gene ontology using abstraction networks

Christopher Ochs, Yehoshua Perl, Michael Halper, James Geller, Jane Lomax

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

The gene ontology (GO) is used extensively in the field of genomics. Like other large and complex ontologies, quality assurance (QA) efforts for GO's content can be laborious and time consuming. Abstraction networks (AbNs) are summarization networks that reveal and highlight high-level structural and hierarchical aggregation patterns in an ontology. They have been shown to successfully support QA work in the context of various ontologies. Two kinds of AbNs, called the area taxonomy and the partial-area taxonomy, are developed for GO hierarchies and derived specifically for the biological process (BP) hierarchy. Within this framework, several QA heuristics, based on the identification of groups of anomalous terms which exhibit certain taxonomy-defined characteristics, are introduced. Such groups are expected to have higher error rates when compared to other terms. Thus, by focusing QA efforts on anomalous terms one would expect to find relatively more erroneous content. By automatically identifying these potential problem areas within an ontology, time and effort will be saved during manual reviews of GO's content. BP is used as a testbed, with samples of three kinds of anomalous BP terms chosen for a taxonomy-based QA review. Additional heuristics for QA are demonstrated. From the results of this QA effort, it is observed that different kinds of inconsistencies in the modeling of GO can be exposed with the use of the proposed heuristics. For comparison, the results of QA work on a sample of terms chosen from GO's general population are presented.

Original languageEnglish (US)
Article number1642001
JournalJournal of Bioinformatics and Computational Biology
Volume14
Issue number3
DOIs
StatePublished - Jun 1 2016

All Science Journal Classification (ASJC) codes

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications

Keywords

  • Gene ontology
  • abstraction network
  • obo ontology
  • ontology quality assurance

Fingerprint

Dive into the research topics of 'Quality assurance of the gene ontology using abstraction networks'. Together they form a unique fingerprint.

Cite this