Abstract
The Chemical Entities of Biological Interest (ChEBI) ontology is an important knowledge source of chemical entities in a biological context. ChEBI is large and complex, making it almost impossible to be error-free, given the scarce resources for quality assurance (QA). We present a methodology to locate concepts in ChEBI with a high probability of being erroneous. An Abstraction Network, which provides a compact summarization of an ontology, supports the methodology. By investigating a sample of ChEBI concepts, we show that uncommonly modeled concepts residing in small units of the Abstraction Network of ChEBI are statistically significantly more likely to have errors than other concepts. The finding may guide ChEBI ontology curators to focus their limited QA resources on such concepts to achieve a better QA yield. Furthermore, this study, combined with previous work, contributes to progress in showing that this methodology can be applied to a whole family of similar ontologies.
Original language | English (US) |
---|---|
Journal | CEUR Workshop Proceedings |
Volume | 2285 |
State | Published - 2018 |
Event | 9th International Conference on Biological Ontology, ICBO 2018 - Corvallis, United States Duration: Aug 7 2018 → Aug 10 2018 |
All Science Journal Classification (ASJC) codes
- General Computer Science
Keywords
- ChEBI
- Chemical concept
- Chemical ontology
- Modeling error
- Quality assurance