TY - JOUR
T1 - A quality assurance methodology for ChEBI ontology focusing on uncommonly modeled concepts
AU - Liu, Hao
AU - Chen, Ling
AU - Zheng, Ling
AU - Perl, Yehoshua
AU - Geller, James
N1 - Funding Information:
ACKNOWLEDGMENT Research reported in this publication was partially supported by the National Cancer Institute of the National Institutes of Health under Award Number R01CA190779. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
Publisher Copyright:
© 2018 CEUR-WS.
PY - 2018
Y1 - 2018
N2 - The Chemical Entities of Biological Interest (ChEBI) ontology is an important knowledge source of chemical entities in a biological context. ChEBI is large and complex, making it almost impossible to be error-free, given the scarce resources for quality assurance (QA). We present a methodology to locate concepts in ChEBI with a high probability of being erroneous. An Abstraction Network, which provides a compact summarization of an ontology, supports the methodology. By investigating a sample of ChEBI concepts, we show that uncommonly modeled concepts residing in small units of the Abstraction Network of ChEBI are statistically significantly more likely to have errors than other concepts. The finding may guide ChEBI ontology curators to focus their limited QA resources on such concepts to achieve a better QA yield. Furthermore, this study, combined with previous work, contributes to progress in showing that this methodology can be applied to a whole family of similar ontologies.
AB - The Chemical Entities of Biological Interest (ChEBI) ontology is an important knowledge source of chemical entities in a biological context. ChEBI is large and complex, making it almost impossible to be error-free, given the scarce resources for quality assurance (QA). We present a methodology to locate concepts in ChEBI with a high probability of being erroneous. An Abstraction Network, which provides a compact summarization of an ontology, supports the methodology. By investigating a sample of ChEBI concepts, we show that uncommonly modeled concepts residing in small units of the Abstraction Network of ChEBI are statistically significantly more likely to have errors than other concepts. The finding may guide ChEBI ontology curators to focus their limited QA resources on such concepts to achieve a better QA yield. Furthermore, this study, combined with previous work, contributes to progress in showing that this methodology can be applied to a whole family of similar ontologies.
KW - ChEBI
KW - Chemical concept
KW - Chemical ontology
KW - Modeling error
KW - Quality assurance
UR - http://www.scopus.com/inward/record.url?scp=85059841943&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059841943&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85059841943
SN - 1613-0073
VL - 2285
JO - CEUR Workshop Proceedings
JF - CEUR Workshop Proceedings
T2 - 9th International Conference on Biological Ontology, ICBO 2018
Y2 - 7 August 2018 through 10 August 2018
ER -