Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs

Vipina K. Keloth, Shuxin Zhou, Andrew J. Einstein, Gai Elhanan, Yan Chen, James Geller, Yehoshua Perl

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Clinical data stored in EHRs could provide valuable knowledge for research if it were annotated properly. However, almost no EHR notes are currently annotated as the performance of off the shelf annotation tools is unsatisfactory. Concentrating on the cardiology specialty, we propose to design a Cardiology Interface Terminology dedicated to the annotation of EHR notes in cardiology. This interface terminology will be developed by the addition of high granularity concepts, mined from cardiology EHR notes, to an initial version reusing SNOMED CT cardiology subhierarchies. Using text mining NLP tools with machine learning for extending this interface terminology requires proper training data. In this paper, we discuss concept-mining of EHR notes, using concatenation and anchoring operations iteratively to create such training data. This approach can be applied to other medical specialties.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020
EditorsTaesung Park, Young-Rae Cho, Xiaohua Tony Hu, Illhoi Yoo, Hyun Goo Woo, Jianxin Wang, Julio Facelli, Seungyoon Nam, Mingon Kang
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1728-1735
Number of pages8
ISBN (Electronic)9781728162157
DOIs
StatePublished - Dec 16 2020
Event2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020 - Virtual, Seoul, Korea, Republic of
Duration: Dec 16 2020Dec 19 2020

Publication series

NameProceedings - 2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020

Conference

Conference2020 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2020
CountryKorea, Republic of
CityVirtual, Seoul
Period12/16/2012/19/20

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems and Management
  • Medicine (miscellaneous)
  • Health Informatics

Keywords

  • Cardiology EHR
  • EHR annotation
  • enriching interface terminology
  • interface terminologies
  • training data

Fingerprint Dive into the research topics of 'Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs'. Together they form a unique fingerprint.

Cite this