Pre-miRNA classification via combinatorial feature mining and boosting

Ling Zhong, Jason T.L. Wang, Dongrong Wen, Bruce A. Shapiro

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (referred to as pseudo pre-miRNAs). Several computational methods have been developed to tackle this challenge. In this paper we propose a new method, called MirlD, for identifying and classifying microRNA precursors. We collect 74 features from the sequences and secondary structures of pre-miRNAs; some of these features are taken from our previous studies on non-coding RNA prediction while others were suggested in the literature. We develop a combinatorial feature mining algorithm to identify suitable feature sets. These feature sets are then used to train support vector machines to obtain classification models, based on which classifier ensemble is constructed. Finally we use a boosting algorithm to further enhance the accuracy of the classifier ensemble. Experimental results on a variety of species demonstrate the good performance of the proposed method, and its superiority over existing tools.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2012
Pages369-372
Number of pages4
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM2012 - Philadelphia, PA, United States
Duration: Oct 4 2012Oct 7 2012

Publication series

NameProceedings - 2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2012

Other

Other2012 IEEE International Conference on Bioinformatics and Biomedicine, BIBM2012
Country/TerritoryUnited States
CityPhiladelphia, PA
Period10/4/1210/7/12

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics

Keywords

  • AdaBoost
  • ensemble method
  • miRNA precursor
  • support vector machine

Fingerprint

Dive into the research topics of 'Pre-miRNA classification via combinatorial feature mining and boosting'. Together they form a unique fingerprint.

Cite this