TY - JOUR
T1 - Effective classification of microRNA precursors using feature mining and AdaBoost algorithms
AU - Zhong, Ling
AU - Wang, Jason T.L.
AU - Wen, Dongrong
AU - Aris, Virginie
AU - Soteropoulos, Patricia
AU - Shapiro, Bruce A.
PY - 2013
Y1 - 2013
N2 - MicroRNAs play important roles in most biological processes, including cell proliferation, tissue differentiation, and embryonic development, among others. They originate from precursor transcripts (pre-miRNAs), which contain phylogenetically conserved stem-loop structures. An important bioinformatics problem is to distinguish the pre-miRNAs from pseudo pre-miRNAs that have similar stem-loop structures. We present here a novel method for tackling this bioinformatics problem. Our method, named MirID, accepts an RNA sequence as input, and classifies the RNA sequence either as positive (i.e., a real pre-miRNA) or as negative (i.e., a pseudo pre-miRNA). MirID employs a feature mining algorithm for finding combinations of features suitable for building pre-miRNA classification models. These models are implemented using support vector machines, which are combined to construct a classifier ensemble. The accuracy of the classifier ensemble is further enhanced by the utilization of an AdaBoost algorithm. When compared with two closely related tools on twelve species analyzed with these tools, MirID outperforms the existing tools on the majority of the twelve species. MirID was also tested on nine additional species, and the results showed high accuracies on the nine species. The MirID web server is fully operational and freely accessible at http://bioinformatics. njit.edu/MirID/. Potential applications of this software in genomics and medicine are also discussed.
AB - MicroRNAs play important roles in most biological processes, including cell proliferation, tissue differentiation, and embryonic development, among others. They originate from precursor transcripts (pre-miRNAs), which contain phylogenetically conserved stem-loop structures. An important bioinformatics problem is to distinguish the pre-miRNAs from pseudo pre-miRNAs that have similar stem-loop structures. We present here a novel method for tackling this bioinformatics problem. Our method, named MirID, accepts an RNA sequence as input, and classifies the RNA sequence either as positive (i.e., a real pre-miRNA) or as negative (i.e., a pseudo pre-miRNA). MirID employs a feature mining algorithm for finding combinations of features suitable for building pre-miRNA classification models. These models are implemented using support vector machines, which are combined to construct a classifier ensemble. The accuracy of the classifier ensemble is further enhanced by the utilization of an AdaBoost algorithm. When compared with two closely related tools on twelve species analyzed with these tools, MirID outperforms the existing tools on the majority of the twelve species. MirID was also tested on nine additional species, and the results showed high accuracies on the nine species. The MirID web server is fully operational and freely accessible at http://bioinformatics. njit.edu/MirID/. Potential applications of this software in genomics and medicine are also discussed.
UR - http://www.scopus.com/inward/record.url?scp=84883703071&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883703071&partnerID=8YFLogxK
U2 - 10.1089/omi.2013.0011
DO - 10.1089/omi.2013.0011
M3 - Article
C2 - 23808606
AN - SCOPUS:84883703071
SN - 1536-2310
VL - 17
SP - 486
EP - 493
JO - OMICS A Journal of Integrative Biology
JF - OMICS A Journal of Integrative Biology
IS - 9
ER -