TY - JOUR
T1 - Predicting coaxial helical stacking in RNA junctions
AU - Laing, Christian
AU - Wen, Dongrong
AU - Wang, Jason T.L.
AU - Schlick, Tamar
N1 - Funding Information:
National Science Foundation (EMT award # CCF-0727001 to T.S., grant # IIS-0707571 to J.W.); and the National Institutes of Health (grant # R01-GM081410 to T.S.). Funding for open access charge: National Science Foundation, National Institutes of Health.
PY - 2012/1
Y1 - 2012/1
N2 - RNA junctions are important structural elements that form when three or more helices come together in space in the tertiary structures of RNA molecules. Determining their structural configuration is important for predicting RNA 3D structure. We introduce a computational method to predict, at the secondary structure level, the coaxial helical stacking arrangement in junctions, as well as classify the junction topology. Our approach uses a data mining approach known as random forests, which relies on a set of decision trees trained using length, sequence and other variables specified for any given junction. The resulting protocol predicts coaxial stacking within three- and four-way junctions with an accuracy of 81% and 77%, respectively; the accuracy increases to 83% and 87%, respectively, when knowledge from the junction family type is included. Coaxial stacking predictions for the five to ten-way junctions are less accurate (60%) due to sparse data available for training. Additionally, our application predicts the junction family with an accuracy of 85% for three-way junctions and 74% for four-way junctions. Comparisons with other methods, as well applications to unsolved RNAs, are also presented. The web server Junction-Explorer to predict junction topologies is freely available at: http://bioinformatics.njit.edu/junction.
AB - RNA junctions are important structural elements that form when three or more helices come together in space in the tertiary structures of RNA molecules. Determining their structural configuration is important for predicting RNA 3D structure. We introduce a computational method to predict, at the secondary structure level, the coaxial helical stacking arrangement in junctions, as well as classify the junction topology. Our approach uses a data mining approach known as random forests, which relies on a set of decision trees trained using length, sequence and other variables specified for any given junction. The resulting protocol predicts coaxial stacking within three- and four-way junctions with an accuracy of 81% and 77%, respectively; the accuracy increases to 83% and 87%, respectively, when knowledge from the junction family type is included. Coaxial stacking predictions for the five to ten-way junctions are less accurate (60%) due to sparse data available for training. Additionally, our application predicts the junction family with an accuracy of 85% for three-way junctions and 74% for four-way junctions. Comparisons with other methods, as well applications to unsolved RNAs, are also presented. The web server Junction-Explorer to predict junction topologies is freely available at: http://bioinformatics.njit.edu/junction.
UR - http://www.scopus.com/inward/record.url?scp=84862969708&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862969708&partnerID=8YFLogxK
U2 - 10.1093/nar/gkr629
DO - 10.1093/nar/gkr629
M3 - Article
C2 - 21917853
AN - SCOPUS:84862969708
SN - 0305-1048
VL - 40
SP - 487
EP - 498
JO - Nucleic Acids Research
JF - Nucleic Acids Research
IS - 2
ER -