TY - GEN
T1 - GTfold
T2 - 24th Annual ACM Symposium on Applied Computing, SAC 2009
AU - Mathuriya, Amrita
AU - Bader, David A.
AU - Heitsch, Christine E.
AU - Harvey, Stephen C.
PY - 2009
Y1 - 2009
N2 - The prediction of the correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n 4), so the computational requirements become prohibitive as the length increases. Existing folding programs implement heuristics and approximations to overcome these limitations. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs and achieves comparable accuracy of prediction. Development of GTfold opens up a new path for the algorithmic improvements and application of an improved thermodynamic model to increase the prediction accuracy. In this paper we analyze the algorithm's concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. In a remarkable demonstration, GTfold now optimally folds 11 picornaviral RNA sequences ranging from 7100 to 8200 nucleotides in 8 minutes, compared with the two months it took in a previous study. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems. We also show that the exact algorithms like internal loop speedup can be implemented with our method in an affordable amount of time. GTfold is freely available as open source from our website.
AB - The prediction of the correct secondary structures of large RNAs is one of the unsolved challenges of computational molecular biology. Among the major obstacles is the fact that accurate calculations scale as O(n 4), so the computational requirements become prohibitive as the length increases. Existing folding programs implement heuristics and approximations to overcome these limitations. We present a new parallel multicore and scalable program called GTfold, which is one to two orders of magnitude faster than the de facto standard programs and achieves comparable accuracy of prediction. Development of GTfold opens up a new path for the algorithmic improvements and application of an improved thermodynamic model to increase the prediction accuracy. In this paper we analyze the algorithm's concurrency and describe the parallelism for a shared memory environment such as a symmetric multiprocessor or multicore chip. In a remarkable demonstration, GTfold now optimally folds 11 picornaviral RNA sequences ranging from 7100 to 8200 nucleotides in 8 minutes, compared with the two months it took in a previous study. We are seeing a paradigm shift to multicore chips and parallelism must be explicitly addressed to continue gaining performance with each new generation of systems. We also show that the exact algorithms like internal loop speedup can be implemented with our method in an affordable amount of time. GTfold is freely available as open source from our website.
KW - Computational biology
KW - Parallel algorithms
KW - Ribosomal and viral RNA
UR - http://www.scopus.com/inward/record.url?scp=70349285940&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70349285940&partnerID=8YFLogxK
U2 - 10.1145/1529282.1529497
DO - 10.1145/1529282.1529497
M3 - Conference contribution
AN - SCOPUS:70349285940
SN - 9781605581668
T3 - Proceedings of the ACM Symposium on Applied Computing
SP - 981
EP - 988
BT - 24th Annual ACM Symposium on Applied Computing, SAC 2009
Y2 - 8 March 2009 through 12 March 2009
ER -