Abstract
The operation of phylogenetic analysis aims to investigate the evolution and relationships among species. It is widely used in the fields of system biology and comparative genomics. However, phylogenetic analysis is also a computationally intensive operation as the number of tree topology grows in a factorial way with the number of species involved. Therefore, due to the large number of species in the real world, the computational burden has largely thwarted phylogenetic reconstruction. In this paper, we describe the detailed GPU-based multi-threaded design and implementation of a Markov Chain Monte Carlo (MCMC) maximum likelihood algorithm for phylogenetic analysis on a set of aligned nucleotide sequences. The implementation is based on the framework of the most widely used phylogenetic analysis tool, namely MrBayes. The proposed approach resulted in 6x-8x speed-up on an NVidia Geforce 460 GTX GPU compared to an optimized GPP-based software implementation running on a desktop computer with a single Intel Xeon 2.53 GHz CPU and 6.0 GB RAM.
- Saitou M. and Nei N.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol, 1987, 4:406--425.Google Scholar
- Farris J.S.: Estimating phylogenetic trees from distance matrices. American Nature, 1967, 155:279--284.Google Scholar
- Felsentein J.: Evolutionary trees from DNA sequnces: a maximum likelihood approach, J.Mol.Evol, 1981, 17:368--376.Google Scholar
Cross Ref
- Fitch W.M.: Toward defining the course of evolution: Minumum changer for a specfic tree topology. Systematic Zoology, 1971, 20:406--416.Google Scholar
Cross Ref
- Download website for MrBayes, http://mrbayes.sourceforge.net/download.php.Google Scholar
- Download website for PAUP, http://paup.csit.fsu.edu/down.html.Google Scholar
- Thompson J.D., Higgins D.G. and Gibson T.J.: CLUSTALW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994. 22:4673--4680.Google Scholar
Cross Ref
- Box G.E.P. and Tiao G.C.: Bayesian Inference in Statistical Analysis. Wiley, 1973, ISBN 0-471-57428-7.Google Scholar
- Hastings W.K.: Monte Carlo Sampling Methods Using Markov Chains and Their Applications". Biometrika, 1970, 57(1):97--109.Google Scholar
Cross Ref
- Altekar G., Dwarkadas S., Huelsenbeck J.P. and Ronquist F.:Parallel Metropolis coupled Markov chain Monte Carlo for Bayesian phylogenetic inference. Bioinformatics 2004, 20(3):407--415. Google Scholar
Digital Library
- Zhou J.F., Liu X.G., Stones D.S., Xie Q. and Wang G.: MrBayes on a Graphics Processing Unit. Bioinformatics 2011, 27(9): 1255--1261. Google Scholar
Digital Library
- Felsenstein J. and Churchill G.A.: A Hidden Markov Model approach to variation among sites in rate of evolution, and the branching order in hominoidea. Molecular Biology and Evolution, 1996, 13(1):93--104.Google Scholar
Cross Ref
- Suchard M.A. and Rambaut A.: Many-core algorithms for statistical phylogenetics. Bioinformatics , 2009, 25(11): 1370--1376. Google Scholar
Digital Library
Index Terms
High performance phylogenetic analysis on CUDA-compatible GPUs
Recommendations
High-performance cone beam reconstruction using CUDA compatible GPUs
Compute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper presents an acceleration method for cone beam reconstruction using CUDA ...
Performance study on CUDA GPUs for parallelizing the local ensemble transformed Kalman filter algorithm
Modern graphics cards provide computational capabilities that exceed current CPUs. As one of the computational intensive problems, numerical weather prediction has the opportunity to benefit from the massive number of threads and large memory throughput ...
Nekbone performance on GPUs with OpenACC and CUDA Fortran implementations
We present a hybrid GPU implementation and performance analysis of Nekbone, which represents one of the core kernels of the incompressible Navier---Stokes solver Nek5000. The implementation is based on OpenACC and CUDA Fortran for local parallelization ...






Comments