Abstract
This paper evaluates and discusses how different GPU programming frameworks affect the performance obtained from GPU acceleration of the striped smith-waterman algorithm used for biological sequence alignment. A total of 6 GPU implementations of the algorithm on NVIDIA GT200b and AMD RV870 using the CUDA and the OpenCL frameworks are compared to analyze cons and pros of explicit descriptions for architecture specific hardware mechanisms in the code. The evaluation results show that the primitive descriptions with the CUDA are still efficient especially for small size data, while better instruction scheduling and optimizations are carried out by the OpenCL compiler. On the other hand, the combination of OpenCL and RV870 which provides a relatively simple view of the architecture is efficient for the large data size.
- K. Datta, M. Murphy, V. Volkov, S. Williams, J. Carter, L. Oliker, D. Patterson, J. Shalf, and K. Yelick, "Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures," in High Performance Computing, Networking, Storage and Analysis, 2008. SC 2008. International Conference for, pp. 1--12, IEEE, 2009. Google Scholar
Digital Library
- S. Asano, T. Maruyama, and Y. Yamaguchi, "Performance comparison of fpga, gpu and cpu in image processing," in Field Programmable Logic and Applications, 2009. FPL 2009. International Conference on, pp. 126--131, IEEE, 2009.Google Scholar
- T. Smith and M. Waterman, "Identification of common molecular subsequences," J Molecular Biology, vol. 147, pp. 195--197, 1981.Google Scholar
Cross Ref
- O. Gotoh, "An improved algorithm for matching biological sequences," J Mol Biol, vol. 162, pp. 707--708, 1982.Google Scholar
Cross Ref
- S. Henikoff and J. Henikoff, "Amino acid substitution matrices from protein blocks," Proceedings of the National Academy of Sciences, vol. 89, no. 22, p. 10915, 1992.Google Scholar
Cross Ref
- S. Manavski and G. Valle, "CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment," BMC Bioinformatics, vol. 9, no. Suppl 2, p. S10, 2008.Google Scholar
Cross Ref
- L. Ligowski and W. Rudnicki, "An efficient implementation of smith waterman algorithm on gpu using cuda, for massively parallel scanning of sequence databases," in Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on, pp. 1--8, IEEE, 2009. Google Scholar
Digital Library
- K. Dohi, K. Benkridt, C. Ling, T. Hamada, and Y. Shibata, "Highly efficient mapping of the smith-waterman algorithm on cuda-compatible gpus," in Application-specific Systems Architectures and Processors (ASAP), 2010 21st IEEE International Conference on, pp. 29--36, IEEE, 2010.Google Scholar
- Y. Liu, B. Schmidt, and D. Maskell, "Cudasw++2.0: enhanced smith-waterman protein database search on cuda-enabled gpus based on simt and virtualized simd abstractions," BMC Research Notes, vol. 3, no. 1, p. 93, 2010.Google Scholar
Cross Ref
- M. Farrar, "Striped smith-waterman speeds database searches six times over other simd implementations," Bioinformatics, vol. 23, no. 2, pp. 156--161, 2007. Google Scholar
Digital Library
- M. Farrar, "Optimizing smith-waterman for the cell broadband engine,"Google Scholar
- A. Wozniak, "Using video-oriented instructions to speed up sequence comparison," Comput Appl Biosci, vol. 13, no. 2, pp. 145--150, 1997.Google Scholar
- T. Rognes and E. Seeberg, "Six-fold speedup of smith-waterman sequence database searches using parallel processing on common microprocessors," Bioinformatics, vol. 16, no. 8, pp. 699--706, 2000.Google Scholar
Cross Ref
- "CUDA ZONE." http://www.nvidia.com/object/cuda home.html.Google Scholar
- "OpenCL." http://www.khronos.org/opencl/.Google Scholar
- "UniProt." http://www.uniprot.org/.Google Scholar
Index Terms
Performance comparison of GPU programming frameworks with the striped Smith-Waterman algorithm
Recommendations
Optimization schemes and performance evaluation of Smith–Waterman algorithm on CPU, GPU and FPGA
With fierce competition between CPU and graphics processing unit (GPU) platforms, performance evaluation has become the focus of various sectors. In this paper, we take a well-known algorithm in the field of biosequence matching and database searching, ...
Improving Performance of GPU Specific OpenCL Program on CPUs
PDCAT '12: Proceedings of the 2012 13th International Conference on Parallel and Distributed Computing, Applications and TechnologiesOpenCL provides unified programming interface for various parallel computing platforms. The OpenCL framework manifests good functional portability, the programs can be run on platforms supporting OpenCL programming without any modification. However, ...
GPU accelerated smith-waterman
ICCS'06: Proceedings of the 6th international conference on Computational Science - Volume Part IVWe present a novel hardware implementation of the double affine Smith-Waterman (DASW) algorithm, which uses dynamic programming to compare and align genomic sequences such as DNA and proteins. We implement DASW on a commodity graphics card, taking ...






Comments