skip to main content
research-article

SuperDragon: A Heterogeneous Parallel System for Accelerating 3D Reconstruction of Cryo-Electron Microscopy Images

Authors Info & Claims
Published:13 September 2015Publication History
Skip Abstract Section

Abstract

The data deluge in medical imaging processing requires faster and more efficient systems. Due to the advance in recent heterogeneous architecture, there has been a resurgence in research aimed at domain-specific accelerators. In this article, we develop an experimental system SuperDragon for evaluating acceleration of a single-particle Cryo-electron microscopy (Cryo-EM) 3D reconstruction package EMAN through a hybrid of CPU, GPU, and FPGA parallel architecture. Based on a comprehensive workload characterization, we exploit multigrained parallelism in the Cryo-EM 3D reconstruction algorithm and investigate a proper computational mapping to the underlying heterogeneous architecture. The package is restructured with task-level (MPI), thread-level (OpenMP), and data-level (GPU and FPGA) parallelism. Especially, the proposed FPGA accelerator is a stream architecture that emphasizes the importance of optimizing computing dominated data access patterns. Besides, the configurable computing streams are constructed by arranging the hardware modules and bypassing channels to form a linear deep pipeline. Compared to the multicore (six-core) program, the GPU and FPGA implementations achieve speedups of 8.4 and 2.25 times in execution time while improving power efficiency by factors of 7.2 and 14.2, respectively.

Skip Supplemental Material Section

Supplemental Material

References

  1. B. Betkaoui, D. B. Thomas, and W. Luk. 2010. Comparing performance and energy efficiency of FPGAs and GPUs for high productivity computing. In Proceedings of the International Conference on Field-Programmable Technology. 94--101.Google ScholarGoogle Scholar
  2. T. M. Brewer. 2010. Instruction set innovations for the convey HC-1 computer. IEEE Micro 30, 2, 70--79. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Coric, M. Leeser, E. Miller, and M. Trepanier. 2002. Parallel-beam backprojection: An FPGA implementation optimized for medical imaging. In Proceedings of the ACM International Symposium on Field-Programmable Gate Arrays. 217--226. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. de Dinechin, C. Klein, and B. Pasca. 2009. Generating high-performance custom floating-point pipelines. In Proceedings of the International Conference on Field Programmable Logic and Applications. 59--64.Google ScholarGoogle Scholar
  5. D. DeRosier and A. Klug. 1968. A reconstruction of 3-dimensional structure from electron micrographs. Nature 217, 130--134.Google ScholarGoogle ScholarCross RefCross Ref
  6. T. El-Ghazawi. 2008. The promise of high-performance reconfigurable computing. IEEE Computer 41, 2, 69--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Fluke. 2011. Fluke Home Page. Retrieved August 25, 2015, from http://www.fluke.com/fluke/usen/products/categoryben.htm.Google ScholarGoogle Scholar
  8. T. Hartley, U. Catalyurek, A. Ruiz, F. Igual, R. Mayo, and M. Ujaldon. 2008. Biomedical image analysis on a cooperative cluster of GPUs and biomedical image analysis on a cooperative cluster of GPUs and multicores. In Proceedings of the 22nd Annual International Conference on Supercomputing. 15--25. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. Hormati, M. Kudlur, S. Mahlke, D. Bacon, and R. Rabbah. 2008. Optimus: Efficient realization of streaming applications on FPGAs. In Proceedings of the International Conference on Compilers, Architectures, and Synthesis for Embedded Systems. 41--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. L. Li, X. Li, G. Tan, M. Chen, and P. Zhang. 2011. Experience of parallelizing cryo-EM 3D reconstruction on a CPU-GPU heterogeneous system. In Proceedings of the 20th International Symposium on High Performance Distributed Computing. 195--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. LLVM. 2012. The LLVM Compiler Infrastructure. Retrieved August 25, 2015, from http://www.llvm.org.Google ScholarGoogle Scholar
  12. S. J. Ludtke and P. R. Baldwin. 1999. Eman: Semiautomated software for high-resolution single-particle reconstructions. Journal of Structural Biology 128, 1, 82--97.Google ScholarGoogle ScholarCross RefCross Ref
  13. J. Marathe and F. Mueller. 2008. PFetch: Software prefetching exploiting temporal predictability of memory access streams. In Proceedings of the 9th Workshop on Memory performance: Dealing with Applications, Systems, and Architecture. 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Matlab. 2013. Matlab Home Page. Retrieved August 25, 2015, from http://www.mathworks.cn.Google ScholarGoogle Scholar
  15. O. Mencer, K. H. Tsoi, S. Craimer, T. Todman, W. Luk, M. Y. Wong, and P. H. W. Leong. 2009. CUBE: A 512-FPGA CLUSTER. In Proceedings of the IEEE Southern Programmable Logic Conference. 51--57.Google ScholarGoogle ScholarCross RefCross Ref
  16. B. Nikolaos, S. M. Chai, D. Malcolm, L. Dan, and L. Abelardo. 2009. Proteus: An architectural synthesis toll based on the stream programming paradigm. In Proceedings of the International Conference on Field Programmable Logic and Applications. 596--599.Google ScholarGoogle Scholar
  17. L.-N. Pouchet, P. Zhang, P. Sadayappan, and J. Cong. 2013. Polyhedral-based data reuse optimization for configurable computing. In Proceedings of the ACM/SIGDA International Symposium on Field Programmable Gate Arrays. 29--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Ryoo, C. Rodrigues, and S. Baghsorkhi. 2008. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 73--82. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. H. Scherl, S. Hoppe, M. Kowarschik, and J. Hornegger. 2008. Design and implementation of the software architecture for a 3-D reconstruction system in medical imaging. In Proceedings of the ACM 30th International Conference on Software Engineering. 661--668. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Scrofano, M. Gokhale, F. Trouw, and V. K. Prasanna. 2006. Hardware/software approach to molecular dynamics on reconfigurable computers. In Proceedings of the 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. 23--24. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. G. Tan and Z. Guo. 2009. Single-particle 3D reconstruction from cryo-electron microscopy images on GPU. In Proceedings of the 23rd International Conference on Supercomputing. 380--389. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. K. Taylor and R. M. Glaeser. 1974. Electron diffraction of frozen, hydrated protein crystals. Science 186, 1036--1037.Google ScholarGoogle ScholarCross RefCross Ref
  23. K. H. Tsoi and W. Luk. 2010. Axel: A heterogeneous cluster with FPGAs and GPUs. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 115--124. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. W. Wang, B. Duan, W. Tang, C. Zhang, G. Tan, P. Zhang, and N. Sun. 2012. A coarse-grained stream architecture for cryo-electron microscopy images 3D reconstruction. In Proceedings of the ACM/SIGDA 20th International Symposium on Field-Programmable Gate Arrays. 143--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Xilinx. 2011. Xilinx Home Page. Retrieved August 25, 2015, from http://www.xilinx.com.Google ScholarGoogle Scholar
  26. F. Xu and K. Mueller. 2007. Real-time 3D computed tomographic reconstruction using commodity graphics hardware. Physics in Medicine and Biology 512, 12, 3405--3419.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. SuperDragon: A Heterogeneous Parallel System for Accelerating 3D Reconstruction of Cryo-Electron Microscopy Images

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Reconfigurable Technology and Systems
      ACM Transactions on Reconfigurable Technology and Systems  Volume 8, Issue 4
      October 2015
      134 pages
      ISSN:1936-7406
      EISSN:1936-7414
      DOI:10.1145/2822909
      • Editor:
      • Steve Wilton
      Issue’s Table of Contents

      Copyright © 2015 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 September 2015
      • Accepted: 1 February 2015
      • Revised: 1 January 2015
      • Received: 1 July 2014
      Published in trets Volume 8, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!