skip to main content
column

Performance Analysis of the Multi-GPU System with ExpEther

Published:03 December 2014Publication History
Skip Abstract Section

Abstract

A GPU cluster in which each node provides a few GPUs connected with PCIe (PCI Express) is commonly used for acceleration of a large application program requiring the performance beyond a single GPU. However, in such a system, programmers are required to describe two parallel programming between nodes in MPIs or other message passing library as well as the fine grained parallel programming for intra-GPUs. As a cost effective alternative of such clusters, we propose a novel multi-GPU system with ExpEther, a virtualization technique which extends PCIe of a host CPU to Ethernet. All devices connected by ExpEther can be treated as if they were directly connected to the host. Evaluation with two application programs with and without GPU-GPU communication revealed that the proposed system with four GPUs achieved 3.88 and 3.29 times performance improvement respectively compared with a single GPU system. Compared with GPU cluster system in which each node provides a GPU, the proposed system achieved about 7% and 30% performance improvement, respectively.

References

  1. A.Shitara, T.Nakahama, M.Yamada, T.Kamata, Y.Nishikawa, M.Yoshimi, and H.Amano. Vegeta: An implementation and evaluation of development-support middleware on multiple opencl platform. In Proc. of the 2nd ICNC, 2011, pages 141--147. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. GSIC. Tsubame computing services. http://tsubame.gsic.titech.ac.jp/en.Google ScholarGoogle Scholar
  3. T. Hamada. Degima: The greenest accelerator-based supercomputer in the top500 list. http://www.cs.tsukuba.ac.jp/¿yoshiki/heart/HEART2012/keynote/HEART2012-Hamada.pdf, June 2012.Google ScholarGoogle Scholar
  4. Integrated Device Technology. Pci express switches. http://www.idt.com/products/interfaceconnectivity/pci-express-solutions/pci-expressswitches.Google ScholarGoogle Scholar
  5. Khronos. The opencl specification version: 2.0, November 2013.Google ScholarGoogle Scholar
  6. NEC Corporation. http://www.nec.co.jp.Google ScholarGoogle Scholar
  7. NVIDIA. CUDA Toolkit Documentation. http://docs.nvidia.com/cuda/index.html.Google ScholarGoogle Scholar
  8. PCI-SIG. Pci express. http://www.pcisig.com/specifications/pciexpress/.Google ScholarGoogle Scholar
  9. R. Aoki, S. Oikawa, T. Nakamura, and S. Miki. Hybrid opencl: Enhancing opencl for distributed processing. In Proc. of the 9th ISPA, 2011, pages 149--154. IEEE, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Suzuki, Y. Hidaka, J. Higuchi, T. Yoshikawa, and A. Iwata. Expressether-ethernet-based virtualization technology for reconfigurable hardware platform. In High-Performance Interconnects, 14th IEEE Symposium on, pages 45--51. IEEE, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. T. Miyoshi, H. Irie, K. Shima, H. Honda, M. Kondo, and T. Yoshinaga. Flat: a gpu programming framework to provide embedded mpi. In Proc. of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, pages 20--29. ACM, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader
About Cookies On This Site

We use cookies to ensure that we give you the best experience on our website.

Learn more

Got it!