skip to main content
poster

Automatic problem size sensitive task partitioning on heterogeneous parallel systems

Authors Info & Claims
Published:23 February 2013Publication History
Skip Abstract Section

Abstract

In this paper we propose a novel approach which automatizes task partitioning in heterogeneous systems. Our framework is based on the Insieme Compiler and Runtime infrastructure. The compiler translates a single-device OpenCL program into a multi-device OpenCL program. The runtime system then performs dynamic task partitioning based on an offline-generated prediction model. In order to derive the prediction model, we use a machine learning approach that incorporates static program features as well as dynamic, input sensitive features. Our approach has been evaluated over a suite of 23 programs and achieves performance improvements compared to an execution of the benchmarks on a single CPU and a single GPU only.

References

  1. Insieme compiler and runtime infrastructure. - Distributed and Parallel Systems Group, University of Innsbruck. http://insieme-compiler.org, 2012.Google ScholarGoogle Scholar
  2. S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S.-H. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In IISWC, pages 44--54, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. A. Danalis, G. Marin, C. McCurdy, J. S. Meredith, P. C. Roth, K. Spafford, V. Tipparaju, and J. S. Vetter. The scalable heterogeneous computing (shoc) benchmark suite. In GPGPU, pages 63--74, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Grauer-Gray, L. Xu, R. Searles, S. Ayalasomayajula, , and J. Cavazos. Auto-tuning a high-level language targeted to gpu codes. In InPar, 2012.Google ScholarGoogle ScholarCross RefCross Ref
  5. C. Gregg and K. M. Hazelwood. Where is the data? why you cannot debate cpu vs. gpu performance without the answer. In ISPASS, pages 134--144, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Khronos OpenCL Working Group. The OpenCL 1.2 specification. http://www.khronos.org/opencl, 2012.Google ScholarGoogle Scholar
  7. P. Thoman, K. Kofler, H. Studt, J. Thomson, and T. Fahringer. Automatic opencl device characterization: guiding optimized kernel design. In Euro-Par, pages 438--452, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatic problem size sensitive task partitioning on heterogeneous parallel systems

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!