skip to main content
poster

An optimizing compiler for GPGPU programs with input-data sharing

Authors Info & Claims
Published:09 January 2010Publication History
Skip Abstract Section

Abstract

Developing high performance GPGPU programs is challenging for application developers since the performance is dependent upon how well the code leverages the hardware features of specific graphics processors. To solve this problem and relieve application developers of low-level hardware-specific optimizations, we introduce a novel compiler to optimize GPGPU programs. Our compiler takes a naive GPU kernel function, which is functionally correct but without any consideration for performance optimization. The compiler then analyzes the code, identifies memory access patterns, and generates optimized code. The proposed compiler optimizations target at one category of scientific and media processing algorithms, which has the characteristics of input-data sharing when computing neighboring output pixels/elements. Many commonly used algorithms, such as matrix multiplication, convolution, etc., share such characteristics. For these algorithms, novel approaches are proposed to enforce memory coalescing and achieve effective data reuse. Data prefetching and hardware-specific tuning are also performed automatically with our compiler framework. The experimental results based on a set of applications show that our compiler achieves very high performance, either superior or very close to the highly fine-tuned library, NVIDIA CUBLAS 2.1.

References

  1. J. Stratton, et. al., MCUDA: An efficient implementation of CUDA kernels on multicores. IMPACT Technical Report, UIUC, 2008.Google ScholarGoogle Scholar
  2. S.-I. Lee, et. al., Cetus - an extensible compiler infrastructure for source-to-source transformation. LCPC, 2003.Google ScholarGoogle Scholar

Index Terms

  1. An optimizing compiler for GPGPU programs with input-data sharing

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 45, Issue 5
      PPoPP '10
      May 2010
      346 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1837853
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
        January 2010
        372 pages
        ISBN:9781605588773
        DOI:10.1145/1693453

      Copyright © 2010 Copyright held by author(s).

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 January 2010

      Check for updates

      Author Tags

      Qualifiers

      • poster

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!