10.1145/1964179.1964196acmotherconferencesArticle/Chapter ViewAbstractPublication PagesgpgpuConference Proceedingsconference-collections
research-article

Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation

Published:05 March 2011Publication History

ABSTRACT

We propose a system-independent representation of sparse matrix formats that allows a compiler to generate efficient, system-specific code for sparse matrix operations. To show the viability of such a representation we have developed a compiler that generates and tunes code for sparse matrix-vector multiplication (SpMV) on GPUs. We evaluate our framework on six state-of-the-art matrix formats and show that the generated code performs similar to or better than hand-optimized code.

References

  1. Nathan Bell and Michael Garland. Implementing sparse matrix-vector multiplication on throughput-oriented processors. In SC, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Jee W. Choi, Amik Singh, and Richard W. Vuduc. Model-driven autotuning of sparse matrix-vector multiply on GPUs. In PPoPP, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David R. Kincaid, John R. Respess, and David M. Young. ITPACK 2.0 user's guide. Technical Report CNA-150, Center for Numerical Analysis, University of Texas, Austin, Texas, 1979.Google ScholarGoogle Scholar
  4. Nikolay Mateev, Keshav Pingali, Paul Stodghill, and Vladimir Kotlyar. Next-generation generic programming and its application to sparse matrix computations. In ICS, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Alexander Monakov, Anton Lokhmotov, and Arutyun Avetisyan. Automatically tuning sparse matrix-vector multiplication for GPU architectures. In HiPEAC, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B. Kirk, and Wen-mei W. Hwu. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA. In PPoPP, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Richard W. Vuduc. Automatic performance tuning of sparse matrix kernels. PhD thesis, University of California, Berkeley, CA, USA, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Samuel Williams, Leonid Oliker, Richard W. Vuduc, John Shalf, Katherine A. Yelick, and James Demmel. Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Computing, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automatically generating and tuning GPU code for sparse matrix-vector multiplication from a high-level representation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        GPGPU-4: Proceedings of the Fourth Workshop on General Purpose Processing on Graphics Processing Units
        March 2011
        101 pages
        ISBN:9781450305693
        DOI:10.1145/1964179

        Copyright © 2011 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 5 March 2011

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate 57 of 129 submissions, 44%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!