skip to main content
poster

Data transformations enabling loop vectorization on multithreaded data parallel architectures

Published:09 January 2010Publication History
Skip Abstract Section

Abstract

Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memory access patterns in the data stream. This paper describes data transformations that allow us to vectorize loops targeting massively multithreaded data parallel architectures. We present a mathematical model that captures loop-based memory access patterns and computes the most appropriate data transformations in order to enable vectorization. Our experimental results show that the proposed data transformations can significantly increase the number of loops that can be vectorized and enhance the data-level parallelism of applications. Our results also show that the overhead associated with our data transformations can be easily amortized as the size of the input data set increases. For the set of high performance benchmark kernels studied, we achieve consistent and significant performance improvements (up to 11.4X) by applying vectorization using our data transformation approach.

References

  1. M. M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan, "Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories," in PPoPP '08: Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming. New York, NY, USA: ACM, 2008, pp. 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. B. Jang, S. Do, H. Pien, and D. Kaeli, "Architecture-aware optimization targeting multithreaded stream computing," in GPGPU-2: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units. New York, NY, USA: ACM, 2009, pp. 62--70. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. B. Jang, D. Kaeli, S. Do, and H. Pien, "Multi GPU Implementation of Iterative Tomographic Reconstruction Algorithms," in Biomedical Imaging: From Nano to Macro, 2009. ISBI 2009. 6th IEEE International Symposium on, Jun 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. AMD, "Brook+," May 2006, http://ati.amd.com/technology/streamcomputing.Google ScholarGoogle Scholar
  5. S. T. Leung and J. Zahorjan, "Optimizing data locality by array restructuring," University of Washington, Tech. Rep. TR 95-09-01, 1995.Google ScholarGoogle Scholar
  6. S. Ghosh, M. Martonosi, and S. Malik, "Cache miss equations: an analytical representation of cache misses," in ICS '97: Proceedings of the 11th international conference on Supercomputing. New York, NY, USA: ACM, 1997, pp. 317--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. B. Jang, P. Mistry, D. Schaa, R. Dominguez, and D. Kaeli, "Data Transformations Enabling Loop Vectorization, NUCAR Technical Report," Nov 2009, http://www.ece.neu.edu/groups/nucar/publications.html.Google ScholarGoogle Scholar

Index Terms

  1. Data transformations enabling loop vectorization on multithreaded data parallel architectures

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 45, Issue 5
      PPoPP '10
      May 2010
      346 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/1837853
      Issue’s Table of Contents
      • cover image ACM Conferences
        PPoPP '10: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
        January 2010
        372 pages
        ISBN:9781605588773
        DOI:10.1145/1693453

      Copyright © 2010 Copyright held by author(s).

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 9 January 2010

      Check for updates

      Qualifiers

      • poster

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!