skip to main content
research-article

SIMD code generation for stencils on brick decompositions

Published:10 February 2018Publication History
Skip Abstract Section

Abstract

We present a stencil library and associated compiler code generation framework designed to maximize performance on higher-order stencil computations through the use of two main technologies: a fine-grained brick data layout designed to exploit the inherent multidimensional spatial locality endemic to stencil computations, and a vector scatter associative reordering transformation that reduces vector loads and alignment operations and exposes opportunities for the backend compiler to reduce computation. For a range of stencil computations, we compare the generated code expressed in the brick library to the standard tiled code. We attain up to a 7.2X speedup on the most complex stencils when running on an Intel Knights Landing (Xeon Phi) processor.

References

  1. Mauricio Araya-Polo, Félix Rubio, Raúl de la Cruz, Mauricio Hanzich, José María Cela, and Daniele Paolo Scarpazza. 2009. 3D Seismic Imaging Through Reverse-time Migration on Homogeneous and Heterogeneous Multi-core Processors. Sci. Program. 17, 1-2 (Jan. 2009), 185--198. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Protonu Basu, Mary Hall, Samuel Williams, Brian Van Straalen, Leonid Oliker, and Phillip Colella. 2015. Compiler-directed transformation for higher-order stencils. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE, 313--323. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Kaushik Datta, Shoaib Kamil, Samuel Williams, Leonid Oliker, John Shalf, and Katherine Yelick. 2009. Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors. SIAM Rev. 51, 1 (2009), 129--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Steven J Deitz, Bradford L Chamberlain, and Lawrence Snyder. 2001. Eliminating redundancies in sum-of-product array computations. In Proceedings of the 15th international conference on Supercomputing. ACM, 65--77. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Matthew Emmett, Weiqun Zhang, and John B Bell. 2014. High-order algorithms for compressible reacting flow with complex chemistry. Combustion Theory and Modelling 18, 3 (2014), 361--387.Google ScholarGoogle ScholarCross RefCross Ref
  6. Jagan Jayaraj. 2013. A strategy for high performance in computational fluid dynamics. Ph.D. Dissertation. University of Minnesota.Google ScholarGoogle Scholar
  7. Sriram Krishnamoorthy, Muthu Baskaran, Uday Bondhugula, J. Ramanujam, Atanas Rountev, and P Sadayappan. 2007. Effective automatic parallelization of stencil computations. In Proc. ACM SIGPLAN conference on Programming language design and implementation (PLDI). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noël Pouchet, Fabrice Rastello, Jagannathan Ramanujam, and Ponnuswamy Sadayappan. 2014. A framework for enhancing data reuse via associative reordering. In ACM SIGPLAN Notices, Vol. 49. ACM, 65--76. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gerhard Wellein, Georg Hager, Thomas Zeiser, Markus Wittmann, and Holger Fehske. 2009. Efficient Temporal Blocking for Stencil Computations by Multicore-Aware Wavefront Parallelization. In International Computer Software and Applications Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Charles Yount, Josh Tobin, Alexander Breuer, and Alejandro Duran. 2016. YASK-yet Another Stencil Kernel: A Framework for HPC Stencil Code-generation and Tuning. In Proceedings of the Sixth International Workshop on Domain-Specific Languages and High-Level Frameworks for HPC (WOLFHPC '16). IEEE Press, 30--39. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. SIMD code generation for stencils on brick decompositions

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 53, Issue 1
        PPoPP '18
        January 2018
        426 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/3200691
        Issue’s Table of Contents
        • cover image ACM Conferences
          PPoPP '18: Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
          February 2018
          442 pages
          ISBN:9781450349826
          DOI:10.1145/3178487

        Copyright © 2018 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 February 2018

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!