skip to main content
article
Public Access

Verified lifting of stencil computations

Published:02 June 2016Publication History
Skip Abstract Section

Abstract

This paper demonstrates a novel combination of program synthesis and verification to lift stencil computations from low-level Fortran code to a high-level summary expressed using a predicate language. The technique is sound and mostly automated, and leverages counter-example guided inductive synthesis (CEGIS) to find provably correct translations. Lifting existing code to a high-performance description language has a number of benefits, including maintainability and performance portability. For example, our experiments show that the lifted summaries can enable domain specific compilers to do a better job of parallelization as compared to an off-the-shelf compiler working on the original code, and can even support fully automatic migration to hardware accelerators such as GPUs. We have implemented verified lifting in a system called STNG and have evaluated it using microbenchmarks, mini-apps, and real-world applications. We demonstrate the benefits of verified lifting by first automatically summarizing Fortran source code into a high-level predicate language, and subsequently translating the lifted summaries into Halide, with the translated code achieving median performance speedups of 4.1X and up to 24X for non-trivial stencils as compared to the original implementation.

References

  1. Apache Hive. http://hive.apache.org.Google ScholarGoogle Scholar
  2. Syntax-Guided Synthesis Competition. http://www.sygus. org.Google ScholarGoogle Scholar
  3. Rajeev Alur, Rastislav Bod´ık, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013, pages 1–17, 2013.Google ScholarGoogle Scholar
  4. Saman P Amarasinghe, Jennifer-Ann M Anderson, Monica S Lam, and Chau-Wen Tseng. An overview of the SUIF compiler for scalable parallel machines. In PPSC, pages 662–667, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O’Reilly, and Saman Amarasinghe. OpenTuner: An extensible framework for program autotuning. In Proceedings of the 23rd International Conference on Parallel Architectures and Compilation, PACT ’14, pages 303–316, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter, L. Dagum, R. A. Fatoohi, P. O. Frederickson, T. A. Lasinski, R. S. Schreiber, H. D. Simon, V. Venkatakrishnan, and S. K. Weeratunga. The NAS parallel benchmarks— summary and preliminary results. In Proceedings of the 1991 ACM/IEEE Conference on Supercomputing, Supercomputing ’91, pages 158–165, New York, NY, USA, 1991. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Gilles Barthe, Juan Manuel Crespo, Sumit Gulwani, Cesar Kunz, and Mark Marron. From relational verification to SIMD loop synthesis. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP ’13, pages 123–134, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. John R. Baumgardner. Three-dimensional treatment of convective flow in the earth’s mantle. Journal of Statistical Physics, 39(5-6):501–511, 1985.Google ScholarGoogle ScholarCross RefCross Ref
  9. Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, and Jim Demmel. Optimizing matrix multiply using PHiPAC: A portable, high-performance, ANSI C coding methodology. In Proceedings of the 11th International Conference on Supercomputing, ICS ’97, pages 340–347, New York, NY, USA, 1997. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Aaron R. Bradley. SAT-based model checking without unrolling. In Proceedings of the 12th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI’11, pages 70–87, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Aaron R. Bradley, Zohar Manna, and Henny B. Sipma. What’s decidable about arrays? In Proceedings of the 7th International Conference on Verification, Model Checking, and Abstract Interpretation, VMCAI’06, pages 427–442, Berlin, Heidelberg, 2006. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Peter E Bulychev, Egor V Kostylev, and Vladimir A Zakharov. Anti-unification algorithms and their applications in program analysis. In Perspectives of Systems Informatics, pages 413– 423. Springer, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Bryan Catanzaro, Shoaib Kamil, Yunsup Lee, Krste Asanovic, James Demmel, Kurt Keutzer, John Shalf, Kathy Yelick, and Armando Fox. SEJITS: Getting productivity and performance with selective embedded JIT specialization. In Workshop on Programmable Models for Emerging Architecture (PMEA), 2009.Google ScholarGoogle Scholar
  14. B.L. Chamberlain, D. Callahan, and H.P. Zima. Parallel programmability and the Chapel language. International Journal High Performance Computing Applications, 21(3):291– 312, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. Optimizing database-backed applications with query synthesis. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 3–14, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. Christen, O. Schenk, and H. Burkhart. PATUS: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures. In International Parallel Distributed Processing Symposium (IPDPS), pages 676–687, May 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Alessandro Cimatti and Alberto Griggio. Software model checking via IC3. In Proceedings of the 24th International Conference on Computer Aided Verification, CAV’12, pages 277–293, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David Patterson, John Shalf, and Katherine Yelick. Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. In Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pages 4:1–4:12, Piscataway, NJ, USA, 2008. IEEE Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kei Davis and Daniel J. Quinlan. ROSE: An optimizing transformation system for C++ array-class libraries. In Workshop on Object-Oriented Technology, ECOOP ’98, pages 452–453, London, UK, UK, 1998. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient SMT solver. In Proceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, pages 337–340, Berlin, Heidelberg, 2008. Springer-Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Michael D. Ernst, Jake Cockrell, William G. Griswold, and David Notkin. Dynamically discovering likely program invariants to support program evolution. In Proceedings of the 21st International Conference on Software Engineering, ICSE ’99, pages 213–224, New York, NY, USA, 1999. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Paul Feautrier. Dataflow analysis of array and scalar references. International Journal of Parallel Programming, 20(1):23–53, 1991.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Cormac Flanagan and Shaz Qadeer. Predicate abstraction for software verification. In Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’02, pages 191–202, New York, NY, USA, 2002. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Matteo Frigo and Volker Strumpen. Cache oblivious stencil computations. In Proceedings of the 19th Annual International Conference on Supercomputing, ICS ’05, pages 361–366, New York, NY, USA, 2005. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Pranav Garg, Christof Löding, P. Madhusudan, and Daniel Neider. ICE: A robust framework for learning invariants. In Armin Biere and Roderick Bloem, editors, Computer Aided Verification, volume 8559 of Lecture Notes in Computer Science, pages 69–87. Springer International Publishing, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tobias Grosser, Albert Cohen, Justin Holewinski, P. Sadayappan, and Sven Verdoolaege. Hybrid hexagonal/classical tiling for gpus. In Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO ’14, pages 66:66–66:75, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Tobias Grosser, Sven Verdoolaege, Albert Cohen, and P. Sadayappan. The relation between diamond tiling and hexagonal tiling. Parallel Processing Letters, 24(03):1441002, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  28. Michael A. Heroux, Douglas W. Doerfler, Paul S. Crozier, James M. Willenbring, H. Carter Edwards, Alan Williams, Mahesh Rajan, Eric R. Keiter, Heidi K. Thornquist, and Robert W. Numrich. Improving performance via mini-applications. Technical report, Sandia National Laboratory, 2009.Google ScholarGoogle Scholar
  29. C. A. R. Hoare. An axiomatic basis for computer programming. Communications of the ACM, 12(10):576–580, October 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Intel. The X10 parallel programming language. http:// x10-lang.org.Google ScholarGoogle Scholar
  31. Francois Irigoin and Remi Triolet. Supernode partitioning. In Symposium on Principles of Programming Languages (POPL’88), pages 319–328, San Diego, CA, January 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Ming-Yee Iu and Willy Zwaenepoel. HadoopToSQL: A MapReduce query optimizer. In Proceedings of the 5th European Conference on Computer Systems, EuroSys ’10, pages 251–264, New York, NY, USA, 2010. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Jinseong Jeon, Xiaokang Qiu, Armando Solar-Lezama, and Jeffrey S. Foster. Adaptive concretization for parallel program synthesis. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II, pages 377–394, 2015.Google ScholarGoogle Scholar
  34. Rajeev Joshi, Greg Nelson, and Keith Randall. Denali: A goal-directed superoptimizer. pages 304–314, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. David Joyner, Ondˇrej ˇ Cert´ık, Aaron Meurer, and Brian E. Granger. Open source computer algebra systems: SymPy. ACM Communications in Computer Algebra, 45(3/4):225–234, January 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Shoaib Kamil. A cross-domain stencil benchmark suite. In Workshop on Optimizing Stencil Computations, 2013.Google ScholarGoogle Scholar
  37. Shoaib Kamil, Cy Chan, Leonid Oliker, John Shalf, and Samuel Williams. An auto-tuning framework for parallel multicore stencil computations. In International Parallel and Distributed Processing Symposium (IPDPS), pages 1–12, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  38. Aleksandr Karbyshev, Nikolaj Bjørner, Shachar Itzhaky, Noam Rinetzky, and Sharon Shoham. Property-directed inference of universal invariants or proving their absence. In Computer Aided Verification - 27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part I, pages 583–602, 2015.Google ScholarGoogle Scholar
  39. Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-No¨el Pouchet, and P. Sadayappan. When polyhedral transformations meet SIMD code generation. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pages 127– 138, New York, NY, USA, 2013. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. A.C. Mallinson, D.A. Beckingsale, W.P. Gaudin, J.A. Herdman, J.M. Levesque, and S.A. Jarvis. Cloverleaf: Preparing hydrodynamics codes for exascale. In Cray User Group, 2013.Google ScholarGoogle Scholar
  41. Henry Massalin. Superoptimizer: A look at the smallest program. In Proceedings of the Second International Conference on Architectual Support for Programming Languages and Operating Systems, ASPLOS II, pages 122–126, Los Alamitos, CA, USA, 1987. IEEE Computer Society Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. John Matthews, J. Strother Moore, Sandip Ray, and Daron Vroon. Verification condition generation via theorem proving. In Proceedings of the 13th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, LPAR’06, pages 362–376, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Charith Mendis, Jeffrey Bosboom, Kevin Wu, Shoaib Kamil, Jonathan Ragan-Kelley, Sylvain Paris, Qin Zhao, and Saman Amarasinghe. Helium: Lifting high-performance stencil kernels from stripped x86 binaries to Halide DSL code. In ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, June 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS ’15, pages 429–443, New York, NY, USA, 2015. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins, and Rastislav Bodik. Chlorophyll: Synthesis-aided compiler for low-power spatial architectures. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, pages 396–407, New York, NY, USA, 2014. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Gordon D Plotkin. A note on inductive generalization. Machine Intelligence, 5(1):153–163, 1970.Google ScholarGoogle Scholar
  47. Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. John C. Reynolds. Transformational systems and the algebraic structure of atomic formulas. In Bernard Meltzer and Donald Michie, editors, Machine Intelligence 5, pages 135–151. Edinburgh University Press, Edinburgh, Scotland, 1969.Google ScholarGoogle Scholar
  49. Gabriel Rivera and Chau-Wen Tseng. Tiling optimizations for 3D scientific computations. In Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, SC ’00, Washington, DC, USA, 2000. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic superoptimization. SIGPLAN Notices, 48(4):305–316, March 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Eric Schkufza, Rahul Sharma, and Alex Aiken. Stochastic optimization of floating-point programs with tunable precision. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, Edinburgh, United Kingdom - June 09 - 11, 2014, page 9, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Armando Solar Lezama. Program Synthesis By Sketching. PhD thesis, EECS Department, University of California, Berkeley, Dec 2008.Google ScholarGoogle Scholar
  53. Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodk, Vijay A. Saraswat, and Sanjit A. Seshia. Sketching stencils. In Jeanne Ferrante and Kathryn S. McKinley, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 167–178. ACM, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. Combinatorial sketching for finite programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XII, pages 404–415, New York, NY, USA, 2006. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Arvind K. Sujeeth, Kevin J. Brown, Hyoukjoong Lee, Tiark Rompf, Hassan Chafi, Martin Odersky, and Kunle Olukotun. Delite: A compiler architecture for performance-oriented embedded domain-specific languages. ACM Transactions in Embedded Computer Systems, 13(4s):134:1–134:25, April 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Yuan Tang, Rezaul Chowdhury, Bradley C. Kuszmaul, Chi keung Luk, and Charles E. Leiserson. The Pochoir stencil compiler. In ACM Symposium on Parallelism in Algorithms and Architectures, (SPAA), 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Abhishek Udupa, Arun Raghavan, Jyotirmoy V. Deshmukh, Sela Mador-Haim, Milo M. K. Martin, and Rajeev Alur. TRANSIT: specifying protocols with concolic snippets. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, Seattle, WA, USA, June 16-19, 2013, pages 287–296, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Diego Vasco, Nelson Moraga, and Gundolf Haase. Parallel finite volume method simulation of three-dimensional fluid flow and convective heat transfer for viscoplastic non-newtonian fluids. J. Numerical Heat Transfer, Part A: Applications, 66(2):990–1019, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  59. Diego Vasco, Nelson Moraga, Aurel Neic, and Gundolf Haase. OpenMP parallel acceleration of a 3D finite volume method based code for simulation of non-Newtonian fluid flows. In Sergio Gutiérrez, Daniel Hurtado, and Esteban Sáez, editors, Cuadernos de Mecánica Computacional, pages 108–117, Concepci´on, Chile, 2011. Sociedad Chilena de Mecánica Computacional.Google ScholarGoogle Scholar
  60. Ben Wiedermann, Ali Ibrahim, and William R. Cook. Interprocedural query extraction for transparent persistence. In Proceedings of the 23rd ACM SIGPLAN Conference on Objectoriented Programming Systems Languages and Applications, pages 19–36, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Glynn Winskel. The Formal Semantics of Programming Languages: An Introduction. MIT Press, Cambridge, MA, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Christoph M. Wintersteiger, Youssef Hamadi, and Leonardo Mendonc¸a de Moura. Efficiently solving quantified bit-vector formulas. In Proceedings of 10th International Conference on Formal Methods in Computer-Aided Design, FMCAD 2010, Lugano, Switzerland, October 20-23, pages 239–246, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. David Wonnacott. Achieving scalable locality with time skewing. International Journal of Parallel Programming, 30(3):181–221, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael J. Franklin, Scott Shenker, and Ion Stoica. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, NSDI’12, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Fubo Zhang and Erik H. D’Hollander. Using hammock graphs to structure programs. IEEE Transactions on Software Engineering, 30(4):231–245, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Verified lifting of stencil computations

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 51, Issue 6
        PLDI '16
        June 2016
        726 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2980983
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          PLDI '16: Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation
          June 2016
          726 pages
          ISBN:9781450342612
          DOI:10.1145/2908080
          • General Chair:
          • Chandra Krintz,
          • Program Chair:
          • Emery Berger

        Copyright © 2016 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 2 June 2016

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!