skip to main content
research-article
Open Access

Coarsening optimization for differentiable programming

Published:15 October 2021Publication History
Skip Abstract Section

Abstract

This paper presents a novel optimization for differentiable programming named coarsening optimization. It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to much reduced runtime computations and data allocations in AD. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces phi-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification. Experiments on a collection of real-world applications show that coarsening optimization is effective in speeding up AD, producing several times to two orders of magnitude speedups.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is the presentation video of our paper at OOPLSA 2021 on our paper "Coarsening Optimization for Differentiable Programming". It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to up to two orders of magnitude speedups. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces 𝜙-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification.

References

  1. [n.d.]. Calculus package for Julia. Available at https://github.com/JuliaMath/Calculus.jlGoogle ScholarGoogle Scholar
  2. [n.d.]. HMC Explained. Available at https://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo.htmlGoogle ScholarGoogle Scholar
  3. [n.d.]. SageMath. Available at https://www.sagemath.org/Google ScholarGoogle Scholar
  4. [n.d.]. Sympy software. https://www.sympy.org/en/index.html.Google ScholarGoogle Scholar
  5. 1988. Fast reverse-mode automatic differentiation using expression templates in C++. Perspectives in Computing, 19 (1988), Source of expression swell.Google ScholarGoogle Scholar
  6. 2011. Handbook of Markov Chain Monte Carlo. May, isbn:9780429138508 https://doi.org/10.1201/b10905 Google ScholarGoogle ScholarCross RefCross Ref
  7. 2014. Fast reverse-mode automatic differentiation using expression templates in C++. Trans. Math. Software, 40, 26 (2014), ADEPT AD tool in C++.Google ScholarGoogle Scholar
  8. 2017. High-Performance Derivative Computations using CoDiPack. Trans. Math. Software, 45 (2017), CoDiPack.Google ScholarGoogle Scholar
  9. A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Addison Wesley.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Aubert, N. Di Cesare, and O. Pironneau. 2001. Automatic differentiation in C++ using expression templates ´ and application to a flow control problem. Comput. Vis. Sci., 3 (2001), 197–208.Google ScholarGoogle ScholarCross RefCross Ref
  11. Atılım Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. The Journal of Machine Learning Research, 18, 1 (2018).Google ScholarGoogle Scholar
  12. James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. https://jax.readthedocs.io/.Google ScholarGoogle Scholar
  13. Breandan Considine, Michalis Famelis, and Liam Paull. 2019. Kotlin∇ : A Shape-Safe eDSL for Differentiable Programming. https://github.com/breandan/kotlingrad.Google ScholarGoogle Scholar
  14. Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1989. An efficient method of computing static single assignment form. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 25–35.Google ScholarGoogle Scholar
  15. B. Dauvergne and L. Hascoet. 2006. The Data-Flow Equations of Checkpointing in Reverse Automatic Differentiation. Lecture Notes in Computer Science, 3994 (2006).Google ScholarGoogle Scholar
  16. Y. Ding and X. Shen. 2017. GLORE: Generalized Loop Redundancy Elimination upon LER-Notation. In Proceedings of OOPSLA at The ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).Google ScholarGoogle Scholar
  17. L. C. Dixon. 1991. Use of automatic differentiation for calculating Hessians and Newton steps. Automatic Differentiation of Algorithms: Theory, Implementation, and Application, 114–125.Google ScholarGoogle Scholar
  18. Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arXiv:1810.07951. arxiv:1810.07951Google ScholarGoogle Scholar
  19. Michael J Innes. 2020. Sense & Sensitivities: The Path to General-Purpose Algorithmic Differentiation. In Proceedings of the 3rd MLSys Conference. https://fluxml.ai/Zygote.jl/latest/.Google ScholarGoogle Scholar
  20. Kathleen B Knobe and Vivek Sarkar. 1998. Array SSA form and its use in parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Sören Laue. 2019. On the Equivalence of Forward Mode Automatic Differentiation and Symbolic Differentiation. CoRR, abs/1904.02990 (2019), arXiv:1904.02990. arxiv:1904.02990Google ScholarGoogle Scholar
  22. Dougal Maclaurin. 2016. Modeling, Inference and Optimization with Composable Differentiable Procedures. Ph.D. Dissertation. Harvard University.Google ScholarGoogle Scholar
  23. Charles C. Margossian. 2019. A review of automatic differentiation and its efficient implementation. WIREs Data Mining and Knowledge Discovery, 9, 4 (2019), Mar, issn:1942-4795 https://doi.org/10.1002/widm.1305 Google ScholarGoogle ScholarCross RefCross Ref
  24. Karl J. Ottenstein, Robert A. Ballance, and Arthur B. MacCabe. 1990. The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages. In ACM SIGPLAN 1990 conference on Programming language design and implementation. 257–271.Google ScholarGoogle Scholar
  25. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of NIPS 2017 Workshop Autodiff.Google ScholarGoogle Scholar
  26. Eric Phipps and Roger Pawlowski. 2012. Efficient Expression Templates for Operator Overloading-BasedAutomatic Differentiation. In Recent Advances in Algorithmic Differentiation, Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, and Andrea Walther (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 309–319. isbn:978-3-642-30023-3Google ScholarGoogle Scholar
  27. Junior Rojas, Stelian Coros, and Ladislav Kavan. 2019. Deep reinforcement learning for 2D soft body locomotion. In NeurIPS Workshop on Machine Learning for Creativity and Design 3.0.Google ScholarGoogle Scholar
  28. Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient Differentiable Programming in a Functional Array-Processing Language. Proc. ACM Program. Lang., 3, ICFP (2019), Article 97, July, 30 pages. https://doi.org/10.1145/3341701 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Benjamin Sherman, Jesse Michel, and Michael Carbin. 2021. Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes. In Proceedings of the ACM SIGPLAN-SIGACT symposium on Principles of programming languages.Google ScholarGoogle Scholar
  30. Nazanin Tehrani, Nimar S. Arora, Yucen Lily Li, Kinjal Divesh Shah, David Noursi, Michael Tingley, Narjes Torabi, Sepehr Masouleh, Eric Lippert, and Erik Meijer. 2020. Bean Machine: A Declarative Probabilistic Programming Language For Efficient Programmable Inference. In Proceedings of the 10th International Conference on Probabilistic Graphical Models.Google ScholarGoogle Scholar
  31. Peng Tu and David Padua. 1995. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In Proceedings of the 9th International Conference on Supercomputing. 414–423.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Robert A. van Engelen. 2001. A method for recognizing and substitutions of generalized inductive variables through Chains of recurrences (CRs). In Proceedings of the International Conference on Compiler Constructions.Google ScholarGoogle Scholar
  33. Robert A. van Engelen, J. Birch, Y. Shou, B. Walsh, and Kyle A. Gallivan. 2004. A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis. In Proceedings of the International Conference on Supercomputing.Google ScholarGoogle Scholar
  34. Bart van Merriënboer, Olivier Breuleux, Arnaud Bergeron, and Pascal Lamblin. 2018. Automatic differentiation in ML: Where we are and where we should be going. CoRR, abs/1810.11530 (2018), arXiv:1810.11530. arxiv:1810.11530Google ScholarGoogle Scholar
  35. Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR, abs/1803.10228 (2018), arXiv:1803.10228. arxiv:1803.10228Google ScholarGoogle Scholar
  36. Yun Zhu, Edwin Westbrook, Jun Inoue, Alexandre Chapoutot, Cherif Salama, Marisa Peralta, Travis Martin, Walid Taha, Robert Cartwright, Aaron Ames, and Raktim Bhattacharya. 2010. Mathematical equations as executable models of mechanical systems. In Proceedings of International Conference on Cyber-Physical Systems.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Coarsening optimization for differentiable programming

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Article Metrics

      • Downloads (Last 12 months)128
      • Downloads (Last 6 weeks)11

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!