Abstract
This paper presents a novel optimization for differentiable programming named coarsening optimization. It offers a systematic way to synergize symbolic differentiation and algorithmic differentiation (AD). Through it, the granularity of the computations differentiated by each step in AD can become much larger than a single operation, and hence lead to much reduced runtime computations and data allocations in AD. To circumvent the difficulties that control flow creates to symbolic differentiation in coarsening, this work introduces phi-calculus, a novel method to allow symbolic reasoning and differentiation of computations that involve branches and loops. It further avoids "expression swell" in symbolic differentiation and balance reuse and coarsening through the design of reuse-centric segment of interest identification. Experiments on a collection of real-world applications show that coarsening optimization is effective in speeding up AD, producing several times to two orders of magnitude speedups.
Supplemental Material
- [n.d.]. Calculus package for Julia. Available at https://github.com/JuliaMath/Calculus.jlGoogle Scholar
- [n.d.]. HMC Explained. Available at https://arogozhnikov.github.io/2016/12/19/markov_chain_monte_carlo.htmlGoogle Scholar
- [n.d.]. SageMath. Available at https://www.sagemath.org/Google Scholar
- [n.d.]. Sympy software. https://www.sympy.org/en/index.html.Google Scholar
- 1988. Fast reverse-mode automatic differentiation using expression templates in C++. Perspectives in Computing, 19 (1988), Source of expression swell.Google Scholar
- 2011. Handbook of Markov Chain Monte Carlo. May, isbn:9780429138508 https://doi.org/10.1201/b10905 Google Scholar
Cross Ref
- 2014. Fast reverse-mode automatic differentiation using expression templates in C++. Trans. Math. Software, 40, 26 (2014), ADEPT AD tool in C++.Google Scholar
- 2017. High-Performance Derivative Computations using CoDiPack. Trans. Math. Software, 45 (2017), CoDiPack.Google Scholar
- A. V. Aho, M. S. Lam, R. Sethi, and J. D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Addison Wesley.Google Scholar
Digital Library
- P. Aubert, N. Di Cesare, and O. Pironneau. 2001. Automatic differentiation in C++ using expression templates ´ and application to a flow control problem. Comput. Vis. Sci., 3 (2001), 197–208.Google Scholar
Cross Ref
- Atılım Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. The Journal of Machine Learning Research, 18, 1 (2018).Google Scholar
- James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. https://jax.readthedocs.io/.Google Scholar
- Breandan Considine, Michalis Famelis, and Liam Paull. 2019. Kotlin∇ : A Shape-Safe eDSL for Differentiable Programming. https://github.com/breandan/kotlingrad.Google Scholar
- Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1989. An efficient method of computing static single assignment form. In Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 25–35.Google Scholar
- B. Dauvergne and L. Hascoet. 2006. The Data-Flow Equations of Checkpointing in Reverse Automatic Differentiation. Lecture Notes in Computer Science, 3994 (2006).Google Scholar
- Y. Ding and X. Shen. 2017. GLORE: Generalized Loop Redundancy Elimination upon LER-Notation. In Proceedings of OOPSLA at The ACM SIGPLAN conference on Systems, Programming, Languages and Applications: Software for Humanity (SPLASH).Google Scholar
- L. C. Dixon. 1991. Use of automatic differentiation for calculating Hessians and Newton steps. Automatic Differentiation of Algorithms: Theory, Implementation, and Application, 114–125.Google Scholar
- Michael Innes. 2018. Don’t Unroll Adjoint: Differentiating SSA-Form Programs. CoRR, abs/1810.07951 (2018), arXiv:1810.07951. arxiv:1810.07951Google Scholar
- Michael J Innes. 2020. Sense & Sensitivities: The Path to General-Purpose Algorithmic Differentiation. In Proceedings of the 3rd MLSys Conference. https://fluxml.ai/Zygote.jl/latest/.Google Scholar
- Kathleen B Knobe and Vivek Sarkar. 1998. Array SSA form and its use in parallelization. In Proceedings of the 25th ACM SIGPLAN-SIGACT symposium on Principles of programming languages.Google Scholar
Digital Library
- Sören Laue. 2019. On the Equivalence of Forward Mode Automatic Differentiation and Symbolic Differentiation. CoRR, abs/1904.02990 (2019), arXiv:1904.02990. arxiv:1904.02990Google Scholar
- Dougal Maclaurin. 2016. Modeling, Inference and Optimization with Composable Differentiable Procedures. Ph.D. Dissertation. Harvard University.Google Scholar
- Charles C. Margossian. 2019. A review of automatic differentiation and its efficient implementation. WIREs Data Mining and Knowledge Discovery, 9, 4 (2019), Mar, issn:1942-4795 https://doi.org/10.1002/widm.1305 Google Scholar
Cross Ref
- Karl J. Ottenstein, Robert A. Ballance, and Arthur B. MacCabe. 1990. The program dependence web: a representation supporting control-, data-, and demand-driven interpretation of imperative languages. In ACM SIGPLAN 1990 conference on Programming language design and implementation. 257–271.Google Scholar
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In Proceedings of NIPS 2017 Workshop Autodiff.Google Scholar
- Eric Phipps and Roger Pawlowski. 2012. Efficient Expression Templates for Operator Overloading-BasedAutomatic Differentiation. In Recent Advances in Algorithmic Differentiation, Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, and Andrea Walther (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 309–319. isbn:978-3-642-30023-3Google Scholar
- Junior Rojas, Stelian Coros, and Ladislav Kavan. 2019. Deep reinforcement learning for 2D soft body locomotion. In NeurIPS Workshop on Machine Learning for Creativity and Design 3.0.Google Scholar
- Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient Differentiable Programming in a Functional Array-Processing Language. Proc. ACM Program. Lang., 3, ICFP (2019), Article 97, July, 30 pages. https://doi.org/10.1145/3341701 Google Scholar
Digital Library
- Benjamin Sherman, Jesse Michel, and Michael Carbin. 2021. Computable Semantics for Differentiable Programming with Higher-Order Functions and Datatypes. In Proceedings of the ACM SIGPLAN-SIGACT symposium on Principles of programming languages.Google Scholar
- Nazanin Tehrani, Nimar S. Arora, Yucen Lily Li, Kinjal Divesh Shah, David Noursi, Michael Tingley, Narjes Torabi, Sepehr Masouleh, Eric Lippert, and Erik Meijer. 2020. Bean Machine: A Declarative Probabilistic Programming Language For Efficient Programmable Inference. In Proceedings of the 10th International Conference on Probabilistic Graphical Models.Google Scholar
- Peng Tu and David Padua. 1995. Gated SSA-based demand-driven symbolic analysis for parallelizing compilers. In Proceedings of the 9th International Conference on Supercomputing. 414–423.Google Scholar
Digital Library
- Robert A. van Engelen. 2001. A method for recognizing and substitutions of generalized inductive variables through Chains of recurrences (CRs). In Proceedings of the International Conference on Compiler Constructions.Google Scholar
- Robert A. van Engelen, J. Birch, Y. Shou, B. Walsh, and Kyle A. Gallivan. 2004. A Unified Framework for Nonlinear Dependence Testing and Symbolic Analysis. In Proceedings of the International Conference on Supercomputing.Google Scholar
- Bart van Merriënboer, Olivier Breuleux, Arnaud Bergeron, and Pascal Lamblin. 2018. Automatic differentiation in ML: Where we are and where we should be going. CoRR, abs/1810.11530 (2018), arXiv:1810.11530. arxiv:1810.11530Google Scholar
- Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator. CoRR, abs/1803.10228 (2018), arXiv:1803.10228. arxiv:1803.10228Google Scholar
- Yun Zhu, Edwin Westbrook, Jun Inoue, Alexandre Chapoutot, Cherif Salama, Marisa Peralta, Travis Martin, Walid Taha, Robert Cartwright, Aaron Ames, and Raktim Bhattacharya. 2010. Mathematical equations as executable models of mechanical systems. In Proceedings of International Conference on Cyber-Physical Systems.Google Scholar
Digital Library
Index Terms
Coarsening optimization for differentiable programming
Recommendations
A methodology for quadrilateral finite element mesh coarsening
High fidelity finite element modeling of continuum mechanics problems often requires using all quadrilateral or all hexahedral meshes. The efficiency of such models is often dependent upon the ability to adapt a mesh to the physics of the phenomena. ...
Adaptive mesh coarsening for quadrilateral and hexahedral meshes
Mesh adaptation methods can improve the efficiency and accuracy of solutions to computational modeling problems. In many applications involving quadrilateral and hexahedral meshes, local modifications which maintain the original element type are ...
Localized coarsening of conforming all-hexahedral meshes
Finite element mesh adaptation methods can be used to improve the efficiency and accuracy of solutions to computational modeling problems. In many applications involving hexahedral meshes, localized modifications which preserve a conforming all-...






Comments