Abstract
Where dual-numbers forward-mode automatic differentiation (AD) pairs each scalar value with its tangent value, dual-numbers reverse-mode AD attempts to achieve reverse AD using a similarly simple idea: by pairing each scalar value with a backpropagator function. Its correctness and efficiency on higher-order input languages have been analysed by Brunel, Mazza and Pagani, but this analysis used a custom operational semantics for which it is unclear whether it can be implemented efficiently. We take inspiration from their use of linear factoring to optimise dual-numbers reverse-mode AD to an algorithm that has the correct complexity and enjoys an efficient implementation in a standard functional language with support for mutable arrays, such as Haskell. Aside from the linear factoring ingredient, our optimisation steps consist of well-known ideas from the functional programming community. We demonstrate the use of our technique by providing a practical implementation that differentiates most of Haskell98.
- Martín Abadi and Gordon D. Plotkin. 2020. A simple differentiable programming language. Proc. ACM Program. Lang., 4, POPL (2020), 38:1–38:28. https://doi.org/10.1145/3371106
Google Scholar
Digital Library
- Atilim Gunes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. Automatic Differentiation in Machine Learning: a Survey. J. Mach. Learn. Res., 18 (2017), 153:1–153:43. http://jmlr.org/papers/v18/17-468.html
Google Scholar
- Jean-Philippe Bernardy, Mathieu Boespflug, Ryan R. Newton, Simon Peyton Jones, and Arnaud Spiwack. 2018. Linear Haskell: practical linearity in a higher-order polymorphic language. Proc. ACM Program. Lang., 2, POPL (2018), 5:1–5:29. https://doi.org/10.1145/3158093
Google Scholar
Digital Library
- Aloïs Brunel, Damiano Mazza, and Michele Pagani. 2020. Backpropagation in the simply typed lambda-calculus with linear negation. Proc. ACM Program. Lang., 4, POPL (2020), 64:1–64:27. https://doi.org/10.1145/3371132
Google Scholar
Digital Library
- Paulo Emílio de Vilhena and François Pottier. 2021. Verifying a Minimalist Reverse-Mode AD Library. arXiv preprint arXiv:2112.07292.
Google Scholar
- Conal Elliott. 2018. The simple essence of automatic differentiation. Proc. ACM Program. Lang., 2, ICFP (2018), 70:1–70:29. https://doi.org/10.1145/3236765
Google Scholar
Digital Library
- Andreas Griewank and Andrea Walther. 2008. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition. SIAM. isbn:978-0-89871-659-7 https://doi.org/10.1137/1.9780898717761
Google Scholar
Cross Ref
- R. John M. Hughes. 1986. A Novel Representation of Lists and its Application to the Function "reverse". Inf. Process. Lett., 22, 3 (1986), 141–144. https://doi.org/10.1016/0020-0190(86)90059-1
Google Scholar
Digital Library
- Mathieu Huot, Sam Staton, and Matthijs Vákár. 2020. Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing. In Foundations of Software Science and Computation Structures - 23rd International Conference, FOSSACS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Jean Goubault-Larrecq and Barbara König (Eds.) (Lecture Notes in Computer Science, Vol. 12077). Springer, 319–338. https://doi.org/10.1007/978-3-030-45231-5_17
Google Scholar
Digital Library
- Edward Kmett and contributors. 2021. ad: Automatic Differentiation. https://hackage.haskell.org/package/ad
Google Scholar
- Faustyna Krawiec, Simon Peyton Jones, Neel Krishnaswami, Tom Ellis, Richard A. Eisenberg, and Andrew W. Fitzgibbon. 2022. Provably correct, asymptotically efficient, higher-order reverse-mode automatic differentiation. Proc. ACM Program. Lang., 6, POPL (2022), 1–30. https://doi.org/10.1145/3498710
Google Scholar
Digital Library
- John Launchbury and Simon L. Peyton Jones. 1994. Lazy Functional State Threads. In Proceedings of the ACM SIGPLAN’94 Conference on Programming Language Design and Implementation (PLDI), Orlando, Florida, USA, June 20-24, 1994, Vivek Sarkar, Barbara G. Ryder, and Mary Lou Soffa (Eds.). ACM, 24–35. https://doi.org/10.1145/178243.178246
Google Scholar
Digital Library
- Seppo Linnainmaa. 1970. The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors. Master’s Thesis (in Finnish), Univ. Helsinki.
Google Scholar
- Charles C. Margossian. 2019. A review of automatic differentiation and its efficient implementation. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 9, 4 (2019), https://doi.org/10.1002/widm.1305
Google Scholar
Cross Ref
- Damiano Mazza and Michele Pagani. 2021. Automatic differentiation in PCF. Proc. ACM Program. Lang., 5, POPL (2021), 1–27. https://doi.org/10.1145/3434309
Google Scholar
Digital Library
- Fernando Lucatelli Nunes and Matthijs Vákár. 2021. CHAD for Expressive Total Languages. CoRR, abs/2110.00446 (2021), arXiv:2110.00446. arxiv:2110.00446
Google Scholar
- Fernando Lucatelli Nunes and Matthijs Vákár. 2022. Automatic Differentiation for ML-family languages: correctness via logical relations. CoRR, abs/2210.07724 (2022), arXiv:2210.07724. arxiv:2210.07724
Google Scholar
- Fernando Lucatelli Nunes and Matthijs Vákár. 2022. Logical Relations for Partial Features and Automatic Differentiation Correctness. CoRR, abs/2210.08530 (2022), arXiv:2210.08530. arxiv:2210.08530
Google Scholar
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS 2017 Autodiff Workshop: The future of gradient-based machine learning software and techniques. Curran Associates, Inc., Red Hook, NY, USA.
Google Scholar
- Adam Paszke, Daniel Johnson, David Duvenaud, Dimitrios Vytiniotis, Alexey Radul, Matthew Johnson, Jonathan Ragan-Kelley, and Dougal Maclaurin. 2021. Getting to the Point. Index Sets and Parallelism-Preserving Autodiff for Pointful Array Programming. CoRR, abs/2104.05372 (2021), arxiv:2104.05372.
Google Scholar
- Barak A. Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator. ACM Trans. Program. Lang. Syst., 30, 2 (2008), 7:1–7:36. https://doi.org/10.1145/1330017.1330018
Google Scholar
Digital Library
- John C. Reynolds. 1998. Definitional Interpreters for Higher-Order Programming Languages. High. Order Symb. Comput., 11, 4 (1998), 363–397. https://doi.org/10.1023/A:1010027404223
Google Scholar
Digital Library
- Robert Schenck, Ola Rønning, Troels Henriksen, and Cosmin E. Oancea. 2022. AD for an Array Language with Nested Parallelism. CoRR, abs/2202.10297 (2022), arXiv:2202.10297. arxiv:2202.10297
Google Scholar
- Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, and Simon Peyton Jones. 2019. Efficient differentiable programming in a functional array-processing language. Proc. ACM Program. Lang., 3, ICFP (2019), 97:1–97:30. https://doi.org/10.1145/3341701
Google Scholar
Digital Library
- Tim Sheard and Simon L. Peyton Jones. 2002. Template meta-programming for Haskell. ACM SIGPLAN Notices, 37, 12 (2002), 60–75. https://doi.org/10.1145/636517.636528
Google Scholar
Digital Library
- Jesse Sigal. 2021. Automatic differentiation via effects and handlers: An implementation in Frank. arXiv preprint arXiv:2101.08095.
Google Scholar
- Tom Smeding and Matthijs Vákár. 2022. Artifact for Efficient Dual-Numbers Reverse AD via Well-Known Program Transformations. https://doi.org/10.5281/zenodo.7130343 Artifact for this publication
Google Scholar
Digital Library
- Tom Smeding and Matthijs Vákár. 2022. Efficient Dual-Numbers Reverse AD via Well-Known Program Transformations. CoRR, abs/2207.03418v2 (2022), https://doi.org/10.48550/arXiv.2207.03418 arXiv:2207.03418v2.
Google Scholar
- B. Speelpenning. 1980. Compiling fast partial derivatives of functions given by algorithms. Illinois University. https://doi.org/10.2172/5254402
Google Scholar
Cross Ref
- Matthijs Vákár. 2021. Reverse AD at Higher Types: Pure, Principled and Denotationally Correct. In Programming Languages and Systems, Nobuko Yoshida (Ed.) (Lecture Notes in Computer Science, Vol. 12648). Springer, 607–634. https://doi.org/10.1007/978-3-030-72019-3_22
Google Scholar
Digital Library
- Matthijs Vákár and Tom Smeding. 2022. CHAD: Combinatory Homomorphic Automatic Differentiation. ACM Trans. Program. Lang. Syst., 44, 3, 20:1–20:49. https://doi.org/10.1145/3527634
Google Scholar
Digital Library
- Dimitrios Vytiniotis, Dan Belov, Richard Wei, Gordon Plotkin, and Martin Abadi. 2019. The differentiable curry. NeurIPS Workshop on Program Transformations.
Google Scholar
- Fei Wang and Tiark Rompf. 2018. A Language and Compiler View on Differentiable Programming. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Workshop Track Proceedings. OpenReview.net. https://openreview.net/forum?id=SJxJtYkPG
Google Scholar
- R. E. Wengert. 1964. A simple automatic derivative evaluation program. Commun. ACM, 7, 8 (1964), 463–464. https://doi.org/10.1145/355586.364791
Google Scholar
Digital Library
Index Terms
Efficient Dual-Numbers Reverse AD via Well-Known Program Transformations
Recommendations
Implementation of automatic differentiation tools
Automatic differentiation is a semantic transformation that applies the rules of differential calculus to source code. It thus transforms a computer program that computes a mathematical function into a program that computes the function and its ...
An implementation of parallel pattern-matching via concurrent haskell
ACSC '02: Proceedings of the twenty-fifth Australasian conference on Computer science - Volume 4Parallel pattern-matching offers the maximum laziness for programs written in lazy functional languages. Function arguments are evaluated concurrently and all arguments are given equal precedence, so functions can return results whenever possible in the ...
FunZ Designs -- A Bridge Between Z Specifications and Haskell Implementations
COMPSAC '95: Proceedings of the 19th International Computer Software and Applications ConferenceFunZ, an intermediate specification language, is part of a complete methodology designed to facilitate the derivation of purely functional programs from Z specifications. FunZ is actually an extension of Haskell, yet the language also retains a Z-like ...






Comments