Abstract
We show that reverse-mode AD (Automatic Differentiation)—a generalized gradient-calculation operator—can be incorporated as a first-class function in an augmented lambda calculus, and therefore into a functional-programming language. Closure is achieved, in that the new operator can be applied to any expression in the augmented language, yielding an expression in that language. This requires the resolution of two major technical issues: (a) how to transform nested lambda expressions, including those with free-variable references, and (b) how to support self application of the AD machinery. AD transformations preserve certain complexity properties, among them that the reverse phase of the reverse-mode AD transformation of a function have the same temporal complexity as the original untransformed function. First-class unrestricted AD operators increase the expressive power available to the numeric programmer, and may have significant practical implications for the construction of numeric software that is robust, modular, concise, correct, and efficient.
- Appel, A. W. 1998. SSA is functional programming. ACM SIGPLAN Notices 33, 4 (Apr.), 17--20. Google Scholar
Digital Library
- Christianson, B. 1992. Automatic Hessians by reverse accumulation. IMA J. Numer. Anal. 12, 135--150.Google Scholar
Cross Ref
- Corliss, G., Faure, C., Griewank, A., Hascoët, L., and Naumann, U. 2001. Automatic Differentiation: From Simulation to Optimization. Springer-Verlag, New York. Google Scholar
Digital Library
- Griewank, A. 2000. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Number 19 in Frontiers in Applied Mathematics. SIAM, Philadelphia, PA. Google Scholar
Digital Library
- Griewank, A., Juedes, D., and Utke, J. 1996. Adol-C, a package for the automatic differentiation of algorithms written in C/C++. ACM Trans. Math. Softw. 22, 2, 131--167. Google Scholar
Digital Library
- Hascoët, L. and Pascual, V. 2004. Tapenade 2.1 user's guide. Rapport technique 300, INRIA, Sophia Antipolis.Google Scholar
- Karczmarczuk, J. 1998a. Functional differentiation of computer programs. In Proceedings of the III ACM SIGPLAN International Conference on Functional Programming (Baltimore, MD). ACM, New York, 195--203. Google Scholar
Digital Library
- Karczmarczuk, J. 1998b. Lazy differential algebra and its applications. In Workshop, III International Summer School on Advanced Functional Programming (Braga, Portugal).Google Scholar
- Karczmarczuk, J. 1999. Functional coding of differential forms. In Scottish Workshop on FP.Google Scholar
- Karczmarczuk, J. 2000a. Adjoint codes in functional framework. http://users.info.unicaen.fr/~karczma/arpap/revdiff.ps.Google Scholar
- Karczmarczuk, J. 2000b. Lazy time reversal, and automatic differentiation. http://users.info.unicaen.fr/~karczma/arpap/revpearl.pdf.Google Scholar
- Karczmarczuk, J. 2001a. Calcul des adjoints et programmation paresseuse. http://users.info. unicaen.fr/~karczma/arpap/jflarev.pdf.Google Scholar
- Karczmarczuk, J. 2001b. Functional differentiation of computer programs. Higher-Ord. Symbol. Comput. 14, 35--57. Google Scholar
Digital Library
- Kedem, G. 1980. Automatic differentiation of computer programs. ACM Trans. Mathemat. Softw. 6, 2, 150--165. Google Scholar
Digital Library
- Kelsey, R., Clinger, W., and Rees, J. 1998. Revised report on the algorithmic language Scheme. Higher-Ord. Symbol. Computat. 11, 1 (Sept.), 7--105. Google Scholar
Digital Library
- Kelsey, R. A. 1995. A correspondence between continuation passing style and static single assignment form. ACM SIGPLAN Notices, Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations 30, 3 (Mar.), 13--22. Google Scholar
Digital Library
- Pearlmutter, B. A. 1994. Fast exact multiplication by the Hessian. Neural Computat. 6, 1, 147--160. Google Scholar
Digital Library
- Pearlmutter, B. A. and Siskind, J. M. 2007. Lazy multivariate higher-order forward-mode AD. In Proceedings of the 2007 Symposium on Principles of Programming Languages (Nice, France). 155--160. Google Scholar
Digital Library
- Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical Recipes in C, 2nd ed. Cambridge University Press, New York. Google Scholar
Digital Library
- Rall, L. B. 1981. Automatic Differentiation: Techniques and Applications. Lecture Notes in Computer Science, vol. 120. Springer-Verlag, New York.Google Scholar
Cross Ref
- Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning representations by back-propagating errors. Nature 323, 533--536.Google Scholar
Cross Ref
- Sabry, A. and Felleisen, M. 1993. Reasoning about programs in continuation-passing style. LISP Symbol. Computat. 6, 3--4, 289--360. Google Scholar
Digital Library
- Siskind, J. M. 1999. Flow-directed lightweight closure conversion. Tech. Rep. 99-190R, NEC Research Institute, Inc.Google Scholar
- Siskind, J. M. and Pearlmutter, B. A. 2005. Perturbation confusion and referential transparency: Correct functional implementation of forward-mode AD. In Implementation and Application of Functional Languages---17th International Workshop, IFL'05, A. Butterfield, Ed. Dublin, Ireland, 1--9. Trinity College Dublin Computer Science Department Tech. Rep. TCD-CS-2005-60.Google Scholar
- Siskind, J. M. and Pearlmutter, B. A. 2008. Nesting forward-mode AD in a functional framework. Higher-Ord. Symb. Computat. To appear. Google Scholar
Digital Library
- Speelpenning, B. 1980. Compiling fast partial derivatives of functions given by algorithms. Ph.D. dissertation, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL. Google Scholar
Digital Library
- Sussman, G. J., Wisdom, J., and Mayer, M. E. 2001. Structure and Interpretation of Classical Mechanics. MIT Press, Cambridge, MA. Google Scholar
Digital Library
- Wengert, R. E. 1964. A simple automatic derivative evaluation program. Commun. ACM 7, 8, 463--644. Google Scholar
Digital Library
- Werbos, P. J. 1992. Neural networks, system identification, and control in the chemical process industries. In Handbook of Intelligent Control---Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. Van Norstrand Reinhold, Chap. 10. 283--356.Google Scholar
Index Terms
Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator
Recommendations
Deriving interpretations of the gradually-typed lambda calculus
PEPM '14: Proceedings of the ACM SIGPLAN 2014 Workshop on Partial Evaluation and Program ManipulationSiek and Garcia (2012) have explored the dynamic semantics of the gradually-typed lambda calculus by means of definitional interpreters and abstract machines. The correspondence between the calculus's mathematically described small-step reduction ...
Deriving the full-reducing Krivine machine from the small-step operational semantics of normal order
PPDP '13: Proceedings of the 15th Symposium on Principles and Practice of Declarative ProgrammingWe derive by program transformation Pierre Crégut's full-reducing Krivine machine KN from the structural operational semantics of the normal order reduction strategy in a closure-converted pure lambda calculus. We thus establish the correspondence ...
a functional programming system SFP: sisal 3.1 language structures decomposition
PaCT'07: Proceedings of the 9th international conference on Parallel Computing TechnologiesThe paper describes equivalent transformations of structures of the Sisal 3.1 programming language (based on Sisal 90). These transformations are aimed to decompose the complex language structures into more simple ones that can be directly expressed by ...






Comments