skip to main content
research-article
Free Access

Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator

Published:14 March 2008Publication History
Skip Abstract Section

Abstract

We show that reverse-mode AD (Automatic Differentiation)—a generalized gradient-calculation operator—can be incorporated as a first-class function in an augmented lambda calculus, and therefore into a functional-programming language. Closure is achieved, in that the new operator can be applied to any expression in the augmented language, yielding an expression in that language. This requires the resolution of two major technical issues: (a) how to transform nested lambda expressions, including those with free-variable references, and (b) how to support self application of the AD machinery. AD transformations preserve certain complexity properties, among them that the reverse phase of the reverse-mode AD transformation of a function have the same temporal complexity as the original untransformed function. First-class unrestricted AD operators increase the expressive power available to the numeric programmer, and may have significant practical implications for the construction of numeric software that is robust, modular, concise, correct, and efficient.

References

  1. Appel, A. W. 1998. SSA is functional programming. ACM SIGPLAN Notices 33, 4 (Apr.), 17--20. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Christianson, B. 1992. Automatic Hessians by reverse accumulation. IMA J. Numer. Anal. 12, 135--150.Google ScholarGoogle ScholarCross RefCross Ref
  3. Corliss, G., Faure, C., Griewank, A., Hascoët, L., and Naumann, U. 2001. Automatic Differentiation: From Simulation to Optimization. Springer-Verlag, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Griewank, A. 2000. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Number 19 in Frontiers in Applied Mathematics. SIAM, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Griewank, A., Juedes, D., and Utke, J. 1996. Adol-C, a package for the automatic differentiation of algorithms written in C/C++. ACM Trans. Math. Softw. 22, 2, 131--167. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Hascoët, L. and Pascual, V. 2004. Tapenade 2.1 user's guide. Rapport technique 300, INRIA, Sophia Antipolis.Google ScholarGoogle Scholar
  7. Karczmarczuk, J. 1998a. Functional differentiation of computer programs. In Proceedings of the III ACM SIGPLAN International Conference on Functional Programming (Baltimore, MD). ACM, New York, 195--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Karczmarczuk, J. 1998b. Lazy differential algebra and its applications. In Workshop, III International Summer School on Advanced Functional Programming (Braga, Portugal).Google ScholarGoogle Scholar
  9. Karczmarczuk, J. 1999. Functional coding of differential forms. In Scottish Workshop on FP.Google ScholarGoogle Scholar
  10. Karczmarczuk, J. 2000a. Adjoint codes in functional framework. http://users.info.unicaen.fr/~karczma/arpap/revdiff.ps.Google ScholarGoogle Scholar
  11. Karczmarczuk, J. 2000b. Lazy time reversal, and automatic differentiation. http://users.info.unicaen.fr/~karczma/arpap/revpearl.pdf.Google ScholarGoogle Scholar
  12. Karczmarczuk, J. 2001a. Calcul des adjoints et programmation paresseuse. http://users.info. unicaen.fr/~karczma/arpap/jflarev.pdf.Google ScholarGoogle Scholar
  13. Karczmarczuk, J. 2001b. Functional differentiation of computer programs. Higher-Ord. Symbol. Comput. 14, 35--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Kedem, G. 1980. Automatic differentiation of computer programs. ACM Trans. Mathemat. Softw. 6, 2, 150--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Kelsey, R., Clinger, W., and Rees, J. 1998. Revised report on the algorithmic language Scheme. Higher-Ord. Symbol. Computat. 11, 1 (Sept.), 7--105. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kelsey, R. A. 1995. A correspondence between continuation passing style and static single assignment form. ACM SIGPLAN Notices, Papers from the 1995 ACM SIGPLAN Workshop on Intermediate Representations 30, 3 (Mar.), 13--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Pearlmutter, B. A. 1994. Fast exact multiplication by the Hessian. Neural Computat. 6, 1, 147--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Pearlmutter, B. A. and Siskind, J. M. 2007. Lazy multivariate higher-order forward-mode AD. In Proceedings of the 2007 Symposium on Principles of Programming Languages (Nice, France). 155--160. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. 1992. Numerical Recipes in C, 2nd ed. Cambridge University Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rall, L. B. 1981. Automatic Differentiation: Techniques and Applications. Lecture Notes in Computer Science, vol. 120. Springer-Verlag, New York.Google ScholarGoogle ScholarCross RefCross Ref
  21. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning representations by back-propagating errors. Nature 323, 533--536.Google ScholarGoogle ScholarCross RefCross Ref
  22. Sabry, A. and Felleisen, M. 1993. Reasoning about programs in continuation-passing style. LISP Symbol. Computat. 6, 3--4, 289--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Siskind, J. M. 1999. Flow-directed lightweight closure conversion. Tech. Rep. 99-190R, NEC Research Institute, Inc.Google ScholarGoogle Scholar
  24. Siskind, J. M. and Pearlmutter, B. A. 2005. Perturbation confusion and referential transparency: Correct functional implementation of forward-mode AD. In Implementation and Application of Functional Languages---17th International Workshop, IFL'05, A. Butterfield, Ed. Dublin, Ireland, 1--9. Trinity College Dublin Computer Science Department Tech. Rep. TCD-CS-2005-60.Google ScholarGoogle Scholar
  25. Siskind, J. M. and Pearlmutter, B. A. 2008. Nesting forward-mode AD in a functional framework. Higher-Ord. Symb. Computat. To appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Speelpenning, B. 1980. Compiling fast partial derivatives of functions given by algorithms. Ph.D. dissertation, Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Sussman, G. J., Wisdom, J., and Mayer, M. E. 2001. Structure and Interpretation of Classical Mechanics. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Wengert, R. E. 1964. A simple automatic derivative evaluation program. Commun. ACM 7, 8, 463--644. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Werbos, P. J. 1992. Neural networks, system identification, and control in the chemical process industries. In Handbook of Intelligent Control---Neural, Fuzzy, and Adaptive Approaches, D. A. White and D. A. Sofge, Eds. Van Norstrand Reinhold, Chap. 10. 283--356.Google ScholarGoogle Scholar

Index Terms

  1. Reverse-mode AD in a functional framework: Lambda the ultimate backpropagator

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!