skip to main content
research-article
Open Access

Backpropagation in the simply typed lambda-calculus with linear negation

Published:20 December 2019Publication History
Skip Abstract Section

Abstract

Backpropagation is a classic automatic differentiation algorithm computing the gradient of functions specified by a certain class of simple, first-order programs, called computational graphs. It is a fundamental tool in several fields, most notably machine learning, where it is the key for efficiently training (deep) neural networks. Recent years have witnessed the quick growth of a research field called differentiable programming, the aim of which is to express computational graphs more synthetically and modularly by resorting to actual programming languages endowed with control flow operators and higher-order combinators, such as map and fold. In this paper, we extend the backpropagation algorithm to a paradigmatic example of such a programming language: we define a compositional program transformation from the simply-typed lambda-calculus to itself augmented with a notion of linear negation, and prove that this computes the gradient of the source program with the same efficiency as first-order backpropagation. The transformation is completely effect-free and thus provides a purely logical understanding of the dynamics of backpropagation.

Skip Supplemental Material Section

Supplemental Material

a64-brunel.webm

References

  1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek Gordon Murray, Benoit Steiner, Paul A. Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In Proceedings of OSDI. USENIX Association, 265–283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Beniamino Accattoli. 2012. An Abstract Factorization Theorem for Explicit Substitutions. In Proceedings of RTA (LIPIcs), Vol. 15. 6–21.Google ScholarGoogle Scholar
  3. Beniamino Accattoli. 2018. Proof Nets and the Linear Substitution Calculus. In Proceedings of ICTAC (Lecture Notes in Computer Science), Vol. 11187. Springer, 37–61.Google ScholarGoogle ScholarCross RefCross Ref
  4. Beniamino Accattoli, Pablo Barenbaum, and Damiano Mazza. 2014. Distilling Abstract Machines. In Proceedings of ICFP. ACM, 363–376.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Beniamino Accattoli and Bruno Barras. 2017. Environments and the complexity of abstract machines. In In Proceedings of PPDP. ACM, 4–16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Atılım Güneş Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. Automatic Differentiation in Machine Learning: a Survey. Journal of Machine Learning Research 18 (2017), 153:1–153:43.Google ScholarGoogle Scholar
  7. George Cybenko. 1989. Approximation by superpositions of a sigmoidal function. MCSS 2, 4 (1989), 303–314.Google ScholarGoogle Scholar
  8. Olivier Danvy and Mayer Goldberg. 2005. There and Back Again. Fundam. Inform. 66, 4 (2005), 397–413.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Cicero dos Santos and Maira Gatti. 2014. Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING: Technical Papers. 69–78.Google ScholarGoogle Scholar
  10. Thomas Ehrhard. 2018. An introduction to differential linear logic: proof-nets, models and antiderivatives. Mathematical Structures in Computer Science 28, 7 (2018), 995–1060.Google ScholarGoogle ScholarCross RefCross Ref
  11. Thomas Ehrhard and Giulio Guerrieri. 2016. The Bang Calculus: an untyped lambda-calculus generalizing call-by-name and call-by-value. In Proceedings PPDP. ACM, 174–187.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Thomas Ehrhard and Laurent Regnier. 2003. The differential lambda-calculus. Theor. Comput. Sci. 309, 1-3 (2003), 1–41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Conal Elliott. 2018. The simple essence of automatic differentiation. PACMPL 2, ICFP (2018), 70:1–70:29.Google ScholarGoogle Scholar
  14. Jean-Yves Girard. 1987. Linear Logic. Theor. Comput. Sci. 50, 1 (Jan. 1987), 1–102.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of ICML. 513–520.Google ScholarGoogle Scholar
  16. Kurt Hornik. 1991. Approximation capabilities of multilayer feedforward networks. Neural Networks 4, 2 (1991), 251–257.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. M. E. Hyland. 2017. Classical lambda calculus in modern dress. Math. Structures Comput. Sci. 27, 5 (2017), 762–781.Google ScholarGoogle ScholarCross RefCross Ref
  18. Teijiro Isokawa, Tomoaki Kusakabe, Nobuyuki Matsui, and Ferdinand Peper. 2003. Quaternion Neural Network and Its Application. In Proceedings of KES, Part II. 318–324.Google ScholarGoogle ScholarCross RefCross Ref
  19. Yann LeCun. 2018. Deep Learning est mort. Vive Differentiable Programming! (2018).Google ScholarGoogle Scholar
  20. Yann LeCun, Bernhard E. Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne E. Hubbard, and Lawrence D. Jackel. 1989. Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation 1, 4 (1989), 541–551.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. (2017).Google ScholarGoogle Scholar
  22. Barak A. Pearlmutter. 1995. Gradient calculations for dynamic recurrent neural networks: a survey. IEEE Trans. Neural Networks 6, 5 (1995), 1212–1228.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Barak A. Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a Functional Framework: Lambda the Ultimate Backpropagator. ACM Trans. Program. Lang. Syst. 30, 2, Article 7 (March 2008), 36 pages.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. J.K. Pearson and David L. Bisset. 1992. Back Propagation in a Clifford Algebra. In Proceedings of ICANN, Vol. 2. 413–416.Google ScholarGoogle Scholar
  25. Gordon Plotkin. 2018. Some Principles of Differential Programming Languages. (2018). https://popl18.sigplan.org/details/ POPL- 2018- papers/76/Some- Principles- of- Differential- Programming- Languages Invited talk at POPL 2018.Google ScholarGoogle Scholar
  26. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 8 (2019).Google ScholarGoogle Scholar
  27. David E. Rumelhart, James L. McClelland, and PDP Research Group. 1987. Parallel Distributed Processing, Volumes 1 and 2. MIT Press.Google ScholarGoogle Scholar
  28. Aliaksei Severyn and Alessandro Moschitti. 2015. Twitter sentiment analysis with deep convolutional neural networks. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 959–962.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Ronald Van Iwaarden. 1993. Automatic Differentiation Applied to Unconstrained Nonlinear Optimization with Result Verification. Interval Computations 3 (1993), 41–60.Google ScholarGoogle Scholar
  30. Fei Wang, Daniel Zheng, James M. Decker, Xilun Wu, Grégory M. Essertel, and Tiark Rompf. 2019. Demystifying differentiable programming: shift/reset the penultimate backpropagator. PACMPL 3, ICFP (2019), 96:1–96:31.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Backpropagation in the simply typed lambda-calculus with linear negation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 4, Issue POPL
        January 2020
        1984 pages
        EISSN:2475-1421
        DOI:10.1145/3377388
        Issue’s Table of Contents

        Copyright © 2019 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 20 December 2019
        Published in pacmpl Volume 4, Issue POPL

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!