Abstract
Deep learning is moving towards increasingly sophisticated optimization objectives that employ higher-order functions, such as integration, continuous optimization, and root-finding. Since differentiable programming frameworks such as PyTorch and TensorFlow do not have first-class representations of these functions, developers must reason about the semantics of such objectives and manually translate them to differentiable code.
We present a differentiable programming language, λS, that is the first to deliver a semantics for higher-order functions, higher-order derivatives, and Lipschitz but nondifferentiable functions. Together, these features enableλS to expose differentiable, higher-order functions for integration, optimization, and root-finding as first-class functions with automatically computed derivatives. λS’s semantics is computable, meaning that values can be computed to arbitrary precision, and we implement λS as an embedded language in Haskell.
We use λS to construct novel differentiable libraries for representing probability distributions, implicit surfaces, and generalized parametric surfaces – all as instances of higher-order datatypes – and present case studies that rely on computing the derivatives of these higher-order functions and datatypes. In addition to modeling existing differentiable algorithms, such as a differentiable ray tracer for implicit surfaces, without requiring any user-level differentiation code, we demonstrate new differentiable algorithms, such as the Hausdorff distance of generalized parametric surfaces.
- MartĂn Abadi and Gordon D. Plotkin. A simple diferentiable programming languaPgrien. ciIpnles of Programming Languages, 2020.Google Scholar
- Matan Atzmon, Niv Haim, Lior Yariv, Ofer Israelov, Haggai Maron, and Yaron Lipman. Controlling neural level sets. In Advances in Neural Information Processing Systems. 2019.Google Scholar
- Shaojie Bai, J Zico Kolter, and Vladlen Koltun. Deep equilibrium modeAldsv.aInnces in Neural Information Processing Systems, 2019.Google Scholar
- J Daniel Christensen and Enxin Wu. Tangent spaces and tangent bundles for difeologicaAlmspeariceasn. Mathematical Society, 2017.Google Scholar
- Frank H ClarkeO. ptimization and nonsmooth analysis. 1990.Google Scholar
- Pietro Di Gianantonio and Abbas Edalat. A language for diferentiable funcFtoiounds. atIinons of Software Science and Computational Structures, 2013.Google Scholar
- Abbas Edalat. A continuous derivative for real-valued functNieownsC. oImnputational Paradigms. 2008.Google Scholar
- Abbas Edalat and André Lieutier. Domain theory and diferential calculus (functions of one Mvaarthiaebmlaet)i.cal Structures in Computer Science, 14 ( 6 ), 2004.Google Scholar
- hTomas Ehrhard and Laurent Regnier. The diferential lambda-calculusThe.oretical Computer Science, 309 ( 1-3 ), 2003.Google Scholar
- Conal Elliot. Higher-dimensional, higher-order derivatives, functionally. 200h8t. tUp:R//Lconal.net/blog/posts/higherdimensional-higher-order-derivatives-functi.onallyGoogle Scholar
- Conal Elliot. The simple essence of automatic diferentiation. International Conference on Functional Programming, 2018.Google Scholar
- Mikhail Figurnov, Shakir Mohamed, and Andriy Mnih. Implicit reparameterization gradiAendvtasn.ceIsnin Neural Information Processing Systems. 2018.Google Scholar
- Michael P Fourman. Continuous truth I : Non-constructive oSbtjuedcitess. in Logic and the Foundations of Mathematics, 112, 1984.Google Scholar
- Laurent Fousse, Guillaume Hanrot, Vincent Lefèvre, Patrick Pélissier, and Paul Zimmermann. MPFR: A multiple-precision binary floating-point library with correct rounAdCinMg. Transactions on Mathematical Software, 33 ( 2 ), 2007.Google Scholar
Digital Library
- Mathieu Huot, Sam Staton, and Mathijs Vákár. Correctness of automatic diferentiation via difeologies and categorical gluing. InFoundations of Software Science and Computation Structures, 2020.Google Scholar
Cross Ref
- Eric Jang, Shixiang Gu, and Ben Poole. Categorical reparameterization with gumbel-Insoteftmrnaaxt.ional Conference on Learning Representations, 2017.Google Scholar
- Martin Jankowiak and Fritz Obermeyer. Pathwise derivatives beyond the reparameterizatioInntterrnicakti. onInal Conference on Machine Learning, 2018.Google Scholar
- Lutz Ketner, Kurt Mehlhorn, Sylvain Pion, Stefan Schirra, and Chee Yap. Classroom examples of robustness problems in geometric computationCs. omputational Geometry, 40 ( 1 ), 2008.Google Scholar
- Tzu-Mao Li, Miika Aitala, Frédo Durand, and Jaakko Lehtinen. Diferentiable Monte Carlo ray tracing through edge sampling. InSpecial Interest Group on Computer Graphics and Interactive Techniques, 2018.Google Scholar
- Jonathan Lorraine, Paul Vicol, and David Duvenaud. Optimizing millions of hyperparameters by implicit diferentiation. arXiv preprint arXiv:1911.02590, 2019.Google Scholar
- Christian A. Naesseth, Francisco J. R. Ruiz, Scot W. Linderman, and David M. Blei. Reparameterization gradients through acceptance-rejection sampling algorithmAs. rtIinficial Intelligence and Statistics, 2017.Google Scholar
- Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. Diferentiable volumetric rendering: Learning implicit 3D representations without 3D supervisioCno. mInputer Vision and Patern Recognition, 2020.Google Scholar
- Benjamin Sherman, Jesse Michel, and Michael Carbin. Sound and robust solid modeling via exact real arithmetic and continuity. InInternational Conference on Functional Programming, 2019.Google Scholar
Digital Library
- Jesse Sigal. Denotational semantics for diferentiable programming with manifSotuldesn. tInResearch Competition at the Internation Conference on Functional Programming, 2018.Google Scholar
- Jefrey Mark Siskind and Barak A Pearlmuter. Perturbation confusion and referential transparency: Correct functiona implementation of forward-mode Wado. rkshop on Implementation and Application of Functional Languages, 2005.Google Scholar
- Dimitrios Vytiniotis, Dan Belov, Richard Wei, Gordon Plotkin, and Martin Abadi. The diferentiable curNreyu.ralIn Information Processing Systems Workshop Program Transformations, 2019.Google Scholar
- Mathijs Vákár, Ohad Kammar, and Sam Staton. Difeological spaces and semantics for diferential programming. In Domains, 2018. URL https://andrejbauer.github.io/domains-floc-2018/slides/Matthijs-Kammar-St. aton.pdf Yuanhao Wang, Guodong Zhang, and Jimmy Ba. On solving minimax optimization locally: A follow-the-ridge approach. In International Conference on Learning Representations, 2020.Google Scholar
Index Terms
𝜆ₛ: computable semantics for differentiable programming with higher-order functions and datatypes
Recommendations
Globally Convergent Inexact Generalized Newton Method for First-Order Differentiable Optimization Problems
Motivated by the method of Martinez and Qi (Ref. 1), we propose in this paper a globally convergent inexact generalized Newton method to solve unconstrained optimization problems in which the objective functions have Lipschitz continuous gradient ...
On differentiable exact penalty functions
In this work, we study a differentiable exact penalty function for solving twice continuously differentiable inequality constrained optimization problems. Under certain assumptions on the parameters of the penalty function, we show the equivalence of ...
An observationally complete program logic for imperative higher-order functions
We establish a strong completeness property called observational completeness of the program logic for imperative, higher-order functions introduced in [1]. Observational completeness states that valid assertions characterise program behaviour up to ...






Comments