skip to main content
research-article
Open Access

Symbolic and automatic differentiation of languages

Published:19 August 2021Publication History
Skip Abstract Section

Abstract

Formal languages are usually defined in terms of set theory. Choosing type theory instead gives us languages as type-level predicates over strings. Applying a language to a string yields a type whose elements are language membership proofs describing how a string parses in the language. The usual building blocks of languages (including union, concatenation, and Kleene closure) have precise and compelling specifications uncomplicated by operational strategies and are easily generalized to a few general domain-transforming and codomain-transforming operations on predicates.

A simple characterization of languages (and indeed functions from lists to any type) captures the essential idea behind language “differentiation” as used for recognizing languages, leading to a collection of lemmas about type-level predicates. These lemmas are the heart of two dual parsing implementations—using (inductive) regular expressions and (coinductive) tries—each containing the same code but in dual arrangements (with representation and primitive operations trading places). The regular expression version corresponds to symbolic differentiation, while the trie version corresponds to automatic differentiation.

The relatively easy-to-prove properties of type-level languages transfer almost effortlessly to the decidable implementations. In particular, despite the inductive and coinductive nature of regular expressions and tries respectively, we need neither inductive nor coinductive/bisimulation arguments to prove algebraic properties.

Skip Supplemental Material Section

Supplemental Material

3473583.mp4

Presentation Videos

Auxiliary Presentation Video

Formal languages are usually defined in terms of set theory. Choosing type theory instead gives us languages as type-level predicates over strings. Applying a language to a string yields a type whose elements are language membership proofs describing how a string parses in the language. The usual building blocks of languages have precise and compelling specifications uncomplicated by operational strategies. A simple characterization of languages captures the essential idea behind language "differentiation", leading to a collection of lemmas about type-level predicates. These lemmas are the heart of two dual parsing implementations---using (inductive) regular expressions and (coinductive) tries---each containing the same code but in dual arrangements. The regular expression version corresponds to symbolic differentiation, while the trie version corresponds to automatic differentiation.

References

  1. Andreas Abel. 2008. Semi-continuous sized types and termination. Logical Methods in Computer Science, 4, 2 (2008), Apr, arxiv:0804.0876Google ScholarGoogle Scholar
  2. Andreas Abel. 2016. Equational reasoning about formal languages in coalgebraic style. http://www.cse.chalmers.se/~abela/jlamp17.pdf draft.Google ScholarGoogle Scholar
  3. Andreas Abel and Brigitte Pientka. 2016. Well-founded recursion with copatterns and sized types. Journal of Functional Programming, 26 (2016), https://www.cs.mcgill.ca/~bpientka/papers/jfp15.pdfGoogle ScholarGoogle Scholar
  4. Andreas Abel, Brigitte Pientka, David Thibodeau, and Anton Setzer. 2013. Copatterns: Programming infinite structures by observations. POPL ’13. 27–38. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.295.8056Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Agda Team. 2020. The Agda standard library. https://github.com/agda/agda-stdlibGoogle ScholarGoogle Scholar
  6. Alexandre Agular and Bassel Mannaa. 2009. Regular expressions in Agda. Chalmers University. https://itu.dk/people/basm/report.pdfGoogle ScholarGoogle Scholar
  7. Valentin Antimirov. 1996. Partial derivatives of regular expressions and finite automaton constructions. Theoretical Computer Science, 155, 2 (1996), 291–319. https://www.sciencedirect.com/science/article/pii/0304397595001824Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Anne Baanen and Wouter Swierstra. 2020. Combining predicate transformer semantics for effects: A case study in parsing regular languages. In Proceedings Eighth Workshop on Mathematically Structured Functional Programming. arxiv:2005.00197Google ScholarGoogle ScholarCross RefCross Ref
  9. Ana Bove, Peter Dybjer, and Ulf Norell. 2009. A brief overview of Agda — A functional language with dependent types. In Theorem Proving in Higher Order Logics. http://www.cse.chalmers.se/~ulfn/papers/tphols09/tutorial.pdfGoogle ScholarGoogle Scholar
  10. Janusz A. Brzozowski. 1964. Derivatives of regular expressions. J. ACM, 11 (1964), 481–494.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Chomsky and M.P. Schützenberger. 1959. The algebraic theory of context-free languages. In Computer Programming and Formal Systems (Studies in Logic and the Foundations of Mathematics, Vol. 26). 118–161.Google ScholarGoogle Scholar
  12. Richard H. Connelly and F. Lockwood Morris. 1995. A generalization of the trie data structure. Mathematical Structures in Computer Science, 5, 3 (1995), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.902.7768Google ScholarGoogle Scholar
  13. Christian Doczkal, Jan-Oliver Kaiser, and Gert Smolka. 2013. A constructive theory of regular languages in Coq. In Certified Programs and Proofs, Third International Conference (CPP 2013). https://www.ps.uni-saarland.de/~doczkal/regular/ConstructiveRegularLanguages.pdfGoogle ScholarGoogle Scholar
  14. Brijesh Dongol, Ian J. Hayes, and Georg Struth. 2016. Convolution as a unifying concept: Applications in separation logic, interval calculi, and concurrency. ACM Transactions on Computational Logic, Feb., 15:1–15:25. https://bura.brunel.ac.uk/bitstream/2438/12133/1/Fulltext.pdfGoogle ScholarGoogle Scholar
  15. Manfred Droste and Dietrich Kuske. 2019. Weighted automata. Jan, https://eiche.theoinf.tu-ilmenau.de/person/kuske/public_html/weiterleitung.html?Submitted/weighted.pdf unpublished.Google ScholarGoogle Scholar
  16. Conal Elliott. 2018. The simple essence of automatic differentiation. Proceedings of the ACM on Programming Languages, 2, ICFP (2018), Article 4, Sept., 29 pages. http://conal.net/papers/essence-of-ad/Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Conal Elliott. 2019. Generalized convolution and efficient language recognition. CoRR, abs/1903.10677 (2019), arxiv:1903.10677Google ScholarGoogle Scholar
  18. Conal Elliott. 2021. Source repository for “Symbolic and automatic differentiation of languages”. https://github.com/conal/paper-2021-language-derivativesGoogle ScholarGoogle Scholar
  19. Denis Firsov and Tarmo Uustalu. 2013. Certified parsing of regular languages. In Proceedings of the Third International Conference on Certified Programs and Proofs, Volume 8307. Springer-Verlag. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.571.724Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jonathan S. Golan. 2005. Some recent applications of semiring theory. In International Conference on Algebra in Memory of Kostia Beidar. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.318.6696Google ScholarGoogle Scholar
  21. Joshua Goodman. 1998. Parsing Inside-Out. Ph.D. Dissertation. Harvard University. arxiv:cmp-lg/9805007Google ScholarGoogle Scholar
  22. Joshua Goodman. 1999. Semiring parsing. Computational Linguistics, 25, 4 (1999), Dec., 573–605. http://www.aclweb.org/anthology/J99-4004Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Andrew D. Gordon. 1995. A tutorial on co-induction and functional programming. In Functional Programming, Glasgow 1994. 78–95. https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.37.3914Google ScholarGoogle Scholar
  24. Andreas Griewank. 1989. On automatic differentiation. In In Mathematical Programming: Recent Developments and Applications.Google ScholarGoogle Scholar
  25. Andreas Griewank and Andrea Walther. 2008. Evaluating Derivatives. Principles and Techniques of Algorithmic Differentiation (second ed.). Society for Industrial and Applied Mathematics.Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Ralf Hinze. 2000. Generalizing generalized tries. Journal of Functional Programming, 10, 4 (2000), July, 327–351. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.4069Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Graham Hutton and Erik Meijer. 1996. Monadic parser combinators. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.54.1678Google ScholarGoogle Scholar
  28. Donald E. Knuth. 1998. The Art of Computer Programming, Volume 3: (2nd Ed.) Sorting and Searching. Addison Wesley Longman Publishing Co., Inc..Google ScholarGoogle Scholar
  29. Joomy Korkut, Maksim Trifunovski, and Daniel R. Licata. 2016. Intrinsic verification of a regular expression matcher. https://dlicata.wescreates.wesleyan.edu/pubs/ktl16regexp/ktl16regexp.pdf unpublished draft.Google ScholarGoogle Scholar
  30. Daan Leijen and Erik Meijer. 2001. Parsec: A practical parser library. Electronic Notes in Theoretical Computer Science, 41, 1 (2001), 1–20. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.24.5200Google ScholarGoogle Scholar
  31. Yudong Liu. 2004. Algebraic Foundation of Statistical Parsing Semiring Parsing. Ph.D. Dissertation. Simon Fraser University. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.167.6977Google ScholarGoogle Scholar
  32. Sylvain Lombardy and Jacques Sakarovitch. 2005. Derivatives of rational expressions with multiplicity. Theoretical Computer Science, 332, 1 (2005), 141–177. https://www.sciencedirect.com/science/article/pii/S0304397504007054Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Per Martin-Löf and Giovanni Sambin. 1984. Intuitionistic Type Theory. 9, Bibliopolis Naples.Google ScholarGoogle Scholar
  34. Conor McBride and Ross Paterson. 2008. Applicative programming with effects. Journal of Functional Programming, 18, 1 (2008), Jan., 1–13. http://www.staff.city.ac.uk/~ross/papers/Applicative.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  35. Matthew Might and David Darais. 2010. Yacc is dead. CoRR, abs/1010.5023 (2010), arxiv:1010.5023Google ScholarGoogle Scholar
  36. Ulf Norell. 2008. Dependently typed programming in Agda. In Revised Lectures of the Sixth International Spring School on Advanced Functional Programming (Lecture Notes in Computer Science). http://www.cse.chalmers.se/~ulfn/papers/afp08/tutorial.pdfGoogle ScholarGoogle Scholar
  37. Klaus Ostermann and Julian Jabs. 2018. Dualizing generalized algebraic data types by matrix transposition. In Programming Languages and Systems—27th European Symposium on Programming, ESOP 2018, Proceedings, Amal Ahmed (Ed.). https://handin-ps.informatik.uni-tuebingen.de/publications/ostermann18dualizing.pdfGoogle ScholarGoogle Scholar
  38. M.P. Schützenberger. 1961. On the definition of a family of automata. Information and Control, 4, 2 (1961), 245–270. https://www.sciencedirect.com/science/article/pii/S001999586180020XGoogle ScholarGoogle ScholarCross RefCross Ref
  39. S Doaitse Swierstra. 2008. Combinator parsing: A short tutorial. In International LerNet ALFA Summer School on Language Engineering and Rigorous Software Development. 252–300. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.184.7953Google ScholarGoogle Scholar
  40. Axel Thue. 1912. Über die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Jacob Dybwad. https://archive.org/details/skrifterutgitavv121chriGoogle ScholarGoogle Scholar
  41. Dmitriy Traytel. 2017. Formal languages, formally and coinductively. Logical Methods in Computer Science, Volume 13, Issue 3 (2017), Sept., arxiv:1611.09633Google ScholarGoogle Scholar
  42. Philip Wadler. 2015. Propositions as types. Commun. ACM, 58, 12 (2015), https://homepages.inf.ed.ac.uk/wadler/topics/history.html#propositions-as-typesGoogle ScholarGoogle Scholar

Index Terms

  1. Symbolic and automatic differentiation of languages

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image Proceedings of the ACM on Programming Languages
            Proceedings of the ACM on Programming Languages  Volume 5, Issue ICFP
            August 2021
            1011 pages
            EISSN:2475-1421
            DOI:10.1145/3482883
            Issue’s Table of Contents

            Copyright © 2021 Owner/Author

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 19 August 2021
            Published in pacmpl Volume 5, Issue ICFP

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!