skip to main content

Parsing with zippers (functional pearl)

Published:03 August 2020Publication History
Skip Abstract Section

Abstract

Parsing with Derivatives (PwD) is an elegant approach to parsing context-free grammars (CFGs). It takes the equational theory behind Brzozowski's derivative for regular expressions and augments that theory with laziness, memoization, and fixed points. The result is a simple parser for arbitrary CFGs. Although recent work improved the performance of PwD, it remains inefficient due to the algorithm repeatedly traversing some parts of the grammar.

In this functional pearl, we show how to avoid this inefficiency by suspending the state of the traversal in a zipper. When subsequent derivatives are taken, we can resume the traversal from where we left off without retraversing already traversed parts of the grammar.

However, the original zipper is designed for use with trees, and we want to parse CFGs. CFGs can include shared regions, cycles, and choices between alternates, which makes them incompatible with the traditional tree model for zippers. This paper develops a generalization of zippers to properly handle these additional features. Just as PwD generalized Brzozowski's derivatives from regular expressions to CFGs, we generalize Huet's zippers from trees to CFGs.

Abstract The resulting parsing algorithm is concise and efficient: it takes only 31 lines of OCaml code to implement the derivative function but performs 6,500 times faster than the original PwD and 3.24 times faster than the optimized implementation of PwD.

Skip Supplemental Material Section

Supplemental Material

Presentation at ICFP '20

References

  1. Michael D. Adams, Celeste Hollenbeck, and Mathew Might. 2016. On the complexity and performance of parsing with derivatives. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (Santa Barbara, CA, USA) ( PLDI '16). ACM, New York, NY, USA, 224-236. https://doi.org/10.1145/2908080.2908128 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Janusz A. Brzozowski. 1964. Derivatives of Regular Expressions. Journal of the ACM (JACM) 11, 4 (Oct. 1964 ), 481-494. https://doi.org/10.1145/321239.321249 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nils Anders Danielsson. 2010. Total parser combinators. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (Baltimore, Maryland, USA) ( ICFP '10). ACM, New York, NY, USA, 285-296. https://doi.org/10. 1145/1863543.1863585 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Jay Earley. 1970. An eficient context-free parsing algorithm. Communications of the ACM (CACM) 13, 2 (Feb. 1970 ), 94-102. https://doi.org/10.1145/362007.362035 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Romain Edelmann, Jad Hamza, and Viktor Kunčak. 2020. Zippy LL(1) Parsing with Derivatives. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (London, UK) ( PLDI '20). ACM, New York, NY, USA, 1036-1051. https://doi.org/10.1145/3385412.3385992 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Gérard Huet. 1997. The Zipper. Journal of Functional Programming 7, 05 (Sept. 1997 ), 549-554. https://doi.org/10.1017/ S0956796897002864 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jane Street. 2014. core_bench. https://github.com/janestreet/core_bench version 109.58.01.Google ScholarGoogle Scholar
  8. Mark Johnson. 1995. Memoization in top-down parsing. Computational Linguistics 21, 3 (Sept. 1995 ), 405-417. http://dl.acm.org/citation.cfm?id= 216261. 216269Google ScholarGoogle Scholar
  9. Xavier Leroy, Damien Doligez, Alain Frisch, Jacques Garrigue, Didier Rémy, and Jérôme Vouillon. 2020. The OCaml system: release 4.10. https://ocaml.org/releases/4.10/htmlman/Google ScholarGoogle Scholar
  10. Conor McBride. 2001. The Derivative of a Regular Type is its Type of One-Hole Contexts. strictlypositive.org/diff.pdfGoogle ScholarGoogle Scholar
  11. Conor McBride. 2008. Clowns to the left of me, jokers to the right (pearl): dissecting data structures. In Proceedings of the 35th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (San Francisco, California, USA) ( POPL '08). ACM, New York, NY, USA, 287-295. https://doi.org/10.1145/1328438.1328474 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Mathew Might, David Darais, and Daniel Spiewak. 2011. Parsing with derivatives: a functional pearl. In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (Tokyo, Japan) ( ICFP '11). ACM, New York, NY, USA, 189-195. https://doi.org/10.1145/2034773.2034801 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Emmanuel Onzon. 2012. dypgen: Self-extensible parsers and lexers for OCaml. http://dypgen.free.fr/ version 20120619.Google ScholarGoogle Scholar
  14. Scot Owens, John Reppy, and Aaron Turon. 2009. Regular-expression derivatives re-examined. Journal of Functional Programming 19, 02 (March 2009 ), 173-190. https://doi.org/10.1017/S0956796808007090 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. François Potier and Yann Régis-Gianas. 2019. Menhir. http://gallium.inria.fr/~fpottier/menhir/ version 20190626.Google ScholarGoogle Scholar
  16. Python Software Foundation. 2015a. Python 3.4.3. https://www.python.org/downloads/release/python-343/Google ScholarGoogle Scholar
  17. Python Software Foundation. 2015b. The Python Language Reference: Full Grammar specification. https://docs.python.org/ 3/reference/grammar.htmlGoogle ScholarGoogle Scholar
  18. Elizabeth Scot and Adrian Johnstone. 2010. GLL Parsing. Electronic Notes in Theoretical Computer Science 253, 7 (Sept. 2010 ), 177-189. https://doi.org/10.1016/j.entcs. 2010. 08.041 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Elizabeth Scot and Adrian Johnstone. 2013. GLL parse-tree generation. Science of Computer Programming 78, 10 (Oct. 2013 ), 1828-1844. https://doi.org/10.1016/j.scico. 2012. 03.005 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Parsing with zippers (functional pearl)

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!