skip to main content
research-article

Parsing with derivatives: a functional pearl

Published:19 September 2011Publication History
Skip Abstract Section

Abstract

We present a functional approach to parsing unrestricted context-free grammars based on Brzozowski's derivative of regular expressions. If we consider context-free grammars as recursive regular expressions, Brzozowski's equational theory extends without modification to context-free grammars (and it generalizes to parser combinators). The supporting actors in this story are three concepts familiar to functional programmers - laziness, memoization and fixed points; these allow Brzozowski's original equations to be transliterated into purely functional code in about 30 lines spread over three functions.

Yet, this almost impossibly brief implementation has a drawback: its performance is sour - in both theory and practice. The culprit? Each derivative can double the size of a grammar, and with it, the cost of the next derivative.

Fortunately, much of the new structure inflicted by the derivative is either dead on arrival, or it dies after the very next derivative. To eliminate it, we once again exploit laziness and memoization to transliterate an equational theory that prunes such debris into working code. Thanks to this compaction, parsing times become reasonable in practice.

We equip the functional programmer with two equational theories that, when combined, make for an abbreviated understanding and implementation of a system for parsing context-free languages.

Skip Supplemental Material Section

Supplemental Material

_talk5.mp4

References

  1. Brzozowski, J. A. Derivatives of regular expressions. Journal of the ACM 11, 4 (Oct. 1964), 481--494. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Cocke, J., and Schwartz, J. T. Programming languages and their compilers: Preliminary notes. Tech. rep., Courant Institute of Mathematical Sciences, New York University, New York, NY, 1970.Google ScholarGoogle Scholar
  3. Cousot, P., and Cousot, R. Parsing as abstract interpretation of grammar semantics. Theoretical Computer Science 290 (2003), 531--544. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Cousot, P., and Cousot, R. Grammar analysis and parsing by abstract interpretation, invited chapter. In Program Analysis and Compilation, Theory and Practice: Essays dedicated to Reinhard Wilhelm, T. Reps, M. Sagiv, and J. Bauer, Eds., LNCS 4444. Springer discretionary-Verlag, Dec. 2006, pp. 178--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Danielsson, N. A. Total parser combinators. In Proceedings of the 15th ACM SIGPLAN international conference on Functional programming (New York, NY, USA, 2010), ICFP '10, ACM, pp. 285--296. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. DeRemer, F. L. Practical translators for LR(k) languages. Tech. rep., Cambridge, MA, USA, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Dijkstra, E. W. Selected Writings on Computing: A Personal Perspective. Springer, Oct. 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Earley, J. An efficient context-free parsing algorithm. Communications of the ACM 13, 2 (Feb. 1970), 94--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Floyd, R. W. Syntactic analysis and operator precedence. Journal of the ACM 10, 3 (July 1963), 316--333. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ford, B. Packrat parsing: Simple, powerful, lazy, linear time. In Proceedings of the 2002 International Conference on Functional Programming (Oct. 2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Kasami, T. An efficient recognition and syntax-analysis algorithm for context-free languages. Tech. rep., Air Force Cambridge Research Lab, Bedford, MA, 1965.Google ScholarGoogle Scholar
  12. Knuth, D. On the translation of languages from left to right. Information and Control 8 (1965), 607--639.Google ScholarGoogle ScholarCross RefCross Ref
  13. Owens, S., Reppy, J., and Turon, A. Regular-expression derivatives re-examined. Journal of Functional Programming 19, 02 (2009), 173--190. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Pratt, V. R. Top down operator precedence. In POPL '73: Proceedings of the 1st annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (New York, NY, USA, 1973), POPL '73, ACM, pp. 41--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Swierstra, D. S., Pablo, and Sariava, J. Designing and implementing combinator languages. In Advanced Functional Programming (1998), pp. 150--206.Google ScholarGoogle Scholar
  16. Swierstra, S. Combinator parsing: A short tutorial. In Language Engineering and Rigorous Software Development, A. Bove, L. Barbosa, A. Pardo, and J. Pinto, Eds., vol. 5520 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg, Berlin, Heidelberg, 2009, ch. 6, pp. 252--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tomita, M. LR parsers for natural languages. In ACL-22: Proceedings of the 10th International Conference on Computational Linguistics and 22nd annual meeting on Association for Computational Linguistics (Morristown, NJ, USA, 1984), Association for Computational Linguistics, pp. 354--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Warth, A., Douglass, J. R., and Millstein, T. Packrat parsers can support left recursion. In PEPM '08: Proceedings of the 2008 ACM SIGPLAN Symposium on Partial Evaluation and Semantics-based Program Manipulation (New York, NY, USA, 2008), ACM, pp. 103--110. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wirth, N. Compiler Construction (International Computer Science Series), pap/dsk ed. Addison-Wesley Pub (Sd).Google ScholarGoogle Scholar
  20. Younger, D. H. Recognition and parsing of context-free languages in time n3. Information and Control 10, 2 (1967), 189--208.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Parsing with derivatives: a functional pearl

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!