Abstract
Parser combinators are a middle ground between the fine control of hand-rolled parsers and the high-level almost grammar-like appearance of parsers created via parser generators. They also promote a cleaner, compositional design for parsers. Historically, however, they cannot match the performance of their counterparts.
This paper describes how to compile parser combinators into parsers of hand-written quality. This is done by leveraging the static information present in the grammar by representing it as a tree. However, in order to exploit this information, it will be necessary to drop support for monadic computation since this generates dynamic structure. Selective functors can help recover lost functionality in the absence of monads, and the parser tree can be partially evaluated with staging. This is implemented in a library called Parsley.
Supplemental Material
- Michael D. Adams and Ömer S. Ağacan. 2014. Indentation-sensitive Parsing for Parsec. SIGPLAN Not. 49, 12 (Sept. 2014 ), 121-132. https://doi.org/10.1145/2775050.2633369 Google Scholar
Digital Library
- Michael D. Adams, Celeste Hollenbeck, and Matthew Might. 2016. On the Complexity and Performance of Parsing with Derivatives. SIGPLAN Not. 51, 6 ( June 2016 ), 224-236. https://doi.org/10.1145/2980983.2908128 Google Scholar
Digital Library
- Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jefrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (2nd Edition). Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.Google Scholar
Digital Library
- Andrew W. Appel. 2007. Compiling with Continuations. Cambridge University Press, USA.Google Scholar
Digital Library
- Arthur I. Baars and S. Doaitse Swierstra. 2004. Type-Safe, Self Inspecting Code. In Proceedings of the 2004 ACM SIGPLAN Workshop on Haskell (Snowbird, Utah, USA) ( Haskell âĂŹ04). Association for Computing Machinery, New York, NY, USA, 69-79. https://doi.org/10.1145/1017472.1017485 Google Scholar
Digital Library
- Nick Benton. 2005. A Typed, Compositional Logic for a Stack-Based Abstract Machine. 364-380. https://doi.org/10.1007/ 11575467_24 Google Scholar
Digital Library
- Janusz A. Brzozowski. 1964. Derivatives of Regular Expressions. J. ACM 11, 4 (Oct. 1964 ), 481-494. https://doi.org/10.1145/ 321239.321249 Google Scholar
Digital Library
- Olivier Danvy, Karoline Malmkjaer, and Jens Palsberg. 1996. Eta-expansion Does The Trick. ACM Trans. Program. Lang. Syst. 18, 6 (Nov. 1996 ), 730-751. https://doi.org/10.1145/236114.236119 Google Scholar
Digital Library
- Germán Andrés Delbianco, Mauro Jaskeliof, and Alberto Pardo. 2012. Applicative Shortcut Fusion. In Trends in Functional Programming, Ricardo Peña and Rex Page (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 179-194.Google Scholar
- Dominique Devreise and Frank Piessens. 2012. Finally tagless observable recursion for an abstract grammar model. Journal of Functional Programming 22, 6 ( 2012 ), 757-796. https://doi.org/10.1017/S0956796812000226 Google Scholar
Digital Library
- Bryan Ford. 2002. Packrat Parsing : a Practical Linear-Time Algorithm with Backtracking by. Ph.D. Dissertation.Google Scholar
- Bryan Ford. 2004. Parsing Expression Grammars: A Recognition-based Syntactic Foundation. SIGPLAN Not. 39, 1 (Jan. 2004 ), 111-122. https://doi.org/10.1145/982962.964011 Google Scholar
Digital Library
- Martin Fowler. 2010. Domain Specific Languages (1st ed.). Addison-Wesley Professional.Google Scholar
Digital Library
- Jeremy Gibbons and Ralf Hinze. 2011. Just Do It: Simple Monadic Equational Reasoning. SIGPLAN Not. 46, 9 (Sept. 2011 ), 2-14. https://doi.org/10.1145/2034574.2034777 Google Scholar
Digital Library
- Jeremy Gibbons and Nicolas Wu. 2014. Folding Domain-specific Languages: Deep and Shallow Embeddings (Functional Pearl). In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (Gothenburg, Sweden) (ICFP '14). ACM, New York, NY, USA, 339-347. https://doi.org/10.1145/2628136.2628138 Google Scholar
Digital Library
- Andy Gill. 2009. Type-Safe Observable Sharing in Haskell. In Proceedings of the 2nd ACM SIGPLAN Symposium on Haskell (Edinburgh, Scotland) ( Haskell '09). Association for Computing Machinery, New York, NY, USA, 117-128. https://doi.org/10.1145/1596638.1596653 Google Scholar
Digital Library
- Andy Gill and Simon Marlow. 1995. Happy: the parser generator for Haskell.Google Scholar
- Tatsuya Hagino. 1987. Category theoretic approach to data types. Ph.D. Dissertation. PhD thesis, University of Edinburgh.Google Scholar
- Ian Henriksen, Gianfranco Bilardi, and Keshav Pingali. 2019. Derivative Grammars: A Symbolic Approach to Parsing with Derivatives. Proc. ACM Program. Lang. 3, OOPSLA, Article 127 (Oct. 2019 ), 28 pages. https://doi.org/10.1145/3360553 Google Scholar
Digital Library
- Ralf Hinze. 2012. Kan Extensions for Program Optimisation Or: Art and Dan Explain an Old Trick. In Mathematics of Program Construction, Jeremy Gibbons and Pablo Nogueira (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 324-362.Google Scholar
- Ralf Hinze and Nicolas Wu. 2013. Histo-and Dynamorphisms Revisited. In Proceedings of the 9th ACM SIGPLAN Workshop on Generic Programming (Boston, Massachusetts, USA) ( WGP '13). Association for Computing Machinery, New York, NY, USA, 1-12. https://doi.org/10.1145/2502488.2502496 Google Scholar
Digital Library
- Ralf Hinze, Nicolas Wu, and Jeremy Gibbons. 2013. Unifying Structured Recursion Schemes. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming (Boston, Massachusetts, USA) ( ICFP '13). Association for Computing Machinery, New York, NY, USA, 209-220. https://doi.org/10.1145/2500365.2500578 Google Scholar
Digital Library
- Paul Hudak. 1996. Building Domain-specific Embedded Languages. ACM Comput. Surv. 28, 4es, Article 196 ( Dec. 1996 ). https://doi.org/10.1145/242224.242477 Google Scholar
Digital Library
- Graham Hutton. 1992. Higher-order functions for parsing. Journal of Functional Programming 2, 3 ( 1992 ), 323-343. https://doi.org/10.1017/S0956796800000411 Google Scholar
Cross Ref
- Graham Hutton and Erik Meijer. 1996. Monadic Parser Combinators. Technical Report NOTTCS-TR-96-4. Department of Computer Science, University of Nottingham.Google Scholar
- Manohar Jonnalagedda, Thierry Coppey, Sandro Stucki, Tiark Rompf, and Martin Odersky. 2014. Staged Parser Combinators for Eficient Data Processing. SIGPLAN Not. 49, 10 (Oct. 2014 ), 637-653. https://doi.org/10.1145/2714064.2660241 Google Scholar
Digital Library
- Andrew Kennedy. 2007. Compiling with Continuations, Continued. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming (Freiburg, Germany) (ICFP '07). Association for Computing Machinery, New York, NY, USA, 177-190. https://doi.org/10.1145/1291151.1291179 Google Scholar
Digital Library
- Csongor Kiss, Matthew Pickering, and Nicolas Wu. 2018. Generic Deriving of Generic Traversals. Proc. ACM Program. Lang. 2, ICFP, Article 85 ( July 2018 ), 30 pages. https://doi.org/10.1145/3236780 Google Scholar
Digital Library
- Dexter Kozen. 1997. Kleene Algebra with Tests. ACM Trans. Program. Lang. Syst. 19, 3 (May 1997 ), 427-443. https: //doi.org/10.1145/256167.256195 Google Scholar
Digital Library
- Neelakantan R. Krishnaswami and Jeremy Yallop. 2019. A Typed, Algebraic Approach to Parsing. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) ( PLDI 2019). ACM, New York, NY, USA, 379-393. https://doi.org/10.1145/3314221.3314625 Google Scholar
Digital Library
- John Launchbury and Simon L. Peyton Jones. 1994. Lazy Functional State Threads. SIGPLAN Not. 29, 6 ( June 1994 ), 24-35. https://doi.org/10.1145/773473.178246 Google Scholar
Digital Library
- Daan Leijen and Erik Meijer. 1999. Domain Specific Embedded Compilers. SIGPLAN Not. 35, 1 (Dec. 1999 ), 109-122. https://doi.org/10.1145/331963.331977 Google Scholar
Digital Library
- Daan Leijen and Erik Meijer. 2001. Parsec: Direct Style Monadic Parser Combinators For The Real World. Technical Report. Microsoft.Google Scholar
- Peter Ljunglöf. 2002. Pure Functional Parsing. Ph.D. Dissertation. Chalmers University of Technology and Göteborg University.Google Scholar
- Simon Marlow, Louis Brandy, Jonathan Coens, and Jon Purdy. 2014. There is No Fork: An Abstraction for Eficient, Concurrent, and Concise Data Access. SIGPLAN Not. 49, 9 (Aug. 2014 ), 325-337. https://doi.org/10.1145/2692915.2628144 Google Scholar
Digital Library
- Luke Maurer, Paul Downen, Zena M. Ariola, and Simon L. Peyton Jones. 2017. Compiling without continuations. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, Albert Cohen and Martin T. Vechev (Eds.). ACM, 482-494. https://doi.org/10.1145/ 3062341.3062380 Google Scholar
Digital Library
- Conor McBride. 2011. Functional pearl: Kleisli arrows of outrageous fortune. Journal of Functional Programming (accepted for publication) ( 2011 ).Google Scholar
- Conor McBride and Ross Paterson. 2008. Applicative programming with efects. Journal of Functional Programming 18, 1 ( 2008 ), 1-13. https://doi.org/10.1017/S0956796807006326 Google Scholar
Digital Library
- Nancy McCracken. 1984. The Typechecking of Programs with Implicit Type Structure.. In Proc. Of the International Symposium on Semantics of Data Types (Sophia-Antipolis, France). Springer-Verlag New York, Inc., New York, NY, USA, 301-315. http://dl.acm.org/citation.cfm?id= 1096. 1107Google Scholar
Cross Ref
- Andrey Mokhov, Georgy Lukyanov, Simon Marlow, and Jeremie Dimino. 2019. Selective Applicative Functors. Proc. ACM Program. Lang. 3, ICFP, Article 90 ( July 2019 ), 29 pages. https://doi.org/10.1145/3341694 Google Scholar
Digital Library
- Greg Morrisett, Karl Crary, Neal Glew, and David Walker. 2002. Stack-Based Typed Assembly Language. J. Funct. Program. 12, 1 (Jan. 2002 ), 43-88. https://doi.org/10.1017/S0956796801004178 Google Scholar
Digital Library
- Tiark Rompf and Martin Odersky. 2010. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. SIGPLAN Not. 46, 2 (Oct. 2010 ), 127-136. https://doi.org/10.1145/1942788.1868314 Google Scholar
Digital Library
- Tim Sheard and Simon Peyton Jones. 2002. Template Meta-programming for Haskell. SIGPLAN Not. 37, 12 (Dec. 2002 ), 60-75. https://doi.org/10.1145/636517.636528 Google Scholar
Digital Library
- S. Doaitse Swierstra. 2009. Combinator Parsing: A Short Tutorial. Springer Berlin Heidelberg, Berlin, Heidelberg, 252-300. https://doi.org/10.1007/978-3-642-03153-3_6 Google Scholar
Digital Library
- S. Doaitse Swierstra and Luc Duponcheel. 1996. Deterministic, Error-Correcting Combinator Parsers. In Advanced Functional Programming, Second International School-Tutorial Text. Springer-Verlag, London, UK, 184-207. http://dl.acm.org/citation. cfm?id= 647699. 734159Google Scholar
- Walid Taha and Tim Sheard. 1997. Multi-stage Programming with Explicit Annotations. SIGPLAN Not. 32, 12 (Dec. 1997 ), 203-217. https://doi.org/10.1145/258994.259019 Google Scholar
Digital Library
- Tarmo Uustalu and Varmo Vene. 1999. Primitive (Co)Recursion and Course-of-Value (Co)Iteration, Categorically. Informatica 10 ( 1999 ), 5-26.Google Scholar
- Marcos Viera, S. Doaitse Swierstra, and Eelco Lempsink. 2008. Haskell, Do You Read Me? Constructing and Composing Eficient Top-down Parsers at Runtime. In Proceedings of the First ACM SIGPLAN Symposium on Haskell (Victoria, BC, Canada) (Haskell ' 08 ). Association for Computing Machinery, New York, NY, USA, 63-74. https://doi.org/10.1145/ 1411286.1411296 Google Scholar
Digital Library
- Janis Voigtländer. 2008. Asymptotic Improvement of Computations over Free Monads. In Mathematics of Program Construction, Philippe Audebaud and Christine Paulin-Mohring (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 388-403.Google Scholar
- Philip Wadler. 1985. How to replace failure by a list of successes a method for exception handling, backtracking, and pattern matching in lazy functional languages. In Functional Programming Languages and Computer Architecture, Jean-Pierre Jouannaud (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 113-128.Google Scholar
Digital Library
- Jamie Willis and Nicolas Wu. 2018. Garnishing Parsec with Parsley. In Proceedings of the 9th ACM SIGPLAN International Symposium on Scala (St. Louis, MO, USA) ( Scala '18). ACM, New York, NY, USA, 24-34. https://doi.org/10.1145/3241653. 3241656 Google Scholar
Digital Library
- Nicolas Wu, Tom Schrijvers, and Ralf Hinze. 2014. Efect Handlers in Scope. In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell (Gothenburg, Sweden) (Haskell âĂŹ14). Association for Computing Machinery, New York, NY, USA, 1-12. https://doi.org/10.1145/2633357.2633358 Google Scholar
Digital Library
- Jeremy Yallop. 2017. Staged Generic Programming. Proc. ACM Program. Lang. 1, ICFP, Article 29 ( Aug. 2017 ), 29 pages. https://doi.org/10.1145/3110273 Google Scholar
Digital Library
- Jeremy Yallop and Oleg Kiselyov. 2019. Generating Mutually Recursive Definitions. In Proceedings of the 2019 ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (Cascais, Portugal) ( PEPM 2019). ACM, New York, NY, USA, 75-81. https://doi.org/10.1145/3294032.3294078 Google Scholar
Digital Library
Index Terms
Staged selective parser combinators
Recommendations
Lightweight multi-language syntax transformation with parser parser combinators
PLDI 2019: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and ImplementationAutomatically transforming programs is hard, yet critical for automated program refactoring, rewriting, and repair. Multi-language syntax transformation is especially hard due to heterogeneous representations in syntax, parse trees, and abstract syntax ...
Practical, general parser combinators
PEPM '16: Proceedings of the 2016 ACM SIGPLAN Workshop on Partial Evaluation and Program ManipulationParser combinators are a popular approach to parsing where context-free grammars are represented as executable code. However, conventional parser combinators do not support left recursion, and can have worst-case exponential runtime. These limitations ...
Staged parser combinators for efficient data processing
OOPSLA '14Parsers are ubiquitous in computing, and many applications depend on their performance for decoding data efficiently. Parser combinators are an intuitive tool for writing parsers: tight integration with the host language enables grammar specifications ...






Comments