Abstract
Several popular languages including Haskell and Python use the indentation and layout of code as an essential part of their syntax. In the past, implementations of these languages used ad hoc techniques to implement layout. Recent work has shown that a simple extension to context-free grammars can replace these ad hoc techniques and provide both formal foundations and efficient parsing algorithms for indentation sensitivity.
However, that previous work is limited to bottom-up, LR($k$) parsing, and many combinator-based parsing frameworks including Parsec use top-down algorithms that are outside its scope. This paper remedies this by showing how to add indentation sensitivity to parsing frameworks like Parsec. It explores both the formal semantics of and efficient algorithms for indentation sensitivity. It derives a Parsec-based library for indentation-sensitive parsing and presents benchmarks on a real-world language that show its efficiency and practicality.
- Michael D. Adams. Principled parsing for indentation-sensitive languages: revisiting landin's offside rule. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '13, pages 511--522, New York, NY, USA, 2013. ACM. ISBN 978-1-4503-1832-7. 10.1145/2429069.2429129. Google Scholar
Digital Library
- Alfred V. Aho and Jeffrey D. Ullman. The theory of parsing, translation, and compiling, volume 1. Prentice-Hall, Englewood Cliffs, NJ, 1972. ISBN 0-13-914556-7. Google Scholar
Digital Library
- Sam Anklesaria. indents version 0.3.3, May 2012. URL http://hackage.haskell.org/package/indents/.Google Scholar
- Oren Ben-Kiki, Clark Evans, and Ingy döt Net. YAML Ain't Markup Language (YAML) Version 1.2, 3rd edition, October 2009. URL http://www.yaml.org/spec/1.2/spec.html.Google Scholar
- Edwin Brady. Idris, a general-purpose dependently typed programming language: Design and implementation. Journal of Functional Programming, 23 (05): 552--593, September 2013. ISSN 1469-7653. 10.1017/S095679681300018X.Google Scholar
Cross Ref
- Edwin Brady. Ws-idr, February 2013. URL https://github.com/edwinb/WS-idr. Commit db65516b87863fcc0b066d26cb262bcddfff5514.Google Scholar
- Edwin Brady. idris-0.9.8-demos, December 2013. URL https://github.com/edwinb/idris-demos. Commit 9c1355445dee0a41e6850a9c8d33cb0f2072cf78.Google Scholar
- Edwin Brady. idris-0.9.8-examples-benchmarks-tests, December 2013. URL https://github.com/idris-lang/Idris-dev. Commit 869564663b8309a4984ba8ad700baf7b65c926bb.Google Scholar
- Edwin Brady. idris-0.9.8-stdlib, January 2013. URL https://github.com/idris-lang/Idris-dev. Commit a3c8020d50def27d7e1eb01d0ec8e10a00e9b90e.Google Scholar
- Leonhard Brunauer and Bernhard Mühlbacher. Indentation sensitive languages. Unpublished manuscript, July 2006. URL http://www.cs.uni-salzburg.at/~ck/wiki/uploads/TCS-Summer-2006.IndentationSensitiveLanguages/.Google Scholar
- Sebastian Erdweg, Tillmann Rendel, Christian Kästner, and Klaus Ostermann. Layout-sensitive generalized parsing. In Software Language Engineering, Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2012. URL http://sugarj.org/layout-parsing.pdf. To appear.Google Scholar
- Bryan Ford. Parsing expression grammars: a recognition-based syntactic foundation. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL '04, pages 111--122, New York, NY, USA, 2004. ACM. ISBN 1-58113-729-X. 10.1145/964001.964011. Google Scholar
Digital Library
- Simon Fowler. idrisweb, December 2013. URL https://github.com/idris-hackers/IdrisWeb. Committexttt0c823ff5af0fd9f04b66d05a138585acdc656722.Google Scholar
- The Glorious Glasgow Haskell Compilation System User's Guide, Version 7.2.1. The GHC Team, August 2011. URL http://www.haskell.org/ghc/docs/7.2.1/html/users_guide/.Google Scholar
- David Goodger. reStructuredText Markup Specification, January 2012. URL http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html. Revision 7302.Google Scholar
- John Gruber. Markdown: Syntax. URL http://daringfireball.net/projects/markdown/syntax. Retrieved on June 24, 2012.Google Scholar
- Michael Hanus (ed.). Curry: An integrated functional logic language (version 0.8.2). Technical report, March 2006. URL http://www.informatik.uni-kiel.de/~curry/report.html.Google Scholar
- HASP Project. The Habit programming language: The revised preliminary report, November 2010. URL http://hasp.cs.pdx.edu/habit-report-Nov2010.pdf.Google Scholar
- Graham Hutton. Higher-order functions for parsing. Journal of Functional Programming, 2 (03): 323--343, July 1992. 10.1017/S0956796800000411.Google Scholar
Cross Ref
- Graham Hutton and Erik Meijer. Monadic parser combinators. Technical Report NOTTCS-TR-96-4, Department of Computer Science, University of Nottingham, 1996.Google Scholar
- INMOS Limited. occam programming manual. Prentice-Hall international series in computer science. Prentice-Hall International, 1984. ISBN 978-0-13-629296-8. Google Scholar
Digital Library
- Mark P. Jones. The implementation of the Gofer functional programming system. Research Report YALEU/DCS/RR-1030, Yale University, New Haven, Connecticut, USA, May 1994.Google Scholar
- Piyush P. Kurur. indentparser version 0.1, January 2012. URL http://hackage.haskell.org/package/indentparser/.Google Scholar
- P. J. Landin. The next 700 programming languages. Communications of the ACM, 9 (3): 157--166, March 1966. ISSN 0001-0782. 10.1145/365230.365257. Google Scholar
Digital Library
- Daan Leijen and Paolo Martini. parsec version 3.1.3, June 2012. URL http://hackage.haskell.org/package/parsec/.Google Scholar
- Simon Marlow (ed.). Haskell 2010 Language Report, April 2010. URL http://www.haskell.org/onlinereport/haskell2010/.Google Scholar
- Egil Möller. SRFI-49: Indentation-sensitive syntax, May 2005. URL http://srfi.schemers.org/srfi-49/srfi-49.html.Google Scholar
- Bryan O'Sullivan. Criterion version 0.6.0.1, January 2012. URL http://hackage.haskell.org/package/criterion/.Google Scholar
- Python. The Python Language Reference. URL http://docs.python.org/reference/. Retrieved on June 26, 2012.Google Scholar
- Benjamin Saunders. bitstreams, August 2013. URL https://github.com/Ralith/bitstreams. Committextttb4da0ea346d506e7fd9fc7b2c9637281addec9ba.Google Scholar
- S. Doaitse Swierstra. uulib version 0.9.14, August 2011. URL http://hackage.haskell.org/package/uulib/.Google Scholar
- Don Syme et al. The F# 2.0 Language Specification. Microsoft Corporation, April 2010. URL https://research.microsoft.com/en-us/um/cambridge/projects/fsharp/manual/spec.html. Updated April 2012.Google Scholar
- Matúš Tejiščák. lightyear, December 2013. URL https://github.com/ziman/lightyear. Committextttd74e48ad13451e763250ec1412989fdebe7af66a.Google Scholar
- D. A. Turner. Miranda System Manual. Research Software Limited, 1989. URL http://www.cs.kent.ac.uk/people/staff/dat/miranda/manual/.Google Scholar
- Philip Wadler. An introduction to Orwell. Technical report, Programming Research Group at Oxford University, 1985.Google Scholar
Index Terms
Indentation-sensitive parsing for Parsec
Recommendations
Principled parsing for indentation-sensitive languages: revisiting landin's offside rule
POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesSeveral popular languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars cannot express the rules of indentation, parsers for these languages currently use ad hoc ...
Principled parsing for indentation-sensitive languages: revisiting landin's offside rule
POPL '13Several popular languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars cannot express the rules of indentation, parsers for these languages currently use ad hoc ...
Indentation-sensitive parsing for Parsec
Haskell '14: Proceedings of the 2014 ACM SIGPLAN symposium on HaskellSeveral popular languages including Haskell and Python use the indentation and layout of code as an essential part of their syntax. In the past, implementations of these languages used ad hoc techniques to implement layout. Recent work has shown that a ...







Comments