skip to main content
research-article

Principled parsing for indentation-sensitive languages: revisiting landin's offside rule

Published:23 January 2013Publication History
Skip Abstract Section

Abstract

Several popular languages, such as Haskell, Python, and F#, use the indentation and layout of code as part of their syntax. Because context-free grammars cannot express the rules of indentation, parsers for these languages currently use ad hoc techniques to handle layout. These techniques tend to be low-level and operational in nature and forgo the advantages of more declarative specifications like context-free grammars. For example, they are often coded by hand instead of being generated by a parser generator.

This paper presents a simple extension to context-free grammars that can express these layout rules, and derives GLR and LR(k) algorithms for parsing these grammars. These grammars are easy to write and can be parsed efficiently. Examples for several languages are presented, as are benchmarks showing the practical efficiency of these algorithms.

Skip Supplemental Material Section

Supplemental Material

r1d3_talk11.mp4

References

  1. base version 4.5.1.0, June 2012. URL http://hackage.haskell.org/package/base/.Google ScholarGoogle Scholar
  2. Sam Anklesaria. indents version 0.3.3, May 2012. URL http://hackage.haskell.org/package/indents/.Google ScholarGoogle Scholar
  3. Net}yamlOren Ben-Kiki, Clark Evans, and Ingy döt Net. phYAML Ain't Markup Language (YAML) Version 1.2, 3rd edition, October 2009. URL http://www.yaml.org/spec/1.2/spec.html.Google ScholarGoogle Scholar
  4. bacher(2006)}indent-sens-langsLeonhard Brunauer and Bernhard Mühlbacher. Indentation sensitive languages. Unpublished manuscript, July 2006. URL http://www.cs.uni-salzburg.at/ ck/wiki/uploads/TCS-Summer-2006.Indentat%ionSensitiveLanguages/.Google ScholarGoogle Scholar
  5. Janusz A. Brzozowski. Derivatives of regular expressions. phJournal of the ACM (JACM), 11 (4): 481--494, October 1964. ISSN 0004--5411. 10.1145/321239.321249. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. ner, and Ostermann}sugarj-indentSebastian Erdweg, Tillmann Rendel, Christian K\"astner, and Klaus Ostermann. Layout-sensitive generalized parsing. In phSoftware Language Engineering, Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2012. URL http://sugarj.org/layout-parsing.pdf. To appear.Google ScholarGoogle Scholar
  7. phThe Glorious Glasgow Haskell Compilation System User's Guide, Version 7.2.1. The GHC Team, August 2011. URL http://www.haskell.org/ghc/docs/7.2.1/html/users_guide/.Google ScholarGoogle Scholar
  8. David Goodger. phreStructuredText Markup Specification, January 2012. URL http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html. Revision 7302.Google ScholarGoogle Scholar
  9. John Gruber. phMarkdown: Syntax. URL http://daringfireball.net/projects/markdown/syntax. Retrieved on June 24, 2012.Google ScholarGoogle Scholar
  10. 006)}curryMichael Hanus (ed.). Curry: An integrated functional logic language (version 0.8.2). Technical report, March 2006. URL http://www.informatik.uni-kiel.de/ curry/report.html.Google ScholarGoogle Scholar
  11. 010)}habitHASP Project. The Habit programming language: The revised preliminary report, November 2010. URL http://hasp.cs.pdx.edu/habit-report-Nov2010.pdf.Google ScholarGoogle Scholar
  12. Graham Hutton. Higher-order functions for parsing. phJournal of Functional Programming, 2 (03): 323--343, July 1992. 10.1017/S0956796800000411.Google ScholarGoogle ScholarCross RefCross Ref
  13. Graham Hutton and Erik Meijer. Monadic parser combinators. Technical Report NOTTCS-TR-96--4, Department of Computer Science, University of Nottingham, 1996.Google ScholarGoogle Scholar
  14. 984)}occamINMOS Limited. phoccam programming manual. Prentice-Hall international series in computer science. Prentice-Hall International, 1984. ISBN 978-0--13--629296--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Mark P. Jones. The implementation of the Gofer functional programming system. Research Report YALEU/DCS/RR-1030, Yale University, New Haven, Connecticut, USA, May 1994.Google ScholarGoogle Scholar
  16. Donald E. Knuth. On the translation of languages from left to right. phInformation and Control, 8 (6): 607--639, December 1965. ISSN 0019--9958. 10.1016/S0019--9958(65)90426--2.Google ScholarGoogle ScholarCross RefCross Ref
  17. Piyush P. Kurur. indentparser version 0.1, January 2012. URL http://hackage.haskell.org/package/indentparser/.Google ScholarGoogle Scholar
  18. P. J. Landin. The next 700 programming languages. phCommunications of the ACM, 9 (3): 157--166, March 1966. ISSN 0001-0782. 10.1145/365230.365257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Daan Leijen and Paolo Martini. parsec version 3.1.3, June 2012. URL http://hackage.haskell.org/package/parsec/.Google ScholarGoogle Scholar
  20. Simon Marlow and Andy Gill. phHappy User Guide, 2009. URL http://www.haskell.org/happy/doc/html/. For Happy version 1.18.Google ScholarGoogle Scholar
  21. Simon Marlow, Sven Panne, and Noel Winstanley. haskell-src version 1.0.1.5, November 2011. URL http://hackage.haskell.org/package/haskell-src.Google ScholarGoogle Scholar
  22. 010)}haskell2010Simon Marlow (ed.). phHaskell 2010 Language Report, April 2010. URL http://www.haskell.org/onlinereport/haskell2010/.Google ScholarGoogle Scholar
  23. er(2005)}srfi-49Egil Möller. phSRFI-49: Indentation-sensitive syntax, May 2005. URL http://srfi.schemers.org/srfi-49/srfi-49.html.Google ScholarGoogle Scholar
  24. Python. phThe Python Language Reference. URL http://docs.python.org/reference/. Retrieved on June 26, 2012.Google ScholarGoogle Scholar
  25. S. Doaitse Swierstra. uulib version 0.9.14, August 2011. URL http://hackage.haskell.org/package/uulib/.Google ScholarGoogle Scholar
  26. Don Syme et al. phThe F\# 2.0 Language Specification. Microsoft Corporation, April 2010. URL https://research.microsoft.com/en-us/um/cambridge/projects/fsharp/manua%l/spec.html. Updated April 2012.Google ScholarGoogle Scholar
  27. Masaru Tomita. phEfficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer International Series in Engineering and Computer Science. Kluwer Academic Publishers, 1985. ISBN 978-0--89838--202-0. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. D. A. Turner. phMiranda System Manual. Research Software Limited, 1989. URL http://www.cs.kent.ac.uk/people/staff/dat/miranda/manual/.Google ScholarGoogle Scholar
  29. Philip Wadler. An introduction to Orwell. Technical report, Programming Research Group at Oxford University, 1985.Google ScholarGoogle Scholar

Index Terms

  1. Principled parsing for indentation-sensitive languages: revisiting landin's offside rule

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in

        Full Access

        • Published in

          cover image ACM SIGPLAN Notices
          ACM SIGPLAN Notices  Volume 48, Issue 1
          POPL '13
          January 2013
          561 pages
          ISSN:0362-1340
          EISSN:1558-1160
          DOI:10.1145/2480359
          Issue’s Table of Contents
          • cover image ACM Conferences
            POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
            January 2013
            586 pages
            ISBN:9781450318327
            DOI:10.1145/2429069

          Copyright © 2013 ACM

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 23 January 2013

          Check for updates

          Qualifiers

          • research-article

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader
        About Cookies On This Site

        We use cookies to ensure that we give you the best experience on our website.

        Learn more

        Got it!