skip to main content
research-article

Adaptive LL(*) parsing: the power of dynamic analysis

Published:15 October 2014Publication History
Skip Abstract Section

Abstract

Despite the advances made by modern parsing strategies such as PEG, LL(*), GLR, and GLL, parsing is not a solved problem. Existing approaches suffer from a number of weaknesses, including difficulties supporting side-effecting embedded actions, slow and/or unpredictable performance, and counter-intuitive matching strategies. This paper introduces the ALL(*) parsing strategy that combines the simplicity, efficiency, and predictability of conventional top-down LL(k) parsers with the power of a GLR-like mechanism to make parsing decisions. The critical innovation is to move grammar analysis to parse-time, which lets ALL(*) handle any non-left-recursive context-free grammar. ALL(*) is O(n4) in theory but consistently performs linearly on grammars used in practice, outperforming general strategies such as GLL and GLR by orders of magnitude. ANTLR 4 generates ALL(*) parsers and supports direct left-recursion through grammar rewriting. Widespread ANTLR 4 use (5000 downloads/month in 2013) provides evidence that ALL(*) is effective for a wide variety of applications.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ancona, M., Dodero, G., Gianuzzi, V., and Morgavi, M. Efficient construction of LR(k) states and tables. ACM Trans. Program. Lang. Syst. 13, 1 (Jan. 1991), 150--178. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bermudez, M. E., and Schimpf, K. M. Practical arbitrary lookahead LR parsing. Journal of Computer and System Sciences 41, 2 (1990).Google ScholarGoogle ScholarCross RefCross Ref
  3. Brown, S., and Vranesic, Z. Fundamentals of Digital Logic with Verilog Design. McGraw-Hill series in ECE. 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Charles, P. A Practical Method for Constructing Efficient LALR(k) Parsers with Automatic Error Recovery. PhD thesis, New York University, New York, NY, USA, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Clarke, K. The top-down parsing of expressions. Unpublished technical report, Dept. of Computer Science and Statistics, Queen Mary College, London, June 1986.Google ScholarGoogle Scholar
  6. Cleveland, W. S. Robust Locally Weighted Regression and Smoothing Scatterplots. Journal of the American Statistical Association 74 (1979), 829--836.Google ScholarGoogle ScholarCross RefCross Ref
  7. Cohen, R., and Culik, K. LR-Regular grammars - an extension of LR(k) grammars. In SWAT '71 (Washington, DC, USA, 1971), IEEE Computer Society, pp. 153--165. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Earley, J. An efficient context-free parsing algorithm. Communications of the ACM 13, 2 (1970), 94--102. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ford, B. Parsing Expression Grammars: A recognition-based syntactic foundation. In POPL (2004), ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Grimm, R. Better extensibility through modular syntax. In PLDI (2006), ACM Press, pp. 38--51. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hopcroft, J., and Ullman, J. Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Reading, Massachusetts, 1979. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Jarzabek, S., and Krawczyk, T. LL-Regular grammars. Information Processing Letters 4, 2 (1975), 31--37.Google ScholarGoogle ScholarCross RefCross Ref
  13. Jim, T., Mandelbaum, Y., and Walker, D. Semantics and algorithms for data-dependent grammars. In POPL 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Johnson, M. The computational complexity of GLR parsing. In Generalized LR Parsing, M. Tomita, Ed. Kluwer, 1991.Google ScholarGoogle Scholar
  15. Kipps, J. Generalized LR Parsing. Springer, 1991, pp. 43--59.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mclean, P., and Horspool, R. N. A faster Earley parser. In CC (1996), Springer, pp. 281--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. McPeak, S. Elkhound: A fast, practical GLR parser generator. Tech. rep., UC Berkeley (EECS), Dec. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. McPeak, S., and Necula, G. C. Elkhound: A fast, practical GLR parser generator. In CC (2004), pp. 73--88.Google ScholarGoogle ScholarCross RefCross Ref
  19. Parr, T. The Definitive ANTLR 4 Reference. The Pragmatic Programmers, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Parr, T., and Fisher, K. LL(*): The Foundation of the ANTLR Parser Generator. In PLDI (2011), pp. 425--436. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Parr, T. J. Obtaining practical variants of LL(k) and LR(k) for k>1 by splitting the atomic k-tuple. PhD thesis, Purdue University, West Lafayette, IN, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Parr, T. J., and Quong, R. W. Adding Semantic and Syntactic Predicates to LL(k) - pred-LL(k). In CC (1994). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Perlin, M. LR recursive transition networks for Earley and Tomita parsing. In Proceedings of the 29th Annual Meeting on Association for Computational Linguistics (1991), ACL '91. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Plevyak, J. DParser: GLR parser generator, Oct. 2013.Google ScholarGoogle Scholar
  25. Scott, E., and Johnstone, A. GLL parsing. Electron. Notes Theor. Comput. Sci. 253, 7 (Sept. 2010), 177--189. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tomita, M. Efficient Parsing for Natural Language. Kluwer Academic Publishers, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Woods, W. A. Transition network grammars for natural language analysis. Comm. of the ACM 13, 10 (1970). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Adaptive LL(*) parsing: the power of dynamic analysis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGPLAN Notices
        ACM SIGPLAN Notices  Volume 49, Issue 10
        OOPSLA '14
        October 2014
        907 pages
        ISSN:0362-1340
        EISSN:1558-1160
        DOI:10.1145/2714064
        • Editor:
        • Andy Gill
        Issue’s Table of Contents
        • cover image ACM Conferences
          OOPSLA '14: Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications
          October 2014
          946 pages
          ISBN:9781450325851
          DOI:10.1145/2660193

        Copyright © 2014 ACM

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 15 October 2014

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!