Abstract
Traditionally, a grammar defining the syntax of a programming language is typically both context free and unambiguous. However, recent work suggests that an attractive alternative is to use ambiguous grammars,thus postponing the task of resolving the ambiguity to the end user. If all programs accepted by an ambiguous grammar can be rewritten unambiguously, then the parser for the grammar is said to be resolvably ambiguous. Guaranteeing resolvable ambiguity statically---for all programs---is hard, where previous work only solves it partially using techniques based on property-based testing. In this paper, we present the first efficient, practical, and proven correct solution to the statically resolvable ambiguity problem. Our approach introduces several key ideas, including splittable productions, operator sequences, and the concept of a grouper that works in tandem with a standard parser. We prove static resolvability using a Coq mechanization and demonstrate its efficiency and practical applicability by implementing and integrating resolvable ambiguity into an essential part of the standard OCaml parser.
- Annika Aasa. 1995. Precedences in Specifications and Implementations of Programming Languages. Theoretical Computer Science, 142, 1 (1995), May, 3–26. issn:0304-3975 https://doi.org/10.1016/0304-3975(95)90680-J
Google Scholar
Digital Library
- Ali Afroozeh and Anastasia Izmaylova. 2015. One Parser to Rule Them All. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!) (Onward! 2015). Association for Computing Machinery, New York, NY, USA. 151–170. isbn:978-1-4503-3688-8 https://doi.org/10.1145/2814228.2814242
Google Scholar
Digital Library
- Ali Afroozeh, Mark van den Brand, Adrian Johnstone, Elizabeth Scott, and Jurgen Vinju. 2013. Safe Specification of Operator Precedence Rules. In Software Language Engineering, Martin Erwig, Richard F. Paige, and Eric Van Wyk (Eds.) (Lecture Notes in Computer Science). Springer International Publishing, 137–156. isbn:978-3-319-02654-1 https://doi.org/10.1007/978-3-319-02654-1_8
Google Scholar
Cross Ref
- Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman. 2006. Compilers: Principles, Techniques, and Tools (second ed.). Addison Wesley, Boston. isbn:978-0-321-48681-3
Google Scholar
Digital Library
- Roland Axelsson, Keijo Heljanko, and Martin Lange. 2008. Analyzing Context-Free Grammars Using an Incremental SAT Solver. In Automata, Languages and Programming, Luca Aceto, Ivan Damgård, Leslie Ann Goldberg, Magnús M. Halldórsson, Anna Ingólfsdóttir, and Igor Walukiewicz (Eds.) (Lecture Notes in Computer Science). Springer Berlin Heidelberg, 410–422. isbn:978-3-540-70583-3 https://doi.org/10.1007/978-3-540-70583-3_34
Google Scholar
Digital Library
- Bas Basten. 2011. Ambiguity Detection for Programming Language Grammars. Ph. D. Dissertation. Universiteit van Amsterdam.
Google Scholar
- Claus Brabrand, Robert Giegerich, and Anders Møller. 2007. Analyzing Ambiguity of Context-Free Grammars. In Implementation and Application of Automata, Jan Holub and Jan Žďárek (Eds.) (Lecture Notes in Computer Science). Springer Berlin Heidelberg, 214–225. isbn:978-3-540-76336-9 https://doi.org/10.1007/978-3-540-76336-9_21
Google Scholar
Cross Ref
- David Broman. 2019. A Vision of Miking: Interactive Programmatic Modeling, Sound Language Composition, and Self-Learning Compilation. In Proceedings of the 12th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2019). Association for Computing Machinery, New York, NY, USA. 55–60. isbn:978-1-4503-6981-7 https://doi.org/10.1145/3357766.3359531
Google Scholar
Digital Library
- David G. Cantor. 1962. On The Ambiguity Problem of Backus Systems. J. ACM, 9, 4 (1962), Oct., 477–479. issn:0004-5411 https://doi.org/10.1145/321138.321145
Google Scholar
Digital Library
- Arthur Charguéraud. 2022. The TLC Coq Library.
Google Scholar
- Keith Cooper and Linda Torczon. 2011. Engineering a Compiler (second ed.). Elsevier. isbn:978-0-08-091661-3
Google Scholar
- Nils Anders Danielsson and Ulf Norell. 2011. Parsing Mixfix Operators. In Implementation and Application of Functional Languages, Sven-Bodo Scholz and Olaf Chitil (Eds.) (Lecture Notes in Computer Science). Springer Berlin Heidelberg, 80–99. isbn:978-3-642-24452-0 https://doi.org/10.1007/978-3-642-24452-0_5
Google Scholar
Cross Ref
- Luís Eduardo de Souza Amorim and Eelco Visser. 2020. Multi-Purpose Syntax Definition with SDF3. In Software Engineering and Formal Methods, Frank de Boer and Antonio Cerone (Eds.) (Lecture Notes in Computer Science). Springer International Publishing, Cham. 1–23. isbn:978-3-030-58768-0 https://doi.org/10.1007/978-3-030-58768-0_1
Google Scholar
Digital Library
- Jay Earley. 1970. An Efficient Context-free Parsing Algorithm. Commun. ACM, 13, 2 (1970), Feb., 94–102. issn:0001-0782 https://doi.org/10.1145/362007.362035
Google Scholar
Digital Library
- Sebastian Erdweg, Tillmann Rendel, Christian Kästner, and Klaus Ostermann. 2011. SugarJ: Library-Based Syntactic Language Extensibility. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA ’11). Association for Computing Machinery, New York, NY, USA. 391–406. isbn:978-1-4503-0940-0 https://doi.org/10.1145/2048066.2048099
Google Scholar
Digital Library
- Robert W. Floyd. 1963. Syntactic Analysis and Operator Precedence. J. ACM, 10, 3 (1963), July, 316–333. issn:0004-5411 https://doi.org/10.1145/321172.321179
Google Scholar
Digital Library
- Bryan Ford. 2004. Parsing Expression Grammars: A Recognition-based Syntactic Foundation. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’04). ACM, New York, NY, USA. 111–122. isbn:978-1-58113-729-3 https://doi.org/10.1145/964001.964011
Google Scholar
Digital Library
- Seymour Ginsburg and Joseph Ullian. 1966. Ambiguity in Context Free Languages. J. ACM, 13, 1 (1966), Jan., 62–89. issn:0004-5411 https://doi.org/10.1145/321312.321318
Google Scholar
Digital Library
- Bernard Lang. 1974. Deterministic Techniques for Efficient Non-Deterministic Parsers. In Automata, Languages and Programming, Jacques Loeckx (Ed.) (Lecture Notes in Computer Science). Springer, Berlin, Heidelberg. 255–269. isbn:978-3-662-21545-6 https://doi.org/10.1007/978-3-662-21545-6_18
Google Scholar
Cross Ref
- Stefan Monnier. 2020. SMIE: Weakness Is Power!. The Art, Science, and Engineering of Programming, 5, 1 (2020), June, 1:1–1:26. issn:2473-7321 https://doi.org/10.22152/programming-journal.org/2021/5/1
Google Scholar
Cross Ref
- Viktor Palmkvist and David Broman. 2019. Creating Domain-Specific Languages by Composing Syntactical Constructs. In Practical Aspects of Declarative Languages, José Júlio Alferes and Moa Johansson (Eds.) (Lecture Notes in Computer Science). Springer International Publishing, 187–203. isbn:978-3-030-05998-9 https://doi.org/10.1007/978-3-030-05998-9_12
Google Scholar
Digital Library
- Viktor Palmkvist, Elias Castegren, Philipp Haller, and David Broman. 2021. Resolvable Ambiguity: Principled Resolution of Syntactically Ambiguous Programs. In Proceedings of the 30th ACM SIGPLAN International Conference on Compiler Construction (CC 2021). Association for Computing Machinery, New York, NY, USA. 153–164. isbn:978-1-4503-8325-7 https://doi.org/10.1145/3446804.3446846
Google Scholar
Digital Library
- Terence Parr and Kathleen Fisher. 2011. LL(*): The Foundation of the ANTLR Parser Generator. In Proceedings of the 32Nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). ACM, New York, NY, USA. 425–436. isbn:978-1-4503-0663-8 https://doi.org/10.1145/1993498.1993548
Google Scholar
Digital Library
- Terence Parr, Sam Harwell, and Kathleen Fisher. 2014. Adaptive LL(*) Parsing: The Power of Dynamic Analysis. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications (OOPSLA ’14). ACM, New York, NY, USA. 579–598. isbn:978-1-4503-2585-1 https://doi.org/10.1145/2660193.2660202
Google Scholar
Digital Library
- François Pottier and Yann Régis-Gianas. 2005. The Menhir Parser Generator.
Google Scholar
- Sylvain Schmitz. 2007. Conservative Ambiguity Detection in Context-Free Grammars. In Automata, Languages and Programming, Lars Arge, Christian Cachin, Tomasz Jurdziński, and Andrzej Tarlecki (Eds.) (Lecture Notes in Computer Science). Springer Berlin Heidelberg, 692–703. isbn:978-3-540-73420-8 https://doi.org/10.1007/978-3-540-73420-8_60
Google Scholar
Cross Ref
- Elizabeth Scott and Adrian Johnstone. 2010. GLL Parsing. Electronic Notes in Theoretical Computer Science, 253, 7 (2010), Sept., 177–189. issn:1571-0661 https://doi.org/10.1016/j.entcs.2010.08.041
Google Scholar
Digital Library
- Thomas A. Sudkamp. 1997. Languages and Machines: An Introduction to the Theory of Computer Science. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA. isbn:978-0-201-82136-9
Google Scholar
- The dafny-lang community. 2022. Dafny Documentation. https://dafny-lang.github.io/dafny/DafnyRef/DafnyRef.html
Google Scholar
- Adam Brooks Webber. 2003. Modern Programming Languages: A Practical Introduction. Franklin, Beedle & Associates. isbn:978-1-887902-76-2
Google Scholar
Index Terms
Statically Resolvable Ambiguity
Recommendations
Resolvable ambiguity: principled resolution of syntactically ambiguous programs
CC 2021: Proceedings of the 30th ACM SIGPLAN International Conference on Compiler ConstructionWhen building a new programming language, it can be useful to compose parts of existing languages to avoid repeating implementation work. However, this is problematic already at the syntax level, as composing the grammars of language fragments can ...
Simple LR(k) grammars
A class of context-free grammars, called the “Simple LR(k)” or SLR(k) grammars is defined. This class has been shown to include weak precedence and simple precedence grammars as proper subsets. How to construct parsers for the SLR(k) grammars is also ...
Development and evaluation of an Urdu treebank (CLE-UTB) and a statistical parser
AbstractA number of natural language processing tools for Urdu language processing have been developed in the past few years for word segmentation, part of speech tagging, chunking, named entity recognition and parsing. Corpora, especially treebanks, are ...






Comments