ABSTRACT
There is an increasing interest in extensible languages,
(domain-specific) language extensions, and mechanisms for their specification and implementation. One challenge is to develop tools that allow non-expert programmers to add an eclectic set of language extensions to a host language. We describe mechanisms for composing and analyzing concrete syntax specifications of a host language and extensions to it. These specifications consist of context-free grammars with each terminal symbol mapped to a regular expression, from which a slightly-modified LR parser and context-aware scanner are generated. Traditionally, conflicts are detected when a parser is generated from the composed grammar, but this comes too late since it is the non-expert programmer directing the composition of independently developed extensions with the host language.
The primary contribution of this paper is a modular analysis that is performed independently by each extension designer on her extension (composed alone with the host language). If each extension passes this modular analysis, then the language composed later by the programmer will compile with no conflicts or lexical ambiguities. Thus, extension writers can verify that their extension will safely compose with others and, if not, fix the specification so that it will. This is possible due to the context-aware scanner's lexical disambiguation and a set of reasonable restrictions limiting the constructs that can be introduced by an extension. The restrictions ensure that the parse table states can be partitioned so that each state can be attributed to the host language or a single extension.
- A. Aho, R. Sethi, and J. Ullman. Compilers -- Principles, Techniques, and Tools. Addison-Wesley, Reading, MA, 1986. Google Scholar
Digital Library
- S. Ananian. Java 1.4 LALR(1) grammar. Available at http://www2.cs.tum.edu/projects/cup/.Google Scholar
- M. Bravenboer, E. Dolstra, and E. Visser. Preventing injection attacks with syntax embeddings. In Proc. of the Intl. Conf. on Generative programming and component engineering (GPCE), pages 3--12. ACM, 2007. Google Scholar
Digital Library
- M. Bravenboer, Éric Tanter, and E. Visser. Declarative, formal, and extensible syntax definition for AspectJ. In Proc. of Conf. on Object-oriented programming systems, languages, and applications (OOPSLA), pages 209--228. ACM, 2006. Google Scholar
Digital Library
- M. Bravenboer and E. Visser. Concrete syntax for objects: domain-specific language embedding and assimilation without restrictions. In Proc. Conf. on Object-oriented programming, systems, languages, and applications (OOPSLA), pages 365--383. ACM, 2004. Google Scholar
Digital Library
- M. Bravenboer and E. Visser. Parse table composition -- separate compilation and binary extensibility of grammars. In Proc. of Intl. Conf. on Software Language Engineering (SLE), 2008.Google Scholar
- J. Cervelle, R. Forax, and G. Roussel. Tatoo: an innovative parser generator. In Proc. Principles and practice of programming in Java (PPPJ), pages 13--20. ACM, 2006. Google Scholar
Digital Library
- R. Cox, T. Bergany, A. T. Clements, F. Kaashoek, and E. Kohlery. Xoc, an extension-oriented compiler for systems programming. In Proc. of Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2008. Google Scholar
Digital Library
- T. Ekman and G. Hedin. The JastAdd extensible Java compiler. In Proc. Conf. on Object oriented programming systems and applications (OOPSLA), pages 1--18. ACM, 2007. Google Scholar
Digital Library
- T. Ekman and G. Hedin. The JastAdd system -- modular extensible compiler construction. Science of Computer Programming, 69:14--26, December 2007. Google Scholar
Digital Library
- B. Ford. Parsing expression grammars: a recognition-based syntactic foundation. In Proc. of Symp. on Principles of Programming Languages (POPL), pages 111--122. ACM, 2004. Google Scholar
Digital Library
- R. Grimm. Better extensibility through modular syntax. In Proc. of Conf. on Programming Language Design and Implementation (PLDI), pages 38--51. ACM Press, 2006. Google Scholar
Digital Library
- C. Heitmeyer, A. Bull, C. Gasarch, and B. Labaw. SCR*: A toolset for specifying and analyzing requirements. In Proc. of Tenth Annual Conf. on Computer Assurance (COMPASS), 1995.Google Scholar
Cross Ref
- L. Hendren, O. de Moor, A. S. Christensen, and the abc team. The abc scanner and parser, including an LALR(1) grammar for AspectJ. Available at http://abc.comlab.ox.ac.uk/documents/scanparse.pdf, September 2004.Google Scholar
- R. N. Horspool. Incremental generation of LR parsers. Computer Languages, 15(4):205--223, 1990. Google Scholar
Digital Library
- D. E. Knuth. On the translation of languages from left to right. Information and Control, 8(6):607--639, 1965.Google Scholar
Cross Ref
- W. R. LaLonde. An efficient LALR parser generator. Technical Report 2, Computer Systems Research Group, University of Toronto, 1971.Google Scholar
- N. Nystrom, M. R. Clarkson, and A. C. Myer. Polyglot: An extensible compiler framework for Java. In Proc. 12th International Conf. on Compiler Construction, volume 2622 of LNCS, pages 138--152. Springer--Verlag, 2003. Google Scholar
Digital Library
- M. Odersky and P. Wadler. Pizza into Java: translating theory into practice. In Proc. of Symp. on Principles of Programming Languages (POPL), pages 146--159. ACM Press, 1997. Google Scholar
Digital Library
- T. Rus. A unified language processing methodology. Theoretical Computer Science, 281(1--2):499--536, 2002.Google Scholar
- T. Rus and T. Halverson. A language independent scanner generator. Paper available at http://www.uiowa.cs.edu/~rus, 1998.Google Scholar
- A. Schwerdfeger. A declarative specification of a deterministic parser and scanner for AspectJ. Technical Report 09--007, University of Minnesota, 2009. Available at http://www.cs.umn.edu.Google Scholar
- A. Schwerdfeger and E. Van Wyk. Verifiable composition of deterministic grammars. Technical Report 09--008, University of Minnesota, 2009. Available at http://www.cs.umn.edu.Google Scholar
Digital Library
- E. Van Wyk, D. Bodin, L. Krishnan, and J. Gao. Silver: an extensible attribute grammar system. Electronic Notes in Theoretical Computer Science (ENTCS), 203(2):103--116, 2008. Originally in LDTA 2007. Google Scholar
Digital Library
- E. Van Wyk, L. Krishnan, A. Schwerdfeger, and D. Bodin.Google Scholar
- Attribute grammar-based language extensions for Java. In European Conf. on Object Oriented Programming (ECOOP), volume 4609 of LNCS, pages 575--599. Springer-Verlag, July 2007. Google Scholar
Digital Library
- E. Van Wyk and Y. Mali. Adding dimension analysis to java as a composable language extension. In Post Proc. of Generative and Transformational Techniques in Software Engineering (GTTSE), number 5235 in LNCS, pages 442--456. Springer-Verlag, 2008. Google Scholar
Digital Library
- E. Van Wyk and A. Schwerdfeger. Context-aware scanning for parsing extensible languages. In Intl. Conf. on Generative Programming and Component Engineering, (GPCE). ACM Press, October 2007. Google Scholar
Digital Library
- E. Visser. Scannerless generalized-LR parsing. Technical Report P9707, Programming Research Group, University of Amsterdam, Aug. 1997.Google Scholar
- E. Visser. Program transformation with Stratego/XT: Rules, strategies, tools, and systems in StrategoXT-0.9. In C. Lengauer et al., editors, Domain-Specific Program Generation, volume 3016 of LNCS, pages 216--238. Spinger-Verlag, June 2004.Google Scholar
- X. Wu, B. R. Bryant, J. Gray, and M. Mernik. Component-based LR parsing. Computer Languages, Systems & Structures, 2009. In press.Google Scholar
Index Terms
Verifiable composition of deterministic grammars
Recommendations
Reliable and automatic composition of language extensions to C: the ableC extensible language framework
This paper describes an extensible language framework, ableC, that allows programmers to import new, domain-specific, independently-developed language features into their programming language, in this case C. Most importantly, this framework ensures ...
Verifiable composition of deterministic grammars
PLDI '09There is an increasing interest in extensible languages,
(domain-specific) language extensions, and mechanisms for their specification and implementation. One challenge is to develop tools that allow non-expert programmers to add an eclectic set of ...
An on-the-fly grammar modification mechanism for composing and defining extensible languages
Adaptable Parsing Expression Grammar (APEG) is a formal method for defining the syntax of programming languages. It provides an on-the-fly mechanism to perform modifications of the syntax of the language during parsing time. The primary goal of this ...







Comments