skip to main content
research-article

Automating grammar comparison

Published:23 October 2015Publication History
Skip Abstract Section

Abstract

We consider from a practical perspective the problem of checking equivalence of context-free grammars. We present techniques for proving equivalence, as well as techniques for finding counter-examples that establish non-equivalence. Among the key building blocks of our approach is a novel algorithm for efficiently enumerating and sampling words and parse trees from arbitrary context-free grammars; the algorithm supports polynomial time random access to words belonging to the grammar. Furthermore, we propose an algorithm for proving equivalence of context-free grammars that is complete for LL grammars, yet can be invoked on any context-free grammar, including ambiguous grammars. Our techniques successfully find discrepancies between different syntax specifications of several real-world languages, and are capable of detecting fine-grained incremental modifications performed on grammars. Our evaluation shows that our tool improves significantly on the existing available state of the art tools. In addition, we used these algorithms to develop an online tutoring system for grammars that we then used in an undergraduate course on computer language processing. On questions involving grammar constructions, our system was able to automatically evaluate the correctness of 95% of the solutions submitted by students: it disproved 74% of cases and proved 21% of them.

Skip Supplemental Material Section

Supplemental Material

References

  1. Antlr version 4. http://www.antlr.org/.Google ScholarGoogle Scholar
  2. Java 7 language specification. http://docs.oracle.com/ javase/specs/jls/se7/html/jls-18.html.Google ScholarGoogle Scholar
  3. A. V. Aho, R. Sethi, and J. D. Ullman. Compilers: Princiles, Techniques, and Tools. Addison-Wesley, 1986. ISBN 0-201- 10088-6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. R. Axelsson, K. Heljanko, and M. Lange. Analyzing context-free grammars using an incremental SAT solver. In Automata, Languages and Programming, ICALP, pages 410–422, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. . URL http://dx.doi.org/10.1007/ 978-3-540-70583-3_34.Google ScholarGoogle Scholar
  6. C. Bastien, J. Czyzowicz, W. Fraczak, and W. Rytter. Prime normal form and equivalence of simple grammars. Theor. Comput. Sci., 363(2):124–134, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. A. Bertoni, M. Goldwurm, and M. Santini. Random generation and approximate counting of ambiguously described combinatorial structures. In STACS 2000, pages 567–580. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Creus and G. Godoy. Automatic evaluation of context-free grammars (system description). In Rewriting and Typed Lambda Calculi RTA-TLCA, pages 139– 148, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  9. . URL http://dx.doi.org/10.1007/ 978-3-319-08918-8_10.Google ScholarGoogle Scholar
  10. B. Daniel, D. Dig, K. Garcia, and D. Marinov. Automated testing of refactoring engines. In Foundations of Software Engineering, pages 185–194, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. P. Godefroid, A. Kiezun, and M. Y. Levin. Grammar-based whitebox fuzzing. In Programming Language Design and Implementation, pages 206–215, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. V. Gore, M. Jerrum, S. Kannan, Z. Sweedyk, and S. R. Mahaney. A quasi-polynomial-time algorithm for sampling words from a context-free language. Inf. Comput., 134(1):59–74, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. H. Guo and Z. Qiu. Automatic grammar-based test generation. In Testing Software and Systems ICTSS, pages 17–32, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  14. M. A. Harrison, I. M. Havel, and A. Yehudai. On equivalence of grammars through transformation trees. Theor. Comput. Sci., 9:173–205, 1979.Google ScholarGoogle ScholarCross RefCross Ref
  15. M. Hennessy. An analysis of rule coverage as a criterion in generating minimal test suites for grammar-based software. In Automated Software Engineering, pages 104–113, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. J. Hickey and J. Cohen. Uniform random generation of strings in a context-free language. SIAM J. Comput., 12(4): 645–655, 1983.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. J. Korenjak and J. E. Hopcroft. Simple deterministic languages. In Symposium on Switching and Automata Theory (Swat), pages 36–46, 1966. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. D. Kozen. Automata and computability. Undergraduate texts in computer science. Springer, 1997. ISBN 978-0-387-94907-9. Google ScholarGoogle Scholar
  19. I. Kuraj and V. Kuncak. Scife: Scala framework for efficient enumeration of data structures with invariants. In Scala Workshop, pages 45–49, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. R. Lämmel and W. Schulte. Controllable combinatorial coverage in grammar-based testing. In Testing of Communicating Systems, TestCom, pages 19–38, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. G. Mairson. Generating words in a context-free language uniformly at random. Inf. Process. Lett., 49(2):95–99, 1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Majumdar and R. Xu. Directed test generation using symbolic grammars. In Automated Software Engineering, pages 553–556, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. B. A. Malloy. An interpretation of purdom’s algorithm for automatic generation of test cases. In International Conference on Computer and Information Science, pages 3–5, 2001.Google ScholarGoogle Scholar
  24. P. M. Maurer. Generating test data with enhanced context-free grammars. IEEE Software, 7(4):50–55, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A. Nijholt. The equivalence problem for LL- and LR-regular grammars. pages 149–161, 1982.Google ScholarGoogle Scholar
  26. T. Olshansky and A. Pnueli. A direct algorithm for checking equivalence of LL(k) grammars. Theor. Comput. Sci., 4(3): 321–349, 1977.Google ScholarGoogle ScholarCross RefCross Ref
  27. T. Parr, S. Harwell, and K. Fisher. Adaptive LL(*) parsing: the power of dynamic analysis. In Object Oriented Programming Systems Languages & Applications, OOPSLA, pages 579––598, 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. S. Pigeon. Pairing function. http://mathworld.wolfram. com/PairingFunction.html.Google ScholarGoogle Scholar
  29. P. Purdom. A sentence generator for testing parsers. BIT Numerical Mathematics, pages 366–375, 1972.Google ScholarGoogle Scholar
  30. D. J. Rosenkrantz and R. E. Stearns. Properties of deterministic top down grammars. In Symposium on Theory of Computing STOC, pages 165–180, 1969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. R. Singh, S. Gulwani, and A. Solar-Lezama. Automated feedback generation for introductory programming assignments. In Programming Language Design and Implementation PLDI, pages 15–26, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. E. G. Sirer and B. N. Bershad. Using production grammars in software testing. In Domain-Specific Languages DSL, pages 1–13, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. G. Sénizergues. L(a)=l(b)? decidability results from complete formal systems. Theoretical Computer Science, 251(1–2):1 – 166, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L. G. Valiant. Decision procedures for families of deterministic pushdown automata. Technical report, University of Warwick, Coventry, UK, 1973. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. A. Warth, J. R. Douglass, and T. D. Millstein. Packrat parsers can support left recursion. In Symposium on Partial Evaluation and Semantics-based Program Manipulation, PEPM, pages 103–110, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Automating grammar comparison

              Recommendations

              Comments

              Login options

              Check if you have access through your login credentials or your institution to get full access on this article.

              Sign in

              Full Access

              • Published in

                cover image ACM SIGPLAN Notices
                ACM SIGPLAN Notices  Volume 50, Issue 10
                OOPSLA '15
                October 2015
                953 pages
                ISSN:0362-1340
                EISSN:1558-1160
                DOI:10.1145/2858965
                • Editor:
                • Andy Gill
                Issue’s Table of Contents
                • cover image ACM Conferences
                  OOPSLA 2015: Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications
                  October 2015
                  953 pages
                  ISBN:9781450336895
                  DOI:10.1145/2814270

                Copyright © 2015 ACM

                Publisher

                Association for Computing Machinery

                New York, NY, United States

                Publication History

                • Published: 23 October 2015

                Check for updates

                Qualifiers

                • research-article

              PDF Format

              View or Download as a PDF file.

              PDF

              eReader

              View online with eReader.

              eReader
              About Cookies On This Site

              We use cookies to ensure that we give you the best experience on our website.

              Learn more

              Got it!