Abstract
Despite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to synthesize a parser from input/output examples. The user does not write a single line of code. To achieve this, Parsify provides: (a) an iterative algorithm for synthesizing and refining a grammar one example at a time, (b) an interface that provides immediate visual feedback in response to changes in the grammar being refined, and (c) a graphical mechanism for specifying example parse trees using only textual selections. We empirically demonstrate the viability of our approach by using Parsify to construct parsers for source code drawn from Verilog, SQL, Apache, and Tiger.
- A. V. Aho and J. D. Ullman. The Theory of Parsing, Translation, and Compiling. Prentice-Hall, 1972. Google Scholar
Digital Library
- D. Angluin. Inference of reversible languages. J. ACM, 29(3), 1982. Google Scholar
Digital Library
- D. Angluin. Learning regular sets from queries and counterexamples. Information and Computation, 75(2), 1987. Google Scholar
Digital Library
- A. W. Appel. Modern Compiler Implementation in ML: Basic Techniques. Cambridge University Press, 1997. Google Scholar
Digital Library
- M. F. Arlitt and C. L. Williamson. Web server workload characterization: The search for invariants. In SIGMETRICS, 1996. Google Scholar
Digital Library
- census-postgres, 2014. URL https://github.com/leehach/ census-postgres.Google Scholar
- A. Cypher, editor. Watch What I Do – Programming by Demonstration. MIT Press, 1993. Google Scholar
Digital Library
- M. Daly, M. F. Fernández, K. Fisher, Y. Mandelbaum, and D. Walker. LAUNCHPADS: A system for processing ad hoc data. In PLAN-X, 2006.Google Scholar
- A. Dubey, S. Aggarwal, and P. Jalote. A technique for extracting keyword based rules from a set of programs. In CSMR, 2005. Google Scholar
Digital Library
- K. Fisher, D. Walker, K. Q. Zhu, and P. White. From dirt to shovels: Fully automatic tool generation from ad hoc data. In POPL, 2008. Google Scholar
Digital Library
- B. Ford. Parsing expression grammars: A recognition-based syntactic foundation. In POPL, 2004. Google Scholar
Digital Library
- GNU Bison manual. GNU Software Foundation. URL http: //www.gnu.org/software/bison/manual/.Google Scholar
- R. Grimm. Better extensibility through modular syntax. In PLDI, 2006. Google Scholar
Digital Library
- D. Grune and C. J. H. Jacobs. Parsing Techniques: A Practical Guide. Ellis Horwood, 1990. Google Scholar
Digital Library
- S. Gulwani. Automating string processing in spreadsheets using inputoutput examples. In POPL, 2011. Google Scholar
Digital Library
- S. Gulwani. Synthesis from examples: Interaction models and algorithms. In SYNASC, 2012. Google Scholar
Digital Library
- W. R. Harris and S. Gulwani. Spreadsheet table transformations from examples. In PLDI, 2011. Google Scholar
Digital Library
- P. Hart, N. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. Systems Science and Cybernetics, IEEE Transactions on, 4(2), 1968.Google Scholar
Cross Ref
- instaparse, 2014. URL https://github.com/Engelberg/ instaparse.Google Scholar
- P. Klint and E. Visser. Using filters for the disambiguation of contextfree grammars. In ASMICS, 1994.Google Scholar
- P. Klint, R. Lämmel, and C. Verhoef. Toward an engineering discipline for grammarware. ACM TOSEM, 14(3), 2005. Google Scholar
Digital Library
- T. Lau, S. A. Wolfman, P. Domingos, and D. S. Weld. Programming by demonstration using version space algebra. Mach. Learn., 53(1-2), 2003. Google Scholar
Digital Library
- V. Le and S. Gulwani. FlashExtract: A framework for data extraction by examples. In PLDI, 2014. Google Scholar
Digital Library
- L. Lee. Learning of context-free languages: A survey of the literature. Technical Report TR-12-96, Harvard University, 1996.Google Scholar
- T. Lei, F. Long, R. Barzilay, and M. C. Rinard. From natural language specifications to program input parsers. In ACL, 2013.Google Scholar
- S. McPeak and G. Necula. Elkhound: A fast, practical GLR parser generator. In CC, 2004.Google Scholar
Cross Ref
- M. Mernik, G. Gerliˇc, V. Žumer, and B. R. Bryant. Can a parser be generated from examples? In SAC, 2003. Google Scholar
Digital Library
- M. Might and D. Darais. Yacc is dead. CoRR, abs/1010.5023, 2010.Google Scholar
- R. C. Miller and B. A. Myers. Lightweight structured text processing. In USENIX ATC, 1999. Google Scholar
Digital Library
- MonitorWare. Apache (Unix) log samples, 2004. URL http: //www.monitorware.com/en/logsamples/apache.php.Google Scholar
- R. C. Moore. Removing left recursion from context-free grammars. In NAACL, 2000. Google Scholar
Digital Library
- T. Parr and K. Fisher. LL(*): The foundation of the antlr parser generator. In PLDI, 2011. Google Scholar
Digital Library
- Y. Sakakibara. Efficient learning of context-free grammars from positive structural examples. Information and Computation, 97(1), 1992. Google Scholar
Digital Library
- E. Scott and A. Johnstone. GLL parsing. ENTCS, 253(7), 2010. Google Scholar
Digital Library
- R. Singh and S. Gulwani. Synthesizing number transformations from input-output examples. In CAV, 2012. Google Scholar
Digital Library
- M. Thorup. Disambiguating grammars by exclusion of sub-parse trees. Acta Informatica, 33(5), 1996. Google Scholar
Digital Library
- M. Tomita. Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems. Kluwer Academic Publishers, 1985. Google Scholar
Digital Library
- E. Vidal. Grammatical inference: An introductory survey. In Grammatical Inference and Applications, LNCS. 1994. Google Scholar
Digital Library
- K. Yessenov, S. Tulsiani, A. Menon, R. C. Miller, S. Gulwani, B. Lampson, and A. Kalai. A colorful approach to text processing by example. In UIST, 2013. Google Scholar
Digital Library
Index Terms
Interactive parser synthesis by example
Recommendations
Interactive parser synthesis by example
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and ImplementationDespite decades of research on parsing, the construction of parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present a programming-by-example framework called Parsify that is able to synthesize a parser from input/...
Left Corner Parser for Tree Insertion Grammars
AIMSA '02: Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and ApplicationsTree Adjoining Grammar (TAG) is a grammar formalism that has become very popular for the description of natural languages, however, this context-sensitive formalism entails important computation costs ( O ( n 6)-time). Tree Insertion Grammar (TIG) is ...
Extended CFG Formalism for Grammar Checker and Parser Development
CICLing 2014: Proceedings of the 15th International Conference on Computational Linguistics and Intelligent Text Processing - Volume 8403This paper reports on the implementation of grammar checkers and parsers for highly inflected and under-resourced languages. As classical context free grammar CFG formalism performs poorly on languages with a rich morphological feature system, we have ...






Comments