Abstract
We describe the design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops. The language is expressive enough to represent a wide variety of string manipulation tasks that end-users struggle with. We describe an algorithm based on several novel concepts for synthesizing a desired program in this language from input-output examples. The synthesis algorithm is very efficient taking a fraction of a second for various benchmark examples. The synthesis algorithm is interactive and has several desirable features: it can rank multiple solutions and has fast convergence, it can detect noise in the user input, and it supports an active interaction model wherein the user is prompted to provide outputs on inputs that may have multiple computational interpretations.
The algorithm has been implemented as an interactive add-in for Microsoft Excel spreadsheet system. The prototype tool has met the golden test - it has synthesized part of itself, and has been used to solve problems beyond author's imagination.
Supplemental Material
- D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75 (2): 87--106, 1987. Google Scholar
Digital Library
- K. Fisher and R. Gruber. PADS: a domain-specific language for processing ad hoc data. In PLDI, pages 295--304, 2005. Google Scholar
Digital Library
- K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In POPL, pages 2--15, 2006. Google Scholar
Digital Library
- K. Fisher, D. Walker, K. Q. Zhu, and P. White. From dirt to shovels: fully automatic tool generation from ad hoc data. In POPL, 2008. Google Scholar
Digital Library
- M. Gualtieri. Deputize end-user developers to deliver business agility and reduce costs. In Forrester Report for Application Development and Program Management Professionals, April 2009.Google Scholar
- S. Gulwani. Dimensions in program synthesis. In PPDP. ACM, 2010. Google Scholar
Digital Library
- S. Gulwani and G. C. Necula. A polynomial-time algorithm for global value numbering. In SAS, pages 212--227, 2004.Google Scholar
Cross Ref
- S. Jha, S. Gulwani, S. Seshia, and A. Tiwari. Oracle-guided component-based program synthesis. In ICSE, 2010. Google Scholar
Digital Library
- A. J. Ko, B. A. Myers, and H. H. Aung. Six learning barriers in end-user programming systems. In VL/HCC, pages 199--206, 2004. Google Scholar
Digital Library
- T. Lau. Why PBD systems fail: Lessons learned for usable AI. In CHI 2008 Workshop on Usable AI, Florence, Italy, 2008.Google Scholar
- T. Lau, S. Wolfman, P. Domingos, and D. Weld. Programming by demonstration using version space algebra. Machine Learning, 53 (1--2), 2003. Google Scholar
Digital Library
- T. Lau, L. Bergman, V. Castelli, and D. Oblinger. Programming shell scripts by demonstration. In Workshop on SCLAS, AAAI, 2004.Google Scholar
- T. A. Lau, P. Domingos, and D. S. Weld. Version space algebra and its application to programming by demonstration. In ICML, 2000. Google Scholar
Digital Library
- A. Lau, P. Domingos, and D. S. Weld. Learning programs from traces using version space algebra. In K-CAP, pages 36--43, 2003. Google Scholar
Digital Library
- R. C. Miller and B. A. Myers. Interactive simultaneous editing of multiple text regions. In USENIX Annual Technical Conference, 2001. Google Scholar
Digital Library
- T. M. Mitchell. Generalization as search. Artif. Intell., 18 (2), 1982.Google Scholar
- R. P. Nix. Editing by example. TOPLAS, 7 (4): 600--621, 1985. Google Scholar
Digital Library
- S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach (2nd Edition). Prentice Hall, 2 edition, December 2002. Google Scholar
Digital Library
- S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. In POPL, 2010. Google Scholar
Digital Library
- J. M. Vilar. Query learning of subsequential transducers. In Proceedings of the 3rd International Colloquium on Grammatical Inference, 1996. Google Scholar
Digital Library
- J. Walkenbach. Excel 2010 Formulas. John Wiley and Sons, 2010. Google Scholar
Digital Library
- I. H. Witten and D. Mo. TELS: learning text editing tasks from examples. In Watch what I do: programming by demonstration, pages 293--307. MIT Press, Cambridge, MA, USA, 1993. Google Scholar
Digital Library
- Q. Xi and D. Walker. A context-free markup language for semi-structured text. In PLDI, pages 221--232, 2010. Google Scholar
Digital Library
Index Terms
(auto-classified)Automating string processing in spreadsheets using input-output examples
Recommendations
Synthesizing data structure transformations from input-output examples
PLDI '15We present a method for example-guided synthesis of functional programs over recursive data structures. Given a set of input-output examples, our method synthesizes a program in a functional language with higher-order combinators like map and fold. The ...
Automating string processing in spreadsheets using input-output examples
POPL '11: Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languagesWe describe the design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops. The language is expressive enough to represent a wide variety of string manipulation tasks that end-users ...
Spreadsheet table transformations from examples
PLDI '11: Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and ImplementationEvery day, millions of computer end-users need to perform tasks over large, tabular data, yet lack the programming knowledge to do such tasks automatically. In this work, we present an automatic technique that takes from a user an example of how the ...







Comments