skip to main content
research-article

Automating string processing in spreadsheets using input-output examples

Published:26 January 2011Publication History
Skip Abstract Section

Abstract

We describe the design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops. The language is expressive enough to represent a wide variety of string manipulation tasks that end-users struggle with. We describe an algorithm based on several novel concepts for synthesizing a desired program in this language from input-output examples. The synthesis algorithm is very efficient taking a fraction of a second for various benchmark examples. The synthesis algorithm is interactive and has several desirable features: it can rank multiple solutions and has fast convergence, it can detect noise in the user input, and it supports an active interaction model wherein the user is prompted to provide outputs on inputs that may have multiple computational interpretations.

The algorithm has been implemented as an interactive add-in for Microsoft Excel spreadsheet system. The prototype tool has met the golden test - it has synthesized part of itself, and has been used to solve problems beyond author's imagination.

Skip Supplemental Material Section

Supplemental Material

30-mpeg-4.mp4

References

  1. D. Angluin. Learning regular sets from queries and counterexamples. Inf. Comput., 75 (2): 87--106, 1987. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. K. Fisher and R. Gruber. PADS: a domain-specific language for processing ad hoc data. In PLDI, pages 295--304, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. K. Fisher, Y. Mandelbaum, and D. Walker. The next 700 data description languages. In POPL, pages 2--15, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Fisher, D. Walker, K. Q. Zhu, and P. White. From dirt to shovels: fully automatic tool generation from ad hoc data. In POPL, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. M. Gualtieri. Deputize end-user developers to deliver business agility and reduce costs. In Forrester Report for Application Development and Program Management Professionals, April 2009.Google ScholarGoogle Scholar
  6. S. Gulwani. Dimensions in program synthesis. In PPDP. ACM, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. S. Gulwani and G. C. Necula. A polynomial-time algorithm for global value numbering. In SAS, pages 212--227, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. S. Jha, S. Gulwani, S. Seshia, and A. Tiwari. Oracle-guided component-based program synthesis. In ICSE, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. A. J. Ko, B. A. Myers, and H. H. Aung. Six learning barriers in end-user programming systems. In VL/HCC, pages 199--206, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Lau. Why PBD systems fail: Lessons learned for usable AI. In CHI 2008 Workshop on Usable AI, Florence, Italy, 2008.Google ScholarGoogle Scholar
  11. T. Lau, S. Wolfman, P. Domingos, and D. Weld. Programming by demonstration using version space algebra. Machine Learning, 53 (1--2), 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. T. Lau, L. Bergman, V. Castelli, and D. Oblinger. Programming shell scripts by demonstration. In Workshop on SCLAS, AAAI, 2004.Google ScholarGoogle Scholar
  13. T. A. Lau, P. Domingos, and D. S. Weld. Version space algebra and its application to programming by demonstration. In ICML, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Lau, P. Domingos, and D. S. Weld. Learning programs from traces using version space algebra. In K-CAP, pages 36--43, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. C. Miller and B. A. Myers. Interactive simultaneous editing of multiple text regions. In USENIX Annual Technical Conference, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. M. Mitchell. Generalization as search. Artif. Intell., 18 (2), 1982.Google ScholarGoogle Scholar
  17. R. P. Nix. Editing by example. TOPLAS, 7 (4): 600--621, 1985. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach (2nd Edition). Prentice Hall, 2 edition, December 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. In POPL, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. J. M. Vilar. Query learning of subsequential transducers. In Proceedings of the 3rd International Colloquium on Grammatical Inference, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Walkenbach. Excel 2010 Formulas. John Wiley and Sons, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. I. H. Witten and D. Mo. TELS: learning text editing tasks from examples. In Watch what I do: programming by demonstration, pages 293--307. MIT Press, Cambridge, MA, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Q. Xi and D. Walker. A context-free markup language for semi-structured text. In PLDI, pages 221--232, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

(auto-classified)
  1. Automating string processing in spreadsheets using input-output examples

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!