skip to main content
research-article
Open Access

Gauss: program synthesis by reasoning over graphs

Published:15 October 2021Publication History
Skip Abstract Section

Abstract

While input-output examples are a natural form of specification for program synthesis engines, they can be imprecise for domains such as table transformations. In this paper, we investigate how extracting readily-available information about the user intent behind these input-output examples helps speed up synthesis and reduce overfitting. We present Gauss, a synthesis algorithm for table transformations that accepts partial input-output examples, along with user intent graphs. Gauss includes a novel conflict-resolution reasoning algorithm over graphs that enables it to learn from mistakes made during the search and use that knowledge to explore the space of programs even faster. It also ensures the final program is consistent with the user intent specification, reducing overfitting. We implement Gauss for the domain of table transformations (supporting Pandas and R), and compare it to three state-of-the-art synthesizers accepting only input-output examples. We find that it is able to reduce the search space by 56×, 73× and 664× on average, resulting in 7×, 26× and 7× speedups in synthesis times on average, respectively.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a video recording for our talk on our paper titled "Gauss: Program Synthesis by Reasoning over Graphs" published at OOPSLA 2021.

References

  1. Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. CoRR, abs/1611.01989 (2016), arxiv:1611.01989. arxiv:1611.01989Google ScholarGoogle Scholar
  2. Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. 2019. Synthesizing Queries via Interactive Sketching. CoRR, abs/1912.12659 (2019), arXiv:1912.12659. arxiv:1912.12659Google ScholarGoogle Scholar
  3. Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: Neural-Backed Generators for Program Synthesis. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 168, Oct., 27 pages. https://doi.org/10.1145/3360594 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Rohan Bavishi, Hiroaki Yoshida, and Mukul R. Prasad. 2019. Phoenix: Automated Data-Driven Synthesis of Repairs for Static Analysis Violations. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA. 613–624. isbn:9781450355728 https://doi.org/10.1145/3338906.3338952 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Yanju Chen, Ruben Martins, and Yu Feng. 2019. Maximal Multi-Layer Specification Synthesis. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA. 602–612. isbn:9781450355728 https://doi.org/10.1145/3338906.3338951 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20). ACM, New York, NY, USA. 1–12. isbn:9781450367080 https://doi.org/10.1145/3313831.3376442 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program Synthesis Using Conflict-Driven Learning. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA. 420–435. isbn:9781450356985 https://doi.org/10.1145/3192366.3192382 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-Based Synthesis of Table Consolidation and Transformation Tasks from Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA. 422–436. isbn:9781450349888 https://doi.org/10.1145/3062341.3062351 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing Data Structure Transformations from Input-Output Examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA. 229–239. isbn:9781450334686 https://doi.org/10.1145/2737924.2737977 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Adrià Gascón, Ashish Tiwari, Brent Carmer, and Umang Mathur. 2017. Look for the Proof to Find the Program: Decorated-Component-Based Program Synthesis. In Proceedings of the 29th International Conference in Computer Aided Verification. p, a. 86–103. isbn:978-3-319-63389-3 https://doi.org/10.1007/978-3-319-63390-9_5 Google ScholarGoogle ScholarCross RefCross Ref
  11. Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-Output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA. 317–330. isbn:9781450304900 https://doi.org/10.1145/1926385.1926423 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracle-guided Component-based Program Synthesis. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE ’10). ACM, New York, NY, USA. 215–224. isbn:978-1-60558-719-6 https://doi.org/10.1145/1806799.1806833 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA. 542–553. isbn:978-1-4503-2784-8 https://doi.org/10.1145/2594291.2594333 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Hila Peleg, Shachar Itzhaky, Sharon Shoham, and Eran Yahav. 2019. Programming by predicates: a formal model for interactive synthesis. Acta Informatica, 1, 57 (2019), 08, https://doi.org/10.1007/s00236-019-00340-y Google ScholarGoogle ScholarCross RefCross Ref
  15. Hila Peleg, Sharon Shoham, and Eran Yahav. 2018. Programming Not Only by Example. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, New York, NY, USA. 1114–1124. isbn:9781450356381 https://doi.org/10.1145/3180155.3180189 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program Synthesis from Polymorphic Refinement Types. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA. 522–538. isbn:9781450342612 https://doi.org/10.1145/2908080.2908093 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA. 107–126. isbn:9781450336895 https://doi.org/10.1145/2814270.2814310 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Tye Rattenbury, Joseph M. Hellerstein, Jeffrey Heer, Sean Kandel, and Connor Carreras. 2017. Principles of Data Wrangling: Practical Techniques for Data Preparation (1st ed.). O’Reilly Media, Inc.. isbn:1491938927 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. 2015. Compositional Program Synthesis from Natural Language and Examples. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, Qiang Yang and Michael J. Wooldridge (Eds.). AAAI Press, 792–800. http://ijcai.org/Abstract/15/117 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Rishabh Singh and Armando Solar-Lezama. 2011. Synthesizing Data Structure Manipulations from Storyboards. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE ’11). ACM, New York, NY, USA. 289–299. isbn:9781450304436 https://doi.org/10.1145/2025113.2025153 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. 2006. Combinatorial Sketching for Finite Programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII). ACM, New York, NY, USA. 404–415. isbn:1-59593-451-0 https://doi.org/10.1145/1168857.1168907 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Chenglong Wang, Alvin Cheung, and Rastislav Bodik. 2017. Synthesizing Highly Expressive SQL Queries from Input-Output Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA. 452–466. isbn:9781450349888 https://doi.org/10.1145/3062341.3062365 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig. 2019. Visualization by Example. Proc. ACM Program. Lang., 4, POPL (2019), Article 49, Dec., 28 pages. https://doi.org/10.1145/3371117 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xinyu Wang, Greg Anderson, Isil Dillig, and K. L. McMillan. 2018. Learning Abstractions for Program Synthesis. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham. 407–426. isbn:978-3-319-96145-3 https://doi.org/10.1007/978-3-319-96145-3_22 Google ScholarGoogle ScholarCross RefCross Ref
  25. Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Program Synthesis Using Abstraction Refinement. Proc. ACM Program. Lang., 2, POPL (2017), Article 63, Dec., 30 pages. https://doi.org/10.1145/3158151 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium. 3911–3921. https://doi.org/10.18653/v1/D18-1425 Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Gauss: program synthesis by reasoning over graphs

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image Proceedings of the ACM on Programming Languages
      Proceedings of the ACM on Programming Languages  Volume 5, Issue OOPSLA
      October 2021
      2001 pages
      EISSN:2475-1421
      DOI:10.1145/3492349
      Issue’s Table of Contents

      Copyright © 2021 Owner/Author

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 October 2021
      Published in pacmpl Volume 5, Issue OOPSLA

      Check for updates

      Qualifiers

      • research-article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!