Abstract
While input-output examples are a natural form of specification for program synthesis engines, they can be imprecise for domains such as table transformations. In this paper, we investigate how extracting readily-available information about the user intent behind these input-output examples helps speed up synthesis and reduce overfitting. We present Gauss, a synthesis algorithm for table transformations that accepts partial input-output examples, along with user intent graphs. Gauss includes a novel conflict-resolution reasoning algorithm over graphs that enables it to learn from mistakes made during the search and use that knowledge to explore the space of programs even faster. It also ensures the final program is consistent with the user intent specification, reducing overfitting. We implement Gauss for the domain of table transformations (supporting Pandas and R), and compare it to three state-of-the-art synthesizers accepting only input-output examples. We find that it is able to reduce the search space by 56×, 73× and 664× on average, resulting in 7×, 26× and 7× speedups in synthesis times on average, respectively.
Supplemental Material
- Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. CoRR, abs/1611.01989 (2016), arxiv:1611.01989. arxiv:1611.01989Google Scholar
- Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. 2019. Synthesizing Queries via Interactive Sketching. CoRR, abs/1912.12659 (2019), arXiv:1912.12659. arxiv:1912.12659Google Scholar
- Rohan Bavishi, Caroline Lemieux, Roy Fox, Koushik Sen, and Ion Stoica. 2019. AutoPandas: Neural-Backed Generators for Program Synthesis. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 168, Oct., 27 pages. https://doi.org/10.1145/3360594 Google Scholar
Digital Library
- Rohan Bavishi, Hiroaki Yoshida, and Mukul R. Prasad. 2019. Phoenix: Automated Data-Driven Synthesis of Repairs for Static Analysis Violations. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA. 613–624. isbn:9781450355728 https://doi.org/10.1145/3338906.3338952 Google Scholar
Digital Library
- Yanju Chen, Ruben Martins, and Yu Feng. 2019. Maximal Multi-Layer Specification Synthesis. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019). ACM, New York, NY, USA. 602–612. isbn:9781450355728 https://doi.org/10.1145/3338906.3338951 Google Scholar
Digital Library
- Ian Drosos, Titus Barik, Philip J. Guo, Robert DeLine, and Sumit Gulwani. 2020. Wrex: A Unified Programming-by-Example Interaction for Synthesizing Readable Code for Data Scientists. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI ’20). ACM, New York, NY, USA. 1–12. isbn:9781450367080 https://doi.org/10.1145/3313831.3376442 Google Scholar
Digital Library
- Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program Synthesis Using Conflict-Driven Learning. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). ACM, New York, NY, USA. 420–435. isbn:9781450356985 https://doi.org/10.1145/3192366.3192382 Google Scholar
Digital Library
- Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-Based Synthesis of Table Consolidation and Transformation Tasks from Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA. 422–436. isbn:9781450349888 https://doi.org/10.1145/3062341.3062351 Google Scholar
Digital Library
- John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing Data Structure Transformations from Input-Output Examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA. 229–239. isbn:9781450334686 https://doi.org/10.1145/2737924.2737977 Google Scholar
Digital Library
- Adrià Gascón, Ashish Tiwari, Brent Carmer, and Umang Mathur. 2017. Look for the Proof to Find the Program: Decorated-Component-Based Program Synthesis. In Proceedings of the 29th International Conference in Computer Aided Verification. p, a. 86–103. isbn:978-3-319-63389-3 https://doi.org/10.1007/978-3-319-63390-9_5 Google Scholar
Cross Ref
- Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-Output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). ACM, New York, NY, USA. 317–330. isbn:9781450304900 https://doi.org/10.1145/1926385.1926423 Google Scholar
Digital Library
- Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracle-guided Component-based Program Synthesis. In Proceedings of the 32Nd ACM/IEEE International Conference on Software Engineering - Volume 1 (ICSE ’10). ACM, New York, NY, USA. 215–224. isbn:978-1-60558-719-6 https://doi.org/10.1145/1806799.1806833 Google Scholar
Digital Library
- Vu Le and Sumit Gulwani. 2014. FlashExtract: A Framework for Data Extraction by Examples. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). ACM, New York, NY, USA. 542–553. isbn:978-1-4503-2784-8 https://doi.org/10.1145/2594291.2594333 Google Scholar
Digital Library
- Hila Peleg, Shachar Itzhaky, Sharon Shoham, and Eran Yahav. 2019. Programming by predicates: a formal model for interactive synthesis. Acta Informatica, 1, 57 (2019), 08, https://doi.org/10.1007/s00236-019-00340-y Google Scholar
Cross Ref
- Hila Peleg, Sharon Shoham, and Eran Yahav. 2018. Programming Not Only by Example. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). ACM, New York, NY, USA. 1114–1124. isbn:9781450356381 https://doi.org/10.1145/3180155.3180189 Google Scholar
Digital Library
- Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program Synthesis from Polymorphic Refinement Types. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). ACM, New York, NY, USA. 522–538. isbn:9781450342612 https://doi.org/10.1145/2908080.2908093 Google Scholar
Digital Library
- Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). ACM, New York, NY, USA. 107–126. isbn:9781450336895 https://doi.org/10.1145/2814270.2814310 Google Scholar
Digital Library
- Tye Rattenbury, Joseph M. Hellerstein, Jeffrey Heer, Sean Kandel, and Connor Carreras. 2017. Principles of Data Wrangling: Practical Techniques for Data Preparation (1st ed.). O’Reilly Media, Inc.. isbn:1491938927 Google Scholar
Digital Library
- Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. 2015. Compositional Program Synthesis from Natural Language and Examples. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, Qiang Yang and Michael J. Wooldridge (Eds.). AAAI Press, 792–800. http://ijcai.org/Abstract/15/117 Google Scholar
Digital Library
- Rishabh Singh and Armando Solar-Lezama. 2011. Synthesizing Data Structure Manipulations from Storyboards. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE ’11). ACM, New York, NY, USA. 289–299. isbn:9781450304436 https://doi.org/10.1145/2025113.2025153 Google Scholar
Digital Library
- Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. 2006. Combinatorial Sketching for Finite Programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS XII). ACM, New York, NY, USA. 404–415. isbn:1-59593-451-0 https://doi.org/10.1145/1168857.1168907 Google Scholar
Digital Library
- Chenglong Wang, Alvin Cheung, and Rastislav Bodik. 2017. Synthesizing Highly Expressive SQL Queries from Input-Output Examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, New York, NY, USA. 452–466. isbn:9781450349888 https://doi.org/10.1145/3062341.3062365 Google Scholar
Digital Library
- Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig. 2019. Visualization by Example. Proc. ACM Program. Lang., 4, POPL (2019), Article 49, Dec., 28 pages. https://doi.org/10.1145/3371117 Google Scholar
Digital Library
- Xinyu Wang, Greg Anderson, Isil Dillig, and K. L. McMillan. 2018. Learning Abstractions for Program Synthesis. In Computer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham. 407–426. isbn:978-3-319-96145-3 https://doi.org/10.1007/978-3-319-96145-3_22 Google Scholar
Cross Ref
- Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017. Program Synthesis Using Abstraction Refinement. Proc. ACM Program. Lang., 2, POPL (2017), Article 63, Dec., 30 pages. https://doi.org/10.1145/3158151 Google Scholar
Digital Library
- Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, and Dragomir Radev. 2018. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium. 3911–3921. https://doi.org/10.18653/v1/D18-1425 Google Scholar
Cross Ref
Index Terms
Gauss: program synthesis by reasoning over graphs
Recommendations
Algorithmic program synthesis: introduction
Program synthesis is a process of producing an executable program from a specification. Algorithmic synthesis produces the program automatically, without an intervention from an expert. While classical compilation falls under the definition of ...
Optimizing synthesis with metasketches
POPL '16Many advanced programming tools---for both end-users and expert developers---rely on program synthesis to automatically generate implementations from high-level specifications. These tools often need to employ tricky, custom-built synthesis algorithms ...
Can reactive synthesis and syntax-guided synthesis be friends?
PLDI 2022: Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and ImplementationWhile reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...






Comments