skip to main content
research-article
Open Access

Guiding dynamic programing via structural probability for accelerating programming by example

Published:13 November 2020Publication History
Skip Abstract Section

Abstract

Programming by example (PBE) is an important subproblem of program synthesis, and PBE techniques have been applied to many domains. Though many techniques for accelerating PBE systems have been explored, the scalability remains one of the main challenges: There is still a gap between the performances of state-of-the-art synthesizers and the industrial requirement. To further speed up solving PBE tasks, in this paper, we propose a novel PBE framework MaxFlash. MaxFlash uses a model based on structural probability, named topdown prediction models, to guide a search based on dynamic programming, such that the search will focus on subproblems that form probable programs, and avoid improbable programs. Our evaluation shows that MaxFlash achieves × 4.107− × 2080 speed-ups against state-of-the-art solvers on 244 real-world tasks.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a presentation video of my talk at OOPSLA 2020 on our paper accepted in the research track. Programming by example (PBE) is an important subproblem of program synthesis, and PBE techniques have been applied to many domains. Though many techniques for accelerating PBE systems have been explored, the scalability remains one of the main challenges: There is still a gap between the performances of state-of-the-art synthesizers and the industrial requirement. To further speed up solving PBE tasks, in this paper, we propose a novel PBE framework MaxFlash. MaxFlash uses a model based on structural probability, named topdown prediction models, to guide a search based on dynamic programming, such that the search will focus on subproblems that form probable programs, and avoid improbable programs. Our evaluation shows that MaxFlash achieves × 4.107− × 2080 speed-ups against state-of-the-art solvers on 244 real-world tasks.

References

  1. Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: learning distributed representations of code. Proc. ACM Program. Lang. 3, POPL ( 2019 ), 40 : 1-40 : 29. https://doi.org/10.1145/3290353 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Rajeev Alur, Dana Fisman, Saswat Padhi, Rishabh Singh, and Abhishek Udupa. 2019. SyGuS-Comp 2018: Results and Analysis. CoRR abs/ 1904.07146 ( 2019 ). arXiv: 1904.07146 http://arxiv.org/abs/ 1904.07146Google ScholarGoogle Scholar
  3. Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2016. SyGuS-Comp 2016: Results and Analysis. In Proceedings Fifth Workshop on Synthesis, [email protected] 2016, Toronto, Canada, July 17-18, 2016. 178-202. https://doi.org/10. 4204/EPTCS.229.13 Google ScholarGoogle ScholarCross RefCross Ref
  4. Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017a. SyGuS-Comp 2017 : Results and Analysis. In Proceedings Sixth Workshop on Synthesis, [email protected] 2017, Heidelberg, Germany, 22nd July 2017. 97-115. https: //doi.org/10.4204/EPTCS.260.9 Google ScholarGoogle ScholarCross RefCross Ref
  5. Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017b. Scaling Enumerative Program Synthesis via Divide and Conquer. In Tools and Algorithms for the Construction and Analysis of Systems-23rd International Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, Part I. 319-336. https://doi.org/10.1007/978-3-662-54577-5_18 Google ScholarGoogle ScholarCross RefCross Ref
  6. Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to Write Programs. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=ByldLrqlxGoogle ScholarGoogle Scholar
  7. Daniel W. Barowy, Sumit Gulwani, Ted Hart, and Benjamin G. Zorn. 2015. FlashRelate: extracting relational data from semistructured spreadsheets using examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015. 218-228. https://doi.org/10.1145/2737924.2737952 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942. http://proceedings.mlr.press/v48/bielik16.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  9. Qiaochu Chen, Xinyu Wang, Xi Ye, Greg Durrett, and Isil Dillig. 2020. Multi-Modal Synthesis of Regular Expressions. ( 2020 ).Google ScholarGoogle Scholar
  10. Yanju Chen, Ruben Martins, and Yu Feng. 2019. Maximal multi-layer specification synthesis. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019. 602-612. https://doi.org/10.1145/3338906.3338951 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I/O. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017. 990-998. http://proceedings.mlr.press/v70/devlin17a.htmlGoogle ScholarGoogle Scholar
  12. Yu Feng, Ruben Martins, Jacob Van Gefen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 422-436. https: //doi.org/10.1145/3062341.3062351 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Giorgio Gallo, Giustino Longo, and Stefano Pallottino. 1993. Directed Hypergraphs and Applications. Discret. Appl. Math. 42, 2 ( 1993 ), 177-201. https://doi.org/10.1016/ 0166-218X ( 93 ) 90045-P Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011. 317-330. https://doi.org/10.1145/1926385.1926423 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. NeuralGuided Deductive Search for Real-Time Program Synthesis from Examples. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. https://openreview. net/forum?id=rywDjg-RWGoogle ScholarGoogle Scholar
  16. Dileep Kini and Sumit Gulwani. 2015. FlashNormalize: Programming by Examples for Text Normalization. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. 776-783. http://ijcai.org/Abstract/15/115Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Richard E. Korf. 1985. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search. Artif. Intell. 27, 1 ( 1985 ), 97-109. https://doi.org/10.1016/ 0004-3702 ( 85 ) 90084-0 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ailsa H. Land and Alison G. Doig. 1960. An Automatic Method of Solving Discrete Programming Problems. Econometrica 28 ( 1960 ), 497-520.Google ScholarGoogle Scholar
  19. Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom-June 09-11, 2014. 542-553. https://doi.org/10.1145/2594291.2594333 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating search-based program synthesis using learned probabilistic models. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018. 436-449. https://doi.org/10.1145/3192366.3192410 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Aditya Krishna Menon, Omer Tamuz, Sumit Gulwani, Butler W. Lampson, and Adam Kalai. 2013. A Machine Learning Framework for Programming by Example. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013. 187-195. http://proceedings.mlr.press/v28/menon13.htmlGoogle ScholarGoogle Scholar
  22. Arvind Neelakantan, Quoc V. Le, Martín Abadi, Andrew McCallum, and Dario Amodei. 2017. Learning a Natural Language Interface with Neural Programmer. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=ry2YOrcgeGoogle ScholarGoogle Scholar
  23. Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, and Todd D. Millstein. 2018. FlashProfile: a framework for synthesizing data profiles. PACMPL 2, OOPSLA ( 2018 ), 150 : 1-150 : 28. https://doi.org/10.1145/3276520 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: a framework for inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 25-30, 2015. 107-126. https://doi.org/10.1145/2814270. 2814310 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Oleksandr Polozov and Sumit Gulwani. 2016. Program synthesis in the industrial world: Inductive, incremental, interactive. In 5th Workshop on Synthesis (SYNT).Google ScholarGoogle Scholar
  26. Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. 2019a. cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis. In Computer Aided Verification-31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part II. 74-83. https://doi.org/10.1007/978-3-030-25543-5_5 Google ScholarGoogle ScholarCross RefCross Ref
  27. Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark W. Barrett. 2015. Counterexample-Guided Quantifier Instantiation for Synthesis in SMT. In Computer Aided Verification-27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II. 198-216. https://doi.org/10.1007/978-3-319-21668-3_12 Google ScholarGoogle ScholarCross RefCross Ref
  28. Andrew Reynolds, Viktor Kuncak, Cesare Tinelli, Clark W. Barrett, and Morgan Deters. 2019b. Refutation-based synthesis in SMT. Formal Methods in System Design 55, 2 ( 2019 ), 73-102. https://doi.org/10.1007/s10703-017-0270-2 Google ScholarGoogle ScholarCross RefCross Ref
  29. David E. Shaw, William R. Swartout, and C. Cordell Green. 1975. Inferring LISP Programs From Examples. In Advance Papers of the Fourth International Joint Conference on Artificial Intelligence, Tbilisi, Georgia, USSR, September 3-8, 1975. 260-267. http://ijcai.org/Proceedings/75/Papers/037.pdfGoogle ScholarGoogle Scholar
  30. Rishabh Singh and Sumit Gulwani. 2012. Learning Semantic String Transformations from Examples. PVLDB 5, 8 ( 2012 ), 740-751. https://doi.org/10.14778/2212351.2212356 Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Armando Solar-Lezama, Liviu Tancau, Rastislav Bodík, Sanjit A. Seshia, and Vijay A. Saraswat. 2006. Combinatorial sketching for finite programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006. 404-415. https://doi.org/10.1145/ 1168857.1168907 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Chenglong Wang, Alvin Cheung, and Rastislav Bodík. 2017. Synthesizing highly expressive SQL queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 452-466. https://doi.org/10.1145/3062341.3062365 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xinyu Wang, Greg Anderson, Isil Dillig, and Kenneth L. McMillan. 2018a. Learning Abstractions for Program Synthesis. In Computer Aided Verification-30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. 407-426. https://doi.org/10.1007/978-3-319-96145-3_22 Google ScholarGoogle ScholarCross RefCross Ref
  34. Xinyu Wang, Isil Dillig, and Rishabh Singh. 2018b. Program synthesis using abstraction refinement. PACMPL 2, POPL ( 2018 ), 63 : 1-63 : 30. https://doi.org/10.1145/3158151 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yingfei Xiong, Bo Wang, Guirong Fu, and Linfei Zang. 2018. Learning to Synthesize. In International Genetic Improvement Workshop. https://doi.org/10.1145/3194810.3194816 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. 2016. Synthesizing transformations on hierarchically structured data. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, June 13-17, 2016. 508-521. https://doi.org/10.1145/2908080.2908088 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Sai Zhang and Yuyin Sun. 2013. Automatically synthesizing SQL queries from input-output examples. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, November 11-15, 2013, Ewen Denney, Tevfik Bultan, and Andreas Zeller (Eds.). IEEE, 224-234. https://doi.org/10.1109/ASE. 2013.6693082 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Guiding dynamic programing via structural probability for accelerating programming by example

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!