skip to main content
research-article
Open Access

Automated transpilation of imperative to functional code using neural-guided program synthesis

Published:29 April 2022Publication History
Skip Abstract Section

Abstract

While many mainstream languages such as Java, Python, and C# increasingly incorporate functional APIs to simplify programming and improve parallelization/performance, there are no effective techniques that can be used to automatically translate existing imperative code to functional variants using these APIs. Motivated by this problem, this paper presents a transpilation approach based on inductive program synthesis for modernizing existing code. Our method is based on the observation that the overwhelming majority of source/target programs in this setting satisfy an assumption that we call trace-compatibility: not only do the programs share syntactically identical low-level expressions, but these expressions also take the same values in corresponding execution traces. Our method leverages this observation to design a new neural-guided synthesis algorithm that (1) uses a novel neural architecture called cognate grammar network (CGN) and (2) leverages a form of concolic execution to prune partial programs based on intermediate values that arise during a computation. We have implemented our approach in a tool called NGST2 and use it to translate imperative Java and Python code to functional variants that use the Stream and functools APIs respectively. Our experiments show that NGST2 significantly outperforms several baselines and that our proposed neural architecture and pruning techniques are vital for achieving good results.

References

  1. Karan Aggarwal, Mohammad Salameh, and Abram Hindle. 2015. Using machine translation for converting Python 2 to Python 3 code. PeerJ PrePrints. https://doi.org/10.7287/peerj.preprints.1459v1 Google ScholarGoogle ScholarCross RefCross Ref
  2. Maaz Bin Safeer Ahmad and Alvin Cheung. 2018. Automatically leveraging mapreduce frameworks for data-intensive applications. In Proceedings of the 2018 International Conference on Management of Data. ACM, New York, NY, USA. 1205–1220. https://doi.org/10.1145/3183713.3196891 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. 2013. Recursive program synthesis. In International conference on computer aided verification. Springer, New York, NY, USA. 934–950. https://doi.org/10.1007/978-3-642-39799-8_67 Google ScholarGoogle ScholarCross RefCross Ref
  4. Rajeev Alur, Pavol Černỳ, and Arjun Radhakrishna. 2015. Synthesis through unification. In International Conference on Computer Aided Verification. Springer, New York, NY, USA. 163–179. https://doi.org/10.1007/978-3-319-21668-3_10 Google ScholarGoogle ScholarCross RefCross Ref
  5. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In Proceedings of the International Conference on Learning Representations (ICLR). International Conference on Learning Representations (ICLR), La Jolla, CA, USA. 1–15.Google ScholarGoogle Scholar
  6. Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. Deepcoder: Learning to write programs. https://doi.org/10.48550/arXiv.1611.01989 arxiv:1611.01989.Google ScholarGoogle Scholar
  7. Sahil Bhatia, Pushmeet Kohli, and Rishabh Singh. 2018. Neuro-symbolic program corrector for introductory programming assignments. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, New York, NY, USA. 60–70. https://doi.org/10.1145/3180155.3180219 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Qiaochu Chen, Aaron Lamoreaux, Xinyu Wang, Greg Durrett, Osbert Bastani, and Isil Dillig. 2021. Web question answering with neurosymbolic program synthesis. In Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. ACM, New York, NY, USA. 328–343. https://doi.org/10.1145/3453483.3454047 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Xinyun Chen, Chang Liu, and Dawn Song. 2018. Tree-to-tree neural networks for program translation. arxiv:1802.03691.Google ScholarGoogle Scholar
  10. Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. ACM SIGPLAN Notices, 48, 6 (2013), 3–14. https://doi.org/10.1145/2499370.2462180 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Patrick Cousot and Radhia Cousot. 1992. Abstract interpretation frameworks. Journal of logic and computation, 2, 4 (1992), 511–547. https://doi.org/10.1093/logcom/2.4.511 Google ScholarGoogle ScholarCross RefCross Ref
  12. Patrick Cousot and Radhia Cousot. 1994. Higher-order abstract interpretation (and application to comportment analysis generalizing strictness, termination, projection and PER analysis of functional languages). In Proceedings of 1994 IEEE International Conference on Computer Languages (ICCL’94). IEEE, New York, NY, USA. 95–112. https://doi.org/10.1109/ICCL.1994.288389 Google ScholarGoogle ScholarCross RefCross Ref
  13. Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, and Joshua B Tenenbaum. 2020. Dreamcoder: Growing generalizable, interpretable knowledge with wake-sleep bayesian program learning. https://doi.org/10.48550/arXiv.2006.08381 arxiv:2006.08381.Google ScholarGoogle Scholar
  14. Azadeh Farzan and Victor Nicolet. 2017. Synthesis of divide and conquer parallelism for loops. ACM SIGPLAN Notices, 52, 6 (2017), 540–555. https://doi.org/10.1145/3140587.3062355 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program synthesis using conflict-driven learning. ACM SIGPLAN Notices, 53, 4 (2018), 420–435. https://doi.org/10.1145/3296979.3192382 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. John K Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing data structure transformations from input-output examples. ACM SIGPLAN Notices, 50, 6 (2015), 229–239. https://doi.org/10.1145/2813885.2737977 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany. 1631–1640. https://doi.org/10.18653/v1/P16-1154 Google ScholarGoogle ScholarCross RefCross Ref
  18. Alex Gyori, Lyle Franklin, Danny Dig, and Jan Lahoda. 2013. Crossing the gap from imperative to functional programming through refactoring. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, New York, NY, USA. 543–553. https://doi.org/10.1145/2491411.2491461 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput., 9, 8 (1997), Nov., 1735–1780. issn:0899-7667 https://doi.org/10.1162/neco.1997.9.8.1735 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Paul Hudak and Jonathan Young. 1991. Collecting interpretations of expressions. ACM Transactions on Programming Languages and Systems (TOPLAS), 13, 2 (1991), 269–290. https://doi.org/10.1145/103135.103139 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Robin Jia and Percy Liang. 2016. Data Recombination for Neural Semantic Parsing. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany. 12–22. https://doi.org/10.18653/v1/P16-1002 Google ScholarGoogle ScholarCross RefCross Ref
  22. Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified lifting of stencil computations. ACM SIGPLAN Notices, 51, 6 (2016), 711–726. https://doi.org/10.1145/2980983.2908117 Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Raffi Khatchadourian, Yiming Tang, and Mehdi Bagherzadeh. 2020. Safe automated refactoring for intelligent parallelization of Java 8 streams. Science of Computer Programming, 195 (2020), 102476. https://doi.org/10.1016/j.scico.2020.102476 Google ScholarGoogle ScholarCross RefCross Ref
  24. Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In Proceedings of the International Conference on Learning Representations (ICLR). International Conference on Learning Representations (ICLR), La Jolla, CA, USA. 1–15.Google ScholarGoogle Scholar
  25. Nikita Kitaev and Dan Klein. 2018. Constituency Parsing with a Self-Attentive Encoder. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Melbourne, Australia. 2676–2686. https://doi.org/10.18653/v1/P18-1249 Google ScholarGoogle ScholarCross RefCross Ref
  26. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, and Richard Zens. 2007. Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th annual meeting of the association for computational linguistics companion volume proceedings of the demo and poster sessions. Association for Computational Linguistics, La Jolla, CA, USA. 177–180.Google ScholarGoogle ScholarCross RefCross Ref
  27. Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised translation of programming languages. https://doi.org/10.48550/arXiv.2006.03511 arxiv:2006.03511.Google ScholarGoogle Scholar
  28. Jian Li, Yue Wang, Michael R. Lyu, and Irwin King. 2018. Code Completion with Neural Attention and Pointer Networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, New York, NY, USA. 4159–4165. https://doi.org/10.24963/ijcai.2018/578 Google ScholarGoogle ScholarCross RefCross Ref
  29. Thang Luong, Hieu Pham, and Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Lisbon, Portugal. 1412–1421. https://doi.org/10.18653/v1/D15-1166 Google ScholarGoogle ScholarCross RefCross Ref
  30. Benjamin Mariano, Yanju Chen, Yu Feng, Greg Durrett, and Isil Dillig. 2022. Automated Transpilation of Imperative to Functional Code using Neural-Guided Program Synthesis (Extended Version). https://doi.org/10.48550/arXiv.2203.09452 arxiv:2203.09452.Google ScholarGoogle Scholar
  31. Benjamin Mariano, Yanju Chen, Yu Feng, Shuvendu K Lahiri, and Isil Dillig. 2020. Demystifying Loops in Smart Contracts. In 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, New York, NY, USA. 262–274.Google ScholarGoogle Scholar
  32. Maxwell I Nye, Armando Solar-Lezama, Joshua B Tenenbaum, and Brenden M Lake. 2020. Learning compositional rules via neural program synthesis. https://doi.org/10.48550/arXiv.2003.05562 arxiv:2003.05562.Google ScholarGoogle Scholar
  33. Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, Sanjoy Dasgupta and David McAllester (Eds.) (Proceedings of Machine Learning Research, Vol. 28). PMLR, Atlanta, Georgia, USA. 1310–1318. http://proceedings.mlr.press/v28/pascanu13.htmlGoogle ScholarGoogle Scholar
  34. Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program synthesis from polymorphic refinement types. ACM SIGPLAN Notices, 51, 6 (2016), 522–538. https://doi.org/10.1145/2980983.2908093 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Maxim Rabinovich, Mitchell Stern, and Dan Klein. 2017. Abstract syntax networks for code generation and semantic parsing. https://doi.org/10.48550/arXiv.1704.07535 arxiv:1704.07535.Google ScholarGoogle Scholar
  36. Cosmin Radoi, Stephen J Fink, Rodric Rabbah, and Manu Sridharan. 2014. Translating imperative code to MapReduce. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications. ACM, New York, NY, USA. 909–927. https://doi.org/10.1145/2660193.2660228 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Veselin Raychev, Madanlal Musuvathi, and Todd Mytkowicz. 2015. Parallelizing user-defined aggregations using symbolic execution. In Proceedings of the 25th Symposium on Operating Systems Principles. ACM, New York, NY, USA. 153–167. https://doi.org/10.1145/2815400.2815418 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Abigail See, Peter J. Liu, and Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Vancouver, Canada. 1073–1083. https://doi.org/10.18653/v1/P17-1099 Google ScholarGoogle ScholarCross RefCross Ref
  39. Koushik Sen, Darko Marinov, and Gul Agha. 2005. CUTE: A concolic unit testing engine for C. ACM SIGSOFT Software Engineering Notes, 30, 5 (2005), 263–272. https://doi.org/10.1145/1095430.1081750 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Calvin Smith and Aws Albarghouthi. 2016. MapReduce program synthesis. Acm Sigplan Notices, 51, 6 (2016), 326–340. https://doi.org/10.1145/2980983.2908102 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodik, Vijay Saraswat, and Sanjit Seshia. 2007. Sketching stencils. In Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, New York, NY, USA. 167–178. https://doi.org/10.1145/1250734.1250754 Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15, 56 (2014), 1929–1958. http://jmlr.org/papers/v15/srivastava14a.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  43. Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. https://doi.org/10.48550/arXiv.1409.3215 arxiv:1409.3215.Google ScholarGoogle Scholar
  44. Emina Torlak and Rastislav Bodik. 2013. Growing solver-aided languages with Rosette. In Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software. ACM, New York, NY, USA. 135–152. https://doi.org/10.1145/2509578.2509586 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Xi Ye, Qiaochu Chen, Isil Dillig, and Greg Durrett. 2020. Optimal Neural Program Synthesis from Multimodal Specifications. https://doi.org/10.48550/arXiv.2010.01678 arxiv:2010.01678.Google ScholarGoogle Scholar

Index Terms

  1. Automated transpilation of imperative to functional code using neural-guided program synthesis

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image Proceedings of the ACM on Programming Languages
        Proceedings of the ACM on Programming Languages  Volume 6, Issue OOPSLA1
        April 2022
        687 pages
        EISSN:2475-1421
        DOI:10.1145/3534679
        Issue’s Table of Contents

        Copyright © 2022 Owner/Author

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 29 April 2022
        Published in pacmpl Volume 6, Issue OOPSLA1

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!