skip to main content
research-article
Open Access
Artifacts Available / v1.1

One down, 699 to go: or, synthesising compositional desugarings

Published:15 October 2021Publication History
Skip Abstract Section

Abstract

Programming or scripting languages used in real-world systems are seldom designed with a formal semantics in mind from the outset. Therefore, developing well-founded analysis tools for these systems requires reverse-engineering a formal semantics as a first step. This can take months or years of effort.

Can we (at least partially) automate this process? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging, as found by Krishnamurthi, Lerner and Elberty. In this paper, we highlight that scaling methods with the size of the language is very difficult due to state space explosion, so we propose to learn semantics incrementally. We give a formalisation of Krishnamurthi et al.'s desugaring learning framework in order to clarify the assumptions necessary for an incremental learning algorithm to be feasible.

We show that this reformulation allows us to extend the search space and express rules that Krishnamurthi et al. described as challenging, while still retaining feasibility. We evaluate enumerative synthesis as a baseline algorithm, and demonstrate that, with our reformulation of the problem, it is possible to learn correct desugaring rules for the example source and core languages proposed by Krishnamurthi et al., in most cases identical to the intended rules. In addition, with user guidance, our system was able to synthesize rules for desugaring list comprehensions and try/catch/finally constructs.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a presentation video of my talk at OOPSLA 2020 on our paper accepted in the research track: "One Down, 699 to Go: or, Synthesising Compositional Desugarings". Reverse-engineering formal semantics for programming languages based on an opaque implementation is necessary, but hard. In this paper, we tackle the challenge of providing machine assistance for this task. The main contributions of the work are pinning down a solvable task in the open-ended problem domain, based on the idea of incremental learning, and a user-extensible hypothesis space that narrows down the problem space.

References

  1. Maaz Bin Safeer Ahmad, Jonathan Ragan-Kelley, Alvin Cheung, and Shoaib Kamil. 2019. Automatically Translating Image Processing Libraries to Halide. ACM Trans. Graph., 38, 6 (2019), Article 204, Nov., 13 pages. issn:0730-0301 https://doi.org/10.1145/3355089.3356549 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. 2013. Syntax-guided synthesis. In 2013 Formal Methods in Computer-Aided Design. IEEE, 1–8. https://doi.org/10.1109/FMCAD.2013.6679385 Google ScholarGoogle ScholarCross RefCross Ref
  3. Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017. Scaling Enumerative Program Synthesis via Divide and Conquer. In Tools and Algorithms for the Construction and Analysis of Systems, Axel Legay and Tiziana Margaria (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 319–336. isbn:978-3-662-54577-5Google ScholarGoogle Scholar
  4. Nada Amin and Ross Tate. 2016. Java and Scala’s Type Systems Are Unsound: The Existential Crisis of Null Pointers. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016). Association for Computing Machinery, New York, NY, USA. 838–848. isbn:9781450344449 https://doi.org/10.1145/2983990.2984004 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sándor Bartha and James Cheney. 2020. Towards Meta-interpretive Learning of Programming Language Semantics. In Proceedings of the 29th International Conference on Inductive Logic Programming (ILP 2019) (LNCS, 11770). 16–25. isbn:978-3-030-49209-0 https://doi.org/10.1007/978-3-030-49210-6_2 Google ScholarGoogle ScholarCross RefCross Ref
  6. Sándor Bartha, James Cheney, and Vaishak Belle. 2021. One Down, 699 to Go: or, synthesising compositional desugarings (extended version). arxiv:2109.06114.Google ScholarGoogle Scholar
  7. H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. 2007. Tree Automata Techniques and Applications. Available on:. http://www.grappa.univ-lille3.fr/tata release October, 12th 2007.Google ScholarGoogle Scholar
  8. Jonas Duregård, Patrik Jansson, and Meng Wang. 2012. Feat: Functional Enumeration of Algebraic Types. SIGPLAN Not., 47, 12 (2012), Sept., 61–72. issn:0362-1340 https://doi.org/10.1145/2430532.2364515 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, and Shriram Krishnamurthi. 2001. How to Design Programs. MIT Press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Daniele Filaretti and Sergio Maffeis. 2014. An Executable Formal Semantics of PHP. In ECOOP, Richard Jones (Ed.). Springer, Berlin, Heidelberg. 567–592. isbn:978-3-662-44202-9Google ScholarGoogle Scholar
  11. Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. 2016. Example-directed Synthesis: A Type-theoretic Interpretation. SIGPLAN Not., 51, 1 (2016), Jan., 802–815. issn:0362-1340 https://doi.org/10.1145/2914770.2837629 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. 2010. The Essence of JavaScript. In Proceedings of the 24th European Conference on Object Oriented Programming (ECOOP 2010). 126–150. https://doi.org/10.1007/978-3-642-14107-2_7 Google ScholarGoogle ScholarCross RefCross Ref
  13. Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program Synthesis. Foundations and Trends® in Programming Languages, 4, 1-2 (2017), 1–119. issn:2325-1107 https://doi.org/10.1561/2500000010 Google ScholarGoogle Scholar
  14. Jeevana Priya Inala, Nadia Polikarpova, Xiaokang Qiu, Benjamin S. Lerner, and Armando Solar-Lezama. 2017. Synthesis of Recursive ADT Transformations from Reusable Templates. In Tools and Algorithms for the Construction and Analysis of Systems, Axel Legay and Tiziana Margaria (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 247–263. isbn:978-3-662-54577-5Google ScholarGoogle Scholar
  15. Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracle-guided component-based program synthesis. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010. 215–224. https://doi.org/10.1145/1806799.1806833 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: Securing the Foundations of the Rust Programming Language. Proc. ACM Program. Lang., 2, POPL (2017), Article 66, Dec., 34 pages. https://doi.org/10.1145/3158154 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified Lifting of Stencil Computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Association for Computing Machinery, New York, NY, USA. 711–726. isbn:9781450342612 https://doi.org/10.1145/2908080.2908117 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Christoph Kern and Mark R. Greenstreet. 1999. Formal Verification in Hardware Design: A Survey. ACM Trans. Des. Autom. Electron. Syst., 4, 2 (1999), April, 123–193. issn:1084-4309 https://doi.org/10.1145/307988.307989 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Shriram Krishnamurthi, Benjamin S. Lerner, and Liam Elberty. 2019. The Next 700 Semantics: A Research Challenge. In SNAPL.Google ScholarGoogle Scholar
  20. P. J. Landin. 1966. The next 700 Programming Languages. Commun. ACM, 9, 3 (1966), March, 157–166. issn:0001-0782 https://doi.org/10.1145/365230.365257 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Junsong Li, Justin Pombrio, Joe Gibbs Politz, and Shriram Krishnamurthi. 2015. Slimming Languages by Reducing Sugar: A Case for Semantics-Altering Transformations. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!) (Onward! 2015). Association for Computing Machinery, New York, NY, USA. 90–106. isbn:9781450336888 https://doi.org/10.1145/2814228.2814240 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Sergio Maffeis, John C. Mitchell, and Ankur Taly. 2008. An Operational Semantics for JavaScript. In ESOP, G. Ramalingam (Ed.). Springer, Berlin, Heidelberg. 307–325. isbn:978-3-540-89330-1Google ScholarGoogle Scholar
  23. Simon Marlow, Simon Peyton Jones, and Satnam Singh. 2009. Runtime Support for Multicore Haskell. In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming (ICFP ’09). Association for Computing Machinery, New York, NY, USA. 65–78. isbn:9781605583327 https://doi.org/10.1145/1596550.1596563 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. K. Meinke and J. V. Tucker. 1993. Universal Algebra. In Handbook of Logic in Computer Science (Vol. 1): Background: Mathematical Structures. Oxford University Press, Inc., USA. 189–368. isbn:0198537352Google ScholarGoogle Scholar
  25. Eugenio Moggi. 1991. Notions of Computation and Monads. Inf. Comput., 93, 1 (1991), 55–92. https://doi.org/10.1016/0890-5401(91)90052-4 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Floréal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek. 2012. Evaluating the Design of the R Language: Objects and Functions for Data Analysis. In ECOOP. Springer-Verlag, Berlin, Heidelberg. 104–131. isbn:978-3-642-31056-0Google ScholarGoogle Scholar
  27. Stephen H. Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad. 2014. Meta-interpretive Learning: Application to Grammatical Inference. Mach. Learn., 94, 1 (2014), Jan., 25–49. issn:0885-6125Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Kyndylan Nienhuis, Kayvan Memarian, and Peter Sewell. 2016. An Operational Semantics for C/C++11 Concurrency. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016). Association for Computing Machinery, New York, NY, USA. 111–128. isbn:9781450344449 https://doi.org/10.1145/2983990.2983997 Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-example-directed program synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015. 619–630. https://doi.org/10.1145/2737924.2738007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Benjamin C. Pierce, Alessandro Romanel, and Daniel Wagner. 2010. The Spider Calculus: Computing in Active Graphs. Manuscript, available from http://www.cis.upenn.edu/ bcpierce/papers/spider_calculus.pdfGoogle ScholarGoogle Scholar
  31. Joe Gibbs Politz, Alejandro Martinez, Matthew Milano, Sumner Warren, Daniel Patterson, Junsong Li, Anand Chitipothu, and Shriram Krishnamurthi. 2013. Python: The Full Monty. In OOPSLA. ACM, New York, NY, USA. 217–232. isbn:978-1-4503-2374-1Google ScholarGoogle Scholar
  32. Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In OOPSLA. ACM SIGPLAN Notices, 50, https://doi.org/10.1145/2858965.2814310 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, and Cesare Tinelli. 2019. cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis. In Computer Aided Verification, Isil Dillig and Serdar Tasiran (Eds.). Springer International Publishing, Cham. 74–83. isbn:978-3-030-25543-5Google ScholarGoogle Scholar
  34. Armando Solar-Lezama. 2013. Program sketching. Int. J. Softw. Tools Technol. Transf., 15, 5-6 (2013), 475–495. https://doi.org/10.1007/s10009-012-0249-7 Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Armando Solar-Lezama, Rodric M. Rabbah, Rastislav Bodík, and Kemal Ebcioglu. 2005. Programming by sketching for bit-streaming programs. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005. 281–294. https://doi.org/10.1145/1065010.1065045 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Michael Sperber, R. Kent Dybvig, Matthew Flatt, Anton van Straaten, Robby Findler, and Jacob Matthews. 2010. Revised [6] Report on the Algorithmic Language Scheme (1st ed.). Cambridge University Press, USA. isbn:0521193990Google ScholarGoogle Scholar
  37. Philip Wadler. 1992. Comprehending monads. Mathematical Structures in Computer Science, 2, 4 (1992), 461–493. https://doi.org/10.1017/S0960129500001560 Google ScholarGoogle ScholarCross RefCross Ref
  38. Philip Wadler. 1992. The Essence of Functional Programming. In Conference Record of the Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Albuquerque, New Mexico, USA, January 19-22, 1992. 1–14. https://doi.org/10.1145/143165.143169 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. One down, 699 to go: or, synthesising compositional desugarings

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Article Metrics

      • Downloads (Last 12 months)62
      • Downloads (Last 6 weeks)5

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader
    About Cookies On This Site

    We use cookies to ensure that we give you the best experience on our website.

    Learn more

    Got it!