Abstract
Programming or scripting languages used in real-world systems are seldom designed with a formal semantics in mind from the outset. Therefore, developing well-founded analysis tools for these systems requires reverse-engineering a formal semantics as a first step. This can take months or years of effort.
Can we (at least partially) automate this process? Though desirable, automatically reverse-engineering semantics rules from an implementation is very challenging, as found by Krishnamurthi, Lerner and Elberty. In this paper, we highlight that scaling methods with the size of the language is very difficult due to state space explosion, so we propose to learn semantics incrementally. We give a formalisation of Krishnamurthi et al.'s desugaring learning framework in order to clarify the assumptions necessary for an incremental learning algorithm to be feasible.
We show that this reformulation allows us to extend the search space and express rules that Krishnamurthi et al. described as challenging, while still retaining feasibility. We evaluate enumerative synthesis as a baseline algorithm, and demonstrate that, with our reformulation of the problem, it is possible to learn correct desugaring rules for the example source and core languages proposed by Krishnamurthi et al., in most cases identical to the intended rules. In addition, with user guidance, our system was able to synthesize rules for desugaring list comprehensions and try/catch/finally constructs.
Supplemental Material
- Maaz Bin Safeer Ahmad, Jonathan Ragan-Kelley, Alvin Cheung, and Shoaib Kamil. 2019. Automatically Translating Image Processing Libraries to Halide. ACM Trans. Graph., 38, 6 (2019), Article 204, Nov., 13 pages. issn:0730-0301 https://doi.org/10.1145/3355089.3356549 Google Scholar
Digital Library
- R. Alur, R. Bodik, G. Juniwal, M. M. K. Martin, M. Raghothaman, S. A. Seshia, R. Singh, A. Solar-Lezama, E. Torlak, and A. Udupa. 2013. Syntax-guided synthesis. In 2013 Formal Methods in Computer-Aided Design. IEEE, 1–8. https://doi.org/10.1109/FMCAD.2013.6679385 Google Scholar
Cross Ref
- Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017. Scaling Enumerative Program Synthesis via Divide and Conquer. In Tools and Algorithms for the Construction and Analysis of Systems, Axel Legay and Tiziana Margaria (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 319–336. isbn:978-3-662-54577-5Google Scholar
- Nada Amin and Ross Tate. 2016. Java and Scala’s Type Systems Are Unsound: The Existential Crisis of Null Pointers. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016). Association for Computing Machinery, New York, NY, USA. 838–848. isbn:9781450344449 https://doi.org/10.1145/2983990.2984004 Google Scholar
Digital Library
- Sándor Bartha and James Cheney. 2020. Towards Meta-interpretive Learning of Programming Language Semantics. In Proceedings of the 29th International Conference on Inductive Logic Programming (ILP 2019) (LNCS, 11770). 16–25. isbn:978-3-030-49209-0 https://doi.org/10.1007/978-3-030-49210-6_2 Google Scholar
Cross Ref
- Sándor Bartha, James Cheney, and Vaishak Belle. 2021. One Down, 699 to Go: or, synthesising compositional desugarings (extended version). arxiv:2109.06114.Google Scholar
- H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. 2007. Tree Automata Techniques and Applications. Available on:. http://www.grappa.univ-lille3.fr/tata release October, 12th 2007.Google Scholar
- Jonas Duregård, Patrik Jansson, and Meng Wang. 2012. Feat: Functional Enumeration of Algebraic Types. SIGPLAN Not., 47, 12 (2012), Sept., 61–72. issn:0362-1340 https://doi.org/10.1145/2430532.2364515 Google Scholar
Digital Library
- Matthias Felleisen, Robert Bruce Findler, Matthew Flatt, and Shriram Krishnamurthi. 2001. How to Design Programs. MIT Press.Google Scholar
Digital Library
- Daniele Filaretti and Sergio Maffeis. 2014. An Executable Formal Semantics of PHP. In ECOOP, Richard Jones (Ed.). Springer, Berlin, Heidelberg. 567–592. isbn:978-3-662-44202-9Google Scholar
- Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. 2016. Example-directed Synthesis: A Type-theoretic Interpretation. SIGPLAN Not., 51, 1 (2016), Jan., 802–815. issn:0362-1340 https://doi.org/10.1145/2914770.2837629 Google Scholar
Digital Library
- Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. 2010. The Essence of JavaScript. In Proceedings of the 24th European Conference on Object Oriented Programming (ECOOP 2010). 126–150. https://doi.org/10.1007/978-3-642-14107-2_7 Google Scholar
Cross Ref
- Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. 2017. Program Synthesis. Foundations and Trends® in Programming Languages, 4, 1-2 (2017), 1–119. issn:2325-1107 https://doi.org/10.1561/2500000010 Google Scholar
- Jeevana Priya Inala, Nadia Polikarpova, Xiaokang Qiu, Benjamin S. Lerner, and Armando Solar-Lezama. 2017. Synthesis of Recursive ADT Transformations from Reusable Templates. In Tools and Algorithms for the Construction and Analysis of Systems, Axel Legay and Tiziana Margaria (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 247–263. isbn:978-3-662-54577-5Google Scholar
- Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracle-guided component-based program synthesis. In Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1, ICSE 2010, Cape Town, South Africa, 1-8 May 2010. 215–224. https://doi.org/10.1145/1806799.1806833 Google Scholar
Digital Library
- Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: Securing the Foundations of the Rust Programming Language. Proc. ACM Program. Lang., 2, POPL (2017), Article 66, Dec., 34 pages. https://doi.org/10.1145/3158154 Google Scholar
Digital Library
- Shoaib Kamil, Alvin Cheung, Shachar Itzhaky, and Armando Solar-Lezama. 2016. Verified Lifting of Stencil Computations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Association for Computing Machinery, New York, NY, USA. 711–726. isbn:9781450342612 https://doi.org/10.1145/2908080.2908117 Google Scholar
Digital Library
- Christoph Kern and Mark R. Greenstreet. 1999. Formal Verification in Hardware Design: A Survey. ACM Trans. Des. Autom. Electron. Syst., 4, 2 (1999), April, 123–193. issn:1084-4309 https://doi.org/10.1145/307988.307989 Google Scholar
Digital Library
- Shriram Krishnamurthi, Benjamin S. Lerner, and Liam Elberty. 2019. The Next 700 Semantics: A Research Challenge. In SNAPL.Google Scholar
- P. J. Landin. 1966. The next 700 Programming Languages. Commun. ACM, 9, 3 (1966), March, 157–166. issn:0001-0782 https://doi.org/10.1145/365230.365257 Google Scholar
Digital Library
- Junsong Li, Justin Pombrio, Joe Gibbs Politz, and Shriram Krishnamurthi. 2015. Slimming Languages by Reducing Sugar: A Case for Semantics-Altering Transformations. In 2015 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!) (Onward! 2015). Association for Computing Machinery, New York, NY, USA. 90–106. isbn:9781450336888 https://doi.org/10.1145/2814228.2814240 Google Scholar
Digital Library
- Sergio Maffeis, John C. Mitchell, and Ankur Taly. 2008. An Operational Semantics for JavaScript. In ESOP, G. Ramalingam (Ed.). Springer, Berlin, Heidelberg. 307–325. isbn:978-3-540-89330-1Google Scholar
- Simon Marlow, Simon Peyton Jones, and Satnam Singh. 2009. Runtime Support for Multicore Haskell. In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming (ICFP ’09). Association for Computing Machinery, New York, NY, USA. 65–78. isbn:9781605583327 https://doi.org/10.1145/1596550.1596563 Google Scholar
Digital Library
- K. Meinke and J. V. Tucker. 1993. Universal Algebra. In Handbook of Logic in Computer Science (Vol. 1): Background: Mathematical Structures. Oxford University Press, Inc., USA. 189–368. isbn:0198537352Google Scholar
- Eugenio Moggi. 1991. Notions of Computation and Monads. Inf. Comput., 93, 1 (1991), 55–92. https://doi.org/10.1016/0890-5401(91)90052-4 Google Scholar
Digital Library
- Floréal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek. 2012. Evaluating the Design of the R Language: Objects and Functions for Data Analysis. In ECOOP. Springer-Verlag, Berlin, Heidelberg. 104–131. isbn:978-3-642-31056-0Google Scholar
- Stephen H. Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad. 2014. Meta-interpretive Learning: Application to Grammatical Inference. Mach. Learn., 94, 1 (2014), Jan., 25–49. issn:0885-6125Google Scholar
Digital Library
- Kyndylan Nienhuis, Kayvan Memarian, and Peter Sewell. 2016. An Operational Semantics for C/C++11 Concurrency. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016). Association for Computing Machinery, New York, NY, USA. 111–128. isbn:9781450344449 https://doi.org/10.1145/2983990.2983997 Google Scholar
Digital Library
- Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-example-directed program synthesis. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015. 619–630. https://doi.org/10.1145/2737924.2738007 Google Scholar
Digital Library
- Benjamin C. Pierce, Alessandro Romanel, and Daniel Wagner. 2010. The Spider Calculus: Computing in Active Graphs. Manuscript, available from http://www.cis.upenn.edu/ bcpierce/papers/spider_calculus.pdfGoogle Scholar
- Joe Gibbs Politz, Alejandro Martinez, Matthew Milano, Sumner Warren, Daniel Patterson, Junsong Li, Anand Chitipothu, and Shriram Krishnamurthi. 2013. Python: The Full Monty. In OOPSLA. ACM, New York, NY, USA. 217–232. isbn:978-1-4503-2374-1Google Scholar
- Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A Framework for Inductive Program Synthesis. In OOPSLA. ACM SIGPLAN Notices, 50, https://doi.org/10.1145/2858965.2814310 Google Scholar
Digital Library
- Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, and Cesare Tinelli. 2019. cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis. In Computer Aided Verification, Isil Dillig and Serdar Tasiran (Eds.). Springer International Publishing, Cham. 74–83. isbn:978-3-030-25543-5Google Scholar
- Armando Solar-Lezama. 2013. Program sketching. Int. J. Softw. Tools Technol. Transf., 15, 5-6 (2013), 475–495. https://doi.org/10.1007/s10009-012-0249-7 Google Scholar
Digital Library
- Armando Solar-Lezama, Rodric M. Rabbah, Rastislav Bodík, and Kemal Ebcioglu. 2005. Programming by sketching for bit-streaming programs. In Proceedings of the ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation, Chicago, IL, USA, June 12-15, 2005. 281–294. https://doi.org/10.1145/1065010.1065045 Google Scholar
Digital Library
- Michael Sperber, R. Kent Dybvig, Matthew Flatt, Anton van Straaten, Robby Findler, and Jacob Matthews. 2010. Revised [6] Report on the Algorithmic Language Scheme (1st ed.). Cambridge University Press, USA. isbn:0521193990Google Scholar
- Philip Wadler. 1992. Comprehending monads. Mathematical Structures in Computer Science, 2, 4 (1992), 461–493. https://doi.org/10.1017/S0960129500001560 Google Scholar
Cross Ref
- Philip Wadler. 1992. The Essence of Functional Programming. In Conference Record of the Nineteenth Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Albuquerque, New Mexico, USA, January 19-22, 1992. 1–14. https://doi.org/10.1145/143165.143169 Google Scholar
Digital Library
Index Terms
One down, 699 to go: or, synthesising compositional desugarings
Recommendations
Equivalence of formal semantics definition methods
AbstractThere are numerous methods of formally defining the semantics of computer languages. Each method has been designed to fulfil a different purpose. For example, some have been designed to make reasoning about languages as easy as possible; others ...
A semantic model of a small typed functional language using Object-Z
APSEC '00: Proceedings of the Seventh Asia-Pacific Software Engineering ConferenceThe Object-Z notation is a good meta-language that can be used to specify the denotational semantics of programming languages. It has been effectively applied to some categories of languages. These include the procedural languages, the object-oriented ...
Modular Compilers Based on Monad Transformers
ICCL '98: Proceedings of the 1998 International Conference on Computer LanguagesThe monadic style of language specification has the advantages of modularity and extensibility: it is simple to add or change features in an interpreter to reflect modifications in the source language. It has proven difficult to extend the method to ...






Comments