Abstract
The R programming language is very popular for developing statistical software and data analysis, thanks to rich libraries, concise and expressive syntax, and support for interactive programming. Yet, the semantics of R is fairly complex, contains many subtle corner cases, and is not formally specified. This makes it difficult to reason about R programs. In this work, we develop a big-step operational semantics for R in the form of an interpreter written in the Coq proof assistant. We ensure the trustworthiness of the formalization by introducing a monadic encoding that allows the Coq interpreter, CoqR, to be in direct visual correspondence with the reference R interpreter, GNU R. Additionally, we provide a testing framework that supports systematic comparison of CoqR and GNU R. In its current state, CoqR covers the nucleus of the R language as well as numerous additional features, making it pass a significant number of realistic test cases from the GNU R and FastR projects. To exercise the formal specification, we prove in Coq the preservation of memory invariants in selected parts of the interpreter. This work is an important first step towards a robust environment for formal verification of R programs.
- Karthikeyan Bhargavan, Antoine Delignat-Lavaud, and Sergio Maffeis. 2013. Language-Based Defenses Against Untrusted Browser Origins. In Usenix security symposium. Google Scholar
Digital Library
- Martin Bodin, Arthur Charguéraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudinien, Alan Schmitt, and Gareth Smith. 2014. A Trusted Mechanised JavaScript Specification. In POPL. Google Scholar
Digital Library
- Patrick Burns. 2011. The R Inferno.Google Scholar
- Arthur Charguéraud, Alan Schmitt, and Thomas Wood. 2018. JSExplain: A Double Debugger for JavaScript. In The web conference. Google Scholar
Digital Library
- ECMA International. 2010. Test262. https://github.com/tc39/test262 .Google Scholar
- Philippa Gardner, Sergio Maffeis, and Gareth Smith. 2012. Towards a Program Logic for JavaScript. In POPL. Google Scholar
Digital Library
- Filippo Ghibellini. 2017. Dynamic test generation for R packages. Bachelor’s Thesis.Google Scholar
- Google. {n. d.} R Style Guide. Retrieved 2018 from https://google. github.io/styleguide/Rguide.xml .Google Scholar
- Arjun Guha, Claudiu Saftoiu, and Shriram Krishnamurthi. 2010. The Essence of JavaScript. In ECOOP. Google Scholar
Digital Library
- Ross Ihaka and Robert Gentleman. 1996. R: a Language for Data Analysis and Graphics. Journal of computational and graphical statistics.Google Scholar
- Jacques-Henri Jourdan, François Pottier, and Xavier Leroy. 2012. Validating LR(1) Parsers. In ESOP. Google Scholar
Digital Library
- Tomas Kalibera, Petr Maj, Floreal Morandat, and Jan Vitek. 2014. A Fast Abstract Syntax Tree Interpreter for R. In Virtual execution environments. Google Scholar
Digital Library
- Robbert Krebbers and Freek Wiedijk. 2011. A Formalization of the C99 Standard in HOL, Isabelle and Coq. In Calculemus/mkm. Google Scholar
Digital Library
- Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Communications of the acm. Google Scholar
Digital Library
- Xavier Leroy. 2014. How much is a mechanized proof worth, certification-wise? In Principles in Practice.Google Scholar
- Sergio Maffeis, John C. Mitchell, and Ankur Taly. 2008. An Operational Semantics for JavaScript. In APLAS. Google Scholar
Digital Library
- Sergio Maffeis, John C. Mitchell, and Ankur Taly. 2009. Isolating JavaScript with Filters, Rewriting, and Wrappers. In ESORICS. Google Scholar
Digital Library
- Sergio Maffeis, John C. Mitchell, and Ankur Taly. 2010. Object Capabilities and Isolation of Untrusted Web Applications. In SP. IEEE. Google Scholar
Digital Library
- Petr Maj, Tomas Kalibera, and Jan Vitek. 2013. TestR: R Language Test Driven Specification. In The R User Conference, UseR!Google Scholar
- Jonathan McPherson. 2014. Debugging in R. In The R User Conference, UseR!Google Scholar
- Floréal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek. 2012. Evaluating the design of the R language. In ECOOP.Google Scholar
- Mozilla. 2013. Mozilla Automated JavaScript Tests. https://developer. mozilla . org / en - US / docs / SpiderMonkey / Running _ Automated _ JavaScript_Tests .Google Scholar
- Daejun Park, Andrei Stefnescu, and Grigore Rou. 2015. KJS: A Complete Formal Semantics of JavaScript. In PLDI. Google Scholar
Digital Library
- Joe Gibbs Politz, Matthew J. Carroll, Benjamin S. Lerner, Justin Pombrio, and Shriram Krishnamurthi. 2012. A Tested Semantics for Getters, Setters, and eval in JavaScript. DLS.Google Scholar
- R Core Team. 2015. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/ .Google Scholar
- R Core Team. 2000. R Language Definition. R foundation for statistical computing.Google Scholar
- R Core Team. {n. d.} The Comprehensive R Archive Network. Retrieved 2018 from https://cran.r-project.org/ .Google Scholar
- Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek. 2011. The eval that Men Do. A large-scale study of the use of eval in javascript applications. In ECOOP. Google Scholar
Digital Library
- Ankur Taly, Úlfar Erlingsson, John C. Mitchell, Mark S. Miller, and Jasvir Nagra. 2011. Automated Analysis of Security-Critical JavaScript APIs. In SP. Google Scholar
Digital Library
- The Coq development team. 1984. the Coq Proof Assistant. Retrieved 2018 from https://coq.inria.fr/ .Google Scholar
- Luke Tierney, Gabe Becker, and Tomas Kalibera. 2017. ALTREP and Other Things. In R-devel.Google Scholar
- Roman Tsegelskyi and Jan Vitek. 2014. TestR: Generating Unit Tests for R Internals. In The R User Conference, UseR!Google Scholar
- Philip Wadler. 1992. Comprehending Monads. Mathematical structures in computer science.Google Scholar
- Thomas Wuerthinger. 2012. Truffle: A Self-Optimizing Runtime System.Google Scholar
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In PLDI. Google Scholar
Digital Library
Index Terms
A trustworthy mechanized formalization of R
Recommendations
A trustworthy mechanized formalization of R
DLS 2018: Proceedings of the 14th ACM SIGPLAN International Symposium on Dynamic LanguagesThe R programming language is very popular for developing statistical software and data analysis, thanks to rich libraries, concise and expressive syntax, and support for interactive programming. Yet, the semantics of R is fairly complex, contains many ...
A mechanized formalization of GraphQL
CPP 2020: Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and ProofsGraphQL is a novel language for specifying and querying web APIs, allowing clients to flexibly and efficiently retrieve data of interest. The GraphQL language specification is unfortunately only available in prose, making it hard to develop robust ...
A trusted mechanised JavaScript specification
POPL '14: Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming LanguagesJavaScript is the most widely used web language for client-side applications. Whilst the development of JavaScript was initially just led by implementation, there is now increasing momentum behind the ECMA standardisation process. The time is ripe for a ...







Comments