Abstract
Systems programmers need fine-grained control over the memory layout of data structures, both to produce performant code and to comply with well-defined interfaces imposed by existing code, standardised protocols or hardware. Code that manipulates these low-level representations in memory is hard to get right. Traditionally, this problem is addressed by the implementation of tedious marshalling code to convert between compiler-selected data representations and the desired compact data formats. Such marshalling code is error-prone and can lead to a significant runtime overhead due to excessive copying. While there are many languages and systems that address the correctness issue, by automating the generation and, in some cases, the verification of the marshalling code, the performance overhead introduced by the marshalling code remains. In particular for systems code, this overhead can be prohibitive. In this work, we address both the correctness and the performance problems.
We present a data layout description language and data refinement framework, called Dargent, which allows programmers to declaratively specify how algebraic data types are laid out in memory. Our solution is applied to the Cogent language, but the general ideas behind our solution are applicable to other settings. The Dargent framework generates C code that manipulates data directly with the desired memory layout, while retaining the formal proof that this generated C code is correct with respect to the functional semantics. This added expressivity removes the need for implementing and verifying marshalling code, which eliminates copying, smoothens interoperability with surrounding systems, and increases the trustworthiness of the overall system.
- Sidney Amani, Alex Hixon, Zilin Chen, Christine Rizkallah, Peter Chubb, Liam O’Connor, Joel Beeren, Yutaka Nagashima, Japheth Lim, Thomas Sewell, Joseph Tuong, Gabriele Keller, Toby Murray, Gerwin Klein, and Gernot Heiser. 2016. Cogent: Verifying High-Assurance File System Implementations. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’16). Association for Computing Machinery, New York, NY, USA. 175–188. isbn:9781450340915 https://doi.org/10.1145/2872362.2872404
Google Scholar
Digital Library
- Godmar Back. 2002. DataScript – A Specification and Scripting Language for Binary Data. In Generative Programming and Component Engineering, Don Batory, Charles Consel, and Walid Taha (Eds.) (LNCS, Vol. 2487). Springer, Berlin, Heidelberg. 66–77. isbn:978-3-540-45821-0 https://doi.org/10.1007/3-540-45821-2_4
Google Scholar
Cross Ref
- Julian Bangert and Nickolai Zeldovich. 2014. Nail: A Practical Tool for Parsing and Generating Data Formats. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). USENIX Association, Broomfield, CO. 615–628. isbn:978-1-931971-16-4 https://www.usenix.org/conference/osdi14/technical-sessions/presentation/bangert
Google Scholar
- Erik Barendsen and Sjaak Smetsers. 1993. Conventional and Uniqueness Typing in Graph Rewrite Systems. In Foundations of Software Technology and Theoretical Computer Science, Rudrapatna K. Shyamasundar (Ed.) (LNCS, Vol. 761). Springer, Berlin, Heidelberg. 41–51. isbn:978-3-540-48211-6 https://doi.org/10.1007/3-540-57529-4_42
Google Scholar
Cross Ref
- Zilin Chen, Ambroise Lafont, Liam O’Connor, Gabriele Keller, Craig McLaughlin, Vincent Jackson, and Christine Rizkallah. 2022. Dargent: A Silver Bullet for Verified Data Layout Refinement (Artefact). https://doi.org/10.5281/zenodo.7220452
Google Scholar
Digital Library
- Zilin Chen, Christine Rizkallah, Liam O’Connor, Partha Susarla, Gerwin Klein, Gernot Heiser, and Gabriele Keller. 2022. Property-Based Testing: Climbing the Stairway to Verification. In Proceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2022). Association for Computing Machinery, New York, NY, USA. 14 pages. https://doi.org/10.1145/3567512.3567520
Google Scholar
Digital Library
- Louis Cheung, Liam O’Connor, and Christine Rizkallah. 2022. Overcoming Restraint: Composing Verification of Foreign Functions with Cogent. In Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2022). Association for Computing Machinery, New York, NY, USA. 13–26. isbn:9781450391825 https://doi.org/10.1145/3497775.3503686
Google Scholar
Digital Library
- Karl Cronburg and Samuel Z. Guyer. 2019. Floorplan: Spatial Layout in Memory Management Systems. In Proceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2019). Association for Computing Machinery, New York, NY, USA. 81–93. isbn:9781450369800 https://doi.org/10.1145/3357765.3359519
Google Scholar
Digital Library
- Benjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, and Adam Chlipala. 2019. Narcissus: Correct-by-Construction Derivation of Decoders and Encoders from Binary Formats. Proc. ACM Program. Lang., 3, ICFP (2019), Article 82, July, 29 pages. https://doi.org/10.1145/3341686
Google Scholar
Digital Library
- Iavor S. Diatchki and Mark P. Jones. 2006. Strongly Typed Memory Areas Programming Systems-Level Data Structures in a Functional Language. In Proceedings of the 2006 ACM SIGPLAN Workshop on Haskell (Haskell ’06). Association for Computing Machinery, New York, NY, USA. 72–83. isbn:1595934898 https://doi.org/10.1145/1159842.1159851
Google Scholar
Digital Library
- Iavor S. Diatchki, Mark P. Jones, and Rebekah Leslie. 2005. High-Level Views on Low-Level Representations. In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming (ICFP ’05). Association for Computing Machinery, New York, NY, USA. 168–179. isbn:1595930647 https://doi.org/10.1145/1086365.1086387
Google Scholar
Digital Library
- Kathleen Fisher and Robert Gruber. 2005. PADS: A Domain-Specific Language for Processing Ad Hoc Data. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’05). Association for Computing Machinery, New York, NY, USA. 295–304. isbn:1595930566 https://doi.org/10.1145/1065010.1065046
Google Scholar
Digital Library
- Kathleen Fisher and David Walker. 2011. The PADS Project: An Overview. In Proceedings of the 14th International Conference on Database Theory (ICDT ’11). Association for Computing Machinery, New York, NY, USA. 11–17. isbn:9781450305297 https://doi.org/10.1145/1938551.1938556
Google Scholar
Digital Library
- Juliana Franco, Martin Hagelin, Tobias Wrigstad, Sophia Drossopoulou, and Susan Eisenbach. 2017. You Can Have It All: Abstraction and Good Cache Performance. In Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2017). Association for Computing Machinery, New York, NY, USA. 148–167. isbn:9781450355308 https://doi.org/10.1145/3133850.3133861
Google Scholar
Digital Library
- Juliana Franco, Alexandros Tasos, Sophia Drossopoulou, Tobias Wrigstad, and Susan Eisenbach. 2019. Safely Abstracting Memory Layouts. https://doi.org/10.48550/ARXIV.1901.08006
Google Scholar
- David Greenaway, Japheth Lim, June Andronick, and Gerwin Klein. 2014. Don’t Sweat the Small Stuff: Formal Verification of C Code without the Pain. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 429–439. isbn:9781450327848 https://doi.org/10.1145/2594291.2594296
Google Scholar
Digital Library
- Steve Klabnik and Carol Nichols. 2022. The Rust Programming Language. https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html
Google Scholar
- Yitzhak Mandelbaum, Kathleen Fisher, David Walker, Mary Fernandez, and Artem Gleyzer. 2007. PADS/ML: A Functional Data Description Language. In Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’07). Association for Computing Machinery, New York, NY, USA. 77–83. isbn:1595935754 https://doi.org/10.1145/1190216.1190231
Google Scholar
Digital Library
- Peter J. McCann and Satish Chandra. 2000. Packet Types: Abstract Specification of Network Protocol Messages. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM ’00). Association for Computing Machinery, New York, NY, USA. 321–333. isbn:1581132239 https://doi.org/10.1145/347059.347563
Google Scholar
Digital Library
- Emmet Murray. 2019. Recursive Types for Cogent. UNSW. Sydney, Australia. https://github.com/emmet-m/thesis
Google Scholar
- Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. 2002. Isabelle/HOL — A Proof Assistant for Higher-Order Logic (LNCS, Vol. 2283). Springer, Berlin, Heidelberg. isbn:3540433767 https://doi.org/10.1007/3-540-45949-9
Google Scholar
Cross Ref
- Liam O’Connor. 2019. Type Systems for Systems Types. Ph. D. Dissertation. UNSW. Sydney, Australia. https://doi.org/10.26190/unsworks/21495
Google Scholar
Cross Ref
- Liam O’Connor, Zilin Chen, Christine Rizkallah, Sidney Amani, Japheth Lim, Toby Murray, Yutaka Nagashima, Thomas Sewell, and Gerwin Klein. 2016. Refinement through Restraint: Bringing Down the Cost of Verification. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016). Association for Computing Machinery, New York, NY, USA. 89–102. isbn:9781450342193 https://doi.org/10.1145/2951913.2951940
Google Scholar
Digital Library
- Liam O’Connor, Zilin Chen, Christine Rizkallah, Vincent Jackson, Sidney Amani, Gerwin Klein, Toby Murray, Thomas Sewell, and Gabriele Keller. 2021. Cogent: Uniqueness Types and Certifying Compilation. Journal of Functional Programming, 31 (2021), e25. https://doi.org/10.1017/S095679682100023X
Google Scholar
Cross Ref
- Liam O’Connor, Zilin Chen, Partha Susarla, Christine Rizkallah, Gerwin Klein, and Gabriele Keller. 2018. Bringing Effortless Refinement of Data Layouts to Cogent. In Leveraging Applications of Formal Methods, Verification and Validation. Modeling, Tiziana Margaria and Bernhard Steffen (Eds.) (LNCS, Vol. 11244). Springer International Publishing, Cham. 134–149. isbn:978-3-030-03418-4 https://doi.org/10.1007/978-3-030-03418-4_9
Google Scholar
Digital Library
- Blaise Paradeza. 2020. Refinement Types for Cogent. UNSW. Sydney, Australia. https://people.eng.unimelb.edu.au/rizkallahc/theses/blaise-paradeza-honours-thesis.pdf
Google Scholar
- Tahina Ramananandro, Antoine Delignat-Lavaud, Cedric Fournet, Nikhil Swamy, Tej Chajed, Nadim Kobeissi, and Jonathan Protzenko. 2019. EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA. 1465–1482. isbn:978-1-939133-06-9 https://www.usenix.org/conference/usenixsecurity19/presentation/delignat-lavaud
Google Scholar
- Christine Rizkallah, Japheth Lim, Yutaka Nagashima, Thomas Sewell, Zilin Chen, Liam O’Connor, Toby Murray, Gabriele Keller, and Gerwin Klein. 2016. A Framework for the Automatic Formal Verification of Refinement from Cogent to C. In Interactive Theorem Proving, Jasmin Christian Blanchette and Stephan Merz (Eds.) (LNCS, Vol. 9807). Springer, Cham. 323–340. isbn:978-3-319-43144-4 https://doi.org/10.1007/978-3-319-43144-4_20
Google Scholar
Cross Ref
- Norbert Schirmer. 2005. A Verification Environment for Sequential Imperative Programs in Isabelle/HOL. In Logic for Programming, Artificial Intelligence, and Reasoning, Franz Baader and Andrei Voronkov (Eds.) (LNCS, Vol. 3452). Springer, Berlin, Heidelberg. 398–414. isbn:978-3-540-32275-7 https://doi.org/10.1007/978-3-540-32275-7_26
Google Scholar
Cross Ref
- Konrad Slind. 2021. Specifying Message Formats with Contiguity Types. In 12th International Conference on Interactive Theorem Proving (ITP 2021), Liron Cohen and Cezary Kaliszyk (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 193). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 30:1–30:17. isbn:978-3-95977-188-7 issn:1868-8969 https://doi.org/10.4230/LIPIcs.ITP.2021.30
Google Scholar
Cross Ref
- The seL4 developers. 2022. The seL4 Microkernel. https://sel4.systems/
Google Scholar
- Marcell van Geest and Wouter Swierstra. 2017. Generic Packet Descriptions: Verified Parsing and Pretty Printing of Low-Level Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Type-Driven Development (TyDe 2017). Association for Computing Machinery, New York, NY, USA. 30–40. isbn:9781450351836 https://doi.org/10.1145/3122975.3122979
Google Scholar
Digital Library
- Michael Vollmer, Chaitanya Koparkar, Mike Rainey, Laith Sakka, Milind Kulkarni, and Ryan R. Newton. 2019. LoCal: A Language for Programs Operating on Serialized Data. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). Association for Computing Machinery, New York, NY, USA. 48–62. isbn:9781450367127 https://doi.org/10.1145/3314221.3314631
Google Scholar
Digital Library
- Philip Wadler. 1990. Linear Types Can Change the World!. In Programming Concepts and Methods. North-Holland, 561.
Google Scholar
- Yan Wang and Verónica Gaspes. 2011. An Embedded Language for Programming Protocol Stacks in Embedded Systems. In Proceedings of the 20th ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM ’11). Association for Computing Machinery, New York, NY, USA. 63–72. isbn:9781450304856 https://doi.org/10.1145/1929501.1929511
Google Scholar
Digital Library
- Qianchuan Ye and Benjamin Delaware. 2019. A Verified Protocol Buffer Compiler. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2019). Association for Computing Machinery, New York, NY, USA. 222–233. isbn:9781450362221 https://doi.org/10.1145/3293880.3294105
Google Scholar
Digital Library
Index Terms
Dargent: A Silver Bullet for Verified Data Layout Refinement
Recommendations
Extraction of Abstraction Invariants for Data Refinement
ZB '02: Proceedings of the 2nd International Conference of B and Z Users on Formal Specification and Development in Z and BIn this paper, we describe an approach to generating abstraction invariants for data refinement from specifications mixing B and CSP notations. A model-checker can be used to check automatically refinement of the CSP parts. However, we need to translate ...
Checking Z Data Refinements Using Traces Refinement
Data refinement is useful in software development because it allows one to build more concrete specifications from abstract ones, as long as there is a mathematical relation between them. It has associated rules (proof obligations) that must be ...
A Certifying Compiler for Clike Subset of C Language
TASE '10: Proceedings of the 2010 4th IEEE International Symposium on Theoretical Aspects of Software EngineeringProof-carrying code (PCC) is a technique that allows code consumers to check whether the code is safe to execute or not through a formal safety proof provided by the code producer. And a certifying compiler makes PCC practical by compiling annotated ...






Comments