skip to main content

Dargent: A Silver Bullet for Verified Data Layout Refinement

Published:11 January 2023Publication History
Skip Abstract Section

Abstract

Systems programmers need fine-grained control over the memory layout of data structures, both to produce performant code and to comply with well-defined interfaces imposed by existing code, standardised protocols or hardware. Code that manipulates these low-level representations in memory is hard to get right. Traditionally, this problem is addressed by the implementation of tedious marshalling code to convert between compiler-selected data representations and the desired compact data formats. Such marshalling code is error-prone and can lead to a significant runtime overhead due to excessive copying. While there are many languages and systems that address the correctness issue, by automating the generation and, in some cases, the verification of the marshalling code, the performance overhead introduced by the marshalling code remains. In particular for systems code, this overhead can be prohibitive. In this work, we address both the correctness and the performance problems.

We present a data layout description language and data refinement framework, called Dargent, which allows programmers to declaratively specify how algebraic data types are laid out in memory. Our solution is applied to the Cogent language, but the general ideas behind our solution are applicable to other settings. The Dargent framework generates C code that manipulates data directly with the desired memory layout, while retaining the formal proof that this generated C code is correct with respect to the functional semantics. This added expressivity removes the need for implementing and verifying marshalling code, which eliminates copying, smoothens interoperability with surrounding systems, and increases the trustworthiness of the overall system.

References

  1. Sidney Amani, Alex Hixon, Zilin Chen, Christine Rizkallah, Peter Chubb, Liam O’Connor, Joel Beeren, Yutaka Nagashima, Japheth Lim, Thomas Sewell, Joseph Tuong, Gabriele Keller, Toby Murray, Gerwin Klein, and Gernot Heiser. 2016. Cogent: Verifying High-Assurance File System Implementations. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’16). Association for Computing Machinery, New York, NY, USA. 175–188. isbn:9781450340915 https://doi.org/10.1145/2872362.2872404 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Godmar Back. 2002. DataScript – A Specification and Scripting Language for Binary Data. In Generative Programming and Component Engineering, Don Batory, Charles Consel, and Walid Taha (Eds.) (LNCS, Vol. 2487). Springer, Berlin, Heidelberg. 66–77. isbn:978-3-540-45821-0 https://doi.org/10.1007/3-540-45821-2_4 Google ScholarGoogle ScholarCross RefCross Ref
  3. Julian Bangert and Nickolai Zeldovich. 2014. Nail: A Practical Tool for Parsing and Generating Data Formats. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). USENIX Association, Broomfield, CO. 615–628. isbn:978-1-931971-16-4 https://www.usenix.org/conference/osdi14/technical-sessions/presentation/bangert Google ScholarGoogle Scholar
  4. Erik Barendsen and Sjaak Smetsers. 1993. Conventional and Uniqueness Typing in Graph Rewrite Systems. In Foundations of Software Technology and Theoretical Computer Science, Rudrapatna K. Shyamasundar (Ed.) (LNCS, Vol. 761). Springer, Berlin, Heidelberg. 41–51. isbn:978-3-540-48211-6 https://doi.org/10.1007/3-540-57529-4_42 Google ScholarGoogle ScholarCross RefCross Ref
  5. Zilin Chen, Ambroise Lafont, Liam O’Connor, Gabriele Keller, Craig McLaughlin, Vincent Jackson, and Christine Rizkallah. 2022. Dargent: A Silver Bullet for Verified Data Layout Refinement (Artefact). https://doi.org/10.5281/zenodo.7220452 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Zilin Chen, Christine Rizkallah, Liam O’Connor, Partha Susarla, Gerwin Klein, Gernot Heiser, and Gabriele Keller. 2022. Property-Based Testing: Climbing the Stairway to Verification. In Proceedings of the 15th ACM SIGPLAN International Conference on Software Language Engineering (SLE 2022). Association for Computing Machinery, New York, NY, USA. 14 pages. https://doi.org/10.1145/3567512.3567520 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Louis Cheung, Liam O’Connor, and Christine Rizkallah. 2022. Overcoming Restraint: Composing Verification of Foreign Functions with Cogent. In Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2022). Association for Computing Machinery, New York, NY, USA. 13–26. isbn:9781450391825 https://doi.org/10.1145/3497775.3503686 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Karl Cronburg and Samuel Z. Guyer. 2019. Floorplan: Spatial Layout in Memory Management Systems. In Proceedings of the 18th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2019). Association for Computing Machinery, New York, NY, USA. 81–93. isbn:9781450369800 https://doi.org/10.1145/3357765.3359519 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Benjamin Delaware, Sorawit Suriyakarn, Clément Pit-Claudel, Qianchuan Ye, and Adam Chlipala. 2019. Narcissus: Correct-by-Construction Derivation of Decoders and Encoders from Binary Formats. Proc. ACM Program. Lang., 3, ICFP (2019), Article 82, July, 29 pages. https://doi.org/10.1145/3341686 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Iavor S. Diatchki and Mark P. Jones. 2006. Strongly Typed Memory Areas Programming Systems-Level Data Structures in a Functional Language. In Proceedings of the 2006 ACM SIGPLAN Workshop on Haskell (Haskell ’06). Association for Computing Machinery, New York, NY, USA. 72–83. isbn:1595934898 https://doi.org/10.1145/1159842.1159851 Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Iavor S. Diatchki, Mark P. Jones, and Rebekah Leslie. 2005. High-Level Views on Low-Level Representations. In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming (ICFP ’05). Association for Computing Machinery, New York, NY, USA. 168–179. isbn:1595930647 https://doi.org/10.1145/1086365.1086387 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Kathleen Fisher and Robert Gruber. 2005. PADS: A Domain-Specific Language for Processing Ad Hoc Data. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’05). Association for Computing Machinery, New York, NY, USA. 295–304. isbn:1595930566 https://doi.org/10.1145/1065010.1065046 Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kathleen Fisher and David Walker. 2011. The PADS Project: An Overview. In Proceedings of the 14th International Conference on Database Theory (ICDT ’11). Association for Computing Machinery, New York, NY, USA. 11–17. isbn:9781450305297 https://doi.org/10.1145/1938551.1938556 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Juliana Franco, Martin Hagelin, Tobias Wrigstad, Sophia Drossopoulou, and Susan Eisenbach. 2017. You Can Have It All: Abstraction and Good Cache Performance. In Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward! 2017). Association for Computing Machinery, New York, NY, USA. 148–167. isbn:9781450355308 https://doi.org/10.1145/3133850.3133861 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Juliana Franco, Alexandros Tasos, Sophia Drossopoulou, Tobias Wrigstad, and Susan Eisenbach. 2019. Safely Abstracting Memory Layouts. https://doi.org/10.48550/ARXIV.1901.08006 Google ScholarGoogle Scholar
  16. David Greenaway, Japheth Lim, June Andronick, and Gerwin Klein. 2014. Don’t Sweat the Small Stuff: Formal Verification of C Code without the Pain. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 429–439. isbn:9781450327848 https://doi.org/10.1145/2594291.2594296 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Steve Klabnik and Carol Nichols. 2022. The Rust Programming Language. https://doc.rust-lang.org/book/ch04-02-references-and-borrowing.html Google ScholarGoogle Scholar
  18. Yitzhak Mandelbaum, Kathleen Fisher, David Walker, Mary Fernandez, and Artem Gleyzer. 2007. PADS/ML: A Functional Data Description Language. In Proceedings of the 34th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’07). Association for Computing Machinery, New York, NY, USA. 77–83. isbn:1595935754 https://doi.org/10.1145/1190216.1190231 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Peter J. McCann and Satish Chandra. 2000. Packet Types: Abstract Specification of Network Protocol Messages. In Proceedings of the Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM ’00). Association for Computing Machinery, New York, NY, USA. 321–333. isbn:1581132239 https://doi.org/10.1145/347059.347563 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Emmet Murray. 2019. Recursive Types for Cogent. UNSW. Sydney, Australia. https://github.com/emmet-m/thesis Google ScholarGoogle Scholar
  21. Tobias Nipkow, Lawrence C. Paulson, and Markus Wenzel. 2002. Isabelle/HOL — A Proof Assistant for Higher-Order Logic (LNCS, Vol. 2283). Springer, Berlin, Heidelberg. isbn:3540433767 https://doi.org/10.1007/3-540-45949-9 Google ScholarGoogle ScholarCross RefCross Ref
  22. Liam O’Connor. 2019. Type Systems for Systems Types. Ph. D. Dissertation. UNSW. Sydney, Australia. https://doi.org/10.26190/unsworks/21495 Google ScholarGoogle ScholarCross RefCross Ref
  23. Liam O’Connor, Zilin Chen, Christine Rizkallah, Sidney Amani, Japheth Lim, Toby Murray, Yutaka Nagashima, Thomas Sewell, and Gerwin Klein. 2016. Refinement through Restraint: Bringing Down the Cost of Verification. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming (ICFP 2016). Association for Computing Machinery, New York, NY, USA. 89–102. isbn:9781450342193 https://doi.org/10.1145/2951913.2951940 Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Liam O’Connor, Zilin Chen, Christine Rizkallah, Vincent Jackson, Sidney Amani, Gerwin Klein, Toby Murray, Thomas Sewell, and Gabriele Keller. 2021. Cogent: Uniqueness Types and Certifying Compilation. Journal of Functional Programming, 31 (2021), e25. https://doi.org/10.1017/S095679682100023X Google ScholarGoogle ScholarCross RefCross Ref
  25. Liam O’Connor, Zilin Chen, Partha Susarla, Christine Rizkallah, Gerwin Klein, and Gabriele Keller. 2018. Bringing Effortless Refinement of Data Layouts to Cogent. In Leveraging Applications of Formal Methods, Verification and Validation. Modeling, Tiziana Margaria and Bernhard Steffen (Eds.) (LNCS, Vol. 11244). Springer International Publishing, Cham. 134–149. isbn:978-3-030-03418-4 https://doi.org/10.1007/978-3-030-03418-4_9 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Blaise Paradeza. 2020. Refinement Types for Cogent. UNSW. Sydney, Australia. https://people.eng.unimelb.edu.au/rizkallahc/theses/blaise-paradeza-honours-thesis.pdf Google ScholarGoogle Scholar
  27. Tahina Ramananandro, Antoine Delignat-Lavaud, Cedric Fournet, Nikhil Swamy, Tej Chajed, Nadim Kobeissi, and Jonathan Protzenko. 2019. EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA. 1465–1482. isbn:978-1-939133-06-9 https://www.usenix.org/conference/usenixsecurity19/presentation/delignat-lavaud Google ScholarGoogle Scholar
  28. Christine Rizkallah, Japheth Lim, Yutaka Nagashima, Thomas Sewell, Zilin Chen, Liam O’Connor, Toby Murray, Gabriele Keller, and Gerwin Klein. 2016. A Framework for the Automatic Formal Verification of Refinement from Cogent to C. In Interactive Theorem Proving, Jasmin Christian Blanchette and Stephan Merz (Eds.) (LNCS, Vol. 9807). Springer, Cham. 323–340. isbn:978-3-319-43144-4 https://doi.org/10.1007/978-3-319-43144-4_20 Google ScholarGoogle ScholarCross RefCross Ref
  29. Norbert Schirmer. 2005. A Verification Environment for Sequential Imperative Programs in Isabelle/HOL. In Logic for Programming, Artificial Intelligence, and Reasoning, Franz Baader and Andrei Voronkov (Eds.) (LNCS, Vol. 3452). Springer, Berlin, Heidelberg. 398–414. isbn:978-3-540-32275-7 https://doi.org/10.1007/978-3-540-32275-7_26 Google ScholarGoogle ScholarCross RefCross Ref
  30. Konrad Slind. 2021. Specifying Message Formats with Contiguity Types. In 12th International Conference on Interactive Theorem Proving (ITP 2021), Liron Cohen and Cezary Kaliszyk (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 193). Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 30:1–30:17. isbn:978-3-95977-188-7 issn:1868-8969 https://doi.org/10.4230/LIPIcs.ITP.2021.30 Google ScholarGoogle ScholarCross RefCross Ref
  31. The seL4 developers. 2022. The seL4 Microkernel. https://sel4.systems/ Google ScholarGoogle Scholar
  32. Marcell van Geest and Wouter Swierstra. 2017. Generic Packet Descriptions: Verified Parsing and Pretty Printing of Low-Level Data. In Proceedings of the 2nd ACM SIGPLAN International Workshop on Type-Driven Development (TyDe 2017). Association for Computing Machinery, New York, NY, USA. 30–40. isbn:9781450351836 https://doi.org/10.1145/3122975.3122979 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Michael Vollmer, Chaitanya Koparkar, Mike Rainey, Laith Sakka, Milind Kulkarni, and Ryan R. Newton. 2019. LoCal: A Language for Programs Operating on Serialized Data. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019). Association for Computing Machinery, New York, NY, USA. 48–62. isbn:9781450367127 https://doi.org/10.1145/3314221.3314631 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Philip Wadler. 1990. Linear Types Can Change the World!. In Programming Concepts and Methods. North-Holland, 561. Google ScholarGoogle Scholar
  35. Yan Wang and Verónica Gaspes. 2011. An Embedded Language for Programming Protocol Stacks in Embedded Systems. In Proceedings of the 20th ACM SIGPLAN Workshop on Partial Evaluation and Program Manipulation (PEPM ’11). Association for Computing Machinery, New York, NY, USA. 63–72. isbn:9781450304856 https://doi.org/10.1145/1929501.1929511 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Qianchuan Ye and Benjamin Delaware. 2019. A Verified Protocol Buffer Compiler. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2019). Association for Computing Machinery, New York, NY, USA. 222–233. isbn:9781450362221 https://doi.org/10.1145/3293880.3294105 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Dargent: A Silver Bullet for Verified Data Layout Refinement

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Article Metrics

            • Downloads (Last 12 months)246
            • Downloads (Last 6 weeks)37

            Other Metrics

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!