skip to main content

Modular, compositional, and executable formal semantics for LLVM IR

Published:19 August 2021Publication History
Skip Abstract Section

Abstract

This paper presents a novel formal semantics, mechanized in Coq, for a large, sequential subset of the LLVM IR. In contrast to previous approaches, which use relationally-specified operational semantics, this new semantics is based on monadic interpretation of interaction trees, a structure that provides a more compositional approach to defining language semantics while retaining the ability to extract an executable interpreter. Our semantics handles many of the LLVM IR's non-trivial language features and is constructed modularly in terms of event handlers, including those that deal with nondeterminism in the specification. We show how this semantics admits compositional reasoning principles derived from the interaction trees equational theory of weak bisimulation, which we extend here to better deal with nondeterminism, and we use them to prove that the extracted reference interpreter faithfully refines the semantic model. We validate the correctness of the semantics by evaluating it on unit tests and LLVM IR programs generated by HELIX.

Skip Supplemental Material Section

Supplemental Material

Auxiliary Presentation Video

This is a presentation video of our talk at ICFP'21 for the paper titled "Modular, Compositional, and Executable Formal Semantics for LLVM IR".

3473572.mp4

Presentation Videos

References

  1. Andrew W. Appel. 2011. Verified Software Toolchain. In Programming Languages and Systems, Gilles Barthe (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 1–17. isbn:978-3-642-19718-5 https://doi.org/10.1007/978-3-642-19718-5_1 Google ScholarGoogle ScholarCross RefCross Ref
  2. Gilles Barthe, Sandrine Blazy, Benjamin Grégoire, Rémi Hutin, Vincent Laporte, David Pichardie, and Alix Trieu. 2020. Formal verification of a constant-time preserving C compiler. Proc. ACM Program. Lang., 4, POPL (2020), 7:1–7:30. https://doi.org/10.1145/3371075 Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Nick Benton. 2004. Simple Relational Correctness Proofs for Static Analyses and Program Transformations. In Proceedings of the 31st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’04). Association for Computing Machinery, New York, NY, USA. 14–25. isbn:158113729X https://doi.org/10.1145/964001.964003 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Martin Bodin, Arthur Chargueraud, Daniele Filaretti, Philippa Gardner, Sergio Maffeis, Daiva Naudziuniene, Alan Schmitt, and Gareth Smith. 2014. A Trusted Mechanised JavaScript Specification. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). Association for Computing Machinery, New York, NY, USA. 87–100. isbn:9781450325448 https://doi.org/10.1145/2535838.2535876 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Soham Chakraborty and Viktor Vafeiadis. 2017. Formalizing the Concurrency Semantics of an LLVM Fragment. In Proceedings of the 2017 International Symposium on Code Generation and Optimization (CGO ’17). IEEE Press, 100–110. isbn:9781509049318 https://doi.org/10.5555/3049832.3049844Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Arthur Charguéraud. 2013. Pretty-Big-Step Semantics. In Programming Languages and Systems, Matthias Felleisen and Philippa Gardner (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 41–60. isbn:978-3-642-37036-6 https://doi.org/10.1007/978-3-642-37036-6_3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Adam Chlipala. 2010. A Verified Compiler for an Impure Functional Language. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’10). ACM, New York, NY, USA. 93–106. isbn:978-1-60558-479-9 https://doi.org/10.1145/1706299.1706312 Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Trans. Program. Lang. Syst., 13, 4 (1991), Oct., 451–490. issn:0164-0925 https://doi.org/10.1145/115372.115320 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Charles Ellison. 2012. A formal semantics of C with applications. Ph.D. Dissertation. University of Illinois at Urbana-Champaign.Google ScholarGoogle Scholar
  10. Franz Franchetti, Tze-Meng Low, Thom Popovici, Richard Veras, Daniele G. Spampinato, Jeremy Johnson, Markus Püschel, James C. Hoe, and José M. F. Moura. 2018. SPIRAL: Extreme Performance Portability. Proceedings of the IEEE, special issue on “From High Level Specification to High Performance Code”, 106, 11 (2018), https://doi.org/10.1109/JPROC.2018.2873289 Google ScholarGoogle ScholarCross RefCross Ref
  11. Ronghui Gu, Jérémie Koenig, Tahina Ramananandro, Zhong Shao, Xiongnan (Newman) Wu, Shu-Chun Weng, Haozhong Zhang, and Yu Guo. 2015. Deep Specifications and Certified Abstraction Layers. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). ACM, New York, NY, USA. 595–608. isbn:978-1-4503-3300-9 https://doi.org/10.1145/2676726.2676975 Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Ronghui Gu, Zhong Shao, Hao Chen, Xiongnan Wu, Jieung Kim, Vilhelm Sjöberg, and David Costanzo. 2016. CertiKOS: An Extensible Architecture for Building Certified Concurrent OS Kernels. In Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI’16). USENIX Association, USA. 653–669. isbn:9781931971331 https://doi.org/10.5555/3026877.3026928Google ScholarGoogle Scholar
  13. Ronghui Gu, Zhong Shao, Jieung Kim, Xiongnan (Newman) Wu, Jérémie Koenig, Vilhelm Sjöberg, Hao Chen, David Costanzo, and Tahina Ramananandro. 2018. Certified Concurrent Abstraction Layers. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). Association for Computing Machinery, New York, NY, USA. 646–661. isbn:9781450356985 https://doi.org/10.1145/3192366.3192381 Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Chung-Kil Hur, Georg Neis, Derek Dreyer, and Viktor Vafeiadis. 2013. The Power of Parameterization in Coinductive Proof. In Proceedings of the 40th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’13). ACM, New York, NY, USA. 193–206. isbn:978-1-4503-1832-7 https://doi.org/10.1145/2429069.2429093 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Ralf Jung, Jacques-Henri Jourdan, Robbert Krebbers, and Derek Dreyer. 2017. RustBelt: Securing the Foundations of the Rust Programming Language. Proc. ACM Program. Lang., 2, POPL (2017), Article 66, Dec., 34 pages. https://doi.org/10.1145/3158154 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jeehoon Kang, Chung-Kil Hur, William Mansky, Dmitri Garbuzov, Steve Zdancewic, and Viktor Vafeiadis. 2015. A Formal C Memory Model Supporting Integer-Pointer Casts. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). Association for Computing Machinery, New York, NY, USA. 326–335. isbn:9781450334686 https://doi.org/10.1145/2737924.2738005 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jeehoon Kang, Yoonseung Kim, Chung-Kil Hur, Derek Dreyer, and Viktor Vafeiadis. 2016. Lightweight Verification of Separate Compilation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’16). Association for Computing Machinery, New York, NY, USA. 178–190. isbn:9781450335492 https://doi.org/10.1145/2837614.2837642 Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jeehoon Kang, Yoonseung Kim, Youngju Song, Juneyoung Lee, Sanghoon Park, Mark Dongyeon Shin, Yonghyun Kim, Sungkeun Cho, Joonwon Choi, Chung-Kil Hur, and Kwangkeun Yi. 2018. Crellvm: Verified Credible Compilation for LLVM. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). Association for Computing Machinery, New York, NY, USA. 631–645. isbn:9781450356985 https://doi.org/10.1145/3192366.3192377 Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Nicolas Koh, Yao Li, Yishuai Li, Li-yao Xia, Lennart Beringer, Wolf Honoré, William Mansky, Benjamin C. Pierce, and Steve Zdancewic. 2019. From C to Interaction Trees: Specifying, Verifying, and Testing a Networked Server. In Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2019). ACM, New York, NY, USA. 234–248. isbn:978-1-4503-6222-1 https://doi.org/10.1145/3293880.3294106 Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Robbert Krebbers and Freek Wiedijk. 2015. A typed C11 semantics for interactive theorem proving. In Proceedings of the 2015 Conference on Certified Programs and Proofs. 15–27. https://doi.org/10.1145/2676724.2693571 Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Ramana Kumar, Magnus O. Myreen, Michael Norrish, and Scott Owens. 2014. CakeML: A Verified Implementation of ML. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). Association for Computing Machinery, New York, NY, USA. 179–191. isbn:9781450325448 https://doi.org/10.1145/2535838.2535841 Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Leonidas Lampropoulos and Benjamin C. Pierce. 2018. QuickChick: Property-Based Testing in Coq. Electronic textbook. https://softwarefoundations.cis.upenn.edu/qc-current/index.htmlGoogle ScholarGoogle Scholar
  23. Chris Lattner and Vikram Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. San Jose, CA, USA. 75–88.Google ScholarGoogle Scholar
  24. Juneyoung Lee, Chung-Kil Hur, Ralf Jung, Zhengyang Liu, John Regehr, and Nuno P. Lopes. 2018. Reconciling High-Level Optimizations and Low-Level Code in LLVM. Proc. ACM Program. Lang., 2, OOPSLA (2018), Article 125, Oct., 28 pages. https://doi.org/10.1145/3276495 Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P. Lopes. 2017. Taming Undefined Behavior in LLVM. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, 633–647. https://doi.org/10.1145/3140587.3062343 Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Xavier Leroy. 2009. Formal verification of a realistic compiler. Commun. ACM, 52, 7 (2009), 107–115. https://doi.org/10.1145/1538788.1538814 Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Xavier Leroy, Andrew W. Appel, Sandrine Blazy, and Gordon Stewart. 2012. The CompCert Memory Model, Version 2. INRIA, 26. https://hal.inria.fr/hal-00703441Google ScholarGoogle Scholar
  28. Xavier Leroy and Hervé Grall. 2009. Coinductive big-step operational semantics. Information and Computation, 207, 2 (2009), 284 – 304. issn:0890-5401 https://doi.org/10.1016/j.ic.2007.12.004 Special issue on Structural Operational Semantics (SOS). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liyi Li and Elsa Gunter. 2020. K-LLVM: A Relatively Complete Semantics of LLVM IR. In 34rd European Conference on Object-Oriented Programming, ECOOP 2020, Berlin, Germany. https://doi.org/10.4230/LIPIcs.ECOOP.2020.7 Google ScholarGoogle ScholarCross RefCross Ref
  30. Liyi Li and Elsa L. Gunter. 2018. IsaK: A Complete Semantics of K. University of Illinois at Urbana-Champaign.Google ScholarGoogle Scholar
  31. Nuno P. Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. 2021. Alive2: Bounded Translation Validation for LLVM. Proceedings of the 42th ACM SIGPLAN Conference on Programming Language Design and Implementation, https://doi.org/10.1145/3453483.3454030 Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Nuno P Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. 2015. Provably correct peephole optimizations with alive. Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, 22–32. https://doi.org/10.1145/2813885.2737965 Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Kenji Maillard, Cătălin Hritcu, Exequiel Rivas, and Antoine Van Muylder. 2020. The next 700 Relational Program Logics. Proceedings of the ACM on Programming Languages, 4, POPL (2020), Article 4, 33 pages. https://doi.org/10.1145/3371072 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. William Mansky, Dmitri Garbuzov, and Steve Zdancewic. 2015. An Axiomatic Specification for Sequential Memory Models. In Computer Aided Verification - 27th International Conference, CAV 2015. https://doi.org/10.1007/978-3-319-21668-3_24 Google ScholarGoogle ScholarCross RefCross Ref
  35. Kayvan Memarian, Victor B. F. Gomes, Brooks Davis, Stephen Kell, Alexander Richardson, Robert N. M. Watson, and Peter Sewell. 2019. Exploring C Semantics and Pointer Provenance. Proc. ACM Program. Lang., 3, POPL (2019), Article 67, Jan., 32 pages. https://doi.org/10.1145/3290380 Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N. M. Watson, and Peter Sewell. 2016. Into the Depths of C: Elaborating the de Facto Standards. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Association for Computing Machinery, New York, NY, USA. 1–15. isbn:9781450342612 https://doi.org/10.1145/2908080.2908081 Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. David Menendez and Santosh Nagarakatte. 2017. Alive-Infer: Data-driven Precondition Inference for Peephole Optimizations in LLVM. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). ACM, 49–63. https://doi.org/10.1145/3140587.3062372 Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. 2015. Pilsner: A Compositionally Verified Compiler for a Higher-Order Imperative Language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP 2015). Association for Computing Machinery, New York, NY, USA. 166–178. isbn:9781450336697 https://doi.org/10.1145/2784731.2784764 Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Scott Owens, Magnus O. Myreen, Ramana Kumar, and Yong Kiam Tan. 2016. Functional Big-Step Semantics. In Programming Languages and Systems, Peter Thiemann (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 589–615. isbn:978-3-662-49498-1 https://doi.org/10.1007/978-3-662-49498-1_23 Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Daniel Patterson and Amal Ahmed. 2019. The next 700 Compiler Correctness Theorems (Functional Pearl). Proc. ACM Program. Lang., 3, ICFP (2019), Article 85, July, 29 pages. https://doi.org/10.1145/3341689 Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Gordon D. Plotkin and John Power. 2003. Algebraic Operations and Generic Effects. Applied Categorical Structures, 11, 1 (2003), 69–94. https://doi.org/10.1023/A:1023064908962 Google ScholarGoogle ScholarCross RefCross Ref
  42. Talia Ringer, Karl Palmskog, Ilya Sergey, Milos Gligoric, and Zachary Tatlock. 2019. QED at large: A survey of engineering of formally verified software. Foundations and Trends® in Programming Languages, 5, 2-3 (2019), 102–281.Google ScholarGoogle ScholarCross RefCross Ref
  43. Grigore Roşu and Traian Florin Şerbănută. 2010. An overview of the K semantic framework. The Journal of Logic and Algebraic Programming, 79, 6 (2010), 397 – 434. issn:1567-8326 https://doi.org/10.1016/j.jlap.2010.03.012 Membrane computing and programming. Google ScholarGoogle ScholarCross RefCross Ref
  44. Jaroslav Ševčík, Viktor Vafeiadis, Francesco Zappa Nardelli, Suresh Jagannathan, and Peter Sewell. 2013. CompCertTSO: A Verified Compiler for Relaxed-Memory Concurrency. J. ACM, 60, 3 (2013), 22. https://doi.org/10.1145/2487241.2487248 Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Youngju Song, Minki Cho, Dongjoo Kim, Yonghyun Kim, Jeehoon Kang, and Chung-Kil Hur. 2019. CompCertM: CompCert with C-Assembly Linking and Lightweight Modular Verification. Proc. ACM Program. Lang., 4, POPL (2019), Article 23, Dec., 31 pages. https://doi.org/10.1145/3371091 Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Guy L. Steele, Jr.. 1994. Building Interpreters by Composing Monads. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’94). ACM, New York, NY, USA. 472–492. isbn:0-89791-636-0 https://doi.org/10.1145/174675.178068 Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Gordon Stewart, Lennart Beringer, Santiago Cuellar, and Andrew W. Appel. 2015. Compositional CompCert. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’15). Association for Computing Machinery, New York, NY, USA. 275–287. isbn:9781450333009 https://doi.org/10.1145/2676726.2676985 Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Wouter Swierstra and Tim Baanen. 2019. A Predicate Transformer Semantics for Effects (Functional Pearl). Proc. ACM Program. Lang., 3, ICFP (2019), Article 103, July, 26 pages. https://doi.org/10.1145/3341707 Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. The Coq Development Team. 2020. The Coq Proof Assistant, version 8.11.0. https://doi.org/10.5281/zenodo.3744225 Google ScholarGoogle ScholarCross RefCross Ref
  50. Yuting Wang, Pierre Wilke, and Zhong Shao. 2019. An Abstract Stack Based Approach to Verified Compositional Compilation to Machine Code. Proc. ACM Program. Lang., 3, POPL (2019), Article 62, Jan., 30 pages. https://doi.org/10.1145/3290375 Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Li-yao Xia, Yannick Zakowski, Paul He, Chung-Kil Hur, Gregory Malecha, Benjamin C. Pierce, and Steve Zdancewic. 2020. Interaction Trees. Proceedings of the ACM on Programming Languages, 4, POPL (2020), https://doi.org/10.1145/3371119 Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 283–294. isbn:9781450306638 https://doi.org/10.1145/1993498.1993532 Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Yannick Zakowski, Paul He, Chung-Kil Hur, and Steve Zdancewic. 2020. An Equational Theory for Weak Bisimulation via Generalized Parameterized Coinduction. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP). https://doi.org/10.1145/3372885.3373813 Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Vadim Zaliva and Franz Franchetti. 2018. HELIX: A Case Study of a Formal Verification of High Performance Program Generation. In Workshop on Functional High Performance Computing (FHPC). https://doi.org/10.1145/3264738.3264739 Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Vadim Zaliva and Matthieu Sozeau. 2019. Reification of Shallow-Embedded DSLs in Coq with Automated Verification. In International Workshop on Coq for Programming Languages (CoqPL).Google ScholarGoogle Scholar
  56. Vadim Zaliva, Ilia Zaichuk, and Franz Franchetti. 2020. Verified Translation Between Purely Functional and Imperative Domain Specific Languages in HELIX. In Proceedings of the 12th Working Conference on Verified Software: Theories, Tools, and Experiments (VSTTE). https://doi.org/10.1007/978-3-030-63618-0_3 Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic. 2012. Formalizing the LLVM Intermediate Representation for Verified Program Transformations. In Proc. of the ACM Symposium on Principles of Programming Languages (POPL). https://doi.org/10.1145/2103621.2103709 Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Jianzhou Zhao, Santosh Nagarakatte, Milo M. K. Martin, and Steve Zdancewic. 2013. Formal Verification of SSA-Based Optimizations for LLVM. In Proc. 2013 ACM SIGPLAN Conference on Programming Languages Design and Implementation (PLDI). https://doi.org/10.1145/2499370.2462164 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Modular, compositional, and executable formal semantics for LLVM IR

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader
          About Cookies On This Site

          We use cookies to ensure that we give you the best experience on our website.

          Learn more

          Got it!