Melocoton: A Program Logic for Verified Interoperability Between OCaml and C

one


INTRODUCTION
In recent years, there has been tremendous progress on developing verification systems based on program logics (in particular, separation logics), which support the modular verification of complex programs in a rich and diverse array of languages.Notable examples include VST-Floyd [Cao et al. 2018] and RefinedC [Sammler et al. 2021] for C, RustBelt [Jung et al. 2017] and RustHornBelt [Matsushita et al. 2022] for Rust, and Cosmo [Mével et al. 2020] for multi-core OCaml-all of which are fully mechanized in Coq.
The above-cited systems all employ a common recipe: (1) Design an operational semantics for the language, which serves as the "ground truth" about program behavior.
(2) Build a (Hoare-style) program logic for the language, which supports higher-level proof rules for compositionally verifying program correctness.
(3) Establish soundness of the program logic by giving an interpretation of the logical judgments (e.g., Hoare triples) in terms of the operational semantics, and verifying the proof rules as lemmas about that interpretation.
This recipe works great for verifying programs written entirely in a single language.However, in practice, programs are not typically written all in one language, but rather linked together from a patchwork of components written in different languages.For instance, they include calls to kernel and runtime primitives implemented in low-level languages such as C and assembly; leverage low-level efficient code implemented in languages like C, C++, and Rust; and reuse large, existing libraries implemented in a different language (e.g., numeric computing libraries written in C).In fact, almost all widely-used programming languages include some kind of foreign function interface (FFI) to interact with code written in other languages (often using C as an intermediary).For these so-called "multi-language programs", there is not yet a recipe for building (provably sound) program logics.For starters, step one (the question of how to define a semantics for multilanguage programs) is still an active topic of research [Neis et al. 2015;Patterson et al. 2022;Sammler et al. 2023;Stewart et al. 2015].Moreover, for steps two and three, there does not yet exist any program logic for reasoning about multi-language programs in which the constituent languages have (as is often the case) very different memory models.In short, the problem of building program logics for multi-language program verification is still wide open.
In this paper, we take the first steps towards filling this gap by proposing a recipe for building multi-language program logics.To keep matters concrete, we focus on a specific instance of the problem: verifying multi-language programs written in a combination of OCaml and C. OCaml has structured values (e.g., sums, pairs, lists, and references), a garbage-collected memory, and a type system providing strong guarantees about its programs.C has integer values and pointers, manually managed memory, and a type system providing only weak guarantees about its programs.Nevertheless, there exists a bridge between the two-the OCaml FFI-which exposes enough of the OCaml runtime to C to convert values, execute callbacks, and share memory.

Key Challenge: Bridging the Language Differences
In fact, in multi-language verification, one-if not the-main challenge is bridging the gap between the languages under consideration: the more the languages differ, the more the single-language reasoning principles that apply to them will differ.In this regard, OCaml and C are truly an "odd couple": they differ in their values (i.e., abstract values in OCaml vs. concrete values in C), 1 memory models (i.e., mutable records and arrays vs. pointers with pointer arithmetic), memory management (i.e., garbage collection vs. manual memory management), type systems (i.e., a strong, polymorphic type system vs. a weak type system), and runtime (i.e., large runtime vs. bare-bones binaries).To illustrate these differences and how they are bridged by the OCaml FFI, let us consider a concrete example: the runtime representation of OCaml values in C. Consider the OCaml type: type buf = { cap : int; used: int ref; data: raw_bytes } This type is used as part of the running example of this paper (in §2), where it exposes buffers implemented in C (i.e., raw chunks of bytes) to OCaml through the OCaml FFI.The data-field stores the underlying bytes, the cap-field the capacity of the buffer, and the used-field how much of the buffer is currently in use (i.e., how many bytes are readable).The type raw_bytes-as far as OCaml is concerned-is abstract (i.e., declared with type raw_bytes without a definition), meaning it does not reveal anything about its contents to OCaml.
From the perspective of OCaml, values of type buf are records {cap, used, data} where the cap-field is an immutable integer, the used-field is a mutable reference, and the data-field stores some kind of immutable value (of an unknown shape).This is not, however, what the OCaml FFI exposes to C as the runtime representation of these values.Instead, nested values (e.g., records and pairs) are exposed as blocks of memory, runtime blocks, which are nested using pointers.A value of type buf is a pointer to a block with three elements, one for each field of the record (as depicted in Fig. 1).The first field stores the runtime representation of the cap-integer; the second field stores the used-reference-a pointer to a block, which stores the number of used bytes; and the third field stores a pointer to a so-called "custom" block, which embeds C data into OCaml (e.g., raw C pointers).In this case, the custom block stores a pointer to the underlying bytes of the buffer.
Clearly, the OCaml view of buf values and their runtime representation in C is different.For one, in C, they reside in memory (with a concrete address where they are stored) whereas, in OCaml, they are conceptually "just values" (i.e., they are copied when passed around).However, that is not the only difference.The fields of buf are immutable (e.g., cap cannot change) and the used-reference is mutable.From the perspective of C, however, both records and references have the same representation, runtime blocks.It is up to the programmer to remember that one of them is allowed to be mutated whereas the other is not.Finally, C code can embed its own data into OCaml through custom blocks (here the data-field) and it can access and modify that data as needed whereas the data-field of the record is completely opaque to OCaml.
In program code, the differences between OCaml and C are bridged through so-called "glue code", code which uses the OCaml FFI to link up program parts written solely in either OCaml or C. For program verification, the story is not so clear yet.For instance, when we reason about glue code, how do we reconcile the two very different views that OCaml and C have on values of type buf?To design a multi-language program verification analogue of the single-language recipe, we have to bridge the language gap at two different levels: The operational semantics.At the operational semantics level, the challenge is that no existing multi-language semantics explains how to reconcile the key differences between both languages (i.e., interacting with the OCaml garbage collector and runtime, registering "roots", executing callbacks, etc.).However, there are promising starting points.Patterson et al. [2022] combine a garbage collected language and a language with manual memory management.Unfortunately, they do so using an elaboration semantics to a shared target language with garbage collection, which does not exist in the case of OCaml and C. Sammler et al. [2023] propose an approach to multi-language semantics that uses modular combinators to connect languages by composing their operational semantics.As such, their approach fits well with step one of the single-language recipe.Unfortunately, while they consider languages with significant differences (e.g., different calling conventions and memory models), they do not consider garbage collection or a runtime.
The program logic.At the program logic level, the main challenge is preserving language-local reasoning.That is, chunks of code that are solely written in C or solely written in OCaml should enjoy reasoning principles that are fine-tuned to their respective language (without being impacted by the existence of the other language).However, preserving language-local reasoning is easier said than done.For one, the two program logics will have different views on the same piece of state (e.g., a buf-value in OCaml will be viewed as nested runtime blocks on the C side).Changes to runtime blocks on the C side (e.g., updating the block storing the number of used bytes) correspond to changes in the state on the OCaml side (e.g., changing the used-reference in the buffer).Another, more subtle issue with preserving language-local reasoning is that code written in one language can potentially violate language-specific invariants of the other.For example, OCaml-as a functional language-makes pervasive use of immutable values.When linked with C, the C code can observe the runtime representation of those values and, in principle, mutate their contents, albeit with unknown consequences as far as the OCaml semantics is concerned.

Melocoton
In this paper, we present Melocoton, a multi-language program verification system for reasoning about OCaml, C, and their interactions through the OCaml FFI.It extends the single-language recipe for program verification to a multi-language setting as follows: (1) As a starting point, we take simplified versions of OCaml and C called C and ML (see §2.2) and-following the single-language recipe-we define their canonical operational semantics, → C and → ML .Moreover, again following the recipe, we derive language-specific program logics for them.Since both languages have non-trivial state, we derive two separation logics, called Iris C and Iris ML , in the separation logic framework Iris [Jung et al. 2018[Jung et al. , 2015]].(2) We extend the language-local program logics Iris C and Iris ML with reasoning principles for "external calls".That is, functions potentially implemented in another language (e.g., in C) are exposed to the language-local logic (e.g., to Iris ML ) through an interface.Conceptually, for each external call, the interface gives a precondition and a postcondition surrounding the call.(We do not add any rules to → C and → ML ).(3) Finally, for the combined language, ML+C , we develop an operational semantics −↠ ML+C , which embeds the operational semantics of C and ML and adds reasoning rules to bridge between them.Moreover, we develop a separation logic, Iris ML+C , which embeds Iris C and Iris ML and extends their reasoning principles to bridge between the two logics.We then prove that Iris ML+C is sound with respect to the joint semantics −↠ ML+C .In Melocoton, the verification of a mixed OCaml and C program can be done almost entirely in the language-local logics Iris C and Iris ML .The notion of "external calls" allows us to abstract over which language is "on the other end" of the call, and the interfaces of external calls (i.e., their preand postconditions) are written in the language-local logics.Hence, even for external calls, we do not have to leave the language-local logics (e.g., functions declared external in OCaml, although actually implemented in C, still have an Iris ML specification).In fact, even code interacting with the OCaml FFI-on the C side-can stay in the language-local logic Iris C (see §2.4).The only purpose of the "umbrella logic" Iris ML+C is to tie together language-local verifications and ensure that the assumptions of one side match up with the guarantees of the other.The key to matching up both sides is providing reasoning rules for what we call the "view reconciliation problem": reconciling different logical views on the same shared state (see §2.5).

Contributions.
The main contribution of this paper is Melocoton, a multi-language program verification system for programs written in OCaml and C. Melocoton consists of the first formal semantics of (a large subset of) the OCaml FFI-previously only described in prose in the OCaml manual [oca 2023b]-and the first program logic to reason about the interactions of OCaml and C.
Melocoton's operational semantics ( §3) is the first multi-language semantics that models interactions with an OCaml-style garbage collector.To define it, we take inspiration from Sammler et al. [2023]: we define an operational semantics that modularly embeds the semantics of ML and C , and we use angelic and demonic non-determinism to bridge the gap between the two semantics.As mentioned above, the work of Sammler et al. does not cover the kind of language interactions that are possible through the OCaml FFI, which are what makes interoperability between OCaml and C interesting.To capture them, we use a carefully designed model of the core aspects of the OCaml runtime (e.g., runtime blocks, garbage collection, "roots", callbacks, etc.).
Melocoton's program logic ( §4) is the first separation logic (and program logic) for a multilanguage setting with different memory models (including garbage collection).We use separation logic to reason about the state of OCaml and C, and we use step-indexing-inherited from Iris-to handle the recursive and higher-order features of OCaml.To soundly combine (1) angelic nondeterminism and (2) the higher-order features of OCaml, as it turns out, we have to use a richer form of step-indexing than what is provided out-of-the box by Iris: we have to use transfinite step-indexing and, thus, define Iris ML+C in Transfinite Iris [Spies et al. 2021] (see §4.3).
We explain the key ideas behind Melocoton (in §2) with our running example: importing a compression library from C into OCaml.Besides the compression library, we have applied Melocoton to several interesting examples (in §5): a polymorphic equality function that cannot be implemented natively in OCaml, a list implementation that alternates between OCaml and C memory blocks, a version of Landin's knot [Landin 1964] that mutually recursively goes through the FFI, and an abstract data type that stores OCaml callbacks in C memory.To show that the last two examples-even though they are implemented in C-are type safe (i.e., they can be safely used from arbitrary, well-typed OCaml code), we define a standard logical relation in Iris ML+C , and extend it with reasoning principles for external calls.Melocoton is fully mechanized in Coq, and the Coq development can be found in the supplementary material [Guéneau et al. 2023].The current development version of Melocoton and further information is available from the projet webpage at https://melocoton-project.github.io.

MELOCOTON BY EXAMPLE
We explain the key ideas underlying Melocoton using a motivating example: importing a C compression library into OCaml.We focus on the program logic level of Melocoton and follow the spirit of the multi-language recipe: first develop language-local reasoning principles, then focus on language interoperability.Concretely, we discuss the C library and an OCaml client ( §2.1), we explain how to reason about them locally ( §2.2), we discuss the OCaml FFI "glue code" that ties them together ( §2.3), we explain how to reason about the glue code ( §2.4), and, finally, we address the view reconciliation problem ( §2.5).Throughout this paper, we use colors to distinguish different languages: magenta for OCaml, dark blue for C, and light blue for primitives of the OCaml FFI.

The C Library and OCaml Client
In this example, depicted in Fig. 2, we want to import a C compression library into OCaml.One can imagine that the implementation of the compression library is particularly efficient in C, or that it already exists as a separate, standalone library and we want to avoid rewriting it.In this particular example, we take inspiration from the interface of Google's Snappy compression library [sna 2023].
For the purposes of the example, the implementation of the compression algorithm is not relevant.2 The C library.The function snappy_compress can be used to compress an input buffer inp (i.e., a raw C pointer to a chunk of bytes) of length insz into an output buffer outp of capacity outsz.After compression, the pointer outsz stores the actual length of the compressed data.To make sure that the capacity of the output buffer outp suffices, the library provides the function snappy_max_compressed_length, which computes an upper bound on the compressed size.
The OCaml client.The OCaml client is_compressible wants to use the compression library to check whether an array of characters will shrink in size when compressed.When we implement it in OCaml, the first stumbling block that we encounter is that is_compressible takes in an OCaml array of characters (i.e., a built-in array managed by the OCaml garbage collector), but the C library function snappy_compress expects two char pointers (i.e., raw pointers to C buffers of bytes).The issue is the that the two types, char array and char *, are not the same-they are not even part of the same language.
To circumvent this problem, we introduce glue code: C code that uses the OCaml FFI to mediate between is_compressible and snappy_compress.For now, we delay a discussion of the C implementation of the glue code (to §2.3) and focus on its OCaml interface (Lines 3-9).The interface consists of (1) a small buffer library (loosely inspired by OCaml's Bigarray library [oca 2023a]) and (2) wrappers of the compression functions operating on buffers.The interface declares an abstract type raw_bytes, a record for buffers buf, and four functions to operate on buffers: buf_alloc allocates an uninitialized buffer of a given capacity (initially using 0 bytes); buf_free frees a buffer (since memory management is manual in C, we have to free allocated buffers); buf_upd updates a range of the buffer using a callback (additionally increasing the number of used bytes if necessary); and buf_get reads a byte of the buffer.The compression functions are wrapped as wrap_max_len (for snappy_max_compressed_length) and wrap_compress (for snappy_compress).
The OCaml client is_compressible uses the glue code to check whether an array of characters can shrink in size as follows: it allocates an input buffer inp and an output buffer outp-large enough to store the compressed input; it then fills the input buffer with the contents of the array; it compresses the input into the output buffer; it checks whether the output size is smaller than the input size; and, finally, it frees the two buffers and returns the result.(The function is_compressible is only illustrating how one can use the external compression function from OCaml.A more realistic client would not "throw away" the compressed buffer like this.)

Language-Local Reasoning
Before we turn to the implementation of the glue code that connects the OCaml client and the C library, let us first illustrate one of the central ideas of Melocoton: preserving language-local reasoning.Even without knowing the implementation of the glue code (or the OCaml FFI), we can verify the C compression library and the OCaml client already in language-specific program logics.As mentioned in §1, we consider rather idealized versions of C and OCaml in this paper, called C and ML , which are depicted in Fig. 3.We first discuss the language C and verify the compression library in Iris C .Then, we contrast C with ML and verify the client in Iris ML .
The language C .The essential features of C are its very simple form of values (i.e., integers or addresses ) and its flat memory model.To be precise, memory in C is a finite map from addresses to memory cells, either values or special tokens indicating that an address has been freshly allocated and is uninitialized (★) or has been freed already ( †).Memory is allocated with malloc( ) and has to be manually deallocated again with free( , ).Executing malloc( ) returns an address which points to the first of consecutive heap cells.To access heap cells other than , pointer arithmetic can be used (i.e., " + ").Programs , in C , are lists of functions (where no function is defined twice).
Verifying the C library.The language C gives rise to Iris C , a simple, language-specific separation logic, depicted in Fig. 4. In the context of our running example, we use Iris C to prove correctness of the two compression library functions.Since Iris C is a standard separation logic (e.g., see write-c, read-c, alloc-c, free-c, and call-internal), we only discuss the function specifications as the proofs themselves are routine.To verify a function in Iris C , one proves Hoare triples of the form "{ } call fn ì w @ , Ψ { } C " where call fn ì w is the function call construct of C , is the surrounding program, and are the pre-and postconditions, and Ψ is an "interface".(Interfaces are only used for external calls to other languages-we will ignore them for now and come back to them shortly.)For the function snappy_max_compressed_length, we show where lib is the surrounding compression library (and the interface is the empty interface ∅).Since we are not concerned with the details of the compression algorithm, we keep the upper bound (and other details of compression) abstract in the form of predicates (here "maxlen").
For snappy_compress, we want to prove that if we pass in two buffers-one containing a sequence of integers to compress and one large enough to store the output-the compressed version of the input is stored in the output buffer.To make this intuition formal, we can use the points-to assertion ↦ → C cl of Iris C .It conveys ownership over the memory address and asserts that it currently stores the memory cell cl.For snappy_compress, we show stores a buffer of capacity with the first cells storing the integers 0 , . . ., −1 , and we write ↦ → C ≜ ∃cl ∈ Val ⊎ {★} .↦ → cl whenever the contents of are irrelevant and has not been freed yet.In other words, we prove that if we pass in a sufficiently large output buffer out , then compression is successful (i.e., return value 0), out will afterwards store the compressed version of the input in (captured by cpr( − − → in , − −− → out )), and outsz will store the length of the compressed buffer.The language ML .Let us now turn to ML .In contrast to C , the language has a very rich notion of values: integers, locations, booleans, unit, pairs, sums, foreign values " " (discussed shortly), and closures.Moreover, the memory model of ML is very different.The heap of ML is a map from (abstract) locations to lists of values.Superficially, this may seem similar to the memory of C .However, there are several crucial differences: First, memory management in ML relies on garbage collection, meaning memory is allocated on the heap with alloc V , but never has to be freed manually.Instead, conceptually, memory is garbage collected once no part of the program can access it anymore.Second, the memory of ML is more complex, because it can store (lists of) arbitrary values, regardless of their shape or size (e.g., ⟨⟨n,m⟩, rec .⟩, a pair of another pair and a closure).Third, locations in ML store entire lists of values.Each location ℓ models a mutable array, whose length can be determined with length ℓ, whose elements can be retrieved with ℓ.( ), and whose elements can be updated with ℓ.( ) ← V .In contrast to C , there is no "address arithmetic" (i.e., the expression ℓ + is stuck).A third, minor difference to C is that there are no top-level function declarations in ML (i.e., = ∅) and, instead, in ML we execute single expressions e (which internally can contain let-bindings and mutual recursion).
There are three aspects of ML that enable language interoperability.They are non-intrusive and minimal such that ML does not even (need to) know which languages it is interacting with.First, there are foreign values " ", where is an abstract identifier.They can be used by other languages such as C to "embed" their own data into ML .We will use this feature of OCaml to embed the underlying C buffer into OCaml in our running example (in §2.5).Foreign values are abstract: they can be passed around via function calls, but there is no language construct in ML to inspect their contents.Second, after executing an external function, locations in the heap of ML can store data that is temporarily inaccessible from ML (see §3.1).Whenever this is the case, the ML heap stores at the respective location and accesses in ML will get stuck.Third, and most interestingly, ML contains a language construct for external function calls call fn ì V .The syntactic construct call fn ì V has no meaning in the language-local semantics of ML (i.e., it is stuck3 in the semantics of ML ).As such, ML does not have to include any rules in its semantics for executing code of other languages (e.g., C ).Instead, we will assign meaning to external calls in the combined multi-language ML+C (see §3), where external calls in ML are linked with their implementation in C .
Verifying the OCaml client.Let us turn to the verification of is_compressible in Iris ML .We prove that given ownership over an array of integers, the function returns an unspecified boolean, meaning {ℓ ↦ → ML ì } is_compressible ℓ @ ∅, Ψ buf {V ′ .V ′ ∈ {true, false} * ℓ ↦ → ML ì } ML .This specification is not particularly exciting, but it suffices to illustrate the interaction with the external functions implemented in C. The resources of Iris ML (e.g., "ℓ ↦ → ML ì V ") and its reasoning rules (e.g., read-ml, write-ml, and alloc-ml) are standard.Thus, we focus mainly on the treatment of external function calls.External calls allow us to slice the verification of a program into the language-local parts and the parts that are implemented in another language.In is_compressible, the first time that we call an external function is when we allocate the initial buffers inp and outp with buf_alloc (Line 12).Suppose (for a moment), for the sake of explanation, that buf_alloc was implemented in OCaml.In this hypothetical scenario, the standard way to proceed would be to prove (once) and afterwards apply (multiple times) the Hoare triple: asserts that V is a tuple containing the buffer capacity n, a ML reference ℓ for the used field, and a value V ′ containing the underlying integers ì m.How V ′ stores the integers ì m is irrelevant for the verification of is_compressible, meaning we can stay at the abstraction of "raw_bytes(V ′ , n, ì m)".(As we will see in §2.5, the definition of raw_bytes uses ML 's "foreign values" and knowledge about the OCaml FFI.)After applying the Hoare triple, we can then use the buffer that buf_alloc returns to resume the verification of is_compressible.
Of course, buf_alloc is not actually implemented in OCaml.Nevertheless, in our proof of is_compressible, we want to stay as close as possible to the language-local reasoning sketched out above.To do so, we introduce reasoning principles to "skip" external function calls by drawing a boundary at their specification.For example, in the verification of is_compressible, we want to prove the precondition > 0 of buf_alloc to call it and then resume afterwards with the postcondition buffer ML (V, , []).To make this reasoning sound, we take inspiration from the work on open simulations [Hur et al. 2012] and of de Vilhena and Pottier [2021] on a program logic for effect handlers.Concretely, we parameterize Hoare triples by an interface Ψ ∈ Intf (Val) ≜ FnName → list(Val) → (Val → iProp) → iProp that maps each external call to its specification.Formally, an interface Ψ is a "predicate transformer" that takes in a function name fn, a list of argument values ì , and a postcondition and then produces the precondition "Ψ fn ì " required to call fn with arguments ì .The way that "skipping" function calls works with predicate transformers (see call-external) is that we show that the current precondition implies Ψ fn ì where is our desired postcondition.
For example, in the case of the buffer library, the interface specifies all the library functions as a disjunction of single-function interfaces, where (Ψ 1 ⊔ Ψ 2 ) fn ì ≜ Ψ 1 fn ì ∨ Ψ 2 fn ì .For buf_alloc specifically, the interface is: which we, more idiomatically, write as (note the angle brackets!) to give single-function interfaces their familiar "Hoare triple reading". 4ith the interface Ψ buf for the library in hand, the verification of is_compressible in Iris ML proceeds smoothly as if the functions were implemented in ML (using call-external).

A Primer of the OCaml FFI
Having verified the OCaml client and the C compression library (from Fig. 2), we now turn to the "glue code" that connects them.Before we can verify any glue code (in §2.4), we first have to understand the central concepts of the OCaml FFI and how they are used in our example.To explain them, we focus on the implementation of buf_alloc (in Fig. 5); the implementation of the remaining glue code functions can be found in the supplementary material [Guéneau et al. 2023].
The representation of OCaml values.In C, all OCaml values-regardless of their type, including buf, raw_bytes, int, and char array-are exposed by the OCaml runtime as runtime values-of type value.They are either integers or pointers to runtime blocks (i.e., chunks of memory in the heap of the OCaml runtime).Integers are used to encode OCaml's integers (i.e., int) and other simple types such as bool and unit.Pointers are used to encode OCaml's structured values (e.g., pairs, data types with arguments, arrays, etc.).For example, as depicted in Fig. 1, a value of the buffer record is represented as a pointer to a runtime block with three values, one storing an integer for the capacity, one storing a reference for the used field, and another storing the underlying C buffer.To embed the C buffer-a raw C pointer which is not of type value-into OCaml, the runtime offers so-called "custom blocks".Custom blocks are runtime blocks that embed native C data (e.g., pointers) into OCaml as foreign values. 5anipulating OCaml values in C. As a function using the OCaml FFI, buf_alloc has to interact correctly with the OCaml runtime primitives (in light blue) and the OCaml garbage collector.We will ignore the garbage collector for now-and the primitives for working with it in Line 17 and CAMLreturn in Line 24-and come back to it below.Instead, we focus on how to operate on runtime values.For integers, converting between an integer and the representation of as a runtime value is simple: the runtime provides the primitives Val_int (read "integer to value", see Line 19) and Int_val (read "value to integer", see Line 21).These primitives actually do something-e.g., Val_int converts to 2 + 1 and Int_val right-shifts it back-because the OCaml runtime uses the least significant bit as a tag to distinguish integers from pointers.
Interacting with structured values is more subtle.Recall the runtime representation of the buffer record {cap, used, data} depicted in Fig. 1.To create it, buf_alloc proceeds as follows: it allocates a runtime block r of size one for the reference used using the primitive caml_alloc (in Line 18) and initializes it to zero using the primitive Store_field (in Line 19); it allocates a "custom" runtime block bk (in Line 20) using caml_alloc_custom; it allocates a C buffer of the right length and stores it in the custom block (in Line 21); it allocates a block for the buf record and stores the capacity (here len), the used reference, and the custom block in it (in Lines 22-23); and, finally, it returns the newly created runtime block (in Line 24).
Tiptoeing around the OCaml garbage collector.One fundamental difference between OCaml and C reveals itself only implicitly in the code in Fig. 2: the presence of the OCaml garbage collector (GC).That is, memory management in OCaml is based on garbage collection, meaning once a runtime allocated object (i.e., one that is represented as a runtime block) becomes unreachable (i.e., no part of the OCaml code can access it anymore), the GC can deallocate it to reduce the amount of memory consumed.This is in stark contrast to the manual memory management of C, where memory has to be explicitly allocated and eventually deallocated.
One immediate consequence of this difference is that the buffer library needs to provide a buf_free-function to OCaml (Line 5), because otherwise the buffer will stay in memory forever.A more burdensome and subtle consequence is that we have to make sure that the GC does not invalidate references to runtime blocks that we still want to use.That is, when we define an external function in C, the GC may execute whenever we make calls to (certain) OCaml FFI runtime functions (e.g., caml_alloc).To prevent it from invalidating our local references (by deallocating or moving the runtime blocks that they point to), we have to "register" our local references with the GC as so-called "roots"(using CAMLlocal and CAMLparam in Line 17).Eventually, when we no longer need them, we can unregister our roots with the GC again (using CAMLreturn in Line 24).

An Interface for the OCaml FFI
How should we verify glue code such as buf_alloc in Fig. 5? On the one hand, the code is written in C, so it would be natural to use the language-local logic Iris C to reason about it.On the other hand, semantically, the code is more concerned with OCaml values (and the runtime representation thereof) than with C data structures, C pointers, and C-specific features.We answer this question with our next key idea: importing a logic for the OCaml FFI into the language-local logic Iris C .
Conceptually, "the OCaml FFI" is (1) a lower-level model of OCaml's values and heap that is exposed to C and (2) a set of "runtime primitives" available to C to operate on this lower-level representation.Together, these two parts of the OCaml FFI form a clean abstraction over the actual, underlying C data representation (and the implementation of the garbage collector).We use them as a "middle ground" between C and OCaml in this work.Concretely, (1) we define a notion of runtime values, runtime heaps, and runtime separation logic resources and (2) we specify the runtime primitives as external functions in Iris C using an interface Ψ FFI .In doing so, we get access to abstract reasoning principles about the OCaml runtime while retaining the language-local reasoning principles of Iris C .
The resources of the OCaml runtime.To define the runtime interface, we take the view that memory exposed by the OCaml FFI is an abstract heap of runtime blocks and then relate this heap to both its C and OCaml representations (i.e., map it to pointers and integers in C and map it to structured values and references in OCaml).We will describe the interface of the OCaml runtime in more detail in §4.2.For now, we just take a glimpse at its values, resources, and rules.
The values of the OCaml runtime ∈ Val ( ∈ Z) | ( ∈ Loc) are either integers or abstract runtime locations .Runtime locations "store" runtime blocks, which we track through separation logic resources: For standard blocks, we have ↦ → blk[ | ] ì , which says that currently stores a block of values ì .It, additionally, asserts that the tag6 of the block is and that the mutability of the block is ∈ {imm, mut, fresh}: blocks can be immutable imm (e.g., for OCaml pairs), mutable mut (e.g., for OCaml references), or fresh fresh (i.e., it has not been decided yet whether will be a mutable or immutable value).For "custom" blocks, which embed C data into OCaml, we have ↦ → cstm w, which says that currently stores the C -value w.Custom blocks are always mutable.To use these abstract runtime values from C , we relate them to concrete C -values w.This correspondence is the relation ∼ C w: It takes in a finite map from runtime locations to C -addresses, and then relates runtime integers with the C -value ˆ representing them (where • translates integers to C -values using the encoding also used by Val_int) and runtime locations with their C -addresses.The map allows us to model the behavior of the garbage collector (i.e., moving and deallocating runtime blocks).It is not constant.Instead, the GC may (1) move blocks in memory, which changes their physical address in C (i.e., changes) but not their "identity" in the runtime (i.e., remains unchanged) and ( 2) deallocate blocks that have become unreachable, which keeps the block in the abstract runtime (i.e., is not removed from the runtime heap), but its physical address vanishes.To keep track of the current map , we introduce an additional separation logic resource GC( ), which asserts that the current map from runtime locations to C addresses is .
The interface of the OCaml runtime.To understand how these resources are used, let us take a look at the interface for the caml_alloc runtime primitive (which is a disjunct of Ψ FFI ): The interface enables us to allocate a block of (non-negative) length with tag .To do so, we have to provide the GC resource GC( ) with some address map initially.Since the allocation primitive caml_alloc calls the garbage collector internally, this map is potentially changed during the allocation, and we get back a new address map ′ after the call (and the GC resource).Moreover, for the newly allocated block, we get a fresh runtime location , which "stores" -consecutive zeros as runtime values.To enable us to use the new block from C , the return value w is related by ∼ ′ C w to the freshly allocated runtime location in the postcondition.(We will see more runtime primitives and how to maintain references to ML values across GC calls in §4.2.) Using the interface Ψ FFI , we can verify glue code in Iris C .In particular, for buf_alloc, we prove: encodes the runtime representation of a buffer, as illustrated in Fig. 1-albeit without tags.It contains ownership of (1) the "buffer record" in runtime representation ↦ → blk[0|imm] [ , , ] (with capacity ), (2) the reference for the used field ↦ → blk[0|mut] [0], (3) the "custom block" ↦ → cstm underneath data, which stores the address of the underlying C buffer, and (4) the underlying buffer bf ( , , ì ).

View Reconciliation
We have sketched how to verify a mixed OCaml-and-C program from the OCaml side using Iris ML and from the C side using Iris C (including glue code using the OCaml FFI via Ψ FFI ).The last remaining piece of the puzzle is connecting the different parts, which brings us to the view reconciliation problem.Take buf_alloc again for example.We have discussed how to assume a specification about it in Iris ML (in §2.2) and how to prove a specification for it in Iris C (in §2.4).However, so far, the two specifications do not match up-the Iris C -specification uses the resources for the runtime and the Iris ML -specification the resources for ML .
There is a fundamental challenge here.The two logics, Iris C (extended with Ψ FFI ) and Iris ML , are about entirely different languages, and yet they express views on the same underlying data.More specifically, they are language-local logics following the single-language recipe (from §1), which means they intentionally do not model how values and memory of their language are represented in other languages.Each language-local logic models the language's own local account of the physical state, which is different on each side.Yet, whenever code on one side changes its underlying physical state, the changes become observable on the other side.For example, after buf_alloc allocates the block r for the used-reference (obtaining ↦ → blk[0|mut] [0]), the block becomes observable in OCaml as a reference (in the form of ℓ ↦ → ML [0]) when buf_alloc returns.Subsequently, whenever the OCaml or C side mutates it (e.g., in wrap_compress), the new value becomes also observable on the other side (e.g., in is_compressible).

GC(
(ffi-to-ml) Fig. 6.The view reconciliation rules The fact that there are two ways of describing the same piece of data means we have to make sure that they stay "in sync" (e.g., ↦ → blk[0|mut] [0] and ℓ ↦ → ML [1] would be inconsistent).Here, we can reap the benefits of working in separation logic.Using the notion of exclusive ownership, we can enforce that, at any given point, there is only a single view on any piece of data-through the lens of either OCaml (ℓ ↦ → ML [0]) or the runtime ( ↦ → blk[0|mut] [0]), but not both.In the logic, we can transition between the two perspectives using the view reconciliation principles, depicted in Fig. 6.They allow us to use Iris ML resources when we are verifying C glue code in Iris C .Concretely, the rule ml-to-ffi allows us to turn ℓ ↦ → ML ì V (whenever we own the GC resource GC( )) into the runtime block ↦ → blk[0|mut] ì using Iris's resource updates [Jung et al. 2018, §5.4].The runtime block location is tied to the ML location ℓ through a new assertion ℓ ∼ ML , which makes sure that always (uniquely) corresponds to ℓ.To relate the values of the runtime block ì and the reference ì V , the relation "∼ ML " is lifted to values as ì V ∼ ML ì (e.g., by relating false ∼ ML 0).The rule ffi-to-ml reverses ml-to-ffi: we can take ownership of a runtime block and turn it (back) into ownership of a ML reference.The assertion ℓ ∼ ML appears after the update, because we can use this rule also to expose blocks that were freshly created in C (e.g., in buf_alloc) to OCaml.
With the view reconciliation rules in hand, we can finally tie the knot.We will discuss the formal details of connecting Iris C and Iris ML in §4.For now, let us use buf_alloc to illustrate the key steps for our running example.Concretely, we need to match up the two buffer representations in the postconditions of buf_alloc, buffer ML (V, n, ì m) and buf RT ( , , ì ).To do so, we prove the following view reconciliation rule: which we can use to turn buf RT into buffer ML .The proof of this rule consists of three parts: First, we turn the runtime block ↦ → blk[0|mut] [ ] inside buf RT into the mutable used-reference ℓ ↦ → ML [ ] inside buffer ML (using ffi-to-ml).Second, we turn the custom block ↦ → cstm * bf ( , , ì ) inside buf RT into raw_bytes(V ′ , n, ì m) inside buffer ML .This is the place where we define raw_bytes, which was treated axiomatically while verifying is_compressible.Concretely, we define raw_bytes( ), which means V ′ is a foreign value that corresponds to the custom runtime block , which stores the address of the underlying bytes.(We obtain the foreign identifier using additional rules given in the supplementary material [Guéneau et al. 2023].)Third, we prove that the runtime block ↦ → blk[0|imm] [ , , ] represents the ML buffer V ≜ ⟨n,ℓ, ⟩, meaning V ∼ ML .With the above view reconciliation rule and its reverse (i.e., turning buffer ML into buf RT ), we can verify all the glue code of the buffer library (see [Guéneau et al. 2023]).We can then finally connect all the puzzle pieces (as described in §4) to conclude that is_compressible is correct, which in particular means (1) the glue code correctly interacts with the FFI and maintains all the language invariants of ML , and (2) that the client is_compressible uses the function snappy_compress correctly without triggering any unsafe behavior in C (e.g., no out-of-bounds accesses).

OPERATIONAL SEMANTICS
Zooming out, let us return to the multi-language recipe from §1.In this section, we focus on the operational semantics side of the recipe.As a starting point, we take the canonical small-step operational semantics for the languages C and ML (defined in the supplementary material [Guéneau et al. 2023]).We write (e, ) → ML (e ′ , ′ ) for a step in ML and ; ( , ) → C ( ′ , ′ ) for a step in C (where is the surrounding C -program).Since these semantics operate on different values and have different memory models, they cannot be simply "plugged together." Thus, the big question is how can we connect the two semantics to a multi-language semantics "−↠ ML+C "?
The language ML+C and the semantics −↠ ML+C .We define the semantics −↠ ML+C as a composition of smaller building blocks.Each building block is a language with an associated notion of expressions ∈ Expr, values ∈ Val, state ∈ State, functions f ∈ Func, programs ∈ FnName fin ⇀ Func, and a small-step operational semantics −↠.Two languages and interact through external calls call fn ì .Our basic building blocks are the semantics C and ML .Inspired by the approach of Sammler et al. [2023] (i.e., defining multi-language semantics using modular combinators), we combine these building blocks using combinators.
In our case, to connect two languages and through their external calls, we introduce a language-generic linking combinator " ⊕ " with its associated language ⊕ (and semantics −↠ ⊕ ).The linking combinator composes two languages with the same values and memory model: outgoing calls "call fn ì " from one side result in the execution of "fn(ì )" in the other.
Of course, ML and C differ in their values and memory model (see Fig. 3).Thus, we cannot directly link a ML -expression e and a C -program .To bridge the gap between them, we introduce a wrapping combinator "[•] FFI " that embeds the language ML into its own language Putting everything together, we obtain the combined language ML+C ≜ [ML] FFI ⊕C with its operational semantics −↠ ML+C ≜−↠ [ML] FFI ⊕C and its programs [e] FFI ⊕ where e is a ML -expression and a C -program.In the semantics −↠ ML+C , external calls of e are made in terms of ML 's values and memory model.They get translated by the wrapper to external calls of C , which are then resolved to the actual function implementations within through linking.In the other direction, can make external calls to FFI functions, which will be resolved by the wrapper in its operational semantics.
Modeling garbage collection.In the semantics of ML+C , we need to model the observable actions of the garbage collector (GC): when viewed from the C side, the OCaml GC can move objects in memory, as well as free memory that was previously storing OCaml values.To account for this behavior, our model of garbage collection is built around the following three key principles: First, we decouple the identity of runtime values from their physical address in memory.In the semantics, we distinguish between blocks in the "runtime heap" and their concrete physical addresses.This allows us to disentangle the concept of a "runtime value" (stable across runs of the GC) from its physical representation (which can change across GC runs).This principle was used previously by Hur and Dreyer [2011], albeit in proving compiler correctness in the presence of a GC, not in defining a multilanguage semantics.Second, garbage collection is modeled as a non-deterministic choice subject to reachability constraints: the GC is modeled as non-deterministically changing or invalidating the physical addresses of blocks.This allows us to abstract over the concrete implementation of the OCaml GC.Non-determinism was also used by Moine et al. [2023], Wang et al. [2019], and Hur and Dreyer [2011] in their respective models of a GC.Third, we track the registered FFI roots in the operational semantics.These roots are needed to make C accesses to runtime-managed memory sound: The OCaml FFI requires that C pointers to OCaml values are registered as "roots" with the GC (using FFI primitives).This registration ensures that the GC is aware of the values being used in C and, hence, does not erroneously deallocate them.Moreover, it allows the GC to update the C pointers when it decides to move the value they store in memory.In the semantics, we keep track of which pointers have been registered as roots and update them when appropriate.
Bridging language barriers with angelic non-determinism.The wrapper [•] FFI bridges the gap between ML and C at the level of the operational semantics.Doing so is not straightforward for a number of reasons (see also §3.1).In particular, when going from C to ML (e.g., when C invokes a ML -callback or returns to ML ), the wrapper needs to translate values from their low-level runtime representation to their high-level ML representation.The issue is that this translation is, unfortunately, not unique!Different ML -values V can have the same runtime representation (e.g., integers and booleans are both runtime integers; pairs and arrays are both runtime blocks, etc.).When we go from C to ML in the wrapper, we have to choose "the right" high-level representation.Which one is the right one is only known to the programmer who wrote the glue code. 7o make "the right" choice, we follow the lead of Song et al. [2023] and Sammler et al. [2023]: we use angelic non-determinism (together with the usual demonic non-determinism).Instead of forcing the wrapper to make the right choice (as would be the case for demonic non-determinism), angelic non-determinism rules out all the wrong choices.Concretely, we use multirelations [Martin et al. 2007;Rewitzky 2003], written ; −↠ , as steps in the operational semantics.Here, is a configuration of the operational semantics (i.e., a pair of an expression and the state ) and a set of configurations from which to continue the execution after the step.We interpret the choice of the set as angelic non-determinism, and the choice of the configuration ′ ∈ to resume from as demonic non-determinism.
The operational semantics → C and → ML use-as usual-only demonic non-determinism.Their semantics can be lifted in a generic way to multirelations −↠ C and −↠ ML (note the double arrow!).We define the semantics of the wrapper "[•] FFI " and linking combinator "⊕" in terms of multirelations.(See the Coq development [Guéneau et al. 2023] for the full definition of the combinators.)

View Reconciliation in the Wrapper Semantics
We focus on the most interesting part of the semantics: the wrapper semantics and how it deals with the view reconciliation problem (see §2.5) at the level of the operational semantics.To explain view reconciliation in the wrapper, we need to understand, at a high level, how the wrapper is set up: The purpose of the wrapper is to produce a program [e] FFI that (1) faithfully executes the code of e in the semantics of ML and (2) provides a model of the runtime primitives such that they can be called from C .The program [e] FFI does not have to be a syntactic C -program.Instead, since the linking combinator "⊕" connects languages through external calls call fn ì , it only needs to provide "functions" that can be called using C -values in the C -memory model.We use this freedom in the semantics −↠ [ML] FFI by implementing the "functions" of [e] FFI (i.e., the primitives of the FFI and a main function that triggers the execution of e) as operations on an internal notion of wrapper state.
The wrapper state.The wrapper state ∈ State (in Fig. 7) has two execution modes: ML and C. While it is executing the wrapped expression e, its state is ML (( , , , rm), ) where is the current ML -state and the remaining state is part of the runtime state (discussed shortly).The wrapper transitions to the C execution mode whenever we are at the boundary between C and ML (e.g., we have called an external C -function, or we are executing a FFI primitive called from C ).In the C execution mode, the state of the wrapper is C (( , , , rs), ) where is the current C -state and the remaining state is again part of the runtime state.In this mode, the ML -heap has been dissolved completely into the runtime state.Throughout the execution, the wrapper state thus switches between the ML and C views of the state.These two views represent the same data, e.g., a reference on one side and a block on the other.When switching sides, it is the task of the wrapper to propagate changes made on one side into corresponding changes on the other side.
The runtime state (i.e., , , , , rm, and rs) is how the wrapper reconciles the views of C and ML .It is an abstract model of the runtime heap of blocks (alluded to in §2.4) and the GC roots, together with additional state to relate them to their C and ML representation.The map stores the current heap of runtime blocks.It is the central piece of state of the runtime: many primitives only operate at the level of the block heap (e.g., allocating new blocks, accessing or modifying existing ones).In particular, it is stable under the action of the GC, as it describes an abstract view of blocks, independently from their physical representation.A block can be either a standard block Vals( , , ì ) (with tag , mutability ∈ {Mut, Immut}, storing values ì ), a "custom" block Custom(w) (storing a C value w), or a "closure" block Closure(rec .) representing a ML function.
The map maps runtime locations to their C -addresses (see also §2.4) if they are materialized in the C -memory (i.e., they have not been deallocated by the GC).The map maps runtime locations to their ML counterpart.A runtime location is mapped (1) to a ML -location ℓ if is a standard block backing an array, (2) to a "foreign value identifier" if is a custom block, or (3) to a special token (•) if is freshly allocated (from C ) or a block that backs a pure ML value (e.g., pairs or closures).A runtime location can be in all three maps at the same time: it can be in to relate it to a corresponding ML -reference ℓ, in to track the runtime values ì that are stored in , and in to assign it an address in the C -heap.The map rm and the set rs are used to keep track of GC "roots" and the memory tracks the "residual" C -memory (explained below).
The wrapper state has been carefully designed to move between the two views of the state with "just the right" amount of non-determinism: the wrapper needs to expose enough information about the runtime, but must not overspecify its behavior.For instance, we expose and track the mutability of standard blocks, encoding the runtime's expectation that one must not modify a block that is supposed to be immutable.(While the concrete implementation of the runtime will give some behavior to this operation, performing an update to an "immutable" block violates an implicit OCaml assumption that the compiler can depend on for optimizations.)But we are also careful to leave some flexibility in how blocks are used to allow compiler optimizations (e.g., representationally identical, immutable values such as the ML pairs ⟨42,true⟩ and ⟨42,1⟩ can reuse the same runtime blocks).
The garbage collector and registered roots.In ML+C , we do not fix a specific garbage collector implementation.Instead, the semantics of ML+C should be sound with respect to all reasonable garbage collectors.Thus, as outlined above, we model the GC using non-determinism.That is, the address map is only part of the state while we are at the boundary to C (i.e., in the mode C).It is picked anew whenever we move from an execution in ML to an external function in C (and for some runtime primitives such as caml_alloc).This choice over is made demonically (if we verify a program, we have to reason about all possible choices for ).To avoid "over-eager" deallocation, the choice is subject to two constraints that ensure that reachable locations remain alive: The first constraint, requires the address map to be transitively closed: for any runtime location in , any reachable location ′ must also be part of the map (where locs( ) is the set of runtime locations contained in ).The second constraint ensures that contains at least all the "registered roots".Roots, in general, are how the OCaml GC keeps track of which runtime blocks to keep around.
In ML+C , it suffices to track those roots that have been explicitly registered with the GC through an FFI primitive (explained below).How the other GC roots-which we do not model in ML+C -are determined is left as an implementation choice to the GC so long as it does not deallocate reachable blocks (GC1).In ML+C , registered roots are C -addresses that store (the C -addresses) of runtime blocks , given by ( ).They are tracked in the roots map rm (and in the set rs when in C).8 Their intended behavior is that (1) the block is not deallocated by the GC and (2) whenever the runtime moves the block in memory, it will also update its address stored in the root.The first part is addressed by (GC2).The second part happens whenever we move from ML to C: The ML wrapper state contains a C -memory where all the registered roots have been removed, the "residual" C -memory.When we return to C , the wrapper uses the freshly picked address map to "add in" the C -representation of the roots.
Angelic non-determinism.As mentioned above, the angelic non-determinism comes into play when we move from C to ML .Concretely, when we transition from C (( , , , rs), ) to ML (( , , , rm), ), we need to choose ML -values V for runtime values in their block representation.However, the block representation is not unique, since, for example, true and 1 have the same runtime representation 1.To pick "the right" value, the semantics uses angelic non-determinism.
There is a second use of angelic non-determinism in the semantics: in transitioning from C to ML , the wrapper angelically chooses a subset of the heap , possibly empty, that it temporarily disables (i.e., it marks the contents with ).This use of angelic non-determinism simplifies reasoning about the semantics: Locations that are not accessed on the ML -side can be disabled, which avoids (unnecessarily) committing to a concrete ML -value when transitioning from C to ML .Instead, by staying uncommitted, the choice of the value is deferred to a later point (e.g., to the next external call from ML ).In contrast, locations that are subsequently accessed on the ML -side have to remain enabled in this choice.They cannot be disabled, because an access to would be undefined behavior.
The runtime primitives.The runtime provides the following primitives, a large subset of the OCaml FFI [oca 2023b], operating on the runtime state: The primitive alloc (for caml_alloc) allocates a new runtime block by extending (with the block) and (with •), "calls the GC" on , and then extends the new map with the C address.Similarly, alloc_custom (for caml_alloc_custom) extends these maps with a new custom block.The primitives Field (for Field) and Store_field (for Store_field) access and update a field of a block in .The primitives read_custom (for reading from Custom_contents) and write_custom (for writing to Custom_contents) access and update

PROGRAM LOGIC
As we have seen in §2, reasoning about programs is done at the level of the language-local logics Iris C and Iris ML in Melocoton.However, eventually, we have to ensure that this reasoning is sound (i.e., that assumptions of one side are properly connected to proofs in the other).For this purpose, we introduce the "umbrella logic" Iris ML+C , which embeds Iris C and Iris ML .We discuss how Iris ML+C composes proofs (in §4.1), how it justifies the interface Ψ FFI (in §4.2), and how it is modelled (in §4.3).

Composing Proofs in Iris ML+C
In Iris ML+C , we primarily reason at the level of interfaces Ψ, Π ∈ Intf (Val).We write Ψ |= : Π to mean that the program implements interface Π under the "assumption" of interface Ψ.We can prove (also in Iris C and Iris ML ) that implements an interface by showing that its functions satisfy the specification given by Ψ against the interface Π (IntfImplement).As one would expect, there is also an analogue to the Hoare rule of consequence (IntfConseq) where Using interfaces, we can succinctly state how proofs in the languagelocal logics (e.g., from §2) can be connected through Iris ML+C : Theorem 4.1 (Connecting Iris C and Iris ML ).Let e be a ML -expression, a C -program (where dom does not contain FFI primitives), Π a ML -interface, and a pure predicate on integers.If Let us break this theorem down.It allows linking a ML -expression e with a C -program .For e, we need to prove a language-local triple (in Iris ML ) with a pure postcondition on integers (e.g., True for a proof of safety).We can do so against an arbitrary ML -interface Π.For , we need to prove that its functions implement the interface Π as we have claimed they do.Since Π is an interface over ML -values, but we want to use it in Iris C , we convert it into an interface on C -values using the interface wrapper [•] FFI (discussed below).To use runtime primitives, is proven against the runtime interface Ψ Π FFI .Here, Ψ FFI gets the ML -interface Π as an additional argument (which we omit in §2 for simplicity) to be able to execute ML -callbacks from C (see §4.2).Under these conditions, the theorem allows us to deduce that the combined program [e] FFI ⊕ implements the interface main( ) ≜ ⟨atInit⟩ main [] ⟨w.∃ .w = ˆ ∧ ( )⟩.Here, atInit is a resource that signals program start and main is a "primitive" of the wrapper that will trigger the execution of e.
The proof of Theorem 4.1 is straightforward using the key properties of Iris ML+C (in Fig. 8).We can embed proofs into Iris ML+C from Iris C (EmbedC) and Iris ML (EmbedML).To embed an Iris ML proof, it suffices to prove a Hoare triple about a ML -expression e against the interface Π.We then obtain that the wrapped program [e] FFI implements the runtime interface Ψ Π FFI and the main interface main( )-assuming the wrapped interface [Π] FFI .After embedding proofs into Iris ML+C , we can then "link" them together (Link): if one side assumes Π and implements Ψ and the other assumes Ψ and implements Π, then they cancel out (i.e., the remaining assumption is ∅) and the linked program implements Ψ ⊔ Π.In the case of Theorem 4.1, the wrapped program [e] FFI assumes [Π] FFI and implements Ψ Π FFI whereas assumes Ψ Π FFI and implements [Π] FFI .Thus, they cancel out.What allows us to use the ML -interface Π as a C -interface in this proof is the wrapper [Π] FFI : While its formal definition is quite a mouthful, its high-level intuition is comparatively simple: The wrapper [Π] FFI relates the C -values to ML -values through the runtime representation.Concretely, it (1) gives us access to the GC resource GC( ) when entering the function on the C -side (and demands it back with an updated map ′ when existing the function), (2) connects the C -arguments ì w to ML -arguments ì V through their runtime representation ì , and (3) connects the ML -return value V ′ through its runtime representation ′ to the C -return value w ′ .

The Runtime Interface
One thing that we have not discussed yet is who is on the other side of the runtime primitives in the interface Ψ FFI .Operationally, these primitives are given semantics by −↠ ML+C .At the program logic level, we prove their specifications in Iris ML+C (against the underlying semantics −↠ ML+C ).A key selection of these primitives is depicted in Fig. 9.The rule AllocCustom allows allocating a custom block; it is analogous to the interface Ψ alloc (from §2.4).The rule RegRoot allows registering a Proc.ACM Program.Lang., Vol. 7, No. OOPSLA2, Article 247.Publication date: October 2023.
C -address as a root with the runtime.If we do so, we obtain a new resource " ↦ → root ", a root points-to, which asserts that is registered as a root for the runtime block .The resource is stable across calls to the GC (i.e., it does not depend on ) and thus we can use it to access even after triggering the GC (e.g., through alloc).
Unregistering the root (UnregRoot) gives back ownership of the underlying C points-to.While the root is registered, one is prevented from freeing the underlying memory.(Note that the program logic permits roots to remain registered indefinitely, and thus does not guarantee the absence of memory leaks.) The rule ExecCallback allows executing a callback rec .e ′ in the runtime.It uses a special points-to assertion "↦ → clos " for closures.The rule says that if we call (with the C -representation w of) V under precondition , then we get (the C -representation w ′ of) the value V ′ that results from executing and the postcondition (V ′ ).The "⊲" modality in this rule is Iris's modality for step-indexing; we will discuss the effects of step-indexing on Iris ML+C shortly (in §4.3).
View reconciliation.One of the main challenges, also at the level of the program logic, is view reconciliation.That is, we have to soundly combine the resources V and ↦ → blk [ | ] ì can in fact overlap (see §2.5), as they can describe different views on the same piece of data.For example, if we mutate a runtime block in C through the FFI, then this change will be observable in ML after an external call.
As far as the user of Iris ML+C is concerned, these two views are mediated by the view reconciliation rules (Fig. 6).While these rules are in some sense very natural, justifying their soundness requires care: The crux is that there is an inherent disconnect between the physical state of the operational semantics (i.e., and ) and the logical state in terms of points-to assertions (i.e., ℓ ↦ → ML ì V and ↦ → blk[ | ] ì ).In the operational semantics, there is only one view of the ML -heap at any given time: either as a runtime heap or a ML -heap .9(This is essential to reuse the language-local semantics, which are only defined in terms of the language-specific heap.)The logic, however, allows more fine-grained views, where ℓ ′ ↦ → ML ì V and ′ ↦ → blk [ | ] ì can exist at the same time for different locations ℓ ′ and ′ (both during the execution in ML and in C ).The coexistence of runtime points-to assertions and ML -points-to assertions is not only convenient (e.g., it enables the view reconciliation rules in Fig. 6), it is essential for core reasoning principles of separation logic such as framing (e.g., the language-local logic Iris ML can frame its resources ℓ ↦ → ML ì V around external calls).
Our solution to the view reconciliation challenge is to adjust the connection between the physical representation and the logical representation when we cross language boundaries.Concretely, whenever we are in ML , the resource ℓ ↦ → ML ì V is connected directly to the physical memory (i.e., (ℓ) = V ).When we transition to C , we connect it instead to a runtime block in the runtime memory using ghost state.When ℓ ↦ → ML ì V is connected to , we can obtain the runtime resource ↦ → blk[ | ] ì .To ensure that there are never two overlapping views exposed, we maintain as an invariant (not in the Iris sense) that the runtime blocks backing ℓ ↦ → ML ì V and The physical view that the operational semantics has at this point on the ML -state is the runtime heap phys (the map in execution mode C in Fig. 7).Logically, we split this heap into two disjoint parts, virt and ml .,The part virt that backs up the resource ↦ → blk[ | ] ì (through •( virt ) ) and the part ml that backs up the resource ℓ ↦ → ML ì V .We cannot directly back ℓ ↦ → ML ì V with a runtime heap, since "↦ → ML " is a resource from Iris ML which-as usual-is backed by a ML -heap (through SI ML (•), the "state interpretation" in the terminiology of Iris).However, what we can do is maintain a separate, virtual ML -heap virt (not present in the underlying physical memory) which (1) backs the resource ↦ → ML in the form of SI ML ( virt ) and ( 2) is faithfully represented by the runtime heap virt (denoted repr( virt , ml )).The way the view reconciliation rules (see Fig. 6) are proven sound is by moving locations between virt and virt as needed.Concretely, ml-to-ffi removes ℓ from virt and adds a corresponding runtime identifier (previously stored in ml ) into virt .ffi-to-ml does the opposite.When we start verifying a glue code function, the original ML -heap is turned into virt , and then we can gradually convert to and from the runtime representation as needed.

The Model of Iris
At first glance, this definition may seem odd, because it may seem like we are proving the existence of an execution.However, this thought is misleading.In a semantics with demonic and angelic non-determinism, if we want to prove something for all possible executions of a program, we have to resolve angelic choices and accept demonic choices.In our multi-relations ; −↠ , the outer choice over the set of states is angelic and, thus, when we reason about all executions, we need to resolve this choice (hence "∃ ′ ").The inner choice over the next state ∈ is demonic and, hence, when we reason about all executions, we need to accept this choice (hence ∀ ′ ∈ ′ ).From the language-generic adequacy statement, we additionally derive standard language-specific adequacy statements for Iris C and Iris ML -provided the program under consideration does not contain any external function calls (i.e., it can be verified against the empty interface ∅).(We do so using a generic lifting from standard relations ; → ′ to multi-relations ; −↠ .)For example, for Iris ML , we derive the following adequacy statement (and analogously for Iris C ): Corollary 4.3 (Adeqacy of Iris ML ).Let e be a ML expression and a pure predicate on integers.
Here, the predicate safe(e, ) guarantees that e is never stuck and that, if e terminates, the resulting value is an integer satisfying .

CASE STUDIES
We have applied Melocoton to several interesting case studies.In §5.1 we start with three examples that illustrate how Melocoton can be used for program verification.Afterwards, in §5.2, we show how Melocoton can, additionally, be used to prove type safety of external functions implemented in C.

Program verification
Buffer library & compression.In §2, we have primarily focused on the function buf_alloc as a running example.Most of the other external functions (in Fig. 2) are straightforward to verify.The most interesting one is buf_upd, because it uses callbacks.We prove the following specification: where the predicate describes the resources used by the callback depending on the current index and is a mathematical function describing the integers that the callback computes.The specification asserts that buf_upd modifies the contents of the buffer between indices and according to .Using this specification, we verify is_compressible by picking ( ) ≜ ℓ ↦ → ML ì (for the input ML -array of characters) and ( ) ≜ ì [ ] (for the -th element of the array).Unlike well-typed functions in OCaml, the function buf_upd is not safe in general: if it is not used according to its specification, then it can exhibit undefined behavior (e.g., out-of-bounds accesses and iterator invalidation).Thus, OCaml code that uses the buffer library must carefully respect its specification.The function is_compressible, as we prove by verifying it, satisfies these requirements.
Polymorphic equality.As another example, we have verified a polymorphic equality function that (deeply) compares the runtime representation of two OCaml values.What makes this function interesting is that it is not possible to implement it natively in OCaml (or in our case ML ), because there is no way to determine the shape of values (e.g., whether they are sums or pairs).For example, a ML function designed to compare pairs would get stuck when we attempt to pass it a sum value.(OCaml exposes an Obj module providing some escape hatches that expose the runtime representation of values, but it is undocumented and breaks type safety so we do not consider it here.)As such, this example demonstrates that the FFI can be used to add new functionality to ML .
Zig-Zag Lists.As another example, we have verified a "zig-zag list"-a linked list implementation whose cons-operation allocates cells on the C heap to store the head and tail of the list (OCaml values), and then exposes this "cons-cell" to OCaml by embedding it into a custom block.To make this work, the fields of the cons-cell are registered with the GC as global roots, ensuring that they are not accidentally deallocated by the GC.We provide C implementations of external functions to work with such lists in OCaml (e.g., head, cons).We use the term "zig-zag list" because, when traversing such list, one follows pointer chains that alternate between the OCaml and C heaps.This example demonstrates that Melocoton supports data structures spanning both heaps, and that the values of each language can be stored in the other's heap.

Type Safe Interfaces
Besides program verification, we can additionally use Melocoton to prove type safety of external functions (and their clients).To this end, we equip ML with a logical relation.The logical relation (and its associated type system) are standard constructions from the literature [Timany et al. 2022, Figures 2 and 5], which we extend with support for external functions implemented in C : First, we add a new context Σ, which assigns a type fn : 1 → • • • → → to every external function fn in it.Then, we interpret the contexts using the interfaces Ψ of Iris ML such that each external function fn ∈ Σ is assigned (the semantic interpretation of) its type as a specification in Ψ.Finally, we use Iris C to validate the assumed types of external functions, which means we prove in Iris C that the C -implementation satisfies the interpretation of the given type.Besides proving type safety of individual external functions fn, we can additionally use the logical relation to prove type safety of OCaml clients that wrap them in a safe abstraction (e.g., is_compressible).
Landin's knot.Our next example, a simple modification of Landin's knot [Landin 1964], illustrates that Melocoton and its logical relation are powerful enough to reason about higher-order functions, callbacks to ML , and mutual recursion through the FFI (and the heap).Our version of Landin's knot implements recursion through backpatching by combining ML and C code: We proved functional correctness of knot (i.e., it is a recursion combinator), and that it is semantically safe (i.e., in the logical relation) at type ∀ .(( → ) → ( → )) → ( → ).
Event listeners.As our last example, we have verified a small library of "event listeners".It demonstrates a tricky use of higher-order state in C (i.e., memory storing closures), which we handle using step-indexing.The ML -interface for this library is below.It captures a programming pattern commonly found in event-based GUI libraries to mediate between "event consumers" (clients of the GUI library) and "event producers" (the backend of the GUI).Consumers are handed an abstract value (of type 'a listener) to which they can attach callbacks (using listen) to react to future events.When an event happens, the backend can notify (using notify) a listener which triggers the consumer's callback.The library is implemented in C and exposed using the FFI.
type 'a listener external create : unit -> 'a listener = "listener_create" external listen : ('a -> unit) -> 'a listener -> unit = "listener_listen" external notify : 'a -> 'a listener -> unit = "listener_notify" The most interesting thing about this library is that the implementation of listen stores an arbitrary ML -callback in a mutable data structure managed on the C -side.The callback, by the nature of OCaml's type system, can capture the listener to which it is attached.This makes proving type safety of the library tricky, because we can easily run into circularity issues when naively attempting to define an interpretation of 'a listener (e.g., a listener is safe if its callbacks are safe, and callbacks are safe if the listeners they capture are safe).We resolve these circularity issues as usual in logical relations, by using step-indexing.Concretely, analogous to our standard interpretation of reference types [Timany et al. 2022, Figures 5], we make use of Iris's impredicative invariants [Jung et al. 2018, §7.1], which internally are modelled using step-indexing.As mentioned in §4.3, for the semantics −↠ ML+C , this is only sound because we use a transfinitely stepindexed version of Iris.Ultimately, we prove the functions (create, listen, notify) safe at type: ∀ .∃listener.(unit → listener) × (( → unit) → listener → unit) × ( → listener → unit).

RELATED WORK
Melocoton is, to our knowledge, the first program logic for programs spanning multiple languages with different memory models.(The only other multi-language program logic that we are aware of is Iris-Wasm discussed below.)Here, we also compare with work that tackles the related problem of compiler verification in a multi-language setting (but is not based on a program logic).
Iris-Wasm.Iris-Wasm [Rao et al. 2023] provides an Iris-based program logic for reasoning about WebAssembly and its interaction with its host language, which in Iris-Wasm is a tiny subset of JavaScript.The memory model of the host is a minor extension of WebAssembly's memory model and hence, there is no view reconciliation challenge.In contrast, Melocoton shows how to scale verification to more complex FFIs that require integrating program logics with very different memory models and how to deal with the problems that arise such as view reconciliation ( §2.5).
Cito.Cito [Pit-Claudel et al. 2020;Wang et al. 2014] is a C-like language with a verified compiler formalized in Coq, which supports linking with functions from other languages via axiomatic specifications built into its operational semantics.While these axiomatic specifications follow the style of open simulations [Hur et al. 2012] like the interfaces Ψ presented in this paper, they are not phrased using a program logic, but stated using abstract data types and pure pre-and postconditions.Cito uses program logics to reason about individual languages, but it does not tackle the problem of building a multi-language program logic like Iris ML+C .
Cogent.Cheung et al. [2022] show how to extend the compiler correctness theorem of Cogent [O'Connor et al. 2016, 2021] with manually verified external C functions.Cheung et al. require the C code to uphold the invariants guaranteed by Cogent's linear type system and focus on extending the correctness proof of the Cogent compiler using these invariants.In contrast, we assume correctness of the OCaml compiler and runtime and instead focus on building a program logic for verifying OCaml-and-C programs that maintain the (more complex) OCaml runtime invariants.
Semantic soundness for language interoperability.Patterson et al. [2022] prove type safety of the interaction between (among others) a MiniML-style garbage collected language and an L 3style language with manual memory management.Instead of Melocoton's source-level reasoning, they compile both languages to a common target language and build logical relations that relate source-level types with target-level terms.This approach shifts reasoning to the target language.For example, they relate values from different languages using target-level conversion functions instead of Melocoton's source-level relations like V ∼ ML and ∼ C w.
Verified compilers.Mates et al. [2019] (building on the work of Patterson et al. [2017]) verify a compiler from a stateful language with closures to one without closures in a syntactic multilanguage [Matthews and Findler 2007;Perconti and Ahmed 2014] using a logical relation for contextual equivalence.Hur and Dreyer [2011] verify a one-pass compiler from an ML-like language to an assembly-like language, via a cross-language logical relation where garbage collection is axiomatized.Their approach supports linking with manually verified assembly-level code, but only if that code can be proven behaviorally equivalent to some ML-level module.Compositional CompCert [Stewart et al. 2015] and the line of work it inspired [Gu et al. 2015;Koenig and Shao 2021;Song et al. 2020] show how to extend the CompCert compiler with cross-language linking.The composition of languages based on external calls in Melocoton is inspired by Compositional CompCert's interaction semantics, but extends it with DimSum-style wrappers [Sammler et al. 2023] to handle linking of languages with different memory models.DimSum [Sammler et al. 2023] provides a "decentralized" approach for reasoning about multi-language programs via combinators for linking and language translation, which inspired the definition of the operational semantics in Melocoton (see §3).DimSum only considers the interaction of relatively low-level languages, none of which have such a rich FFI and runtime as OCaml.
Formal models of garbage collection using nondeterminism.Our operational model of garbage collection ( §3) uses nondeterminism, which has also been used before in the literature.Hur and Dreyer [2011] similarly decouple GC-managed values from their physical representation, albeit in a program equivalence setting without interactions between the different languages.They model GC-managed values as stored in logical memories in which pointers are never moved or deallocated.Logical memories are related to physical memory using a lookup table; running the GC is modeled as non-deterministically changing the lookup table.These closely match the block store and address map of our runtime semantics, respectively (Fig. 7).The idea of modeling the effect of the GC via non-determinism (subjet to constraints) can also be found in other, more recent work [Moine et al. 2023;Wang et al. 2019].None of the above work studies the combination of a GC with an FFI and, hence, they do not consider the ability to register user-declared roots.
Formal reasoning about the OCaml FFI.The model of the OCaml FFI in this paper is based on the informal description given in the OCaml manual [oca 2023b].We model the core features of the FFI, but omit some more advanced features.For instance, we do not model direct pointer-accesses to the contents of runtime blocks as if they were normal C memory.Instead, all modifications go through runtime primitives provided by the FFI.Direct accesses are primarily used for efficiencyto enable exchanging data such as strings or byte arrays without making copies.We also do not currently model features like exceptions or multithreading.Furthermore, we model the primitives for registering roots with a more elementary API than the one provided by the OCaml FFI through CAMLlocal, CAMLparam and CAMLreturn.Alternative APIs for rooting have also been proposed by Munch-Maccagnoni and Scherer [2022], and we believe that these APIs would be easy to specify on top of our primitives.Furr and Foster [2005] build a type system for C glue code that uses the OCaml FFI to detect common misuses of the FFI.This type system is proven sound using a formal model of the C-side of the FFI.However, because their focus is on finding bugs in C glue code, they do not model the OCaml side of the FFI and do not target verification of mixed OCaml and C code like Melocoton.This also means that they do not handle advanced features such as callbacks and closures.
Fig. 1.C memory layout of the buf-record {cap, used, data} as exposed by the OCaml FFI Fig. 3. Syntax, state, and resources of C and ML

Fig. 4 .
Fig. 4. A selection of the reasoning rules of Iris C and Iris ML .The program and the interface Ψ are omi ed in rules that do not mention them.
[ML] FFI (with semantics −↠ [ML] FFI ).The wrapper takes a ML -expression e and produces a program [e] FFI with C -values and the C -memory model.However, the program [e] FFI is not a syntactic C -program; it is a program in the language [ML] FFI .The language [ML] FFI uses C -values and the C memory model, but has a very different notion of expressions and semantics (see §3.1).

∈
Fig. 7. Runtime state of the wrapper.

Fig. 8 .
Fig. 8.The interface rules of Iris ML+C
and ↦ → blk[ | ] ì of the runtime.Fortunately, the resources ↦ → C cl and ↦ → blk[ | ] ì never overlap, because the operational semantics separates the runtime-managed memory (whose resources are of the form ↦ → blk[ | ] ì ) from the manually-managed C memory (whose resources are of the form ↦ → C cl).Thus, we focus on the interaction of ℓ ↦ → ML ì V and ↦ → blk[ | ] ì .The resources ℓ ↦ → ML ì

Array.get xs i) inp; 14 let _ = wrap_compress inp outp in let shrank
ML+C We have discussed how to verify programs [e] FFI ⊕ in Iris ML+C (see Theorem 4.1).What is still missing is what we obtain from verifying a program in Iris ML+C .The answer is adequacy: Theorem 4.2 (Adeqacy of Iris ML+C ).Let be a ML+C -program and a pure predicate on integers.If ∅ |= : main( ), then ; (call main [], ∅) −↠ * ML+C {( ˆ , ℎ) | ( )}.Intuitively, this theorem says that call main [] (which will trigger the execution of e) can only terminate in integers satisfying and diverge.In particular, call main [] is safe to execute.Formally, we define the program executions on multi-relations coinductively as