DisLog: A Separation Logic for Disentanglement

Disentanglement is a run-time property of parallel programs that facilitates task-local reasoning about the memory footprint of parallel tasks. In particular, it ensures that a task does not access any memory locations allocated by another concurrently executing task. Disentanglement can be exploited, for example, to implement a high-performance parallel memory manager, such as in the MPL (MaPLe) compiler for Parallel ML. Prior research on disentanglement has focused on the design of optimizations, either trusting the programmer to provide a disentangled program or relying on runtime instrumentation for detecting and managing entanglement. This paper provides the first static approach to verify that a program is disentangled: it contributes DisLog, a concurrent separation logic for disentanglement. DisLog enriches concurrent separation logic with the notions necessary for reasoning about the fork-join structure of parallel programs, allowing the verification that memory accesses are effectively disentangled. A large class of programs, including race-free programs, exhibit memory access patterns that are disentangled "by construction". To reason about these patterns, the paper distills from DisLog an almost standard concurrent separation logic, called DisLog+. In this high-level logic, no specific reasoning about memory accesses is needed: functional correctness proofs entail disentanglement. The paper illustrates the use of DisLog and DisLog+ on a range of case studies, including two different implementations of parallel deduplication via concurrent hashing. All our results are mechanized in the Coq proof assistant using Iris.


INTRODUCTION
Recent work has shown that parallel functional programming can deliver the same efficiency and scalability as imperative and procedural approaches.The key to this line of work is a memory property known as disentanglement [Arora et al. 2021[Arora et al. , 2023;;Guatto et al. 2018;Raghunathan et al. 2016;Westrick et al. 2022Westrick et al. , 2020]], which restricts parallel tasks to access data that was allocated "before" the task executed.This restriction enables tasks to allocate and garbage-collect memory locally and independently-that is, without synchronizing with other parallel tasks.Utilizing disentanglement, Arora et al. [2023] developed a provably efficient memory manager for functional programs which also provides full support for effects.All of this work is implemented in MPL ("maple"), an open-source1 compiler for Parallel ML.In practice, MPL has been shown to be fast, scalable, and competitive with lower-level and imperative language implementations.This line of work relies on disentanglement to ensure efficiency and scalability, and leaves it up to the programmer to reason about disentanglement and its performance impact.For purely functional code, reasoning about disentanglement is not an issue: purely functional programs are guaranteed to be disentangled, by construction, due to the lack of mutation (in-place updates).Programmers can also use pure libraries that are implemented under-the-hood with in-place updates for efficiency.For example, common parallel operations (such as map, reduce, scan, etc.) can be implemented efficiently with mutable arrays, hidden behind a pure interface, and can then be used to write disentangled code.However, in this example, the developers of high-performance libraries still need to reason carefully about disentanglement.More generally, whenever in-place updates and other low-level optimizations are necessary for efficiency, disentanglement needs to be taken into account.
In the context of in-place updates and other memory effects, reasoning about disentanglement is subtle.Programmers may wish to use concurrent data structures (for example, lock-free hash tables) to improve efficiency.Such data structures can be made disentangled [Westrick 2022], but reasoning about their correctness is challenging, even for experts.If disentanglement is violated, there can be significant consequences for performance, in terms of increased time and space usage [Arora et al. 2023].In this sense, disentanglement can be considered a "safety" condition for performance-oriented code.
Therefore, we shift our attention to static verification of disentanglement.Our goal is to support reasoning about both high-level and low-level code, including atomic in-place updates and concurrent data structures, which can require identifying intricate invariants.In this setting, concurrent separation logic [Brookes 2007;O'Hearn 2007] and its modern variants [Jung et al. 2018;Nanevski et al. 2014] have proven to be successful vehicles for verifying safety and correctness properties of programs in the presence of challenging concurrent features.An intriguing question is whether or not separation logic can be used to prove disentanglement, which we address in this paper.
To verify disentanglement statically, we develop DisLog, the first program logic for proving disentanglement, and formally prove its soundness.At a high level, DisLog is a concurrent separation logic built on Iris [Jung et al. 2018] endowed with assertions which describe dependencies between parallel tasks and permissions to make disentangled loads from the heap.This approach makes the logic powerful enough to verify disentanglement even in complex and subtle situations, such as programs with lock-free data structures and algorithms using atomic in-place reads and writes.
Going further, on top of DisLog, we develop DisLog+, a standard concurrent separation logic which hides the details of disentanglement, allowing for standard separation logic proofs while also getting proofs of disentanglement for free.DisLog+ is applicable for a wide variety of programs, including purely functional programs, race-free programs, and even programs that have "benign" memory races (for example, write-write races).Importantly, DisLog and DisLog+ work seamlessly together, allowing for DisLog+ proofs to drop into the more powerful DisLog where needed (for example, for verifying non-pure segments of mostly pure programs), and otherwise stay at a high level of abstraction.
To evaluate DisLog and DisLog+, we consider several case studies, including key parallel primitives, as well as sophisticated parallel algorithms involving concurrent data structures.In all cases, we prove that the programs are disentangled.Our experience has shown that, using the logics developed in this paper, the effort of proving disentanglement is typically small.Furthermore, when a formal proof of functional correctness is desired, using DisLog+ often yields a proof of disentanglement for free.

Our contributions include:
• DisLog, the first program logic to verify that a program is disentangled ( §4).It employs the notion of timestamps to reason about the nested fork-join parallelism of a program and introduces a novel clock assertion to prove that memory accesses do not cause entanglement.• DisLog+, a high-level logic built on top of DisLog that shields the user from timestamp management ( § 5).As a result, race-free programs can be verified in DisLog+ with the standard reasoning rules of concurrent separation logic.• Two mechanisms allowing to reason about benign races in DisLog+, including fractional write-only assertions for write-write races ( §5.4), and a set of rules for read-write races on pre-allocated data ( §5.5).• A range of case studies ( §6), including multiple parallel primitives, parallel lookup in a lazy collection, and two examples of deduplication via concurrent hashing.• A formalization in the Coq proof assistant using Iris [Jung et al. 2018].All our results are mechanized ( §7) in Coq, including: the two program logics, their soundness theorems, and the case studies [Moine et al. 2023b].
2 KEY IDEAS 2.1 Background Nested Fork-Join Parallelism.We consider programs written using a single parallel primitive: the parallel tuple 1 || 2 .It executes 1 and 2 in parallel, and returns their results as a pair.Here, two tasks are spawned to execute expressions 1 and 2 in parallel.Parallel tuples may be arbitrarily nested.For example, the expression 1 might itself execute another parallel tuple.This leads to a dynamic nesting structure of tasks called the task tree, where the leaves are tasks that may take steps in parallel.In an operational semantics, the task tree is maintained by two distinguished reductions: a fork, where two tasks are spawned to execute 1 and 2 in parallel, and a join, when the tasks complete and return their results as a pair.Parallel tasks can be understood as concurrent threads with a structure: if they terminate, the two tasks forked by a parallel pair ultimately join.
This style of programming is known as nested fork-join parallelism, or sometimes nested task parallelism.The arbitrary nesting of parallel tasks allows programmers to write parallel recursive divide-and-conquer style algorithms.For example, a "parallel for-loop" can be implemented by splitting the index range in half and then recursively executing the two halves in parallel ( §6.2).
Acquiring locations.During execution, each task may allocate locations in memory and perform memory effects such as reads and writes on these locations, including atomic compare-andswaps (CAS).The reads in particular are important for the disentanglement property we consider.In our operational semantics ( §3.3), a read occurs in three distinct cases: (1) a memory load inside an array reads the indexed value, (2) a closure call reads the environment of the closure, and (3) a CAS reads the scrutinized value.When a task performs a read, if the result of that read is a location, we say that the task acquires the resulting location.
Disentanglement.Disentanglement limits communication between concurrent tasks by restricting which locations may be acquired: each task may always acquire its own allocations, and additionally, each task may acquire any location allocated "before" the task began.The notion of "before" relates to the dependencies induced by forks and joins.A forking task comes before the two tasks it forks, and conversely, two joining tasks come before their join point.If a task ever acquires a location allocated by some other task that is executing concurrently, this constitutes entanglement.The logics developed in this paper allow proving that a program is disentangled, i.e., that in every possible execution of the program, entanglement will never occur.Atomic operations and determinacy races.A key feature of disentanglement is that it allows for non-deterministic interleaving of atomic in-place operations such as atomic loads, stores, and CASes.Such operations are commonly used under-the-hood in the implementation of high-performance libraries (for example, in the implementation of a high-throughput lock-free hash table).Proving disentanglement in this setting requires reasoning carefully about atomic operations that can acquire locations.
In other words, one of the challenges is to prove that entanglement is impossible whenever there is a determinacy race [Feng and Leiserson 1999].Concretely, a determinacy race occurs whenever two atomic in-place operations are performed concurrently at the same location, and at least one of the operations modifies the location.For example, an atomic load could race with an atomic store, or an atomic load could race with a CAS, or two CASes could race with each other, etc.As the name suggests, determinacy races can lead to non-deterministic execution, which we allow.
In this paper, we permit only atomic (i.e., properly synchronized) in-place operations, and therefore avoid all data races [Adve 2010;Boehm 2011;Dolan et al. 2018] by construction.For simplicity, we assume a sequentially consistent memory model.Because of the lack of data races, throughout the paper, we use the term race to refer only to determinacy races.

Running Example
To illustrate the ideas in the paper, we use a running example, called scratch, shown in Fig. 1.This function is non-deterministic due to a determinacy race, yet is disentangled.At a high level, scratch calls the function doWork two times in parallel.Each call to doWork uses an array (called a "scratchpad") as temporary space.Note that it would be safe to allocate a fresh scratchpad for every call to doWork.The goal of the example is to optimize performance by reducing the number of scratchpads that are allocated.(The example only calls doWork twice, but this could be generalized to any number of calls in parallel, which would make the optimization more significant.) To reduce the number of allocated scratchpads, scratch implements a simple strategy.First, a shared scratchpad is allocated together with a lock to protect it (Fig. 1, lines 3-5).Then, before t 3 defaultElem Fig. 2. One possible execution of the example of Fig. 1 each call to doWork, scratch will attempt to claim access to the shared scratchpad by calling tryLock (line 7).If this succeeds, then doWork may use the shared scratchpad (line 8); otherwise, scratch falls back on allocating a fresh scratchpad (line 12).Whenever a call to doWork is finished using the shared scratchpad, the shared scratchpad is cleared (line 9), before finally releasing the lock (line 10).In this way, scratch reduces the number of allocations by reusing the shared scratchpad as much as possible.In particular, if scratch is executed using only a single processor, then every call to doWork will be able to use the shared scratchpad, and no additional scratchpads will be allocated.
The auxiliary code for the example is shown in Fig. 1b, which defines scratchpads and locks.The details of doWork are not important as long as it performs reads and writes on the scratchpad.Locks are implemented by a pair of closures with a mutable boolean, indicating whether the lock has been locked.The closure tryLock is implemented using an atomic compare-and-swap (CAS) which returns true if the CAS succeeds, and false otherwise.
Disentanglement in scratch.Proving that the scratch example is disentangled is subtle.In particular, doWork may read or write to the scratchpad, and we need to show that reading from the scratchpad will never return a value allocated by a concurrent task.Thankfully, the scratch function guarantees a strong precondition: when doWork begins, the argument scratchpad will contain only the value defaultElem, which is allocated before every call to doWork, and therefore is safe with respect to disentanglement.
The precondition on doWork is easily satisfied on line 13, because the scratchpad is freshly allocated.Showing that the precondition is also satisfied on line 8 is more subtle, because there is an invariant on the shared scratchpad which is determined by the state of the lock.Informally, the invariant is: "while the lock is not held, for every , shared[i] = defaultElem."This invariant is re-established by calling clearScratchpad (line 9) before releasing the lock.Note that removing this call to clearScratchpad may lead to an entangled state.Indeed, after a call to doWork, the scratchpad may contain locally-allocated data, which is hence available for the other task.

Disentanglement: Timestamps, Reads, and How to Reason about Them
Partial orders on tasks through timestamps.In Sec. 3, we present a semantics that facilitates reasoning about disentanglement.To this end, we enrich the semantics with the notion of a timestamp, a unique identifier for each parallel task.We then assign every heap-allocated location a timestamp, marking when (i.e., by which task) the location was allocated, and restrict every task to only depend on locations allocated at timestamps that come before their timestamp.Timestamps form a partial order which respects the dependencies induced by forks and joins.When a task forks, the semantics generates two timestamps (one for each task) that are preceded by the timestamp of the forking task.Conversely, when two tasks join, the semantics generates a timestamp that is preceded by both timestamps of the two joining tasks.Forks and joins are the only operations extending the partial order of timestamps.Tasks otherwise step independently.
Fig. 2 visualizes one possible execution of the running example.Each shaded oval represents a task, and is labeled by its timestamp .The framed content is not relevant yet.Execution begins on a task 1 , which allocates the scratchpad and the lock.Then, a fork occurs, generating two tasks 2 and 3 .The task 2 fails to win the lock, whereas the task 3 succeeds.After completing, the two tasks join, forming a new task 4 .
A program logic for timestamp orders.In Sec. 4, we develop a program logic, DisLog, that incorporates timestamps.Every expression in DisLog is associated with a current timestamp and an end timestamp.The program logic uses a weakest precondition (WP) modality which takes the form wp ⟨ , ⟩ { ′ .Φ} asserting that the expression is currently evaluated by a task at timestamp , that is disentangled and can reduce, and if its reduction terminates, then it does so at end timestamp ′ , yielding a value such that Φ holds.The partial order of timestamps is encoded into the assertions of the logic, via the precedence assertion ≼ ′ .The precedence assertion is persistent and hence duplicable at will.This assertion describes the parallel structure of the program being verified.
Examples of the precedence assertion ≼ ′ appear in the framed boxes of Fig. 2. To reason about the task 2 , the user gets an assertion 1 ≼ 2 .Dually, the user gets an assertion 1 ≼ 3 to reason about the task 3 .At the join point 4 , the user gets the assertions 2 ≼ 4 and 3 ≼ 4 .Making use of the fact the ≼ is a pre-order, the user can use transitivity and deduce for example that 1 ≼ 4 .
Preserving disentanglement.Acquiring a memory location ℓ from the heap puts disentanglement at risk.Disentanglement is preserved if and only if ℓ was allocated at some timestamp preceding the timestamp of the acquiring task-in that case, we say that ℓ was allocated before .To represent such a requirement, we introduce the clock assertion, written ℓ , which precisely asserts that the ℓ was allocated before .This assertion appears for example in the precondition of DisLog's Load rule ( §4.3).To illustrate this rule, we show a specialized instantiation allowing to load the element at index 0 in the shared scratchpad, here named ℓ.The premises are implicitly separated by the separating conjunction * .
The SpecializedLoad rule first requires, as in standard separation logic, ownership of the shared location via a points-to assertion.Crucially, this rule also requires that the loaded value ℓ was allocated before the current timestamp via the ℓ assertion.In the postcondition of the WP, the rule asserts that the end timestamp ′ is equal to the previous timestamp , the returned value is ℓ, and the user still has the points-to ownership.
The clock assertion is persistent, giving the user great flexibility.Moreover, it is monotonic with respect to the precedence pre-order.Hence, if the user knows that the location ℓ was allocated before and that precedes ′ they can then deduce that ℓ was allocated before ′ .This mechanism is illustrated in Fig. 2. Indeed, the user can produce an assertion defaultElem 1 upon the allocation of defaultElem.Then, while reasoning about 3 the user can use the assertion 1 ≼ 3 to generate a new clock assertion defaultElem 3 .As explained next, our high-level logic DisLog+ takes full advantage of the monotonicity of the clock assertion.
2.4 Going High-Level: Simple Programs Should Have Simple Proofs Readers familiar with proofs of realistic programs may be worried by timestamps polluting the logic, and the additional proof burden imposed on a common rule such as Load.In practice, many programs "don't poke the bear" and subtle reasoning about timestamps should not be needed.For example, Westrick et al. [2020] show that race-free programs are always disentangled, as they prevent the communication of concurrently-allocated locations.Verifying such programs should be as cheap as a standard separation logic proof.
In Sec. 5, we present DisLog+, a high-level separation logic where timestamps, clocks, and precedence are confined to very few occurrences.DisLog+ allows reasoning on race-free programs with the standard reasoning rules of concurrent separation logic.The sole difference is a restriction on ghost state, effectively preventing races ( §5.3).What is the secret of the DisLog+ logic?The key observation we make on race-free programs is that (1) the content of a freshly allocated location is always safe to acquire for the allocating task and (2) "being safe to acquire" is a monotonic property: if a location is safe to acquire for a given task, it is safe to acquire for every subsequent task.As long as a program does not write carelessly to a shared location and break monotonicity, locations are always safe to acquire and no reasoning about timestamps is needed.
Technically, we define assertions of DisLog+ as monotonic predicates of DisLog over an ambient timestamp, the latter being implicitly threaded through during the proof.Points-to assertions of DisLog+ store not only ownership information but also the proof that all pointed-to locations are safe to acquire at the ambient timestamp.Hence, DisLog+ provides a standard Load reasoning rule.We stress that DisLog+ is a light abstraction over DisLog.At any moment during the proof, the user of DisLog+ can fall back to DisLog for fine timestamp-related reasoning.
Our definition of DisLog+ is directly inspired by separation logics for weak-memory models: iGPS [Kaiser et al. 2017], iRC11 [Dang et al. 2020] and Cosmo [Mével et al. 2020].These high-level logics are defined in terms of low-level logics by implicitly threading through a monotonic view of the memory.
Beyond race freedom.It turns out that DisLog+ is not confined to reasoning about race-free programs, but additionally provides two new sets of high-level reasoning rules to accommodate the most elementary disentangled races.The first one consists of fractional write-only assertions ( §5.4) allowing the user to reason about write-write races within DisLog+.As a write-write race does not acquire any location, such a race is always disentangled.The second one consists of a set of rules unveiling just enough timestamps to reason about races on "obviously safe" data ( §5.5).These data include data that was allocated before the beginning of the parallel phase, and unboxed data-that is, data that is not allocated in the heap.
The language we model supports the atomic operation compare-and-swap (CAS).A CAS is an entanglement hazard.Indeed, a CAS reads the scrutinized value, which must be safe to acquire.In the scratch running example (Fig. 1b), we use CAS to implement a spin-lock.Here, we exploit unboxed data to allow parallel tasks to safely communicate via a race on shared reference .A "race on unboxed data" should ring a bell: it perfectly fits in the realm of DisLog+ and its extensions.We show in Sec.6 how to reason about our locks and scratch entirely within the high-level DisLog+.

LANGUAGE AND SEMANTICS
Our language, DisLang, is an imperative lambda-calculus with fork-join parallelism.We equip Dis-Lang with a small-step, substitution-based, call-by-value semantics, guaranteeing disentanglement.

Syntax
The syntax of DisLang appears in Fig. 3.A value ∈ V can be the unit value (), a boolean ∈ {true, false}, an idealized integer ∈ Z, a memory location ℓ ∈ L, where L is an infinite set of locations, or a top-level function ˆ .ì . .A top-level function is closed in the sense that the only variables available in the function body are the function's name and the formal parameters ì .
A block describes the contents of a heap cell, amounting to either an array of values, written ì , or a -abstraction .ì . .Lambdas, as opposed to top-level functions ˆ .ì ., are not values.Instead, they are compiled to heap-allocated closures [Appel 1992;Landin 1964].Hence, acquiring a lambda can create entanglement.Top-level functions can be seen as closures that are pre-allocated outside the heap, which thus cannot create entanglement.In DisLang, fork-join parallelism is available via the parallel tuple 1 || 2 , representing the expressions 1 and 2 to be computed in parallel.DisLang supports a compare-and-swap instruction CAS , which targets an array, and is parameterized by 4 arguments: the location of the array, the index in the array, the old value and the new value.An evaluation context describes a term with a hole, written □.The syntax of evaluation contexts dictates a left-to-right call-by-value evaluation.

Computation Graphs and Disentanglement
The dynamics of DisLang, presented in the next section, makes use of a computation graph, capturing the nested fork-join parallel structure of a program.A computation graph is a directed acyclic graph where vertices, or tasks, represent sequential computations, and edges represent the dependencies between them [Acar et al. 2016].We label each task with a unique timestamp , from an infinite set T .When a task 0 forks two fresh tasks 1 and 2 , the computation graph is extended with edges ( 0 , 1 ) and ( 0 , 2 ).Conversely, when two completed tasks 1 and 2 join to form a fresh task 3 the computation graph is extended with edges ( 1 , 3 ) and ( 2 , 3 ).As discussed earlier, an example computation graph for the scratch example ( §2) is shown in Fig. 2.
In a computation graph , we say that precedes ′ and write ≼ ′ when there exists a sequence of edges in from to ′ .In particular, we say that two tasks are concurrent when neither precedes the other.Entanglement occurs when a task acquires a location that was allocated by a concurrent task.Recall our running example ( §2, Fig. 1): a particular implementation of doWork can store a locally-allocated location in the shared scratchpad.If it were possible for two concurrent tasks to both win the lock, without proper cleaning of the scratchpad, this locally-allocated location could be acquired by the concurrent task, generating entanglement.

Operational Semantics
Head Reduction.Fig. 4 defines the head reduction relation , ⊢ \ \ −→ ′ \ ′ \ ′ between two head configurations \ \ and ′ \ ′ \ ′ , where is the (global) computation graph and the timestamp of the (local) task at which the reduction takes place.A head configuration consists of the expression being evaluated, the store , and an allocation map .A store is a finite map of locations to blocks, representing the heap, and an allocation map is a finite map of locations to timestamps, recording the timestamps at which locations were allocated.
We write (ℓ) to denote the block stored at the location ℓ in the store .To insert a block into the store or update the store, we write [ℓ := ] .Note that only arrays can be updated; closures are immutable.To refer to the index of an array ì , we write ì ( ), and to update an array, we write [ := ] ì .We similarly write [ℓ := ] for an insertion in the allocation map.We write for an array of length , where each element of the array is initialized with the value .
The HeadAlloc and HeadClosure reductions allocate heap blocks, arrays and closures, respectively, extending the store with the desired block and the allocation map with the current timestamp.The HeadCallPrim reduction encompasses a reduction pure − −− → to compute a primitive operation.The HeadStore reduction updates the field of an array, and the HeadLength reduction returns the length of an array.The HeadLetVal reduction substitutes a variable by its value.The HeadIfTrue and HeadIfFalse reductions reduce an if-then-else construction where the conditional is evaluated.
Entanglement may only occur when a task acquires a location.Locations are acquired during the reductions HeadLoad, HeadCall, HeadCASSucc and HeadCASFail.A load reads the indexed value, a call the environment of the closure, and a CAS the scrutinized value.The HeadCall . Reduction under a context and parallelism reduction distinguishes between invoking a top-level function and a closure.Calling a closure loads the values of its environment, which may contain locations.As we use a substitution-based semantics, these locations are the location literals occurring in the function body , which are computed by the ( ) function.To prevent entanglement, all the mentioned rules include the same kind of precondition (highlighted in Fig. 4): if ℓ is acquired, then its allocation timestamp (ℓ) must precede the timestamp of the task at which the reduction takes place.These preconditions amount to a proof obligation during the verification of a program.Verified programs will satisfy the obligation and will thus never get stuck.As we will see in Sec.4.4 and in Sec.5.3, soundness of both of our logics entail the invariant that the physical program state are always disentangled.
Parallelism and Reduction under a Context.To keep track of the currently active and suspended tasks of an executing parallel program, we follow Westrick et al. [2020] and enrich the semantics with an auxiliary structure called a task tree, written , of the following formal grammar: ≜ ∈ T | ⊗ .A leaf represents an active task and is denoted by its timestamp .A node 1 ⊗ 2 represents a suspended task that has forked two parallel computations, recursively described by the task trees 1 and 2 .
Taking advantage of task trees, we define the semantics of parallel reductions and reductions under a context in Fig. 5.We define a scheduling reduction / / / / sched − −−− → ′ / ′ / ′ / ′ / ′ as either a head step, a fork, or a join.In this reduction relation, is a store, an allocation map, a computation graph, a task tree, and an expression.The SchedHead reduction describes a head reduction.The SchedFork reduction describes a fork: the task tree is at a leaf 0 and faces a parallel tuple.The reduction generates two fresh timestamps 1 and 2 , adds the corresponding edges to the computation graph and updates the task tree to the node with two leaves 1 ⊗ 2 .The SchedJoin reduction describes a join: the task tree is at a node with two leaves 1 ⊗ 2 , and both leaves reached a value.The reduction generates a fresh timestamp 3 , updates the computation graph, and allocates a memory cell to store the result of the parallel tuple.It then updates the task tree to the leaf 3 .
The main reduction relation / / step − −− → ′ / ′ / ′ describes a scheduling reduction inside the whole parallel program.A tuple / / consists of the program state , the task tree , and an expression .A state consists of the tuple ( , , ), denoting a store , an allocation map , and a computation graph .The StepSched reduction describes a scheduling step.The other reductions describe where the scheduling reduction takes place.The StepBind reduction describes a reduction under an evaluation context.The StepParL and StepParR reductions unveil the non-determinism of the parallel reduction.If a node of the task tree is encountered facing a parallel tuple, the left side or the right side can reduce.

DISLOG, A PROGRAM LOGIC FOR DISENTANGLEMENT
In this section, we present the details of DisLog.We first give an Iris primer and explain our notations ( § 4.1).Then, we showcase how timestamps appear in the program logic ( § 4.2), and present other reasoning rules ( §4.3).Finally, we discuss the soundness theorem of DisLog ( §4.4).

Assertions and Weakest Preconditions
We build DisLog on top of Iris [Jung et al. 2018], adopting Iris' syntax.In particular, we write Φ for an Iris assertion (of type ), Φ * Φ ′ for a separating conjunction, and Φ − * Φ ′ for a separating implication.If is a proposition of the meta logic, we call pure and write ⌜ ⌝.We write Φ ⊣⊢ Φ ′ for the equivalence of assertions.
Our program logic features a weakest precondition (WP) modality which takes the form: This modality adapts a standard Iris' WP to the semantics of DisLang, and in particular, enriches it to account for timestamps.In the above assertion, is the timestamp of the task which symbolically executes the expression .We call this timestamp the current timestamp of the expression.A postcondition takes the form ′ .Φ where the variables ′ and are bound in Φ.The variable denotes the resulting value and the variable ′ the end timestamp, the timestamp of the returning task.We write wp ⟨ , ⟩ { ′ ℓ.Φ}, where the variable ℓ denotes a location, as a syntactic sugar for wp ⟨ , ⟩ { ′ .∃ℓ. ⌜ = ℓ ⌝ * Φ}.We similarly do so for booleans and integers .If we want to abstract over the details of the postcondition, we write Ψ instead of ′ .Φ.
Our WP is subject to the standard structural rules of separation logic.DisLog supports in particular the Frame rule that we present below, as a warm-up to our notations.We write reasoning rules as inference rules, where premises are separated by the separating conjunction * and entail the conclusion.In particular, if the conclusion is a WP, premises amount to preconditions.
Iris features ghost state, which is hence available in DisLog.We write Φ ⇛ Φ ′ for a ghost update (or fancy update) that updates the ghost state.We omit the so-called masks for the sake of readability.Thanks to the ghost state, DisLog supports Iris invariants [Jung et al. 2018, §2.2], with a standard interface.Our WP allows the user to assume (or open) an invariant before reasoning about an atomic expression and generates an obligation to restore (or close) the invariant in the postcondition.An atomic expression is an expression that can reduce to a value in a single head step of computation.We syntactically characterize such assertions with the Atomic pure predicate.
DisLog makes use of fractional [Bornat et al. 2005;Boyland 2003] and discardable [Vindum and Birkedal 2021] points-to assertions of the form ℓ ↦ → ì , where denotes either a positive fraction less than or equal to 1, or a discarded fraction written .The latter makes the points-to assertion persistent.When = 1 we write ℓ ↦ → ì .Points-to assertions of DisLog do not carry information about timestamps: this role is devoted to two new assertions described in the next Section.

Timestamps Management
A central aspect of our disentanglement logic is the management of timestamps.To this end, DisLog features two new assertions.
• The clock assertion ℓ , indicating that the location ℓ was allocated before the timestamp in the underlying computation graph.
• The precedence assertion ≼ ′ , witnessing that the timestamp precedes the timestamp ′ in the underlying computation graph.
Both assertions are persistent and work hand-in-hand.Given the assertion ℓ , a task at timestamp can safely acquire the location ℓ.Moreover, given both the assertions ℓ and ≼ ′ , a task at timestamp ′ can safely acquire the location ℓ as well.A benefit of phrasing a location's allocation timestamp relative to another timestamp, rather than absolute, is that the user never needs to know precisely at which timestamp a location was allocated: disentanglement is ensured as soon as the acquired location was allocated by a preceding task.Similarly, the user never needs to know the whole computation graph: precedence information suffices for proving disentanglement.We overload the clock assertion to arbitrary values and introduce assertions of the form .If is a location ℓ, then this assertion is defined as ℓ .Otherwise, it is defined as ⌜ ⌝.We overload again this assertion to a collection of values, and write ì for the iterated conjunction * ( ∈ ì ) ( ).
Fig. 6 summarizes the rules governing the clock and the precedence assertions.The ClockMono rule illustrates the monotonicity of the clock predicate with respect to the precedence pre-order: if the location ℓ was allocated before and precedes ′ , then it is safe to conclude that ℓ was allocated before ′ .We emphasize that the precedence assertion forms a pre-order: this assertion is reflexive (PrecRefl) and transitive (PrecTrans).The MementoPre and MementoPost rules are the only rules generating a clock predicate.The MementoPre rule asserts that if the location ℓ occurs in the expression at current timestamp , then the user can gain a witness ℓ that ℓ was allocated at a timestamp preceding .The MementoPost rule asserts that the value returned by a task was allocated before the end timestamp of this task.
The TempusFugit rule distills the semantics of DisLang: it is safe to suppose that the current timestamp precedes the end timestamp.The TempusAtomic rule asserts that the current timestamp and the end timestamp of an atomic expression are the same.The TempusAtomic rule is more precise than needed: the clock predicate and the precedence predicate are both monotonic with respect to the precedence pre-order, via the ClockMono rule and the PrecTrans rule, respectively.However, the TempusAtomic rule relieves the user from the burden of always applying the ClockMono and PrecTrans rules by hand when the timestamp is effectively preserved.

Reasoning Rules for Expressions
Fig. 7 gives the syntax-directed reasoning rules of DisLog (we hide "later" modalities for brevity).The rules Alloc, Length, CallPrim, LetVal, IfTrue, IfFalse, and Store are standard, apart from their mention of timestamps.In particular, the Alloc rule does not generate a clock assertion.If desired, such an assertion can be obtained by applying the MementoPost rule.
The Value rule asserts that if the symbolic evaluation of an expression ended at timestamp , yielding a value , then the postcondition Ψ should hold.The Load rule extends the standard separation logic rule to prevent entanglement.Indeed, the assertion in the precondition witnesses that if is a location, then it must have been allocated before the current timestamp .The CASSucc and CASFail rules are similarly extended: they prevent entanglement by requiring that if the scrutinized value is a location, then it was allocated before the current timestamp.
The Closure and TopLevel rules produce an assertion Func ì certifying that calling as a function will not cause entanglement.Obtaining this assertion for closures may be surprising at first, but is warranted by the following facts: (i) all the timestamps of locations captured by the closure are guaranteed to precede the closure's allocation timestamp (by rule MementoPre), and (ii) closures are immutable objects and, as such, cannot themselves create entanglement [Westrick et al. 2022].Phrased differently, the locations of the environment are allocated before the closure itself, and thanks to immutability, this fact never changes.The Func predicate is persistent.The Call rule allows calling a function, given the Func predicate.Proving that the environment was allocated before the current timestamp amounts to proving that the closure's location itself was allocated before the current timestamp, which is true since the closure's location is already part of the expression ( §4.4).
The Bind rule gives meaning to the notion of the "current timestamp" of an expression.Operationally, the evaluation of a term [ ] at timestamp reduces the sub-expression until it reaches a value and an end timestamp ′ .Then, the whole term [ ] starts reducing at the new timestamp ′ .The Bind rule paraphrases this operational behavior.The rule asserts that the user first has to reason about the sub-expression at the same current timestamp.The user has then to reason about the filled term [ ] at a current timestamp ′ , under the precondition that the sub-expression reduced to a value at end timestamp ′ .
Reasoning About a Parallel Tuple.A pivotal rule of DisLog is Par.Let's first derive a naive version ParWeak of this rule below before focusing on the ultimate rule given in Fig. 7.
ParWeak This rule allows reasoning about a parallel tuple 1 || 2 at current timestamp .The premise universally quantifies over two fresh timestamps 1 and 2 , which are used for 1 and 2 , respectively.We focus on the left-hand side of the tuple as the right-hand side is handled similarly.The user should verify 1 with the postcondition Ψ 1 , under the hypothesis ≼ 1 witnessing that precedes 1 .This information allows the user to safely acquire any location that was safe to acquire from .Indeed, if the user has an assertion ℓ , they can use the ClockMono rule to obtain an assertion ℓ 1 .We emphasize that the above rule ensures that the two fresh timestamps 1 and 2 are unrelated.Hence, an assertion ℓ 1 cannot be converted to an assertion ℓ 2 .This would indeed be unsafe, as a location allocated by the left task could be acquired by the right one, creating entanglement.
After the join point, the postcondition of the ParWeak rule asserts that 1 reduced to a value 1 at end timestamp ′ 1 , and that 2 reduced to a value 2 at end timestamp ′ 2 .The postcondition also produces witnesses that ′ 1 and ′ 2 precede the (new) current timestamp ′ .Thanks to these two assertions any locations allocated by either of the two tasks are now accessible by any task at timestamp ′′ such that ′ ≼ ′′ .Finally, the postcondition asserts that the parallel tuple itself reduced to a location ℓ, pointing to the two resulting values 1 and 2 .
Unfortunately, the ParWeak rule is tedious to use in practice.It fails to support a common pattern underlying our proof rules, which would allow the postconditions Ψ 1 and Ψ 2 of the newly forked tasks to depend on the tasks' timestamps 1 and 2 .This dependence is rendered impossible by the universal quantification of the timestamps 1 and 2 .Our final rule Par presented in Fig. 7 facilitates the wished-for pattern.The premise of the Par rule quantifies universally over the two timestamps, and after the quantification, allows the user to choose two existentially quantified postconditions Ψ 1 and Ψ 2 that can depend on the two timestamps.Moreover, the user is free to choose the postconditions after a potential ghost update; for example, to allocate an invariant that depends on both 1 and 2 .The user should then verify the two parts of the parallel tuple with their respective timestamps, as in the ParWeak rule.Finally, the user has to show that the resulting values and timestamps entail the postcondition Ψ.This is formally expressed in the second line of the precondition of the rule.

Soundness
Finally, we devote our attention to stating and proving soundness of DisLog.For our disentanglement logic to be sound it has to hold that the reduction of a program verified using the rules of DisLog leads to a disentangled program state.Our semantics is phrased in terms of a transition system that Fig. 9. Definition of the weakest precondition modalities gets stuck if entanglement is encountered, ensured by the highlighted premises for head reductions in Fig. 4. Soundness of our logic thus must entail that verified programs cannot get stuck.
Since we use a small-step semantics in a parallel world, the definition of "not getting stuck" needs careful wording.In particular, it is not enough to say that "the configuration can take a step".Indeed, one step of DisLang corresponds to a step of one task, whereas we want to ensure that every task can take a proper step.The adequacy theorem of the Iris WP makes use of the notion of reducibility to capture the fact that a thread can take a proper step.The judgment red presented in Fig. 8 adapt this notion, ensuring that every task of the task tree can take a step.The RedSched rule asserts that a configuration that can make a scheduling step (either a head step, a fork, or a join) is reducible.The RedCtx rule asserts that the reducibility of a configuration facing an expression under an evaluation context amounts to the reducibility of this very expression.The RedPar rule asserts that a configuration facing a node of the task tree and a parallel tuple is reducible if at least one side of the pair is not a value (otherwise, a join should be possible), and each side that is not a value is reducible.
An expression is safe if (∅, ∅, ∅) / / step − −− → * ′ / ′ / ′ implies that either the configuration ′ / ′ / ′ is reducible, or that ′ is a value and ′ a single leaf.Our soundness theorem asserts that if an expression can be verified using DisLog, then it is safe.As the semantics of DisLang cannot progress when entanglement is detected, the soundness theorem asserts that cannot reach an entangled state.The formal proof Theorem 4.1 can be found in our Coq formalization [Moine et al. 2023b].We detail below the main definitions and invariants.
Definition of the Weakest Precondition.The formal definition of the wp ⟨ , ⟩ {Ψ} assertion can be found in Fig. 9.The key to our approach is to define the wp modality with respect to a more general-hidden from the user-weakest precondition modality that we refer to as wpg modality.The wpg modality is parameterized not with a single timestamp, but a whole task tree: we found this generalization crucial for the various proofs to succeed.Nevertheless, reasoning always takes place at the leaves.Hence, we can hide the details of the task tree from the user.
The definition of the wpg modality appears also in Fig. 9 and follows the traditional Iris recipe [Jung et al. 2018, §6].As usual, the WP is defined as a guarded fixpoint, and makes use of a state interpretation predicate (or central invariant), written interp, which relates the ghost state and the physical state.The definition of wpg cases on whether the expression is a value or not.If the expression is a value, we can access the state interpretation, and deduce that the task Fig. 10.Definition of the state interpretation predicate and of base assertions tree consists of a single leaf and the postcondition.Otherwise, if the expression is not a value, the wpg modality asserts that the configuration is reducible, and that for any possible step, the state interpretation must continue to hold, as well as the WP of the reduced program.Apart from its mention of timestamps, our WP distinguishes itself from the standard Iris WP by making the state interpretation available in the value case, and making use of our custom red judgment.
The State Interpretation Predicate.Our WP maintains a state interpretation predicate between each reduction step, which is defined in Fig. 10.We review its definition next.
We first focus on the roots disentanglement judgment rootsde .This judgment asserts each task of the expression only uses locations allocated before its associated timestamp.This judgment allows stating the MementoPre rule.If the task tree consists of a single leaf , the RDeLeaf rule requires that the locations of were allocated before .The RDePar rule requires that both sides of a parallel tuple satisfy the rootsde judgment.In the case of an evaluation context, RDeCtx requires that the judgment holds for the expression under the context, and that the locations occurring in the evaluation context itself are allocated before all the leaves of the task tree.
The state interpretation predicate also gives meaning to the ghost state from the physical state.We first briefly explain the construction of ghost state in Iris abstractly, before detailing the part of interp that concerns ghost state.In Iris, ghost state is defined in terms of so-called cameras (CMRA) which can be thought as "step-indexed partial commutative monoids" [Jung et al. 2018], detailing a resource algebra.Iris provides predefined notions of resource algebras.For example, the resource algebra Auth( ) describes the authoritative resource algebra over the resources .This resource algebra gives access to • , the authoritative ownership of , and • , the fragmentary ownership of .Together, these two assertions entail that there exists an element of the algebra such that = • .The resource algebra Set( ) describes the set resource algebra, where the composition of resources is described by set union.For our state interpretation predicate, we define a ghost cell which we equip with the resource algebra Auth(Set(T × T )).The ghost cell stores the computation graph and gives meaning to the precedence assertion.
Iris moreover provides a generic construction to define points-to assertions via the gen_heap library [Iris Development Team 2023].This library defines a certain piece of ghost state, defines an assertion Heap that ties a store to this ghost state, and defines the points-to assertion ℓ ↦ → ì in terms of this ghost state.Moreover, the gen_heap library allows associating persistent information to locations via a mechanism of meta assertions.In our case, we associate to each location ℓ the timestamp of the task that allocated it, and write meta ℓ .The main property of this assertion is that, from the knowledge meta ℓ and meta ℓ ′ , we can deduce that = ′ .
We are now able to review the details of the definition of our state interpretation predicate shown in the lower part of Fig. 10.First, it asserts that the domain of the store is the same as the allocation map, and the roots disentanglement judgment.The state interpretation also asserts the ghost authoritative ownership of the computation graph • and the ownership of the store via the Heap assertion.Moreover, the state interpretation asserts, for every mapping from a location ℓ to a timestamp in the allocation map , that the persistent knowledge meta ℓ was set.
We define an edge between and ′ as a ghost fragmentary ownership of the singleton •{( , ′ )} .The conjunction of • and •{( , ′ )} allows to deduce that ( , ′ ) ∈ .We define the precedence assertion ≼ ′ as the reflexive-transitive-closure (rtc) over the edge predicate.The clock assertion ℓ is defined as a paraphrase of its informal definition.The location ℓ was allocated before timestamp if there exists a timestamp 0 such that ℓ was allocated at 0 , and 0 precedes .
The representation predicate of a -abstraction Func ì is a disjunction: either is a top-level function, or a heap-allocated closure.In that case, we use a discarded fraction for the points-to assertion, as the closure is immutable.The predicate also asserts the existence of a timestamp at which the closure was allocated, and that every location of its environment (the locations occurring in ) was allocated before .We make use of this knowledge to verify the Call rule of Fig. 7.The closure's location is allocated before the current timestamp (thanks to the rootsde judgment), but since the locations of the environment were allocated before the allocation time of the closure itself, they are also allocated before the current timestamp, and hence safe to acquire.

A HIGH-LEVEL LOGIC: DISLOG+
In this section, we introduce DisLog+, an almost standard concurrent separation logic allowing proof of disentanglement for a large class of programs.DisLog+ is defined in terms of DisLog using monotonicity arguments, a technique that first appeared in program logics targeting weak-memory models [Dang et al. 2020;Kaiser et al. 2017;Mével et al. 2020].

Don't Poke the Bear
Disentanglement is preserved by restricting reads: when a task acquires a location, the programmer must ensure that this location was allocated by a preceding task.However, numerous programs "don't poke the bear", that is, are disentangled because they do not comprise reads of shared, and hence potentially hazardous, locations.
Determinacy-race-free programs are an example of such cautious programs.A determinacy race [Feng and Leiserson 1999] occurs when two concurrent tasks access the same location atomically, and at least one of these accesses is a write.As noticed by Westrick et al. [2020], such race-free programs are trivially disentangled: shared locations cannot be written to from different tasks, which prevent the communication of freshly allocated data between tasks.Moreover, they noticed that there exist some races that are also trivially disentangled.Races that fall into this category are: (i) write-write races, because a write does not acquire a location, and (ii) read-write races on data that was allocated before the beginning of the parallel phase, because tasks are allowed to communicate previously-allocated data.
What is the common denominator of all these cautious programs?Rather than categorically restricting reads, they demand a more nuanced consideration of writes.More precisely, these programs ensure that when a task writes a value to a location, this value is safe to read for any task that can access the said location.Race-free programs prevent concurrent reads, because a task Fig. 11.DisLog+ separation logic and assertions must have unique ownership of a location to write to it.Write-write races do not restrict writes, as there is no concurrent read, and read-writes races are permitted as long as the written value was allocated before the beginning of the parallel phase.
For this large class of programs that don't poke the bear, we provide an almost traditional separation logic called DisLog+.By "traditional", we mean that the weakest precondition of DisLog+ takes a form which does not mention timestamps ( §5.2).Moreover, its syntax-directed reasoning rules do not mention clock nor precedence assertions: they are the standard reasoning rules of concurrent separation logic ( §5.3).By "almost", we stress that DisLog+ restricts the use of ghost state to prevent races, but is otherwise a standard separation logic.DisLog+ is hence ideally suited to reason about race-free programs.To cater to the benign races identified above, we extend DisLog+ with (i) write-only assertions to reason about write-write races ( §5.4) and (ii) three rules to reason about read-write races on previously allocated data that we refer to as the objectivity lemmas ( §5.5).

Monotonicity to the Rescue
Our development of DisLog+ was triggered by two observations about race-free programs: (i) race-free programs ensure that, when a task accesses a location, any value referenced by the location is safe for the task to acquire, and that (ii) this property is monotonic with respect to the precedence pre-order.Indeed, if all the pointed-to values are safe to read for a given task, then these values are also safe to read for any of the task's descendants in the computation graph.
We define a new separation logic, in which every assertion is parameterized by a timestamp, called the ambient timestamp, and is monotonic with respect to the precedence pre-order.Fig. 11 presents the formal definitions of these assertions, written and of type .The user can always project a assertion to a particular timestamp in via the construction @ .Conversely, the lifting construction ⌈Φ⌉ allows to lift an assertion Φ into .Hence, the whole ghost-state theory of is available in .When the context allows it, we write Φ instead of ⌈Φ⌉ for the lifting of an assertion into .The entailment of (written ⊢ ) is defined using the entailment of (written ⊢ ).The definition ensures that an entailment ⊢ ′ is valid if and only if, for any timestamp the projection @ of the premise entails the projection ′ @ of the conclusion.
Fig. 11 also defines the assertions relative to DisLog+.The ℓ now assertion asserts that ℓ was allocated before the ambient timestamp.This is a persistent assertion, whose monotonicity is ensured by the ClockMono rule.Again, we overload this assertion to arbitrary values and collection of values.The key idea of DisLog+ is its definition of the points-to assertion.The assertion ℓ ↦ → ì is defined as the conjunction of the points-to assertion in , written ⌈ℓ ↦ → ⌉, as well as the knowledge that every value pointed-to by the location was allocated before the ambient timestamp, using the ì now assertion.Hence, the points-to assertion of asserts that every load for this location is safe for this particular task, and any subsequent task!
The WP of DisLog+ takes the form wpm { .}, where the variable denotes the resulting value of and is bound in .To abstract over the details of the postcondition, we write instead 12. Selected rules of DisLog+ of . .The assertion wpm { } asserts that is safe to execute at any timestamp succeeding the ambient one, and that if reaches a value , then holds at the end timestamp (or at any subsequent timestamp, since is monotonic).
The user can freely go between DisLog+ and DisLog using the following Conversion rule.
This rule needs a careful reading.It asserts that the precondition entails wpm { } in if and only if (in the meta-logic), for any timestamp, the projection of the precondition at this timestamp entails the WP of DisLog with the postcondition projected at the end timestamp.Notice that the Conversion rule is an equivalence.While it is not surprising that a specification in DisLog+ is valid in DisLog (the former being more restrictive than the latter), the converse is also true: the user can use the full power of DisLog rules to verify a DisLog+ interface.
The soundness of DisLog+ is a direct corollary of the soundness of DisLog (Theorem 4.1), thanks to the Conversion rule.

Reasoning Rules of DisLog+
We showcase the most important reasoning rules of DisLog+ in Fig. 12, which are similar to the reasoning rules of the original concurrent separation logic with fractional permissions [Bornat et al. 2005].These rules are expressed at the level, where the horizontal bar stands for entailment.In particular, the points-to assertion occurring in the rules is the one defined in Fig. 11, guaranteeing that any load will be safe.Nevertheless, all the rules of Fig. 12 are derived from the rules of DisLog ( §4.3) using the Conversion rule.
The rules we present in Fig. 12 prevent races.Indeed, the only way to allow a race is by sharing a points-to assertion between tasks, which is only possible via invariants [Jung et al. 2018, §2.2].Because invariants can only be installed for assertions of type , but points-to assertions in DisLog+ are of type , DisLog+ rules out races by construction.We alluded to this restricted use of Iris ghost state by referring to DisLog+ as an "almost" standard separation logic ( §5.1).
The Alloc+ rule produces a valid points-to assertion, that is, both the ownership information and the proof that the default value is safe to read at the ambient timestamp.As the default value occurs in the expression alloc , if is a location, then it was already acquired, and hence already safe.This is reminiscent of the MementoPre rule (Fig. 6).The Store+ rule is also standard and preserves the fact that any subsequent load will be safe.The stored value occurs in the expression ℓ [ ] ← , and is hence safe to read.The Load+ rule is the standard rule of separation logic, as it internally rests on the fact that all the values pointed-to by the location are safe to read.Finally, the Par+ rule heavily makes use of the monotonicity of assertions.Indeed, the two postconditions 1 and 2 are valid for the end timestamp of the two forked tasks.Hence, they are also valid for the end timestamp of the parallel tuple, that succeeds them.A write-write race occurs when two or more tasks race to write to a shared location, but neither of them (or any other task) ever reads from the said location.Write-write races are always disentangled, as a write does not acquire any location.However, from the point of view of functional correctness, write-write races are subtle: once the tasks join and the program reads the shared location, all the outcomes of the race should be taken into account.
To verify such races in more standard Iris settings, the user typically installs an invariant containing the points-to assertion, quantifies existentially over the pointed-to value, and constrains it using ghost state.This existential quantification allows the user to change the pointed-to value while preserving the invariant.In DisLog+, invariants are restricted to assertions, and thus the user cannot store a ( ) points-to assertion inside an invariant.To allow the verification of write-write races in DisLog+ without the need of an invariant, we introduce the notion of a fractional write-only assertion presented in Fig. 13.
A write-only assertion takes the form ℓ ⇒ , where is a list of ghost names, a positive fraction less or equal to 1, and a set of possible values.When = 1, the set contains all the values possibly written to ℓ.The write-only assertion comes with a companion assertion orig , which is persistent and describes the original value of the points-to assertion.The WOStart rule consumes a points-to assertion and produces an orig assertion and an empty write-only assertion.The WOFrac rule asserts that the write-only assertion is fractional: the user can always arbitrarily split and join it.The WOStore rule allows executing a store operation, overwriting the set of possible values.This rule only requires a fraction of the write-only assertion: a concurrent task could have another fraction and race to write.Fig. 13 also includes two rules for getting back a points-to assertion from a write-only assertion.In both cases, the full fraction 1 must be given back.The WOCancel rule can be used if no write occurred, as witnessed by the empty write-only assertion.The rule produces the original points-to assertion.If at least one write has occurred, the WOEnd rule can be used.The rule produces a points-to assertion and a proof that the pointed-to value is in the set of possible values.
The definition of write-only assertions appears in our Coq formalization [Moine et al. 2023b].This definition makes use of standard ghost state and in particular a cancellable invariant [Jung et al. 2018, §7.1] in which a points-to assertion of DisLog is stored.Notably, the assertion ℓ ⇒ carries not only information on the contents of the cancellable invariant, but also a witness now that the set of possible values was allocated before the ambient timestamp.Hence, after an application of the WOEnd rule, we can reconstruct back a points-to by canceling the invariant and exhibiting a witness that the pointed-to value was allocated before the ambient timestamp.
While write-only assertions fit well in the context of , we stress that they are not specific to it.Similar definitions can be proposed for regular points-to assertions at the level, dropping assertions related to timestamps.In our discussion of write-only assertions, we focus on references, which in DisLang are represented by arrays of size 1.To target arbitrary arrays, we assume that our approach can be generalized to a more detailed interface with a per-index write-only points-to assertion.This generalization should be purely mechanical.Read-write races can create entanglement: a task could communicate a memory location it allocated to a concurrent task.Verifying that such races are safe often requires the full expressiveness of Dis-Log.However, some read-write races are trivially disentangled: if written values are unboxed (that is, not allocated in the heap), or were allocated before the beginning of the race.This section explains how to reason about such races within DisLog+.
In the extreme case where a location points to a block of unboxed values, read-write races are tolerated without additional work.Recall (Fig. 11) that ℓ ↦ → ì ≜ ⌈ℓ ↦ → ì ⌉ * ì now.If ì contains only unboxed values, the assertion ì now holds trivially true, and we thus have that ℓ ↦ → ì ⊣⊢ ⌈ℓ ↦ → ì ⌉.Therefore, the points-to assertion of ℓ can be stored directly inside an invariant, and read-write races on ℓ can be verified, as long writes concern unboxed values.
In a more general case, Fig. 14 presents an interface of "objectivity lemmas" to reason about trivially disentangled read-write races.Objectivity lemmas offer two mechanisms.First, they allow witnessing that a set of locations were allocated before a program point (rules GetClock and MementoPre+).Second, they allow sharing a points-to assertion through an invariant, and allow updating this points-to assertion as long as new values are unboxed or were allocated before the installation of the invariant (rules SplitSubjObj and Objectivize).
To witness that a set of locations were allocated before a given program point, the user can use two rules and combine their result using the equivalence ( 1 ∪ 2 ) now ⊣⊢ 1 now * 2 now.First, the GetClock rule allows extracting from a points-to assertion that every pointed-by value was allocated before the ambient timestamp.Second, the MementoPre+ rule asserts that every location occurring in the current expression was allocated before the ambient timestamp.
The SplitSubjObj rule allows in particular sharing a points-to assertion through an invariant, while fixing the set of possibilities for newly written values.We present the general form of the rule, adapted from Cosmo [Mével et al. 2020, §4.1].The SplitSubjObj rule asserts that owning a assertion is equivalent to owning its projection to some timestamp ⌈ @ ⌉ (the objective part of ), and the information that this timestamp precedes the ambient timestamp, written ↑ (the subjective part of ).The assertion ↑ is persistent and defined as ′ .≼ ′ .The objective part @ is an assertion and can hence be shared through an invariant.The subjective part ↑ cannot be shared through an invariant, but it can be given to subsequent tasks and used to convert back the objective part into the original assertion, by using again the SplitSubjObj rule.We now focus on a contrived example to illustrate how the objectivity lemmas are intended to be used, and why an additional rule, Objectivize, is needed.Sec.6.1 and 6.3 provide more realistic examples, based on the same idea.Our contrived example is: This example allocates a reference ℓ (that is, an array of size 1), initialized to ().It then forks two tasks, one reading from ℓ and the other storing ℓ inside itself.This is a trivially disentangled read-write race: the only written value is allocated before the beginning of the race.
To verify this example, we first use the Alloc+ rule (Fig. 12) and obtain the assertion ℓ ↦ → [()].Then, we use the MementoPre+ rule to generate a witness ℓ now that ℓ was allocated before the execution of the par.Then, we use the SplitSubjObj rule on the assertion (ℓ ↦ → [()] * ℓ now).We obtain the assertion ↑ * ⌈(ℓ ↦ → [()] * ℓ now)@ ⌉.Unfolding definitions, the objective part is equivalent to (ℓ ↦ → [()])@ * ℓ .We then allocate an invariant containing the assertion: Next, we apply the Par+ rule (Fig. 12), and give each task a copy of the invariant, of the assertion ↑ and of the assertion ℓ .For the two tasks, the proof is similar.First, we open the invariant.Then, use the SplitSubjObj rule to convert back the assertion (ℓ ↦ → [ ])@ into ℓ ↦ → [ ], instantiating the existential with .We then execute the load or the store.
Finally, comes the time to close the invariant.We cannot use the SplitSubjObj rule again because the existential quantification of this rule would generate a fresh timestamp ′ , distinct from .Thankfully, the Objectivize rule comes to save the day.This rule allows us to generate an assertion ⌈(ℓ ↦ → ì )@ ⌉ as long as we can provide the assertion ⌈ ( ì ) ⌉, where ( ì ) denotes the set of locations of the list of values ì .We apply the Objectivize rule, instantiating ì by [ ], do a case analysis on and end the proof, given that we have at hand the assertion ℓ .Notice that, instead of writing ℓ into itself, we could have written any unboxed value, or any location that was allocated before ℓ itself.

EVALUATION
We showcase DisLog+ and DisLog via a range of case studies.We first focus on the scratch example of Sec. 2, and prove it correct in DisLog+ ( §6.1).Then, we illustrate how to reason about a writewrite race using write-only assertions ( §6.2) based on a parallel lookup in a lazy collection.We conclude with a case study on concurrent hashing and deduplication ( §6.3 and §6.3).
Deduplication refers to the process of removing duplicates from a collection.This task can be efficiently done in parallel using concurrent hashing: each task tries to insert elements into a shared concurrent hash set, which by construction does not store duplicates.If the collection is fully allocated beforehand, we use a folklore hash set [VerifyThis 2022], and verify both the hash set interface and the duplication itself entirely within DisLog+ ( §6.3), thanks to the objectivity lemmas.In the case of a lazy collection, where elements may not be already allocated prior to the parallel phase, the previous approach cannot be directly used: naïvely applying the previous deduplication algorithm would result in entanglement, and so a different deduplication algorithm is needed.We address this issue by first partially removing duplicates in parallel with a more subtle hash set, then getting rid of the remaining duplicates by calling the previous deduplication function.Interestingly, the proof requires the full power of DisLog ( §6.4).
In the case studies, we write a non-recursive function as ì ., which is a sugar for _. ì . .where _ denotes an anonymous binding.We add a hat and write ˆ to distinguish top-level functions.We write 1 ; 2 for a sequence, which is encoded as let _ = 1 in 2 .

The scratch Example
We first give an interface to a spin-lock.The verified code is a direct translation of the spin-lock presented earlier ( §2.2), which is implemented as a pair of closures sharing a reference to a boolean named , initialized to false.The first closure attempts a lock by doing a CAS on from false to true.The second closure releases the lock by setting the reference to false.
Our specification of locks in DisLog+ appears in Fig. 15 and is very similar to the standard specification of locks in high-order separation logic [Gotsman et al. 2007;Svendsen and Birkedal 2014].In DisLog+, a lock must be restricted to protect an assertion Φ, as a lock protects an invariant.The precondition of the specification of new_lock consumes Φ.The postcondition produces a location ℓ pointing to a pair of closures ( , ), for locking and unlocking, respectively, as well as an assertion lock Φ, which is persistent and asserts that the and describe a valid lock.The specification of a call to the closure requires a valid lock, and returns a boolean.If this boolean is true, then the lock was successfully locked: the user gains the protected assertion Φ as well an exclusive token locked , witnessing that the lock is now locked.This token is required to call the closure , as well as Φ, which has to be given back when the user wants to unlock the lock.
We conduct the proofs of the interface of Fig. 15 entirely within DisLog+.Indeed, we are in an extreme case: the shared reference points-to a boolean, which is unboxed.Hence, the points-to of DisLog+ is equivalent to the points-to of DisLog ( §5.5).Thus, we are able to define the lock Φ assertion using an invariant that stores directly the points-to assertion of the shared reference.
We make use of the interface of locks we just presented, as well as the objectivity lemmas, and verify the following interface for the scratch example.
Recall that the scratch example allocates a shared reference, which is protected by a lock while two parallel tasks attempt to access it.After allocating the shared reference, we have a points-to assertion of the form shared ↦ → defaultElem N .We use the GetClock rule to extract an assertion defaultElem now.Then, we use the SplitSubjObj rule and obtain the assertion ↑ as well as the assertion that will be protected by the lock (shared ↦ → defaultElem N )@ .Each task gets an assertion ↑ * defaultElem .Then, if a task wins the lock, we reconstruct a points-to assertion with the SplitSubjObj rule, call the doWork function, call the cleanScratchPad function, and restore the invariant protected by the lock using the Objectivize rule.

Parallel Lookup in a Lazy Collection
The left part of Fig. 16 presents the code of a parallel loop parfor [ ; ; ℎ], calling the function ℎ for each index between and .The presented code is a direct translation of the implementation used in the standard library of MPL [MPL Development Team 2022].The specification of parfor appears below and should be unsurprising.The precondition requires, for every index between and , that ℎ [ ] is safe and satisfies a postcondition .The postcondition of parfor [ ; ; ℎ] produces the iterated conjunction of the postconditions.
The right part of Fig. 16 presents the code of the lookup [ ; ] function that searches for a non-unit value in the lazy collection up to index .To do so, the function uses a reference , and a closure ℎ that takes an index , produces the -th index of the lazy collection, and writes it in if it is non-unit.The closure ℎ is then called in parallel for every index between 0 and .This is a typical example of a write-write race: each call on ℎ may write in , but never read from it.
The specification of lookup [ ; ] appears at the bottom of Fig. 16.Its precondition requires that is valid lazy collection: between indices 0 and , produces a value satisfying a predicate .The postcondition of the specification produces a value and asserts the existence of ì , the collection itself.The found ì judgment asserts that ì has size and that either is not the unit value and occurs in ì , or is the unit value and every value in ì is the unit value.It is defined as: The proof of our specification makes use of a write-only assertion ( §5.4).Indeed, just after the allocation of , we convert its points-to assertion into a write-only assertion ⇒ 1 ∅, and split it into fractions.Then, we call the parfor specification and instantiates with: This postcondition asserts that the lazy collection produced a value and that if it is not the unit value, then it was written into , else nothing was written to .After applying the specification of parfor, we gather all the fractions of the write-only assertion.We then do a case analysis on whether a non-unit value is in the lazy collection , and convert the write-only assertion back to a normal points-to assertion accordingly.

Deduplication via Concurrent Hashing
In the next two sections, we suppose a user-chosen capacity , which bounds the number of elements within hash sets.We also suppose a hash function from values to integers.We present a folklore [VerifyThis 2022] concurrent, lock-free, fixed-capacity hash set using open addressing and linear probing to handle collision [Knuth 1998].The code of our hash set appears in the upper part of Fig. 17.The hash set consists of an array of size .When created, the array is filled with a dummy element , which cannot be inserted in the hash set as it denotes an empty slot.Inserting an element is done by the add [ ; ; ] function, where is the hash set, the dummy element and the element being inserted.The function calls the auxiliary closure, which tries to insert at a given index, originally the hash of , using a CAS.If this index is already taken by a distinct value, potentially due to a collision, the closure tries the next index.(We do not resize: the function loops if the table is full.)The function elems [ ; ] function returns the elements of : that is, all the elements distinct from the dummy element .Occurrences of are removed via a call to a dedicated filter_compact function, which filters the array, and returns a compacted array where does not appear anymore.For brevity here, we omit the implementation of filter_compact, as it can be implemented entirely race-free and therefore disentangled.We instead focus on the nuance of disentanglement for concurrent operations on the hash set.
The hash set can be used to insert values concurrently and in parallel.However, in order to preserve disentanglement, the user should only insert values that were allocated before the beginning of the parallel phase [Westrick 2022].Indeed, the auxiliary function does a CAS operation on an a priori arbitrary index, which may have been filled by a concurrent task.Our interface hence restricts insertions to a set of values that were allocated before the hash set itself.if The representation predicate of a hash set is written hset , where is the dummy element, a set of values that were witnessed as allocated before the hash set, a set of values that were inserted, and a fraction in (0; 1].This predicate can be split and joined, allowing for parallel use.

hset
( Such a predicate is created by the init [ ] function.Its precondition requires a witness that a set of values were allocated before the current timestamp, via the now assertion.Such an assertion can be obtained via the GetClock and SplitSubjObj rules.The precondition also requires that the dummy element is not an element of .The postcondition produces a valid empty hash set with fraction 1.The specification of add [ ; ; ] requires a valid hash set with an arbitrary fraction, and that the element being inserted is in the authorized set of values.The specification of elems [ ; ] consumes a hash set with fraction 1 with content and produces an array ℓ with content ì .The deduped ì assertion asserts that ì contains no duplicate and has the same elements as : The proofs of the hash set interface of Fig. 17 rest on the objectivity lemmas.Indeed, parallel insertion may imply read-write races, but only on data allocated before the parallel phase.Intuitively, the hset predicate involves a cancellable invariant, storing an assertion ( ↦ → ì )@ , as well as an assertion , where comes from an application of the SplitSubjObj rule.The invariant also records that ( ì ) ⊆ , allowing the use of the Objectivize rule during the proofs.In addition to this cancellable invariant, the hset predicate also involves the (persistent) assertion ↑ , allowing to retrieve and update the points-to assertion of using the SplitSubjObj rule.
Fig. 17 also presents the code and specification of the dedup [ ; ℓ] function.This function deduplicates the array ℓ using our concurrent hash set.The function first creates a hash set .Then, the function allocates a closure which, given an index , inserts the element ℓ [ ] inside .Next, the function calls the closure in parallel for every index of the array.Finally, the function returns the elements of the hash set.The precondition requires that ℓ points to an array ì and the existence of a dummy element that is not in the array.The postcondition returns a fresh location ℓ ′ pointing to an array ì that is a deduplicated version of ì .Making use of our hash set, the proof is straightforward: each task gets an assertion hset ì ∅ (1/|ì |), enabling them to Fig. 18.Case study: deduplication of a lazy collection by concurrent hashing insert their element.At the end, the fractions of the hset predicate are joined, and the specification of elems concludes the proof.

Deduplication of a Lazy Collection via Concurrent Hashing
We cannot reuse the hash set of the previous section to deduplicate a lazy collection in parallel: its elements might not be allocated before the parallel phase.To address this issue, we implement and verify a more subtle hash set that can store elements allocated by concurrent tasks, while having a small number of duplicates.After the parallel phase, we use the previous dedup function to get rid of the remaining duplicates.
The hash set consists of a pair of arrays of the same size , and takes inspiration from the lock striping technique [Herlihy and Shavit 2012].The first array is similar to the hash set of the previous section and contains the inserted elements.The second array stores task identifiers, represented as (unboxed) integers.Each task is given a distinct identifier.Intuitively, before loading or writing an index of the first array, a task must ensure with a CAS that its identifier is written inside the second array at the same index.Otherwise, the task is not allowed to load or write the desired index in the first array.The races on the second array are disentangled: it stores only unboxed integers.This design has two consequences.First, duplicates may occur in the hash set: two tasks could insert the same element at distinct indexes.The number of tasks bounds the number of duplicates.Second, disentanglement of loads on the first array relies on a subtle invariant: if a task has written its identifier at an index of the second array, this task uniquely owns the same index of the first array.
Fig. 18 presents the implementation of our new hash set.The init [ ] function initializes the first array with the dummy element and the second array with a dummy identifier −1.The add [ ; ; ; ] function inserts the element inside the hash set with dummy element and the task identifier .This function differs from the add function of the previous section in two points.First, it is parameterized by the identifier of the task inserting the element.Second, instead of inserting an element at index using a direct CAS, it uses the auxiliary function tryput.
The tryput function first loads the first array 1 and the second array 2 of the hash set.Then, the function tests if it has already written its identifier at index inside 2 .If yes, or if a CAS succeeds in writing the identifier , the task has unique ownership of the index in 1 .The task can then load the content of 1 [ ], and tries to insert the desired element.
To prove this data structure correct, the full power of DisLog is needed.The lower part of Fig. 18 shows the specifications of init and add, where is the number of tasks that will use the hash set.They involve a representation predicate lhset , which asserts that the hash set with dummy element can be used with the task identifier , contains elements in and is valid at timestamp .The specification of init generates pieces of the lhset predicate, one for each task.The add function must be called with the correct identifier.The key idea is that the lhset representation predicate is monotonic with respect to the precedence pre-order.We derive similar specifications in DisLog+ using the Conversion rule, confining the timestamp-related reasoning.
Given a number of tasks , our deduplication function dedup_lazy [ ; ; ] allocates a hash set and generates parallel tasks using parfor.For each element of its designated range, each task sequentially generates the element on-the-fly before trying to insert it inside the lazy collection.When all tasks end, the dedup_lazy function extracts the elements of the hash set, and remove the remaining duplicates using the previous dedup function.
The last part of Fig. 18 presents the specification of our deduplication function dedup_lazy [ ; ; ].The precondition requires that the lazy collection is safe between index 0 and , that it returns a value that is not the dummy element, and that satisfies a given postcondition .A call to dedup_lazy returns a location ℓ and guarantees the existence of ì and ì such that ì contains elements and ì is a deduplicated version of ì .The postcondition then asserts that the returned location ℓ points to ì , and that for every value in ì , the assertion holds.

MECHANIZATION
All our results are mechanized in the Coq proof assistant [Moine et al. 2023b] using Iris [Jung et al. 2018] and its dedicated Proof Mode [Krebbers et al. 2018].Rounding and excluding comments, the definition of the language takes 1200LOC, the proofs of the two logics and their soundness theorems, 4600LOC, and the verification of case studies 3700LOC.We provide tactics to be used while reasoning with the two logics, and automation to DisLog+ thanks to the Diaframe library [Mulder et al. 2022].

RELATED WORK
Disentanglement.There has been a variety of work on disentanglement [Arora et al. 2021[Arora et al. , 2023;;Guatto et al. 2018;Raghunathan et al. 2016;Westrick 2022;Westrick et al. 2022Westrick et al. , 2020]].This work focuses on dynamic techniques that exploit disentanglement for improved efficiency, especially for parallel memory management.In particular, Arora et al. [2021] developed a provably efficient memory manager for functional programs based on disentanglement, and Arora et al. [2023] extended this approach to support unrestricted effects by accounting for the cost of entanglement.These works rely on disentanglement for efficiency and scalability, and leave the task of reasoning about disentanglement to the programmer.The first formal definition for disentanglement was given by Westrick et al. [2020] using traces of memory operations, and Westrick et al. [2022] developed a semantics which detects entanglement during execution.Our semantics for disentanglement is similar in the sense that it becomes stuck when entanglement occurs.In this context, the logics developed in this paper statically verify that execution never becomes stuck.Linearity and Concurrency.Our "plain vanilla" DisLog+ (i.e., without fractional write-only assertions and objectivity lemmas) is related to reasoning approaches establishing race freedom by a linear treatment of resources.These approaches comprise type systems for the -calculus [Igarashi andKobayashi 2001, 2004] as well as session type systems [Balzer and Pfenning 2017;Caires et al. 2016;Jacobs et al. 2022;Lindley and Morris 2015;Toninho et al. 2013].The latter are based on a Curry-Howard correspondence established between linear logic and the session-typed -calculus [Caires and Pfenning 2010;Wadler 2012].Most closely related to our work in terms of employed techniques is the work by Jacobs et al. [2022], which mechanizes safety of a session-typed language in Coq, where safety encompasses freedom of memory leaks and deadlocks.The authors introduce the notion of a connectivity graph, which is acyclic by construction due to linearity, and use concurrent separation logic to prove acyclicity-preservation graph transformations.Our work, in contrast, is not confined to a linear setting.
Separation Logics.Multiple Iris-based concurrent separation logic were developed.Among them, logics targeting weak-memory models inspired DisLog+, namely iGPS [Kaiser et al. 2017], iRC11 [Dang et al. 2020] and Cosmo [Mével et al. 2020].Indeed, they all build a high-level logic on top of a low-level logic using monotonicity arguments.In their case, assertions are monotonic predicates over the view of the memory: assertions remain valid even after observing additional memory events.In their case, the view ordering of the memory is a pure assertion.We generalize their approach to a pre-order within .Moine et al. [2023a] present a separation logic to reason about heap space for a sequential language with garbage collection.Their language is similar to the sequential subset of DisLang.In particular, they also make the difference between top-level functions and heap-allocated closures.Disentanglement is closely related to garbage collection: disentanglement ensures in particular that locations occurring in the program are always safe to read for a task-local garbage collector: this is reminiscent of the free variable rule [Felleisen and Hieb 1992].Outside the Iris world, Fu et al. [2010] present a concurrent separation logic with temporal reasoning.Contrary to them, our notion of time only relates to the pre-order induced by the fork-join structure of the program.

CONCLUSION AND FUTURE WORK
Disentanglement is an important property for parallel performance, but prior work leaves the challenge of reasoning about disentanglement to the programmer.We address this challenge by presenting DisLog, the first program logic to formally verify that a program is disentangled.Additionally, we present DisLog+, which allows for mostly standard separation logic proofs and offers proofs of disentanglement "for free" for many programs.Using these logics, we prove disentanglement for a number of examples, including several lock-free data structures.Our experience with DisLog and DisLog+ is that the effort required to prove disentanglement is often small and can be confined to the daring parts of the program.In future work, we plan to develop a type system to automatically infer disentanglement where possible.We hope that a semantic type soundness approach making use of DisLog could be used to prove such a system sound.

Fig. 13 .
Fig. 13.Fractional write-only points-to assertions 5.4 Write-Write Races are Disentangled: Fractional Write-Only Assertions A write-write race occurs when two or more tasks race to write to a shared location, but neither of them (or any other task) ever reads from the said location.Write-write races are always disentangled, as a write does not acquire any location.However, from the point of view of functional correctness, write-write races are subtle: once the tasks join and the program reads the shared location, all the outcomes of the race should be taken into account.To verify such races in more standard Iris settings, the user typically installs an invariant containing the points-to assertion, quantifies existentially over the pointed-to value, and constrains it using ghost state.This existential quantification allows the user to change the pointed-to value while preserving the invariant.In DisLog+, invariants are restricted to assertions, and thus the user cannot store a () points-to assertion inside an invariant.To allow the verification of write-write races in DisLog+ without the need of an invariant, we introduce the notion of a fractional write-only assertion presented in Fig.13.A write-only assertion takes the form ℓ ⇒ , where is a list of ghost names, a positive fraction less or equal to 1, and a set of possible values.When = 1, the set contains all the values possibly written to ℓ.The write-only assertion comes with a companion assertion orig , which is persistent and describes the original value of the points-to assertion.The WOStart rule consumes a points-to assertion and produces an orig assertion and an empty write-only assertion.The WOFrac rule asserts that the write-only assertion is fractional: the user can always arbitrarily split and join it.The WOStore rule allows executing a store operation, overwriting the set of possible values.This rule only requires a fraction of the write-only assertion: a concurrent task could have another fraction and race to write.Fig.13also includes two rules for getting back a points-to assertion from a write-only assertion.In both cases, the full fraction 1 must be given back.The WOCancel rule can be used if no write occurred, as witnessed by the empty write-only assertion.The rule produces the original points-to assertion.If at least one write has occurred, the WOEnd rule can be used.The rule produces a points-to assertion and a proof that the pointed-to value is in the set of possible values.The definition of write-only assertions appears in our Coq formalization[Moine et al. 2023b].This definition makes use of standard ghost state and in particular a cancellable invariant[Jung et al.  2018,  §7.1]  in which a points-to assertion of DisLog is stored.Notably, the assertion ℓ ⇒ carries not only information on the contents of the cancellable invariant, but also a witness now that the set of possible values was allocated before the ambient timestamp.Hence, after an application of the WOEnd rule, we can reconstruct back a points-to by canceling the invariant and exhibiting a witness that the pointed-to value was allocated before the ambient timestamp.While write-only assertions fit well in the context of , we stress that they are not specific to it.Similar definitions can be proposed for regular points-to assertions at the level, dropping assertions related to timestamps.In our discussion of write-only assertions, we focus on references, which in DisLang are represented by arrays of size 1.To target arbitrary arrays, we assume that our approach can be generalized to a more detailed interface with a per-index write-only points-to assertion.This generalization should be purely mechanical.

Fig. 14 .
Fig. 14.Objectivity lemmas 5.5 Many Read-Write Races are Disentangled: Objectivity LemmasRead-write races can create entanglement: a task could communicate a memory location it allocated to a concurrent task.Verifying that such races are safe often requires the full expressiveness of Dis-Log.However, some read-write races are trivially disentangled: if written values are unboxed (that is, not allocated in the heap), or were allocated before the beginning of the race.This section explains how to reason about such races within DisLog+.In the extreme case where a location points to a block of unboxed values, read-write races are tolerated without additional work.Recall (Fig.11) that ℓ ↦ → ì ≜ ⌈ℓ ↦ → ì ⌉ * ì now.If ì contains only unboxed values, the assertion ì now holds trivially true, and we thus have that ℓ ↦ → ì ⊣⊢ ⌈ℓ ↦ → ì ⌉.Therefore, the points-to assertion of ℓ can be stored directly inside an invariant, and read-write races on ℓ can be verified, as long writes concern unboxed values.In a more general case, Fig.14presents an interface of "objectivity lemmas" to reason about trivially disentangled read-write races.Objectivity lemmas offer two mechanisms.First, they allow witnessing that a set of locations were allocated before a program point (rules GetClock and MementoPre+).Second, they allow sharing a points-to assertion through an invariant, and allow updating this points-to assertion as long as new values are unboxed or were allocated before the installation of the invariant (rules SplitSubjObj and Objectivize).To witness that a set of locations were allocated before a given program point, the user can use two rules and combine their result using the equivalence ( 1 ∪ 2 ) now ⊣⊢ 1 now * 2 now.First, the GetClock rule allows extracting from a points-to assertion that every pointed-by value was allocated before the ambient timestamp.Second, the MementoPre+ rule asserts that every location occurring in the current expression was allocated before the ambient timestamp.The SplitSubjObj rule allows in particular sharing a points-to assertion through an invariant, while fixing the set of possibilities for newly written values.We present the general form of the rule, adapted from Cosmo[Mével et al. 2020,  §4.1].The SplitSubjObj rule asserts that owning a assertion is equivalent to owning its projection to some timestamp ⌈ @ ⌉ (the objective part of ), and the information that this timestamp precedes the ambient timestamp, written ↑ (the subjective part of ).The assertion ↑ is persistent and defined as ′ .≼ ′ .The objective part @ is an assertion and can hence be shared through an invariant.The subjective part ↑ cannot be shared through an invariant, but it can be given to subsequent tasks and used to convert back the objective part into the original assertion, by using again the SplitSubjObj rule.We now focus on a contrived example to illustrate how the objectivity lemmas are intended to be used, and why an additional rule, Objectivize, is needed.Sec.6.1 and 6.3 provide more realistic examples, based on the same idea.Our contrived example is: Fig. 15.Case study: specification of a spin-lock Fig. 16.Case study: parallel lookup in a lazy collection Proc.ACM Program.Lang., Vol. 8, No. POPL, Article 11.Publication date: January 2024.