Hopping Proofs of Expectation-Based Properties: Applications to Skiplists and Security Proofs

We propose, implement, and evaluate a hopping proof approach for proving expectation-based properties of probabilistic programs. Our approach combines EHL, a syntax-directed proof system for reducing proof goals of a program to proof goals of simpler programs, with a "hopping" proof rule for reducing proof goals of an original program to proof goal of a different program which is suitably related (by means of pRHL, a relational program logic for probabilistic program) to the original program. We prove that EHL is sound for a core language with procedure calls and adversarial computations, and complete for the adversary-free fragment of the language. We also provide an implementation of EHL into EasyCrypt, a proof assistant tailored for reasoning about relational properties of probabilistic programs. We provide a tight integration of EHL with other program logics supported by EasyCrypt, and in particular probabilistic Relational Hoare Logic (pRHL). Using this tight integration, we give mechanized proofs of expected complexity of in-place implementations of randomized quickselect and skip lists. We also sketch applications of our approach to cryptographic proofs and discuss the broader impact of EHL in the EasyCrypt proof assistant.


INTRODUCTION
There is a long line of work that develops rigorous approaches for proving properties of probabilistic programs.These approaches generalize to the probabilistic setting the classic notions of pre-and post-conditions and of invariants.A fundamental difference is that in the probabilistic setting these notions are quantitative.Assertions are expectations, i.e. functions that map states to extended positive reals.The use of expectations was pioneered by Kozen [Kozen 1985], systematized by Morgan, McIver and Seidel [Morgan et al. 1996], and still prevails to date.
Unfortunately, these approaches are often difficult to use.One main reason is that proofs of probabilistic programs do not always follow their control flow.Another reason is that once the target program property is fixed, it is often very convenient to reason about more abstract or refactored programs.From the theoretical perspective, none of these concerns is an issue, since in general these approaches are complete.However, more flexible approaches are desirable when verifying concrete examples, in particular when building mechanized proofs.
Problem statement and contributions.The main goal of this paper is to support flexible computeraided verification of probabilistic programs, and in particular to develop an approach that allows breaking away from the control flow of programs, and change program representation during verification.Our target is to use our approach on relatively small but challenging probabilistic programs drawn from the theory of randomized algorithms and from cryptography.The choice of application domains naturally delineates the choice of the pWhile language, a core probabilistic language with sampling from discrete distributions, (non-recursive) procedures and adversaries.Informally, an adversary is an unspecified quantified procedure with constraints on the variables it can read and write, and on the procedures it can call.Thus the main challenge with adversaries is to devise proof principles that are sound w.r.t.all possible instantiations of the adversary.We note that in contrast with many other works in this realm, pWhile explicitly (and purposedly) does not support conditioning, concurrency and non-determinism, which do not have a central role in our applications.
We achieve our goals in three steps.First, we define a program logic, called eHL, to reason about expectation-based properties of pWhile programs.Judgments of eHL are of the form { } C { } where C is a statement and and are maps from program states to extended positive reals.Informally, a judgment is valid if the expected value of on the output memory is upper bounded by the value of on the initial memory.The proof system for eHL closely matches the pGCL pre-expectation calculus [Morgan et al. 1996], except for loops, procedure and adversary calls: -our rule for loops uses the notion of upper invariant from the literature; -our rule for procedures uses auxiliary variables.It is folklore that complete proof rules for procedures-even in the deterministic setting-require the use of auxiliary variables, cf.[Kleymann 1998[Kleymann , 1999;;Nipkow 2002a,b].We show that auxiliary variables also allow to recover completeness in the probabilistic setting.-our proof rule for adversaries is new.The main challenge is to devise useful and sound proof rules based exclusively on the aforementioned adversary constraints.In addition, our program logic features a "hopping"1 proof rule to reduce the proof of a probabilistic program C ′ to a proof of a probabilistic program C. Hopping proofs subsume the "abstract and verify" or "refactor and verify" paradigms that are commonly used in verification by allowing the possibility to perform arbitrary long interleavings of verification steps with abstraction/refactoring steps.They have been previously used in interactive and automated program verification, including [Lammich and Tuerk 2012;Magill et al. 2010;Nipkow et al. 2020;Tassarotti and Harper 2019].In our case, programs are probabilistic, so we use the relational program logic pRHL [Barthe et al. 2009] (we defer to subsequent sections for the definition of the pRHL judgment ⊢ { } C ′ ∼ C { }) in the following way: Hopping Proofs of Expectation-Based Properties: Applications to Skiplists and Security Proofs 122:3 This rule brings hopping proofs to the realm of expectation-based properties.Our case studies use the rule to switch to a more abstract representation of probabilistic programs, and to switch to a different probabilistic program, e.g. one whose control flow follows the reasoning.
Our logic also features a proof rule inspired by the frame rule, also known as rule of constancy, from classical Hoare Logic.Indispensable in practice, the rule improves upon modularity and compositionality of the calculus, by allowing one to focus only on those parts of assertions that are potentially affected during evaluation.Leveraging the reverse Jensen's inequality, the rule takes the form In short, it permits the extension of judgments to arbitrary contexts (i) depending only on the memory not modified by the statement C that is (ii) concave (e.g.linear or sublinear) and monotone, when seen as function.For instance, the rule allows to deduce ⊢ { log(2 ) + } C { log( ) + } from ⊢ { 2 } C { } by taking [□] = log □ + , whenever C leaves unchanged.
Second, we implement our program logic in the EasyCrypt proof assistant [Barthe et al. 2013], an existing tool for the verification of probabilistic programs and cryptographic proofs.Our implementation is carefully crafted to leverage some key features of EasyCrypt, including some weak forms of weakest precondition and SMT-based support.Concretely, we define and implement another set of proof rules that make deductive verification more practical.This set of proof rules is obtained by adapting classic approaches to turn Hoare logics into deductive verification tools, e.g.chaining applications of construct-specific rules with applications of sequential composition and non-structural rules.In order to reason effectively about pre-expectations in the ambient logic of EasyCrypt, we have also developed a library of mathematical definitions and facts about extended positive reals.This library is used critically in our case studies.
Finally, we use our framework to mechanize proofs of several examples.Our main examples are proofs of expected cost for in-place implementations of randomized quickselect and skip lists.Both examples leverage the full power of the framework and go beyond the reach of previous approaches.In particular, our proof is the first to establish a logarithmic bound for skip lists implementations-prior works either establish a logarithmic bound for an abstract description of skip lists [Haslbeck and Eberl 2020] or a linear bound for a (concurrent) implementation of 2-level skiplists [Tassarotti and Harper 2019].In addition, we illustrate how our framework can be used beneficially in the context of cryptographic proofs.In contrast to the expected cost examples, which target real examples, we consider a synthetic example of cryptographic proofs, inspired from concurrent work [Barbosa et al. 2023] that uses our implementation of eHL to prove security of Dilithium, a post-quantum signature scheme recently standardized by the NIST (National Institute of Standards and Technology).The goal of our example is to illustrate how eHL can be used to obtain simpler proofs with tighter security bounds.However, potential uses of eHL are not limited to such use cases.We also discuss informally how eHL can be used to verify previously axiomatized techniques for reasoning about failure events, and to prove probability bounds in place of the existing logic implemented in EasyCrypt.
In summary, our main contributions are: -the design, theoretical study and implementation of eHL; -the application of eHL to expected cost analysis of randomized quickselect and skip lists; -an illustration of the benefits of eHL in cryptographic proofs.
Artifact.The implementation of eHL, the library of expectations and the formally verified case studies will be submitted as an artifact.The case studies themselves are also available in source form as supplementary material.As an indication, the implementation of eHL proof system and associated libraries represents about 3,000 lines of OCaml code and 1,000 lines of EasyCrypt code.The proof of the quickselect example represent 70 lines for the programs, 300 lines for a library on partition, 110 lines for the equivalence proof in pRHL (concrete version versus the abstract one) and 70 lines for the proof bounding the expected cost in eHL.For skip lists, the proof consists of 2600 lines of EasyCrypt code, about 500 lines for bounding the expectation, the remaining part is mostly concerned with the equivalence proofs and functional correctness.The implementation and case studies will also be made publicly available in GitHub.
Outline.This paper is structured as follows.In Section 3, we provide a bird's eye view on the contributions of this work.Sections 4 and 5 formally establish the expectation logic eHL and its integration with pRHL, while in Section 6 we employ our framework to obtain a fully formalized average case complexity analysis of (a natural and realistic implementation of) skip lists, a randomized data structure of interest for practitioners.In Section 7 we extend eHL to a setting permitting adversarial code and demonstrates the usefulness of the logic for carrying out cryptographic proofs.Section 8 we describe the integration of eHL into EasyCrypt and provide further details on the formal verification of the case studies.In Section 2 we consider related work, and we finally conclude in Section 9.

RELATED WORK
There is a large body of work on formal verification of probabilistic programs and resource analysis.For space reasons, we mention only closely related work.
Verification of probabilistic programs.Expectation-based reasoning can be traced back to the seminal work of Kozen [Kozen 1985], who developed a sound and complete propositional dynamic logic for a core probabilistic programming language.It was further developed by Morgan, McIver and Seidel [Morgan et al. 1996], who introduced and studied extensively probabilistic predicate transformers for a core probabilistic language with non-deterrinism.These approaches were recently extended to recursive procedures [Olmedo et al. 2016] and conditioning [Olmedo et al. 2018].eHL inherits many technical tools from this line of work, in particular the use of upper invariants.However, eHL makes several (minor but practically important) technical contributions: it embeds pRHL into expectation-based reasoning; it supports adversary calls; it features a non-structural rule to simplify expectations (to our best knowledge, no such rule has been considered before); it recasts in the setting of probabilistic programs existing approaches to achieve completeness of Hoare logic in presence of procedures.For the latter, we follow the approach of [Kleymann 1998[Kleymann , 1999;;Nipkow 2002a,b].

Complexity analysis of probabilistic of programs.
There is also a huge body of work related to complexity analysis of probabilistic programs.Related to probabilistic predicate transformers, Kaminski et al. [2018] define an expected runtime transformer ert for a core probabilistic programming language with non-determinism.Subsequent works extend the expected runtime transformer with recursive procedures [Olmedo et al. 2016], amortized reasoning Batz et al. [2023], or to higher-order functions [Avanzini et al. 2021].Related to this line of work, several automated tools have emerged Avanzini et al. [2020bAvanzini et al. [ , 2023]]; Ngo et al. [2018].Also martingale theory has been successfully tailored towards the analysis of complexity related properties of imperative programs [Agrawal et al. 2018;Barthe et al. 2016;Chakarov and Sankaranarayanan 2013;Chatterjee et al. 2017;Takisaka et al. 2018;Wang et al. 2019].These notions correspond close to that of Lyapunov ranking functions for proving (positive almost-sure) termination, and for deriving bounds on the runtime [Avanzini et al. 2020a;Bournez and Garnier 2005].For functional languages, type-based approaches to complexity analysis turned out useful [Avanzini et al. 2019;Leutgeb et al. 2022;Wang et al. 2020 Mechanized analyses of probabilistic programs.Haslbeck [2021] implements a Hoare style calculus related to the ert calculus of Kaminski et al. within the Isabelle/HOL proof assistant.Interestingly, his work contains a frame rule which can be interpreted as a special case of the one we give.Related, Hurd et al. [2004] formalized a weakest pre-condition calculus for probabilistic programs within HOL, and proof several interesting meta-theoretical properties of the calculus.Program verification is aided through the extraction of recurrence relations to Prolog.Both works include proofs of soundness and completeness of the transformer.In contrast, our core logical rules are part of the trusted computing base.This is in line with the approach inEasyCrypt, where proof rules for program verification are not verified-in other words, EasyCrypt does not use a shallow nor a deep embedding of programs, but rather a hardwired embedding.Van der Weegen and McKinna [2008] was probably the first to formalize quicksort in a proof assistant, more precisely in Coq.They used a shallow embedding and analyzed the average case complexity of more high-level, functional version of quicksort.In a similar spirit, Eberl et al. [2020] use the Isabelle/HOL proof assistant to reason via a shallow embedding about the average case complexity of algorithms on binary tree structures.Notably, their analysis covers (the functional variant of) quicksort.Tassarotti and Harper [2018] study quantitative properties of concrete randomized algorithms, focusing on the formal verification of tail bounds.For example they handle (a functional version of) quicksort, again using a monadic embedding.Their analysis is formalized in Coq.
The average complexity analysis of skiplists is rather intricate, rendering skip lists a prime example to evaluate the expresivity and usability of proof assistants.Haslbeck and Eberl [2020] formalise the relationship between the expected height and expected length of search paths within the proof assistant Isabelle/HOL, leading also to the formalisation of a considerable amount of results of probability theory.Wheareas the starting point of Haslbeck and Eberl is a formal but abstract specification, here, we study a concrete algorithm resembling the reference implementation given by Pugh [1990b].This explains our focus on program logics, rather than the formalisation of mathematical results.Relying on the extensive library underlying EasyCrypt, our formalisation effort is mostly concerned with laws on expectations (such as, linearity or Jensen's inequality).Strongly related to our formal complexity analysis of skip lists is the work by Tassarotti and Harper [2019] on concurrent skip lists.Their Coq formalization extends Iris [Jung et al. 2015] with probabilistic coupling, conceptually in line with our use of eHL in conjunction with pRHL.Their very impressive formalization is orthogonal to our results.On the one hand, the focus is on the verification of quantitative program behaviour in the context of concurrency, while our analysis only concerns sequential evaluation.On the other hand, the notion of skip lists is restricted to two levels and the obtained upper bound on the expected search length is linear, while we consider skip lists in their original definition and re-obtain the original logarithmic bound, in expectation.This latter aspect requires a more involved encoding of our non-concurrent version and conclusively a more sophisticated verification.
Comparison with EasyCrypt.EasyCrypt is an interactive proof assistant targetted to formal verification of cryptographic proofs.Its main component pRHL is used to support game-hopping proofs.In addition, EasyCrypt features a program logic called phoare for reasoning about the probability of events.In contrast to eHL, phoare judgments are of the form ⊢ ⋄ { } { }, where , are boolean-valued assertions and ⋄ is either ≤, ≥, or =; unfortunately, it is difficult to build sound, complete, and practical proof systems for such judgments.Moreover, the proof rules of phoare, and in particular the rule for loops, require programs to be certainly terminating.In general, it would seem beneficial to deprecate phoare and use eHL instead.
Fig. 1.Implementation of qselect (le ) of quickselect and its eHL annotated abstraction qselect_abs (right).The term g (i, k, l, h) EasyCrypt also provides a cost logic for adversarial programs [Barbosa et al. 2021].The purpose of the cost logic is to upper bound the the complexity of constructed adversaries, i.e. programs with adversary calls that formalize the security reduction from security of a cryptographic scheme to hardness assumptions or assumptions about primitives.One main rule of the logic is an instatiation rule, which allows to reason about the cost of a program where the adversary is instantiated by another program-it can either be a concrete program but also a so-called constructed adversary,.The instantiation rule is required to upper bound constructed adversaries for complex cryptographic systems that are built from several components.The logic is focused on worst-case cost.An interesting direction for future work is to adapt this logic to expected cost.

A BIRD'S-EYE VIEW ON OUR METHODOLOGY
In what follows, we introduce our methodology on Tony Hoare's quickselect [Hoare 1961]: a non-trivial, (possibly) non-recursive, randomized algorithm.
Quickselect.Sorting and searching are arguably the most studied algorithmic problems in computer science. 2 Quickselect is a selection algorithm to find the th smallest element in a given (unordered) array.Quickselect operates similar to quicksort, by partitioning the array around a chosen pivot.However, the recursive call is performed just on the partition actually containing the element one is looking for.This observation allows one to perform a tail-call optimization of recursive quickselect, which produces an iterative algorithm.As for quicksort, performance degrades if bad pivots are consistently chosen.By choosing a pivot uniformly at random at each stage, it can be shown that quickselect expected runtime, often more interesting than worst-case complexity when randomness plays a role, is in O( ). 3 The code of quickselect with random pivot selection is given in Figure 1a.Arrays are indexed from 0, for instance, qselect ([4, 6, 2, 8], 1) = 4 since 4 occurs at index 1 in the sorted input array [2,4,6,8].Randomized partitioning of an array a (within indices l and h) is implemented with rpartition(a, l, h).The instruction unif(l, h), used to choose the pivot index p, samples at random an integer between l and h.It is the only point of the code where randomness actually plays a role.Partitioning is then carried out via partition following the Lomuto partition scheme, expecting the pivot at the final index.
Informal Complexity Analysis.The classic textbook proof on the average case complexity of quickselect can be found in [Cormen et al. 2009].It is based on a sequence of lemmas that are proved looking at the source code in a quite abstract way, through some high-level reasoning.
An important observation is that for each input, if the pivot is chosen uniformly at random from the interval [l, h], then so is its rank (the position of an element in the sorted array) i.Thus, partitioning with pivot of rank i has probability 1 /h − l + 1 and, depending on i, the resulting parts of the partition have sizes i − l and h − i, respectively.The procedure qselect loops over just one of the parts, the one actually containing the element we are looking for.In particular, if i < k, the right partition of size h − i is explored, likewise, if i > k, then the left partition of size i − l is explored.In the remaining case i = k the k-th element has been found.Averaging over all the h − l + 1 possible partitions and noting that the number of comparisons performed inside partition is h − l, the average number of comparisons can be estimated accurately by solving the following recurrence relation: Then, it is not difficult to prove that C (l, h) ≤ 4(h − l).Since l is initialized to 0 and h to size(a) − 1, we obtain the well-known linear bound of O(size(a)), in expectation.
Towards a Formal Analysis.The complexity analysis sketched above is still informal.In particular, the recurrence relation is obtained by a high-level analysis of the code, and through informal reasoning involving probabilities, sizes of partitions, etc.How can we be sure that all of this is correct?
In this paper, we propose a formal end-to-end methodology that is able to provide upper-bounds on the complexity of randomized programs, based on the general methodology of Hoare logic [Hoare 1969].Towards this formalization, we first have to endow a cost model, i.e., be precise in exactly what to measure.A generic way to do so is to simply instrument the program with a cost counter, as we have already done in Figure 1a.Notice how the global variable ct takes account of the total number of comparisons-the usual cost metric for sorting and selection algorithms-performed by qselect.Our objective now turns into bounding the value that ct takes on average after execution, in terms of the size of the input. 4rom here, a fully formalized complexity analysis of qselect is certainly possible, however, unnecessarily complicated.As we have already seen in the informal proof, a priori we do not really have to reason about the full program.Some parts of it can be abstracted, so that the complexity analysis becomes easier.This is exactly what we have done when we have claimed that partition does h − l comparisons.Indeed, program abstraction is a useful tool in program analysis (see e.g.[Magill et al. 2010]).Consider the procedure qselect_abs, depicted in Figure 1b, giving a complexity preserving skeleton of qselect.Ignoring gray annotations for now, in essence arrays a are abstracted by their size n.While the skeleton of quickselect remains identical, partitioning of the array becomes superfluous.In rpartition_abs a cost of h − l is incurred directly and the rank i, rather than the pivot p, is sampled.
Naturally, the claim about the complexity equivalence of the two programs has to be made formal.To this end, relational program logics such as probabilistic relational Hoare logic (pRHL) provide a suitable solution [Barthe et al. 2015[Barthe et al. , 2012[Barthe et al. , 2017]].Moreover, support for pRHL is readily available in the proof assistant EasyCrypt.In pRHL, judgments take the form of (relational) Hoare triples where and are predicates over the joint program states of C and D, with the informal meaning that on inputs related by , the programs C and D produce an output (distribution) related by . 5eferring with (•) ⟨1⟩ and (•) ⟨2⟩ to the state of the left-and right program the triple asserts that if the inputs are related in the obvious way, then the (distributions of) cost counters ct are identical after execution. 6The main crux of the proof lies in proving a related statement on partitioning: where res refers to the return value of the procedure.Comparing the two procedures, in effect this statement formalizes that (i) partitioning itself performs h − l comparisons (ct ⟨1⟩ = ct ⟨2⟩ ) and that (ii) the rank of the pivot lies uniformly in the interval [l, h] (res ⟨1⟩ = res ⟨2⟩ ).While the former point is quite trivial to prove, the latter property essentially states that pivot positions and ranks are in a bijective relationship, a property that rests on functional correctness of partition and uniqueness.
Proc Formal Reasoning about Expectations.Through the correspondence (equiv_qselect) we have achieved a separation of concerns, as functional correctness properties relevant to the complexity analysis have been dealt with.Knowing that qselect_abs is a cost-preserving abstraction of qselect, we can thus focus on the core of the complexity analysis, as carried out in the informal analysis above.
For this, we use a Hoare logic for reasoning about expectations.This logic, dubbed Expectation Hoare Logic (eHL), constitutes a sound and complete logic for reasoning about judgments of the form where , are (non-negative) real-valued functions over the program state of C, also referred to as pre-and post-expectation, respectively.Informally, this judgment states the expected value of after execution of C is bounded by .More formally, this judgment is valid if for any initial program memory , where the left-hand side denotes the expected value of on the (sub)distribution C of memories obtained after evaluating C on .
Coming back to quickselect, the judgment bounds the expected value of ct after execution by 4(n − 1).The guard 0 ≤ k < n in the preexpectation should be understood as a classical pre-condition, for details see Section 5. We have decorated the code of Figure 1b with the corresponding eHL assertions at each line of the listing.The proof of this statement relies again on an auxiliary statement on partitioning, namely, Here, the free variable should be understood as a universally quantified, logical (function) variable, and as above, res refers to the return value of rpartition_abs.Notice how this statement reflects that the cost counter is advanced by h − l, and that the return value is sampled uniformly from the interval [l, h]; eHL is in many aspects reminiscent of classical HL.Indeed, the core rules-when restricted to predicates-are identical.As such it transfers Hoare-style backward reasoning to probabilistic programs.Where eHL does depart from HL is the support of sampling instructions S, embodied by the axiom extends naturally from HL to eHL, implications turn into inequalities.The two axioms together with the consequence rule, tacitly employed before the first statement within the procedure's body, should be sufficient to comprehend the annotations given in Figure 1b around the definition of rpartition_abs.
In a similar fashion, the annotations of qselect_abs can be traced from bottom to top.As in classical HL, the treatment of loop rests on finding a suitable invariant, here it is given by 0 Within the loop, the guard l < h can be additionally assumed, the guard is falsified immediately after the loop.Concerning the nested conditional in the loop, the term ct + g(i, k, l, h) is computed syntactically as the weakest pre-expectation given postexpectation ct + 4(h − l).(See the caption for the precise definition of g.) Concerning the call rpartition_abs(l, h) the logical variable is instantiated by the function i ↦ → g(i, k, l, h), since the result of the call is bound to i. Interestingly, one recovers, in a formal and syntax-directed way, the recurrence relation of the previous paragraph through the weakening performed in (★).Indeed, the (approximate) solution of the recurrence ( †) becomes the invariant of the main while loop.
The combination of (equiv_qselect) and (qselect_abs_cost) yields confirming the linear bound-O(size( ))-on the expected cost of qselect, derived above in the informal analysis.
Integration within EasyCrypt.The here presented case study on quickselect clarifies the effectiveness of our verification methodology.Relational reasoning provided by pRHL-in particular that employed to guarantee functional correctness-and quantitative reasoning provided by eHLformalizing the original (informal) complexity proof-work together in a synergistic way.As mentioned, the development is fully formalized (within EasyCrypt), rendering heightened assurance that none of the (necessary) intricacies of a complexity analysis of a randomized algorithm have been overlooked.To this end, EasyCrypt has been extended with support for eHL, see Section 8.

A PROBABILISTIC PROGRAMMING LANGUAGE
We consider here a simple imperative probabilistic programming pWhile capturing the core language of EasyCrypt without adversaries.This language follows the spirit of Dijkstra's Guarded Command Language but including (non-recursive) procedures and a separation of global and (statically scoped) local variables.The language will be consecutively extended to permit adversarial code in Section 7, when we discuss applications to cryptography.
Syntax.Let Fun = {f, g, . . .} be a set of procedure names, and Var = {x, y, z, . . .} a set of variables, partitioned into local variables LVar and global variables GVar.The set Stmt of statements is defined by the following syntax: Here, E ∈ Expr is drawn from a set of expressions, B ∈ BExpr is a Boolean expression, and S ∈ SExpr a sampling expression.The statements are mostly standard.The statement x ← E gives the usual, deterministic assignment, whereas x $ ← S samples a value from S, and thereby makes the language probabilistic.Statement x ← f(E) calls a procedure with argument E and assigns its return value to x. Zero or more than one argument can be passed to procedures as tuples.We require that x is a local variable.A procedure is declared through a procedure definition of the form proc f(x) C; return E , where x ∈ LVar is the formal parameter, C ∈ Stmt the body and E ∈ Expr the return expression of f.Global variables should be understood as implicit input and output to procedures, whereas local ones are statically scoped.A program P ∈ Prog is a finite sequence of (mutually exclusive) procedure definitions.
Monadic Denotational Semantics.Semantics of imperative programs can be given in many ways.Here, we endow the language with a denotational (monadic) style semantics, lending itself better to the proofs of soundness and completeness of our logic.Since programs are probabilistic, we interpret them as functions from states to subdistributions of states, rather than as mere (partial) state transformers.
A subdistribution over a set is a function : → [0, 1] such that ∈ ( ) ≤ 1, with D we denote the set of all subdistributions over .For : D , the support supp( ) ⊆ is given by the collection of elements ∈ with ( ) > 0. Throughout the following, we consider only discrete subdistributions, that is, where the set is countable.Let R +∞ denote the non-negative sup ∈N while ( ) B do C where and we use f ( , ) as a short-hand when f is declared in the program as above.

EXPECTATION HOARE LOGIC
In this section, we now present the Expectation Hoare Logic (eHL) formally, starting with the core logic and then integrating relational reasoning towards the end of the section.
As seen in Section 3, eHL is designed for reasoning reason about judgments of the form { } C { }, where C is a pWhile statement and and , dubbed pre-and post-expectations respectively, are functions from states to (non-negative) extended reals.In effect, eHL manipulates slightly more complex judgments in order to address a well-known issue with completeness of proof rules for procedures.In a nutshell, the standard proof systems for procedures aim to achieve modularity by proving for each procedure a procedure specification.These are triples of the form { } f { }.For instance, in (rpartition_abs_cost) on page 9, we have employed the specification In this specification, the pre-expectation is parameterised in the argument-here, a tuple ( , ℎ)whereas the post-expectation is parameterised in the return value res.Both may reference global variables like the counter above-they are implicit input and output of the procedure.Then, modularity is achieved by using the procedure specification every time the procedure is called.Unfortunately, a naive realization of this approach does not achieve completeness.Incompleteness arises because the specification of a function is independent of its call site.Since independence in itself is desirable for reducing proof effort, the standard compromise is to provide users with a means to adapt a declaration to specific call-sites, to reason about properties potentially involving local state.To this end, we borrow the notion of auxiliary (or logical) variables from Kleymann [1998].Auxiliary variables may occur in pre-and post-expectations and are (implicitly) unversally quantified.Effectively, they turn declarations into schemata, where auxiliary variables can be freely instantiated.For instance, in the above specification of rpartition_abs we used an auxiliary variable , with the intended meaning that the triple holds for any concrete instantiation of .As for Kleymann, auxiliary variables yield a conceptual simple solution to recover (relative) completeness of our logic.With this in mind, we can embark of defining eHL.Our presentation follows closely the presentation of (classical) Hoare Logic HL given by Nipkow [2002b], with preand post-expectations given by semantic objects parameterized by a type of auxiliary variables, rather than terms or expressions.In eHL, judgments now take one of two forms, namely where , : → Mem → R +∞ and , : → GMem × Val → R +∞ .As indicated above, pre-and post-expectations of procedures are parametric only in the global memory.In the pre-expectation , the additional value argument refers to the formal parameter of f, whereas in the post-expectation it refers to the returned value.To avoid notational overhead, in examples, we will continue to write pre-and post-expectations as expressions, potentially referring to extra auxiliary variables besides program variables.For instance, = Z × Z admits two integer valued extra variables, say and .If is a program variable, an expression such as + + formally represents ( , ) .+ + ( ).In a similar vain, we will use variables arg and res to refer to the formal parameter and return value within procedure specifications.
eHL is tailored to proving upper-bounds on the value that a function takes, in expectation, after running a program.This meaning is made precise through the notion of validity.
Finally, through the binary operator ( | ) that we have already used when reasoning about quickselect, we can also combine classical with probabilistic reasoning.Semantically, the operator Procedure Declarations Logical Rules The Core Rules. Figure 3 presents the core rules of eHL.Interestingly, and what we believe makes the logic in particular usable, is that the core rules are in essence identical in shape to that of classical HL.This is in particular visible in the rules (skip), (seq) and (assign).In Rule (assign), [x/E] is shorthand for .
[x/ E ]. Rule (sample) generalizes the usual assignment rule to sampling instructions: the pre-expectation is the weakest one binding post-expectation when x is sampled from S. For instance, ⊢ { 0.5 } x $ ← unif([0, 1]) { } states that in expectation the value of x is given by 0.5, when sampled uniformly from {0, 1}.
Rule (if) is the mere adaptation of the equivalent classical HL rule.The rule descends into the then-and else-branches, where one can additionally assume that the guard and its negation holds, respectively.Concerning loops, rule (while) requires establishing an invariant on the loops body.As in classical HL, the invariant needs to be established only on initial memories making the guard evaluate to true.The rule also establishes that the guard evaluates to false after exiting the loop.
The rule (call) allows one to use a procedure specification { } f { } to reason about a call-site x ← f(E).Recall that and are parameterized, beside auxiliary variables and global state, by the formal argument and return value of f, respectively.The rule adapts these to the call-site, by substituting value of the argument E for the formal argument in , and by identifying the return value of f with that of the assigned variable x within .Dual to (call), rule (proc) establishes that a procedure proc f(x) C; return E satisfies a specification { } f { }.Here, one essentially has to validate that the procedures body C; return E adheres to the specification.Following the Fig. 4. Integration of Relational Hoare Logic semantics of procedure calls, the pre-condition = 0 [x/ x] permits one to restrict attention to memories whose local variables are initialised by 0 , apart from the formal argument x which ranges over an arbitrary value.This completes the definition of all structural rules.
The final two logical rules deal with auxiliary variables and approximate reasoning, through a rule of consequence.A natural candidate for the latter is the rule we have seen in Section 3, corresponding to the law of monotonicity in pre-expectation transformers [McIver and Morgan 2005].Alas, ignoring extra variables, the rule is too weak and its addition alone would render our logic incomplete.Rather, our rule (conseq) is an embodiment of the one of Nipkow [2002b] which is strictly more powerful in the presence of local variables.Observe how the additional premise is just enough to lift validity ⊨ ′ { ′ } C { ′ } of the premise to that of the conclusion.Although a bit cumbersome, its generality allows one to derive various rules more useful in practice, such as the simple rule from Section 3. It also encompasses book-keeping rules on auxiliary variables such as the instantiation, or substitution, rule where is itself an expression over ′ .
The final rule (nmod) captures the observation that if a variable is not touched by statement C, it remains constant through evaluation, and can thereby be regarded as an auxiliary variable.In the rule, Mod C denotes the set of variables modified by C ∈ Stmt. 7The rule gives a mean to internalise the local memory across procedure calls, indispensable in our setup since procedure specifications reference only global memories.This rule, together with the rule of consequence is powerful enough to derive e.g. a framing rule based on Jensen's inequality.We elaborate more on that in Section 8. Theorem 5.2 (Soundness and Completeness).For all procedures f, The proof is this theorem is given in the appendix.
Relational Reasoning.Formally reasoning about the complexity of intricate programs can be very hard.However, complexity can often be studied on simplified (but complexity preserving) versions of the original programs with much less burden.Probabilistic relational Hoare logic (pRHL for short) allows one to formally relate two programs that behave the same [Barthe et al. 2015[Barthe et al. , 2012[Barthe et al. , 2017]].Judgments have the following form: where , ⊆ Mem × Mem are both assertions that relate memories of C and D. The intuitive meaning behind this judgment is that, when programs C and D are run on initial memories related by , the resulting output-distributions are coupled via relation .Probabilistic coupling is formalised via the notion of relational lifting of to a relation † : D (Mem) × D (Mem).Precisely, 1 † 2 iff there exists a (sub)distribution ∈ D ( × Mem) such that (i) the marginal (sub)distributions    of are 1 and 2 ; and (ii) supp( ) ⊆ .We are now ready to state the definition of validity of a pRHL judgment: The proof system underlying pRHL is extensively described in the literature [Barthe et al. 2015[Barthe et al. , 2012[Barthe et al. , 2017]].Noteworthy, an implementation is available in EasyCrypt.Here, the notion of validity is sufficient to relate eHL with pRHL.Indeed, we would like to transfer eHL properties from one program C ′ to a potentially more complex one C.The rule in Figure 4 allows for just that.Concerning postexpectations, the second side-condition is sufficient to establish E [ ] ≤ E ′ [ ′ ] for any coupling † ′ .Through the pRHL judgement, this holds in particular for the output distributions of C and C ′ , on any pair of initial memories and ′ related by .The first side-condition now essentially demands that each initial of C can be paired with a memory ′ of C ′ related through , but also through the pre-expectations.From here, soundness is not difficult to establish.

AVERAGE CASE COMPLEXITY OF SKIP LISTS
In this section, we demonstrate the flexibility of our framework via a complexity analysis of the skip list data structure.Skip lists have been introduced in [Pugh 1990b] as a randomized alternative to balanced binary trees that is easier to implement and maintain.8Being a probabilistic data structure, their formal average case complexity analysis is intricate.A skip list can be thought of as an ordered linked list, where nodes may have additional forward pointers skipping several nodes ahead so as to facilitate a more efficient search.Forward pointers are organised in levels, each level skipping ahead (ideally) half of the nodes found in the level below.By introducing log 2 ( ) levels for a list of elements, search becomes effectively a O(log 2 ( )) operation.For illustration, Figure 5a shows a (perfectly balanced) skip list with three levels of forward pointers, organized as a stack above keys.Element −∞ and +∞, acting as head and terminator of the list, respectively.
Maintaining perfect balance of forward pointers when elements are inserted or deleted is a costly operation.In a skip list, forward pointers are chosen at random, so that when inserting a node, it search of key 3, highlighting the search path traversed by find(3).The implementation increments the cost counter ct, whenever a comparisons of keys-k < get_key(q)-is performed.Observe how this cost measure is directly related to the length of the search path.
Insertion.Inserting a datum d with key k involves finding first the location for the given k, and linking a new node within skip list in case the key k is not present.Figure 5c illustrates insertion of a datum z with unoccupied key 3.5.The size of the array of forward pointers fwd is drawn at random.Pre-existing pointers that would "skip through the new column" are separated in two, pointing now to and from the new node, respectively.Finally, the forward pointers of −∞ are extended, in case insertion increases the maximal level, as in the Figure .The full implementation of insertion is given by procedure proc insert(k, d) in Figure 6a.It uses a variation find_path of find that returns an array spf of forward pointers on the search path where search took a downward turn-those that will link to a newly inserted entry-together with the key k ′ where search terminated.Insertion incurs no cost, as we will be interested in the complexity of search.

Average search complexity.
In what follows, we outline our formalization on the search complexitythe number of comparisons performed within a search-of skip lists.To this end, our starting point is the function find_cost(lst, k) which searches for key k in a skip list, built from the provided list of key/value pairs lst.The procedure returns the cost counter, storing the number of comparisons performed by find.Since the skip list is constructed at random through the implementation of insert, the expectation of the return value ct reflects precisely the average search complexity.

Outline of the Formalization
Height abstraction.Since pointers in a skip list always point forward to the first node of sufficient height, the structure of a skip list is fully determined by the height of nodes, i.e. the size of their array of forward pointers.This, in turn, justifies to abstract nodes by there height, specifically, we have the following mapping in mind: As insert is mostly concerned with managing the pointer structure after update, this abstraction considerable simplifies its implementation, see Figure 6b.Correctness of this abstraction is justified by the (classical) Hoare judgment9 where upd(hs, k, hk) updates the height of k to hk in hs only in the case when k ∉ dom(hs).The predicate wf(nodes) collects several well-formedness conditions expressing that nodes forms a skip list (eg, keys are ordered, pointers reference the first larger key, etc).Reasoning inductively, this auxiliary result establishes the following correspondence: As we have alluded to already above, the search complexity corresponds to the length of the search path.Formally, this statement is expressed by the (classical) Hoare triple where cost refers to the value of the cost counter before execution, and where path_len(hts, k) expresses the length of the search path to key k.It is worth mentioning that the proof of this judgment depends crucially on wf(nodes).For instance, would nodes contain a cycle, find(k) would potentially loop and no bound on ct could be derived.This explains why we have proven preservation of well-formedness-in essence functional correctness-of insertion.Indeed, this turned out to be the most delicate part in the proof of (equiv_from_list).
By (find_spec), to analyze the search complexity it is sufficient to bound the length of the search path path_len(hts(nodes), k), which in turn is computable within the abstraction.The procedure path_len_to(k), given in Figure 6b, gives an explicit definition of the search path length.To give some intuition about the definition, reconsider the search path for key k = 3 depicted in Figure 5b.The procedure starts by scanning keys in reverse-order, pictorially from right to left, until it reaches head(keys) = 3.From now on, the procedure traverses the search path in reverse-order, starting at level l = −1.Observe that search reaches a new key always through the top-most incoming forward pointer.Correspondingly, the backward traversal moves up by raising the level l to the maximal level and by incrementing the length len of the path traversed so far, accounting for the upward moves and the move to the left.From here, the procedure iterates.In the example, at key 3 the procedure moves this way to level l = 1 advancing len = 0 to len = 3, accounting for the upward two moves and the move to the left.The procedure then iterates, to key 2, skipping along key 1 not on the search path (due to the condition l < hts[k]), until finally arriving at −∞.The final increment in the return statement accounts for the final move upwards on key −∞, in the example from level 2 to level 3.With this intuition in mind, functional correctness is easily provable in classical Hoare logic.Summing up, the following relational Hoare judgment state correctness of the abstraction with respect to search complexity.
Estimation of the path length through the height abstraction.The judgment (equiv_find_cost_h) formally justifies that we analyze the search complexity through its height abstraction given in Figure 6b.The crux in proving the latter directly is to find a suitable upper invariant for the loop in from_list_h.In effect, this requires expressing the search path length after inserting a column, in terms of of the search path length before the insertion.At the same time, this invariant has to lead to a sufficiently tight bound in the size of keys.However, this technicality can be avoided altogether, by sampling hts on-demand, rather than eagerly.The procedure find_cost_d, given in Figure 6c, is obtained by inlining path_len_to within find_cost_d from Figure 6b.Heights hk corresponding to hts[k] are sampled on demand-within the path traversal, rendering the call to from_list_h obsolete.The auxiliary variable h refers to the maximal sampled height, viz. the height of −∞.One can prove that semantically, find_cost_h and find_cost_d coincide: Notice that the left program inserts keys in the order they occur in lst, whereas the right program processes keys in reverse-sorted order, removing duplicates.Thus, a rather involved key step towards this equivalence is proving that path length is independent of the order of insertions.Final cost analysis via eHL.The judgments (equiv_find_cost_h) and (equiv_find_cost_d) establish witnessing that the complexity of searching for a key k in an arbitrary skip list build from lst is computed by find_cost_d(lst, k).The final puzzle piece lies now in bounding this result, in expectation.To this end, we make use of eHL, compare the assertions in Figure 6c.The gist of the proof lies in finding an invariant for the loop.As the definition and the related weakening proofs are quite technically involved, we have relegated further discussion to the Appendix.Very briefly, terms Δ ℎ ( ) and Δ +1 ( ) are used to account for changes to the path length, through vertical and horizontal steps, respectively.Concerning horizontal steps for instance, in the common case where the current height h does not exceed the (average) logarithmic overall height, Δ ℎ ( ) = log 2 ( +1 2 ℎ ) measures the expected height increase of completing the loop in terms of the +1 2 ℎ nodes found at current height h.The invariant turns slightly more complicated, to also account for the final difference h − 1 − l contributing to the result of the procedure (see weakening (★ 1 )).Once carried over the initialisation statements (see weakening (★ 2 )), the invariant gives the final logarithmic bound 2 • log 2 (size( )+1) +4.Apart from defining the invariant, the most delicate step concerned the proof of the weakening step (★ 2 ).Towards this proof, we have build a considerate library on laws of expectations, such as the law of linearity, Jensen's inequality, etc.
Concluding Remarks.Splitting the correctness proof, done via pRHL, from the complexity analysis, carried on via eHL, seems essential to achieve our goal.The modularity provided by our framework has allowed us to develop the proof step-by-step, in a compositional way, which would not have been possible without the EasyCrypt implementation.

ADVERSARIES AND APPLICATIONS TO CRYPTOGRAPHIC PROOFS
In this section, we extend our programming language and logic with adversary calls, and illustrate how the extended logic can be used to reason about cryptographic proofs.Our example is inspired from a recent work by [Barbosa et al. 2023], which uses our implementation of eHL for proving security of Dilithium [Ducas et al. 2017], a post-quantum signature scheme recently standardized by the NIST (National Institute of Standards and Technology).
Extension of the language.We now extend the language to adversarial code by permitting adversary calls x ← A o (E), where A is drawn from a set Adv = {A, B, . . .} of adversary names.Each adversary call is parameterised by an oracle, i.e., a pre-defined procedure o ∈ Fun.(a) Logged rejection sampling.provided oracle o.We require that adversary environments are consistent with writeable variables, i.e., the body of (A )(o), nor any of its subprocedures except the oracle o, modifies the memory outside of Write A .For instance, if the adversary executes an instruction x ← E, then x ∈ Write A .
In contrast to the notion of modified variables Mod C , which is semantic, Write A is a syntactic notion with subtle differences.The memory content of a variable x ∉ Write A may change during an invocation, but only through invocations of the oracle.The semantics of an adversarial call are now identical to ordinary procedure calls, just, the declaration of the adversary is provided by the adversary environment , that is, we let A o = (A )(o) , but treat a call x ← A o (E) otherwise identical to an ordinary procedure call.
Extension of eHL.To extend the logic for programs with adversarial code, the notion of judgment can remain identical, apart from the fact that program statements now may contain adversarial calls.However, judgments will now be valid if validity in the original sense holds independent of the adversarial code, that is, holds for all , and all adversary environments .Similar, validity for procedure declarations is defined by quantifying over all adversary environments.
The following now gives our adversarial rule, for : → GMem → R +∞ depending only on the global memory.
This rule lifts invariants on oracles to that of adversaries.The hypothesis ⊥ Write Adv , stating that is independent of writable variables by the adversary, ensures that remains invariant throughout complete invocation of the adversary.
Example.We illustrate how eHL can be used to upper bound the probability of bad events in rejection sampling.The example captures the essence of a key step in the security proof of the Dilithium signature scheme, formalized in Barbosa et al. [2023] using our implementation of eHL.Our goal is to provide an upper-bound on the probability that a fresh, random value appears in the history of samplings performed during rejection sampling.This stage of the proof is represented, in slightly simplified form, in Figure 7. Procedure rsample (Figure 7a) performs rejection sampling from distribution sample with predicate test.The global variable log keeps track of all sampled values.Each invocation of the oracle o, later provided to the adversary, performs rejection sampling and thereby populates log.A global counter c keeps track of the number of oracle invocations.Once the counter reaches 0 ≤ Q, a bad event is signaled through setting the global variable bad, precisely when log contains a randomly sampled value r * .The main program (Figure 7c) consists simply of a call to the adversary A o , with global auxiliary global variables initialised correspondingly.The adversary has access to the global variables only through the oracle, that is, Write A = ∅.Our goal is to bind the probability of the Boolean variable bad-its expectation-within this program.
Figure 7 is annotated with the corresponding eHL proof.The central proof step lies in annotating the oracle in Figure 7b with an invariant binding the value of bad.Being initialized to false, this variable is only set once the invocation counter c of the oracle reaches Q, and then only when a fresh sampled value r * collides with a previously sampled value in log.In turn, the probability of a collision r * ∈ log is bounded from above by size(log), for an upper-bound on probabilities of sample.This, in effect, allows us to estimate the value of bad in terms of the size of log when c reaches Q.To this end, let 0 < be the probability that a sample satisfies the predicate test.As indicated in Figure 7a, rejection sampling increases the length of log, on average, by 1 .
The invariant given in the specification (see Figure 7b) lifts this observation to the oracle.In the term = • (size(log) + Q−c ), the factor stems from the approximation of bad in terms of the size of log, the fraction Q−c accounts the potential size increase of log until the invocation counter reaches the limit Q.Once the counter is reached, the invariant simply refers to the value of bad.The overall program is now treated essentially by an application of the adversary rule, using the invariant on the oracle as provided, see Figure 7c.The weakening at the end follows from the classical invariant .The derived bound • Q is obtained by simplification of the invariant with global variables initialised correspondingly.
The proof hinges essentially on the fact that, on average, the size of log is bounded, although rsample is potentially non-terminating and may produce a log of arbitrary size.Lacking capabilities for expection based reasoning, this renders a proof using the phoare logic present in EasyCrypt significantly more involved.The most natural way here is to proceed via an approximation of rejection sampling so that the number of iterations is bounded, say by a constant .Thereby, within the Q invocations of the oracle, the size of log becomes bounded by Q • , worst case.On the so transformed, certainly terminating, program, one can then obtain a bound Q • • on the probability of bad being set.The approximation itself however, introduces an additional error rate, leading to the overall bound of Q In contrast, the use of eHL not only significantly reduced proof effort, it also lead to a more preferable bound.The complete formal proof in EasyCrypt takes in total only 48 lines.The frame rule (detailed in the next section) turned out particularly useful.It allowed us to lift the specification of rsample, talking only about the expected size increase of log, to the call within the oracle o.

IMPLEMENTATION
We have implemented eHL in the EasyCrypt proof assistant [Barthe et al. 2013].EasyCrypt is a natural choice to implement eHL, since it is specially tailored to reason about probabilistic programs.Informally, EasyCrypt combines a proof engine for an ambient higher-order logic (HOL) with several program logics for proving properties of probabilistic programs.Judgments of the program logics are terms of the ambient logic, and proofs in the program logics are carried by means of (proof) tactics.In essence, a tactic implements a rule of the logic, by turning the conclusion into its hypotheses.This way, proofs are build gradually from the conclusion, upwards, ending in the axioms of the logic.
Fig. 8. Excerpt of derived rules implemented in EasyCrypt.
In order to support expectation-based reasoning, we have added eHL judgments as assertions of the ambient logic, and built support to reason about such judgments.In particular, we have: -added tactics for core and several derived eHL rules.The core rules are in the trusted computing base (TCB) of the tool.However, the derived rules are designed to generate sequences of core tactics, in order to minimize the TCB as much as possible; -added a library to reason about expectations.The library is required to discharge the many ambient logic goals generated by applying eHL tactics.
Derived proof rules.The proof rules in Section 5 follow the conventional presentation of program logics but are tedious to use in practice.For instance, reasoning about a sequence of instructions would first require a sequence of applications of rule (seq) to split the sequence apart, and then use the syntax-directed rules, possibly combined with non-structural rules, on the individual program instructions.Even more tedious, working towards a triple this way would entail that in many situations the intermediate pre-/post-expectations would need to be supplied by the user, as these cannot be inferred in general.To overcome these complications and to enhance usability of the logic, in the implementation we composing core syntax-directed rules with sequential composition and structural rules.An excerpt of derived rules can be found in Figure 8.
Rules (skipEc) is a variation of the ordinary rule (skip), combined with rule (conseq) to make it applicable to the usual scenario where pre-and post-expectations differ.Rule (assignEc), the combination ofrules (seq) and (assign), embodies the backward style kind of analysis commonly found across the implementations of different logics in EasyCrypt, close to traditional weakest pre-condition reasoning.Generalising on this idea with rule (wpEc), our implementation provides a tactic wp computing the weakest pre-expectation, wp(D, ), for a tail D neither containing loops nor procedure calls.To illustrate the advantage of these derived rules, note that the proof of rpartition_abs in Figure 1b is completely automated by the tactic wp, apart from the initial weakening step.This would not have been possible otherwise.
Among the more interesting derived rules is the final rule (frameEc).In classical Hoare logic the frame rule, also known as rule of constancy constancy, takes the form It is indispensable in practice, since it allows one to focus only on the relevant parts of an assertion, namely only the one that is potentially altered by C. But how to transfer this rule to our quantitative logic, in particular, how to translate logical conjunction?Here are three valid rules, all derivable from rules (conseq) and (nmod): Proc Rather than imposing a concrete choice, our rule (frameEc) abstracts over the choice, and permits placing pre-and post-expectations in an arbitrary context ⊥ Mod C , that is concave11 and nondecreasing (i.e.monotone), when seen as function : R +∞ → R +∞ .For instance, this rule has been applied in the previous section, adapting the function specification of rsample to the call site within o (see Figure 7).In the application of the rule, is given by the context Seen as function in the hole □, this term can be proven monotone and concave, as demanded by the third premise in rule (frameEc).Since it mentions only local, unmodified variables, the second premise is easy to discharge.The rule itself is derivable by a composition of (nmod) and (conseq).To see this, assume ⊢ { } C { }.Rule (conseq) ] holds for all , with E [ ] ≤ .The first inequality effectively imposes concavity of is then a consequence from the reverse Jensen's inequality), the second imposes that is non-decreasing.Allowing to depend on part of the memory that is not modified by C explains why the rule relies on (nmod) to be justified.
To ease the application of the frame rule, we have proven a list of lemmas showing that functions like identity, multiplication by a constant or log satisfy those properties.EasyCrypt is then able to automatically/recursively apply those lemmas to prove the last premises.Furthermore this list of lemmas is user extensible.This way, for instance, EasyCrypt can automatically discharge the premises related to the context (×) in the proof mentioned above.
The final rule, rule (callEc) implemented by tactic call, allows to compute the weakest preexpectation of a procedure call, given a specification.The specification itself is usually already proven by a lemma in EasyCrypt.It implicitly features an application of rule (frameEc), more precisely its instance (frame1) given in the motivation above, to internalise an implicit weakening of the post-expectations within the pre-expectation.This aides usability in connection with wp, which will for instance automatically propagate variable initialisation within the internalised weakening.The tactic call also takes a further context (adhering to the restrictions imposed by rule (frameEc)) as optional argument, in order to lift a procedure specification directly to its use at a call site.
Last but not least, in addition to these derived rules, we have extended already existing tactics that do not change the semantics of programs to deal with eHL judgments, such as the inline tactic that replaces a procedure call by its body.
Libraries of extended positive reals and expectations.We have developed a library to reason about extended positive reals and expectations.The library formalizes the type of positive reals R + as a subtype of R and the type of extended positive reals as a disjoint union of R + and +∞.The library establishes that both positive and extended positive reals form additive monoids, which allows to instantiate the EasyCrypt library on big-operators.This library, inspired from [Bertot et al. 2008], provides a wealth of facts to reason about indexed sums-via the mathematical operator Σ.Using big-operators, it is thus relatively simple to define the notion of expectation, and to prove elementary facts about expectations.These facts are used to discharge many proof obligations automatically.At the time of writing, the library weights in at around 1.100 lines of proof scripts.

CONCLUSION
We have proposed a proof hopping approach for reasoning about expectation-based properties of (adversarial) probabilistic programs, and extended the EasyCrypt proof assistant to support our approach.In addition, we have shown that our approach is useful for reasoning about expected cost of randomized algorithms and for cryptographic proofs.Our implementation of eHL has been integrated into the EasyCrypt proof assistant.Future directions include extending eHL to quantum adversaries and quantum programs, and to further develop and capture formally the use of expectation-based properties in cryptography.

DATA-AVAILABILITY STATEMENT
The implementation of the logic and case studies are available on Zenodo [Avanzini et al. 2024].

Fig. 2 .
Fig. 2. Semantics of statements • (•) : Stmt → Mem → D Mem. reals extended with top element ∞.Given function : → R +∞ and a distribution : D we denote by E [ ] ≜ ∈supp( ) ( ) • ( ) the expected value of on .By the Monotone Convergence Theorem, this value always lies within R +∞ .The subdistribution functor D forms a monad.The unit dunit : → D returns on ∈ the Dirac distribution (where ( ) ≜ 1 if = and ( ) ≜ 0 otherwise).The bind dbind : D → ( → D ) → D is defined as dbind ≜ .∈supp( ) ( ) • : D .To ease notation, we may write dlet ← in ( ) for dbind ( .( )).With fail : D we denote the subdistribution with empty support.We model program memories as mappings ∈ Mem ≜ Var → Val from variables to (a discrete set of) values Val.Each memory can be partitioned into a global memory : GMem ≜ GVar → Val and a local memory : LMem ≜ LVar → Val.We write [x/ ] for the memory obtained from by updating x to .We suppose that expressions E ∈ Expr, Boolean expressions B ∈ BExpr and sampling expressions S ∈ SExpr are equipped with semantics E (•) : Mem → Val, B (•) : Mem → B and S (•) : Mem → D Val, respectively.Statements C are then interpreted as functions C (•) : Mem → D Mem, see Figure 2. The definition is mostly standard.Noteworthy, each procedure f is interpreted as a function in GMem × Val → D (GMem × Val), parameterised by the global memory before execution and a value-the formal parameter-and yielding as output a subdistribution of modified global memories and return values.Upon invocation, the local memory is initialised to an initial memory 0 assigning to each variable x ∈ LVar a default value, and the formal parameter x is bound by the argument.Upon completion, the return value is evaluated and returned, together with the potentially modified global memory.Precisely, we interpret a declaration by

Fig. 3 .
Fig. 3. Kernel rules of eHL. is defined such that (true | ) ≜ and (false | ) ≜ ∞ for any real value ∈ R +∞ , and extended to pre-and post-expectations in the obvious way.This way, { | } C { | } for instance asserts validity of { } C { } under pre-condition , guaranteeing post-condition .
List with two extra levels of balanced forward pointers.
Searching for value associated to key k = 3 in random skip list.
Inserting z with key k at sampled height 5 in (b).

Fig. 5 .
Fig. 5. Several representations of skip lists over elements [1, . . ., 6]. Figure (a) depicts a perfectly balanced skip list.Figure (b) depicts the dictionary {1 ↦ → a, . . ., 6 ↦ → f} implemented on top of a (random) skip list.The search path for value c with key 3 is indicated as a solid green arrow.Figure (c) is obtained from (b) by inserting an element with key 3.5, with a sampled height of ht = 5.The do ed green arrow indicates the search path followed by insert(3.5,z) to determine the position of the new node, it is identical to the path from (b).The search path array sp is outlined with thick green borders, note that it is given by those nodes on the search path where search proceeded downwards.The bended thick blue arrows indicate new pointers.
10 Adversaries A refer to arbitrary procedures, granted only partial access to the global memory through a set Write A ⊆ GVar of writable global variables.In a call to A o , the adversary may modify variables outside Write A only by invoking the oracle o.To model adversarial code in the semantics, we index the interpretation of program statements by an adversary environment .This environments maps each A ∈ Adv to a declaration(A ) = ↦ → (proc A (x) C ; return E) ,indexed by an oracle .Note that the code of the adversary is parametric in the oracle.The body C may contain oracle calls x ← (E).Invocation of A o executes the procedure (A )(o) = proc A (x) C o ; return E, where in the body the meta-variable has been substituted by the

Fig. 7 .
Fig. 7. Rejection sampling with bad.Variables c, log and bad are global.Here, 0 ≤ Q is a constant, ≜ Pr[sample : test] > 0 is the probability of event test on the distribution given by sample, Pr[sample : 1 ] ≤ is an upper-bound on the probability of sampling a value ; ≜ bad ⇒ Q ≤ c and ≜ • (size(log) + Q−c ).