Shoggoth: A Formal Foundation for Strategic Rewriting

Rewriting is a versatile and powerful technique used in many domains. Strategic rewriting allows programmers to control the application of rewrite rules by composing individual rewrite rules into complex rewrite strategies. These strategies are semantically complex, as they may be nondeterministic, they may raise errors that trigger backtracking, and they may not terminate. Given such semantic complexity, it is necessary to establish a formal understanding of rewrite strategies and to enable reasoning about them in order to answer questions like: How do we know that a rewrite strategy terminates? How do we know that a rewrite strategy does not fail because we compose two incompatible rewrites? How do we know that a desired property holds after applying a rewrite strategy? In this paper, we introduce Shoggoth: a formal foundation for understanding, analysing and reasoning about strategic rewriting that is capable of answering these questions. We provide a denotational semantics of System S, a core language for strategic rewriting, and prove its equivalence to our big-step operational semantics, which extends existing work by explicitly accounting for divergence. We further deﬁne a location-based weakest precondition calculus to enable formal reasoning about rewriting strategies, and we prove this calculus sound with respect to the denotational semantics. We show how this calculus can be used in practice to reason about properties of rewriting strategies, including termination, that they are well-composed, and that desired postconditions hold. The semantics and calculus are formalised in Isabelle/HOL and all proofs are mechanised.

Finally, we show how to use the weakest precondition calculus to reason about rewrite strategies by applying it to various examples, including termination, that a strategy is well-composed, and that a rewrite strategy satisfies a particular postcondition after its execution.One of our examples is a strategy for -normalisation taken from the Elevate project by Hagedorn et al. [2020], demonstrating the applicability of our work to practical scenarios.
In summary, we make the following contributions: • We design, formalise and mechanise using Isabelle/HOL the semantics of System S, including both denotational and operational models with a full accounting of nondeterminism, errors, and divergence.We prove these two semantics equivalent (Section 3).
• We design, formalise and mechanise using Isabelle/HOL a location-based weakest precondition calculus for System S. We prove its soundness with respect to the denotational semantics (Section 4).
• We demonstrate how to use the weakest precondition calculus to prove practical useful properties of strategic rewriting (Section 5): that a strategy terminates, i.e., that is does not diverge; -that a strategy is well-composed, i.e., that there exist input expressions for which the strategy execution will succeed; -that a desired property is satisfied after execution of the strategy.Before stepping into the formalisation of System S, in the next section we present the syntax of System S as well as some example strategies to facilitate the understanding of strategic rewriting.

THE SYNTAX OF SYSTEM S
System S [Visser and Benaissa 1998] is a core calculus providing basic constructs of strategic rewriting, including atomic strategies (rewrite rules) and operators composing strategies and performing expression traversals in an abstract syntax tree (AST).A successful execution of a strategy transforms an expression into some other expression while preserving its semantics.The expressions being rewritten can either be Leaf s or nodes, in general, taking the form of: presents the syntax of strategies in System S. We use S to denote the set of all strategies.Variables, atomic strategies, SKIP and ABORT are basic strategies.Basic strategies are not decomposable.An atomic strategy is simply a rewrite rule.For instance, the commutativity of addition add com and commutativity of multiplication mult com are atomic strategies: Commutativity of addition mult com : * ⇝ * Commutativity of multiplication SKIP can always be executed successfully while executing ABORT would always cause failure.To compose strategies, one can make use of combinators including sequential composition ( ; ), left choice (<+) and nondeterministic choice (<+>).Sequential composition instructs to execute two strategies one after the other.Left choice prefers executing the strategy at the left hand side of the combinator over the strategy at the right hand side of the operator while nondeterministic choice decides to execute one of the given two strategy nondeterministically.In addition, one, some and all are traversals that navigate within the AST.Intuitively, one( ) applies to one immediate sub-expression of an input expression, some( ) applies to as many immediate sub-expressions of  an input expression as possible and all( ) applies to all immediate sub-expressions of an input expression.Lastly, System S provides a fixed-point operator to model recursion.
Comparison of the expressiveness to the original System S. One difference between our formalism and the original System S is that we abstract away the term building details for atomic strategies, instead modelling atomic strategies as partial functions.We believe that applying this abstraction does not limit the expressiveness of our system.In fact, the purpose of such design is to allow the flexibility of the term languages, not only limited to the original System S, but also capturing other strategic rewriting languages that use term constructs that are different from System S.Moreover, this design enables us to focus on reasoning about properties of compositions of rewriting strategies that hold independent of the term building behaviour.
Composing strategies.We can compose strategies together with these combinators, traversals and the fixed-point operator to define more strategies.For example, we define a strategy try( ) using left choice and SKIP which attempts to apply a strategy to an input expression.If an error occurs, then it will leave the input expression unchanged by executing the strategy SKIP: try( ) := <+ SKIP With the fixed-point operator and sequential composition, we can then define a strategy repeat ( ) which keeps applying a strategy to an input expression until its no longer applicable: repeat ( ) := .try(; ) With the fixed-point operator, the traversal one( ) and left choice, we can define top-down and bottom-up traversals in an AST: We can further compose repeat ( ) and topDown( ) to define a strategy normalise( ), which keeps applying a strategy to all sub-expressions of an input expression until it is no longer applicable: The normalise strategy is very commonly used to express program transformations.Given beta and eta reductions for -expressions, we can use the normalisation strategy normalise(beta <+ eta) for normalising an input -expression into its -normal form.
As previously mentioned, the composition of strategies can be invalid and the executions of strategies are not always successful.For instance, the strategy mult com ; add com is not well composed since it cannot be successfully executed on any input expression.repeat (SKIP) is a strategy that cannot terminate.Although normalise(beta <+ eta) can certainly be successfully executed on some input expressions, on other inputs it may not terminate.It is important to know that when it terminates, it will indeed leave the expression in -normal form.
To reason about the successful and unsuccessful executions of strategies, we design the locationbased weakest precondition calculus which is discussed in section 4. With this calculus, we are able to detect bad strategies that do not have successful executions, like mult com ; add com and repeat (SKIP), by concluding that there is no input expression that can be successfully rewritten by such strategies into a desired form.Also, for a good strategy that has successful executions, we are able to distinguish inputs that indeed lead to successful executions of the strategy and inputs that lead to erroneous or diverging executions.Such reasoning power is demonstrated in section 5.
To design the location-based weakest precondition calculus, we need to understand the behaviours of executing these strategies in System S. Therefore, before introducing the calculus and its reasoning power, we firstly study the formal semantics of System S.

THE SEMANTICS OF SYSTEM S
For given collections of expressions E, System S defines nondeterministic execution for given strategies that can result in expressions or errors.We extend the original System S by allowing divergence as a possible result of executing a strategy.Therefore, applying a strategy to an expression can result in expressions, an error or divergence.

The Plotkin Powerdomain
We provide a denotational semantics of System S as an instance of Plotkin's powerdomain construction [Plotkin 1976], which allows us to assign least fixed points as the semantics of the recursion construct.An -complete partial order ( -cpo) is a partially ordered set ( , ⪯) in which each -chain ( 1 ⪯ 2 ⪯ 3 ⪯ . . . ) has a least upper bound.A function : → on such a set is continuous if for each -chain 1 ⪯ 2 ⪯ 3 ⪯ . . .with least upper bound , one has that ( ) is the least upper bound of the set { ( 1 ), ( 2), ( 3 ), . . .}.A continuous function is certainly monotone, in the sense that 1 ⪯ 2 implies ( 1 ) ⪯ ( 2) -this follows by considering the -chain 1 ⪯ 2 ⪯ 2 ⪯ 2 ⪯ . . ., and its least upper bound 2 .Now Kleene's fixed-point theorem says that each continuous function on an -cpo with a least element has a least fixed point.
Consider a nondeterministic, possibly diverging, algorithm that transforms values into values.If V is the set of values, this algorithm can be modelled as a function : V → P ¬∅ (V ⊥ ), where P ¬∅ ( ) is the set of non-empty subsets of , the non-empty-powerset, and V ⊥ := V ⊎ {⊥} is the set in which we embed V together with a new element ⊥.The newly added element ⊥ represents the outcome where the algorithm diverges.We equip the set V ⊥ with a partial order by defining: This fits with the intuition that ⊥ represents a computation that has not yet terminated, and ⪯ holds when is a later stage of the computation .
Terminated computations are identified by the values they compute.We compare sets of values using the Egli-Milner ordering: Lifting a partial order from elements to sets in this fashion always yields a preorder.For a flat domain V ⊥ , ⪯ is a partial order on P ¬∅ (V ⊥ ).It is characterised by: The resulting poset P ¬∅ (V ⊥ ) is an -cpo.Each -chain either enters a spine of the porcupine, and thus contains a largest element which is its least upper bound, or ⊥ is a member of all elements in the chain, so that its least upper bound is simply the union of all sets in the chain.
Aside on the powerdomain construction and the Egli-Milner ordering.To give some further insight into the powerdomain construction and the Egli-Milner ordering, recall the following well-known characterisation.Hennessy and Plotkin [1979, Remark after Lemma 3.5] show that Plotkin's [1976] powerdomain construction extends to all -complete partial orders ( -cpos) by sending each -cpo to the free semi-lattice over it.In detail, given an -cpo , we define a free semi-lattice over as an -cpo , together with a Scott-continuous function : → and a Scott-continuous binary operation: ∨ : ( ) 2 → that is associative, commutative, and idempotent.A free semi-lattice always exists, but its explicit description may be complicated.Hennessy and Plotkin show that, when -cpo is -algebraic, we can construct the free semi-lattice explicitly by taking := P ¬∅ to be the powerdomain construction with the Egli-Milner ordering, ( ) := { } as the embedding of into this semi-lattice, and sub-set union as the binary operation.So in a specific and technical sense, the powerdomain is the simplest extension of the -cpo with an associative, idempotent and commutative binary operator.
(end of aside) In our mechanised Isabelle/HOL formalisation, we opt to use posets that are complete with respect to all chains, not merely countable or directed ones, without maintaining continuity as an assumption.The stronger assumption on posets allows us to weaken the assumption on functions: we only require monotonicity to ensure existence of fixed points.This choice was made purely for ease of formalisation, as Isabelle/HOL already includes a library for chain-complete partial orders.While this means that our domain may contain monotone functions that do not correspond to any expressible strategy, and that Hennessy and Plotkin's characterisation does not directly apply, our meta-theoretic results below show how to relate our semantics to the operational semantics, and our reasoning examples show that this semantics suffices to reason about practically interesting examples.We conjecture that our results will easily carry over to a semantics defined with -cpos.

Formalised Denotational Semantics
We now present and discuss the denotational semantics for System S, capturing successful and erroneous executions of strategies as well as nondeterminism, divergence and recursion.A strategy is a nondeterministic algorithm/function that rewrites expressions into expressions.This nondeterministic algorithm can sometimes yield an error err instead of an expression, and it might fail to terminate.In the latter case, we say that it yields the value div.Formally, we instantiate Plotkin's powerdomain construction from the previous section by setting V := E ∪ {err} and ⊥ := div, noting it is a flat domain.We denote the resulting powerdomain by: We define the denotational semantics of System S over the point-wise lifting of the powerdomain: To define the denotational semantics of strategies in a concise manner, we provide semantic combinators and traversals that encapsulate the meaning of syntactic combinators and traversals.Figure 2 illustrates the definitions of the combinators.The definition of sequential composition ; is straightforward, indicating that the execution of the strategy depends on the result of applying to the input expression .If applying to results in an error or divergence, the sequential composition will produce an error and divergence, respectively.Otherwise, the result of the sequential composition ; is produced by applying to the expression obtained by the execution of .The definition of left choice <+ prioritises the execution of the strategy over .The strategy will only be executed if the execution of produces an error.Our treatment of nondeterminism is demonic with respect to divergence while angelic with respect to errors.If either the execution of or divergences, then the nondeterministic choice <+> diverges as well.The nondeterministic choice will only result in an error if both executions of and result in an error.When both and give cause for a successful execution, the choice is nondeterministic.These combinators are sufficient for composing strategies applied to the root of an AST.System S also provide traversals one, some and all to apply strategies to sub-expressions.Their semantics are shown in figure 3. The traversal one ( )( ) nondeterministically chooses one immediate subexpression of and applies strategy to it.The treatment of nondeterminism here is again demonic with respect to divergence and angelic with respect to errors.If applying to one of the subexpressions results in divergence, one ( ) will diverge.An error will only occur when has no sub-expression or applying to all sub-expressions of results in error.The traversal some ( )( ) applies to as many immediate sub-expressions of as possible.Its divergence and erroneous situations are the same as one .The successful execution of all ( ) on an input expression requires successful application of to all immediate sub-expressions of or being a Leaf .If applying to one sub-expression leads to an error or divergence, all ( )( ) yields err or div, respectively.For simplicity of the presentation and illustration, we have restricted ourselves to binary trees in this paper.However, the traversals can easily be generalised to ASTs with wider branching.
With the semantic combinators and semantic traversals introduced, we provide the denotational semantics for System S shown in figure 4. The semantics of a strategy is modelled as a function that takes in a semantic environment , which is a function mapping variables to elements of .
The semantics of a variable consists of looking up the variable in a given semantic environment.We model an atomic strategy as a partial function, which can successfully rewrite an input expression into an output expression when it is defined for the input expression.When an atomic strategy is not defined for an input expression, applying it to the input expression will result in an error.SKIP is a strategy that always rewrites an input expression to itself while ABORT is a strategy that always produces an err.The denotational semantics of combinators and traversals are straightforwardly defined with the semantic combinators and traversals.Lastly, the semantics of the fixed-point operator is the least fixed point in our domain, where we extend the semantic environment with a mapping from the syntactic fixed-point variable to the fixed point in our domain.We denote this environment extension with the syntax [ ↦ → ].
The denotational semantics is monotone.Given two environments 1 and 2 , if the values obtained from looking up the variables in the environments satisfy the ordering 1 ( ) ⪯ 2 ( ) for any variable , the values obtained from evaluation of a strategy with these environments should also satisfy the ordering ⟦ ⟧ 1 ⪯ ⟦ ⟧ 2 .Formally, we present the monotonicity theorem 3.1: Theorem 3.1 (Semantics monotonicity theorem).For given environments 1 and 2 , and strategy we have:  We prove this theorem in Isabelle/HOL by structural induction on the strategy .

Formalised Big-Step Operational Semantics
In this section, we present the formalised big-step operational semantics of System S, with our extension allowing for divergent strategies.Figure 5 depicts the big-step operational semantics for the non-diverging cases of System S. These cases are essentially the same as those of Visser and Benaissa [1998], albeit with the aforementioned simplification to binary trees applied. 2 The semantic rules are given in a straightforward way.On top of these rules for terminating cases, we define the semantics for divergence as the coinductive judgement [Leroy and Grall 2009] satisfying the rules shown in figure 6.Here we use − → ∞ to indicate that the evaluation of an expression by a strategy leads to divergence.

The Denotational Semantics is Equivalent to The Big-Step Operational Semantics
In section 3.2 and section 3.3, we have provided two styles of semantics for System S. It is essential to prove that these two semantics are equivalent, since we would like our formal semantics to provide unambiguous and unique interpretation of strategies in System S. In addition, with the equivalence of these two semantics established, we only need to refer to one of them to prove some properties of System S and they should also hold for the other semantics.
We reason about the equivalence between the denotational semantics and big-step operational semantics via computational soundness and computational adequacy theorems.More specifically, we have a pair of computational soundness and adequacy theorems to relate the non-diverging cases and a pair of computational soundness and adequacy theorems to relate the diverging cases.

Fig. 5. Big-step operational semantics of non-diverging cases
Firstly, we show that if an expression is evaluated to another expression or an error using the big-step operational semantics of a strategy ♣ , this result must also be in the set obtained by executing the denotational semantics of ♣ with the given expression .Formally, this is described Fig. 6.Big-step operational semantics of diverging cases by our first computational soundness theorem 3.2.The subscript ♣ indicates that ♣ is a closed strategy: a strategy with no free variables, i.e. fv( ♣ ) = ∅.Theorem 3.2 (Computational soundness theorem one).For a given closed strategy ♣ , and any environment , we have for an arbitrary expression and result : We prove this by induction on the derivation of ♣ − → from the rules of Figure 5.As the strategy ♣ is always closed, to instantiate our inductive hypothesis in the cases for the fixed-point operator, we make use of the following substitution lemma 3.3 to semantically relate the syntactic substitution of a closed strategy ♣ for a variable in with the strategy under the environment where maps to the semantics of ♣ .
This lemma can easily be generalised to allow ♣ to instead be an open strategy, so long as is not free in ♣ , however our operational semantics only ever substitutes closed strategies, thus this generalisation is not necessary to prove our semantic equivalence theorems.
We now prove a computational adequacy theorem, the converse of the computational soundness theorem 3.2.It states that if a non-diverging result is one of the results of the denotational semantics for a closed strategy ♣ with an input expression , then the big-step operational semantics of ♣ with the input expression will produce the same result.
Theorem 3.4 (Computational adeqacy theorem one).For an expression , result , and closed strategy ♣ we have: To prove this theorem, we first generalise to open strategies.To do this, we define an approximation relation between a closed strategy and an element of our domain, and state an approximation lemma.Here, we employ, for a simultaneous substitution : V → S ♣ , the notation [ ] for the application of to all free variables in .Definition 3.1 (Approximation relation one).Given a closed strategy ♣ and a function ∈ , we say ♣ △ if and only if for any given input expression , when is a non-diverging result obtained by applying to , will also be the result of evaluating the big-step operational semantics of ♣ with the input expression .

∀ ∈ fv( ). ( ) △ ( ) ♣ = [ ]
♣ △ ⟦ ⟧ The proof of this lemma is by induction on the strategy , and Scott induction is required for the fixed point cases.From the approximation lemma, we prove the computational adequacy theorem 3.4 by setting := ♣ .As there are no free variables in ♣ , the approximation relation trivially implies our goal.
The computational soundness and adequacy theorems presented above state that the denotational semantics and big-step operational semantics are equivalent for the non-diverging cases.Next, we present computational soundness and adequacy theorems for divergent strategies.
The computational soundness theorem for the diverging cases states that, if the evaluation of the big-step operational semantics of a closed strategy ♣ with an input expression diverges, div must be in the resulting set obtained by executing the denotational semantics of ♣ with the given expression .
Theorem 3.6 (Computational soundness theorem two).Definition 3.2 (Approximation relation two).Given a closed strategy ♣ and a function ∈ , we say ♣ △ ∞ if and only if for any given input expression , when evaluating the big-step operational semantics of ♣ with the input expression diverges, the divergence div will be obtained by applying to , and we have the ordering The proof of this lemma is (again) by induction on the strategy , where Scott induction is used for the fixed point cases.For the cases which involve terminating sub-steps, such as sequential composition or left choice, we make use of our soundness theorem for non-diverging cases, theorem 3.2.We utilise this approximation lemma 3.7 to prove the computational soundness theorem 3.6.
Lastly, we prove the computational adequacy theorem for the diverging cases, which is again the converse of the soundness theorem 3.6.It states that if div is in a result of executing the denotational semantics of a closed strategy ♣ with an input expression , then evaluating the big-step operational semantics with the given expression leads to divergence.
We prove this by coinduction over big-step operational semantics for diverging cases while making use of the computational adequacy theorem 3.4 for the non-diverging cases.Just as with our computational soundness proof for non-diverging cases, we work only with closed strategies ♣ , and rely on our substitution lemma 3.3 for the fixed point cases.
With these two pairs of computational soundness and adequacy theorems, we can conclude that the denotational semantics and big-step operational semantics are equivalent.Formally, we obtain: Theorem 3.9 (Eqivalence between semantics).
In this section, we have studied two styles of semantics of System S, namely a denotational semantics and a big-step operational semantics.To complete our semantic accounting, it may be worthwhile for us to study its small-step operational semantics in the future.

LOCATION-BASED WEAKEST PRECONDITION CALCULUS
As we have seen, a strategy either successfully rewrites an expression into another expression, generates an error, or fails to terminate.
Naturally, we care mainly about the successful executions of a strategy.In particular, when it rewrites an input expression into another expression that satisfies a desired property.In order to formally understand successful and unsuccessful executions of strategies, we design and formalise a location-based weakest precondition calculus.Weakest preconditions were introduced by Dijkstra [1975], as an axiomatic semantics for his guarded command language.Different from other weakest precondition calculi, we introduce the notion of a location in an AST as a parameter in our calculus for reasoning about traversals, which is discussed in section 4.1.Before presenting the formal definition of the calculus, we recapitulate the definition of a weakest precondition.Definition 4.1 (Weakest precondition).Given a program and a postcondition , the weakest precondition wp( , ) is an assertion such that for any precondition : Here { } { } is a Hoare triple stating that will successfully terminate in a state satisfying assertion if the state before executing satisfies .Intuitively, the weakest precondition of under characterises all those states that lead to successful termination in a state of when executing .In Dijkstra's [1975] calculus, a function wp is defined which, given a program and an assertion as a postcondition, computes the weakest precondition inductively on the program structure.Bonsangue and Kok [1992] extend Dijkstra's calculus to assign weakest preconditions for a fixed-point operator by additionally including a logic environment as an input to the wp function, which associates a predicate transformer with each variable.As we also have a fixed-point operator for general recursion, we do the same in this formalisation.
When dealing with strategies, assertions take the form of sets of expressions, and a state is an expression we are rewriting.Thus, the weakest precondition is a set of input expressions for a strategy to be applied to, such that the result of the application of the strategy will lead to another expression.That means the strategy will neither yield an error nor diverge.Moreover, the weakest precondition has to guarantee that an expression of the postcondition is reached.Definition 4.2 (Weakest must succeed precondition).A weakest must succeed precondition takes the form ⊩ @ ( ).This is the set of those expressions that, by applying strategy s at location l under the logic environment , will be successfully transformed into expressions satisfying .
To calculate this set of input expressions constituting the weakest must succeed precondition, we also introduce the following auxiliary function.In fact, ⊩ @ ( ) and ↑ ⊩ @ ( ) will be defined by mutual induction.
Definition 4.3 (Weakest may error precondition).A weakest may error precondition takes the form of ↑ ⊩ @ ( ).This is the set of those expressions that, by applying strategy s at location l under the logic environment , will be successfully transformed into expressions satisfying , or result in error.

Modelling Traversals
In definitions 4.2 and 4.3, we introduce the location for specifying the particular sub-expression to which the strategy should be applied.This allows us to express that after applying a strategy to the sub-expression at the location of an input expression , the input expression should be transformed into an expression that satisfies the postcondition .Consequently, the weakest precondition for traversals such as one( ), some( ), and all( ) can be defined inductively in terms of the weakest precondition of , just at different locations.Kieburtz [2001] proposes an alternative approach, using modal logic for assertions about traversals.However, it is unclear how this technique could be used to define a complete calculus.We discuss this in section 6.
A location is essentially a path into the abstract syntax tree.Such a path consists of a sequence of positions, for our binary trees either left ( ) or right ( ).Positions are prepended to a location with ⊳ and appended with ⊲.For instance, suppose we have an AST representing an arithmetic expression 1 + 3, each sub-expression is located as: With locations being introduced in the assertions, accompanied by the two helper functions lookup and update discussed in the next section, we can model the execution of a strategy at a given location in the input expression, which enables the assignments of weakest preconditions inductively for traversals just as with other operators.

The Calculus
We now introduce the location-based weakest precondition calculus for System S in its full formal detail.We first provide definitions of helper functions and essential notations for the formalisation.
To connect locations and expressions, we introduce two partial functions lookup and update, shown in figure 7. Given a location and an expression , the partial function lookup returns the sub-expression which is located at the location in an expression .The function is partial, as it is only defined when the location actually exists in the expression .The partial function update takes in a set xs ∈ , and updates an expression at the location with each expression in xs, resulting in a set of expressions where each element is obtained by replacing the sub-expression of at the location with an element of xs, with appropriate handling of errors and divergence.
Figure 8 shows the essential notations for defining the weakest precondition calculus.Since we will again have fixed-point operators in the weakest precondition calculus, we need to ensure that least fixed points exist, by operating in a domain which is again a cpo, and show that our wp function is monotone with respect to that domain.The ordering of our domain is a point-wise lifted set ordering, of which the bottom element is the empty set.
Similar to the semantic environment introduced for the denotational semantics in figure 4, the logic environment contains mappings of (fixed point) variables to an element of our logic domain (which is a function).Since we mutually define weakest must succeed preconditions and weakest may error preconditions, a fixed-point variable can map to two different functions.We use the tags • (must succeed) and ↑ (may error) to distinguish these two different mappings.
With these notations and helper partial functions, we provide the location-based weakest precondition calculus.For presentation purposes, we simplify our definitions by only considering the lookup : L → E ⇀ E (We write it as ⋔ :L ( : E) : We write it as ( : cases where location actually exists in the expression.In our Isabelle/HOL formalisation, we make this explicit in the definition of wp and wp ↑ .Figure 9 shows the weakest preconditions for basic strategies: SKIP, ABORT and atomic.Trivially, the weakest must succeed precondition and weakest may error precondition for SKIP are the same, i.e., the given postcondition , since the execution of SKIP never results in error or divergence, nor changes the input expression.As for ABORT, since it will always result in an error no matter what input expression is given, its weakest must succeed precondition is the empty set and its weakest may error precondition is the set of all expressions.The weakest preconditions of atomic strategies are defined using their denotational semantics (cf.figure 4): the weakest must succeed precondition is the set of input expressions, for each expression of which applying the atomic strategy to its sub-expression at the given location should result in a (singleton) set of expressions which is a subset of the given postcondition .The weakest may error postcondition is defined in a similar manner, the only difference is that the resulting set of expressions should be a subset of ∪ {err}.It does not matter what semantic environment is given here when we invoke the semantics, so we just use the environment which maps all variables to {div}, denoted by ∅.Remember that the operators ⋔ and are lookup and update.Figure 10 shows the weakest preconditions for combinators: sequential composition, left choice and nondeterministic choice.Intuitively, the weakest must succeed precondition of the sequential composition ; is simply to sequentially compose the weakest must succeed preconditions of and where the post condition of is the weakest must succeed precondition of .The same approach is taken for defining the weakest may error precondition.The weakest must succeed precondition of the left choice <+ is the union of the set of expressions that can be successfully rewritten by the strategy and the set of expressions for which applying may result in error but that can be successfully rewritten by the strategy .Its weakest may error condition additionally includes the set of expressions for which applying the strategy may result in error.The definitions of the ⊩ ; @ ( ) = ⊩ @ ( ⊩ @ ( ) weakest preconditions of the nondeterministic choice <+> capture the angelic nondeterminism for err and demonic nondeterminism for div.Its weakest must succeed precondition is the set of expressions to which applying neither the strategy nor will diverge and which can be successfully rewritten by at least one of and .The weakest may error precondition relaxes this last requirement by including the set of expressions to which applying both and may result in an error.
Location is very important for defining the weakest preconditions of traversals.Demonstrated in figure 11, the approach of defining the weakest preconditions for one( ) is again similar to nondeterministic choice, as one( ) nondeterministically chooses one of the left or right child of the current expression to apply the strategy to.Its weakest must succeed precondition is a set of expressions that are not leaf nodes.For each of them, applying to either its left child or right child should not diverge, and at least one of its children must be successfully rewritten by .The weakest may error precondition of one( ) includes all expressions that are leaf nodes as well as expressions whose both children to which applying may result in error.The weakest must succeed precondition of some( ) is a set of expressions that are not leaf nodes.For each of them, if the given strategy can be applied to both of its children successfully, the result of applying to both of them regardless of the ordering of the application should satisfy the postcondition .In addition, applying to one of its children may result in an error, but not for both of its children.Again, expressions with children to which applying diverges are excluded from the weakest must succeed precondition.Similar to one( ), the weakest may error preconditions includes all leaf expressions and expressions whose both children to which applying may result in error.Since all( ) requires the strategy to be applied to either a leaf expression or both children of an expression which is not a leaf, intuitively, its weakest must succeed precondition is a set of leaf expressions, or expressions of which both children can be successfully rewritten by the strategy regardless of the order of the application of .Its weakest may error precondition again includes all leaf expressions and expressions with children to which applying may result in an error.
Lastly, we introduce the weakest preconditions for the fixed-point operator, shown in figure 12, which are defined using simultaneous induction.Δ contains a pair of simultaneously defined least fixed points and which are used to define the weakest must succeed precondition and weakest may fail precondition respectively.In these fixed-point equations, we extend the logic environment with mappings from the fixed-point variable with tags ( , •) and ( , ↑) to the least fixed points and respectively.
⊩one ( )@ ( ) (Fixed-point variable) The weakest must succeed precondition a (fixed point) variable is calculated by applying the function obtained by looking up ( , •) in the logic environment to the location and postcondition .For the weakest may fail precondition, the function applied to and is obtained by looking up with the may fail tag ↑ from .

4.3
The Soundness of the Weakest Precondition Calculus w.r.t. the Formal Semantics Since our weakest precondition calculus is designed to reason about the execution of strategies, it is essential to prove it is sound with respect to the formal semantics introduced in section 3. Specifically, we define the soundness of the weakest must succeed precondition as theorem 4.1, and the soundness of the weakest may error precondition as theorem 4.2.Both of these theorems have the same assumption to relate the logic and semantic environments and .This assumption states that given any variable , location and postcondition , executing the function obtained by looking up in the logic environment -with the must succeed tag or the may error tag correspondinglygives the set of expressions, at the location of each of which executing the semantics of the variable ( ( )) results in a subset of the postcondition or ∪ {err} respectively.From this assumption, theorem 4.1 concludes that the weakest must succeed precondition ⊩ @ ( ) should equal to the set of expressions on which executing the semantics of gives a subset of .Similarly, theorem 4.2 says that under the same assumptions, the weakest may error precondition ↑ ⊩ @ ( ) should equal to the set of expressions on which executing the semantics of gives a subset of ∪ {err}.
We prove these two theorems simultaneously, by induction on the strategy .For the fixed-point operator cases, we make use of Scott induction.The proof is mechanised in Isabelle/HOL.

REASONING ABOUT STRATEGIES WITH WEAKEST PRECONDITION CALCULUS
As discussed in section 2, there are some strategies that can never be executed successfully, such as strategies that always diverge like repeat (SKIP) and strategies that are not well composed like mult com ; add com .We call such strategies bad strategies.Formally, we define good and bad strategies in terms of our weakest precondition calculus as definition 5.1 and definition 5.2, where the formal definition of bad strategies is the negation of good strategies.
Definition 5.1 (Good strategies).A strategy is good iff for a given postcondition : ⊩ @ ( ) ≠ ∅ Definition 5.2 (Bad strategies).A strategy is bad iff for a given postcondition : ⊩ @ ( ) = ∅ For strategies that can terminate and are well composed, they may not be able to successfully rewrite any input expression into an expression satisfying our desired postcondition.For instance, even though the atomic strategy add com is a good strategy, applying it to 3 * 4 would result in an error.Also, as illustrated in section 2, when encoding a normalisation strategy for rewriting an input lambda expression into its -normal form, such strategy can diverge on some input expressions (e.g., the expression Ω given below).If it does terminate on an input expression, it ought to rewrite all reducible sub-expressions of such input expression.We formally define the successful executions and unsuccessful executions of good strategies as definition 5.3 and definition 5.4.

Definition 5.3 (Successful execution
).An execution of a good strategy , on an input expression is successful iff for a given postcondition : Definition 5.4 (Unsuccessful execution).An execution of a good strategy on an input expression is unsuccessful iff for a given postcondition : Next, we demonstrate how to use the location-based weakest precondition calculus to reason about the execution of strategies.All examples we discuss are mechanised in Isabelle/HOL.

Reasoning About Termination
Strategies can diverge.Recall from section 2 that repeat ( ) is defined as .( ; ) where try( ) is defined as <+ SKIP.We can derive the weakest precondition formula of repeat ( ) by the weakest precondition formulae of SKIP, left choice, sequential composition and the fixed-point operator: where Δ is the fixed-point equation Although the execution of repeat ( ) would never result in an error since its weakest may error precondition formula is identical to its weakest must succeed precondition, it may diverge.
A simple example of a diverging strategy we have introduced is the strategy repeat (SKIP).It is straightforward to conclude that it is a bad strategy using the weakest precondition calculus.With the weakest must succeed precondition formulae of repeat ( ) and SKIP, we calculate that for the set of all expressions as the post condition, the weakest must succeed precondition of repeat (SKIP) is an empty set: ⊩repeat (SKIP)@ (E) = ∅ Intuitively, such a result indicates that there is no expression that can be successfully rewritten by the strategy repeat (SKIP).According to the definition 5.2, we can conclude that the diverging strategy repeat (SKIP) is bad strategy.
Since we apply demonic nondeterminism on divergence as discussed in section 4, the strategy SKIP <+> repeat (SKIP) always diverges.To show that it is a bad strategy, we can again calculate its weakest must succeed precondition with the set of all expressions as the postcondition: ⊩SKIP<+>repeat (SKIP)@ (E) = ∅ Again, we obtain an empty set as its weakest must succeed precondition, indicating that such a strategy can never be successfully executed on any input expression.
Strategies that can terminate are potentially good strategies.For instance, the strategy SKIP <+ repeat (SKIP) always terminates.To conclude it being a good strategy, we calculate its weakest must succeed precondition: ⊩SKIP<+repeat (SKIP)@ (E) = E Intuitively, because left choice prioritises the strategy on the left hand side of the combinator over the strategy on the right hand side, SKIP is always preferred over repeat (SKIP) here.Therefore, SKIP <+ repeat (SKIP) always terminates and produces expressions.According to the definition 5.1, we conclude that the terminating strategy SKIP <+ repeat (SKIP) is a good strategy.

Reasoning About Well Composed Strategies
Strategies that terminate may still not be good strategies, since they can be not well composed and always result in error.An example of a not well composed strategy that we have introduced in section 2 is mult com ; add com .According to the weakest precondition formulae for atomic strategies and the sequential composition presented in figure 9 and figure 10, we calculate its weakest must succeed precondition for the set of all expressions as the postcondition: ⊩mult com ; add com @ (E) = ∅ Since its weakest must succeed precondition is an empty set, with definition 5.2, we can conclude that the strategy mult com ; add com is a bad strategy.
Well composed terminating strategies are good strategies.For example, given an atomic strategy add id defined as: The strategy add com ; add id is a well composed strategy.In practice, it can successfully rewrite an expression 3 + 0 into the expression 3. We are able to conclude that the strategy add com ; add id is a good strategy again by checking its weakest must succeed precondition for the set of all expressions as the postcondition: Since the calculated weakest must succeed precondition is not an empty set, according to the definition 5.1, the strategy add com ; add id is a good strategy.

Reasoning About Beta-Eta Normalisation
In section 2, we have defined the normalise strategy by composing the strategy repeat ( ) and the top-down traversal topDown( ) as normalise( ) = repeat (topDown( )), which keeps applying a given strategy to every possible sub-expression of an expression until is no longer applicable.
One example usage of the normalisation strategy we demonstrated is to reduce an expression in -calculus into the -normal form.Given the -reduction and -reduction as two atomic strategies beta and eta, we can express the strategy for calculating the -normal form as: Furthermore, we define a predicate to assert that an expression is in -normal form, simply by stating that the beta and eta atomic strategies must not be defined for any location in the expression: (where: ⋔ is defined) It is well known that not every -expression has such a normal form.With our location-based weakest precondition calculus, we are able to reason about whether an expression can be normalised by the strategy BENF into a -normal form.Firstly, in figure 13, we provide an encoding of the lambda calculus with de Bruijn indices using the expression tree structure we introduced, which takes the form of either a Leaf or a node n .
Specifically, we encode an Id expression (a de Bruijn index) as a Leaf and both an abstraction and an application as nodes.Then we encode beta reduction and eta reduction as two atomic strategies: beta : where [ /0] is the de Bruijn substitution of the index 0 with the expression in and ⫰ 0 is the de Bruijn down shifting eliminating the index 0 in .Next we introduce the weakest precondition formula for the strategy normalise( ), which is calculated using the weakest precondition formulae of repeat ( ) (introduced in section 5.1) and topDown( ).Recall that in section 2 the strategy topDown( ) is defined using the left choice combinator, the traversal one( ) as well as the fixed-point operator: We can derive its weakest must succeed precondition and weakest may error precondition formulae: With the weakest precondition formulae for topDown( ) defined, we can subsequently provide the weakest precondition formula for the strategy normalise( ).Note that its weakest must succeed precondition and weakest may error precondition share the same formula, just like repeat ( ): Where: With the weakest precondition formula for normalise( ), we can first conclude that the strategy BENF for calculating the -normal form for expressions is a good strategy by showing: Although the strategy BENF is good, some expressions are not able to be rewritten by it to a -normal form.For instance, the expression Ω is defined as: • Applying the strategy BENF to the expression Ω will diverge, namely, the execution of the strategy BENF on Ω is unsuccessful.We draw this conclusion by showing that Ω is not an expression in the weakest must succeed precondition of BENF no matter what the postcondition is: We prove this proposition straightforwardly using Scott induction.
Beside identifying expressions that fail to be normalised into a -normal form using BENF, we are also interested in examining whether a complex expression is indeed rewritten into a -normal form after applying the strategy BENF.For instance, given an expression defined as: The proof of this proposition is also straightforward, merely requiring the repeated unfolding of fixed-point operators.On the basis of this result, we can conclude that the strategy BENF performs the rewrite on the input expression as we expected, namely, rewriting into its -normal form.

Discussion
As this section demonstrates, our formal calculus provides precise description of strategies, independent of their length and complexity.It also provides a good characterisation of desired properties to be satisfied after the execution of a strategy, as well as of expressions that can be successfully rewritten.Additionally, our calculus is capable of performing non-trivial reasoning about rewrite strategies.Specifically, the reasoning about beta-eta normalisation already features strategy combinators, traversals and recursion: the fundamental ingredients of strategic rewriting.As our framework is fully mechanised in Isabelle/HOL, reasoning can be performed directly in and facilitated by the proof assistant.Therefore, it is conceivable -still with a significant effort -to use our framework for reasoning about complex applications, including Elevate [Hagedorn et al. 2020] compiler optimisations.A significant initial hurdle is to encode the language that is rewritten (e.g. the lambda calculus in section 5.3) as well as application-specific rewrites in Isabelle/HOL, before we can start reasoning about the behaviour of more complex rewrite strategies.With our formal calculus and its Isabelle/HOL implementation it would be possible to build up a library of standard languages and rewrites, to facilitate reasoning about increasingly complex practical applications.

RELATED WORK
Strategic Rewriting and Traversals.Term rewriting systems [Dershowitz 1985] are a powerful and versatile method to express syntactic transformations.Strategic rewriting languages, which give programmers control over the rewriting process, have seen applications in many areas.Initial efforts, such as the language ELAN [Borovanský et al. 1996], focused on using rewriting as a way to model deduction and computation.The previously mentioned Stratego [Bravenboer et al. 2008;Visser 2001;Visser et al. 1998], which uses System S as its core language, is designed for developing language interpreters in the Spoofax Language Workbench [Wachsmuth et al. 2014].Elevate [Hagedorn et al. 2020[Hagedorn et al. , 2023] ] is very much in the style of Stratego, but is instead targeted towards guiding optimisations in a compiler for high performance computing.The language TL [Winter and Subramaniam 2004] applies strategic rewriting to data processing tasks, and Strafunski [Lämmel and Visser 2002], which is again a Stratego-like language, uses strategies for datatype-generic programming.Traversals are an essential feature of System S that also appear in other program transformation designs, such as the 'Scrap your boilerplate' (SYB) style traversals (e.g.everywhere, everything, anyDescendant, anyAncestor etc.) for XML programming [Lämmel 2007].Reachability constraints are added to types of these traversals for detecting queries that result in an empty set and transformations that always fail or do not change anything.To analyse strategic programs some algebraic laws are discussed by Cunha and Visser [2007] for equational reasoning and by Lämmel et al. [2013] as hints of potential dead code.One could potentially make use of our weakest precondition calculus to prove and generalise these laws.
Weakest Preconditions.Weakest preconditions were introduced by Dijkstra [Dijkstra 1975].Bonsangue and Kok [1992] extend Dijkstra's calculus to include recursion in the same way that we do.Weakest preconditions are key to Cook's proof [Cook 1978] of the relative completeness of Floyd-Hoare Logic [Floyd 1967;Hoare 1969], and are similarly used by Goncharov and Schröder [2013] to show relative completeness of their Hoare Logic for programs with monadic effects.Morgan [1994] uses weakest preconditions as the semantic foundation for his refinement calculus, enabling stepwise derivation of programs from their specifications.In recent work, Aguirre et al. [2022] explore the categorical structure of compositional weakest preconditions, characterising them as those that are obtained from the Cartesian lifting of some monad.As a related application of weakest preconditions, Swierstra and Baanen [2019] provide a weakest prediction semantics for effectful programs, accounting for exceptions, state, non-determinism and general recursion.Their work could possibly be an alternative approach to achieve some of the goals of our work, although the application of such a formalism to the form of rewriting in formalisms like system S is not immediate.For example, it is unclear whether System S with its handling of errors and non-termination would actually form a monad.Errors alone can, of course, be handled by the Error monad; the interaction with divergence and errors is more sophisticated.As a consequence, this may give rise to complications of a similar order of magnitude as the ones addressed in this paper.
Existing Formalisation and Verification.We are not the first to examine strategic rewriting languages formally.Both the initial paper on Stratego [Visser et al. 1998] and the paper on System S [Visser and Benaissa 1998] present big-step operational semantics.However these semantics do not model divergence, and are not the basis for any formal claims.In this work, by contrast, we model all possible outcomes including divergence denotationally, and we show the denotational model equivalent to an extended big-step operational semantics of System S that includes divergence, by establishing the computational soundness and adequacy with respect to the extended big-step operational semantics.Kaiser and Lämmel [2009] formalise a subset of System S without divergence in Isabelle/HOL by shallow embedding, but this formalisation does not include the general fixed-point operator of System S, and the choice to use shallow embedding, while convenient for some tasks, precludes the formalisation of general, meta-theoretic properties about all strategies.In our formalisation, we opt for a deep embedding, enabling us to mechanise all of the definitions and proofs in this paper.Focusing on traversals in strategic languages, Lämmel et al. [2013] characterise a list of strategic programming errors and discuss ways to avoid these errors with static typing and static analysis.With a different approach, we provide a general and formal characterisation of "good" and "bad" strategies as well as successful and unsuccessful executions of strategies, using our location-based weakest precondition calculus.Kieburtz [2001], an inspiration for this work, informally sketches some weakest precondition rules for Stratego.Rather than a location-based weakest precondition calculus such as ours, Kieburtz [2001] includes assertions in modal logic (specifically a combination of CTL and the modalcalculus), where the various tree modalities allow movement to different sub-expressions.However, this modal logic variant does not have the expressive power of our framework because of our choice of location language.For instance, CTL is not expressive enough to reason about the one operator.When it comes to traversals, Kieburtz [2001] does not define general predicate transformers for modal assertions, and thus Kieburtz's [2001] rules do not form a complete calculus.It is not clear how Kieburtz's [2001] approach could be extended to handle traversals in their full generality.In our work, our assertions are just sets of expressions, and we move around an expression by associating locations to our weakest preconditions.This enables us to define general rules for traversals, allowing a compositional and complete calculus for all strategies and all postconditions.In addition, the fixed-point operator is not well constructed in Kieburtz's [2001] work and it is not proven to be monotone, whereas we have a correct treatment of the fixed-point operator and have proven monotonicity of all our formulae.Also, in Kieburtz's [2001] work, soundness is not proven, whereas we prove the soundness of our weakest precondition calculus w.r.t. the formal semantics.Lastly, we provide a careful treatment of divergence with mutually defined wp and wp ↑ , while such a feature is not reflected in Kieburtz's [2001] work.
Type Systems for Strategic Rewriting Languages.A related but parallel strand of work is in giving types to strategic rewriting languages.Smits and Visser [2020] add gradual typing to Stratego and use it to find bugs in their strategies for language interpreters.Koppel [2023] uses typed strategies to model multi-language program transformations, Lämmel [2003] adds types to strategies with applications to generic programming in typed languages and Fu et al. [2023] makes use of structural typing with traces for checking ill-composed strategies statically.These type systems emphasise lightweight static or the hybrid of dynamic and static checking to find bugs, whereas our focus is on a complete semantic accounting of rewriting strategies, and the development of a weakest precondition calculus that can demonstrate the absence of bugs, not merely their presence.Kleene Algebra.Strategic rewriting languages resemble a Kleene Algebra [Kozen 1991] extended with traversals and a biased choice operator.There have been many other extensions to Kleene Algebra, most notably Concurrent Kleene Algebra [Hoare et al. 2011], which adds parallel composition, and Kleene Algebra with Tests [Kozen 1997], which adds Boolean guards to model the semantics of while programs.Kozen [1999] shows that reasoning by Kleene Algebra with Tests entirely subsumes Hoare Logic for while programs.A version of Kleene Algebra with Tests, NetKAT, has been used to reason about packet switching networks [Anderson et al. 2014].Recently, Concurrent Kleene Algebra and NetKAT have been combined for reasoning about concurrent network systems [Wagemaker et al. 2022].
Denotational semantics and adequacy.The appeal of the Scott-Strachey approach to semantics [Stoy 1985] is in its local and compositional reasoning, and over the last 50 years it has been used for many diverse programming languages.As far as programming language abstractions go, the strategic rewriting language we consider is mostly standard, and we were able to use the following relevant semantic tools with relatively minor modification.Plotkin pioneered the powerdomain construction [1976] and later characterised it as the free semilattice over a domain [Hennessy and Plotkin 1979].Most denotational accounts include an adequacy proof, and it is possible to prove them wholesale for standard programming languages with a myriad of expressive features [Johann et al. 2010;Plotkin and Power 2001;Simpson 2004].We found the decomposition of computational adequacy into dual inductive and coinductive arguments interesting, and we hope it could inform other reflective accounts of adequacy [Devesas Campos and Levy 2018].

CONCLUSION AND FUTURE WORK
We have presented Shoggoth, a rigorous formal foundation for strategic rewriting languages, including a comprehensive semantic accounting of System S, and a weakest precondition calculus to facilitate formal reasoning about rewriting strategies.Our semantic treatment models all possible executions of strategies including divergences in both denotational and big-step operational models, and our proofs of soundness and adequacy demonstrate the equivalence of these models.Our location-based weakest precondition calculus is the first formal axiomatic treatment of rewriting strategies, and enables reasoning about traversals by having the notion of location for indicating where in an expression a given strategy operates.Our soundness proof justifies our location-based weakest precondition calculus with respect to our semantic models, and we demonstrate practical application of this calculus by applying it to realistic examples.All of our work has been mechanised in over 5,000 lines of Isabelle/HOL proof script.
Weakest precondition calculi form the basis of verification condition generators (VCGs), which are a key component of many automatic and semi-automatic verification tools such as VCC [Cohen et al. 2009] and Dafny [Leino 2010], as well as of static analysers such as the popular Extended Static Checking extension for Java [Flanagan et al. 2002;Leino 2005].We envision that our weakest precondition calculus could similarly inform the design of a VCG for automatic verification or static checking of rewriting strategies.We intend, in future work, to use Shoggoth as a foundation for the development of tools for verification and, potentially, synthesis of rewriting strategies.

Fig. 1 .
Fig. 1.The Syntax of System S
div ∈ ⟦ ♣ ⟧ Just as with computational adequacy for non-diverging cases, we must first generalise to open strategies.We define the second approximation relation together with an approximation lemma to prove this soundness theorem.Proc.ACM Program.Lang., Vol. 8, No. POPL, Article 3. Publication date: January 2024.Shoggoth: A Formal Foundation for Strategic Rewriting 3:13 Proc.ACM Program.Lang., Vol. 8, No. POPL, Article 3. Publication date: January 2024.Shoggoth: A Formal Foundation for Strategic Rewriting 3:15

Fig. 13 .
Fig. 13.The syntax of the lambda calculus applying the strategy BENF to the expression does rewrite it to a -normal form by showing the proposition below holds: