Bringing the WebAssembly Standard up to Speed with SpecTec

WebAssembly (Wasm) is a portable low-level bytecode language and virtual machine that has seen increasing use in a variety of ecosystems. Its specification is unusually rigorous – including a full formal semantics for the language – and every new feature must be specified in this formal semantics, in prose, and in the official reference interpreter before it can be standardized. With the growing size of the language, this manual process with its redundancies has become laborious and error-prone, and in this work, we offer a solution. We present SpecTec, a domain-specific language (DSL) and toolchain that facilitates both the Wasm specification and the generation of artifacts necessary to standardize new features. SpecTec serves as a single source of truth — from a SpecTec definition of the Wasm semantics, we can generate a typeset specification, including formal definitions and prose pseudocode descriptions, and a meta-level interpreter. Further backends for test generation and interactive theorem proving are planned. We evaluate SpecTec’s ability to represent the latest Wasm 2.0 and show that the generated meta-level interpreter passes 100% of the applicable official test suite. We show that SpecTec is highly effective at discovering and preventing errors by detecting historical errors in the specification that have been corrected and ten errors in five proposals ready for inclusion in the next version of Wasm. Our ultimate aim is that SpecTec should be adopted by the Wasm standards community and used to specify future versions of the standard.


INTRODUCTION
A programming language is defined by its syntax, static semantics (checking), and dynamic semantics (execution).Since they provide the foundation for all subsequent development and analysis, it is important to define them clearly and rigorously.Many programming languages, such as Java [50] or C# [46], therefore have elaborate language definitions.Languages that are prone to implementation inconsistency issues, such as C [28] and JavaScript [17], manage their standardization processes through international bodies.Few languages, like Standard ML [47], go even further and have formal specifications formulated in precise mathematical terms.However, it is an open problem how to continuously engineer and maintain such formal standards, especially at industrial scale.WebAssembly (Wasm) is a low-level language and virtual machine [18].Initially released as an efficient and portable compilation target on the Web platform, it has since been adopted across a wide range of ecosystems, including cloud and edge computing [26,80], mobile and embedded systems [82], IoT [25], and blockchains [78].Browsers implement Wasm within vendor-specific architectures using multi-tier interpretation or just-in-time compilation; in addition, there exist almost a dozen stand-alone implementations of Wasm.The portability of Wasm across all of these implementations is critically important, especially for Web developers, who have no control over what implementation ultimately executes their code.The diversity of environments and platforms means that there is a heightened risk of implementation divergence.
To mitigate these risks, Wasm has been standardized by the W3C [77] with particularly high rigor.It requires four key artifacts for a feature to be standardized: 1) a formal specification for the feature given by declarative-style typing and reduction rules, written in LaTeX; 2) a prose pseudocode presenting an algorithmic-style semantics, written in reStructuredText (reST) markup; 3) a reference interpreter providing an implementation of the feature, written in OCaml; and 4) a test suite for the feature, written in the Wasm text format (.wast).All of these artifacts must define, or for the test suite evaluate, the same behavior.
Notably, the specification provides both declarative and algorithmic styles of semantic definitions.For example, Fig. 1 shows the semantics of the select instruction in the official Wasm specification.Fig. 1a presents it in a prose notation broken down into seven steps, while Fig. 1b specifies it in a formal notation using two rewrite rules.The two styles enjoy complementary strengths: on the one hand, declarative rewrite rules specify the language semantics through rigorous and succinct Proc.ACM Program.Lang., Vol. 8, No. PLDI, Article 210.Publication date: June 2024.
Bringing the WebAssembly Standard up to Speed with SpecTec 210:3 mathematical rules that enable faster turn-around in the design phase and are particularly wellsuited to formal reasoning, including proving properties such as type safety; on the other hand, algorithmic prose pseudocode specifies the language semantics through a step-by-step natural language description that, though much more verbose, is comprehensible to a broader audience.
Despite the demanding standardization process, the Wasm specification has been written and maintained manually, which creates challenges for specification authors.The prose is transliterated from the formal rules, writing it is extremely laborious [61], and code reviews of specification changes written in LaTeX and reST are not user-friendly.Manual processes are vulnerable to human error, potentially leading to inconsistency or incorrectness in the specification.As Wasm grows with new language features such as garbage collection [63], exceptions [2], and threads [68], manually crafting all of the above artifacts poses a challenge to scalability.
These challenges call for automating the process via mechanizing the language semantics.The literature shows general-purpose language frameworks such as Ott [67], PLTRedex [20], Skeleton [66], Spoofax [74], and K [81].However, the most successful examples of a mechanized normative specification involve mechanizations tailored to the specific target languages.This is because by narrowing attention to a specific language, a far more ambitious variety of mechanizations can be supported with ease.For example, ASL [59], which is singularly concerned with Arm's architectural specification, achieves far more impressive automation than is possible in a general-purpose framework.Taking a slightly different route, the ESMeta framework [1] reconstructs a mechanized specification from the JavaScript standard [17], allowing diverse tools [55,54,53] to be incorporated into the continuous integration checks of the specification and official test suite [73,72].
We aim to develop a mechanized specification that will ultimately become the normative specification for Wasm, hence it is preferable to have a framework specialized for Wasm.But Wasm poses unique challenges.In order to replace the existing manual specification, it is required to generate all standardization artifacts in a format acceptable to the Wasm Community Group (CG).In particular, a framework is needed that can directly support and match both styles of semantics used in its specification (declarative and algorithmic), without requiring authors to write both.To the best of our knowledge, no framework in the literature directly supports both styles of semantics.
To this end, we propose Wasm SpecTec, a framework for mechanizing the Wasm semantics with these goals in mind.SpecTec provides a domain-specific language (DSL), in which Wasm syntax, type system, and execution semantics can be defined in a declarative style.Fig. 2 illustrates this with the reduction rules defining the semantics of Wasm's select instruction, expressed in this DSL.Specification authors only need to write these two rules instead of what is seen in Fig. 1, which involves half a page of LaTeX and reST markdown sources not shown here.The definition in SpecTec then becomes a single source of truth from which various artifacts can be auto-generated: (1) declarative representations, including the LaTeX-based specification and mechanized definitions for diverse theorem provers, (2) algorithmic representations, such as the prose specification in reST and a Wasm interpreter, and (3) executable tests in Wasm text format, to check conformance of third-party implementations.This is made possible by leveraging the domain-specific knowledge present in a specialized tool.
Fig. 3 illustrates the overview of the overall SpecTec architecture.In this paper, we focus on the first stage of the project, which aims at generating those artifacts necessary for the published specification document, namely LaTeX and prose.We leave investigating the greyed-out parts of the diagram, which are concerned with generating test cases and facilitating the automated generation of mechanized semantics for Wasm in theorem provers, for future work.
The SpecTec DSL is designed to resemble familiar "textbook-style" notation for formally describing language semantics.We deliberately keep the syntax of the DSL close to the syntax used in the  existing formal specification of Wasm, in order to provide a sort of WYSIWYG experience to the specification authors.All definitions and rules are "type-checked" to detect meta-level errors.From this, SpecTec can directly produce the LaTeX-based formal specifications.
To generate the algorithmic representations, especially prose, SpecTec definitions are first translated -with a few intermediate steps -into an intermediate representation called the Algorithmic Language (AL).It is designed to provide a representation that is close in structure to the prose of the official specification.Thus, the English prose in reST markup can be generated directly from the definitions in the AL.SpecTec also supports the interpretation of Wasm programs through a meta-level interpreter for the AL, following the approach of the ESMeta framework [1].
To bridge the gap between the two styles of semantics, we present a mechanism for automatically deriving the algorithmic AL from the declarative DSL.Several challenges arise in transforming declarative-style reduction rules into algorithmic-style pseudocode.For example, a single Wasm instruction may have multiple reduction rules with different premises (such as select in Fig. 2), but they must all be combined in a single prose pseudocode conditional.Moreover, mathematical equality '=' in rules can be interpreted as (unconditional) assignment or a (conditional) equality check, depending on the context (declaratively, there is no meaningful difference between the two, but algorithmically there is).We show that disambiguating such premises while introducing minimal auxiliary variables is an NP-hard problem and propose a practical solution.
SpecTec is available as an open-source project [69] and covers all of Wasm 2.0, the current release of Wasm.The prototype meta interpreter passes 100% of the official test suite and identified several specification errors in current and upcoming features, as confirmed by standards body members.In fact, we are up to speed with the development of the Wasm standard in the sense that we have extended the SpecTec specification with multiple major proposals soon to be merged into the standard, such as subtyping, garbage collection, and multiple memories.Our aim is for SpecTec to be adopted by the Wasm standards community and used to produce future versions of the standard.
Though the Wasm SpecTec design choices and tooling is specific to Wasm, we believe that the general methodology and architecture we present is applicable to other languages too.The main technical contributions of this paper are the following: • We design and implement Wasm SpecTec, a declarative specification language and its toolchain, optimized for reading, writing, and maintaining the Wasm specification ( §2) -It is a framework embracing both declarative and algorithmic styles of semantics.
-It automatically generates multiple representations from a single source of truth.
-It is a forward-compatible framework that can adapt to the evolving Wasm semantics.• We design and implement various backends for SpecTec: a LaTeX backend, which outputs LaTeX code describing both static and dynamic semantics declaratively in the format used by the official Wasm specification ( §2.4), -a prose backend, which outputs prose pseudocode in the style used by the official Wasm specification ( §2.5 and §3.4), and an interpreter backend that indirectly executes Wasm programs by meta-interpreting the specified semantics in its derived algorithmic representation ( §3.5).• We evaluate SpecTec by formalizing all of Wasm 2.0, generating a specification document and a meta-level interpreter from it, and validating them ( §4).

FORMAL SEMANTICS IN SPECTEC
The SpecTec DSL is intended as the single source of truth for all representations of the Wasm semantics.As such, its design needs to strike a balance between the purposes of easy authoring of the official Wasm standard and its artifacts, as well as generating mechanizations for multiple theorem provers.Since the Wasm specification already exists with established idiosyncrasies that the community is familiar with, (1) SpecTec should have minimal impact on the resulting document, and (2) specification authors should not need to learn and understand yet another formalism or second-guess how a tool would translate that to the intended representation in the document.Historically, the formal specification has usually been the first form in which the semantics of Wasm or a new feature has been specified (after informal "explainers").The prose then was the result of a subsequent manual translation of the formal rules into stylized English.These dynamics are due to the conciseness of the declarative mathematics, which enables much faster turnaround, as well as its precision, which cannot always be adequately matched in prose.
SpecTec hence is designed to closely resemble the formal style used in the Wasm specificationwhich in turn is closely based on pen-and-paper notation used widely in literature and textbooks on programming language semantics.The main components of Wasm's formal specification are: (1) formal grammars defining binary and textual representations; (2) deduction rules defining the type system; (3) a functional definition of module instantiation and linking; (4) small-step reduction rules defining Wasm's runtime execution.For SpecTec, we deliberately decided against a type-theoretic notation that would be closer to theorem provers like Coq.The goal is a WYSIWYG user experience, where authors write formal rules (and code-reviewers see them) in the same notation as they will later appear on screen, except in a plain and readable (and diff'able) ASCII.To that end, SpecTec builds in much of the meta-level notation used by the Wasm specification, such as basic arithmetics, meta-variables, the notion of records, iterated sequences, and manipulation of these.
On the other hand, SpecTec does not hard-code any Wasm-specific concepts in its syntax or meta-theory (though its backends specialize for it to a varying degree, see §2.4 and §2.5).Instead, there are three generic mechanisms for describing the language and its semantics: 1(1) Syntax definitions, which can capture abstract syntax of a language or auxiliary constructs; (2) Relations and respective rules, which can specify typing, evaluation, or other predicates; (3) Functions, which enable auxiliary definitions.

Wasm in SpecTec
Fig. 4 shows a small, simplified excerpt from the Wasm specification in SpecTec.Fig. 4a defines the abstract syntax of types and instructions: a value type is one of Wasm's numeric types; a function type consists of a sequence of value types for its parameters and another for its results, where the arrow is just custom meta-level notation; a global type can be marked with an optional mutability attribute; an instruction can be one of many forms, some of which have additional parameters representing the instruction's "immediates."Mnemonics (essentially, keywords) are written in all caps, distinguishing them from meta-variables.
To define a typing relation for instructions, we first need to define typing contexts (Fig. 4b).This again is a syntax definition, but this time describing a record of different namespaces.Each namespace simply contains a list of respective types -that is because, in Wasm, definitions are referenced by index (via a form of de Bruijn notation), not by names.The typing relation itself (Fig. 4c) is defined over a triple of context, instruction, and a function type describing its effect on Wasm's operand stack.The stylized turnstile and colon used as separators are again just user-defined notation within SpecTec, chosen to make the relation more readable and reminiscent of standard paper notation.Each rule defining this relation then describes how to type one individual instruction: nop pops nothing from and pushes nothing to the stack (eps denotes the empty sequence ); drop consumes a single value of arbitrary type (t is unrestricted in the rule); in contrast, select consumes three values, two of which are of arbitrary but same type t, and produces one t; the rules for local.getand global.getlook up the result types in the context C, with the Boolean side conditions representing a form of premises to the rules.
Execution is defined as a small-step reduction relation over configurations (Fig. 4d); a configuration consists of the state and the instruction sequence to reduce.The state, in turn, consists of the global store and the local function frame; besides locals, the latter holds a reference to the current module instance, needed to resolve module-internal indices to global addresses in the store.In order to ease notation, we define a couple of short-hands for accessing locals and globals as auxiliary meta-functions (Fig. 4e).Meta-definitions are distinguished from meta-variables by a preceding $.
Finally, the actual reduction rules (Fig. 4f) are given by two stepping relations:2 the first is the main relation that defines a step on configurations, while the second is a simpler variant used to define pure reductions that do not access the store.The rule Step/pure defines how they interact: we can take a step if we can take a pure step, which results in the state z being unchanged.The premise in this rule is not a side condition but inductively invokes another relation, which has to be named explicitly.The nop, drop, and select instructions are pure so that they can be formulated with this simpler relation: nop does nothing, hence is reduced to the empty instruction sequence; drop likewise, but it also eliminates a value instruction before it, which amounts to popping a value from the stack; select is defined by two cases with different side conditions inspecting the selector value c, and the side condition "otherwise" is a short-hand negating all previous premises for a rule with the same left-hand side; the rules for local.getand global.getlook up the value in the respective part of the state, invoking the auxiliary meta-functions previously defined.
Following this style, we have translated most of the technical content of the recent Wasm 2.0 specification [18] to SpecTec, 3 as well as substantial newer features, such as garbage collection [63], multiple memories [64], and tail calls [65] (see §4.3).This translation covers (1) abstract syntax, (2) validation, (3) runtime system, (4) execution, (5) module instantiation, and (6) binary format.The main pieces still left for future work are numeric primitives and the text format.By design, the SpecTec tooling allows us to convert the existing specification progressively, that is, different parts of it can be replaced with SpecTec-generated content incrementally.

2.2
The SpecTec Language Fig. 5 presents the syntax of SpecTec definitions.Expressions are either Boolean or arithmetic formulas, applications of meta-functions, sequences of expressions (where eps denotes the empty sequence), indexing into a (homogeneous) sequence, taking the length of a sequence, records, record projection, record and sequence updates, tuples, or user-defined notation.The latter consists of sequences of expressions interspersed with atoms, which are either upper-case identifiers or one of a set of predefined symbolic tokens.Types describe the shape of expressions and can describe either Booleans or natural numbers, sequences, records, variants, tuples, or user-defined notation.
In both cases, homogeneous sequences of variable length can be denoted by iterations, similar to regular expressions.For example, nat * would be the type of possibly empty sequences of natural numbers, while nat^3 specifies a sequence of exactly three numbers.In an expression, the use of a variable under an iteration implies that this variable has a respective dimension; SpecTec infers the dimension of each variable and checks that it is used consistently when the same variable occurs multiple times in a rule or definition.A variable can occur as part of a larger expression under an iteration, which is a mapping over that variable.For example, list = {A x, B y} * /\ (x < 100) * states that list is a sequence of records with fields A and B, and every value x of respective field A must be smaller than 100.This corresponds to the use of overbars in formal notation [70], and like overbars, iterations can nest freely, leading to variables of higher dimension.
Syntax definitions are essentially type definitions: they name a given type. 4Besides creating a shorthand, this also enables (mutual) recursion, by which syntax definitions can be interpreted as inductive types.Relations are declared with a type that specifies its notation, which will typically consist of a sequence of syntax types separated by atoms.Each corresponding rule then consists of an expression of the respective type, possibly accompanied by a sequence of premises.In its basic form, a premise can either invoke another relation, in which case the name of that relation has to be given along with an expression of a suitable type, or it is a Boolean side condition.A special premise is otherwise, which represents the negation of all premises previously used for the same left-hand side.Premises can also be iterated.All rules must be uniquely named, and these names can be hierarchical; the names have no semantic relevance, but allow referring to individual rules in tooling ( §2.4).Function definitions are given by a declaration of their type and then individual equational clauses.The clauses express a (sequential) pattern match, with their left-hand side argument expression being the pattern and possible premises acting as pattern guards.Like syntax, relations and functions are all interpreted as inductive definitions.Finally, variable declarations allow us to globally declare the type implicitly associated with each use of a meta-variable.SpecTec recognizes suffixes as in x_1 or x' as variations with the same type.In addition to explicit variable declarations, syntax definitions implicitly declare a variable of the same name as the type.All variables used in expressions must be declared in one of these two ways.

Elaboration and Lowering
SpecTec definitions are first parsed into a representation called the External Language (EL).This representation is an AST that corresponds directly to the surface syntax of SpecTec, as presented in Fig. 5. On this representation, the frontend then performs binding and recursion analysis, type checking, dimension inference, and type-driven resolution of notational overloading in expressions.During this procedure, the EL is elaborated into a more explicit and unambiguous Internal Language • Perform dataflow analysis on premises, annotate suitable equations as variable bindings, and reorder premises accordingly ( §3.3).Not all these transformations are run for every backend; each backend can select them individually.

LaTeX Backend
The simplest of all backends is the LaTeX backend.It produces LaTeX directly from the EL, bypassing the IL, giving users of SpecTec fairly direct control over the output.Most of it is straightforward, except for the rendering of identifiers, which distinguishes different identifier classes and can handle subscripts of mixed classes.The main complication is handling rendering hints like "hint(show ...)" in the DSL (omitted from Fig. 5), which allows customizing the rendering of selected identifiers, atoms, and whole function invocations.Moreover, it distinguishes forms of relations based on their syntactic shape and renders them as either typing or reduction relations accordingly.
The LaTeX backend operates on secondary input in the form of text files with splice commands.For example, it replaces the splice @@{rule: Instr_ok/local.getInstr_ok/global.get} with a single LaTeX array containing the corresponding inference rules: The splice @@{rule: Step_pure/select− * }, on the other hand, recognizes Step_pure as a reduction relation and renders the rules matched by the wildcard appropriately: An overall specification document hence is a skeleton (in LaTeX, reST, or another format) with respective splices, that is transformed into the final document by the SpecTec tool.

Prose Backend
The prose backend covers validation, runtime system, execution, and module instantiation; it does not apply to abstract syntax and binary format, which do not have corresponding prose.Like the LaTeX backend, it can process splice commands and distinguishes forms of relations based on their syntactic shape and handles them accordingly.Because the Wasm specification's prose description of the validation rules is still declarative, generating it from the EL is relatively straightforward: Fig. 6 shows the result for the local.getand global.getinstructions.
In contrast, generating the English prose specification for execution (and some auxiliary metafunctions) is not trivial because it requires conversion to an algorithmic formulation.Moreover, it assumes that reduction rules can be interpreted as a suitable form of stack machine.We explain the challenges and our solution in the next section.

DL: Declarative Language
For presentational purposes, we abstract the SpecTec syntax in Fig. 5 into a Declarative Language (DL) that only shows the features relevant to the translation of Wasm's runtime semantics into an algorithmic representation.Fig. 7 presents the syntax of DL.
The Wasm runtime semantics is defined by a sequence of definitions * .A definition is either a reduction rule or an auxiliary helper function .A reduction rule 1 2 - * is an abstraction for the specific subset of rules from Fig. 5, whose role is to express Wasm reduction rules as illustrated in Fig. 4f.Note that we only abstract the reduction rules but not typing rules in Fig. 4c, since the prose notation of typing rules are not algorithmic, and can not be executed.A configuration ( ? , ) denotes a pair of Wasm program state ?which can be omitted (written ), and a list of Wasm instructions that represents the current stack.If ? is omitted, it indicates that the corresponding reduction rule is a simpler Step_pure rule.A premise is a premise prem extended with the special case ← , denoting an explicit variable binding.A helper function ( * ) = - * is an abstraction for the function definitions defs from Fig. 5.Because the details of the expression are not relevant to this section, we show only some cases used for concrete examples.A constructor is an atom that denotes the name of Wasm types, such as I32, or Wasm instructions, such as CONST and SELECT.
For example, the semantics of select in Fig. 2 corresponds to the following in DL: To avoid clutter, we will sometimes omit brackets around singular lists.

AL: Algorithmic Language
The Algorithmic Language (AL) represents Wasm's runtime semantics in an imperative style.DL definitions are translated into AL definitions, such that pseudocode representations can be generated.Fig. 8   function, in imperative form.An algorithm consists of a name , parameters * , and a body of meta-level statements * .A statement denotes a prose statement in the Wasm specification.An expression denotes an expression in the prose that is evaluated to a value, and a condition denotes a condition in the prose that is evaluated to a Boolean value.The figure also indicates the intended prose rendering of statements and expressions.The Wasm specification often uses specific phrases like "the current frame" and "a label is now on the top of the stack."We abstract such Wasm-specific expressions and conditions as for brevity.
For example, the semantics of select in Fig. 2 corresponds to the following in AL: algorithm SELECT () [ assert "an I32 value is on the top of the stack", pop (CONST I32 ), assert "a value is on the top of the stack", pop 2 , assert "a value is on the top of the stack", The semantics of AL is imperative.The interpretation of a program starts by calling one of its algorithms, algorithm ( * ) * , which sequentially executes its body statements * .Executing a statement may alter the implicit program state, and the resulting state after executing the last statement of the algorithm is the result of the program execution.

DL to AL Translation
Now, we describe how to translate a Wasm semantics * in DL into an AL program .First, the definitions in * are grouped to represent algorithms; among * , the reduction rules * are grouped according to the Wasm instruction in their redex, and the helper functions * are grouped according to their names.Each group is translated into a single algorithm in two phases: (1) preprocess the group's definitions into a more restricted form, and (2) generate an AL algorithm from the preprocessed DL definitions.Since the translation of helper functions is similar to the translation of reduction rules, this section focuses on the translation of reduction rules.

3.
3.1 DL to DL Preprocessing.Preprocessing consists of two steps: for each group of reduction rules, (1) preprocess the left-hand sides of the reduction rules in the group to make them the same, and (2) preprocess the premises of the definitions in the group so that every variable is bound exactly once before its uses.
The first preprocessing step is to generalize the left-hand sides of the reduction rules so that the left-hand sides in each group become identical.Most Wasm definitions satisfy this condition, but some do not, such as the following inductive reduction rules for the br instruction: which corresponds to the following in DL: This process is known as anti-unification [12].For each group of reduction rules, the antiunification algorithm AU takes a list of left-hand side expressions as input.For the br instruction, for example, AU takes two expressions: For a list of expressions to anti-unify 1 • • • , the algorithm AU (1) generates a general expression possibly containing some fresh variables and then (2) generates premises * (1 ≤ ≤ ) using the fresh variables to make each be the same as with * .More specifically, the general expression is the most specific expression that generalizes the expressions 1 • • • , replacing only the different components with fresh variables.For example, the general expression for the reduction rules for br is (LABEL ′ * ( 1 ++(BR 2 )++ * )) with two fresh variables 1 and 2 .After generating the general expression, AU generates premises * for each so that is an instance of the general expression satisfying * .For example, to make the general expression for br same as the left-hand sides of 1 and 2 , AU infers the conditions [ 1 = ′ * ++ , 2 = 0] and [ 1 = * , 2 = +1], respectively.Finally, two rules for the br instruction become: Thanks to the fresh variables 1 and 2 , both 1 and 2 have the same left-hand sides.The definitions of 1 and 2 are in premises: The second preprocessing step is to change the premises of each group's definitions so that each variable is bound exactly once before it is used.This step is necessary because the order of premises in declarative reduction rules can be arbitrary.In addition, equality expressions in premises can be ambiguous because they can represent equality check conditions or variable bindings.Thus, the goal of this second preprocessing step is to replace every equality premise denoting a variable binding with ← ′ .In order to do that, we need to identify each variable's binding occurrence, keep track of the variables that each premise binds, and reorder premises so that preceding premises bind all free variables in each premise.
For example, consider the first rule for the br instruction again: The second premise 2 = 0 is an equality check condition, while the first premise is a binding of fresh variables and ′ .Therefore, this step changes the rule as follows: which clearly indicates that fresh variables and ′ are newly introduced in the first premise.Once the variable binding occurrences in premises are identified, reordering the premises is a simple def-use dataflow analysis.
The primary challenge now lies in identifying each variable's binding occurrence in premises.An intriguing difficulty emerges because of partial binding, in which certain free variables on one side of an equality expression are binding occurrences, but others are not.In a premise ( 1 , 2 ) = , for example, only 1 might be a binding occurrence but not 2 .Unfortunately, such partial bindings introduce fresh variables while generating prose.The prose rendering of the premise ( 1 , 2 ) = would be "1.Let ( 1 , ) be .2. If = 2 , then: • • • ", introducing the free variable .Overuse of such fresh variables may result in a prose specification that is very different from the current prose specification, leading to a less readable document.
Therefore, it is preferable to minimize the number of partial bindings when identifying the binding occurrences.It turns out that this is actually an NP-hard problem.We prove it by reduction from a known NP-hard problem, the exact cover problem [45]: The exact cover problem aims at deciding whether it is possible to select some subsets within a given collection of subsets in such a way that each element of a given set belongs to exactly one selected subset.This problem is NP-complete [35].Here, we provide the formal definition of the exact cover problem: Definition 3.1 (Exact Cover Problem).An instance of the Exact Cover Problem (EC) is defined by a tuple ( , ) such that is a set of elements and ⊆ P ( ) is a collection of subsets of .EC aims at deciding if there exists a subcollection ⊆ which is a partition of , that is, For example, consider = { , , , , } and = {{ , }, { , }, { , , }}.Because {{ , }, { , , }}, one of the subcollections of , is a partition of , the answer is yes.On the other hand, for ′ = {{ , }, { , }, { , }, { , }}, since no subcollection of ′ is a partition of , the answer is no.
Because EC is one of Karp's 21 NP-complete problems [35], we can prove that the problem of identifying each variable's binding occurrence in premises is NP-hard, if we can reduce EC into the problem in polynomial time.
Theorem 3.2.The problem of identifying each variable's binding occurrence in premises with the minimal partial binding is NP-hard.
Proof.Assume that we are given EC with a set and a collection of its subsets ⊆ P ( ).Let be the size of .Let = { 1 , ..., } be the -th subset of .Now, consider a reduction rule , and the -th premise of * be = ( 1 , ..., ).Note that this reduction rule can be constructed in linear time.The claim is that when we identify each variable's binding occurrence in * with the minimal number of partial binding, this gives a solution to EC.Specifically, the answer to EC is YES if and only if the minimum number is zero.Note that every variable must be bound exactly once.Thus, given that there is no partial binding, when we collect all premises that bind new variables, then the subsets corresponding to such premises would form a subcollection of , which is a partition of .Conversely, if an exact cover exists for the given collection, then the identification of binding occurrences can be done without any partial bindings, by regarding every variable in the premises that corresponds to the sets in the exact cover as a binding occurrence, and regarding anything else as not.□ Example.Let's consider the example of = { , , , , } and = {{ , }, { , }, { , , }} again.Following the proof above, EC for and is reduced to the problem of identifying each variable's binding occurrence in the premises of the following reduction rule: If we successfully solve the problem, then the result should look like the following: Algorithm 1: Preprocess Premises Input: A list of premises * and a set of bound variables Output: A list of preprocessed premises ′ * if reordering ′ * succeeds then return reordered ′ * 10 where free( ) returns free variables in and Knuth( ) returns partitions of which does not have any partial bindings.From the result, we can reconstruct the partition of the set by collecting the binding premises, {{ , }, { , , }}, giving the answer to EC.Thus, no polynomial-time algorithm can solve the problem.If the numbers of premises are small, a simple brute-force algorithm might be a solution.However, the reduction rule for module instantiation, for example, has more than 10 premises and variables, so a more efficient method is preferrable.Another solution is to use an SMT solver like Z3 [14], since this is a constraint solving problem.However, encoding the problem as an SMT query and invoking the Z3 solver from our implementation can be an unnecessary overhead.
As a practical solution to this NP-hard problem, we adopt an all-or-nothing heuristic approach.We first try to solve the problem under a condition that partial bindings are not allowed, and if that fails, we resort to a greedy algorithm, inevitably introducing some fresh variables.In order to solve the problem with the constraint of no-partial bindings, we reduce the problem into EC and then adopt the Knuth algorithm [38], u well-known and effective algorithm for solving EC.The high-level idea is to encode the premises as a collection of sets, where a solution to EC of the encoded collection corresponds to the solution to our problem.
Algorithm 1 describes the process.It takes two inputs, a list of premises * in reduction rules and a set of already bound variables , and returns a list of new premises ′ * .First, it encodes premises as a collection of subsets of , where = {1, • • • , | * |} ∪ free( * ) is a set of the numbers from 1 to the size of premises and all free variables in them, on which the Knuth algorithm performs.For each , if it is an equality expression = ′ , it is encoded as three subsets: {{ }, { } ∪free( ), { } ∪free( ′ )}, which denotes its possible interpretations.The first set { } denotes when the -th premise does not bind any variables, meaning is an equality check condition.The second set { } ∪ free( ) denotes when binds ALL the variables in and the third set { } ∪ free( ′ ) denotes when binds ALL the variables in ′ .For example, if the first premise is = , it is encoded as three subsets: {1}, {1, }, and {1, }.If is not an equality expression, it is encoded as only one subset, { }, meaning does not bind any variables but checks some non-equality condition.Then, the Knuth algorithm takes the collection containing all the encoded subsets and the set of bound variables and returns the partitions of the set .Note that the Knuth algorithm may not return a unique partition.In addition, due to the definition of partition, for any partition , there should be exactly one subset that contains a number for 1 ≤ ≤ .Thanks to the design of encoding, a subset is either a singleton set { } or a set with an index and some variables { , 1 , 2 Using a partition , the algorithm replaces every equality expression denoting a variable binding with an explicit variable binding expression via replaceVariableBinding in Algorithm 2. For example, if a singleton set {2} is in the input partition , then the second premise 2 is a condition and therefore remains the same.If a set {3, } is in and the third premise is = , then the third premise is replaced with ← .For preprocessed premises ′ * , Algorithm 1 tries to reorder them so that all variables are bound before their uses.Reordering premises may fail, if they contain cyclic bindings like ← ( ) and ← ( ).If reordering ′ * succeeds, the algorithm returns the reordered ′ * ; otherwise, it tries with the next partition.If the reordering fails for every possible partition, the translation from DL to AL fails.There were two such cases in the Wasm spec for helper functions regarding module instantiation, instantiate and allocmodule.We re-wrote them to have equivalent non-circular definitions.

AL Generation.
After preprocessing, DL definitions satisfy two conditions: (1) the left-hand sides of the reduction rules in each group are identical and (2) every variable in premises is bound exactly once before it is used.For each group of reduction rules the translation algorithm T generates an AL algorithm by translating the left-hand-side ( 1 , ′ 1 ) first, and then translating the premise and right-hand-side for each rule (1 ≤ ≤ ).For the rules for select in DL, for example: For the first rule 1 , the premise ≠ 0 and the right-hand-side ( , [ 1 ]) are translated to if ( ≠ 0) and push 1 , respectively.Roughly speaking, translating a left-hand-side of a reduction rule corresponds to generating the beginning of an algorithm, which pops the values from the stack to use as the inputs for the target instruction and binds new variables that contain the information about the inputs.Translation of premises corresponds to generation of the middle of an algorithm, which generates either variable bindings let or if statements with conditions .Note that a binding premise ← ′ may introduce side conditions.For example, a binding [ , ] ← arr introduces a side condition that the size of the array arr is two.T generates such side conditions based on binding patterns.Finally, translation of a right-hand-side of a reduction rule corresponds to generation of the end of an algorithm, which pushes the result value onto the stack top or execute other Wasm instructions.

Prose Backend
As AL is designed to resemble pseudocode, generating an English prose specification from AL algorithms is a straightforward task.Fig. 9 shows the prose pseudocode semantics of the select instruction, generated from the specification in Fig. 2, which is very close to the original handwritten prose in Fig. 1a.AL algorithms are rendered into a prose specification document through three steps.First, the semantics described in AL is printed into English prose in reST markup.For example, the second step of select is printed as follows: 2. Pop :math:`\xref{syntax/types}{syntax-numtype}{\mathsf{i32}}.\\ \xref{syntax/instructions}{syntax-instr-numeric}{\mathsf{const}}~c`from the stack.
Next, as in the LaTeX backend ( §2.4), prose in reST is spliced into a skeleton specification document.Finally, the spliced document is processed by Sphinx [56], producing formats like PDF and HTML.Note that prose in reST is not simple plaintext, but has inline math blocks as denoted by the :math: markup.Expressions in math blocks are typeset with LaTeX.Furthermore, the prose backend embeds cross-references into the math blocks with \xref{doc}{section}{text}. This serves as a reference to a section in some reST file doc, to be rendered as text.As in the original specification document, const in the second step references its syntax production rule in a separate file.This systematic insertion of references rules out possibilities of missing, broken, or misplaced links when inserted manually.

Interpreter Backend
SpecTec enables the interpretation of Wasm programs through a meta-level interpretation approach, similar to the ESMeta framework [1,55].By interpreting an AL program that denotes the Wasm semantics, with a Wasm program as its input value, we can indirectly interpret the Wasm program.
Initially, we parse a Wasm program into an AST, which an AL algorithm takes as input.Unlike ESMeta which also generates the parser automatically, SpecTec does not generate a parser for Wasm's text format, but sources out this task to the parser of the existing reference interpreter [87].Then, the executable semantics represented in AL is automatically extracted as described in §3.3.We have developed an AL interpreter in OCaml based on the AL meta-level semantics, enabling the execution of an AL program.By executing the extracted Wasm semantics using the parsed AST of a Wasm module as input, we can indirectly execute Wasm programs.
As described in §3.2, we can execute an AL program by calling one of its algorithms.According to the Wasm specification, code is executed either when instantiating a module [18, §4.5.4] or when invoking a function exported by a module instance [18, §4.5.5].Thus, the AL interpreter calls the instantiate algorithm to instantiate a module or the invoke algorithm to call a function.The algorithms either return a list of values or produce a trap, which is the result of execution.

EVALUATION
We developed SpecTec as an open-source project [69], and evaluated it based on the following: • RQ1.Correctness: Does SpecTec correctly generate formal and prose specifications, as well as the interpreter backend?( §4.1) • RQ2.Bug Prevention: Can SpecTec prevent bugs during Wasm standard development?( §4.2) • RQ3.Forward Compatibility: Can SpecTec support future language features?( §4.3) We evaluate SpecTec with the latest Wasm specification, Wasm 2.0 [18]; we manually specified its syntax, validation, and execution in the DSL, which we name Wasm2 ST .We have not yet included auxiliary functions for numeric instructions since they are numerous but do not pose new challenges for SpecTec.Instead, we use the reference interpreter's implementations for numeric functions.Also, for some parts of the Wasm 2.0 standard that we expect to be difficult to reason about in theorem provers, we use a different, but equivalent formulation that is closer to the one used by the official reference interpreter.For example, while Wasm 2.0 uses evaluation contexts to escape multiple block contexts in a single step, Wasm2 ST uses a bubbling-up technique to exit one block at a time.We also rewrite the module instantiation semantics to remove cyclic bindings within premises.Wasm2 ST amounts to 2,957 Lines of Code (LoC), while the corresponding official specification document written in reST is 5,526 LoC.

Correctness
We evaluate the correctness of the artifacts generated by SpecTec using Wasm2 ST as an input.
Formal and Prose Specification.SpecTec can generate two of the four key artifacts required by the W3C Wasm CG in order to standardize a feature.It can generate a formal specification in declarative style, written in LaTeX, for the abstract syntax, validation semantics, execution semantics, and binary format.It can also generate a prose pseudocode presenting validation in declarative style and execution in algorithmic style, written in reST.A generated PDF or HTML document with formal and prose notations [75] is very close to the respective parts of the hand-written specification [18].
Interpreter Backend.We evaluate the correctness by executing the official Wasm test suite [86].This evaluation demonstrates two things: 1) that the meta-level interpreter can interpret Wasm programs; and 2) that Wasm2 ST , the process of translating it to AL, and the resultant Wasm semantics in AL, as represented by the prose, are correct with high confidence.A Wasm test consists of one or more Wasm modules and assertions to verify whether the implementation behaves as expected.Of the seven kinds of assertions [88], we exclude three related to parsing and validation (for which SpecTec does not have an interpreter backend) and one related to testing infinite loops.Thus, we use the following three kinds of assertions: (assert_return <invoke> <result>*) ;; assert invocation has expected results (assert_trap <invoke> <failure>) ;; assert invocation traps with given failure string (assert_trap <module> <failure>) ;; assert module traps on instantiation We performed our experiments with a MacBook Pro (16-inch, 2019) with a 2.4 GHz 8 core Intel Core i9 and 32GB of RAM.On this machine, the meta-level interpreter executed all 49,833 tests (47,391 assert_return, 2,408 assert_trap for actions, and 34 assert_trap for modules) in 58 seconds.While the meta-level interpreter currently exhibits slower performance compared to the reference interpreter, its design is focused primarily on ensuring fidelity of the algorithmic semantics generated from the DSL.Every applicable test in the official test suite has successfully passed, giving high confidence in the correctness of both the SpecTec implementation and the Wasm semantics written in it, and by extension, the correctness of the generated prose specification.

Bug Prevention
We evaluate SpecTec's ability to detect or prevent bugs during the authoring process by evaluating whether it can prevent the actual bugs that occurred during the Wasm standard's development [89].We collected the bugs by investigating specification fixes within the Wasm standard's main GitHub branch over the last two years.A few bugs were not in scope for our framework, such as edits to non-normative explainers.Other than that, our analysis reveals that all remaining bugs would have been prevented using our tool.We classified these bugs into four categories based on the phase at which they would have been prevented.
Type Errors.We found three bugs that SpecTec's type checking would have prevented: 1) a missing field in the execution semantics of the elem.dropinstruction [8], 2) a missing argument in the semantics of module instantiation [31] and 3) a wrong use of tabletype in the validation rule of table.setinstruction [32].We injected each bug into Wasm2 ST , used it as an input to SpecTec, and confirmed that SpecTec detected both bugs as type errors with informative error messages that included the error locations in the specification as well as the reasons for the errors.
Prose Errors.We found seven bugs that SpecTec's automatic prose generation would have prevented: 1) free identifiers in the execution semantics of control instructions [15], 2, 3) a free identifier in the execution semantics of the allocelem function [90] and memory.initinstruction [3], 4) a missing parameter in the execution semantics of the table.setinstruction [36], 5, 6) missing steps in the execution semantics of function invocation [57] and the module instantiation [4], and 7) an obsolete step in the execution semantics of function invocation [6].Such prose errors do not exist in prose specifications generated by SpecTec, and we confirmed that the corresponding semantics were correctly specified in the generated PDF document [75].
Semantics Errors.We found three bugs that SpecTec's meta-level interpretation may have prevented: 1) a missing value in the value stack of the reduction rule of the table.growinstruction [79], 2) a wrong memory index for the memory.filland memory.initinstructions [30], and 3) popping a wrong number of values when exiting from a label [5].We injected each bug into Wasm2 ST , used it as an input to SpecTec, executed the Wasm test suite with the generated meta-level interpreter, and confirmed that the meta-level interpreter detected all three bugs as semantics errors.
Editorial Fixes.Finally, we found numerous editorial and presentational issues, such as typographical errors in LaTeX, inconsistencies in writing style across the specification, or incorrect crossreferences or hyperlinks to definitions.These errors do not arise in formal and prose specifications generated by SpecTec since they follow a predefined structure and style, removing the possibility of human errors or inconsistencies.
These results demonstrate that SpecTec is effective in preventing a wide range of human errors throughout the Wasm standardization process.It can detect or prevent various errors, including type errors, prose errors, semantics errors, and editorial fixes.We believe that SpecTec can significantly enhance the robustness and reliability of the Wasm standardization process while also reducing the burden on specification authors.

Forward Compatibility
We evaluate SpecTec's forward compatibility by applying it to five proposals ready for inclusion in the next version of Wasm ("Wasm 3.0"): typed function references [62], garbage collection [63], tail calls [65], multiple memories [64], and extended constant expressions [13].The proposals add or modify 44 Wasm instructions and dozens of auxiliary helper functions; in particular, the garbage collection proposal adds substantial complications to the Wasm type system, introducing structural types, subtyping, and type recursion.
For each proposal, we extended Wasm2 ST with the proposal specified in SpecTec.Remarkably, describing all five proposals required only trivial adjustments to the DSL, mostly adding new custom operators for new user-definable notation.Qualitatively, we found that SpecTec's type-checking of DSL definitions was a significant boon during this process.In Wasm's current LaTeX specification which is manually written, updating the structure of an abstract syntax definition does not ensure that all of the definition's uses are likewise updated.SpecTec, in contrast, ensures that all uses respect the definition's new structure, avoiding a very common class of drafting error.
After extending Wasm2 ST , we utilized SpecTec to generate formal and prose specifications automatically.It readily generated formal and prose specifications for 42 of the 44 Wasm instructions and for all helper functions.The remaining two instructions' reduction rules used new patterns, such as complex use of inverse functions, which the DL-to-AL translator was not able to handle.After we revised the rules to reflect the translator's intended style, it correctly generated their prose specifications.Extending the translator's capability to handle reduction rules with more diverse styles is future work.Finally, we executed the proposals' test suites using the generated meta-level interpreter.For the subtyping validation of Wasm types introduced by the garbage collection proposal, SpecTec uses the reference interpreter's implementation.During this evaluation, we found ten bugs in the proposals: two type errors, two prose errors, four semantics errors, and two editorial fixes.We reported them to the specification authors and received conformation [16,29,92,95,93,94,91].After fixing the errors in the proposals, we applied SpecTec to the revised Wasm2 ST .For each proposal, the meta-level interpreter passed all the 1,331 tests.
The results demonstrate that, with few adjustments, SpecTec can handle future language features and detect and prevent numerous human mistakes throughout the standardization process.We believe that SpecTec is a long-term solution for supporting a growing language like Wasm.

RELATED WORK
Programming Language Frameworks.Researchers have presented numerous frameworks to mechanize the definitions of programming languages.
Sail [22] is a DSL and toolchain for defining processor instruction set architectures, which can output SMT encodings, theorem prover definitions, and LaTeX documentation fragments.While SpecTec relies extensively on relations, following a declarative approach to language semantics as used in the existing Wasm standard, Sail is first-order and imperative, and hence its definitions can be directly executed.The Sail type system can express complex dimension constraints which must be checked by an external SMT solver.SpecTec uses a simpler form of dimension constraints which are checked fully within the tool ( §2.2).
Lem [48] is a toolchain and DSL for defining semantic models, which can output LaTeX, executable code, and theorem prover definitions.Lem does not include strong support for custom syntax, and so it is challenging for Lem to output LaTeX which fits the format of an existing specification such as Wasm.While Lem does support both relational and functional styles of definition, unlike SpecTec it does not support inference of a functional definition from a relational one -as in our algorithmic backends ( §3).SpecTec benefits from its specialization to Wasm as its inference algorithm for this purpose can be tightly scoped, and can fail in cases that are irrelevant to the Wasm semantics.
Ott [67] allows language designers to specify the semantics in inference rules, and generates code for Coq, HOL, and Isabelle/HOL.It has been used for case studies like a large fragment of OCaml.Spoofax [74] supports agile development of textual domain-specific languages with the Eclipse IDE support.Skeleton [66] specifies language semantics in big-step semantics and constructs both concrete and abstract interpretation.PLTRedex [20] describes language semantics in reduction rules, and it has specified the semantics of Scheme [60].The K framework [81] can generate various tools, including interpreters, model checkers, and verifiers, so long as the definitions under consideration can be encoded in K's term-rewriting system.K has been used to specify semantic fragments of languages such as C [19], Java [11], and JavaScript [51].While the aforementioned frameworks aim to support general-purpose languages with declarative semantics, ESMeta [1] is designed to support JavaScript with imperative semantics.By devising an algorithmic language, IR ES , to specify the semantics in the prose ECMAScript standard, ESMeta can generate diverse tools [55,54,53,52], including a test harness which has been officially integrated into the continuous integration (CI) systems of ECMAScript [73] and the Test262 conformance test suite [72] since November 2022.
Wasm Semantics in Existing Language Frameworks.Fragments of the Wasm 1.0 semantics have been mechanized in PLTRedex and K. Two PLTRedex models specify a large core of its reduction rules [24]: wasm-redex [71] and Wasm-Redex [21], which are executed by the rewrite system of PLTRedex.Both models do not specify the Wasm module semantics, thus complete execution of Wasm programs is not available.KWasm [27] specifies the Wasm runtime semantics, including the module semantics, in a single rewrite system in the K framework.Since it does not model the entire Wasm semantics yet, it fails for a number of tests in the official Wasm test suite.While both PLTRedex and K execute the Wasm semantics using their rewriting engines, SpecTec provides indirect interpretation over an algorithmic representation of the semantics.
Theorem Prover Language Mechanizations.Two complete mechanizations of the Wasm 1.0 semantics exist in the Coq and Isabelle theorem provers [84].Unlike the works described above, which use frameworks specifically designed for language semantics, these two mechanizations are written manually as collections of inductive definitions and functions within the formal languages of their theorem provers.The models are accompanied by proofs of the soundness of the Wasm type system.However, these models have not yet been updated to support all the upcoming features of Wasm 2.0 due to the onerous process of manually extending all of the models' existing definitions.Moreover, in comparison to a purpose-built framework for language semantics, the process of constructing a proof involves many hand-coded inference steps, and therefore, even minor changes to the structure of the underlying model can invalidate large swathes of existing results, such as each model's type soundness proof.A number of mechanizations of other language semantics exist in general-purpose interactive theorem provers -such as Java [37], JavaScript [10,23], Standard ML [42,40], C [43,39,49], and Rust [33].Except for the mechanizations of Standard ML, these almost invariably stick to fixed fragments of the languages in order to make the mechanization process tractable.The process of generating executable code from a mechanized inductive definition is sometimes referred to as animation [9,44].In contrast with our approach, prior work is generally not concerned with minimizing the number of new variables introduced by the animation process.

FUTURE WORK
Declarative specification to algorithmic pseudocode.The current paper focuses on the use of SpecTec to generate algorithmic pseudocode from a declarative specification of Wasm's dynamic semantics (i.e., the instantiation and execution phases of Wasm).The declarative specification provides a single source of truth and is rendered directly into the LaTeX formal specification.The algorithmic pseudocode generated from the declarative specification is rendered into the reST markup prose pseudocode.The generated algorithmic representation is executable, so it can be directly tested against the existing Wasm test suite and reference interpreter.
While SpecTec supports the definition of Wasm's type system and basic formal grammar and their rendering into specification text, it does not yet support the generation of executable code (i.e., a type checker and parser) from these definitions.SpecTec also does not yet include the formal definition of Wasm's arithmetic operations.Instead, we modularize SpecTec's generated artifacts so that handwritten definitions for these parts of the language can be separately provided.We intend to investigate further support for defining and executing these parts in future work.
The AL interpreter also allows us to easily analyze the coverage of the existing Wasm test suite.In the future, we intend to investigate the automatic generation of test cases and fuzzing harnesses, guided by the semantic definitions in SpecTec -we expect that approach to have the potential for producing highly comprehensive tests tailored to maximize coverage of semantic edge cases.
Declarative specification to theorem prover formal specification.A major future direction not covered by the current paper is the implementation of SpecTec backends for a range of different proof assistants.This will enable the generation of formal specifications for proof assistants, including Agda, Coq, Isabelle, and Lean, from the single source of truth provided by the SpecTec declarative specification.Wasm has previously benefited from the mechanization of its initial draft specification -in attempting to give a mechanized proof of the soundness of the Wasm type system, several errors in the draft specification were discovered and corrected before the proof was ultimately completed [83].However, this mechanization was transcribed entirely by hand from the pen-and-paper specification and has not been kept up to date with upcoming new Wasm features.
Compounding the issue of mechanization effort, it is beneficial to support several mechanizations of Wasm since each theorem prover has its own strengths and weaknesses.For the two mechanizations of Wasm 1.0 [84], the Coq mechanization uses the Coq-only Iris framework [34] in defining a mechanized program logic [58], while the Isabelle mechanization partially uses the Isabelle-only Sepref [41] in defining a verified monadic interpreter for Wasm [85].Once SpecTec supports the automatic generation of formal specifications for theorem provers, it will improve confidence that mechanized proofs correspond to the Wasm specification, speeding up the mechanization process as the specification evolves, further improving confidence in Wasm's design and specification.
We have begun work on backends for Agda, Coq, and Lean.In doing so, we implemented a number of IL to IL passes to simplify the output of the elaboration phase and bring it closer to a format suitable for theorem provers ( §2.3).An internal IL type-checker re-validates that the IL is still internally consistent after each pass, in order to capture possible bugs in the transformation.We envisage substantial parts of this infrastructure being shared amongst theorem prover backends.
A particular challenge for theorem prover backends is posed by the differing needs of SpecTec and theorem provers.Whereas SpecTec can be tailored to faithfully replicate the precise metatheory of the existing Wasm formalism, theorem provers come with their own existing metatheories, and so we must carefully manage any mismatches to achieve faithful translations.Agda, in particular, heavily encourages the use of an intrinsically-typed representation of a language's semantics, where the language's type system is directly encoded into the language's abstract syntax using Agda-level dependent types [7].The existing Wasm specification and its mechanizations in Coq and Isabelle use an extrinsic representation of well-typed terms where the typing judgment is a separate inductive definition.We plan to investigate techniques for generating intrinsically-typed representations of Wasm using SpecTec.
Meta-meta-theory for SpecTec.While the SpecTec DSL is used to describe formal language semantics, we did not give a meta-level semantics for this DSL itself.Other frameworks such as Ott/Lem have taken similar approaches in their academic presentations.
We expect that the core work of doing meta-theory for Wasm will continue to happen in actual theorem provers that are intended for this task and have their own meta-semantics; we have no intention of turning SpecTec into one itself.Rather, we regard it as a versatile frontend supporting such work.Still, formalizing the semantics of the DSL (possibly as an undecidable logic) would be interesting future work; it was not a priority for us, as it is not on the critical path to getting SpecTec adopted by Wasm's standards body, unlike other research questions that we had to address in supporting the variety of backends required.One interesting exercise would be to try specifying the semantics of SpecTec in itself.
Pragmatics of varying backend assumptions.The SpecTec DSL and its frontend are fairly generic.While they build in most of the meta-level notation used in the Wasm specification, they know nothing about the actual Wasm syntax and semantics -this is entirely specified within the DSL.Consequently, the overall design of SpecTec is relatively future-proof, and our experience with translating a variety of non-trivial Wasm proposals validated that.We expect extensions to the DSL itself to be necessary only on the rare occasion that the specification of a new Wasm feature requires entirely new meta-level concepts.
However, this does not hold for all of the tool's backends, some of which make much more specific assumptions about the input, to varying degrees.For example, the prose backend needs to recognize typing relations and reduction relations in order to generate appropriate English (much of this is already embodied in the transformation to the AL).The interpreter backend needs to specialize even further, since it (currently) connects to an externally provided parser and a numerics library from the pre-existing reference interpreter in OCaml.
We hope to be able to eliminate some of this special-casing over time, e.g., by encoding numerics in SpecTec itself.However, the fact will remain that different backends hard code knowledge to a differing degree, and pattern match certain aspects of the input specification.The consequence is that individual backends will reject specifications that they cannot (yet) fully handle.
To function as part of the official authoring workflow, we will need to define an appropriate process for supporting specification authors that hit one of these limitations and need an extension to some of the tool's backends.In general, we do not expect authors to touch the implementation of the tool, though some expertise will have to grow within the Wasm standards community to maintain it over time.

CONCLUSION
We have presented Wasm SpecTec, a domain-specific language and toolchain intended to replace the current authoring and specification workflow of the Wasm language standard.As Wasm evolves, we expect to iterate further on some design details of the DSL, for example, revolving around concrete syntax and striking the right balance between WYSIWYG and precision, since classical notation can sometimes be difficult to disambiguate.Likewise, there are various levels of polishing possible for the generated prose, and we expect it to improve further over time.More experience with prover backends in the future may also impact the design.
We ultimately aim for the Wasm standards community to specify all current and future Wasm features using SpecTec and replace most of the manually authored artifacts necessary for Wasm's standardization process with our generated ones, enhancing the efficiency and reliability of the process.Initial feedback from the community has been positive.Our evaluations demonstrate that SpecTec would effectively avoid many historical mistakes in the published Wasm standard and many more in currently in-progress feature specifications.
SpecTec has the potential to empower those working on the standardization of Wasm to engage more directly with the writing and maintenance of the specification.While SpecTec still requires some understanding of the underlying mathematical formalism, rather than unstructured LaTeX, the interface is a checked domain-specific language along with tools that automatically generate the LaTeX and prose.Moreover, SpecTec supports rapid prototyping of extensions, including execution of programs that use those extensions.We believe SpecTec is ready for evaluation by Wasm's standards body and have begun the process of working towards its official adoption.
Finally, we see Wasm and SpecTec as evidence that a rigorous and formal approach to language specification is possible and can be scaled to industrial-strength languages.Although the SpecTec tool itself is specialized to Wasm's needs, we believe that its overall architecture can be used as a blueprint to replicate a similar approach for other language specifications.

9 Fig. 6 .
Fig. 6.Validation of local.getand global.get in a generated prose specification (IL).This representation groups definitions and makes all binders, types, dimensions, recursions, and uses of subsumption explicit.It also annotates every iteration with the variables it iterates over.From here, the middlend takes over and runs a number of simplifying transformations on the IL: • Infer implicit side conditions, such as the fact that the use of s[i] in a premise requires i < |s|, or that joint iterations like (x < y + 1) * require |x * | = |y * |. • Lift the result type of partial functions (whose clauses are not exhaustive) into options, i.e., iterations t?, and adjust their use sites accordingly.• Introduce auxiliary variables for option types without variable content, like ATOM?, in order to turn them into Booleans later.• Turn subsumption (e.g., on variant types) into explicit injections.•Perform dataflow analysis on premises, annotate suitable equations as variable bindings, and reorder premises accordingly ( §3.3).Not all these transformations are run for every backend; each backend can select them individually.