Solving Infinite-State Games via Acceleration

Two-player graph games have found numerous applications, most notably in the synthesis of reactive systems from temporal specifications, but also in verification. The relevance of infinite-state systems in these areas has lead to significant attention towards developing techniques for solving infinite-state games. We propose novel symbolic semi-algorithms for solving infinite-state games with $\omega$-regular winning conditions. The novelty of our approach lies in the introduction of an acceleration technique that enhances fixpoint-based game-solving methods and helps to avoid divergence. Classical fixpoint-based algorithms, when applied to infinite-state games, are bound to diverge in many cases, since they iteratively compute the set of states from which one player has a winning strategy. Our proposed approach can lead to convergence in cases where existing algorithms require an infinite number of iterations. This is achieved by acceleration: computing an infinite set of states from which a simpler sub-strategy can be iterated an unbounded number of times in order to win the game. Ours is the first method for solving infinite-state games to employ acceleration. Thanks to this, it is able to outperform state-of-the-art techniques on a range of benchmarks, as evidenced by our evaluation of a prototype implementation.


INTRODUCTION
Reactive synthesis, introduced by Church [13], has the goal of automatically generating an implementation (for instance, a program or a finite-state controller) from a formal specification that describes the desired behavior of the system.The system requirements are typically specified using temporal logics, such as Linear Temporal Logic (LTL), which provide expressive high-level specification languages.Recent advancements [10,21] have made possible the successful application of synthesis to industrial protocols and robotics.
The problem of synthesizing strategies in two-player games over graphs is tightly connected to reactive synthesis.Synthesis from temporal logic specifications can be reduced to computing a winning strategy in a game.Games can also be used to model the interaction between a system and its environment directly.There is a large body of algorithmic techniques and tools for solving finite-state games.
Many applications of two-player games, however, naturally require the treatment of infinitestate models, such as software synthesis and repair [24], controller synthesis in domains like robotics [29], and software verification against hyperproperties [7].The problem of solving (i.e., determining the winner) infinite-state games is in general undecidable, and many practical applications lie outside of decidable classes, making incomplete approaches necessary.As different approaches have different strengths, a number of techniques have been developed over the last years [8,17,35,37].However, the state of the art is still far from the level of the algorithmic approaches and tools available for the finite-state case.
We propose a novel technique for solving infinite-state games that aims to address one of the limitations of existing approaches, namely, that they usually diverge on game-solving tasks that require reasoning about the unbounded iteration of strategic decisions.We illustrate this challenge with an example.
Example 1.1.Consider the simple game shown in Fig. 1a.It models the interaction between a reactive program (the system) and its environment.The game has two locations, 0 and , an integer input variable , and an integer program variable .The edges depict transitions and are labeled with guard conditions over the input and output variables, and updates which are assignments to the program variable.When we depict transitions we separate guards and updates.Edges originating in locations are labeled with guards and end in a black square.Edges originating in the black squares are labeled with possible updates for the system to chose from.Note that the black squares are used simply for visualization.When playing the game, in each step the environment chooses a value for .Then, in 0 , if is smaller than 42 or is zero, the system must leave unchanged and the game moves to .Otherwise, the system has to decide whether to add or subtract from and the game stays in 0 .Once in , the game stays there forever and is not changed.Let us consider the temporal specification where the system is required to eventually reach location from 0 for any initial value in .The program can indeed enforce this property by choosing the appropriate update at every step: adding to if < 0 and subtracting from if > 0. This ensures that is decremented in every step (unless the environment sets to 0, in which case the game moves to ), and hence = 0 ∨ ≤ 42 will eventually be reached.
The number of times has to be decremented depends on its initial value, and therefore the number of iterations through 0 is a priori unbounded.However, not only the number of iterations in the system execution is unbounded, which would correspond to an unbounded loop, but in every iteration, the system has to perform strategic decisions which depend on the input from the environment and impact the execution.We call these constructs unbounded strategy loops.
While necessary for many practical applications, handling unbounded strategy loops is out of the scope of existing methods for reactive synthesis.However, in our example, the necessary argument to establish that the system can satisfy the requirement of the game can be stated as the following simple property: Whenever > 42, if the system can ensure that it can decrement , then eventually ≤ 42 will be reached.To utilize this reasoning, the synthesis process has to establish that the task postulated in the sub-property, i.e. " is decremented", can be realized in certain situations, i.e. ≠ 0. This itself is a reactive synthesis task.I this paper we propose a symbolic method for solving infinite-state games that utilizes such reasoning in order to handle unbounded strategy loops as the one in Fig. 1a.
Contributions.We present a symbolic approach for solving infinite-state games with temporal objectives, such as reachability, Büchi, and parity, lifting the respective finite-state algorithms.The key novelty of our technique is an acceleration method which, in contrast to existing approaches, accelerates unbounded strategy loops.It uses the newly introduced notion of acceleration lemmas, which are simple inductive statements that the symbolic algorithm lifts to accelerate the termination of the game-solving process.To the best of our knowledge, this is the first acceleration-based method for game solving.We implemented our method in a prototype and demonstrate its feasibility by its successful application to standard benchmarks from the literature, outperforming existing tools, and several new ones that are out of the scope of these tools.
Paper Outline.After discussing related work in Section 2, we present in Section 3 several examples illustrating different challenges, how our technique approaches them, and where existing methods fall short.After the technical preliminaries in Section 4, we introduce our game model in Section 5. We then describe the basis of our solving techniques in Section 6.In Section 7, we present formally our acceleration method.We continue by discussing the synthesis of acceleration lemmas in Section 8. Lastly, Section 9 evaluates our method and discusses our empirical observations.

RELATED WORK
We now overview the landscape of existing approaches for solving infinite-state games and explain how our work addresses some of their limitations.We also discuss relevant techniques for verification of infinite-state systems and why they do not directly apply in the context of synthesis.
Infinite-State Game Solving.In general, games over infinite graphs are undecidable.Decidable classes, such as pushdown games [43] and downward-closed safety games [1], as well as termination criteria for symbolic procedures [14] have been found.As many practical applications lie outside of these classes, incomplete methods are needed.
Different approaches have been proposed for safety games.[35] presents an automata-learning method for safety games over infinite graphs, and [33] develops a learning-based technique for parameterized systems with safety specifications.In [17], the authors study safety and reachability games defined within the theory of linear rational arithmetic.[28] presents a method for synthesis from Assume-Guarantee contracts describing safety properties.The tool GenSys [37] implements a fixpoint-based solver for infinite-state safety games using an SMT solver.[16] presents a method for solving infinite-state reachability games by reducing the problem to checking the satisfiability of a system of constrained Horn clauses.It is restricted to games where the reachability player has finitely many actions to choose from.More expressive winning conditions, such as the Büchi condition which requires repeated visits to some states, are out of the scope of these methods.
The constraint-based approach in [8] handles infinite-state games with winning conditions given by LTL specifications.It can solve problems with unbounded strategy loops.However, it requires the user to provide templates that structure how the final system works, including handling the unbounded strategy loops.As can be seen in the examples in [8] such templates can be quite complex even for small games.In contrast, our approach uses inductive statements that are automatically generated from generalizable templates.
Solvers for first-order fixpoint logics [40,41] can be used, as described in [40], to solve games with omega-regular winning conditions.This approach uses the fixpoint encoding of the winning sets of states for a player in the game.Its ability to find a solution depends on the constraint-solving engine.In Section 9 we report on experimental comparison with the technique described in [40], demonstrating the strengths of our approach.
Abstraction-based techniques have been extended to games [3,[25][26][27]42].Abstraction-based controller synthesis for dynamical systems [39] makes use of discretization.As they reduce the solving of an infinite-state game to the finite-state case, they can use the techniques and tools available for finite-state games.The effectiveness of these approaches depends on the abstractions, whose iterative refinement might diverge when unbounded strategy loops are needed.
Infinite-State Synthesis from Temporal Logics.The recent works [12,32] both study the problem of synthesizing reactive systems from Temporal Stream Logic modulo theories (TSL-MT [19]) specifications.This temporal logic can express conditions on unbounded input data and program variables.[12] integrates a procedure for synthesis from TSL specifications [20] and Syntax-Guided synthesis.The role of SyGuS is to generate assumptions that are added to a refined TSL formula.
The procedure proposed in [32] is based on abstraction refinement and performs a counterexampleguided synthesis loop that invokes LTL synthesis.The refinement uses an SMT solver to analyze counterstrategies for inconsistency with the theory.While [12] can handle some unbounded looping behavior by using recursive functions, both [12,32] cannot handle unbounded strategy loops where at each iteration, the environment provides new input values.A recent preprint [38] proposes a symbolic fixpoint computation for infinite-state games with LTL winning conditions.At the time of writing, their procedures for solving co-Büchi and Büchi games, as presented in version 1 of the preprint, appear to be incorrect.Furthermore, they remark that their method has the same non-termination issues as [32], which implies that they will also diverge in the presence of unbounded strategy loops.

Infinite-State Verification.
There is a vast variety of approaches for verification of infinite-state systems.Above we discussed extensions of abstraction-based techniques and deductive verification techniques to the setting of games and synthesis of reactive systems.In contrast, other prominent approaches used in the verification of infinite-state systems, such as acceleration [5,6,22,31] or loop summarization [30], do not directly extend to the setting of two-player games, and have thus far not been explored.The key difficulty is caused by the alternation of environment inputs and decisions by the system player, meaning that a loop in the game structure might not be under the full control of the system.Thus, the existence of a loop whose transitions can be composed, does not entail that it is enforceable by the system.We illustrate this in the next section, and discuss this challenge further in this paper.

OVERVIEW AND MOTIVATING EXAMPLES
In this section, we give a high-level overview of our approach, highlighting some of its strengths and distinguishing features on several simple but challenging examples.First, we show how our acceleration method can be applied to enable termination in certain cases, specifically in the presence of unbounded strategy loops (Example 1.1), where existing techniques typically diverge.We explain, on the high level, how acceleration works, and also discuss what are the challenges that our technique addresses.Second, we demonstrate the advantage of our acceleration-based game solving procedure over purely constraint-based approaches for solving fixpoint equations.Finally, we illustrate the applicability of our acceleration method to expressive classes of specifications encoded as Büchi objectives in the synthesis game.
We begin with a brief, high-level, description our game model and problem formulation.The underlying formal definitions are presented in the later Section 4 and Section 5.

Symbolic Model and Problem Formulation
We specify synthesis tasks as reactive program game structures, like the one depicted in Fig. 1a.Such a game structure consists of a finite set of control locations, a finite set I of input variables, and a finite set X of output variables.While the sets I and X are finite, the domains of the variables can be infinite.Thus, reactive program game structures can represent infinite-state games.A reactive program game contains transitions between the locations, labeled with guards and updates.A transition has the form ( , , , ′ ), where and ′ are the source and target locations respectively, is the guard, and is the update.As in Fig. 1a, we depict such a transition using an edge from to a black square labeled with , and an edge from a black square to ′ labelled with .The intermediate black squares are for visualization only, and not part of the formal definition.A guard has to hold for the transition to be possible, and an update assigns to each program variable a term with whose value it should be updated.A reactive program game is pair of a reactive program game structure and a winning condition expressed over the locations .For example, in Fig. 1a, we consider the reachability condition that requires that location is eventually visited.In Example 3.3 we show a reactive program game with a Büchi winning condition that states that the system has to enforce visiting a set of accepting locations ⊆ infinitely often.
The reactive program game is played over the possibly infinite set of states, where each state ( , ) consists of the current location and an assignment of values for all the program variables X.A step in the game is played as follows.In the current state ( , ), the environment chooses some values i for the inputs I.Then, the system chooses a transition ( , , , ′ ) whose guard is satisfied by the current values i and , of the inputs and the program variables respectively.The next location is ′ , and the next values of the program variables are determined according to the update .Starting in some state ( , ), this repeated interaction produces a play, an infinite sequence of states, which is winning for the system player if it satisfies the given winning condition.
The realizability problem is to determine whether there exists a strategy for the system to resolve the transition (i.e., update) choices that guarantees that any play starting in any possible state ( init , ) for a given initial location init , is winning for the system, regardless of the input values chosen by the environment.The synthesis problem asks to compute such a strategy if one exists.One such a strategy for Example 1.1 with initial location 0 is depicted in Fig. 1b, in the form of a reactive program.The program contains an auxiliary variable 0 , whose role will be explained later.

Acceleration for Unbounded Strategy Loops
Finite-state games can be solved by fixpoint algorithms that compute sets of states from which a given player can enforce winning the game.To solve reactive program games, we lift those algorithms to compute symbolically infinite sets of states.We represent them as elements of a symbolic domain D := → FOL(X), where FOL(X) is the set of first-order logic formulas with free variables among the program variables X.A symbolic set ∈ D represents the set of those ( , ) where the assignment to X satisfies ( ).A basic notion in the fixpoint-based game-solving algorithms is the attractor.For a given player and a given set of goal states, the attractor consists of the states from which the player can enforce reaching a goal state in some number of steps.Attractor sets can be computed via iterative fixpoint computation, which is not guaranteed to terminate in the case of an infinite state space.We illustrate this on Example 1.1.
Example 1.1 (Continued).Suppose we want to compute the attractor for the system player for the symbolically represented set of states := { 0 ↦ → ⊥, ↦ → ⊤} (describing the set of states whose location is ).After the first iteration, we get { 0 ↦ → ( ≤ 42), ↦ → ⊤}.In the next one, we get ), ↦ → ⊤}, . . . .A fixpoint is never reached, since at every step of the computation we add one state from the infinite attractor set.
In Section 1 we argued intuitively why in Example 1.1 every state ( 0 , ) belongs to the attractor for the system player.The method we propose in this paper is able to automatically establish this and compute the attractor.More concretely, this is achieved as follows.
• We introduce the notion of acceleration lemmas, which allow us to express in a formal way arguments like the one outlined in Section 1.An acceleration lemma, precisely defined in Section 7.1, is a triple (base, step, conc) of FOL formulas.The conclusion conc characterizes a set of states with the property that from each of them, by iterating the step relation described by step, a state in the set characterized by the base condition base must be reached.Intuitively, conc characterizes the states added to the attractor by applying this acceleration lemma.For our Example 1.1, one possible acceleration lemma is (base, step, conc) with base := ( ≤ 42), step := ( ′ < ) and conc := ⊤.In the formula step := ( ′ < ), the variable refers to the value of the respective program variable at one visit to location 0 and and ′ refers to its value the next time 0 is visited.Clearly, starting in any state ( 0 , 0 ), any sequence of states ( 0 , 0 ), ( 0 , 1 ), . . .that satisfies step eventually reaches a state that satisfies ≤ 42.• The purpose of an acceleration lemma (base, step, conc) is to accelerate the computation of an attractor set for a given player in a reactive program game structure G, by adding all the states ( , ) that satisfy conc to the attractor at once.In order to apply such a lemma, the attractor computation procedure must first ensure that the set of states described by base is included in the subset of the attractor computed thus far.Second, it is necessary to establish that player can enforce the repeated iteration of the step relation in the game G against all possible behaviors of the opponent player.Intuitively, this can be guaranteed by showing that player has a strategy to enforce looping in location such that each iteration satisfies step.In Example 1.1 this is indeed the case for location 0 and the acceleration lemma given above.Thus, our method is able to add all states ( 0 , ) to the attractor for the system player.
In Section 7.1 we describe in detail how our method generates acceleration lemmas while at the same time ensuring that their step relations can be enforced by the respective player in the underlying reactive program game structure.Next, we give an example that illustrates why the latter condition is crucial for the correct application of an acceleration lemma.In fact, this is one of the major challenges for acceleration in the context of synthesis, and a key difference to the application of acceleration techniques in verification.
Example 3.1.Consider a modification of the reactive program game depicted in Fig. 1a in which we have omitted the parts of the guards referring to the input variable and colored in blue.Clearly, now the only states ( 0 , ) for which the system can enforce reaching location are those where ≤ 42, since otherwise the environment can prevent the system from decreasing by always setting to 0. Thus, although the reactive program game contains loops that decrement , these loops cannot be enforced by the system.This is a major difference to the acceleration techniques employed in verification, as the existence of a loop with certain properties is a realizability problem in itself.In Section 7.1 we show how we address this problem by the utilization of a so-called loop game.

Embedding Acceleration in Iterative Symbolic Fixpoint Computation
Our acceleration technique is embedded in symbolic game-solving algorithms based on iterative fixpoint computation.This enables our method to combine the strengths of both techniques by utilizing acceleration for unbounded strategy loops in concert with performing a finite number of concrete iterations of the fixpoint computation.The latter can be helpful when the set of winning states that the game solving procedure has to compute does not have a "simple" and "easy to discover" symbolic characterization, as the next example illustrates.
Example 3.2.Consider the reactive program game structure depicted in Fig. 2. The specification again requires the system to reach location from 0 for any initial values of the program variables and .Note that if the value of is at least 6 when transitioning from location 0 to 1 , the system can reach after doubling 6 times the value of .Regardless of the initial values of the variables in 0 , it is possible to reach 1 with having value 6 or larger.Thus, the specification is realizable.
If linear arithmetic formulas are used to represent sets of states, then the set of values at location 1 from which can be reached is characterized by the formula . Constraint-solving techniques for fixpoint equation systems can have difficulty generating such formulas, as we will 0 1 Reactive program game structure with two system-controlled variables and and no input variables.We use skip to denote the update := ; := .If the update to some program variable is missing from the label of an update edge, this means that the value of this variable remains unchanged.
see in our experimental evaluation in Section 9. On the other hand, methods for iterative symbolic attractor computation that do not apply acceleration diverge due to the presence of the loop in 0 .
Our acceleration-based procedure successfully determines the realizability in this case, as it integrates acceleration in the iterative fixpoint computation.Note that, for simplicity, this example does not contain environment inputs, but the same challenges are present even if it does.

Acceleration Beyond Reachability and Safety Games
The synthesis approach that we propose is applicable beyond reachability specifications.More concretely, we consider Büchi specifications that require that some set of locations is visited infinitely often, and parity objectives which can express all omega-regular specifications.Accelerated attractor computation can be readily integrated in symbolic fixpoint-based procedures for infintestate games with these types of objectives.The next example shows a reactive program game with Büchi specification, for which our attractor acceleration method allows us to establish realizability.
Example 3.3.In the reactive program game depicted in Fig. 3, the Büchi specification defined by the set of locations { 2 , 3 } requires the system to visit some of these two locations infinitely often.Looping between 0 and 3 is possible as long as > 16 at location 0 .Otherwise, upon reaching 3 with < 16, the program will have to transition to 4 and revisiting either of 3 or 2 is not possible from this point on.From location 0 the system can transition to 1 , from where, similarly as in the game from Example 1.1, the system can ensure reaching 2 as long as ≤ 42 or ≤ 32.If > 32 and > 42 in location 1 , the environment can prevent the system from decreasing by always setting to a positive value.Thus, if every time in location 0 the system transitions to 1 when ≤ 32 and to 3 when > 32 it can ensure that the set of locations { 2 , 3 } is visited infinitely often and thus satisfy the given Büchi specification.Our procedure successfully computes that every state of the form ( 0 , ) is winning for the system.As part of the computation, attractor acceleration is applied to establish that from every state of the form ( 0 , ) where ( ) ≤ 32 the system has a strategy to enforce reaching 2 .Note the presence of an unbounded strategy loop, due to which the standard symbolic method for solving Büchi games is bound to diverge.3. Reactive program game structure with two system-controlled variables and and input variable .We use skip to denote the update := ; := .If the update to a variable is missing from the label of an update edge, its value remains unchanged.Double-circles denote the Büchi accepting locations { 2 , 3 }.
In the above example, applying acceleration to the attractor computation performed as part of the procedure for solving Büchi games, suffices to ensure convergence.In Section 7.3 we give an example where this is not enough, and describe an acceleration method for Büchi conditions.

TECHNICAL PRELIMINARIES
Functions, Terms, and First-Order Logic.Let V be the set of all values of arbitrary types, F := { : V → V | ∈ N} be the set of all functions, and P := { ∈ F | Range( ) ⊆ B} be the set of all predicates.Let Vars be the countably infinite set of all variables.For ⊆ Vars, we denote with ′ := { ′ | ∈ } the set of primed variables such that ∩ ′ = ∅.Terms are used to describe functions and predicates.Let Σ be the set of all function symbols and Σ ⊂ Σ the set of all predicate symbols.Function terms T are defined by the grammar T ∋ ::= ( 1 , . . . ) | for ∈ Σ and ∈ Vars, and we denote with T the function terms of Boolean type.
A function : Vars → V is called a variable assignment (or simply assignment).The set of all assignments over a set of variables ⊆ Vars is denoted as Assignments( ).We denote the combination of two assignments ′ , ′′ over disjoint sets of variables by ′ ⊎ ′′ .Given an assignment function , a variable ∈ Vars and a value ∈ V, we define the assignment function A function I : Σ → F is called an interpretation of the function symbols (or simply interpretation).We require I ( ) ∈ P for ∈ Σ .The set of all interpretations of a set of function symbols Σ ⊆ Σ is denoted as Interpretations (Σ).We denote the combination of two interpretations I ′ , I ′′ over disjoint symbol sets by I ′ ⊎ I ′′ .The evaluation of function terms ,I : T → V is defined by ,I ( ) := ( ) for ∈ Vars, ,I ( ( 0 , . . .)) := I ( ) ( ,I ( 0 ), . . .,I ( )) for ∈ Σ .
We denote the set of all first-order formulas as FOL.Let be a formula and = { 1 , . . ., } ⊆ Vars be a set of variables.For a quantifer ∈ {∃, ∀}, we write .as a short-cut for 1 . . . . . .We denote by QF the set of all quantifier-free formulas in FOL.We write ( ) to denote that the free variables of are a subset of .We also denote with FOL( ) and QF ( ) the set of formulas (respectively quantifier-free formulas) whose free variables belong to .Given variables 1 , . . ., ∈ Vars, constant function terms with arity zero 1 , . . ., ∈ T and function terms 1 , . . ., + , ∈ T , we denote with the formula obtained from by the simultaneous replacement of all free occurrences of each with the respective term and of each occurrence by + .We denote with | =: Assignments(Vars) × Interpretations(Σ ) × FOL the entailment of first-order logic formulas.A first-order theory ⊂ 2 FOL(∅) is finite set of closed first-order logic formulas called axioms that restrict the possible interpretations of function and predicate symbols.The models of a theory are defined as Models( e. all possible interpretations of the function symbols that satisfy all axioms.Given a theory , for a formula ( ) and assignment ∈ Assignments( ) we define that | = if and only if , I | = for all I ∈ Models( ).
Two-Player Games of Infinite Duration.We consider two-player games between a system player and an environment player.A game structure is a tuple = (S, e , s , ), where S is a set of states, e and s are the sets of possible moves for Player Env and Player Sys respectively, and ⊆ S × e × s ×S is a transition relation where: (1) for all ∈ S, there exist e ∈ e , s ∈ s and ′ ∈ S such that ( , e , s , ′ ) ∈ , and (2) for all ∈ S, e ∈ e and s ∈ s , if ( , e , s , 1 ) ∈ and ( , e , s , 2 ) ∈ then 1 = 2 .Condition (1) states that every state has a successor, and (2) states that the moves chosen by the two players uniquely determine a successor.We define the functions Act e : S → 2 e and Act s : S × e → 2 s that indicate the enabled moves of a player: Act e ( ) A game on is played by Player Env and Player Sys as follows.In a state ∈ S, Player Env chooses a move e ∈ Act e ( ), Player Sys chooses a move s ∈ Act s ( , e ).These choices define the next state ′ such that ( , e , s , ′ ) ∈ , The game then continues from ′ .The resulting infinite sequence = 0 , 1 , 2 , . . .∈ S of states is called a play.For ∈ {Env, Sys} we define 1 − := Sys when = Env, and 1 − := Env when = Sys.A strategy for Player Env is a function e : S + → e where e ( 0 , . . ., ) = e implies e ∈ Act e ( ).A strategy for Player Sys is a function s : S + × e → s such that s (( 0 , 1 , . . ., ), e ) = s implies s ∈ Act s ( , e ).We denote with Strat ( ) the set of all strategies for Player ∈ {Env, Sys}.
Given ∈ S and strategies e and s for the two players, we denote with Outcome( , e , s ) the unique play 0 , 1 , 2 , . . .such that 0 = , and for all ∈ N there exist e ∈ e and s ∈ s such that e ( 0 , 1 . . ., ) = e , s (( 0 , 1 . . ., ), e ) = s and ( , e , s , +1 ) ∈ .Given a ∈ Strat ( ) and ∈ S, we define Plays( , An objective is a set Ω ⊆ S .The set of states winning for Player with respect to objective Ω is

REACTIVE PROGRAM GAMES
In this section, we define reactive program games, a symbolic model that we use to specify and synthesize reactive programs that operate over infinite domains.Fig. 1a, Fig. 2, Fig. 3, and Fig. 4a depict examples of reactive program games.The interpretation of the function symbols appearing in a reactive program game is required to conform to a given theory first-order .Unless stated otherwise, we consider the theories of linear (integer or real) arithmetic.The reactive program game in Fig. 1a, is defined in the theory of linear integer arithmetic.
Definition 5.1 (Reactive Program Game Structure).A reactive program game structure is a tuple G = ( , I, X, , Inv, ) with the following components.is a first-order theory.I ⊆ Vars is a finite set of input variables.X ⊆ Vars is a finite set of program variables where I ∩ X = ∅. is a finite set of game locations.Inv : → FOL(X) maps each location to a location invariant.
⊆ × QF (X ∪ I) × (X → T ) × is a finite symbolic transition relation, where for every ∈ the set of outgoing transition guards Guards( ) := { | ∃ , ′ .( , , , ′ ) ∈ } satisfies the conditions: (1) ∈Guards( ) ≡ ⊤, and for all 1 , 2 ∈ Guards( ) with 1 ≠ 2 it holds that 1 ∧ 2 ≡ ⊥, (2) for all , , 1 , 2 , if ( , , , 1 ) ∈ and ( , , , 2 ) ∈ , then 1 = 2 , and (3) for every ∈ and x ∈ Assignments(X) such that x | = Inv( ) there exist a transition ( , , , ′ ) ∈ and i ∈ Assignments(I) such that The requirements on imply for each ∈ that: (1) the guards in Guards( ) partition the set Assignments(X ∪ I).(2) each pair of ∈ Guards( ) and update can label at most one outgoing transition from , and (3) if there is an assignment satisfying the invariant at , then there is an input assignment for which there is a possible transition, i.e., there are no dead-end states.We define • the set of input assignments in G as Inputs G := Assignments(I), and The semantics of the reactive program game structure G is a possibly infinite-state game structure.The set of states of the reactive program game structure G consists of pairs ( , x) of game location and assignment x to the program variables X.The moves of Player Env (modeling the environment) are the input assignments Inputs G and are potentially infinitely many.Player Sys (corresponding to the program being synthesized) chooses the updates to the program variables (from the set Updates G ) and has, therefore, by definition only finitely many moves.
Remark: When all the location invariants in a game structure are ⊤, for instance in our examples, we omit them for the sake of brevity.Definition 5.2 (Semantics of Reactive Program Game Structures).Let G = ( , I, X, , Inv, ) be a reactive program game structure.The semantics of G is the game structure G = (S, e , s , ) where

and only if
there exists ∈ Guards( ) such that ( , , , ′ ) ∈ and x ⊎ i | = , and -x ′ ( ) = x⊎i ( ( )) for every ∈ X, and The transition relation is well-defined as the conditions on in Definition 5.1 ensure that given ( , x) ∈ , i ∈ Inputs G , and ∈ Updates G , the successor location ′ and program variable assignment x ′ are uniquely determined.The successor assignment x ′ is obtained by updating the value of each program variable ∈ X according to the corresponding update term ( ) and assignment i to the input variables.For a state = ( , x) ∈ S, we denote with loc( ) := the location of .
We consider the realizabillity problem for reactive program games, formally defined below.

Realizability and Program Synthesis for Reactive Program Games
Given a reactive program game structure G = ( , I, X, , Inv, ), objective Ω for Player Sys, and ∈ , the realizability problem is to determine if ( , x) ∈ Sys ( G , Ω) for every x ∈ Assignments(X) with x | = Inv( ).The program synthesis problem is to compute a strategy for Player Sys that is winning from every ( , x) ∈ Sys ( G , Ω).

SYMBOLIC PROCEDURES FOR REACTIVE PROGRAM GAMES
Given a reactive program game structure G = ( , I, X, , Inv, ) and an objective Ω for Player , the game solving problem is to compute the set of states ( G , Ω).The realizability question for given location can be answered by checking if this set contains all states ∈ S with loc( ) = .
We present procedures for solving reactive program games with the main types of objectives considered in reactive synthesis.Similarly to the respective algorithms for finite-state games, a building block of these procedures is the computation of the so-called attractor sets.We lift the classical algorithms to infinite-state games by employing a symbolic attractor computation procedure.Since the corresponding game-solving problems are generally undecidable, the game-solving procedures are not guaranteed to terminate.In the next section, we propose an acceleration technique, which, when it succeeds, enforces convergence.

Symbolic Representation and Operations
We now present the basic building blocks of our procedures for solving reactive program games: the symbolic representation of sets of states and the necessary operations on this representation.
Attractor.Let ⊆ S be a set of states.The set of states from which Player can enforce reaching a state in is called the Player -attractor for in G and is denoted by A r G , ( ).Formally, In the finite-state case, attractors are computed by a fixpoint iteration using the so-called controllable predecessor operators.We define their symbolic counterparts for ∈ D where [ ] := [ ↦ → ( ) | ∈ X] for ∈ Updates G applies the substitution defined by ∈ X → T to ∈ FOL(X) resulting in the formula obtained by the simultaneous replacement of all ∈ X with the respective term ( ).By definition, we have that CPre G,Env , CPre G,Sys : D → D.
Algorithm 1 can be used to compute A r G , ( ) symbolically, given a symbolic representation of .Fig. 4b shows an example of such a computation, and the next proposition states its soundness.

Symbolic Game Solving
In reactive program games we consider objectives defined in terms of the locations appearing in a play.Below we recall the definitions of common types of objectives and the classical algorithms for solving such games, formulated symbolically in the context of reactive program games.Reachability and Safety Games.The reachability objective Reach( ) for ⊆ , requires that some state with location in is visited eventually.Formally, Reach( The dual, safety objective Safety( ) for ⊆ , requires that only locations in are visited by the play.Formally, Safety( Reactive program games with reachability objectives for a Player can be solved by applying Algorithm 1 to compute the Player -attractor for the set of states with locations in .More concretely, Proposition 6.1 directly implies that ( G , Reach( Employing the symbolic attractor computation procedure in Algorithm 1, we lift the classical algorithm for solving finite-state games with Büchi and co-Büchi objectives to a procedure for solving reactive program games with these objectives.The procedure, given in Algorithm 2, is based on a nested fixpoint computation, as the classical algorithm.The inner fixpoint iteration computes attractor sets for the Player with Büchi objective Buchi( ), and the outer fixpoint iteration computes increasing underapproximations of the set of winning states for the Player 1 − with co-Büchi objective coBuchi( ).At each iteration, we first compute , which represents the states from which Player can enforce a visit to , by calling Algorithm 1.Then, we compute 1− , which represents the states from which Player 1 − can prevent Player from revisiting .This is done by calling Algorithm 1 to compute an attractor for Player 1 − .
To solve a reactive program Büchi game defined by a set of locations in G, we execute Algorithm 2 with := { ↦ → Inv( ) | ∈ }.If the computation reaches a fixpoint, that is, for some iteration we have that ≡ −1 (and hence, 1− = −1 1− ), then we have that . This follows from the correctness of the classical algorithm for solving Büchi games [9,23].
Parity Games.A parity objective is defined via a function col : → {0, 1, . . .} that associates for a given ∈ N each location with a color from {0, 1, . . .} .The parity objective requires that the maximal color a play visits infinitely often is even.Formally, Parity (col where colAt ( ) := col(loc( [ ])).To solve reactive program games with parity objectives, we lift Zielonka's algorithm [45] to a symbolic procedure in a similar manner as those above.The algorithm makes recursive calls with game structures obtained from the original one by removing states from it.In the symbolic computation we achieve this by strengthening the location invariants Inv.We recall the formal algorithm in Appendix A.

Strategy Extraction
When the answer to the realizability question for a reactive program game is positive, that is, for the given location init all states ∈ S with loc( ) = init are winning for Player Sys, we might want to extract a winning strategy for Player Sys in the form of a program.To this end, we lift the strategy-generation extensions of the classical algorithms on which the symbolic procedures are based.
We represent winning strategies for Player Sys as simple GOTO programs, with labels corresponding to locations, and which contain goto statements, read statements for the input variables, conditionals, and the selected updates from the game in form of (parallel) assignments to the program variables.Later on, we will extend these programs with additional labels, auxiliary variables, and simple assignments to the auxiliary variables.We now describe how the symbolic game-solving procedures described earlier in this section are extended to produce such programs.
The basic building block for strategy generation is the extraction of a program statement in a location ∈ from a strategic decision based on the controllable predecessor CPre G,Sys ( ) ( ) for some ∈ D. This requires a strategy that enforces reaching in one step from .Such a strategy selects a transition ( , , , ′ ) where the guard holds, and after the update , ( ′ ) and Inv ( ′ ) hold.For example, for = { ′ ↦ → > 0}, guard < 5, invariant Inv ( ′ ) = ⊤, and update := + 1, we extract the program statement if ( < 5 ∧ + 1 > 0) then := + 1; goto ′ else . . .which could then be followed by other such transition statements.Note that before branching according to the choice of the strategy at a given location, we have to add read statements for the input variables.
For games with reachability objectives, a winning strategy can be generated based on the attractor computation augmented with keeping a record of the "layers" of the attractor, that is the individual 1 , 2 , . . .until termination.For each location ∈ and states in +1 ( ) ∧ ¬ ( ), that is, states added to the attractor in step + 1, the winning strategy will select a transition based on the controllable predecessor CPre G,Sys ( ) ( ), resulting in a program statement as described above.
For safety games, we first compute the set of states Sys ( G , Safety ( )) winning for Player Sys, symbolically represented by some ∈ D.Then, for each location ∈ , we generate a program statement representing a strategy to remain in , that is, based on CPre G,Sys ( ) ( ).
The procedures for solving games with Büchi and parity objectives can also be extended with the necessary bookkeepping to generate winning strategies by using strategies (programs) generated from the controllable predecessor operator and attractor computation as building blocks.
What about the environment?Winning strategies for Player Env are less useful as we are usually interested in strategies that give us an implementation of the desired system.They can be helpful as counterexamples, but for reactive program games, extracting a strategy for Player Env is more difficult, as the set of moves e := Inputs G is potentially infinite.

ENFORCEMENT ACCELERATION
The symbolic procedure for attractor computation, and hence the procedures for solving games, presented in the previous section do not always terminate when the set of states S is infinite.
As we explained in Section 3, the attractor computation for the reactive program game structure in Fig. 1a and the reachability winning condition for Player Sys to reach , does not terminate.Applying the method from the previous section as A (G, Sys, { ↦ → ⊤}), we get: ), ↦ → ⊤},. . . .A fixpoint is never reached as keeps growing in every iteration.However, arguing for termination on an intuitive level is done quite simply as already shown in Section 1.If ≤ 42 or = 0 in location 0 , then { ↦ → ⊤} is reached in one step.Otherwise, by choosing the correct transition in location 0 , Player Sys can force to decrease and go back to location 0 .Since the game is back in 0 , decreasing can be iterated, eventually leading to ≤ 42.
While we cannot ensure termination in general, we will augment the attractor computation with an acceleration operation.This operation should extend the computed attractor set with states from infinitely many levels in finitely many steps.Since the attractor captures strategic decisions, this acceleration operation allows us to handle unbounded strategy loops as the one in Fig. 1a.
For the rest of this section, we fix a first-order logical theory .

Accelerating Symbolic A ractor Computation
The termination argument of our running example above is based on a simple inductive statement.From a more general perspective, it stated that if in location , starting at some state not part of the goal states, Player can enforce coming back to and make progress towards the goal, then Player can enforce reaching the goal eventually.To perform such reasoning automatically, we formalize this intuition by introducing the notion of acceleration lemmas.An acceleration lemma, formally introduced below, is a triple (base, step, conc) of FOL formulas.The conclusion conc characterizes a set of states with the property that from each of them, by iterating the step relation described by step, a state in the set characterized by the base condition base must be reached.Acceleration lemmas express generic inductive statements in a logical form.In order to employ such a lemma to accelerate the computation of an attractor set for some Player in a reactive program game structure G, it is necessary to establish that Player can enforce the repeated iteration of the step relation in the game, against all possible behaviors of the opponent Player 1 − .To this end, we define the so-called loop game, which is a reactive program game structure obtained from G for a given location .Intuitively, the loop game "splits" the loops from location to itself, thus allowing us to reason about iterated behavior from back to itself.Furthermore, since the loop game captures all possible interactions between Player Sys and Player Env in G that start from location , it enables us to reason about the enforcement of a step relation by a player.
For a given location , the loop game LoopGame(G, , End ) is constructed by adding a new location End ∉ to G and redirecting all edges in G with target location to the new location End .This is formalized in the next definition.Definition 7.3 (Loop Game).Given a reactive program game structure G = ( , I, X, , Inv, ), a location Split ∈ , and a fresh location End ∉ , the loop game is the reactive program game structure LoopGame(G, Split , End ) := ( , I, X, ∪ { End }, Inv ′ , ′ ), where Inv ′ := Inv ∪ { End ↦ → Inv( Split )}, and Example 7.4.Fig. 5 depicts the loop game LoopGame(G, 0 , End ) of our running example in Fig. 1a.We now illustrate how this loop game can be used to accelerate the diverging attractor computation shown at the beginning of this section, at the point where 3 We want to apply the acceleration lemma from Example 7.2 at location 0 .To this end, we have to check that starting in 0 , Player Sys can either enforce the step relation and come back to 0 or reach 3 .To do this, we consider the loop game LoopGame(G, 0 , End ).The condition above is satisfied if Player Sys can enforce reaching End from 0 in the loop game such that the relation step is satisfied by the initial (at 0 ) and final (at End ) values of the program variables (or reach 3 ).Therefore, we compute the attractor for Player Sys of { 0 ↦ → ( ≤ 43), ↦ → ⊤, End ↦ → < } in the loop game.Here < corresponds to the step relation step = ′ < , where represents the initial value of before the iteration.This "shift" is necessary as we are doing a backward computation.This attractor computation in the loop game yields Since at 0 we have completed the iteration, we can set := and the formula simplifies to { 0 ↦ → ⊤, . . .}.This means that Player Sys can indeed iterate enforcing the step relation or reach 3 .Now, we check that the base condition base = ( ≤ 42) of our acceleration lemma is a subset of 3 ( 0 ), which is indeed the case.The property of acceleration lemmas allows us to conclude that Player Sys can reach 3 starting from conc by iterating step.Hence, we can add conc = ⊤ to the attractor set at location 0 , which results in reaching a fixpoint and the computation terminates.
To transform the method illustrated in Example 7.4 into a general acceleration procedure, we have to account for the following: In general, the loop game can be complex itself, and the attractor computation in the loop game may also benefit from acceleration.Hence, our acceleration technique should support the nested applications of lemmas.Furthermore, a simple way to realize our acceleration idea would be first to generate acceleration lemmas and then check their applicability with the described technique.However, this quickly becomes infeasible for large spaces of possible lemmas and game structures requiring the application of nested acceleration.To address this, we compute a symbolic attractor with "unknown" lemmas represented by uninterpreted symbols.To this end, our attractor acceleration technique has two main aspects: • In this section, we show how to apply acceleration lemmas as uninterpreted lemmas.This includes collecting constraints that capture conditions a lemma must satisfy to be applicable, e.g. to establish that the respective player can enforce the step relation in the game.• In Section 8, we present how we generate instances of the unknown lemmas that meet the accumulated constraints and, therefore, can be applied.Let us fix a countably infinite set LemmaSymb of triples ( , , ) of uninterpreted predicate symbols.The symbols in such a triple represent the corresponding elements of an acceleration lemma.In the course of the symbolic computation of attractors, we generate and accumulate constraints, elements of Constraints := FOL(∅), featuring these symbols.For Φ ⊆ FOL, we denote with UsedLemmaSymbols(Φ) the set of triples whose symbols appear in Φ.

Accelerated Attractor. Algorithm 3 shows the procedure A
A for attractor computation with acceleration.At each iteration, a heuristic function A ? is used to decide whether an acceleration should be attempted at location at the current step of the computation.This function returns either ∅ or a singleton set of locations.We discuss possible heuristics in Section 9.
The acceleration consists in computing a formula in ∈ FOL(X) such that the set {( , x) | x | = } is a subset of the computed attractor.The procedure AccA : GameStructure × {Sys, Env} × × D → Constraints × FOL(X) computes such a featuring uninterpreted predicate symbols representing the acceleration lemmas and a constraint capturing conditions for lemma application.It uses the procedure IterA : GameStructure × {Sys, Env} × Constraints × D → Constraints × D. The function InstantiateLemmas : Constraints × FOL → FOL(X) finds an instantiation to the uninterpreted symbols used to represent the acceleration lemmas.These functions are defined in Fig. 6.
IterA performs steps of an attractor computation plus acceleration and computes an underapproximation of an attractor.We cannot use the normal attractor computation anymore since the uninterpreted predicates hinder us from identifying if we have reached a fixpoint.Therefore, the heuristic function I ?determines when to stop the computation.The heuristic function, A ? determines whether to accelerate by invoking AccA.Given a location ∈ and symbolic goal , AccA computes a symbolic state for from which Player can enforce going to , if the uninterpreted lemmas satisfy the computed constraint.AccA does so by introducing and symbolically applying an acceleration lemma at location as follows.
• We first pick a fresh triple ( , , ) of lemma symbols from LemmaSymb for the uninterpreted lemma that we are applying.• As in Example 7.4, we construct G loop for , since we want to accelerate at location .• We construct loop obtained from by mapping the new location End to the formula ( , X).
( , X) expresses the relation between , a set of fresh uninterpreted constants with one such constant for each variable in X, representing the initial assignment, and X representing the values of the program variables at End .• We invoke IterA which recursively performs iterations of the attractor computation.Intuitively, if Player can enforce to reach loop ( End ), it can either get to the goal or enforce the step relation ( , X).As IterA might call AccA, it also returns a constraint Ψ Rec .The substitution [ ↦ → X] denotes the syntactic transformation that replaces each by the respective ∈ X.This transformation is necessary as, intuitively, for Ψ Rec and ( ) we have to consider for the constraint the initial assignment to X in before performing the step.
• Finally, AccA constructs the new constraint Ψ. Intuitively, the first conjunct states that the base of the lemma (X) is included in the goal ( ).The second conjunct states that every state in the conclusion (X) that is not an element of the goal must be in , i.e. either the goal or the step relation is enforceable from (X) if not already at the goal.We assume that each quantifier uses a fresh copy (or De Bruijn indexing) of the variables in X and applies the appropriate renaming.This ensures the correct replacement of constants in with variables as described in the previous step.
• We return the new constraint Ψ and the conclusion (X) of the applied lemma.The constraint Ψ ensures, together with the property of an acceleration lemma, that from (X), Player can enforce reaching .• Note that the incompleteness of IterA is not a problem, as terminating early with an underapproximation will only result in a stronger constraint.
What remains is to define the function InstantiateLemmas(Ψ, ) that searches for an instantiation of the uninterpreted symbols, representing the used acceleration lemmas, that satisfies Ψ and defines valid lemmas.If such an instantiation is found, it returns a formula in which the terms applying these symbols have been replaced with formulas in FOL(X).In Section 8 we discuss the practicalities of the check for the existence of lemmas.Function InstantiateLemmas, shown in Fig. 6, requires finding a mapping : UsedLemmaSymbols({Ψ, }) → Lemmas(X) of the lemma symbol triples used in Ψ and to the set of lemmas Lemmas(X).For ∈ {Ψ, } we denote with [ ] the formula obtained from by the following transformation.For each ( , , ) such that (( , , )) = (base, step, conc), replace each predicate term of the form By the construction of Ψ and in AccA, all the terms containing the lemma predicate symbols as top symbols are of the above form, and hence, no uninterpreted acceleration lemma symbols appear in [ ]. Furthermore, all uninterpreted constants introduced during the construction have been mapped back to variables and do not appear in the result returned to A A .

Extracting Reactive Programs from Acceleration Lemmas
The extraction of programs in the presence of acceleration follows the same principles as in Section 6.3.We only need to define the sub-program constructed for the application of acceleration during the attractor computation, that is the invocation of procedure AccA.
When extracting a program for AccA, we add new labels in the GOTO program for each location in the loop game G loop .Note that those new labels are different from the labels corresponding to the locations in the original game G.The program labels 0 , and in Fig. 1b result from the loop game in Fig. 5, constructed by AccA.Before starting the extraction for IterA, we introduce an auxiliary copy of the program variables that stores the values of X at the label (this corresponds to the set of fresh variables ).Storing these values is necessary in order to check conditions on the step relation in the program for IterA.The program in Fig. 1b contains such an auxiliary variable 0 and the copy assignment 0 := at label 0 .
For IterA we extract the program as outlined in Section 6.3 except that the conditions in the program might contain uninterpreted symbols for the lemmas.If in the program IterA parts of the original attractor are reached, we add a jump back to the respective location of G.If End is reached, the program jumps back to the label of .If IterA applies acceleration (invokes AccA) we apply this extraction process recursively.
The result is a program skeleton from which we still have to remove the uninterpreted lemma symbols.After InstantiateLemmas has found and instantiation with concrete lemmas, we use this instantiation to replace all occurrences of uninterpreted lemma symbols in the conditions of the   extracted program.In our running example, when generating the program in Fig. 1b, we will first get a program including statements of the form if(step( 0 , + )){ := + ; goto }.Then, we plug in the acceleration lemma from Example 7.2 and step( 0 , + ) becomes + < 0 (as + corresponds to ′ step and 0 to ) which is exactly the statement that we have in Fig. 1b.

Acceleration Beyond A ractors
The symbolic semi-algorithms for solving Büchi and parity games outlined in Section 6 are accelerated by using A A instead of A .This accelerates the computation of attractor sets in the solving procedures, and helps the overall computation of the set of winning states to converge.For the Büchi game in Example 3.3 the procedure for solving Büchi games with accelerated attractor computation successfully terminates.
However, the convergence of A A is in general insufficient to ensure the convergence of the procedure for Büchi games.For example, consider the reactive program game structure in Fig. 7a and the Büchi objective for Player Sys defined by := { 0 }.The table in Figure 7b shows the computation of the procedure in Algorithm 2. We observe that the sets 1− will keep growing and the computation never terminates.Player Env wins the game for every possible state, since the initial value of bounds the number of visits to 0 .However, this argument cannot be captured with attractor acceleration, as Player Env cannot force Player Sys to decrease .
Therefore, we introduce an additional acceleration method.Intuitively, it captures the idea that if Player with the Büchi objective keeps visiting a given accepting location, then the set of winning states of Player 1 − must be reached eventually.The procedure is formalized in Fig. 8.It employs a process we call co-Büchi acceleration, that accelerates the outer fixpoint in S B , which computes the sets 1 , 2 , . ... Analogous to attractor acceleration, we use symbolic procedures AccB and IterB which generate constraints for acceleration lemmas and are mutually recursive.The function AccB : GameStructure × {Sys, Env} × × D → Constraints × FOL(X) operates in a similar manner as function AccA.The difference is that the loop game is now constructed for a location ∈ , the set of Büchi winning locations, and is equipped with a Büchi winning condition loop := ∪ { ′ ↦ → ¬ ( , X)}.Intuitively, loop requires that in the loop game the Büchi player enforces visiting infinitely often a location in \ { }, or a visit to End while also ensuring that the step relation of the applied lemma is violated.Thus, the winning region of the co-Büchi player in this game are states from which it can enforce that the play does not visit \{ } infinitely often and if it visits location End , then the step relation must be satisfied at location End .If Player loses this loop game, Player can only win the overall game by visiting Split infinitely often while applying step.This allows extending the winning region of Player 1 − using an acceleration lemma.The function IterB : GameStructure × {Sys, Env} × Constraints × D × D → Constraints × D is the analogous of IterA, but performs the steps of the procedure for Büchi games instead, and invokes IterA for the accelerated attractor computation.The key difference here is that the attractor sets computed for the Büchi player must not be underapproximated.To this end, we add an additional We call the ranking function.
The above property provides us with a way to generate acceleration lemmas that satisfy the condition in Definition 7.1.To ensure that we can effectively check satisfiability of the generated constraints, a meaningful class of ranking functions are affine functions with bounded coefficients.For instance, when X Num are the program variables with a numerical type, we consider Such a lemma is typically applicable only in some part of the state space.This necessitates the application of multiple lemmas and specifying some properties to be invariant under the step relation.Hence, our templates should allow adding invariants as stated in the following.P 8.2.Let (base, step, conc) be an acceleration lemma over and let inv ∈ FOL( ) be a formula.Then, (base ∧ inv, step ∧ inv [ ↦ → ′ ], conc ∧ inv) is also an acceleration lemma.
We call inv an invariant.
We define templates for invariants as linear inequalities with bounded coefficients.More concretely, we consider linear inequalities with variables for some selected bound ∈ N, i.e., We distinguish the special case where = 1, which results in the simpler set of invariant templates inv := { ≤ , ≥ , = | ∈ X Num , ∈ R}.As invariants need not satisfy any additional restrictions, we can allow conjunctions of inequalities, or even other given predicates.
Putting everything together, the lemma template that we use is the one from Proposition 8.1 combined according to Proposition 8.2 with an invariant inv that is a conjunction of several instances of inv and inv where each part of the template has its unique set of meta-variables.

Lemma Instantiation via
antifier Elimination It remains to discuss how to transform the templates into a mapping Symb → Lemmas(X) of the used lemma symbols Symb := UsedLemmaSymbols({Ψ, }) to actual lemmas.Recall that is required by function InstantiateLemmas in Fig. 6 in order to compute the acceleration result.
We define a template mapping to be a function : Symb → FOL(X ∪ Meta) × FOL(X ∪ Meta) × FOL(X ∪ Meta) that maps the uninterpreted lemma symbols to lemma templates containing meta-variables Meta (like or from above).For an instantiation of the meta-variables : Meta → T , that maps the meta-variables to terms without meta-variables, we define , : Symb → Lemmas(X) as , ( ) = ( ) [ ] where in the image of all meta-variables ∈ Meta are simultaneously replaced by their respective terms ( ) (slightly abusing the notation Thus, given a constraint Ψ and a formula ∈ FOL(X) with uninterpreted lemma symbols, the task of finding a mapping reduces to finding a model : Meta → T of the formula Ψ[ (Meta)], where Ψ[ (Meta)] is obtained from Ψ by applying as substitution.If a model is found (by the SMT solver) we have that Ψ[ , ] evaluates to true, and the function InstantiateLemmas then returns [ , ].For program extraction, provides a concrete instantiation of the meta-variables, and hence, a concrete acceleration lemma that we use as described in Section 7.2.
However, the generated model might yield an acceleration lemma that is not general enough.For example, recall the acceleration lemma used for our running example in Example 7.5, but suppose that now we want to use the method outlined above to generate an acceleration lemma using our template format.As our template allows for the use of invariants, a model generated by the SMT solver might result in an acceleration lemma of the form ( ≤ 42, ′ < ∧ ′ ≤ 100, ≤ 100) which is the original lemma with the additional invariant ≤ 100.Unlike the lemma in Example 7.5, this lemma does not result in immediately reaching a fixpoint in the attractor computation.One possible way to mitigate this, is to enumerate models and thus generate multiple acceleration lemmas that can be all applied in AccA to further extend the computed attractor.
Instead of computing instantiations of the meta-variables one by one from the satisfying assignments of Ψ[ (Meta)], we can take a different approach.We can consider a formula in which the meta-variables are explicitly existentially quantified: The set of assignments that satisfy Φ(X) consists of those assignments to X for which there exists an instantiaton of the meta-variables (that is, a valid acceleration lemma).Thus, it implicitly captures all the acceleration lemmas that are possible results of InstantiateLemmas.Applying quantifier elimination to Φ(X), we obtain a formula QElim(Φ) that characterizes the final conclusion, which can be seen as the union of the conclusions of all acceleration lemmas satisfying the templates and the constraints.In our running example, we obtain QElim(Φ) ≡ ⊤, as there exist an instantiation of the meta-variables that results in a lemma with conclusion ⊤ (for example the one from our example).With this approach, InstantiateLemmas returns QElim(Φ) as the result of the acceleration.

Program Extraction via Skolem Function Synthesis
The formula QElim(Φ) characterizes all possible results of InstantiateLemmas and is sufficient for computing a symbolically represented set of states to add to the attractor.However, for extracting a program as in Section 7.2, we need concrete acceleration lemmas, i.e. concrete values for Meta.Since for different assignments satisfying QElim(Φ), the concrete values of Meta can be different, our goal is to generate functions for the meta-variables with arguments in Assignments(X).For each x satisfying QElim(Φ) these functions should yield a valid lemma.We consider the formula ∀X.∃Meta.QElim(Φ) → (Ψ[ (Meta)] ∧ [ (Meta)]).and the task becomes to compute Skolem functions SK : Assignments(X) → Domain( ) for the meta-variables ∈ Meta.This is essentially a functional synthesis problem.One way to do that is to skolemize the above formula by substituting Meta using uninterpreted function symbols and then solve the resulting second-order satisfiability problem.
A tuple of Skolem functions SK , one for each ∈ Meta, characterizes a lemma as a function of the values of X at the point when we start applying acceleration.Our concrete lemmas are therefore given by the instantiation , ↦ →SK m (X ) where X is a copy of X storing the values of the program variables before entering the sub-program generated for acceleration.Aside from these additional auxiliary variables, the program generation proceeds as described in Section 7.2.

EVALUATION
We implemented our game-solving method in a prototypeusing Z3 [15] as the SMT solver.The implementation realizes the heuristics from Section 7 based on the current number of symbolic state changes per location ∈ , i.e. how many times +1 ( ) ( ) so far in the computation.In Algorithm 3, the frequency with which A ? attempts acceleration in grows linearly in .In InstantiateLemmas, we query the SMT solver with a timeout which is quadratic in .The lemma template is as described in Section 8 and uses an invariant that has a number of conjuncts inv and inv 2 linear in .For IterA, the heuristics bound the depth of nested acceleration linearly in , as well as ensure two updates of the symbolic state per location, first an application of the enforceable predecessor and then a potential nested acceleration.cvc5 [4]).Note that the last two perform synthesis from temporal logic specifications, and our tool solves directly specified games.For benchmarks where a TSL encoding was available, we use the existing benchmark, otherwise we encoded the game into TSL with an automatic translation, where the game locations are encoded with an additional data cell in the TSL formula.For MuVal, we encoded the games into CLP as described in [40], where the set of winning states is described by (nested) fixpoint equations.We did not compare to [17,35] as those tools are outperformed by GenSys [37], and did not compare to [33] as they use a fairly different model and perform similarly to GenSys on the shared benchmarks.The implementation from [8] is not available.For [16] no tool is available (instead, a small set of SMTLib benchmark files that also contain the respective CHC encoding are available).However, while [16] did not compare to GenSys, their performance seems similar.
Benchmarks. [35] introduced a set of benchmarks that are safety games modeling one or two robots moving on an infinite two-dimensional grid while influenced by the system and the environment.The reasoning that these games require can be localized, and no unbounded strategy loops occur.The Cinderella game [11,44] is a standard safety-game benchmark for infinite-state synthesis that is parametrized in bucket size.The number of iterations needed in a strategy is bounded.Bloem and Maderbacher introduced a set of (TSL) benchmarks in [32], which model simple elevator and water tank controllers, and whose variables have infinite domain but take values in a bounded set.We introduce a new set of benchmarks, described in Appendix C, where a robot moves on a grid, in a continuous space, or through a finite set of abstract locations.It has to perform different tasks like reaching a location, commuting between locations, executing tasks at specific locations, while handling environment disturbances, or decreasing energy levels, or avoiding an environment-controlled cat.Our benchmarks have unbounded variable ranges and many contain unbounded strategy loops.We did not use the TSL benchmarks from [12] as most are almost deterministic (and require a manual translation from a formula into a program-like game).

Analysis
Table 1 shows the results of our evaluation.For game solving, the evaluation demonstrates that on standard benchmarks from the literature our method performs better than or equally well as other tools.More significantly, on most benchmarks featuring unbounded strategy loops, our prototype tool outperforms the state of the art.Furthermore, it also performs well when combining acceleration with explicit steps in the fixpoint computation is needed.Except for MuVal, none of the other tools is able to handle unbounded strategy loops.MuVal performs well on benchmarks where the set of wining states and the required ranking arguments have concise representations.However, it runs into scalability issues in cases where many steps in the fixpoint computations are needed, or when the ranking arguments become more complex.Our prototype tool, on the other hand, demonstrates a more consistent performance in both cases, due to the ability of our method to combine acceleration with iterative game-solving.
For strategy extraction, on benchmarks from the literature our prototype performs comparably to the other tools from the evaluation that support strategy extraction, which are Raboniel, TeMoS and GenSys.On our new benchmarks where acceleration is required to solve the game, our tool is able to extract a strategy within the timeout in 5 out of 16 benchmarks (note that a strategy is only extracted for realizable problems).Note that, to our knowledge, no other tool is capable of extracting programs for these benchmarks, and thus our tool improves on the state of the art.It can be seen from the results in Table 1, that when acceleration is applied, the time our tool takes for strategy extraction is significantly higher than that for solving the game.The underlying reason is that the generation of strategies for acceleration requires solving relatively complex functional synthesis problems in order to synthesize Skolem functions for the acceleration lemmas.Our current implementation first performs Skolemization on the respective formulas and then invokes Z3 to search for a model for the Skolem function symbols.For selected queries we experimented with syntax-guided synthesis (cvc5 with option -sygus-inference) and with the AE-solver tool from [18], but neither was able to produce a solution to any of the problem instances.We remark that some of the ∀∃ formulas contain more than 100,000 logical and arithmetic operators.
We evaluate performance of realizability checking and strategy extraction separately, since realizability checking is of crucial importance on its own, especially in early design stages when initial specifications frequently turn out to be unrealizable.

Discussion
As part of future work, we plan to further the scalability of our prototype tool, especially the program extraction.The first step would be to identify possible ways to simplify the Skolem function synthesis problem instances.For example, we noticed that in our problems the Skolem functions for many of the existentially quantified meta-variables are constant, and thus only for some of them the functions involve case distinctions.If it is possible to perform some analysis at the game level to identify the different types of meta-variables, combining different search techniques might result in simpler functional synthesis problems.
Second, the selection of the templates and the employed generation technique for identifying acceleration lemmas are just one way to use the general framework we propose.One major strength of our approach is that the needed templates capture localized arguments that are automatically combined by the acceleration procedure to reason about strategies.Thus, even if more customized user-provided templates are necessary for more complex arguments, these are still localized and do not require the user to argument over strategic decisions.The development of more refined lemma generation techniques, for example, inspired by Syntax Guided Synthesis [2,36], offers a different angle to tackle the lemma instantiation problem in future work.
A third avenue for future work is exploring the possibility to integrate in our approach solving techniques for fixpoint logics.Our evaluation shows that on benchmarks where the set of wining states and the required ranking arguments have concise representations, the solver MuVal performs well.Employing similar techniques could also improve the computation of acceleration lemmas.The main challenge is designing the interface between the game-solving/acceleration procedure and the underlying reasoning method, that is, decomposing the game solving process in a way that the constraint-solving method is only required to solve simpler localized sub-problems.
At a higher level, another direction is the investigation and integration of approximation techniques.Our current approach computes exact winning sets, which might need complex representations, while in many cases computing approximations might be sufficient.Further directions include exploring the relation of reactive program games to temporal logics.

CONCLUSION
We study reactive program games, a type of infinite state games with -regular objectives.We propose a symbolic method for solving such games that relies on a novel technique for accelerating the game-solving process in the presence of unbounded strategy loops.This acceleration is the key reason why our method can solve games on which state-of-the-art techniques diverge, as we demonstrate in the evaluation of our prototype implementation.Since our acceleration method's core idea is based on generic inductive statements, we believe our work expands the scope of infinite-state synthesis and opens up a range of interesting directions for further development.also still loose energy.This location has to be visited infinitely often.On floor two, there is a trap the robot cannot leave anymore.Note that the choices of the robot are encoded using integers.In Warehouse-Empty, the robot has to do nothing additional than not loosing to much energy.In Warehouse-Clean, while idling the robot might get environment issued cleaning orders for the different floors.It then has to go cleaning before it goes back to idling again.In Warehouse-Stock, the robot is not allowed to idle but has to restock up to four items infinitely often.

P 6 . 1 .
Let G be a reactive program game structure, p ∈ {Sys, Env} and ∈ D. If A (G, , ) terminates and returns a r, then = A r G , ( ).
The game structure. 0
Let G be a reactive program game structure, ∈ {Sys, Env}, and ∈ D. If A A (G, , ) terminates returning , then it holds that = A r G , ( ).
Game with integer program variable .
b) Sets computed during the procedure.

Fig. 7 .
Fig. 7. Example demonstrating the need for acceleration of the computation of the set 1− in the procedure for solving Büchi games outlined in Section 6.