Neural Abstraction-Based Controller Synthesis and Deployment

Abstraction-based techniques are an attractive approach for synthesizing correct-by-construction controllers to satisfy high-level temporal requirements. A main bottleneck for successful application of these techniques is the memory requirement, both during controller synthesis and in controller deployment. We propose memory-efficient methods for mitigating the high memory demands of the abstraction-based techniques using neural network representations. To perform synthesis for reach-avoid specifications, we propose an on-the-fly algorithm that relies on compressed neural network representations of the forward and backward dynamics of the system. In contrast to usual applications of neural representations, our technique maintains soundness of the end-to-end process. To ensure this, we correct the output of the trained neural network such that the corrected output representations are sound with respect to the finite abstraction. For deployment, we provide a novel training algorithm to find a neural network representation of the synthesized controller and experimentally show that the controller can be correctly represented as a combination of a neural network and a look-up table that requires a substantially smaller memory. We demonstrate experimentally that our approach significantly reduces the memory requirements of abstraction-based methods. For the selected benchmarks, our approach reduces the memory requirements respectively for the synthesis and deployment by a factor of $1.31\times 10^5$ and $7.13\times 10^3$ on average, and up to $7.54\times 10^5$ and $3.18\times 10^4$. Although this reduction is at the cost of increased off-line computations to train the neural networks, all the steps of our approach are parallelizable and can be implemented on machines with higher number of processing units to reduce the required computational time.


INTRODUCTION
Designing controllers for safety-critical systems with formal correctness guarantees has been studied extensively in the past two decades, with applications in robotics, power systems, and medical devices [1,22,24].Abstraction-based controller design (ABCD) has emerged as an approach that can algorithmically construct a controller with formal correctness guarantees on systems with non-linear dynamics and bounded adversarial disturbances [3,26,28,34,36] and complex behavioral specifications.ABCD schemes construct a finite abstraction of a dynamical system that has continuous state and input spaces, and solve a two-player graph game on the abstraction.When the abstraction is related to the original system through an appropriate behavioral relation (alternating bisimulation or feedback refinement [28]), the winning strategy of the graph game can be refined to a controller for the original system.Finite abstractions can be computed analytically when the system dynamics are known and certain Lipschitz continuity properties hold.Even when the system dynamics are unknown, one can use data-driven methods to learn finite abstractions that are correct with respect to a given confidence [7,18,27].
A main bottleneck of ABCD is the memory requirement, both in representing the finite abstract transition relation and in representing the controller.First, the state and input spaces of the abstraction grow exponentially with the system and input dimensions, respectively, and the size of the abstract transition relation grows quadratically with the abstract states and linearly with the input states.While symbolic encodings using BDDs can be used, in practice, the transition relation very quickly exceeds the available RAM. Memory-efficient methods sometimes exploit the analytic description of the system dynamics or growth bounds [15,25,31], but these techniques are not applicable when the finite abstractions are learned directly from sampled system trajectories, or when a compact analytical expression of the growth bound is not available.Second, the winning strategy in the graph game is extracted as a look-up table mapping winning states to one or more available inputs.Thus, the controller representation is also exponential in the system dynamics.Such controllers cannot be deployed on memory-constrained embedded systems.
In this work, we address the memory bottleneck using approximate, compressed, representations of the transition relation and the controller map using neural networks.We learn an approximate representation of the abstract transition relation as a neural network with a fixed architecture.In contrast to the predominant use of neural networks to learn a generalization of an unknown function through sampling, we train the network on the entire data set (the transition relation or the controller map) offline.We store the transitions on disk, and train our networks in batch mode by bringing blocks of data into the RAM as needed.The trained network is small and fits into RAM.Since the training of the network minimizes error but does not eliminate it, we apply a correction to the output to ensure that the representation is sound with respect to the original finite abstraction, i.e., every trajectory in the finite abstraction is preserved in the compressed representation.We propose an on-the-fly synthesis approach that works directly on the corrected representation of the forward and backward dynamics of the system.Although we present our results with respect to reach-avoid specifications, our approach can be generalized to other classes of properties and problems (e.g., linear temporal logic specifications [2]) in which the solution requires the computation of the set of predecessors and successors in the underlying transition system.
Similarly, we store the winning strategy as a look-up table mapping states to sets of valid inputs on disk and propose a novel training algorithm to find a neural network representation of the synthesized controller.The network is complemented with a look-up table that provides "exceptions" in which the network deviates from the winning strategy.We experimentally demonstrate that a controller can be correctly represented as a combination of a neural network and a look-up table that requires a substantially smaller memory than the original representation.
An important aspect of our approach is that, instead of using neural networks for learning an unknown data distribution, we train them over the entire data domain.Therefore, in contrast to many other applications wherein neural networks provide function representation and generalization over the unseen data, we are able to provide formal soundness guarantees for the performance of the trained neural representations over the whole dataset.
Our compression scheme uses additional computation to learn a compressed representation and avoid the memory bottleneck.In our implementation, the original relations are stored on the hard drive and data batches are loaded sequentially into the RAM to perform the training.Hard drives generally have much higher memory sizes compared to the RAM, but reading data from the hard drive takes much longer.However, data access during training is predictable and we can perform prefetching to hide the latency.During the synthesis, the trained corrected neural representations fit into the RAM.In contrast, a disk-based synthesis algorithm does not have predictable disk access patterns and is unworkable.Similarly, the deployed controller only consists of the trained compact representation and (empirically) a small look-up table, which can be loaded into the RAM of the controlling chip for the real-time operation of the system.
We evaluate the performance of our approach on several examples of different difficulties and show that it is effective in reducing the memory requirements at both synthesis and deployment phases.For the selected benchmarks, our method reduces the space requirements of synthesis and deployment respectively by factors of 1.31 × 10 5 and 7.13 × 10 3 in average, and up to 7.54 × 10 5 and 3.18 × 10 4 , compared to the abstraction-based method that requires storing the full transition system.Moreover, we empirically show that, unlike other encodings, the memory requirement of our method is not affected by the system dimension on the considered benchmarks.
In summary, our main contributions are: • Proposing a novel and sound representation scheme for compressing finite transition systems using the expressive power of neural networks; • Proposing a novel on-the-fly controller synthesis method using the corrected neural network representations of forward and backward dynamics; • Proposing an efficient scheme for compressing the controller computed by abstraction-based synthesis methods; • Demonstrating significant reduction in the memory requirements by orders of magnitude through a set of standard benchmarks. 1he rest of this paper is organized as follows.After a brief discussion of related works, we give a high-level overview of our proposed approach in Subsec.1.2.The preliminaries and the problem statements are given in Sec. 2. We provide the details of our synthesis and deployment algorithms in Sec. 3 and 4, respectively.In Sec 5, we provide experimental results of applying our approach to several examples.We state the concluding remarks in Sec. 6.

Related Work
Synthesis via reinforcement learning.The idea of using neural networks as function approximators to represent tabular data for synthesis purposes has been used in different fields such as reinforcement learning (RL) literature and aircraft collision avoidance system design.RL algorithms try to find an optimal control policy by iteratively guiding the interaction between the agent and the environment modeled as a Markov decision process [35].When the space of the underlying model is finite and small, q-tables are used to represent the required value functions and the policy.When the space is large and possibly uncountable, such finite q-tables are replaced with neural networks as function approximators.Convergence guarantees that hold with the q-table representation [4] are not valid for non-tabular setting [5,6,39].A similar behavior is observed in our setting: we lose the correctness guarantees in our approach without correcting the output of the neural network representations of the transition systems and the tabular controller.
Neural-aided controller synthesis.Constructing neural network representations of the dynamics of the control system and using them for synthesis is studied in specific application domains including the design of unmanned airborne collision avoidance systems [17].The central idea of [17] is to start from a large look-up table representing the dynamics, train a neural network on the look-up table, and use it in the dynamic programming for issuing horizontal and vertical advisories.Several techniques are used to reduce the storage requirement since the obtained score table-that is the table mapping every discrete state-input pair into the associated score-becomes huge in size (hundreds of gigabytes of floating numbers).Since simple techniques such as down sampling and block compression [21], are unable to achieve the required storage reduction, Julian et al. have shown that deep neural networks can successfully approximate the score table [16].However, as in the RL controller synthesis, there is no guarantee that the control input computed using the neural representation matches the one computed using the original score table.In contrast, our corrected neural representations are guaranteed to produce formally correct controllers.
Reactive synthesis.Binary decision diagrams (BDDs) are used extensively in the reactive synthesis literature to represent the underlying transition systems [11,29].While BDDs are compact enough for low-order dynamical systems, recent synthesis tools such as SCOTS v2.0 [32] have already migrated into the non-BDD setting in order to avoid the large runtime overheads.In fact, motivated by reducing the required memory foot print, the current trend is to synthesize controllers in a non-BDD on the fly to eliminate the need for storing the transition system [15,19,20,23,25,31].These memory-efficient methods exploit the analytic description of the system dynamics or growth bounds.In contrast, our technique is applicable also to the case where the finite abstractions are learned directly from the sampled system trajectories, i.e., when no compact analytical expression of the dynamics and growth bounds are available.
Verifying systems with neural controllers.An alternative approach developed for safety-critical systems is to use neural networks as a representation of the controller and learn the controller using techniques such as reinforcement learning and data-driven predictive control [8,37].In this approach, the controller synthesis stage does not provide any safety guarantee on the closed loop system, i.e., on the feedback connection of the neural controller and the physical system.Instead, the safety of the closed-loop system is verified a posteriori for the designed controller.Ivanov et al. have considered dynamical systems with sigmoid-based neural network controllers, used the fact that sigmoid is the solution to a quadratic differential equation to transform the composition of the system and the neural controller into an equivalent hybrid system, and studied reachability properties of the closed-loop system by utilizing existing reachability computation tools for hybrid systems [14].Huang et al. have considered dynamical systems with Lipschitz continuous neural controllers and used Bernstein polynomials for approximating the input-output model of the neural network [13].Development of formal verification ideas for closed-loop systems with neural controllers has led into emergence of dedicated tools such as NNV [38] and POLAR [12].While these methods provide guarantees on closed-loop control system with neural controllers, they can only consider finite horizon specifications for a given set of initial states.In contrast, we consider controllers that are synthesized for infinite horizon specifications.
Minimizing the memory foot print for symbolic controllers.Girard et al. have proposed a method to reduce the memory needed to store safety controllers by determinizing them, i.e., choosing one  [40].Both of the ADD scheme in [9,15] and the BDD-based scheme in [40] have the capability to determinize the symbolic controller and reduce its memory foot print.However, the computed controller still suffers from the additional runtime overhead of the ADD/BDD encoding.Further, as mentioned by the authors of [40], their regression-based method is not able to represent the original controller with high accuracy.In contrast, our tool produces real-valued representations for symbolic controllers and can (additionally) be computed on top of the simplified version found by either of the methods proposed in [15,40].
Compressed representations for model predictive controllers (MPCs).Hertneck et al. have proposed a method to train an approximate neural controller representing the original robust (implicit) MPC satisfying the given specification [10].While reducing the online computation time is the main motivation in implicit MPCs, minimizing the memory foot print is the main objective in explicit MPCs.Salamati et al. have proposed a method which is based on solving an optimization to compute a memory-optimized controller with mixed-precision coefficients used for specifying the required coefficients [33].Our method considers a different class of controllers that can fulfill infinite horizon temporal specifications.

Overview of the Proposed Approach
In this subsection, we provide a high-level description of our approach for both synthesis and deployment.Corrected neural representations.Fig. 1 gives a pictorial description of the steps for computing a corrected neural network representation.Given a finite abstraction Σ that corresponds to the forward dynamics of the system and stored on the hard drive, we first compute the transition system Σ corresponding to the backward dynamics.Next, we extract the input-output training datasets D  and D  respectively from the forward and backward systems, and store them on the hard drive.Each data point contains one state-input pair and the characterization of ℓ ∞ ball for the corresponding reachable set.We train two neural networks N  and N  such that they represent compressed input-output surrogates for the datasets D  and D  , respectively.Finally, we compute the soundness errors   and   which correspond to the difference between the output of N  and N  and the respective values in D  and D  , calculated over all of the state-input pairs.We use the computed errors   and   in order to construct the corrected neural representations   and   .We will get memory savings by using   and   instead of Σ and Σ , respectively.Synthesis.Fig. 2 gives a pictorial description of our proposed synthesis algorithm for a reach-avoid specification with the target set Goal and obstacle set Avoid as subsets of the state space.Let  0 ⊆ X represents a discrete under-approximation of the target set Goal.We initialize the winning set as  =  0 , the controller as  = ∅, and the set of state-input pairs that must be added to the controller as Γ 0 = ∅.In each iteration, we compute the set of new states that belong to the winning set and update the controller accordingly, until no new state is added to .To this end, we first use   and its corresponding soundness error   to compute a set of candidates   out of which some belong to  and it is guaranteed that there will be no winning state outside of   in the  ℎ iteration.We use   and its corresponding soundness error   to compute the set of new winning states  +1 ⊆   .We also compute the set of control inputs for every new winning state and compute the corresponding set of state-input pairs Γ +1 that must be added to the controller.Finally, if   = ∅, we terminate the computations as we already have computed the winning set  and the controller .Otherwise, we add the new winning set of states and state-input pairs, respectively, into the overall winning set ( ←  ∪  +1 ) and the controller ( ←  ∪ Γ +1 ), and repeat the steps in the next iteration.Deployment.Fig. 3 shows our method for compressing controllers that are obtained from abstractionbased approaches.In the first step, we collect the training dataset D  and reformat it to become appropriate for our specific formulation of a classification problem.Each data point contains one state and an encoding of the corresponding set of control inputs.We then train a neural network N  on the data with the loss function designed for this specific classification problem.Finally, we find all the states at which the output label generated by N  is invalid, and store the corresponding state-input pair in a look-up table, denoted by Ĉ.We experimentally show that Ĉ only contains a very small portion of state-input pairs.

Notation
We denote the set of integer numbers and natural numbers including zero by Z and N, respectively.We use the notation R and R >0 to denote respectively the set of real numbers and the set of positive real numbers.We use superscript  > 0 with R and R >0 to denote the Cartesian product of  copies of R and R >0 respectively.For a vector  ∈ R  , we denote its  ℎ component, element-wise absolute value and ℓ 2 norm by (), || and ∥∥, respectively.For a pair of vectors ,  ∈ R  , ,  denotes the hyper-rectangular set [(1),  (1)] × • • • × [(),  ()].Further, given  ∈ R  ,  + ,  is another hyper-rectangular set which is shifted compared to ,  to the extent determined by .Similarly, for a vector  ∈ R  and a pair of vectors ,  ∈ R  , for which  = ,  ∈ Z and  = ,  ∈ Z, we define ,   =  =1   , where Let  be a finite set of size ||.The empty set is denoted by ∅.When  inherits a coordinate structure, i.e., when its members are vectors on the Euclidean space, () denotes the projection of set  onto its  ℎ dimension.Further, we use the notation  ∞ to denote the set of all finite and infinite sequences formed using the members of .Our control tasks are defined using a subset of Linear Temporal Logic (LTL).In particular, we use the until operator U. Let  and  be subsets of R  and  = ( 0 ,  1 , . . . ) be an infinite sequence of elements from R  .We write  |= U if there exists  ∈ N s.t.  ∈  and   ∈  for all 0 ≤  < .For the detailed syntax and semantics of LTL, we refer to [2] and references therein.

Control Systems
We consider the class of continuous-state continuous-time control systems characterized by the tuple Σ = (,  , ,  ), where  ⊂ R  is the compact state space,  ⊂ R  is the compact input space, and  ⊂ R  is the disturbance space being a compact hyper-rectangular set of disturbances which is symmetric with respect to the origin (i.e., for every  ∈  also it is the case that − ∈  ).The vector field  :  ×  →  is such that  (•, ) is locally Lipschitz for all  ∈  .The evolution of the state of Σ is characterized by the differential inclusion Given a sampling time  > 0, an initial state  0 ∈  , and a constant input  ∈  , define the continuous-time trajectory   0 , of the system on the time interval [0, ] as an absolutely continuous function   0 , : [0, ] →  such that   0 , (0) =  0 , and   0 , satisfies the differential inclusion   0 , () ∈  (  0 , (), ) +  for almost all  ∈ [0, ].Given ,  0 , and , we define Sol( 0 , , ) as the set of all  ∈  such that there is a continuous-time trajectory   0 , with  () = .A sequence  0 ,  1 ,  2 , . . . is a time-sampled trajectory for a continuous control system if  0 ∈  and for each  ≥ 0, we have  +1 ∈ Sol(  ,   , ) for some   ∈  .

Finite Abstractions
In order to satisfy a temporal specification on the trajectories of the system, it is generally needed to over-approximate the dynamics of the system with a finite discrete-time model.Let X ⊂  and Ū ⊂  be the finite sets of states and inputs, computed by (uniformly) quantizing the compact state and input spaces  and  using the rectangular discretization partitions of size   ∈ R  >0 and   ∈ R  >0 , respectively.A finite abstraction associated with the dynamics in Eq. ( 1) is characterized by the tuple Σ : ( X, Ū ,  ), where   ⊆ X × Ū × X denotes the system's forward-in-time transition system.The transition system   is defined such that When the dynamics in Eq. ( 1) are known and satisfy the required Lipschitz continuity condition, the finite abstraction can be constructed using the method proposed in [28].For systems with unknown dynamics, data-driven schemes for learning finite abstractions can be employed [7,18,27].By abusing the notation, we denote the reachable set for a state-input pair We assume that the reachable sets take hyper-rectangular form, meaning that for every x ∈ X , ū ∈ Ū the corresponding reachable set  =   ( x, ū) can be rewritten as  =  =1  (), where  () corresponds to the projection of the set  onto its  ℎ coordinate.Otherwise, in case that  is not hyper-rectangular, it is over-approximated by  =1  ().
Note that Σ can in general correspond to a non-deterministic control system, i.e., |  ( x, ū))| > 1 for some x ∈ X, ū ∈ Ū .Given Σ, one can easily compute the characterization of the backward-in-time dynamics as A trajectory of Σ is a finite or infinite sequence  0 ,  1 ,  2 , . . .∈ X ∞ , such that for each  ≥ 0, there is a control input ū ∈ Ū such that (  , ū ,  +1 ) ∈   .The operator Pre(•) acting on sets  ⊆ X is defined as Finally, to compute an over-approximating set of the discrete states that have overlap with a hyper rectangular set   ,   , we define the (over-approximating) quantization mapping as Similarly, the under-approximating quantization mapping is defined as

Controllers
For a finite abstraction Σ = ( X, Ū ,  ), a feedback controller is denoted by  ⊆ X × Ū .The set of valid control inputs at every state x ∈ X is defined as

Neural Networks
A neural network N (, •) : R  → R  of depth  ∈ N is a parameterized function which transforms an input vector  ∈ R  into an output vector  ∈ R  , and is constructed by the forward combination of  functions as follows: where  = ( 1 , . . .,   ) and   (  , •) : R   −1 → R   denotes the  ℎ layer of N parameterized by   with  0 = ,   ∈ N for  ∈ [1; ] and   = .The  ℎ layer of the network,  ∈ [1; ], takes an input vector in R   −1 and transforms it into an output representation in R   depending on the value of parameter vector   and type of the used activation function in   .During the training phase of the network, the set of parameters  is learned over the training set which consists of a number of input-output pairs {(  ,   ) |  = 1, 2, . . .,  }, in order to achieve the highest performance with respect to an appropriate metric such as mean squared error.For a trained neural network, we drop its dependence on the parameters  .In this paper, we characterize a neural network of depth  using its corresponding list of layer sizes, i.e., ( 1 ,  2 , . . .,   ), and the type of the activation function used, e.g., hyperbolic tangent, Rectified Linear Unit (ReLU), etc.
Neural networks can be used for both regression and classification tasks.In a regression task, the goal is to predict a numerical value given an input, whereas, a classification task requires predicting the correct class label for a given input.In order to measure performance of the trained neural network, we consider prediction error.Note that prediction error is different from the metrics such as mean squared error (MSE) which are used during the training phase for defining the objective function for the training.The prediction error for regression and classification tasks is defined differently.For our regression tasks, we define the prediction error for a trained neural network N over a training set {(  ,   ) |  = 1, 2, . . .,  } as In this paper, we consider the classification tasks wherein there may exist more than one valid class label for each input.Therefore, the training set would be of the form {(  ,   ) |  = 1, 2, . . .,  }, where   ∈ {0, 1}  and   () = 1 iff  ∈ [1; ] corresponds to a valid label at   .Since the number of valid labels for each input can be different, we define the prediction error of a trained classifier N in the following way: For a given neural network N with the training set {(  ,   ) |  = 1, 2, . . .,  }, we define the continuity index as

Problem Statement
We now consider the controller synthesis problem for finite abstractions w.r.t. a reach-avoid specification.Let Goal, Avoid ⊆ , Goal ∩ Avoid = ∅ be the set of states representing the target and unsafe spaces, respectively.The winning domain for the finite abstraction Σ = ( X, Ū ,  ) is the set of states x * ∈ X such that there exists a feedback controller  such that all trajectories of  ∥ Σ, which are started at x * , satisfy the given specification Φ. x0 = x * , x1 , x2 , . . .|= Φ.The aim is to find the set of the winning states  together with a feedback controller  such that  ∥ Σ satisfies the reach-avoid specification Φ.To compute the winning domain and the controller, one can use the methods from reactive synthesis.For many of interesting control systems, size of   in the finite abstraction becomes huge.This restricts the application of reactive-synthesis-based methods for computing the controller.Therefore, we are looking for a method which uses compressed surrogates of   to save memory.In particular, we want to train two corrected neural surrogates, i.e., neural network representations whose output is corrected to maintain the soundness property:   for the forward-in-time dynamics and   for the backward-in-time dynamics.
Outputs: Corrected neural representations   and   , winning domain  and a feedback controller  for Σ such that  ∥ Σ realizes Φ.
It is important to notice that any solution for this problem is required to provide a formal guarantee on the satisfaction of Φ, i.e., the reach-avoid specification Φ must be satisfied under any disturbance affecting the control systems.
Let  ∈ X × Ū be the computed controller for the abstraction Σ such that  ∥ Σ realizes a given specification Φ.The size of this controller can be large due to the large number of discrete state and inputs.For deployment purposes, we would like to compute a corrected neural controller Ĉ X → Ū s.t.Ĉ ∥ Σ realizes Φ.

Algorithm 1: Regression-based compression algorithm for finite abstractions
Data: Forward dynamics Σ and learning rate  1 Compute backward dynamics Σ and the datasets D  and D  using Eqs.( 2), (4), and (5) 2 Train neural networks N  on the dataset D  and train N  on D  using the learning rate  3 Compute the soundness errors   and   using Eq. ( 6) 4 Compute the final corrected representations   and   using Eqs.( 7) and (8) Result: corrected neural representations   and   Fig. 4. Comparing the set of successor states in the transition system   and its representation    .We have

SYNTHESIS
One approach to formally synthesize controllers for a given specification is to store the transition system corresponding to quantization of the state and input spaces, and to use the methods from reactive synthesis to design a controller.However, the memory required to store these transition systems increases exponentially with the number of state variables, which causes a memory blowup for many real-world systems.In this section, we propose our memory-efficient algorithm for synthesizing controllers to satisfy reach-avoid specifications for finite abstractions and reach-avoid specifications.Our method requires computation of corrected neural representations for the finite abstraction.Computation of these representations is discussed in Sec.3.1.Later, in Sec.3.3, we show how our synthesis method makes use of the computed representations.

Corrected Neural Representations for Finite Abstractions
Let Σ = ( X, Ū ,  ) be a finite abstraction.In this section, we show that   can be approximated by some generator functions.In particular, we show how to compute generator functions   : X × Ū → R  ×R  ≥0 and   : X × Ū → R  ×R  ≥0 which can produce characterization of an ℓ ∞ ball corresponding to the over-approximation of forward-and backward-in-time reachable sets, respectively, for every state-input pair picked from X × Ū .Our aim is to use the expressive power of neural networks to represent the behavior of Σ such that the memory requirements significantly decrease.
Our compression scheme is summarized in Alg. 1.We first compute the backward-in-time system Σ using Eq. ( 2).We then calculate the over-approximating ℓ ∞ ball for every state-input pair.Let   ( x, ū) ∈  and This is illustrated in Fig. 4 in two-dimensional space for a given state-input pair ( x, ū).The dotted red rectangle corresponds to the hyper-rectangular reachable set.The center   ( x, ū) and radius   ( x, ū) are computed using the lower-left and upper-right corners of the reachable set denoted, respectively, by    ( x, ū) and   ( x, ū).Then, we have   ( x, ū) = (  ( x, ū) +    ( x, ū))/2 and   ( x, ū) = (  ( x, ū) −    ( x, ū))/2 +   /2.At the end of the first step we have computed and stored the dataset Note that every data-point in D  consists of two pairs: one specifies a state-input pair ( x, ū) and the other one characterizes the center and radius corresponding to the over-approximating ℓ ∞ disc (  ( x, ū),   ( x, ū)).Similarly, we need to store another dataset corresponding to the backward dynamics.First, we define   ( x, ū) ∈  and The dataset corresponding to the backward dynamics is of the following form The size of D  and D  grows exponentially with the dimension of state space.Hence, we store both the datasets D  and D  (potentially) into the hard drive.Next, we take the datasets D  and D  , for which we train neural networks N  and N  , taking the state-input pairs ( x, ū) as input and (  ( x, ū),   ( x, ū)) as output, and try to find an input-output mapping minimizing mean squared error (MSE).For systems with state and input spaces of dimensions  and , the input and output layers of both neural networks are of sizes  +  and 2, respectively.The configuration of the neural networks which we used is illustrated in Fig. 5.During training, we load batches of data from D  and D  , which are stored on the the hard drive, into the RAM.We use the stochastic gradient descent (SGD) method to minimize MSE.As mentioned earlier, in contrast to the usual applications wherein neural networks are used to represent an unknown distribution, we have the full dataset and require computing representations which are sound with respect to the input dataset.A sound representation for the given finite abstractions produces reachable sets that include   ( x, ū) for every state-input pair ( x, ū).For instance, the solid green rectangle in Fig. 4 contains the set of reachable states corresponding to N  ( x, ū) and contains the set of states included in the dotted red rectangle, i.e.,   ( x, ū).Therefore, we can say that the representation N  is sound for the pair ( x, ū).In order to guarantee soundness, we need to compute the maximum error induced during the training process among all the training data points.To that end, we go over all the state-input pairs (which are stored on the hard drive) and compute the maximum error in approximating the centers of the ℓ ∞ balls, denoted by    ,    and radius    ,    corresponding to the forward and backward representations: Similarly, for the backward dynamics, We define and use the errors   and   to compute the corrected representations   and   , corresponding to N  and N  , as described next.Let    and    correspond to the center and radius components of   .Similarly,    and    correspond to the center and radius components of   .For state-input pair ( x, ū) ∈ X × Ū , we define for the forward dynamics, and for the backward dynamics.
Let us define the forward transition system computed using the trained neural network as follows where N   (•, •), N   (•, •) denote the components of the output of N  (•, •) corresponding to the center and radius of disc, respectively.Similarly, we can define the transition system    corresponding to the backward dynamics as follows The following lemma states that we can use the trained neural networks to compute sound transition systems for both forward and backward dynamics.However, our synthesis approach does not require the computation of If the trained representations are accurate, the mismatch rate is low, which results in a less restrictive representation.
Remark 1.The method proposed in this section formulates the computation of the representations as a regression problem, wherein the representative neural networks are supposed to predict the center and radius corresponding to ℓ ∞ reachable sets.In Sec.3.2, we describe a classification-based formulation for compressing finite abstractions, wherein the representative neural networks are supposed to predict the vectorized indices corresponding to the lower-left and upper-right corners of the reachable set.We   16), (17) Result:   ,   experimentally show that this second formulation, while being more memory demanding, provides a less conservative representation compared to the formulation discussed in this section.

Classification-Based Computation of Representations for Finite Abstractions
We proposed in Sec.3.1 a formulation for training neural networks that can guess at any given state-input pair the center and radius of a hyper-rectangular over-approximation of the reachable states.This guess is then corrected using the computed soundness errors.A nice aspect of this formulation is that we only need to store the trained representations and their corresponding soundness errors.However, the result of using the soundness errors to correct the output values produced by the neural networks may give a very conservative over-approximation of the reachable sets, even when the trained representations have a very good performance on a large subset of the state-input pairs, since the soundness errors must be computed over all state-input pairs.In this section, we provide an alternative formulation for computing a compressed representation of a given abstraction.Intuitively, our idea is to train neural network representations which can guess for any given state-input pair the vectorized indices corresponding to the lower-left and upperright corner points of the hyper-rectangular reachable set.The architecture of the representation is shown in Fig. 6.As illustrated, for every state-input pair ( x, ū) ∈ X × Ū , the output of the representation gives the lower-left and upper-right corners of the rectangular set that is reachable by taking the control input ū at the state x.Alg. 2 describes our classifier-based compression scheme for finite abstractions.We first compute the backward system Σ using Eq. ( 2).We then compute the training datasets for both the forward and backward systems Σ and Σ .For Σ, let   : X × Ū → X and    : X × Ū → X denote the mappings from the state-input pair ( x, ū) ∈ X × Ū into the corresponding upper-right and lower-left corners of the rectangular reachable set from ( x, ū).We define   : X × Ū → {0, 1} 2  =1 | X ( ) | with | X ()| being the cardinality of the projection of X along the  th axis and   ( x, ū) () = 1 if and only if Intuitively, each element of the dataset D  contains a state-input pair ( x, ū) and a vector  ∈ {0, 1} 2  =1 | X ( ) | that has 1 only at the entries corresponding to I , (   ( x, ū) ()) and I , (  ( x, ū) ()) for  ∈ {1, 2, . . ., }.Similarly, we define for some  ∈ {1, 2, . . ., }.The training dataset for the backward dynamics is also defined similarly as Once the training datasets are ready, we train the neural networks N  and N  respectively on the datasets D  and D  .Note that the output layer of N  and N  will be a vector of size 2  =1 | X ()|, while the final output of the representations are of size 2 (cf.Fig. 6).These final outputs give an approximation of the coordinates of the lower-left and upper-right corners of the reachable set corresponding to the pair ( x, ū).Note that, because X was computed by equally partitioning over  , both the indexing function I , and its inverse can be implemented in a memory-efficient way using floor and ceil operators.We then evaluate the performance of the trained neural networks N  and N  .Let    ( x, ū) and   ( x, ū) denote respectively the estimated lower-left and upper-right corners of the reachable set estimated by N  .Define   ( x, ū) and   ( x, ū) similarly for N  , and let the set of misclassified state-input pairs be The soundness error of N  and N  can be considered as their misclassification rate: For the misclassified pairs in   and   , we extract the related transitions in the abstraction: Finally, we correct the output of neural network representations to maintain soundness Note that these corrected neural representations are memory efficient only if the misclassification rates are small, i.e., the size of   and   are small compared with X × Ū .

On-the-Fly Synthesis
In the previous subsection, we described the computation of the compressed representations corresponding to the forward and backward dynamics for finite abstractions.In this subsection, we use these representations in order to synthesize formally correct controllers.Our synthesis procedure is provided in Alg. 3. It takes the representations   and   to synthesize a controller which fulfills the given reach-avoid specification.Let be a discrete under-approximation of the target set Goal.We take  0 as the input and perform a fixed-point computation to solve the given reach-avoid game.We initialize the winning set and controller with  0 =  0 and  = ∅, and in each iteration, we add the new winning set of states and state-input pairs, respectively, into the overall winning set and the controller, until no new state is found ( +1 = ∅).
Let   be the set of new winning states in the beginning of the  ℎ iteration.Further, we denote the set of winning states in the beginning of the  ℎ iteration by   =  =0   .In every iteration, for every x ∈   and ū ∈ Ū , we compute the backward over-approximating ℓ ∞ ball and discretize it to get the candidate pool   defined as with where    (•, •),    (•, •) denote the components of the output of   (•, •) corresponding to the center and radius of the ℓ ∞ ball, respectively.Note that we compute the candidate pool by running   over   instead of   .This is computationally beneficial, because |  | ≤ |  |.Next lemma shows that   includes the whole set of new winning states  +1 .Lemma 3.2.Let the set of candidates   be as defined in Eq. (18).Then, we have  +1 ⊆   for all  ≥ 0.
Proof.We prove this lemma by contradiction.Suppose that  +1 ⊈   .Then there exists at least one x * ∈  +1 \   .Since x * ∈  +1 , we know that there exists at least one ū * ∈ Ū such □ Now, we can use   , which represents the forward transition system, in order to choose the legitimate candidates out of   and add the new ones to  +1 .Let be a discrete over-approximation over the set of obstacles.The next lemma states that we can use the representation   to compute  +1 .Lemma 3.3.The set of states added to the winning set in the  ℎ step can be computed as Proof.To prove this lemma, we denote , and show  +1 ⊆  and  ⊆  +1 .The second direction ( ⊆  +1 ) holds by definition.To prove the first direction ( +1 ⊆ ), we note that  ⊆   and further, by the result of Lemma.3.2, we have  +1 ⊆   .Assume  +1 ⊈ .Then there should exist at least one x * ∈  +1 \ .Note that x * ∈   \ .Since x * ∈   , we get that there exists at least one ū * ∈ Ū for which   ( x * , ū * ) ⊆   .Also, because x * ∉ , we have   ( x * , ū * ) ⊈   , which is a contradiction.Therefore,  +1 ⊆ .Hence the proof ends.□ In each iteration, we calculate Γ  , which is the set of new state-input pairs that must be added into the controller, and is defined as Finally, If  +1 = ∅, we can terminate the computations as we already have computed the winning set and the controller.Otherwise, we add   and Γ  into the overall winning set ( +1 ←   ∪  +1 ) and controller ( ←  ∪ Γ +1 ) and restart the depicted process.

DEPLOYMENT
Once the controller  is computed such that  ∥ Σ realizes the given specification Φ, we need to deploy  onto an embedded controller platform, e.g., a microcontroller.Since such embedded controller platforms generally have a small on-board memory, we would like to minimize the size of the stored controller.We define the set of valid control inputs corresponding to x as  ( x) = { ū | ( x, ū) ∈  }.The approach we proposed for finding representations for the finite abstractions may not work, since we are not allowed to over-approximate  ( x), and thus the set of valid control inputs is not representable as a compact ℓ ∞ ball described by its center and radius.The following example illustrates a disconnected  ( x), which cannot be represented by an ℓ ∞ ball.Example 1.Consider a system with one-dimensional state and input spaces ( =  = 1).Fig. 7 illustrates the set of transitions starting from the white middle box (x = 0).Let the boxes with green check mark and red cross mark correspond to the target and obstacle states and  be the controller for the corresponding reach-avoid specification.Then, we have { (0, 2), (0, 3), (0, −2), (0, −3) } ⊆  and In contrast to the symbolic regression method proposed in [40], we formulate the controller compression problem as a classification task, that is, we train a neural network which assigns every state to a list of scores over the set of control inputs, and picks the control input with the highest score.The configuration of the neural network is illustrated in Fig. 8.The justification for our formulation is that any representation for the controller can only perform well if it is trained over a dataset which respects the continuity property, i.e., neighboring states are not mapped into control input values which are very different from each other.A representation that respects the continuity property corresponds to a low continuity index (see Eq. ( 3)).During the training phase, we keep all the valid control inputs and let the training process to choose which value respects the continuity property more, by minimization of the cost function.Therefore, our formulation automatically takes care of the redundancy problem by mapping a neighborhood in the state space into close-in-value control inputs to respect the continuity requirement of the trained representation.The reason that our formulation does not correspond to a standard classification setting is that during the training phase a non-uniform number of labels (corresponding to the control input values in the output stage of the neural network) per input (corresponding to the state values at the input layer of the neural network) are considered as valid, while we only will consider one label-corresponding to the highest score-as the trained representation's choice during the runtime.Remark 2. In order to formulate the problem of finding a neural-network-based representation for the controller as a regression problem, first the training data must be pre-processed such that the continuity property is respected, i.e., the set of valid control-inputs per each state is pruned so that neighboring states are mapped to close-in-value control inputs.However, this pre-processing is time consuming and does not work efficiently in practice (see, e.g., [9,40]).
Alg. 4 summarizes the proposed procedure for computing a compressed representation for the original controller.In the first step, we need to store the training set where  21) 2 Train the neural network N  on the dataset D  using the learning rate  3 Compute the set of state-input pairs C using Eq. ( 23) 4 Compute Ĉ using Eq. ( 24) Result: Corrected neural representation Ĉ state x ∈  and a vector ( x) which is of length | Ū | and has ones at the entries corresponding to the valid control inputs and zeros elsewhere.
Once the training dataset is ready, we can train a neural network N  which takes x ∈ X as input and approximates I  −1 ( (( x))) in the output, where I  −1 (•) denotes the inverse of the indexing function used in Eq. ( 21).
Remark 3. Note that the output layer of N  has to be of size | Ū | and for every x ∈ , we consider the value I  −1 ( (N  ())) as the final control input assigned by N  to the state x.Moreover, because Ū was computed by equally partitioning over  , both the indexing function I  and its inverse can be implemented in a memory-efficient way using floor and ceil functions.
Once the neural network N  is trained, we evaluate its performance by finding all the states x at which using N  produces an invalid control input, i.e., The misclassification rate of the trained classifier N  is defined as: In order to maintain the guarantee provided by the original controller , it is very important to correct the output of the trained representation, so that it outputs a valid control input at every state.In case the misclassification rate is small, we can store N  together with C, where The final deployable controller Ĉ consists of both N  and C, and is defined as Lemma 4.1.Let Ĉ be as defined in Eq. (24).The winning domain of both Ĉ ∥ Σ and  ∥ Σ for satisfying a specification Φ is the same.Remark 4. Our deployment method preserves soundness.The input to our deployment approach is a formally guaranteed controller computed by any abstraction-based method.We train a neural representation that maps the states to a control input.This control input is valid for majority of the states.For the states that the control input is not valid, we keep the set of valid control inputs from the original controller and store them as a small look-up table.Therefore, the final corrected neural controller in Eq. ( 24) is sound with respect to the original controller.

EXPERIMENTAL EVALUATION
We evaluate the performance of our proposed algorithms on several control systems.Dynamics of our control systems are listed in Tab. 1.We used configurations (1) and (2) in Tab. 1, respectively, for evaluating our methods for synthesis and deployment.We construct the transition system in all the case studies using the sampling approach in [18].This approach generates   using sampled trajectories while providing confidence on the correctness of   .Our experiments were performed on a cluster with Intel Xeon E7-8857 v2 CPUs (32 cores in total) at 3GHz, with 100GB of RAM.For training neural networks, we did not use a distributed implementation as we found that distributing the process across GPUs actually decelerates the process.However, for the rest of our compression and synthesis algorithms, we used a distributed implementation.Synthesis.We considered the ℓ ∞ ball centered at (4, 4) with the radius 0.8 over the Euclidean plane as the target set for the multi-dimensional car examples, [−0.5, 0.5] × [−1, 1] for the inverted pendulum example, and [−1, 1] 4 for the TORA example.To evaluate our corrected neural method described in Subsec.3.1, we set the list of neuron numbers in different layers as ( +, 20, 40, 30, 2), select the activation functions to be hyperbolic tangent, and set the learning rate to be  = 0.001.As discussed in Subsec.3.2, the corrected neural representations for finite abstractions can also be constructed by  6),   and   give the graph mismatch rates for the forward and backward dynamics using using Eq.(4), M  gives the memory needed to store the original transition system in kB, M  + M  denotes the memory taken by the representing neural networks for the forward and backward dynamics in kB, T  denotes the total execution time for computing the compressed representations in minutes.and T  denotes the total execution time for synthesizing the controller in minutes.
Case study  Table 3.The results of classifier-based controller synthesis for finite abstractions.X × Ū indicates the number of discrete state-input pairs,   ,   denote the soundness errors, respectively, for the forward and backward representations, computed using Eq. ( 14),   and   give the graph mismatch rates for the forward and backward dynamics, M  gives the memory needed to store the original transition system in kB, M  + M  denotes the memory taken by the representing neural networks for the forward and backward dynamics in kB, T  denotes the total execution time for computing the compressed representations in minutes.and T  denotes the total execution time for synthesizing the controller in minutes.
Case study , select the activation functions to be ReLU, and set the learning rate to be  = 0.0001.We used stochastic gradient descent method with the corresponding learning rate for training the neural networks [30].Tabs. 2 and 3 illustrate the synthesis results related to our experiments for finite  does not suffer from over-parametrization of the neural networks.We have demonstrated this in Fig. 10 by providing the error as a function of the depth of the neural representation for the 3D car example.The error always decreases by increasing the depth of the neural representation.Therefore, the structure of the neural representations can be selected for having an acceptable accuracy within a given time bound for the training process.

DISCUSSION AND CONCLUSIONS
In this paper, we considered abstraction-based methods for controller synthesis to satisfy high-level temporal requirements.We addressed the (exponentially large) memory requirements of these methods in both synthesis and deployment.Using the expressive power of neural networks, we proposed memory-efficient methods to compute compressed representations for both forward and backward dynamics of the system.With focus on reach-avoid specifications, we showed how to perform synthesis using corrected neural representations of the system.We also proposed a novel formulation for finding compact corrected neural representations of the controller to reduce the memory requirements in deploying the controller.Finally, we evaluated our approach on multiple case studies, showing reduction in memory requirements by orders of magnitude, while providing formal correctness guarantees.Extension to more general specifications.Our approach is based on computing an underapproximation of Pre and over-approximation of Post operators.Therefore, it can be applied to any synthesis problem whose solution is characterized based on these operators.This means our approach can be applied to control synthesis for other linear temporal logic specifications including safety, Büchi, and Rabin objectives.Reusability of the computed representations.Our approach computes the corrected neural representations that is sound on the whole state space.These representations can be used for any other problem defined over the same finite abstraction.Application to systems with known analytical model.Our approach is efficient in providing compact representations for a given finite abstraction at the cost of increasing the off-line computational time.This is regardless of constructing the finite abstraction using model-based methods or (correct) data-driven methods.Model-based on-the-fly synthesis methods will utilize numerical solutions of differential equations when the analytical model of the system is known with available bounds on the continuity properties of the system.These methods may perform better in case solving the corresponding differential equations is faster than making a forward pass through the neural representation.
Comparison with a baseline method.We have demonstrated the effectiveness of our method on a number of case studies in compressing finite transition systems and controllers which are stored in the form of look-up tables.In the introduction and related work sections of our paper, we have discussed why other methods cannot be used to solve our problem.In below, we have listed our main arguments.
• While transition systems and controllers can be encoded using BDDs instead of look-up tables, the memory blow-up problem still exists for systems of higher dimensions.However, using our technique, we empirically show that the size of the computed representations is not necessarily affected by size of the original mapping.See for example Fig. 9 (Left), wherein the memory required by the trained compressed representation stays at 488 kB, despite the fact that the required memory by the original transition system has increased by a factor of 5000.• Also, our synthesis setting is different from the one considered in references [15,25,31], wherein memory-efficient synthesis methods are proposed based on a (compact) analytical description of the nominal dynamics of the system and its growth bound.We consider the case wherein the input is a huge finite transition system which can also be learned from simulations.• Finally, while the control determinization and compression schemes proposed in [15,40] are based on the BDD and ADD encodings of the controller, the only methodologically that is in a similar spirit as our deployment approach is the symbolic regression of [40].As mentioned by the authors of [40], their regression-based method is not able to represent the original controller with an acceptable accuracy.Our superior performance is mainly because of our classification-based formulation, as opposed to a regression-based formulation.
Utilizing invertible neural networks.Our method requires training two different neural networks associated with the forward and backward dynamics.A possible future research direction would be to use invertible neural networks instead of training two separate neural networks.However, given the specific application and inherent differences between our approach and the successful experiences with invertible neural networks, it is currently not obvious to us that the same performance would be accessible.
Choice of hyper-rectangular reachable sets.Hyper-rectangular sets are the most popular choice for representing reachable sets in abstraction-based controller synthesis (see e.g., [28]).
Although other templates could also be used to represent reachable sets, the conservativeness of the over-approximation reduces as the size of the discretization decreases.Our approach can in principle be applied to any parametrization of the reachable sets.
Robustness against adversarial examples.In general, it is true that neural networks are not robust to adversarial examples, meaning that a small change in their input could give large enough changes in their output resulting in errors.We emphasize that going from a look-up table representation of the controller to a neural network representation does not influence the robustness against adversarial attacks unless the attacker gains access to middle layers of the neural network.Any adversarial attack on the inputs of the neural network can be studied using similar techniques and concepts from the robustness analysis of abstraction-based methods and is independent of the controller representation.

Fig. 1 .
Fig. 1.Graphical description of the proposed scheme for compressing finite abstractions

Fig. 5 .
Fig. 5.The regression-based configuration used in compression of abstractions.The input to the neural network includes state-input pair ( x, ū), and the output includes the pair (,  ) corresponding to the center and radius of the rectangular reachable set, respectively.Right: The classification-based representation of finite abstractions.The representation receives a state-input pair ( x, ū).In the output,   and   correspond to the lower-left and upper-right corners for the rectangular reachable set.

Fig. 6 .
Fig. 6.The classification-based representation of finite abstractions.The representation receives a state-input pair ( x, ū).In the output,   and   correspond to the lower-left and upper-right corners for the rectangular reachable set.

Algorithm 2 :
Computing classification-based representations of finite abstractions Data: Forward dynamics Σ and learning rate  1 Compute backward dynamics Σ and the datasets D  and D  using Eqs.(2), (11), and (12) 2 Train neural networks N  and N  on the datasets D  and D  using the learning rate  3 Compute the set of misclassified state-input pairs   and   as in Eq. (13) 4 Compute the set of transitions Ñ and Ñ associated with   and   as in Eq. (15) 5 Compute the corrected neural representations   ,   using Eqs.( for some  ∈ {1, 2, . . ., }.The indexing function I , : X () → [1; | X ()|] maps every element of X () into a unique integer index in the interval [1; | X ()|].The training dataset for Σ is defined as

Fig. 7 .
Fig. 7. Illustration of a disconnected set of valid control inputs.

Fig. 8 .
Fig. 8.The configuration used in compression of controllers.Given a state x, the representation produces a corresponding control input ū.

Algorithm 4 :
an indexing function for the control set Ū , which assigns every value in Ū into a unique integer in the interval [1; | Ū |].Intuitively, each point in the dataset D  contains a Compression algorithm for the controller Data: Controller , learning rate  1 Compute the dataset D  using Eq. (

Fig. 10 .
Fig.10.Demonstrating the effect of increasing the depth of the neural representation on the norm of the soundness error   (cf.Eq. (6)) for regression-based controller synthesis (Left), the soundness error   (cf.Eq. (14)) for classification-based controller synthesis (Middle), and the misclassification rate (cf.Eq. (22)) for deployment (Right).The experiments are performed on the 3D car example.
and    and only uses the compressed representations N  and N  .Lemma 3.1.Transition systems    and    computed by (9) and (10) are sound for   and   , i.e., we have   ⊆    and   ⊆    .To reduce the level of conservativeness, we require that    and    do not contain too many additional edges compared to   and   .The mismatch rate of the forward and backward dynamics are defined as

Table 1 .
Catalog of models used to generate the finite abstractions in Sec. 5.