Products, Polynomials and Differential Equations in the Stream Calculus

We study connections among polynomials, differential equations, and streams over a field 𝕂, in terms of algebra and coalgebra. We first introduce the class of (F,G)-products on streams, those where the stream derivative of a product can be expressed as a polynomial function of the streams and their derivatives. Our first result is that, for every (F,G)-product, there is a canonical way to construct a transition function on polynomials such that the resulting unique final coalgebra morphism from polynomials into streams is the (unique) commutative 𝕂-algebra homomorphism—and vice versa. This implies that one can algebraically reason on streams via their polynomial representation. We apply this result to obtain an algebraic-geometric decision algorithm for polynomial stream equivalence, for an underlying generic (F,G)-product. Finally, we extend this algorithm to solve a more general problem: finding all valid polynomial equalities that fit in a user specified polynomial template.


INTRODUCTION
We investigate a connection among polynomials, differential equations, and streams, i.e., infinite sequences of elements from a set [26].At a very informal level, this connection can be expressed by the following correspondences: polynomials = syntax; differential equations = operational semantics; streams = abstract (denotational) semantics.There are two important motivations behind this standpoint.(1) Diverse notions of product (convolution, shuffle,...) arise in streams, in relation to different models-discrete computations, combinatorial sequences, analytic functions, and more [2,26].There is also a close analogy between several forms of products and forms of parallelism arising in concurrency.Our aim is to uniformly accommodate such diverse notions, by automatically deriving an operational semantics for polynomials that is adequate for a given generic stream 7:2 M. Boreale et al. product.(2) Once adequate polynomial syntax and operational semantics have been obtained, one can apply powerful techniques both from algebraic geometry (Gröbner bases [13]) and from coalgebra (coinduction [26]) for reasoning on streams.This includes devising algorithms for deciding stream equivalence.Again, one would like to do so in a uniform fashion w.r.t. an underlying notion of stream product.
Technically, achieving these goals amounts to defining a fully abstract semantics from polynomials to streams, which is essential for algebraic-geometric reasoning on streams.Moreover, one wants the resulting construction to be as much as parametric as possible with respect to the underlying notion of stream product.
As hinted above, we will pursue these goals by relying on tools from algebra and coalgebra.Indeed, it is well-known that, when polynomial coefficients and stream elements are drawn from a field K, both polynomials and streams form commutative K-algebras, i.e., rings with an additional vector space structure over K.Note that, while this algebra structure is fixed for polynomials, it varies with the underlying product for streams.On the other hand, streams also possess a coalgebraic structure, arising from the operation of stream derivative.On the side of polynomials, it is also natural to interpret a differential equation ẋi = p i as a transition x i → p i : thus one expects a transition structure, hence a coalgebra, over polynomials as well.How to appropriately extend transitions from individual variables x i to monomials and polynomials, though, nontrivially depends on the notion of stream product one wants to model.
Our first result is that the above outlined goals can be achieved for the class of (F , G)-products on streams, where, basically, the derivative of a product of two streams can be expressed as a polynomial of the streams themselves and their derivatives.As an example, convolution, shuffle, Hadamard, and infiltration products (see e.g., [2]) all fall in this class.One can then define a coalgebra structure on polynomials, depending on the given (F , G)-product and differential equations, such that the unique morphism from this coalgebra to the coalgebra of streams is also a commutative K-algebra homomorphism.And vice versa: every homomorphism that satisfies the given differential equations is the unique morphism.Thus, full abstraction is achieved.
A major application of this result is an algorithm based on an algebraic-geometric procedure for deciding stream equivalence, i.e., if two polynomials denote the same stream.This procedure is then smoothly extended to an algorithm to find, for instance, all valid polynomial identities up to a given degree.These algorithms are illustrated on specific (F , G)-products (convolution, shuffle), by automatically finding nontrivial valid polynomial equations, for a few examples of SDEs.Note that algebraically solving such equations in the ring of streams [26,27] leads, in turn, to closed forms for generating functions of sequences [16,30].
To sum up, we make the following two main contributions.(1) A unifying treatment of stream products, implying that, under reasonable assumptions, coalgebra morphisms from polynomials to streams are also commutative K-algebra homomorphisms (full abstraction)-and vice versa.
(2) Algorithms for deciding polynomial stream equivalence and finding valid stream polynomial identities that rely on the full abstraction result.
Structure of the article.The rest of the article is organized as follows.In Section 2, we introduce the necessary background on polynomials, differential equations, streams, and coalgebras.Section 3 contains our main result, the coincidence of coalgebra morphisms and algebra homomorphisms, under certain conditions on the underlying stream product.As a major application of this result, we present in Section 4 an algorithm for deciding stream equality.This is expanded in Section 5, where we present a method to find all valid polynomial identities that fit a given template.This result is related to existing algorithms for linear weighted automata/expressions. Section 6 briefly draws some concluding remarks and discusses potential directions for future work.For ease of reading, the proofs of some intermediate technical results have been confined to a separate appendix, Appendix A.
The present article is partly based on material presented in the conference paper [12].The new material here consists of: the detailed proofs of the main results presented in [12] (here Theorem 3.7 and Proposition 3.8 in Section 3, and Theorem 4.2 in Section 4); new discussion on related work concerning bialgebras and distributive laws (Related work section, and Remark 2 in Section 3); new discussion on the fixed point characterizations of the kernel of the final morphism (final part of Section 4); a new template-based algorithm to find polynomial identities and its proof of correctness (Theorem 5.3 in Section 5); a result relating this algorithm with known coalgebraic algorithms for weighted automata (Theorem 5.5 in Section 5).

Related work
Rutten's stream calculus [26,27], a coinductive approach to the analysis of infinite sequences (streams), is a major source of inspiration for our work.Ref. [26] studies streams, automata, languages, and formal power series in terms of coalgebra morphisms and bisimulation.In close analogy with classical analysis, [27] presents coinductive definitions and proofs for a calculus of behavioural differential equations, also called stream differential equations (SDEs) in later works.A number of applications to difference equations, analytical differential equations, continued fractions, and problems from combinatorics, are presented.Convolution and shuffle products play a central role in the stream calculus; a duality between them, mediated by a variation of Laplace transform, exists [28].
A coinductive treatment of analytic functions and Laplace transform is also presented by Escardo and Pavlovic [25].Basold et al. [2] enrich the stream calculus with two types of products, Hadamard and infiltration, and exhibit a duality between the two, mediated by a so-called Newton transform.Although these works form a conceptual prerequisite of our study, they do not offer a unifying treatment of the existing disparate notions of stream product, nor any algorithmic treatment of the induced stream equivalences.
Boreale [7] and Bonchi et al. [4] consider an operational approach to streams and convolution product based on weighted automata, which basically correspond to linear SDEs and expressions.They offer an equivalence checking algorithm for such automata and the recognized streams, based on a linear-algebraic construction; however, the polynomial case is not addressed.Related to this is the work of Bonchi et al. [1], where algorithms for equality of streams specified by linear SDEs, are presented.Our results here generalize these algorithms, as we can also work with polynomial SDEs.This will be made precise and discussed in the final part of Section 5.
We also mention [8,11] that adopt a coinductive approach to reason on polynomial ODEs.The ring of multivariate polynomials is employed as a syntax, with Lie derivatives inducing a transition structure.An algebraic-geometric algorithm to decide polynomial equivalence is presented.This algorithm as well has inspired our decision method: in particular, as Lie derivatives are precisely the transition structure induced in our framework by the shuffle product, the decision algorithms of [8,11] are in essence special cases of our algorithms in Sections 4 and 5. Furthermore, [9,10] extend the framework of [8,11] to polynomial partial differential equations, which pose significant additional challenges.
Somewhat related to ours is the work of Winter on coalgebra and polynomial systems: see e.g., [31,Ch.3].Importantly, Winter considers polynomials in noncommuting variables: under suitable assumptions, this makes his systems of equations isomorphic to certain context-free grammars; see also [21].The use of noncommuting variables sets Winter's treatment in mathematical realm that is totally different from the one considered in this article.In particular, the algebraic geometric concepts we rely on here, like ideals and Gröbner bases, are not applicable in Winter's framework.
More closely related to ours is the work of Hansen, Kupke and Rutten [17].There the authors prove that, when the SDEs defining given operations on streams obey a GSOS syntactic format, the final coalgebra morphism is also a homomorphism from the free term algebra to the algebra (w.r.t. the given operations) of streams [17,Section 8].It is interesting to note that our notion of (F , G)product falls in the abstract GSOS format.However, we work with the algebra of polynomials, which besides being a commutative ring and vector space over K, possesses additional structure arising from monomials.All this structure is essential for algebraic-geometric reasoning, and sets our approach apart from those based on term algebras: for one thing, in term algebras there is no obvious analog of Hilbert's basis theorem, a result deeply related to the well-ordering of monomials (cf.Dickson's lemma, [13,Ch.2]),and a crucial ingredient in our decision algorithm.One might consider more complicated GSOS frameworks enriched with equational theories: indeed, an abstract version of the GSOS format has also been discussed in the framework of bialgebras [17, Section 9] and distributive laws [6].Bialgebras require a substantial background in category theory, which we have preferred to avoid here so as to keep our approach as elementary and accessible as possible.A more technical discussion on this point is deferred to Section 3, Remark 2.

BACKGROUND 2.1 Polynomials and Differential Equations
Let us fix a finite, non empty set of symbols or variables X = {x 1 , . . ., x n } and a distinct variable x X .Informally, x will act as the independent variable, while x 1 , . . ., x n will act as dependent variables, or functions, defined by differential equations (see below).Notationally, we write x to denote {x } ∪ X .We fix a generic field K of characteristic 0; K = R and K = C are typical choices.We let P := K[x], ranged over by p, q, . .., be the set of polynomials with coefficients in K and indeterminates in x.We let M, ranged over by m, m , . .., be the set of monomials that is the free commutative monoid generated by x.As usual, we shall denote polynomials as formal finite sums of distinct monomials with nonzero coefficients in K: p = i ∈I r i m i , for r i ∈ K and m i ∈ M. By slight abuse of notation, we shall write the zero polynomial and the empty monomial as 0 and 1, respectively.Over P, one can define the usual operations of sum p + q and product p • q, with 0 and 1 as identities, and enjoying commutativity, associativity, and distributivity, which make P a ring; multiplication of p ∈ P by a scalar r ∈ K, denoted rp, is also defined and makes (P, +, 0) a vector space over K. Therefore, (P, +, •, 0, 1 π ) forms a commutative K-algebra.
We shall also fix a set D = { ẋ1 = p 1 , . . ., ẋn = p n } of differential equations, one for each x i ∈ X , where the p i 's belong to P and are called drifts.An initial condition for D is a vector ρ = (r 1 , . . ., r n ) ∈ K n .The pair (D, ρ) forms an initial value problem.Informally, each x i ∈ X represents a placeholder for a function whose derivative is given by p i , and whose value at the origin is x i (0) = r i .This terminology is borrowed from the theory of differential equations.However, note that, depending on the semantics of polynomial product one adopts (see next section), D can be given diverse interpretations, including SDEs (for convolution, see next subsection) in the sense of Rutten [26], and of course ordinary differential equations (ODEs, for shuffle).In the literature on bialgebras and distributive laws, initial value problems are sometimes referred to as coequations [14].
Notationally, it will be sometimes convenient to regard D and ρ as functions D : X → P and ρ : X → K, respectively, such that D (x i ) = p i and ρ (x i ) = r i .It is also convenient to extend D and ρ to x by letting D (x ) = 1 and ρ (x ) = 0; note that, seen as an initial value problem, the last two equations define the identity function.Finally, we let x 0 denote x and, when using D and ρ as functions, use x i as a metavariable on x: this makes D (x i ) and ρ (x i ) well defined for 0 ≤ i ≤ n.
Products, Polynomials and Differential Equations in the Stream Calculus 7:5

Streams
We quickly review some basic notions taken from [26].We let Σ K := K ω , ranged over by σ , τ , . .., denote the set of streams, that is infinite sequences of elements from K: σ = (r 0 , r 1 , r 2 , . ..) with r i ∈ K. Often K is understood from the context and we shall simply write Σ rather than Σ K .When convenient, we shall explicitly consider a stream σ as a function from N to K and, e.g., write σ (i) to denote the ith element of σ .By slightly overloading the notation, and when the context is sufficient to disambiguate, the stream (r , 0, 0, . ..) (r ∈ K) will be simply denoted by r , while the stream (0, 1, 0, 0, . ..) will be denoted by x; see [26] for motivations behind these notations. 1urthermore, a stream made up of all the same element r ∈ K will be denoted as r = (r , r , . ..).One defines the sum of two streams σ and τ as the stream σ + τ defined by: (σ + τ )(i) := σ (i) + τ (i) for each i ≥ 0, where the + on the right-hand side denotes the sum in K. Sum enjoys the usual commutativity and associativity properties, and has the stream 0 = (0, 0, . ..) as an identity.
Various forms of stream products, generically denoted by π and with identity 1 π , can also be considered-this is indeed a central theme of our article.In particular, the convolution product (written ×) and the shuffle product (written ⊗) are defined as follows, for any i ≥ 0: where operations on the right-hand side are carried out in K. 2 Both products are commutative, associative, have 1 = (1, 0, 0, . ..) as an identity, and distribute over +; multiplication of σ = (r 0 , r 1 , . ..) by a scalar r ∈ K, denoted rσ = (r r 0 , r r 1 , . ..), is also defined and makes (Σ, +, 0) a vector space over K. Therefore, (Σ, +, π , 0, 1) forms a commutative K-algebra for both products.Let us record the following useful properties for future use: x × σ = (0, r 0 , r 1 , . ..) and r π σ = (r r 0 , r r 1 , . ..),where r ∈ K and π ∈ {×, ⊗}.In view of the second equation above, r π σ coincides with rσ .The first equation above leads to the so called fundamental theorem of the stream calculus, whereby for each Less commonly found forms of products, like Hadamard and Infiltration products, will be introduced in the next subsection; equations similar to (1) exist also for such products [2,17].

Coalgebras, SDEs, and Bisimulation
We quickly review some basic definitions and results about coalgebras and bisimulation; see e.g., [26] for a comprehensive treatment.A (stream) coalgebra with outputs in K is a Moore automaton The set of streams Σ can be naturally given a stream coalgebra structure (Σ, (•) , o(•)), as follows.The output of a stream σ = (r 0 , r 1 , . ..) is o(σ ) := r 0 and its derivative is σ := (r 1 , r 2 , . ..), that is σ is obtained from σ by removing its first element, that constitutes the output of σ .In fact, this makes Σ final in the class of all coalgebras with outputs in K [26].This also implies that one can prove equality of two streams by exhibiting an appropriate bisimulation relation relating them (coinduction).
It is sometimes convenient to consider an enhanced form of bisimulation on Σ that relies on the notion of linear closure. 3Given a relation R ⊆ Σ × Σ, its linear closure R is the set of pairs of the form [26]; since by definition R ⊆ R, this implies that R ⊆ ∼, the bisimilarity on streams, which coincides with equality.
An SDEin the unknown σ is a pair of equations of the form σ (0) = r and σ = ϕ, for r ∈ K and a stream expression ϕ (that can depend on σ or its components, or even on σ itself).Under certain conditions on ϕ [17,26], it can be proven that there is a unique stream σ satisfying the above SDE.In this article, we shall focus on the case where ϕ is represented by a polynomial expression-this will be formalized in the next section.For the time being, we observe that the product operations defined in the preceding subsection enjoy a formulation in terms of SDEs.In particular (see [2,17,26]), for given σ and τ , their convolution and shuffle products are the unique streams satisfying the following SDEs (recall that, as a stream, x denotes (0, 1, 0, 0, . ..)): From the last equation, note the analogy between shuffle and interleaving of languages.Moreover, the derivative of convolution product is usually defined as We shall generally prefer formula (2) because it is symmetric.The multiplicative inverse of a stream σ w.r.t.× and ⊗ exists under the condition that σ (0) 0. In the case of convolution product, the inverse is denoted by σ −1 and satisfies the following SDE and initial condition: Two additional examples of stream products are introduced below; see [2] for the underlying motivations.The Hadamard product and the infiltration product ↑ can be defined by the following two SDEs: Hadamard product is reminiscent of synchronization in concurrency theory and has 1 := (1, 1, 1, . ..) as an identity; it is just the componentwise product of two streams, i.e. (σ τ )(i) = σ (i)τ (i), for every i ≥ 0. Infiltration product ↑ is again reminiscent of a notion in concurrency theory, namely the fully synchronized interleaving; it has 1 = (1, 0, 0, . ..) as an identity.
Products, Polynomials and Differential Equations in the Stream Calculus 7:7

(CO)ALGEBRAIC SEMANTICS OF POLYNOMIALS AND DIFFERENTIAL EQUATIONS
The main result of this section is that, once fixed an initial value problem (D, ρ), for every product π (with identity 1 π ) defined on streams and satisfying certain syntactic conditions, one can build a coalgebra over polynomials such that the corresponding final morphism into Σ is also a commutative K-algebra homomorphism from (P, +, •, 0, 1) to (Σ, +, π , 0, 1 π ).In essence, the polynomial syntax and operational semantics reflects exactly the algebraic and coalgebraic properties of the considered π on streams.
To make polynomials a coalgebra, we need to define the output o : P → K and transition δ : P → P functions.The definition of o(•) is straightforward and only depends on the given initial conditions ρ: we let o := o ρ be the homomorphic extension of ρ, seen as a function defined over x, to P. Equivalently, seeing ρ as a point in K n+1 , we let o ρ (p) := p(ρ), that is the polynomial p evaluated at the point ρ.It can be easily checked that o ρ (1) = 1.
Example 3.2.For the products introduced in Section 2, the pairs of polynomials (F , G) are defined as follows: - The identity stream for convolution, shuffle and infiltration is defined by 1 π (0) = 1 and 1 π = 0, i.e., in these cases the polynomial G is 0. For the Hadamard product, the identity is given by 1 π (0) = 1 and 1 π = 1 π , i.e., the polynomial G in this case is y 1 .
Given a (F , G)-product π on streams, δ is defined in a straightforward manner on monomials, then extended to polynomials by linearity.Below, we assume a total order on variables x 0 < x 1 < • • • < x n and, for any monomial m 1, let min(m) denote the smallest variable occurring in m w.r.t.such a total order. 4otation 1.In what follows, the definition of derivative δ (Definition 3.3) and of coalgebra morphism μ (Theorem 3.7) depend not only on π , but also on a given initial value problem (D, ρ).To avoid excessive notational burden, we will assume that (D, ρ) is fixed once and for all, and omit the dependence on it from the notation, writing e.g., δ π in place of δ π ,( D,ρ ) , and so on.

Definition 3.3 (Transition Function δ π ).
Let π be a (F , G)-product on streams.Given an initial value problem (D, ρ), we define δ π : P → P by induction on the size of p ∈ P as follows: The formal similarity between the definition given above and the construction in [17, Prop.5.4] is noteworthy.Expressed in the language of monads, an important difference is that the monad V of vector spaces in [17] should be replaced here with the monad P of commutative multivariate polynomials; we will return to this point in Remark 2.
We must now impose certain additional sanity conditions on F to ensure that the final coalgebra morphism induced by δ π , as just defined, is also an algebra homomorphism.In the rest of the article, we will make use of the following abbreviation: The necessity of the following conditions is self-evident, if one thinks of F π [p; q] as δ π (p • q) (see Lemma 3.6 below).

Definition 3.5 (Well-behavedness).
Let π be a (F , G)-product on streams.We say that π is wellbehaved if, for each initial value problem (D, ρ), the following equalities hold, for every p, q ∈ P, Products, Polynomials and Differential Equations in the Stream Calculus 7:9 All products defined in Section 2 are well-behaved: the proof of this fact, which is not entirely trivial, is reported in Appendix A (see Proposition A.2).
We are now ready to prove the main result of this section, i.e., that μ π is a commutative K-algebra homomorphism.Intuitively, its proof consists in showing that μ π preserves all the operations in P, by exhibiting in each case an appropriate bisimulation relation in Σ × Σ and then applying coinduction.As expected, the most crucial case is product, where one shows that the relation consisting of all pairs ) is a bisimulation up to linearity.In this case, Lemma 3.6 below is used to prove that μ π preserves transitions: indeed, it connects morphism to homomorphism properties induced by π .The proof is in Appendix A. Lemma 3.6.Let π be a well-behaved (F , G)-product.Then, for every p, q ∈ P, it holds that δ π (p In the next proof and in the rest of this section, we will use the following notation.Given a polynomial substitution (i.e., a map from variables to polynomials) ζ , and a monomial m Similarly, given a stream substitution (i.e., a map from variables to streams) ξ , we let mξ denote the stream ξ Theorem 3.7.Let π be a well-behaved (F , G)-product.Then the (unique) coalgebra morphism μ π from (P, δ π , o ρ ) to (Σ, (•) , o) is a commutative K-algebra homomorphism from (P, +, • , 0, 1) to (Σ, +, π , 0, 1 π ).
Proof.We prove that μ = μ π preserves the ring operations and their identities, as well as multiplication by a scalar.
(a) μ (1)(0) = 1 π (0).Since μ is a coalgebra morphism, by definition of 1, and by Definition 3.1(2), we have that μ (1 Since G is a polynomial in the variable y 1 (i.e., G = i ∈I r i m i , where the m i 's are monomials in y 1 ), we have that ) by ( 10) and (8) where ζ is the substitution that maps y 1 to 1, hence all monomials m i evaluated under ζ yield 1, which justifies the last step above.By Definition 3.1(2) and definition of G, we have that where ξ is the substitution that maps y 1 to 1 π , hence all monomials m i evaluated under ξ yield 1 π .This suffices to conclude up to linearity.(4) μ (p • q) = μ (p) π μ(q).To prove this fact, let us consider the relation and prove that it is a bisimulation up to linearity.Let us consider any since μ is a coalgebra morphism since μ is a coalgebra morphism = (μ (p 1 ) π μ(q))(0) by Definition 3.1(1).
Remark 2 (Relations with Abstract GSOS and Bialgebras).Polynomial syntax and operational semantics might also be described in terms of distributive laws and bialgebras, thus allowing one to leverage known results in this field [6,17].Below, we outline this possibility; in doing so, we shall assume a basic knowledge of the language of category theory.
With bialgebras, the algebraic and coalgebraic structures are combined together, and their interaction is modeled via a distributive law.Specifically, a monad is used to describe the (possibly non free) syntax.A distributive law is a natural transformation that is compatible with the monadic and coalgebraic structure.In the case that interests us, the set of commutative, multivariate polynomials can be modelled as a monad on the category Set (of sets and functions), defined by letting P be the set of polynomial terms over x quotiented by the congruence generated by the axioms of commutative K-algebras.The Eilenberg-Moore algebras for P are then the commutative K-algebras.As already noted, the conditions in the definition of (F , G)-product ensure that the product operation and its identity are defined and fall in the abstract GSOS format.The results in [6] imply then the existence of a distributive law λ of the polynomial terms monad over the (copointed) stream functor defining the coalgebraic structure.
We conjecture 5 that, in this framework, our Theorem 3.7-that the coalgebra structure on P is such that the final coalgebra morphism is a K-algebra homomorphism for every well behaved (F , G)-product-should follow from Proposition 3 and Theorem 1 in [6], provided one can show that, for a generic (F , G)-product, the corresponding distributive law λ preserves the K-algebra axioms.In fact, we have checked some of the K-algebra axioms for the distributive laws induced by the SDEs of specific products, viz.convolution and shuffle.Extending this to the general case of a (F , G)-product would presumably involve proving that well-behavedness (our Definition 3.5) implies preservation of the K-algebra axioms.This extension appears nontrivial; we leave a thorough exploration of this connection for future work.

DECIDING STREAM EQUALITY
One benefit of a polynomial syntax is the possibility of applying techniques from algebraic geometry to reason about stream equality.We will devise an algorithm for checking whether two given polynomials are semantically equivalent, that is, are mapped to the same stream under μ π .if p (k ) ∈ {p (0) , . . ., p (k−1) } then return YES 4: end for

The Algorithm
First of all, by linearity of μ π (•), we have that μ π (p) = μ π (q) if and only if μ π (p) − μ π (q) = μ π (p−q) = 0. Therefore, checking semantic equivalence of two polynomials reduces to the problem of checking if a polynomial is equivalent (bisimilar) to 0. Before introducing the actual algorithm for checking this, we quickly recall a few notions from algebraic geometry; see [13, Ch.1-4] for a comprehensive treatment.

Definition 4.1 (Ideal).
A set of polynomials I ⊆ P is an ideal if 0 ∈ I and, for all p 1 , p 2 ∈ I and q ∈ P, it holds that p 1 + p 2 ∈ I and q • p 1 ∈ I .Given a set of polynomials S, the ideal generated by S is By the previous definition, we have that ∅ := {0}.Trivially, I = S is the smallest ideal containing S, and S is called a set of generators for I .It is well-known that every ideal I admits a finite set S of generators (Hilbert's basis theorem).By virtue of this result, any infinite ascending chain of ideals, I 0 ⊆ I 1 ⊆ I 2 ⊆ • • • ⊆ P, stabilizes in a finite number of steps: that is, there is k ≥ 0 s.t.I k+j = I k for each j ≥ 0 (Ascending Chain Condition, ACC).A key result due to Buchberger [13] is that, given a finite S ⊆ P, it is possible to decide whether p ∈ I = S , for any polynomial p.As a consequence, also ideal inclusion I 1 ⊆ I 2 is decidable, given finite sets of generators for I 1 , I 2 .
Remark 3.These facts are consequences of the existence of a set of generators B for I , called Gröbner basis, with a special property (see [13, Chpt.2-Section 6-Cor.2]): p ∈ I if and only if p mod B = 0, where 'p mod B' denotes the remainder of the multivariate polynomial division of p by B. Indeed, by [13, Chpt.2-Section 3-Thm.3],we can define the notion of multivariate polynomial division by a set of polynomials and, when such a set is a Gröbner basis [13, Chpt.2-Section 5-Def.5],we know by [13, Chpt.2-Section 6-Prop.1]that the remainder of the division, denoted by p mod B, is unique (though the quotient is not).There exist algorithms to build Gröbner bases which, despite their exponential worst-case complexity, turn out to be effective in many practical cases [13,Ch.4].
We first illustrate the algorithm with a simple, linear example.
The algorithm in [8] for polynomial ODEs can be seen as a special case of the algorithm presented here, obtained by letting π = ⊗, the shuffle product.Indeed, it is not difficult to see that δ ⊗ coincides with the Lie derivative, on which the algorithm in [8] is based.Technically, the key step to obtain the present generalization is enucleating a sufficient condition under which, as soon as I i = I i+1 , the ideals chain gets stable: this is the syntactic requirement that the derivative of the product is in the ideal generated by the arguments, F ∈ {y 3 , y 4 } .Let us now discuss a nonlinear example based on shuffle product.= (1, 1, 3, 15, 105, 945, 10395, 135135, . ..), the sequence of double factorials of odd numbers (sequence A001147 in [24]).We want to check the following equation: 2 .An execution of Algorithm 1 consists of the following steps.
Remark 4. We can define the generating function associated to Fibonacci numbers, that is the function д(z) whose Taylor series expansion from z = 0 is j ≥0 f j z j , where f j are the Fibonacci numbers.For z in the radius of convergence of this series, we have From [2] it is known that the convolution product inverse of a given stream σ exists whenever σ (0) 0. From ( 16) we obtain where we use the usual notation σ τ to denote σ × τ −1 .This equation for μ × (x 1 ) is structurally identical to (19): this is of course no coincidence, as algebraic identities on streams correspond exactly to algebraic identities on generating functions.Similarly, the equivalence μ ⊗ (p) = 0 obtained for the double factorial equations yields the exponential generating function for A001147 when solved algebraically for x 1 , that, as seen in Example 4.4 above, is д(z) = 1 1−2z .For additional details on generating functions and algebraic series, see [12].

Remark 5 (Complexity).
The exact theoretical complexity of Algorithm 1 is difficult to characterize, as it also depends on the specific (F , G)-product π that is considered.One can try at least to work out some very conservative bounds.Let us denote by d the sum of the degree of the input polynomial p and of the maximal degree of polynomials in D, and by n the number of variables.Assume that the algorithm stops at iteration i + 1, where i is the least integer such that p (i+1) ∈ {p (0) , . . ., p (i ) } .We note that: (a) each iteration of the main loop involves the computation of a Gröbner basis, for which known algorithms have an exponential worst case time complexity upper bounded approximately by O (D 2 n ), where D is the maximum degree in the input polynomial set (see [13]); (b) the maximum degree D of the derivatives p (k ) 's, for 0 ≤ k ≤ i, depends on the actual (F , G)-product that is considered.For instance, in the case π = ⊗, it is not difficult to see that D ≤ i • d, which gives a worst case time complexity of approximately O (i 2 N +1 d 2 N ).The number of steps i before stabilization also depends on the actual (F , G)-product.As an example, in the case of shuffle product (Lie derivative), according to a result in [23], the number of steps i before stabilization of an ascending chain of ideals generated by successive Lie derivatives is upper bounded by d N O (N 2 ) .One should stress that these are very conservative bounds, and that the algorithms works reasonably well in many practical cases.

A Fixed-Point Theoretic Perspective
To set our algorithm in a more general coalgebraic perspective, it is useful to relate it to a characterization of the kernel's morphism in terms of fixed points.Given any coalgebra with outputs in K, say C = (S, δ , o), consider the function Φ : 2 S → 2 S defined below, for any I ⊆ S: We say that I is a post-fixed point of Φ if I ⊆ Φ(I ) and a fixed point if this inclusion holds with equality.Let us denote by gfp(Φ) the greatest fixed point of Φ.An easy application of the Kanster-Tarski fixed point theorem and of the monotonicity of Φ shows that the unique coalgebra morphism μ from C to the final coalgebra of streams can be characterized as ker(μ) = gfp(Φ) = {I : I is a post-fixed point of Φ}.In terms of the function Φ induced by the polynomials coalgebra C = (P, δ π , o ρ ), Algorithm 1, in case of a YES answer, builds precisely a post-fixed point I = {p (0) , . . ., p (k−1) } : the minimal post-fixed point that is an ideal and contains the input polynomial p.In general, however, I gfp(Φ).This leaves open the problem of actually computing gfp(Φ), that is ker(μ π ), which can be regarded as the main object of interest here.The theory of fixed points ensures that gfp(Φ) can be iteratively obtained as gfp(Φ) = j ≥0 Φ (j ) (P) where, inductively, Φ (0) (P) := S and Φ (j+1) (P) := Φ(Φ (j ) (P)).While the resulting procedure is formally correct, it is far from clear how to make it effective.
In the next section, we will introduce a generalization of Algorithm 1 that actually allows one to find all polynomials in ker(μ π ) up to a prescribed degree.We shall adopt a hybrid approach: we will start from a set of polynomials of bounded degree, that can be effectively described via a template; then refine this set, similarly to the construction of gfp(Φ) described above.However, since applying δ π can actually increase the degree of a polynomial, hence leading outside the initial set, at each iteration we shall also need to add polynomials to the current set.Establishing termination of the resulting procedure, as we shall see, is nontrivial.

FINDING POLYNOMIAL IDENTITIES
In the previous section we have described an algorithm to check whether two given polynomials have the same denotation -hence, are semantically equivalent.Now we generalize this algorithm and give a method to find all valid polynomial equations of a given form.This can be used, for instance, to find all polynomial equalities up to a given degree.To this aim, we use polynomial templates [29], a way to compactly specify sets of polynomials.We note that the template-based algorithm presented in [8] is a special instance of the present one that can be obtained by letting π = ⊗.
Fix a tuple of h ≥ 1 distinct parameters, say a = (a 1 , . . ., a h ), disjoint from x.Let Lin(a), ranged over by , be the set of linear expressions with coefficients in K and variables in a; e.g., by taking K = R, we have that = 5a 1 + 1 2 a 2 − 3a 3 is one such expression. 6A template is an element of the set Lin(a) [x], that is, a polynomial with linear expressions as coefficients; we let T range over templates.For example, the following is a template: Given r = (r 1 , . . ., r h ) ∈ K h , we will let [r] denote the element of K resulting from the evaluation of the expression obtained by replacing each a i with r i in ; we let T [r] ∈ K[x] denote the polynomial obtained by replacing each with [r] in T .For example, by taking T as in ( 21), we have that The (formal) derivative of a template is defined as expected, once linear expressions are treated as constants; note that δ π (T ) is still a template.Because of (10), for each T and r, one has δ π (T [r]) = δ π (T )[r]; this holds in general for the jth derivative (for every j ≥ 0): To make notation lighter, when π is clear from the context, we shall write δ (j ) π (T ) as T (j ) .We now present an algorithm that, given a template T with h parameters, finds all instances p of T such that μ π (p) = 0.More precisely, given a template T , the algorithm computes the intersection of T [K h ] with the kernel of μ π Equivalently, Z = {p ∈ T [K h ] : ∀j ≥ 0, o ρ (p (j ) ) = 0}, i.e., Z is the subset of the instances of T whose derivatives of any order vanish, when evaluated at the initial conditions ρ.In order to compute ZT , we shall rely on a a special kind of ideals, namely invariants, that are defined by relying on the following notation: given P ⊆ P, we denote with δ π (P ) the set {δ π (p) : p ∈ P }.The algorithm we are going to present returns a pair (R, I ), where R ⊆ K h is such that T [R] = ZT and I is the smallest invariant that includes T [R].We calculate these two sets by building two chains: a descending chain of vector spaces and an (eventually) ascending chain of ideals.A pseudo-code description 7 of this procedure is presented as Algorithm 2. Note that each set R i is actually a vector space in K h : this is a consequence of the fact that, for each linear expression , the set V := {r ∈ K h : [r] = 0} is, in turn, a vector space, and that R i can equivalently be described as follows, where T (ρ) denotes the polynomial obtained by replacing each x i ∈ x with ρ (x i ): {V : occurs as a coefficient in one of T (ρ), . . ., T (i ) (ρ)} .
The ideal chain is used to detect the stabilization of the sequence.In fact, in the sequence of vector spaces, R i+1 = R i does not imply that R i+k = R i for each k ≥ 1; see Example 5.4 for an illustration of this phenomenon.For this reason, the algorithm returns the pair (R m , I m ), where m is the least integer such that R m+1 = R m and I m+1 = I m .
The correctness of Algorithm 2 is proven under the same mild condition on F as we assumed for proving correctness of Algorithm 1.We give with a preliminary lemma stating that the algorithm terminates, and that this happens exactly when the two chains stabilize.Lemma 5.2.Consider a template T and a well-behaved (F , G)-product π such that F ∈ {y 3 , y 4 } .Then Algorithm 2 terminates.Furthermore, let (R i , I i ) be the pair of sets returned by the algorithm; then, R i = R i+k and I i = I i+k , for every k ≥ 1. R i := {r ∈ K h : ∀j ≤ i. o ρ (T (j ) [r]) = 0} 5: Proof.We first prove that there exists an i such that R i+1 = R i and I i+1 = I i .Indeed, R 0 ⊇ R 1 ⊇ • • • forms an infinite descending chain of finite-dimensional vector spaces, which must stabilize in finitely many steps; hence, we can consider the least i such that R i = R i +k for each k ≥ 1.Then, I i ⊆ I i +1 ⊆ • • • forms an infinite ascending chain of ideals, which must stabilize at some i ≥ i .
Then, let (R i , I i ) be the sets returned by the algorithm; the proof is by induction on k.The base case holds by line 6 of the algorithm.For the inductive case, we assume that and we prove that R i = R i+k+1 and I i = I i+k+1 .To this aim, we first show that Indeed, by induction hypothesis and definition of I i+k , we have that , with 0 ≤ j t ≤ i and r t ∈ R i .By ( 22), (10) and Lemma 3.6, we have that Since F ∈ {y 3 , y 4 } , we have that F π [q t ; T (j t ) [r t ]] ∈ { T (j t ) [r t ], T (j t +1) [r t ]}, for each t; since ideals are closed under sum, T (i+k+1) [r] ∈ t {T (j t ) [r t ], T (j t +1) [r t ]} .This proves (24) since, for each t, we have that T (j t ) [r t ], T (j t +1) [r t ] ∈ I i+1 = I i (by definition of I i+1 and induction).
Let us now come to the proof of the inductive step: To see this, observe that, for each r ∈ R i+k (= R i ), it follows from (24) that T (i+k+1) [r] = t q t • T (j t ) [r t ], with 0 ≤ j t ≤ i and r t ∈ R i .By construction of R i , we have that o ρ (T (i+k+1) [r]) = 0, which shows that r ∈ R i+k+1 .This proves that R i+k ⊆ R i+k+1 ; the reverse inclusion is by construction and, together with the inductive hypothesis, allows us to conclude.I i = I i+k+1 : By the previous point, definition of I i+k , inductive hypothesis, and (24), we have that Theorem 5.3 (Correctness and Relative Completeness).Consider a template T and a wellbehaved (F , G)-product π such that F ∈ {y 3 , y 4 } .Let (R i , I i ) be the pair of sets returned by Algorithm 2. Then: Proof.Concerning part (a), we first note that each T [r] ∈ ZT is such that o ρ (T [r] (j ) ) = o ρ (T (j ) [r]) = 0, for each j ≥ 0; this, by definition, implies r ∈ R j , for each j ≥ 0, and so r ∈ R i .Conversely, if r ∈ R i = R i+1 = R i+2 = • • • (here we are using Lemma 5.2), then, by definition, o ρ (T [r] (j ) ) = o ρ (T (j ) [r]) = 0, for each j ≥ 0, which implies that T [r] ∈ ZT .Note that, in proving both inclusions, we used (22).
Concerning part (b), we prove that: (1) I i is an invariant, (2) I i ⊇ ZT , and (3) every invariant I that contains ZT also contains I i .
(1) For each r ∈ R i and j ∈ {0, . . ., i − 1}, we have that δ π (T (j ) ) Let I ⊇ ZT be an invariant.We show by induction on j that T (j ) [r] ∈ I , for each r ∈ R i ; this implies the claim.By part (a), ) ∈ I (again, here we used ( 22)).
Concerning complexity, similar considerations to those for the base algorithm apply, see Remark 5.In particular, step 4 of Algorithm 2 can be carried out via simple linear algebraic manipulations, leaving the cost of the Gröbner basis computation at step 5 as the asymptotically dominant one.A slight optimization is to perform step 5 only when the condition R i+1 = R i is true.
Let π = × and T be the complete template of degree 3 involving x and x 1 : We run Algorithm 2 with T .
-At iteration i = 0, T (0) = T ; so, for every r ∈ R 9 , we have that o ρ (T (0) [r]) = 0 if and only if r ∈ R 0 := {r ∈ R 9 : r 9 = −r 7 − r 8 − r 6 }; I 0 is then built as the ideal generated by all the instances of the polynomial In practice, instead of considering the entire R 0 , it is sufficient to consider a basis of it.-For i = 1, we have that o ρ (T (1) [r]) = 0 for all r ∈ R 0 ; thus, R 1 := R 0 .So, the vector space equality has been detected, but ideal equality does not hold: in fact, T (1) [R 1 ] I 0 and, hence, The ideal chain is not yet stabilized and the algorithm goes on with a new iteration of the for loop.
Again, we refer the reader to [12] for a detailed discussion on the relations with generating functions.
In the final part of this section, we discuss the important special case of linear SDEs.Let A ⊆ P be the subset of polynomials of degree ≤ 1, in other words, affine expressions of the form p = n i=0 r i x i + r n+1 , with r i ∈ K. Assume in the given initial value problem (D, ρ) all the expressions in D are linear, 8 that is of the form n i=0 r i x i with r i ∈ K: we call this a linear initial value problem.The restriction of the function δ π to A only depends on the polynomial G, not on F -cf. equalities ( 7)- (10).Moreover, since G (1) ∈ K and D (x i ) ∈ A for each variable x i , clearly for each p ∈ A we have δ π (p) ∈ A. In other words, denoting by | A function restriction to A, we see that C π ,A := (A, δ π | A , o ρ | A ) forms a sub-coalgebra (Moore sub-automaton) of (P, δ π , o ρ ).Let μ π | A be the final coalgebra morphism from C π ,A to Σ.The kernel ker(μ π | A ) = ker(μ π ) ∩ A yields, informally speaking, all valid identifications among the variables x i and their affine combinations, under the given linear initial value problem.It is easy to compute ker(μ π | A ) by resorting to Algorithm 2. In fact, we can specialize the algorithm so as to avoid the (costly) computation of the ideals I i , as described in the next result.In what follows, we let with a 0 , . . ., a n+1 distinct parameters, be the complete template of degree 1.
Theorem 5.5 (Linear Version of Algorithm 2).Let (D, ρ) be a linear initial value problem and π be a well-behaved product.Let T be the template defined in (26), and R 0 , R 1 , . . .be the sequence of vector spaces defined in steps 1 and 4 of Algorithm Proof.Let L ∈ K (n+1)×(n+1) be the matrix defining the given linear SDEs: for i = 0, . . ., n, the row i of L is (l i0 , . . ., l in ), where ẋi = l i0 x 0 + • • • + l in x n is the SDE for x i (recall that x 0 = x).Now let v 0 := (ρ (x), 1) T ∈ K n+2 , seen as a column vector, and let L ∈ K (n+2)×(n+2) be the matrix obtained from L by first adding an extra zero row, then an extra zero column, then setting L [n + 1, n + 1] := G (1) (here indices run from 0 to n + 1).Also consider the vectors: (x, 1), obtained from the concatenation of x and 1, and δ π (x) := (δ π (x 0 ), . . ., δ π (x n )).With this notation in place, for every r ∈ K n+2 a column vector of parameters, we can write: T [r] = r T • (x, 1) T and Fig. 1.A weighted automaton.Output weights (not displayed) are assumed to be 1 for state x 10 and 0 for any other state.
δ π (T [r]) = r T • (δ π (x), δ π (1)) T = r T • L • (x, 1) T ; more generally, one can easily check that δ (j )  π (T [r]) = L j • (x, 1) T ; here we also exploit the fact that δ (j ) π (1) = G (1) j for all j ≥ 1, a consequence of ( 11)- (13).Therefore o ρ (T [r]) = r T • v 0 and more generally, o ρ (δ (j )  π (T [r])) = r T • L j • v 0 for each j ≥ 0. Now define the vectors v j := L j v 0 , for each j ≥ 1.Then, as o ρ (δ (j ) π (T [r])) = r T • v j , by definition of R j one has R j = {v 0 , . . ., v j } ⊥ , the orthogonal complement of the subspace spanned by {v 0 , . . ., v j } in K n+2 .Consider the least i s.t.R i+1 = R i , which must exist as the R i 's form a descending chain of finite dimensional vector spaces.Then v i+1 ∈ span{v 0 , . . ., v i }, that is v i+1 = i j=0 λ j v j , for some λ j ∈ K.We now show that, for each k ≥ 1, v i+k ∈ span{v 0 , . . ., v i }: this will imply that R i = R i+1 = R i+2 = • • • , that is the chain of vector spaces has stabilized.For k = 1, this holds by definition of the index i.Assume k > 1.Then, for some λ j ∈ K, we have: v i+k = L v i+k−1 = L i j=0 λ j v j = i j=0 λ j L v j = i j=0 λ j v j+1 , where in the third equality we have exploited the induction hypothesis.Now the last expression is a linear combination of elements in span{v 0 , . . ., v i }; in particular, v i+1 is in the span by assumption.This shows that v i+k ∈ span{v 0 , . . ., v i }.It is worthwhile to note that the transition functions on A for π ∈ {×, ⊗, ↑} coincide: for these products, C A := C π ,A can be identified with the coalgebra of (expressions for) stream weighted automata [26,27].In particular, the final morphism μ A := μ π | A represents the standard semantics of weighted automata in terms of streams of [26,27]; see also Example 5.6 below.On the other hand, for these products the algorithm outlined in Theorem 5.5 basically corresponds to the partition refinement algorithm for linear weighted automata described in [4,7].This algorithm has been subsequently generalized to the case where weights are drawn from a (semi)ring, under certain conditions: see, e.g., [3,20] and the references therein.
We conclude the section with an example.
Example 5.6.For our purposes, a finite-state weighted automaton is a finite-state automaton where both states and transitions are labelled with weights drawn from a field K. Weights on states are also called output weights.Figure 1 displays a weighted automaton with 10 states {x 1 , . . ., x 10 } with outputs in K = R; outputs are assumed to be 1 for x 10 and 0 for any other state.The semantics of weighted automata can be given in terms of streams as described in, e.g., [26].Equivalently, one can associate to each state x i a linear SDE and an initial condition, as dictated by, respectively, its -property ( 11) is satisfied, since F ⊗ [1; q] = 0 • q + 1 • δ ⊗ (q) = δ ⊗ (q); -property (12) follows from Lemma A.1.Indeed, x i m 2 ]; -property (13) holds.Indeed: by def. of F ⊗ = ( i ∈I r i δ ⊗ (m i )) • q + ( i ∈I r i m i ) • δ ⊗ (q) by (10) b y d e f .o f F ⊗ -property ( 14) trivially holds.
(2) p = m ∈ M: If m = 1, we trivially conclude by (11), since p • q = q.Otherwise, we consider a second structural induction on q.The non-trivial base case of this second induction is for q = m ∈ M. Let x i be the variable with the smallest index in m • m and m be m • m with one occurrence of x i removed.Then δ π (m • m ) = δ π (x i • m ), by commutativity and associativity in M. Now -if m = 1, then m = 1 (i.e., that q = 1) and m = x i ; then , by identity of the product, ( 11) and ( 14); -otherwise, δ π (x i • m ) = F π [x i ; m ] = F π [m; m ], by ( 9) and ( 12) (applied |m| − 1 times).For the second inductive step, let q = j ∈J r j m j , for |J | > 0. We have δ π (p • q) = δ π ( j ∈J r j (m • m j )) by distributivity in P = j ∈J r j δ π (m • m j ) by (10) = j ∈J r j F π [m; m j ] by the base case for q (q a monomial) = F π [q; m] b y ( 13) = F π [p; q] b y ( 14) .

1 m 2
) by induction= F ⊗ [x i ; m 1 m 2 ] b y d e f .o f F ⊗ = δ ⊗ (x i m 1 m 2 ) by (9) = δ ⊗ (m 1 m 2 ) Proposition A.2. ⊗ is well-behaved.Proof.Recall from Example 3.2 that F ⊗ = y 2 y 3 + y 1 y 4 .Then: where S is a nonempty set of states, δ : S → S is the transition function, and o : S → K is the output function.A bisimulation on C is a binary relation R ⊆ S × S such that, whenever unique morphism μ from C to C 0 .In this case, ∼ in C 0 coincides with equality, and the following coinduction principle holds: for every C and s ∼ t in C, it holds that μ