Perfect Zero-Knowledge PCPs for #P

We construct perfect zero-knowledge probabilistically checkable proofs (PZK-PCPs) for every language in #P. This is the first construction of a PZK-PCP for any language outside BPP. Furthermore, unlike previous constructions of (statistical) zero-knowledge PCPs, our construction simultaneously achieves non-adaptivity and zero knowledge against arbitrary (adaptive) polynomial-time malicious verifiers. Our construction consists of a novel masked sumcheck PCP, which uses the combinatorial nullstellensatz to obtain antisymmetric structure within the hypercube and randomness outside of it. To prove zero knowledge, we introduce the notion of locally simulatable encodings: randomised encodings in which every local view of the encoding can be efficiently sampled given a local view of the message. We show that the code arising from the sumcheck protocol (the Reed-Muller code augmented with subcube sums) admits a locally simulatable encoding. This reduces the algebraic problem of simulating our masked sumcheck to a combinatorial property of antisymmetric functions.


Introduction
The notion of a zero-knowledge (ZK) proof, a proof that conveys no information except the truth of a statement, is one of the most influential ideas in cryptography and complexity theory of the past four decades.
Zero knowledge was originally defined by Goldwasser, Micali and Rackoff [GMR89] in the context of interactive proofs (IPs).The deep and beautiful insight in this work is that it is possible to rigorously prove that an interaction does not convey information, by exhibiting an efficient algorithm called the simulator which can generate the distribution of protocol transcripts without interacting with the prover.In that work, the authors identify three different notions of zero knowledge, depending on the quality of the simulation.These are: (1) perfect zero knowledge (PZK), where the simulator's distribution is identical to the real distribution of transcripts; (2) statistical zero knowledge (SZK), where the distributions are inverse-exponentially close; and (3) computational zero knowledge (CZK), where the distributions are computationally indistinguishable.
This hierarchy led naturally to the study of the complexity classes PZK, SZK, CZK of languages admitting ZK interactive proofs.Seminal results show that PZK contains interesting "hard" languages, including quadratic residuosity and nonresiduosity [GMR89] and graph isomorphism and nonisomorphism [GMW91].Despite this, the structure of the class PZK, and its relation to other complexity classes, remains poorly understood.In light of this difficulty, the study of zero knowledge has followed two main routes: (1) studying the "relaxed" notions of SZK and CZK, and (2) studying zero knowledge in other models.
The former line of work has proved highly fruitful.A seminal result of [GMW91] showed that CZK = IP = PSPACE, which launched the cryptographic study of ZK proofs, yielding a plethora of theoretical and practical results [Tha22].The study of SZK has revealed that this class has a rich structure, and deep connections within complexity theory (cf.[Vad99]).
The present work lies along the second route, also hugely influential.The seminal work of Ben-Or, Goldwasser, Kilian, and Wigderson [BGKW88] introduced the model of multi-prover interactive proofs (MIP) in order to achieve perfect zero-knowledge without any computational assumptions [LS95;DFKNS92].This in turn inspired some of the most important models and results in contemporary theoretical computer science, including entangled-prover interactive proofs (MIP*), interactive oracle proofs (IOPs), and most notably, probabilistically checkable proofs (PCPs).Indeed, the celebrated PCP theorem [ALMSS92; AS03;Din07] is widely recognised as one of the most important achievements of modern complexity theory [AAV13].
Zero-knowledge PCPs.Recall that a PCP is a proof which can be verified, with high probability, by a verifier that only makes a small number of queries (even O(1)) to the proof (cf.[AB09]).A zero-knowledge PCP is a randomised proof which, in addition to being locally checkable, satisfies a zero knowledge condition: the view of any efficient (malicious) verifier can be efficiently simulated (see [Wei22] for a survey).Similarly to the IP case, we can distinguish PCPs with perfect, statistical and computational zero knowledge.Note that there appears to be no formal relationship between ZK-IPs and ZK-PCPs: standard transformations from IP to PCP spoil zero knowledge.
The first statistical zero-knowledge PCPs appeared in the seminal work of Kilian, Petrank and Tardos [KPT97].Later works [IMS12; IW14] simplified this construction, and extended it to PCPs of proximity.These constructions operate via a 2-step compilation of (non-zero-knowledge) PCP: first the PCP is endowed with honest-verifier statistical zero-knowledge (HVSZK), and then the latter property is boosted into full statistical zero knowledge by "forcing" the malicious verifier's query pattern to be similarly distributed to that of the honest verifier, using an object called a locking scheme.Kilian, Petrank and Tardos use this transformation to prove that NEXP has SZK-PCPs (with a polynomial-time verifier) that are zero knowledge against all polynomial-time malicious verifiers; i.e., SZK-PCP[poly, poly] = NEXP.
Unfortunately, locking schemes have two inherent drawbacks: they require the honest PCP verifier to be adaptive and they cannot achieve perfect zero knowledge.Another line of work, motivated by cryptographic applications, focuses on obtaining SZK-PCPs for NP with a non-adaptive honest verifier from leakage resilience.These results come with caveats, achieving either a weaker notion of zero knowledge known as witness indistinguishability [IWY16], or simulation against adversaries making only quadratically many more queries than the honest verifier [HVW22].
The complexity landscape of perfect zero knowledge is subtle.As discussed above, we know that PZK-MIP = MIP = NEXP [BGKW88], where MIP is the class of languages with a multi-prover interactive proof.The quantum analogue of this result, PZK-MIP * = MIP * = RE, is also known to hold [CFGS22;GSY19].We know a similar result for interactive PCPs (IPCPs) [KR08], an interactive generalisation of PCPs (and special case of IOPs): PZK-IPCP = IPCP = NEXP [CFS17].For IPs, the picture is very different.We know that PZK ⊆ SZK ⊆ AM ∩ coAM [For87; AH91], and so it is unlikely that PZK = IP (= PSPACE), or even NP ⊆ PZK.It is unknown whether PZK = SZK; indeed, it was recently shown that there is an oracle relative to which this equality does not hold [BCHTV20].
In contrast, nothing at all is known about the class PZK-PCP except for the trivial inclusion BPP ⊆ PZK-PCP.In particular, the following key question in the study of perfect zero knowledge remains open: Do perfect zero-knowledge PCPs exist for any non-trivial language?

Our results
In this work, we give strong positive answer to this question, showing that there exist perfect zero-knowledge PCPs (PZK-PCPs) for all of #P.
We prove Theorem 1 by constructing a PZK-PCP for the #P-complete language This is the first construction of a PZK-PCP for a language (believed to be) outside BPP.Furthermore, unlike previous constructions of zero-knowledge PCPs, our construction simultaneously achieves non-adaptivity for the honest verifier and zero knowledge against arbitrary (adaptive) polynomial-time malicious verifiers.We stress that Theorem 1 is unconditional and does not rely on any cryptographic assumptions.
Remark 1.1.As with its IP counterpart PZK, PZK-PCP is a class of decision problems, and so the inclusion #P ⊆ PZK-PCP refers to the decision version of #P (for which #SAT is complete).This class contains UP and coNP (in a natural way) but may not contain NP.Toda's theorem, that PH ⊆ P #P , does not directly imply that PH ⊆ PZK-PCP, as the #P oracle in this inclusion is a function oracle.
On the way to proving Theorem 1, we solve the following general algebraic-algorithmic problem, which we consider to be of independent interest.Problem 1 (Local simulation of low-degree extensions).Given oracle access to a function f : {0, 1} m → F, where F is a finite field, efficiently simulate oracle access to a uniformly random degree-d extension of f ; that is, a function f drawn uniformly at random from the set of m-variate polynomials over F of individual degree d that agree with f on {0, 1} m .
By "efficient" we mean polynomial in m, d and log |F|.Chen et al. [CCGOS23] give a query-efficient simulator for d ≥ 2, based on observations of Aaronson and Wigderson [AW09].(For d = 1, there is a query lower bound of 2 m [JKRS09].)In this work we give a computationally efficient simulator for d ≥ 2.

Techniques
We start by outlining how to construct (non-ZK) PCPs for #SAT via the sumcheck protocol [LFKN92].Recall that the prover and verifier in the sumcheck protocol have oracle access to an m-variate polynomial P (the arithmetisation of Φ) of individual degree d over a finite field F, and the prover wishes to convince the verifier that a∈{0,1} m P ( a) = γ for some γ ∈ F. This is achieved via a 2m-message protocol, in which for every i ∈ [m], • the 2i-th message, sent by the verifier, is a uniformly random challenge c i ∈ F; and • the (2i − 1)-th message, sent by the prover, is the univariate polynomial The verifier accepts if (a) g 1 (0) + g 1 (1) = γ; (b) for each 1 ≤ i ≤ m − 1, g i (c i ) = g i+1 (0) + g i+1 (1); and (c) g m (c m ) = P (c 1 , . . ., c m ).This protocol is complete and sound for sufficiently large F.Moreover, we can "unroll" the interactive sumcheck protocol into an (exponentially large) PCP by writing down the prover's answers to each possible (sub-)sequence of challenges.
Unfortunately, the sumcheck PCP is clearly not zero knowledge: even the prover's first message g 1 reveals information about P which is #P-hard to compute.In this overview, we will explain how to modify the sumcheck PCP in order to achieve perfect zero knowledge.
Zero knowledge IOPs.Our approach to constructing a PZK-PCP for sumcheck is inspired by the [BCFGRS17] construction of a perfect zero knowledge sumcheck in the interactive oracle proof (IOP) model.The IOP model is an interactive generalisation of PCPs, in which the prover and verifier interact across multiple rounds, with the prover sending a PCP oracle in each round.
In their IOP, the prover first sends the evaluation table of a uniformly random m-variate polynomial R such that a∈{0,1} m R( a) = 0, which will be used as a mask.Next, to ensure soundness in case R does not sum to zero, the verifier sends a challenge α.Finally, the prover sends a sumcheck PCP for the statement Intuitively, this protocol does not leak much information about P , because αP +R is a uniformly random polynomial that sums to αγ. [BCFGRS17] shows, using techniques from algebraic complexity theory, that this IOP is indeed perfect zero knowledge; we will make use of these techniques later in our construction.
Unfortunately, the interaction in this protocol is crucial in order to balance soundness and zero knowledge.Indeed, one could imagine "unrolling" this IOP into a PCP in the same way as for sumcheck itself: writing down, for each α ∈ F, a sumcheck PCP Π α for αP + R.This preserves soundness, but completely breaks zero knowledge: since the sumcheck PCP is a linear function of the underlying polynomial, given any two Π α , Π α ′ for distinct α, α ′ , we can recover the sumcheck PCP for P as On the other hand, if we have the prover send only one Π α , then soundness is lost, as the prover can easily cheat by sending R that does not sum to zero.We could attempt to fix the soundness issue by having the prover additionally prove, via another sumcheck PCP Π R , that a R( a) = 0.While this would restore soundness, we once again lose zero knowledge, as the sumcheck PCP for P can be recovered as

Structure versus randomness
From the above discussion, we see that the central obstacle to obtaining a zero knowledge PCP is that the prover can break soundness by sending some R with a R( a) = 0, and known methods for detecting this strategy (without interaction) break zero knowledge.
Our approach is to prevent this malicious prover strategy by choosing a polynomial R so that a R( a) = 0 by definition.Of course, we must take care to ensure that R still hides certain information about P .But what kind of information?We observe that, without loss of generality, we can view the sumcheck PCP for P + R as an oracle Π such that for all i ∈ [m] and c 1 , . . ., c i ∈ F. Computing the "subcube sum" a∈{0,1} m−i P (c 1 , . . ., c i , a i+1 , . . ., a m ) requires 2 m−i queries to P .Hence, in order to hope to simulate, we must ensure that R hides such subcube sums for any 1 ≤ i < m − O(log m).So, at a minimum, it should be the case that for all such i, and all c 1 , . . ., c i ∈ F, the subcube sum a∈{0,1} m−i R(c 1 , . . ., c i , a i+1 , . . ., a m ) is marginally uniform.
In short: (1) for soundness, we need that the full sum of the masking polynomial R over the hypercube {0, 1} n is fixed, and (2) for zero knowledge, other than sum over the hypercube, we would like R to be distributed as random as possible.
We start by ensuring that any partial subcube sum of R is distributed uniformly at random.Our key observation here is that while the full sum is invariant under any reordering of the variables X 1 , . . ., X m , any nontrivial subcube sum (i < m) is not.This leads us to the following choice for the masking polynomial: where Q is a uniformly random m-variate polynomial of individual degree d.
Clearly, by the permutation invariance of the complete sum, a∈{0,1} m R( a) = 0. On the other hand, consider the special case of a subcube sum of R on the hypercube, i.e., for (c 1 , . . ., c i ) ∈ {0, 1} i :

This expression can only be identically zero for all
which is clearly only possible if i ∈ {0, m}.By linearity, if this subcube sum is not identically zero for all Q, then it is uniform when Q is uniform.
To mitigate this, we modify R to make it "as random as possible" while preserving antisymmetry on {0, 1} m and low-degree structure over F m .Our final choice of R is as follows: where Q is as before, X rev := (X m , . . ., X 1 ), and each T i is a uniformly random m-variate polynomial where deg What is this extra term?It was observed by [CCGOS23], via the combinatorial nullstellensatz [Alo99], that Z( X) is a uniformly random polynomial subject to the condition that, for all a ∈ {0, 1} m , Z( a) = 0. Adding this to R retains the antisymmetric structure on the hypercube, but has the effect of "masking out" any structure that appears outside of {0, 1} m ; in particular, R is no longer antisymmetric over all of F m .Key challenge: local simulation of combinatorial and algebraic structure.Showing the completeness and soundness of the PCP obtained via the foregoing approach is straightforward.Hence, it remains to prove that the perfect zero-knowledge condition holds.
Starting with the trivial case, note that our observation about subcube sums in the discussion above provides a simple strategy for simulating any single query (c 1 , . . ., c i ) within the hypercube {0, 1} i to Π. Alas, this clearly does not suffice, as even the honest sumcheck verifier makes multiple queries to the proof, for (c 1 , . . ., c i ) / ∈ {0, 1} i , Hence, we must simulate multiple queries beyond the hypercube.Indeed, the simulation of multiple queries to the low-degree extension of the hypercube is where the key challenge arises.More specifically, the difficulty is that our masking polynomial R has both algebraic structure arising from the polynomial degree bound and combinatorial structure arising from the antisymmeric reordering of the variables.In order to simulate, we must not only understand both types of structure, but crucially, also how they interact.
In the remainder of this proof overview, we explain how we design an efficient simulator that can simulate responses to any number of queries to Π (the sumcheck PCP for P + R), over the entire space F ≤m .Towards this end, we introduce the notion of locally-simulatable encodings.
Loosely speaking, locally-simulatable encodings are randomised encodings in which every local view of the encoding can be efficiently sampled from a local view of the message.More precisely, we say that a randomised encoding function ENC : (H → F) → (D → F) is locally simulatable if there is an algorithm S which, given oracle access to a message m : H → F and a set S ⊆ D, samples from the distribution of ENC(m)| S ⊆ F S in time polynomial in |S| (which may be much less than |H|).
The notion of locally-simulatable encodings is tightly connected to ZKPCPs: we can view the mapping from instances to corresponding PCPs as a randomised encoding.A local simulator for this encoding is a zero knowledge simulator for the PCP system.1 In the following sections, we will provide an overview of the proofs that the combinatorial antisymmetric structure of our masking, the algebraic structure of the Reed-Muller code, and their augmentations with subcube sums that arise in the sumcheck protocol admit locally simulatable encodings.

Combinatorial structure of antisymmetric functions
As discussed above, the response to any single (c 1 , . . ., c i ) ∈ {0, 1} i to Π can be perfectly simulated.In this section, we explain how to simulate any number of queries to Π, provided that those queries lie in the set {0, 1} ≤m := ∪ m i=1 {0, 1} i .Note that evaluated over the hypercube, the last term of Eq. 1 cancels, hence we can think of the masking polynomials as R( X) = Q( X) − Q( X rev ), where X rev := (X m , . . ., X 1 ).
Observe that, when restricted to {0, 1} m , the distribution of R is well structured; namely, it is a uniformly random element of the vector space AntiSym of "antisymmetric functions"2 , i.e., functions f : {0, 1} m → F such that for all c ∈ {0, 1} m , f ( c) = −f ( c rev ), where c rev = (c m , . . ., c 1 ) is the reverse of c.We can view the restriction of Π to {0, 1} ≤m as the output of a randomised encoding function ENC ΣAntiSym : ({0, 1} m → F) → ({0, 1} m → F), applied to the restriction of P to {0, 1} m .The function ENC ΣAntiSym has the following description.

Output the function Σ[f + r], given by
for all i ∈ [m]; i.e., the function f + r, augmented with all of its subcube sums.In this perspective, our task is now to provide a local simulator for ENC ΣAntiSym .That is, for a query set S ⊆ {0, 1} ≤m , we want to efficiently simulate Σ[f + r]| S for a uniformly random r ∈ ΣAntiSym, given oracle access to f .Let ΣAntiSym| S denote the vector space {Σ[r]| S : r ∈ AntiSym}, and let B be a basis for its dual code (ΣAntiSym| S ) ⊥ ; that is, v ∈ ΣAntiSym| S if and only if Bv = 0.Then, by linearity, Σ[f + r]| S is distributed as a uniformly random vector w : With this in mind, our local simulator S could work as follows.
3. Output a random w such that Bw = y.
Correctness is straightforward.The only potential issue is efficiency: evaluating Σ[f ] at a point in S may require exponentially many queries to f .Note, however, that if the column of B corresponding to c := (c 1 , . . ., c i ) is all zero, then we do not need to know . Hence, for efficiency, it would suffice to (efficiently) find a basis B for (ΣAntiSym| S ) ⊥ such that c∈supp(B) 2 m−ℓ( c) = poly(|S|), where supp(B) is the set of nonzero columns of B and ℓ( ()c) = i for c ∈ {0, 1} i .Does such a basis exist?Not exactly: for example, if S = {(0), (1)}, then supp(B) = {(0), (1)}, since Σ[r](0) + Σ[r](1) = a∈{0,1} m−1 r(0, a) + a∈{0,1} m−1 r(1, a) = 0 for all r ∈ AntiSym.However, this counterexample does not actually cause a problem for zero knowledge, since for this basis B, it holds that B(Σ[P ]| S ) = a∈{0,1} m P ( a) = γ, which is part of the input to the problem.Our key technical result in this section is that all possible counterexamples are essentially of this form.Lemma 1.There is a polynomial p such that, for any prefix-free3 S ⊆ {0, 1} ≤m , there exists a basis B of (ΣAntiSym| S ) ⊥ where each row b i of B is a 0-1 vector, and letting Moreover, B can be efficiently computed from S.
For the purposes of this overview, we will assume that S is indeed prefix-free; in the full proof we show that this holds without loss of generality.
To gain some intuition for this result, let us consider the 2-dimensional case of antisymmetric functions A : [n] 2 → F; i.e., n × n antisymmetric matrices A over F. Suppose that X ⊆ [n] × [n] is a set consisting of r full rows and t individual entries (so |X| = rn + t) such that for all antisymmetric A, it holds that (i,j)∈X a ij = 0.The latter condition implies that the indicator matrix for X must be symmetric; one possible choice of X is highlighted in Fig. 1.It is straightforward to show that this symmetry implies t ≥ r • (n − r), from which we obtain the pair of inequalities This in turn yields appropriate bounds on |X|: |X| 2t + t 2 /n 2 or |X| n 2 − t 2 /n 2 .The proof of Lemma 1 generalises this approach to higher dimensions.
Lemma 1 suggests the following modified local simulator, which we additionally provide with the total sum γ:

S f
AntiSym (S; γ): Clearly, the number of queries to f required is min(|T (b i )|, 2 m − |T (b i )|), which is poly(|S|) by Lemma 1; hence the overall running time of the simulator is poly(|S|).

Local simulation of random low-degree extensions
In the previous section we addressed the combinatorial problem of simulating queries to Π that lie in the set {0, 1} ≤m , which exhibits antisymmetric strucutre.In this section we consider the algebraic problem of simulating queries to Π that also lie outside of the hypercube (i.e., general point queries), which exhibits pseudorandom structure.We then bring these two parts together in Section 2.4.
Recall that our choice of R is where Q is as before, X rev := (X m , . . ., X 1 ), and each T i is a uniformly random m-variate polynomial where deg the set of degree-d extensions of f .With the above modification, we can describe the sumcheck PCP for P + R, restricted to F m , as the output of the following randomised encoding: ENC RMAntiSym (P ): Our problem now becomes to design a local simulator for ENC RMAntiSym .
We will do this by solving a much more general problem: we prove that random low-degree extensions are locally simulatable.More precisely, let ENC d RM (f ) be the encoding function that outputs a uniformly random element of LD[f, d].We prove that ENC d RM has a local simulator S d RM for any d ≥ 2. We can then easily obtain a simulator for ENC RMAntiSym by composing S RM with the local simulator for AntiSym described in the previous section.
Our starting point.We will build on the "stateful emulator" that was constructed in [CCGOS23] (for a particular oracle model that they introduced), which can be conceptualised as an inefficient local simulator for ENC RM .To describe it, we first introduce some notation.
For w ∈ {0, 1} m , we denote by δ w the unique multilinear polynomial satisfying δ w (w) = 1 and δ w (x) = 0 for all x ∈ {0, 1} m \ {w}.For a set S ⊆ F m , we say that w is S-good if there exists an m-variate polynomial q w of individual degree at most d such that (i) q w (x) = 0 for every x ∈ {0, 1} m \ {w}; (ii) q w (z) = 0 for every z ∈ S; and (iii) q w (w) = 1.We say that w is S-bad if it is not S-good.Intuitively, when f is a uniformly random low-degree extension of f , f | S only conveys information about f (w) for S-bad points w.For example, any point in {0, 1} m ∩ S is trivially S-bad.Less trivially, if S consists of sufficiently many points on a curve passing through w ∈ {0, 1} m , then w may be S-bad even if {0, 1} m ∩ S is empty.
The However, they leave open the question of whether S RM can be made computationally efficient.We note that Step 2 can be achieved efficiently using an algorithm of [BCFGRS17].The issue is in Step 1; namely, it is not clear how to compute the set W from S. Indeed, at first glance this appears to be hopeless: naively, computing whether a single w is S-bad requires solving an exponentially large linear system, and there are 2 m possible choices for w.Nonetheless, we will present an efficient algorithm which, given S, outputs a list of all S-bad points. 5tep 1: solving the decision problem.Our first step is to consider the decision variant of this problem: given a set S, does there exist an S-bad w? Our key insight here is that we can relate the existence of an S-bad w to the dimension of the vector space LD[Z, d]| S , where Z is the constant zero function.We show that all w are S-good if and only if Indeed, if every w is S-good, then for any degree-d polynomial P , the polynomial P − w∈{0,1} m P (w) • q w is a degree-d extension of Z that agrees with P on S. On the other hand, if that agrees with δ w on S; hence we can take q w = δ w − Ẑw , and so w is S-good.
Hence to solve the decision problem it suffices to compute (the dimensions of The former can be achieved efficiently (in time poly(|S|, m, d, log |F|)) using the succinct constraint detector for Reed-Muller discovered by [BCFGRS17].
To compute the latter, we build on ideas introduced in [CCGOS23].As discussed above, in that work the authors observe that to sample a vector v ← LD[Z, d]| S , it suffices to (lazily) sample random polynomials T i , i ∈ [m], of the appropriate degrees, and then set v( α Similarly, we show how to compute a basis for LD[Z, d]| S by combining bases for the subspaces These bases can also be computed using the succinct constraint detector for Reed-Muller; see Section 5.
Step 2: search-to-decision reduction.Next, we will build upon the techniques we developed in Step 1 to solve the original search problem.A natural strategy is to employ a binary search: for each b ∈ {0, 1}, test if {b} × {0, 1} m−1 contains any S-bad points, and recurse if it does.Since the number of S-bad points is bounded by |S|, the recursion will terminate quickly.It therefore suffices to give an efficient algorithm that, given a set A of the form Our algorithm for computing a basis for LD[Z, d]| S can be straightforwardly extended to compute a basis for LD[Z A , d]| S .We can also extend the reasoning from Step 1 to show that if every Unfortunately, the converse does not hold: it may be that A contains S-bad points but This can happen, for example, if S contains points on a curve passing through both w ∈ A and w ′ ∈ {0, 1} m \A.In particular, the argument from Step 1 fails: for To obtain our final algorithm, we will instead exploit the algebraic relationship between RM[F, m, d] and its subcode RM[F, m, d − 1] to derive a sufficient condition for every w ∈ A being S-good.Observe such that p w := δ w − Ẑw satisfies p w (w) = 1 and p w ( α) = 0 for all α ∈ S. It follows that all w ∈ A are S-good, as we can set q w = p w • δ w .
Thus, our efficient test for whether A contains any S-bad point is as follows: compute bases for and output "no" if they are of the same dimension; otherwise output "maybe".By the above discussion, this test does not have any false negatives; however, there may be false positives.Provided there are not too many false positives, this does not cause a problem (we can either include them in W or perform an extra test to filter them out).We bound the number of false positives by noting that if the test says "maybe", this in fact means that there exists an "S-bad" point in A with respect to polynomials of degree d − 1.By [AW09], provided d ≥ 2, there are at most |S| such points.

Local simulation of subcube sums of LDEs
So far, we have built a simulator that can answer arbitrary queries to our sumcheck PCP Π, provided those queries lie within the set {0, 1} ≤m ∪ F m , handling the combinatorial structure on the hypercube and the algebraic, pseudorandom algebraic structure outside of it.In this final part of the overview, we outline how this simulator can be extended to handle all queries; i.e., queries in the set F ≤m .
To do this, it suffices to show that the code ΣRM admits a locally-simulatable encoding, where ΣRM[F, m, d] = {Σ[P ] : P ∈ RM[F, m, d]} and Σ[P ] is defined as in Eq. 2 (where (c 1 , . . ., c i ) now ranges over F i ).Specifically, we show that the following encoding function is locally simulatable: Note that the message (input to ENC ΣRM ) is Σ[f ], even though ENC ΣRM operates only on f .This is necessary for local simulation, since individual locations in Σ[ f ] depend on partial sums of f which cannot be computed from few queries to f itself.We would like to follow the strategy from the previous section: given a set S ⊆ F ≤m on which we want to simulate ENC ΣRM (Σ[f ]), compute the set of all "bad" points W ⊆ {0, 1} ≤m , i.e., w ∈ W if ENC ΣRM (Σ[f ])| S conveys information about Σ[f ](w).Unfortunately, unlike in the "plain" Reed-Muller case, it is not clear how to even bound the size of W , let alone compute it.
The issue here is that a single evaluation of Σ[ f ]( α) for α ∈ F i depends on f ( β) for every point β ∈ α × {0, 1} m−i .Naively applying the lemma of [AW09] to this set of points yields a set W ⊆ {0, 1} m of size 2 m−i .To bound |W | by a polynomial, we must therefore crucially make use of the fact that making a few queries to Σ[ f ] can reveal only a few (possibly large) partial sums of f -which we can hope to deduce from a small number of queries to Σ[f ].
To achieve this we will give a decomposition of ΣRM as a sequence of RM codes on different numbers of variables, "tied together" with constraints that enforce summation structure.In more detail, let T ⊆ F ≤m be a set with the following special structure, which we call "closed": if (s 1 , . . ., s i ) ∈ S, then its "parent" (s 1 , . . ., s i−1 ) and its "siblings" (s 1 , . . ., s i−1 , 0) and (s 1 , . . ., s i−1 , 1) are also in S. The set {0, 1} ≤m is closed; it is also easy to see that any set T can be made closed by adding at most 3m|T | points.We prove the following theorem about the structure of restrictions of ΣRM to closed sets T : Theorem 2.1 (Informally stated, see Theorem 6.5).For any closed set T , any constraint z ∈ (ΣRM[F, m, d]| T ) ⊥ lies in the span of the following two types of constraint: • low-degree constraints, which are given by, for each i, the i-variate Reed-Muller constraints on the set If we take T = {0, 1} ≤m ∪ S, then Using this decomposition, we show that we can take as the "bad" set W := m i=1 W i , where W i is the set of (S ∩ F i )bad points in RM[F, i, d].Each W i can be computed efficiently using the algorithm described in Section 2.3.
We conclude by giving some brief intuition on how we prove Theorem 2.1.We show that any constraint z supported on We then use a dimension-counting argument to show that these, augmented with the summation constraints, span

Preliminaries
For a vector a ∈ Σ i , we denote by ℓ( a) the number of entries in a, i.e. ℓ( a) = i.Throughout, F is a finite field.For two vectors x, y ∈ F n , we denote the dot product by x • y := n i=1 x i y i .For sets A, B, we denote by A ⊔ B the disjoint union of A and B.
Algorithms.We write A π (x) to denote the output of A when given input x (explicitly) and oracle access to π.We use the abbreviation PPT to denote probabilistic polynomial-time.We generally omit the internal randomness of an algorithm from probability statements, that is, we write Pr Low-degree extensions.Given a function f : Lagrange polynomials.Let F be a field, and let m ∈ N. Given S 1 , . . .S m ⊆ F and a point w ∈ S 1 × • • • × S m , we define the Lagrange interpolating polynomial for S and w to be where denote Fact 3.1.The set of Lagrange interpolators {L S,w } w∈S form a basis for the space Vanishing functions.Given a subset S ⊆ F we denote the function which is vanishing on S by Z S (x) := s∈S (x− s).

Coding theory
Let Σ an alphabet and let ℓ ∈ N. A code C is a subset C ⊆ Σ ℓ .Given two strings x, y ∈ Σ n , we denote the Hamming distance between x and y by ∆(x, y) := |{i ∈ [n] : x i = y i }|.We say that a vector x is ε-far from a set S ⊆ Σ n if min y∈S ∆(x, y)/n ≥ ε.
Note that C| I is itself a linear code, as it is a linear subspace of F I .
Linear codes.Let F be a finite field.A code C : Reed-Muller codes and low-degree extensions.The Reed-Muller (RM) code is the code consisting of evaluations of multivariate low-degree polynomials over a finite field.
Definition 3.4.Given a field F, a positive integer m, and a degree vector Zero and sum codes.A zero code is a subcode of a linear code that is zero on a given subset.
Definition 3.5 (Zero codes).Given a C ⊆ F D , and S ⊆ D, we define the zero code A sum code is the extension of a linear code obtained by appending all sums of entries over a given subdomain.
Definition 3.6 (Sum codes).For a product set and A ⊆ D, we define the sum code Σ A C ⊆ F D to be the code consisting of all of the subcube sums over A of the codewords in C. That is, w( x, y) .
If D = A, we will typically omit A from the notation (i.e., we write ΣC, Σ[w]).

Linear-algebraic claims
We will make use of the following simple linear-algebraic algorithms and facts.
Claim 3.7.Let U ≤ F n be a vector space.There is a polynomial-time algorithm which, given a basis for U , computes a basis for U ⊥ .Claim 3.8.Let U ≤ F n , V ≤ F m be vector spaces.There is a polynomial-time algorithm which, given a linear map M ∈ F m×n such that {M u : u ∈ U } = V and basis B ∈ F n×k for U ⊥ , outputs a basis for V ⊥ .
Proof.Use Claim 3.7 to compute a basis B ′ ∈ F n×(n−k) for U .Then compute A = M B ′ ∈ F m×(n−k) ; we have that V = colspan(A) by assumption on M .Compute a basis for V from A by Gaussian elimination, and from this a basis for V ⊥ by Claim 3.7.
Proof.We have that Definition 3.10.Let (D, <) be an ordered domain.For a vector v : , where the minimum is taken with respect to <.A sequence of vectors v 1 , . . ., v n : , and any all-zero vectors are at the end of the sequence.
It is a straightforward consequence of Gauss-Jordan elimination that every subspace of (D → F) has a basis in echelon form.Conversely, a sequence of vectors in echelon form (not containing the all-zero vector) forms a linearly independent set.Definition 3.11.Let (R, < R ) and (C, < C ) be ordered domains and let A ∈ F R×C .The leading entry of a row w ∈ F C of A is w(λ(w)), i.e., the first non-zero entry in that row, with respect to the ordering < C .Definition 3.12.Let (R, < R ) and (C, < C ) be ordered domains.A matrix A ∈ F R×C is said to be in reduced row echelon form if (i) the columns of A, listed in increasing order by their indices with respect to < C are in row echelon form (with respect to < D ); (ii) all leading entries are 1; and (iii) all entries in the same column as a leading entry are 0.
Every matrix can be converted to reduced row echelon form via Gauss-Jordan elimination, and moreover, the resulting matrix is unique.We will often represent systems of linear equations in their matrix form.We refer to variables with a leading entry in their column of the reduced row echelon form matrix as leading variables, and the remaining variables as free variables.

The combinatorial nullstellensatz
The combinatorial nullstellensatz, due to Alon [Alo99], is a powerful tool in combinatorial number theory.It is stated as follows.
Lemma 3.13.Let F be an arbitrary field, and let f be a polynomial in F[X 1 , . . ., X n ].Let S 1 , . . ., S n be nonempty subsets of F and define Z S i (x i ) = s∈S i (x i − s).If f vanishes over all the common zeros of Z S 1 , . . ., Z Sn (that is, if f (s 1 , . . ., s n ) = 0 for all s i ∈ S i ), then there are polynomials h

Locally simulatable encodings
In this section we introduce the notion of locally simulatable encodings.First, we define randomised encoding functions, which are randomised mappings from a message space M into a code C. Definition 4.1 (Randomised encoding function).Let Σ be an alphabet, S, D be sets, and M ⊆ Σ S , C ⊆ Σ D be the message space and code space, respectively.A randomised encoding function is a random variable ENC taking values in M → C.
Next, we define what it means for a randomised encoding function to be locally simulatable.At a high level, we require that the (conditional) distribution of any local view of the encoding can be efficiently simulated using only a local view of the message.Definition 4.2 (Locally simulatable encoding).Let ENC : M → C be a randomised encoding function, let t : N → N and let ℓ : N → N. We say that ENC is (t, ℓ)-locally simulatable if there is a probabilistic algorithm S, called a local simulator, which receives oracle access to a message m ∈ M, a query-answer set T ∈ (D × Σ) n and a new query α ∈ D, runs in time t(n) and makes at most ℓ(n) queries to m, satisfying: for all m ∈ M, α ∈ D, β ∈ Σ, and for all T such that there exists c ∈ supp(ENC m ) with c x = y for all (x, y) ∈ T .
To give a sense of the definition, we consider two examples.
Example 4.3.The randomised encoding function which maps m ∈ F S , S ⊆ F, to the evaluation of a random univariate polynomial p over F of degree 2(|S| − 1) such that p(x) = m(x) for all x ∈ S, is (poly(n, d, log |F|), n)-locally simulatable.Indeed, for For |I| ≥ |S|, the simulator S can simply fully determine m and directly conditionally sample p.
Example 4.4.The encoding function which maps m ∈ F S , S ⊆ F, to the evaluation of the unique univariate polynomial p over F of degree |S| − 1 such that p(x) = m(x) for all x ∈ S, is not (t, ℓ)-locally simulatable unless ℓ(n) ≥ |S| for all n > 0. Indeed, determining p(x) for x ∈ F \ S requires knowledge of the entire message m.

Linear randomised encoding functions
In this work we will focus on a special family of randomised encoding functions, for which the randomised encoding is given by appending a random vector to the message, and then applying a linear function.
Definition 4.5 (Linear randomised encoding function).We say that ENC : M → C is a linear randomised encoding function if Σ = F for some finite field F, and ENC(m) is distributed as ENC(m; µ) where µ ∼ U (F r ) and ENC : M × F r → C is a (deterministic) linear function (of both message and randomness).
That is, we use ENC to refer to both the fixed encoding function and the resulting random variable.
Linear randomised encodings have certain structural properties that make them easier to work with.One important property of linear randomised encodings is that the distribution of ENC(m)| I can be characterised as the solution space of an affine system.An ℓ-constraint locator is an algorithm which outputs a description of this system, defined as follows.
Definition 4.6.Let ENC : M → C be a linear randomised encoding function, and let ℓ : N → N. We say that an algorithm CL is an ℓ-constraint locator for ENC if, for every subdomain I ⊆ D, it satisfies where R ⊆ S and Z ∈ F k×(R⊔I) are such that |R| = ℓ(|I|), and for every message m ∈ M and β ∈ F I , (m| R , β) T ∈ ker(Z) ⊆ F R⊔I if and only if there exists randomness µ ∈ F r such that: Our notion of an ℓ-constraint locator generalises the notion of a "constraint detector" introduced in [BCFGRS17].A constraint detector for a linear code C ⊆ F D takes as input a subdomain I ⊆ D and outputs a matrix representing all linear relations that hold between codewords in C| I .Setting the randomised encoding function for our ℓ-constraint locator to be the uniform distribution over all codewords in C, we recover the definition of a constraint detector.Definition 4.7 (Constraint detector).Let C ⊆ F D be a linear code.A constraint detector for C is a 0-constraint locator for ENC C : ∅ → C, the linear randomised encoding function given by ENC C ∼ U (C).
Given an efficient constraint locator, it is straightforward to construct a local simulator.
Claim 4.8.Let ENC : M → C be a linear randomised encoding with an ℓ-constraint locator running in time t.Then ENC is (t, ℓ)-locally simulatable.x∈supp(T ) y x z(x) + x∈R m x z(x).Define the vector y ∈ F supp(T ) to be the vector consisting of values y x for each (x, y x ) ∈ T .
We will break the proof into two parts, corresponding to the two possibilities for the output of the construction in Step 3 and Step 4. First, we show that if there exists a row z ∈ F R∪supp(T )∪{α} of the matrix Z with z(α) = 0 (as in Step 3), then p β = 1 if β = κ and p β = 0 otherwise.If there exists such a row, then the vector (β, y, m| R ) ∈ ker(Z) if and only if β = κ.By the correctness of CL, (β, y, m| R ) ∈ ker(Z) if and only if there exists randomness µ such that ENC(m; µ)| I = (β, y).So for all randomness µ ∈ F r , ENC(m; µ)| I = (β, y) if and only if β = κ; hence p β = 1 if β = κ and p β = 0 otherwise.
Second, we show that if there does not exist a row z with z(α) = 0 (as in Step 4), then p β = 1 |F| , for all β ∈ F. For each β ∈ F, consider the set of all possible choices of randomness which agree with T ∪ (α, β), that is |F| , since µ is chosen uniformly at random.In this case, (β, y, m| R ) ∈ ker(Z) for all β ∈ F.Then, by the correctness of CL, for all β ∈ F, there exists randomness µ ∈ F r such that ENC(m, µ) We prove that ℓ-constraint locators satisfy a useful composition property.That is, given constraint locators for encodings ENC in and ENC out , we can construct a constraint locator for the composed encoding ENC out • ENC in .CL * (I): where k in and k out are the number of rows in Z in and Z out respectively.4. Using Gaussian elimination, compute a basis B for the space V := ker(Z * ). 5. Compute a basis B ′ for the space (V | I∪R in ) ⊥ , using Claim 3.7 and Claim 3.8.Define Z to be the matrix whose rows are the elements of B ′ .6. Output (R in , Z).
Proof.Let m ∈ F S , and denote w m := ENC in (m; µ in ) and c w := ENC out (w; µ out ).
By the correctness of CL out , it holds that α ∈ F I satisfies ( α, w| Rout ) ∈ ker(Z out ) if and only if there exists randomness µ out such that for every message w ∈ F D in , c w | I = α.Similarly, we have that β ∈ F Rout satisfies ( β, m| R in ) ∈ ker(Z in ) if and only if there exists randomness µ in such that for every message m ∈ F S , w m | Rout = β.
Then by definition of Z * we must have that ( α, β, m| R in ) ∈ ker(Z * ) if and only if there exists µ out , µ in such that for all messages m ∈ F S , it holds that w m | Rout = β and c wm | I = α.Finally, as Z is defined to be the matrix whose rows are the elements of a basis for (V | I∪R in ) ⊥ = (ker(Z * | I∪R in )) ⊥ , we have that ( α, m| R in ) ∈ ker(Z) if and only if there exists randomness µ in , µ out , for which ENC * (m; µ in , µ out )| I = α.
Note that this claim does not seem to generalise to permit composition of arbitrary locally simulatable encodings.Indeed, here we are implicitly using the fact that a constraint locator allows efficient sampling of a random input to ENC out consistent with (the restriction to I of) a given output.For general locally simulatable encodings, such an algorithm may not exist.

Constraint location for random low-degree extensions
In this section we construct an efficient constraint locator for random low-degree extensions of a given function.In Section 5.1, we construct an efficient constraint detector (cf.Definition 4.7) for polynomials which are zero on a given product set S, i.e., for the code Z S (RM[F, m, d]).In Section 5.2, we use this constraint detector to construct a "decision" constraint locator for random low-degree extensions, which accepts on inputs which are constrained, and rejects on inputs which are unconstrained.In Section 5.3, we efficiently reduce the task of (search) constraint location to the decision version.
In order to state the main theorem of this section, we first define a randomised encoding that maps a function f to a random extension of specified degree.
Definition 5.1.Let ENC RM( d,S) to be the randomised encoding function that maps f : S → F, where

Note the restriction that |d
We prove the following theorem.
Theorem 5.2.There is an n-constraint locator for ENC RM( d,S) running in time poly(log |F|, m, max i d i , n).
As a corollary, we obtain a local simulator for ENC RM( d,S) : that is, we obtain an algorithm that can efficiently simulate a random low-degree extension of f given oracle access to f .

A constraint detector for random low-degree extensions of the zero function
We construct an efficient constraint detector for Z S (RM[F, m, d]), where S is a product set, and Z S (•) denotes the subcode consisting of codewords which are zero on S. We shall employ the following result, which is proved in [BCFGRS17].
Theorem 5.4.There is a constraint detector CD d for RM[F, m, d], running in time poly(log |F|, m, max i d i , n).
We prove the following.Before we proceed, it will be very useful to write down an expression for an individual entry of G.For β ∈ I, ℓ ∈ [k], the ( β, ℓ)-entry of G, which we will denote by G ℓ ( β), is given by the expression where b ℓ is the ℓ-th basis element computed in Step 3 and Z S i (X) := s∈S (x − s).We now proceed with the proof.

Correctness. We first show
Then, for some scalars a 1 , . . ., a k ∈ F, and for each β ∈ I, we can write where the second equality follows from Eq. 3. We also know that b j (i, for some a j ∈ F. Then substituting into the previous expression for Z yields Efficiency.Construction 5.6 makes m calls to CD d i , which runs in time poly(log |F|, m, max i d i , n).In the remaining steps, the algorithm performs only Gaussian elimination and matrix multiplication of matrices whose sizes are polynomial in |I| and m.

The decision problem
Before stating the main result of this section we define the notion of a constraint with respect to a linear code on a subset of its domain.
Definition 5.7.Let C ⊆ F D be a linear code.A subset I ⊆ D is constrained with respect to C if there exists a nonzero vector z ∈ F I such that, for every codeword w ∈ C, z • w| I = 0 (equivalently, if there exists z = 0 ∈ C ⊥ with supp(z) ⊆ I); we refer to z as a constraint with respect to C on I.We say that I is unconstrained with respect to C if it is not constrained with respect to C.
We construct an algorithm CheckConstraints which, given as input a (large) product set S and a polynomialsize set I, efficiently determines whether I ∪ S is constrained with respect to the Reed-Muller code.To prove the correctness of Construction 5.9, we show an equivalence between constraint detection for Z S (RM) and deciding whether I ∪ S is constrained with respect to RM.In order to do so, we shall require the following two properties of linear codes, whose proofs we defer to Appendix A.1 and Appendix A.2.
Claim 5.10.Let C ⊆ F D be a linear code and let I ⊆ D. Then I is unconstrained with respect to C if and only if for all x ∈ I there exists a codeword w x ∈ C satisfying (i) w x (x) = 1 and (ii) w x (y) = 0 for all y ∈ I \ {x}.
Lemma 5.11.Let C ⊆ F D be a linear code, and let I, S ⊆ D. Suppose that S is unconstrained with respect to C. Then I ∪ S is constrained with respect to C if and only if I \ S is constrained with respect to Z S (C).
As a corollary, we show an equivalence between constraints with respect to the general Reed-Muller code and the Reed-Muller code fixed to be zero on a product set.
for each w ∈ S.These codewords satisfy L S,w (w) = 1 and L S,w (x) = 0 for all x = w ∈ S, so by Claim 5.10, S is unconstrained with respect to RM[F, m, d].
Proof of Lemma 5.8.The correctness follows from Corollary 5.12 and the correctness of Construction 5.6.The runtime follows from the fact that Construction 5.6 runs in time poly(log |F|, m, max i d i , n).

Search-to-decision reduction
In this section we show a reduction from constraint location of random low-degree extensions (a search problem) to CheckConstraints.Before proving Theorem 5.2, we define what it means for a subset I of the domain D of a code to determine a point x ∈ D. We then prove some useful technical lemmas.Definition 5.13.Let C ⊆ F D be a linear code.We say that I ⊆ D determines x ∈ D with respect to C if x ∈ I or there exists a constraint z with respect to C on I ∪ {x} such that z(x) = 0.
The following lemma is central to our reduction.It shows that points which are constrained with respect to the Reed-Muller code are determined with respect to another Reed-Muller code of lower degree.Thus, in our algorithm, it suffices to check the latter condition.Proof.We proceed via contrapositive: if w ∈ S is not determined by I with respect to RM[F, m, d ′ ], then I ∪ S is unconstrained with respect to RM[F, m, d].
Let w ∈ S be not determined by I with respect to RM[F, m, d ′ ].Then there exists p( X) ∈ F ≤ d ′ [X 1 , . . ., X m ] such that (i) p(w) = 1 and (ii) p(i) = 0 for all i ∈ I. Now define q( X) where L S,w denotes the Lagrange polynomial.Note q( X) satisfies: (i) q(w) = 1, (ii) q(x) = 0 for all x ∈ S \ {w} and (iii) q(i) = 0 for all i ∈ I.By Claim 5.10 We shall also need the following claim which defines the notion of an interpolating set and gives an algorithm for reducing an arbitrary subdomain of a linear code to an interpolating set.We defer the proof of this claim to Appendix A.3.Claim 5.15.Let C ⊆ F D be a linear code which has an efficient constraint detector.Given a subdomain I ⊆ D, there exists an efficient algorithm, which computes an interpolating set for I with respect to C: a set I ′ ⊆ I such that: (i) I ′ is unconstrained with respect to C; and (ii) for every S ⊆ D which is unconstrained with respect to C and satisfies I ∩ S = ∅, if there exists a constraint z : I ∪ S → F such that z(s) = 0 for some s ∈ S, then there exists a constraint z ′ : I ′ ∪ S → F such that z ′ (s) = 0.
We will also require the following simple monotonicity property of CheckConstraints (see Construction 5.9).
We are now ready to prove the main result of this section.Correctness.First, we show that if I ∪ S is unconstrained with respect to RM[F, m, d] then Reduction 1 outputs ⊥.In this case, as T ⊆ S, I ∪ T is unconstrained so CD d (I ∪ T ) = ⊥, as required.Now we consider the case where Lastly, CD d (I ∪ T ) will output a basis for the space of constraints on I ∪ T , so the output is in the desired form.Efficiency.We show that the reduction is efficient by bounding the number of calls made to CheckConstraints d .For i ∈ [m], we say a point (s 1 , . . ., s Note that by correctness of CheckConstraints and the fact that I ′ is unconstrained, this means that a point (s 1 , . . ., s , let a i denote the number of points which are i-accepting.Now we count the number of calls our reduction makes to CheckConstraints in terms of a 1 , . . ., a m−1 .Note that regardless of the value of I, Reduction 1 makes |S 1 | calls to CheckConstraints d ′ at the beginning.Subsequently, for each i ∈ [m − 1], Reduction 1 makes a i |S i+1 | many calls to CheckConstraints d ′ .Overall, the total number of calls that Reduction 1 makes to CheckConstraints d ′ is:

Now we bound the size of a
Lemma 5.17 implies that there exists a set G I with |G I | ≥ |S| − |I|, such that for all g ∈ G I there exists a polynomial p g ( X) ∈ F ≤ d ′ [X 1 , . . ., X m ] satisfying (i) p g (x) = 0 for all x ∈ I ∪ G I \ {g} and (ii) p g (g) = 1.Thus, by Claim 5.10, at most |I| many points in S are constrained by I with respect to RM[F, m, d ′ ].In other words, Thus, as each call to CheckConstraints takes time poly(log |F|, m, max i d i , n), and |S i | is bounded by a polynomial in d i for each i, we have that t(n) = poly(log |F|, m, max i d i , n).
Finally, the fact that at most |I| many points in S are constrained implies that ℓ(n) = n.
6 Constraint location for subcube sums of random low-degree extensions In this section we extend the efficient constraint locator of the previous section to support queries to subcube sums.We begin by defining the corresponding randomised encoding.
Definition 6.1.We prove this theorem by giving, in Section 6.1, a local characterisation of ΣRM in terms of the plain Reed-Muller code.We give our construction based on this characterisation in Section 6.2.

Local characterisation of ΣRM
In this subsection we give a local characterisation of ΣRM in terms of the plain Reed-Muller code.We define the code ΣRM as follows.
Given a set X ⊆ F ≤m , we refer to the smallest A-closed set containing X as the A-closure of X, i.e., X is the A-closure of X if X is A-closed, X ⊆ X, and for all A-closed Y ⊇ X, it holds that X ⊆ Y .
Globally the ΣRM code can be viewed as a collection of plain Reed-Muller codes on 1, 2, . . ., m variables, related by summation constraints.The following theorem asserts that this characterisation also holds locally: over any A-closed subdomain S, (ΣRM| S ) ⊥ is spanned by summation constraints {z s } s∈S * and the local constraints on a collection of Reed-Muller codes.For technical reasons, we show this with respect to both the "plain" ΣRM code and the subcode Z A (ΣRM) of encodings of the zero word.Theorem 6.5.Let A 1 , . . ., A m ⊆ F, A := A 1 ×• • •×A m and let S be an A-closed set.Then for X ∈ {∅, A}, we have We will use the above notation for X i , S i , S * and z s ( t) throughout the remainder of this section.We prove Theorem 6.5 by a dimensionality argument.In particular, we show that any basis for (Z X (ΣRM[F, A, d])| S ) ⊥ which is in echelon form can be mapped to a set of linearly independent vectors in the span of summation constraints and Reed-Muller constraints.
The key technical tool used here is a "flattening lemma" (Lemma 6.8).This lemma provides a means to map any constraint z on subcube sums of m-variate polynomials to a new constraint z ′ on subcube sums of (m−1)-variate polynomials, while preserving the value of λ(z).Crucially, in the case where |λ(z)| = m−1, all dependence of z ′ on subcube sums vanishes, and we obtain a constraint with respect to the plain Reed-Muller code.
By repeated applications of the flattening lemma we map any constraint over subcube sums to a Reed-Muller constraint; we then show by a counting argument that these "flattened" constraints, along with the summation constraints, span the dual code.
Proof of Theorem 6.5.Define Z * := {z s } s∈S * , and . Fix some ordering < on S so that (i) ℓ( s) < ℓ( t) then s < t, and (ii These vectors are defined as follows: where R is as guaranteed by Claim 6.6 below.To show that the b ′ i are linearly independent, it suffices to show that they are in echelon form with respect to <.This follows since λ(b It remains to prove the following technical claim. This claim will be a consequence of Lemma 6.8 (the "flattening lemma"), which provides a way to transform a constraint on an m-variate ΣRM code into a related constraint on an (m − 1)-variate ΣRM code.Definition 6.7.Let S ⊆ F ≤m and and an element a ∈ A i , we define the "flattening map" Q i,a : (S → F) → (S \ S i → F) as follows, for s ∈ S \ S i : , and z( s) otherwise.
Proof.We will proceed by showing how a constraint on subcube sums of an m-variate polynomials induces a constraint on subcube sums of (m − 1)-variate polynomials.
. Define the set P 0 := {p ∈ F ≤ d [X 1 , . . ., X m ] : p( x) = 0 ∀ x ∈ X}.Then for all polynomials p ∈ P 0 it holds that s∈S z( s) Let a * m ∈ A m and define the set of (m − 1)-variate polynomials where d ′ := (d 1 , . . ., d m−1 ).Then for any q ∈ Q 0 , the polynomial defined by q a * m (x 1 , . . ., x m ) := q(x 1 , . . ., x m−1 ) • L Am,a * m (x m ) satisfies q a * m ( x) = 0 for all x ∈ X.So q a * m ∈ P 0 .Thus, substituting q a * m into Equation 4 implies that for all polynomials q ∈ Q 0 , where A ′ := A m−1 .Recall that for points (s 1 , . . ., s m−1 ) / ∈ S * , there does not exist s m ∈ S m such that (s 1 , . . ., s m−1 , s m ) ∈ S. Thus we can rearrange this expression as: Notice that the coefficients of Eq. 5 are precisely the values of Q m,a * m (Definition 6.7) at each point in S \S m .This demonstrates that for all z ∈ Z . Now we will deal with the two cases of the lemma statement separately.
As a result, Equation 5 simplifies greatly and becomes: Note that Equation 6 is a constraint with respect to the Reed-Muller code (Z For the second part, assume λ(z) / ∈ S * .By the definition of <, if s < λ(z), then it must be that s / . We now apply Lemma 6.8 to prove Claim 6.6, which will complete the proof of Theorem 6.5.
Proof of Claim 6.6.For each i ∈ {|λ(z)| + 1, . . ., m} arbitrarily chose an element a i ∈ A i , and for notational convenience, denote by Q i := Q i,a i the flattening map (Definition 6.7).For each z ∈ (Z By part 1 of Lemma 6.8, we see that Combining this with part 2 of Lemma 6.8 we see that and provided that λ(z) / ∈ S * , then λ(T (z)) = λ(z).

An ℓ-constraint locator for ΣRM
In this section we construct an constraint locator for ΣRM, by combining the structure implied by Theorem 6.5 with the constraint locator we constructed in Section 5. We will often denote ΣRM := ΣRM[F, m, d] when F, m, d are clear from context.Construction 6.9.An ℓ-constraint locator for ENC ΣRM( d,A) .For each i ∈ [m], let CL i be an ℓ iconstraint locator for ENC RM((d 1 ,...,d i ),A i ) (cf.Theorem 5.2).Input: a subdomain I ⊆ F m .
CL Σ (I): i to be the natural embedding of Z i into the space of matrices where k i is the number of rows of Z i . 5. Construct the matrix 6. Construct the matrix Z * whose ( s, γ)-th entry, for s ∈ ( R ⊔ Î) * , and γ ∈ R ⊔ Î, is defined by 7. Define Z to be the matrix obtained by vertically stacking Z ′ on top of Z * .8. Compute a basis B for the space {( r, β) Our proof of correctness for Construction 6.9 consists of two main steps.The first is showing that ker(B) = ΣRM| R⊔I , which we accomplish by appealing to Theorem 6.5.We then show that ( r, β) ∈ ΣRM| R⊔I if and only if for all messages m ∈ M such that m| R = r, we have (m, β) ∈ ΣRM| Ā⊔I .Claim 6.10.Let I ⊆ F m and let ( R, B) := CL Σ (I) (cf.Construction 6.9), then ker(B) = ΣRM[F, m, d]| R⊔I .
To see that ker(Z) = ΣRM| R⊔ Î , consider the following.By construction, the rows of Z consist of trivial constraints over ( Î ⊔ R) * and Reed-Muller constraints on the set R i ⊔ Î padded with zeros so as to be supported on R ⊔ Î.That is, By the correctness of each CL i , we know that for all i ∈ [m], for all constraints z ∈ (RM[F, i, (d 1 , . . ., where the final equality follows from Theorem 6.5 and the fact that R ⊔ Î is A-closed.In particular, we have shown that Z is a parity check matrix for ΣRM| R⊔ Î .In other words, ker(Z) = ΣRM| R⊔ Î .Now we focus on the second part of our proof of correctness: showing that ( r, β) ∈ ΣRM| R⊔I if and only if for all messages m ∈ M such that m| R = r, we have (m, β) ∈ ΣRM| Ā⊔I .We accomplish this via the following property of linear codes, whose proof is deferred to Appendix A.5. Claim 6.11.Let C ⊆ F D be a linear code, and let We prove that Z A (ΣRM)| Î = Z R(ΣRM)| Î where Î, R are as above.To do this, we first show an analogous statement for "plain" RM, and then lift that result to ΣRM via Theorem 6.5.Claim 6.12.Let w ∈ RM such that w(r) = 0 for all r ∈ R. Then w| S is a general element of Z R (RM)| S .We will show that there exists w ′ ∈ RM which agrees with w on S, but is zero on A, i.e., w ′ | S = w| S and w ′ | A = 0.
By the correctness of CL, all constraints on A ⊔ S with respect to RM are supported only on R ⊔ S; in particular, they have no support on the set A \ R. For α ∈ A \ R, define w α : A ⊔ S → F by w α (α) := 1 and w α (β) := 0 for all β ∈ A ⊔ S \ {α}.Then since w α is only nonzero in A \ R, w α ∈ RM| A⊔S , and so w α = w ′ α | A⊔S for w ′ α ∈ RM.Consider the codeword defined by This choice of w ′ satisfies w ′ ∈ Z A (RM), and w ′ | S = w| S .Thus w| S ∈ Z A (RM)| S .
Corollary 6.13.For Î, R as in Construction 6.9, , so we show the reverse inclusion.For R i , Z i , Îi as in Construction 6.9, we have that The first equality is a consequence of Theorem 6.5.The second equality is a consequence of Claim 6.12.

Constraint location for ΣAntiSym
In this section we define the code AntiSym of antisymmetric functions, and provide a constraint locator for an encoding function related to its sum code ΣAntiSym.
The code ΣAntiSym is the sum code of AntiSym (see Definition 3.6).
The main theorem of this section is the following, which gives an efficient constraint locator for ΣAntiSym with polynomial locality.
where G is a codeword sampled uniformly at random from AntiSym [A].
In Section 7.1 we analyse the combinatorial structure of ΣAntiSym| ⊥ I , obtaining a bound on the "complexity" of constraints in terms of the size of I.In Section 7.2, we show how this bound leads to an efficient constraint locator for ENC ΣAntiSym .

Properties of ΣAntiSym
We start by introducing some useful notation.
We associate with an element a ∈ Ā the set a × A >ℓ( a) ; we denote this set also by a.
Proof.For the "if" direction, note that w( x) + w( x rev ) = f ( x) − f ( x rev ) + f ( x rev ) − f ( x) = 0 for all x ∈ A. For the "only if" direction, let w ∈ AntiSym[ A], and consider some strict total ordering < on A. Define f ( x) := w( x) if x < x rev , and 0 otherwise.
Then for all x, f ( x) − f ( x rev ) = w( x) if x < x rev , and −w( x rev ) if x rev < x = w( x) .
We say that H is symmetric if ∪H = ∪H rev , where ∪H := ∪ h∈H h.We say that H is minimal symmetric if H is symmetric and for all nonempty H ′ H, H ′ is not symmetric.
Proposition 7.6.If H ⊆ Ā is symmetric and prefix-free, then H = ∪ k i=1 H i for H 1 , . . ., H k ⊆ H where each H i is minimal symmetric and H i ∩ H j = ∅ for i = j.
Proof.We proceed by induction on the size of H. Indeed, suppose that the statement is true for all sets H ′ ⊆ Ā of size less than n, and let H ⊆ Ā be a set of size n.If H is minimal symmetric, then the statement holds trivially, so suppose not.Then there is a nonempty symmetric also symmetric.Since H ′ and H \ H ′ are strictly smaller than H, we can apply the inductive hypothesis to complete the proof.
The following lemma shows that, for G prefix-free, ΣAntiSym| ⊥ G has a basis of constraints of the form " h∈H w( h) = 0", taken over all minimal symmetric subsets H ⊆ G.
Lemma 7.7.Let G ⊆ Ā be prefix-free.For H ⊆ G, let 1 H : G → F be the indicator vector for H, i.e., 1 H ( h) = 1 if h ∈ H and 0 otherwise.A basis for ΣAntiSym It follows that for all γ, ∪S γ ⊆ ∪(S γ ) rev ; a similar argument establishes that in fact S γ is symmetric.By Proposition 7.6, S γ is a union of minimal symmetric sets.It follows that z ∈ span(B).
Next, we prove the main technical lemma of this section.
Lemma 7.8.Let H, G ⊆ Ā be prefix-free with |H| • |G| ≤ |A|/4.Suppose that ∪H = ∪G rev .Then Taking G = H, we obtain strong bounds on the support size of symmetric sets: if H is symmetric then ∪H must be of size either poly(|H|) or |A| − poly(|H|).
The first part of the lemma follows by recalling that | ∪ H| = N + t.The second part then follows since

Algorithms for ΣAntiSym
In this section, we present two useful algorithms for working with ΣAntiSym.The first (Lemma 7.11) takes as input an arbitrary set I ⊆ Ā and outputs a prefix-free set G ⊆ Ā of a similar size with the property that any a ∈ I can be obtained as the union of sets in G.The second (Lemma 7.13) takes as input a prefix-free set G and outputs the set of minimal symmetric subsets of G (i.e., a basis for ΣAntiSym 1.For each a ∈ I, set Λ a := { a}.Set G := I. 2. Let a * = (a 1 , . . ., a ℓ * ) be a shortest element in G (i.e., ℓ * is minimal) for which N G ( a * ) is nonempty.
If there is no such term, output G.

Let a
4. Remove a * from G.

Go to
Step 2.
Clearly if the algorithm terminates then the correctness condition is satisfied, and so it remains to bound the number of iterations.Denote by ∆ i the value of | ∪ a∈G N G ( a)| at the beginning of the i-th iteration.
Clearly ∆ 1 cannot be larger than |I|, and if ∆ t = 0 then the algorithm terminates at the beginning of the t-th iteration.We show that ∆ is a progress measure for the above algorithm.
Proof.Let a ′ be the term chosen in Step 3 of the i-th iteration.We show that at the termination of the i-th iteration, a ′ has been removed from ∪ a∈G N G ( a), and no element has been added.By choice of a ′ , a ′ ∈ N G ( a * ), and a ′ / ∈ N G ( a ′′ ) for any a ′′ = a * , since that would mean a ′ ⊂ a ′′ ⊂ a * .Moreover, for every element α j added to G in Step 5, a ′ / ∈ N G ( α j ) since α j is disjoint from a ′ by construction, N G ( α j ) ⊂ N G ( a) since α j ⊂ a, and α j / ∈ ∪ a∈G N G ( a) by choice of a * , a ′ .
It follows that the number of iterations is at most |I|; since each iteration clearly runs in polynomial time, the algorithm runs in polynomial time.The bound on |G| is obtained by noting that, in each iteration, the size of G increases by at most m − 1. Lemma 7.13.There is a polynomial-time algorithm SymSets which, given as input a prefix-free set G ⊆ Ā, outputs the set H of all minimal symmetric subsets H of G.
Proof.The algorithm SymSets operates as follows:  this can also be achieved in polynomial time since H is prefix-free.The correctness of the algorithm is a consequence of the following claim.Definition 8.5 (Perfect-Zero-Knowledge PCP).We say that a PCP system (P, V) for a language L is perfect zero-knowledge (a PZK-PCP) if there exists an (expected) PPT algorithm Sim (the simulator), such that for every (possibly malicious) adaptive polynomial-time verifier V * , and for every x ∈ L, Sim V * (x) is distributed identically to View V * ,P (x).

Our Construction
We define the following randomised encoding function.
For the reverse implication, let z ∈ F I be a constraint satisfying z • w| I = 0 for all w ∈ C. We will show that z is identically zero.By assumption, for each x ∈ I, there exists a codeword w x ∈ C satisfying w x (x) = 1 and w x (y) = 0 for all y ∈ I \ {x}.Substituting w x into our constraint equation for each x ∈ I give z(x) = z • w x | I = 0 for each x ∈ I, that is, z is identically zero.In other words, I is unconstrained with respect to C.
A.2 Proof of Lemma 5.11 Lemma 5.11.Let C ⊆ F D be a linear code, and let I, S ⊆ D. Suppose that S is unconstrained with respect to C. Then I ∪ S is constrained with respect to C if and only if I \ S is constrained with respect to Z S (C).
Proof.First we show the forward implication.If I ∪ S is constrained with respect to C then there exists a non-zero constraint z ∈ F I∪S such that for every w ∈ C we have where we have used the fact that w ′ (x) = 0 for all x ∈ S. Lastly, there must exist x ∈ I \ S for which z(x) = 0, for if not, then as z is non-zero, z| S is a non-zero constraint on S, contradicting the hypothesis.Hence, I is constrained with respect to Z S (C).
For the reverse implication, we will employ the contrapositive: if I ∪ S is unconstrained with respect to C, then I \ S is unconstrained with respect to Z S (C).If I ∪ S is unconstrained with respect to C, then, by the forward implication of Claim 5.10, for each x ∈ I ∪ S there exists w x ∈ C satisfying (i) w x (x) = 1; and (ii) w x (y) = 0 for all y ∈ I ∪ S \ {x}.In particular, for every x ∈ I \ S, w x (y) = 0 for all y ∈ S, so we have that for all x ∈ I \ S, w x ∈ Z S (C).Then the reverse implication of Claim 5.10 implies that I \ S is unconstrained with respect to Z S (C).
A.3 Proof of Claim 5.15 Claim 5.15.Let C ⊆ F D be a linear code which has an efficient constraint detector.Given a subdomain I ⊆ D, there exists an efficient algorithm, which computes an interpolating set for I with respect to C: a set I ′ ⊆ I such that: (i) I ′ is unconstrained with respect to C; and (ii) for every S ⊆ D which is unconstrained with respect to C and satisfies I ∩ S = ∅, if there exists a constraint z : I ∪ S → F such that z(s) = 0 for some s ∈ S, then there exists a constraint z ′ : I ′ ∪ S → F such that z ′ (s) = 0.
First we give the construction, as follows.2. Perform Gaussian elimination on Z to obtain a matrix Z ′ which is in reduced row echelon form.3. Output the set I ′ := {x ∈ I : x is a free variable in Z ′ }.
The efficiency is clear by construction.To see that this construction is correct, we will show that I ′ is unconstrained and each x ∈ I \ I ′ is determined by I ′ .We then show that the second property in Claim 5.15 is a consequence of these two properties.
To see that I ′ is unconstrained, note that as each element x ∈ I ′ is a free variable, for each choice of values u ∈ F I ′ for the variables in I ′ there will be some setting v ∈ F I\I ′ of the remaining leading variables in I \ I ′ such that ( u, v) ∈ ker(Z ′ ).By the correctness of CD, ( u, v) ∈ ker(Z ′ ) if and only if ( u, v) ∈ C| I .We have just demonstrated that C| I ′ = F I ′ , in other words, I ′ is unconstrained.
For the second part, as each x ∈ I \ I ′ is a leading variable, the row of Z ′ in which it is leading forms a constraint z : F I → F with z(x) = 1, and z(x ′ ) = 0 for all x ′ ∈ I \ I ′ (as Z ′ is in reduced row echelon form).Thus the restriction of this constraint to I ′ ∪ {x} is a constraint z : F I ′ ∪{x} → F with z(x) = 1, so x is determined by I ′ .
For the third part, suppose there exists a constraint z : I ∪ S with respect to C such that z(s) = 0 for some s ∈ S. Denote I * := {x ∈ I \ I ′ : z(x) = 0}.We assume that I * is non-empty, for if it were not, then the restriction of z to I ′ ∪ S would suffice.Then we can write Observe that z(s) = z ′ (s) for all s ∈ S.
A.4 Proof of Lemma 5.17 for some scalars a w ∈ F. Then the constraints p(x) = 0 for all x ∈ I can be written as an |I| × |S| matrix A of linear equations over F, relating the |S| many coefficients a w .By applying Gaussian elimination we can uniquely convert A to a matrix B which is in reduced row echelon form.As rank(B) = rank(A) ≤ |I|, B will have at most |I| many "leading ones".Set G I to be the set of all w ∈ S such that there is no leading one in the w-th column of B, i.e., for each g ∈ G I , the coefficient a g is a free variable with respect to the linear system.Thus, in particular, for any g ∈ G I , we can set a g = 1 and a g ′ = 0 for all g ′ ∈ G I \ {g} and obtain a solution to the linear system, yielding the result.
A.5 Proof of Claim 6.11 Proof.The reverse implication is straightforward: if (v, x) ∈ C| V ⊔I , then (v| U , x) ∈ C| U ⊔I by taking the restriction to U ⊔ I.
For the forward implication, let (u, x) ∈ C| U ⊔I .Then, as U ⊆ V , trivially there exists v ∈ C| V such that v| U = u and (v, x) ∈ C| V ⊔I .Let v ′ ∈ C| V be an arbitrary message which agrees with v when restricted to U , that is, v ′ | U = u.Then for some x ′ ∈ C| I , (v ′ , x ′ ) ∈ C| V ⊔I .Restricting this vector to U ⊔ I, we have (u, x ′ ) ∈ C| U ⊔I .By linearity of C| U ⊔I , we have that (u, x) − (u, x ′ ) = ( 0 U , x − x ′ ) ∈ C| U ⊔I , where 0 U ∈ C| U is the zero codeword.If we can show that ( 0 V , x − x ′ ) ∈ C| V ⊔I , where 0 V ∈ C| V is the zero codeword.we will be done.Clearly (u, x ′ ) ∈ Z U (C)| I , and by Corollary 6.13, we have that Z U (C) = Z V (C), so (u, x ′ ) ∈ Z V (C)| I .In other words, ( 0 V , x − x ′ ) ∈ C| V ⊔I .Hence, by linearity again, (v ′ , x ′ ) + ( 0 V , x − x ′ ) = (v ′ , x) ∈ C| V ⊔I , as required.
for the ring of polynomials in m variables over F of individual degree d.For d = (d 1 , . . ., d m ) ∈ N m , we use F ≤ d [X 1 , . . ., X m ] to denote the ring of polynomials in m variables over F of degree d i in X i for each i ∈ [m].

Definition 3. 2 .
Given a function w : D → F and a subset I ⊆ D, we denote by w| I the restriction of w to I, that is the function w| I : I → F such that w(x) = w| I (x) for all x ∈ I. Definition 3.3.Given a linear code C ⊆ F D and a subset I ⊆ D, we denote by C| I the code consisting of codewords from C restricted to I, that is Construction 4.9.A local simulator S for a linear randomised encoding function ENC : F S → F D given an ℓ-constraint locator CL for ENC.S m ENC (T, α): 1. Set (R, Z) = CL(supp(T ) ∪ {α}).2.For each γ ∈ R, query the message and set m γ := m(γ).3.If there exists a row z ∈ F R∪supp(T )∪{α} of the matrix Z with z(α) = 0, output β := − 1 z(α) x∈supp(T ) y x z(x) + x∈R m x z(x), where (x, y x ) ∈ T .4. Otherwise, output β ← F. Proof.First we introduce some notation.Denote p β := Pr[ENC(m) α = β | ENC(m) x = y ∀(x, y) ∈ T ].Denote I := supp(T ) ∪ {α} and denote κ := − 1 z(α) Claim 4.10.Let ENC in : F S → F D in and ENC out :F D in → F Dout admit ℓ in -constraint location and ℓ outconstraint location respectively.Then ENC out •ENC in (M ; µ in , µ out ) := ENC out (ENC in (m; µ in ); µ out ) admits (ℓ in • ℓ out )-constraint location.Construction 4.11.A constraint locator CL * for ENC * , which receives oracle access to the message m and is given constraint locators CL in and CL out for ENC in and ENC out respectively.It receives as input a subdomain I ⊆ D out .
to a random degree-d extension of f , i.e., a uniformly random element of LD d [f ].

Lemma 5. 5 .
Construction 5.6 is a constraint detector for Z S (RM[F, m, d]), running in time poly(log |F|, m, max i d i , n).Construction 5.6.A constraint detector for Z S (RM[F, m, d]), where S := S 1 × • • • × S m , given a constraint detector CD d for RM[F, m, d].Receives as input a subdomain I ⊆ F m .CD Z S (RM[F,m, d]) (I): 1.For each i ∈ [m], compute Z i := CD d i (I) where d i := (d 1 , . . ., d i − |S i |, . . ., d m ). 2. Define the block diagonal matrix Z Gaussian elimination, compute a basis b 1 , . . ., b k for ker(Z ′ ) and define B to be the k × |I|•m matrix whose i-th row is b i .4. Define the |I| • m × |I| matrix A I to be the matrix whose (( α, i), γ)-entry is Z S i (α i ) if α = γ and 0 otherwise (where Z S i (x) := s∈S i (x − s)). 5. Compute G := (BA I ) T , an |I| × k matrix, which is a generator matrix for Z S (RM[F, m, d])| I .6. Using G, compute a parity check matrix H for Z S (RM[F, m, d])| I , and output H.

Proof.
In order to prove that Construction 5.6 is a constraint detector for Z S (RM[F, m, d]), it suffices to show that G := (BA I ) T is a valid generator matrix for Z S (RM[F, m, d])| I ; i.e., Im(G) = Z S (RM[F, m, d])| I .We show that Im(G) ⊆ Z S (RM[F, m, d])| I using the fact that ker(Z ′ ) = m i=1 RM[F, m, d i ]| I (Theorem 5.4) and some linear algebra.We prove the reverse inclusion via the combinatorial nullstellensatz (see Lemma 3.13).

Lemma 5. 8 .
Let S := S 1 × • • • × S m , I ⊆ F m and d = (d 1 , . . ., d m ) with d i ≥ |S i | for all i ∈ [m].CheckConstraints d (I, S) outputs yes if and only if and I ∪ S is constrained with respect to RM[F, m, d], and runs in time poly(log |F|, m, max i d i , |I|).Construction 5.9.An algorithm which efficiently determines whether a set I ∪ S is constrained with respect to the Reed-Muller code, given an efficient constraint detector CD Z S (RM[F,m, d]) (I) for Z S (RM[F, m, d]).CheckConstraints d (I, S): 1. Compute H = CD Z S (RM[F,m, d]) (I \ S) (see Construction 5.6). 2. If H = 0, output no, otherwise output yes.
and denote S := S 1 × • • • × S m .Then I ∪ S is constrained with respect to RM[F, m, d] if and only if I \ S is constrained with respect to Z S (RM[F, m, d]).Proof.By Lemma 5.11, it suffices to show that S is unconstrained with respect to RM[F, m, d].Consider, for each w ∈ S, the Lagrange polynomial L
where T is the output of Reduction 1.By inspection of Reduction 1 and Claim 5.16, we see that s ∈ T if and only if CheckConstraints d ′ (I ′ , {s 1 } × • • • × {s m }) = yes.If there exists a constraint z : I ∪ S → F with respect to RM[F, m, d] with z(s) = 0, then by Claim 5.15, there exists a constraint z ′ : I ′ ∪ S with respect to RM[F, m, d] with z ′ (s) = 0. Further, by Lemma 5.14, s is determined by I ′ with respect to RM[F, m, d ′ ].
by first choosing a random degree-d extension F of F , and then outputting the word Σ A [ F ] : F ≤m → F obtained by augmenting F with partial sums over A (see Definition 3.6).Note that the input to ENC ΣRM( d,A) includes partial sums of F ; these are not necessary to define the encoding but are essential to permit local simulation.The main theorem of this section is the following.Theorem 6.2.There is a (t, ℓ)-constraint locator for ENC ΣRM( d,A) , where ℓ(n) = nm(m(a + 1) + 1) 2 , for a := max i |A i | and t(n) = poly(log |F|, m, max i d i , n).
where (R, Z) := CL(S), for an ℓ-constraint locator CL for ENC RM( d,A) .Proof.As R ⊆ A, it is clear that Z A (RM[F, m, d])| S ⊆ Z R (RM[F, m, d])| S .We now argue the reverse containment.For brevity we write RM for RM[F, m, d].

Theorem 7. 2 .
Let A be as above.Let M := {F : A ∪ {⊥} → F | F (⊥) = a∈A F ( a)}.Let ENC ΣAntiSym : M → F Ā be the following randomised encoding: In particular, | a| = |A >ℓ( a) | = m i=ℓ( a)+1 |A i |. (See Definition 3.6.)It is straightforward to see that b ⊆ a if and only if a is a prefix of b.The following simple proposition gives a useful alternative characterisation of AntiSym.Proposition 7.4.w ∈ AntiSym[A] if and only if there exists f

Lemma 7. 11 .
There is a polynomial-time algorithm PrefixFree which, given as input a set I ⊆ Ā, outputs a prefix-free set G of size at most |I| • m and a list (Λ a ⊆ G) a∈I such that for each a ∈ I, ∪Λ a = a.Proof.For a ∈ Ā and G ⊆ Ā, define N G ( a) := { b ∈ G : b a}.The algorithm PrefixFree operates as follows.

3.
Output the set {H ∈ C(Γ) : ∪H = ∪H rev }.It is straightforward to construct Γ and compute C(Γ) in polynomial time.To determine whether ∪H = ∪H rev , it suffices to compute K := |(∪H) ∩ (∪H rev )| and then check whether | ∪ H| = | ∪ H rev | = K; Theorem 8.8 (Low individual degree test[GS06;GR15]).Let m ∈ N, d ∈ N m be such that i d i < |F|/10, ε ∈ (0, 1/10) and δ ∈ [0, 1].There exists an efficient test that, given oracle access to a function P : F m → F, makes O(md • poly(1/ε) • log(1/δ)) queries to P , and:• if P ∈ RM[F, m, d]then the test accepts with probability 1;• if P is ε-far from RM[F, m, d]then the test accepts with probability at most δ.Construction 8.9.A PZK-PCPP for sumcheck.Both parties receive the common input (F, m, d, H, γ), and oracle access to the evaluation table of F : F m → F. The proof proceeds as follows.Proof:1.Sample a polynomial Q ← F ≤ d [X 1 , . . ., X m ] uniformly at random.Compute the full evaluation table π Q of Q. 2. For each i ∈ [m], sample T i ← F ≤ d i [X 1 , . . .,X m ], where d i = (d, . . ., d − |H|, . . ., d) is the vector which takes the value d in every coordinate except for the i-th location which takes the value d− |H|.
x∈I∪S z(x)w(x) = 0 .As Z S (C) is a subcode of C, we have that all w ′ ∈ Z S (C) satisfy x∈I∪S z(x)w ′ (x) = x∈I z(x)w ′ (x) = 0 ,

Construction A. 1 .
An algorithm to compute the interpolating set for a linear code C given a constraint detector CD for C. Receives as input a subdomain I ⊆ F m .Interpolate(I): 1. Compute Z := CD(I).

Claim 6. 11 .
Let C ⊆ F D be a linear code, and letU ⊆ V ⊆ D, I ⊆ D be such that Z U (C)| I = Z V (C)| I .Then (u, x) ∈ C| U ⊔I if and only if for all v ∈ C| V such that v| U = u, (v, x) ∈ C| V ⊔I .
Query f (w) at all w ∈ W , sample a uniformly random polynomial P such that P (w) = f (w) for all w ∈ W , and output P | S .Chen et al. prove that S RM is correct, and moreover that it is query-efficient, provided that d ≥ 2: the number of queries it makes to f (equal to |W |) is at most |S| (this follows from [AW09, Lemma 4.3]).