Improved Stabilizer Estimation via Bell Difference Sampling

We study the complexity of learning quantum states in various models with respect to the stabilizer formalism and obtain the following results: - We prove that $\Omega(n)$ $T$-gates are necessary for any Clifford+$T$ circuit to prepare computationally pseudorandom quantum states, an exponential improvement over the previously known bound. This bound is asymptotically tight if linear-time quantum-secure pseudorandom functions exist. - Given an $n$-qubit pure quantum state $|\psi\rangle$ that has fidelity at least $\tau$ with some stabilizer state, we give an algorithm that outputs a succinct description of a stabilizer state that witnesses fidelity at least $\tau - \varepsilon$. The algorithm uses $O(n/(\varepsilon^2\tau^4))$ samples and $\exp\left(O(n/\tau^4)\right) / \varepsilon^2$ time. In the regime of $\tau$ constant, this algorithm estimates stabilizer fidelity substantially faster than the na\"ive $\exp(O(n^2))$-time brute-force algorithm over all stabilizer states. - In the special case of $\tau>\cos^2(\pi/8)$, we show that a modification of the above algorithm runs in polynomial time. - We exhibit a tolerant property testing algorithm for stabilizer states. The underlying algorithmic primitive in all of our results is Bell difference sampling. To prove our results, we establish and/or strengthen connections between Bell difference sampling, symplectic Fourier analysis, and graph theory.

• Given an n-qubit pure quantum state |ψ⟩ that has fidelity τ with some stabilizer state, we give an algorithm that outputs a succinct description of a stabilizer state that witnesses fidelity at least τ − ε.The algorithm uses O(n/(ε 2 τ 4 )) samples and exp O(n/τ 4 ) /ε 2 time.In the regime of τ constant, this algorithm estimates stabilizer fidelity substantially faster than the naïve exp(O(n 2 ))-time brute-force algorithm over all stabilizer states.
• In the special case of τ > cos 2 (π/8), we show that a modification of the above algorithm runs in polynomial time.
• We exhibit a tolerant property testing algorithm for stabilizer states.
The underlying algorithmic primitive in all of our results is Bell difference sampling.To prove our results, we establish and/or strengthen connections between Bell difference sampling, symplectic Fourier analysis, and graph theory.

Introduction
A central goal in quantum information is to understand which quantum states are efficiently learnable.While many quantum state learning algorithms are extremely efficient in sample complexity [Aar18,BO21,HKP20], fewer classes of time-efficiently-learnable quantum states are known.One such example is the class of stabilizer states, which are n-qubit states that are stabilized by a group of 2 n commuting Pauli matrices. 1Stabilizer states are well-studied because of their broad importance and widespread applications throughout quantum information, including in quantum error correction [Sho95,CS96,Got97], efficient classical simulation of quantum circuits [BSS16, BBC + 19], randomized benchmarking [KLR + 08], and measurement-based quantum computation [RB00], to name a few examples.
The first computationally efficient algorithm for learning a complete description of an unknown stabilizer state was given by Montanaro [Mon17]. 2 Given copies of a stabilizer state |ϕ⟩, Montanaro's algorithm utilizes the algebraic properties of Pauli matrices and Bell-basis measurements to efficiently learn the generators of the stabilizer group of |ϕ⟩, which suffices to determine |ϕ⟩.More specifically, Montanaro (implicitly) introduced Bell difference sampling, which, at a high level, is an algorithmic primitive that takes copies of some state and induces a measurement distribution on Pauli matrices.Bell difference sampling was studied more thoroughly in [GNW21] and has seen extended success in the development of algorithms for stabilizer states and states that are close to stabilizer states [Mon17, GNW21, LC22, GIKL23c, HK23].
In this work, we extend the use of Bell difference sampling to give faster, more general, and otherwise improved algorithms for learning properties of quantum states related to the stabilizer formalism.By understanding how these properties affect the Bell difference sampling distribution, we are able to find relevant certificates of these properties faster than the previous state-of-the-art.

Tight Pseudorandomness Bounds
Pseudorandom states are a quantum cryptographic primitive that have recently attracted much attention in quantum cryptography and complexity theory.They can be thought of as a quantum analogue of pseudorandom generators, with the main difference being that pseudorandom states mimic the Haar measure over n-qubit states, rather than the uniform distribution over n-bit strings.Formally, they are defined as follows: Definition 1.1 (Pseudorandom quantum states [JLS18]).A keyed family of n-qubit quantum states {|φ k ⟩} k∈{0,1} κ is pseudorandom if the following conditions hold: 1. (Efficient generation) There is a polynomial-time quantum algorithm G that generates |φ k ⟩ on input k (so, in particular, n ≤ poly(κ)).
Pseudorandom states suffice to build a wide range of cryptographic primitives, including quantum commitments, secure multiparty computation, one-time digital signatures, and more [JLS18, AQY22, MY22, BCKM21, GLSV21, HMY23].The language of pseudorandom states has also been found to play a key role in resolving some paradoxes at the heart of black hole physics [BFV20,Bra23].Finally, and perhaps most surprisingly, there is recent evidence to suggest that pseudorandom states can be constructed without assuming the existence of one-way functions [Kre21,KQST23].
Collectively, these results have motivated recent works that seek to characterize what computational properties or resources are required of pseudorandom states.For example, [ABF + 22] investigates the possibility of building pseudorandom quantum states with limited entanglement, and prove the existence of pseudorandom state ensembles with entanglement entropy substantially smaller than n, assuming the existence of quantum-secure one-way functions.
Analogously, Grewal, Iyer, Kretschmer, and Liang [GIKL23c] study quantum pseudorandomness from the perspective of stabilizer complexity.They treat the number of non-Clifford gates in a circuit as a resource, similar to size or depth.The main result of [GIKL23c] shows that states having fidelity at least 1 poly(n) with a stabilizer state cannot be computationally pseudorandom.As a consequence, they deduce that ω(log n) non-Clifford gates are necessary for a family of circuits to yield an ensemble of pseudorandom quantum states.
We give an exponential improvement on this lower bound:3 Theorem 1.2 (Informal version of Corollary 4.10).Any family of Clifford circuits that produces an ensemble of n-qubit computationally pseudorandom quantum states must use at least n/2 auxiliary non-Clifford single-qubit gates.
In the special case that the non-Clifford gates are all diagonal (e.g.T -gates), our lower bound improves to n.
Under plausible computational assumptions, Theorem 1.2 is tight up to constant factors.In particular, the existence of linear-time quantum-secure pseudorandom functions implies the existence of linear-time constructible pseudorandom states [BS19,GIKL23c], which of course have at most O(n) non-Clifford gates.Note that linear-time classically-secure pseudorandom functions are strongly believed to exist [IKOS08,FLY22], and it seems conceivable that these constructions remain secure against quantum adversaries.
We remark that Theorem 1.2 bears analogy to a recent result of Leone, Oliviero, Lloyd, and Hamma [LOLH22] that information scrambled by an n-qubit unitary implemented with Clifford gates and t < n T -gates can be efficiently unscrambled.In particular, both Theorem 1.2 and [LOLH22] establish different forms of non-pseudorandomness (for states and unitaries, respectively) in the same parameter regime of non-Cliffordness.

Faster Stabilizer State Approximation
As noted earlier, one of the prominent applications of stabilizer states is in classical simulation algorithms of quantum circuits.Such algorithms work by modeling the output state of a quantum circuit as a decomposition of stabilizer states (e.g., as a linear combination) [BBC + 19].The runtime of these algorithms then scale with respect to one of several measures of the "amount of non-stabilizerness" in this decomposition.These measures are sometimes called magic monotones [VMGE14, Definition 3] [GLG + 23, Definition 3], because they are non-increasing under Clifford operations.Typically, magic monotones increase exponentially as non-Clifford gates are applied. 4xamples of well-known magic monotones include the stabilizer rank, stabilizer extent, and inverse of stabilizer fidelity [BBC + 19].
A series of recent and simultaneous works have explored the question of whether magic monotones can be estimated efficiently, or whether states with low magic are efficiently learnable.For example, recall that [GIKL23c] showed that states with non-negligible stabilizer fidelity are weakly learnable, in the sense that they are efficiently distinguishable from random.[GIKL23a, GIKL23b, HG23, LOH23] proved that states with bounded stabilizer nullity are efficiently learnable, and [GIKL23a] also gave an efficient property tester for stabilizer nullity.[GLG + 23] showed that various magic monotones cannot be estimated efficiently in certain parameter regimes, by constructing states with low magic that are cryptographically indistinguishable from states with large magic.Finally, [ABDY23,AA23] raised the question of whether states of bounded stabilizer rank are efficiently learnable.
Our second result is a further contribution towards understanding the learnability of low-magic states: we give an algorithm that finds stabilizer state approximations of states with non-negligible stabilizer fidelity.As its name suggests, stabilizer fidelity (denoted F S (|ψ⟩)) measures how close a state |ψ⟩ is to a stabilizer state: it is simply the maximum of |⟨ϕ|ψ⟩| 2 over all stabilizer states |ϕ⟩.Hence, it is not hard to see that the inverse of stabilizer fidelity is a magic monotone.Assuming |ψ⟩ has stabilizer fidelity at least τ , our algorithm returns a stabilizer state that witnesses overlap at least F S (|ψ⟩) − ε with |ψ⟩.
To our knowledge, this is the first nontrivial algorithm to approximate an arbitrary quantum state with a stabilizer state.5Indeed, we are not aware of any prior algorithm better than a brute-force search over all stabilizer states, which takes 2 O(n 2 ) time and O(n 2 ) samples. 6Thus our algorithm offers a substantial improvement in the regime of τ = ω(n −1/4 ).Arguably, the most interesting setting of parameters is constant τ , in which case we have a quadratic improvement in sample complexity and a superpolynomial improvement in time complexity.
Observe that, because we output a witness of stabilizer fidelity at least τ − ε with high probability, assuming a state with fidelity τ exists, our algorithm can additionally be used as a subroutine to estimate stabilizer fidelity and, moreover, find a stabilizer state that witnesses this.More precisely, if the goal is to estimate stabilizer fidelity to accuracy ±ε, then one can break [0, 1] into intervals of width ε and perform a binary search procedure using our algorithm.Overall, this takes O(n/ε 6 ) samples and exp(O(n/ε 4 )) time.
As an application, our stabilizer state approximation algorithm could be used to search for better stabilizer decompositions of magic states.Recall that magic states are states that, when injected into Clifford circuits, allow for universal quantum computation [BK05].The best-known algorithms for simulating quantum circuits dominated by Clifford gates use decompositions of magic states into linear combinations of stabilizer states and have a runtime that scales polynomially in the complexity of the decomposition, either in terms of the stabilizer rank or stabilizer extent [BBC + 19].Hence, better stabilizer decompositions of magic states yield faster algorithms.These decompositions are often obtained by writing the tensor product of a small number of magic states (usually on the order of 10 qubits) as linear combination of a slightly larger number of stabilizer states [BSS16,Koc22].Therefore, if a classical simulation of our algorithm could be made practical for (say) n ≈ 15 qubits, there is reason to believe that running this algorithm on magic states, combined with a meta-algorithm such as matching pursuit [MZ94], could find better stabilizer decompositions of magic states and, as a result, improve the runtime of near-Clifford simulation.
Finally, we remark that the problem we solve is similar in spirit to the agnostic probably approximately correct (PAC) learning framework [Val84,KSS92].In the agnostic PAC model, a learner is given labeled training data {(x 1 , y 1 ), . . ., (x m , y m )} from some unknown distribution D, as well as some concept class C to choose a hypothesis from.The goal of the learner is to find a hypothesis function h ∈ C that approximates the best fit for the training data, even though no function in C will necessarily fit the training data perfectly.In an analogous fashion, our algorithm finds a stabilizer state |ϕ⟩ that approximates the best fit for |ψ⟩ over the set of stabilizer states, which need not contain |ψ⟩.We note that Aaronson studied PAC learning of quantum states in the so-called realizable setting [Aar07].However, agnostic PAC learning of quantum states has not yet appeared in the literature.

Bounded-Distance Stabilizer Approximation
Although our stabilizer state approximation algorithm significantly improves upon brute force, it still requires exponential time in general.One might wonder whether this exponential runtime is necessary.For example, is it possible that finding stabilizer state approximations is computationally hard, even for states whose distance to the nearest stabilizer state is bounded by some small constant?A priori, this might even be expected, because in other contexts, learning stabilizer states with a constant rate of noise can be as hard as the Learning Parities with Noise (LPN) problem [GL22, HIN + 23], which is believed to be hard.What if the stabilizer fidelity is large enough to guarantee the existence of a unique closest stabilizer state?Our third result shows that in this regime, a modification of the algorithm from Theorem 1.3 is computationally efficient.In particular, this modification works when the stabilizer fidelity is larger than cos 2 (π/8) ≈ 0.8536, which is precisely threshold above which |ψ⟩ is guaranteed to have a unique closest stabilizer state.

Tolerant Stabilizer Testing
Our final result is a tolerant property testing algorithm for stabilizer states.In the tolerant property testing model [PRR06], which generalizes ordinary property testing [RS96,GGR98], a tester must accept objects that are at most ε 1 -far from having some property ("completeness") and reject objects that are at least ε 2 -far from having that same property ("soundness") for 0 ≤ ε 1 < ε 2 ≤ 1.
The standard property testing model is recovered when ε 1 = 0, and the relaxed completeness condition generally makes tolerant testing a much harder problem.Nonetheless, the tolerant testing model is natural to consider in certain error models, such as in the presence of imprecise quantum gates.
Our result extends work by Gross, Nezami, and Walter [GNW21], who gave a property tester (hereafter, the "GNW algorithm") for stabilizer states.When combined with the prior work of [GIKL23c], we deduce the existence of a tolerant property testing algorithm for stabilizer states.Our algorithm takes copies of an n-qubit quantum state |ψ⟩ and decides whether |ψ⟩ has stabilizer fidelity at least α 1 or at most α 2 , promised that one of these is the case.Note that we have taken α 1 := 1 − ε 1 and α 2 := 1 − ε 2 for notational simplicity.
There is an algorithm that uses O(1/γ 2 ) copies of a quantum state |ψ⟩, O(n/γ 2 ) time, and decides whether |ψ⟩ has stabilizer fidelity at least α 1 or at most α 2 , promised that one of these is the case.
While our algorithm does not work for all settings of ε 1 and ε 2 -giving such an algorithm is an open problem-our algorithm does significantly improve over prior work.In Section 7.3, we compare the parameter regimes in which our algorithm works to the existing literature and show those regimes visually in Fig. 2.
We remark that the algorithm from Theorem 1.4 can also be modified into a tolerant property testing algorithm that works whenever α 1 > cos 2 (π/8), by simply estimating the fidelity |⟨ψ|ϕ⟩| 2 of the stabilizer state |ϕ⟩ output by the algorithm.The main advantages of Theorem 1.5 are its improved sample complexity and runtime: whereas Theorem 1.5 uses a system-size independent number of samples and linear time, Theorem 1.4 requires a linear number of samples and cubic time.Additionally, Theorem 1.5 operates in some parameter regimes where α 1 ≤ cos 2 (π/8); see Fig. 2.

Our Techniques
The unifying tool in our work is Bell difference sampling, a measurement primitive that has recently found applications in a variety of algorithms related to stabilizer states [Mon17,GNW21,GIKL23c].We defer a full definition of Bell difference sampling to Section 2.3, but note some of its important properties here.Bell difference sampling involves measuring pairs of qubits of |ψ⟩ ⊗2 in the Bell basis, repeating again with |ψ⟩ ⊗2 , and combining the measurements to interpret the result as corresponding to an n-qubit Pauli operator.Overall, this consumes four copies of |ψ⟩, though it only performs measurements across two copies of |ψ⟩ at a time.It will be most convenient to parameterize the sampled Pauli operators by strings in F 2n 2 , which we do as follows.For x = (a, b) ∈ F 2n 2 , where a and b are the first and last n bits of x, respectively, we define the Weyl operator W x as Importantly for us, the Weyl operators form an orthogonal basis for C 2 n ×2 n , and so they give rise to the Weyl expansion of a quantum state |ψ⟩ as For pure states, the squared coefficients in this expansion sum to 1, and therefore form a distribution over F 2n 2 .We denote this distribution by p ψ (x) := 2 −n ⟨ψ|W x |ψ⟩ 2 .7 Gross, Nezami, and Walter [GNW21] give an explicit form for the distribution obtained by performing Bell difference sampling.In particular, they showed that Bell difference sampling a quantum pure state |ψ⟩ is equivalent to sampling from the following distribution: i.e., the convolution of p ψ with itself.At a high level, we establish our results by proving some structure on q ψ and p ψ for certain quantum states.
Tight Pseudorandomness Bounds To prove our lower bound on the number of non-Clifford gates required to prepare pseudorandom states, we give an algorithm that distinguishes Haarrandom states from quantum states prepared by circuits with fewer than n/2 non-Clifford singlequbit gates.The key insight is that if |ψ⟩ is the output of such a circuit, then q ψ is concentrated on a proper subspace of F 2n 2 , whereas for Haar-random states, q ψ is anticoncentrated on all such subspaces with overwhelming probability over the Haar measure.Proving these properties of q ψ reveals a simple algorithm: draw a linear number of samples from q ψ and compute the number of linearly independent vectors in the sample.Haar-random states will have 2n such vectors with high probability and, otherwise, there will be strictly less than 2n such vectors.
Faster Stabilizer State Approximation Our algorithms for stabilizer approximation also rely on proving anticoncentration properties of q ψ .We begin by showing that if |ψ⟩ has large fidelity with some stabilizer state |ϕ⟩, then q ψ is well-supported on the n-dimensional subspace Weyl(|ϕ⟩) := {x ∈ F 2n 2 : W x |ϕ⟩ = ± |ϕ⟩} of Weyl operators that stabilize |ϕ⟩ (up to sign).Next, we establish that if |ϕ⟩ is the state that maximizes stabilizer fidelity, then the mass of q ψ on Weyl(|ϕ⟩) is not too concentrated on any proper subspace.Hence, by sampling from q ψ enough times, we can be guaranteed that with high probability, Weyl(|ϕ⟩) will be generated by some subset of the sampled Weyl operators.By iterating through all mutually commuting subsets of the sampled Weyl operators, we compile a list of candidate stabilizer states |ϕ⟩ that must contain the fidelity-maximizing |ϕ⟩.Therefore, our algorithm reduces to estimating the fidelity of |ψ⟩ with each candidate |ϕ⟩.We further improve the time efficiency via an algorithm for finding maximal cliques, due to [TTT06], by observing that the candidate subsets must correspond to maximal cliques in the graph of commutation relations. 8We also improve the sample complexity by using the classical shadows protocol [HKP20] to estimate all of the fidelities with candidate states |ϕ⟩ efficiently.For more details on these improvements, see Section 5.2.

Bounded-Distance Stabilizer Approximation
In the case where stabilizer fidelity is bounded below by cos 2 (π/8), we follow the same approach, but use a different and more efficient subroutine for determining which of the sampled Weyl operators generate Weyl(|ϕ⟩).In particular, we show that there is a simple statistical test for this purpose: if |⟨ϕ|ψ⟩| 2 > cos 2 (π/8), then for any x ∈ F 2n 2 , x ∈ Weyl(|ϕ⟩) if and only if ⟨ψ|W x |ψ⟩ 2 > 1 2 (Corollary 6.4).This allows us to eschew the maximal clique algorithm entirely, and we instead directly estimate ⟨ψ|W x |ψ⟩ 2 to determine whether W x belongs to Weyl(|ϕ⟩).We further improve upon the sample complexity of this subroutine by making use of an algorithm due to Huang, Kueng, and Preskill [HKP21] for estimating the expectation of m different Weyl operators from only O(log m) samples.We also provide a simpler proof of this result in Appendix A, based on the Fourier-analytic techniques described below.
Tolerant Stabilizer Testing Our last result, the tolerant property testing algorithm for stabilizer states, is based on running the GNW stabilizer testing algorithm [GNW21] repeatedly to estimate its acceptance probability.We prove this algorithm's correctness by combining existing bounds on the completeness and soundness of the GNW test in terms of stabilizer fidelity.The bound on completeness is due to [GIKL23c], while the soundness analysis comes from [GNW21].
Symplectic Fourier Analysis An essential tool for proving the above results is symplectic Fourier analysis, wherein the Fourier transform over real-valued functions is defined with respect to the symplectic product on F 2n 2 .To give a sense of the usefulness of symplectic Fourier analysis in our work, we showcase two powerful theorems whose proofs are symplectic-Fourier-analytic.In what follows, for a subspace T ⊆ F 2n 2 identified with a set of Weyl operators {W x : x ∈ T }, the subspace T ⊥ denotes the set of Weyl operators that commute with T .
Theorem 1.6 (Restatement of Theorem 3.1 and Theorem 3.2).Let T ⊆ F 2n 2 be a subspace, and let |ψ⟩ be an n-qubit quantum pure state.Then and In words, Theorem 1.6 shows that p ψ and q ψ exhibit a strong duality property with respect to the commutation relations among Weyl operators.In particular, the first part shows that the mass of p ψ on a subspace T of Weyl operators is directly proportional to the mass on the subspace T ⊥ of Weyl operators that commute with T .Theorem 1.6 is especially powerful when the subspace T is very large, because T and T ⊥ always have inversely proportional size (see Fact 2.6).Hence, using our duality theorems, we can convert summations over high-dimensional subspaces into summations over just a few terms.

Preliminaries
We introduce notation and background that is central to our work.We assume familiarity with common concepts in quantum information and computer science, such as the stabilizer formalism and basic graph theory.For more background on the stabilizer formalism, see, e.g., [Got97,NC02].
We write [n] := {1, . . ., n}.For x = (a, b) ∈ F 2n 2 , a and b always denote the first and last n bits of x, respectively.For a probability distribution D on a set S, we denote drawing a sample s ∈ S according to D by s ∼ D. We denote drawing a sample s ∈ S uniformly at random by s ∼ S. In an undirected graph G, a clique is a complete subgraph of G.A maximal clique is a clique that is not a proper subgraph of another clique.For quantum pure states |ψ⟩ , |ϕ⟩, let d tr (|ψ⟩ , |ϕ⟩) = 1 − |⟨ψ|ϕ⟩| 2 denote the trace distance and F (|ψ⟩ , |ϕ⟩) = |⟨ψ|ϕ⟩| 2 denote the fidelity.The trace distance quantifies the distinguishability between two quantum states by a two-outcome measurement.We also use the following Chernoff bound.
Fact 2.2 (Hoeffding's inequality).Suppose X 1 , . . ., X n are independent random variables subject to Then for all t ≥ 0 it holds that: The n-qubit Pauli group P n is the set {±1, ±i} × {I, X, Y, Z} ⊗n , where I, X, Y, Z are the standard Pauli matrices.We refer to unitary transformations in the Clifford group as Clifford circuits (equivalently, Clifford circuits are quantum circuits comprised only of Clifford gates, namely, the Hadamard, Phase, and CNOT gates).Clifford gates with the addition of any single-qubit non-Clifford gate form a universal gate set.The T -gate is often the non-Clifford gate of choice, where the T -gate is defined by T := |0⟩⟨0| + e iπ/4 |1⟩⟨1|.We denote the set of n-qubit stabilizer states by S n .One way to measure the "stabilizer complexity" of a quantum state is the stabilizer fidelity.

Symplectic Vector Spaces
We work extensively with F 2n 2 as a symplectic vector space by equipping it with the symplectic product.
Definition 2.4 (Symplectic product).For x, y ∈ F 2n 2 , we define the symplectic product as where all operations are performed over F 2 .
The symplectic product gives rise to the notion of a symplectic complement, much like the orthogonal complement for the standard inner product.

Definition 2.5 (Symplectic complement). Let T ⊆ F 2n
2 be a subspace.The symplectic complement of T , denoted by T ⊥ , is defined by We present the following useful facts about the symplectic complement, many of which are similar to that of the more familiar orthogonal complement.
Fact 2.6.Let S and T be subspaces of F 2n 2 .Then: 2 is isotropic when for all x, y ∈ T , [x, y] = 0.A subspace T ⊂ F 2n 2 is Lagrangian when T ⊥ = T .Lagrangian subspaces can equivalently be defined as isotropic subspaces with dimension n.

Symplectic Fourier Analysis
Our work uses symplectic Fourier analysis, which is similar to Boolean Fourier analysis (see e.g., [O'D14]), except the Fourier characters are defined with respect to the symplectic product.
Definition 2.7 (Symplectic Fourier transform).Let f : F 2n 2 → R. We define the symplectic Fourier transform of f , which is given by a function f : Hence, the symplectic Fourier expansion of f is The well-known Plancherel's Theorem holds under this Fourier transform.
Convolution plays an important role in our work.
Definition 2.9 (Convolution).Let f, g : Convolution corresponds to the multiplication of Fourier coefficients, even under the symplectic Fourier transform.
Proposition 2.10.Let f, g : Using this, we can expand and simplify: We prove a fact that will be useful in our symplectic Fourier analysis.
Lemma 2.11.For any subspace T ⊆ F 2n 2 and a fixed Proof.If x ∈ T ⊥ then this is easy to see.Suppose x ̸ ∈ T ⊥ .Then we claim [a, x] = 0 for exactly half of the elements a ∈ T .To see this, we observe that there exists a y ∈ T such that [y, x] = 1.Let T /y denote T modulo addition by y.Given a pair {a, a + y} ∈ T /y, observe that exactly one of [a, x] and [a + y, x] is 0 and the other is 1.As such we have that for half of all a ∈ T , [a, x] = 0 and for the other half, [a, x] = 1, giving us a∈T (−1) [a,x] = 0.

Weyl Operators and Bell Difference Sampling
For x = (a, b) ∈ F 2n 2 , the Weyl operator W x is defined as The stabilizer dimension of a stabilizer state is n, which is maximal, and, for most states, the stabilizer dimension is 0.
The Weyl operators collectively form an orthogonal basis for 2 n × 2 n matrices with respect to the inner product ⟨A, B⟩ = tr(A † B).This gives rise to the so-called Weyl expansion of a quantum state.
Definition 2.14 (Weyl expansion).Let |ψ⟩ ∈ C 2 n be an n-qubit quantum pure state.The Weyl expansion of |ψ⟩ is Squaring the c ψ (x)'s gives rise to a distribution over F 2n 2 and therefore over the Weyl operators (see Footnote 7 for a proof).We denote this distribution by p ψ (x) := c ψ (x) 2 and refer to it as the characteristic distribution.Note that, for all x, p ψ (x) ∈ [0, 2 −n ].A convenient fact about the p ψ is its invariance (up to scaling) under the symplectic Fourier transform.
Fact 2.15.For any n-qubit pure state |ψ⟩ and any x ∈ F 2n 2 , p ψ (x) = 2 n p ψ (x).For a proof of this fact, we refer the reader to [GNW21, Equation 3.5], noting our slight difference in normalization. 10 A significant algorithmic primitive in our work is Bell difference sampling 2 } forms an orthonormal basis of C 2 ⊗C 2 , which we call the Bell basis.Bell difference sampling an n-qubit state |ψ⟩ just means the following.First, take two copies of a pure state |ψ⟩.Take the first qubit in each copy and measure them in the Bell basis.Repeat this for each remaining pair of qubits.Let (a i , b i ) denote the two-bit measurement outcome from measuring the ith pair of qubits.Then, we denote the measurement outcome on the two copies by x = (a 1 , . . ., a n , b 1 , . . ., b n ) ∈ F 2n 2 .Repeat this once more with two fresh copies of |ψ⟩ to obtain a string y ∈ F 2n 2 .Finally, output x + y. 11 Historically, Bell difference sampling has found use in algorithms for stabilizer states.However, Gross, Nezami, and Walter proved that Bell difference sampling is meaningful for all quantum states. 9The stabilizer dimension is closely related to the stabilizer nullity [BCHK20] (in fact, for n-qubit states, the stabilizer dimension is simply n minus the stabilizer nullity).
10 Alternatively, one can refer to [GIKL23c, Proposition 17], where the normalization is consistent with this work, but [GIKL23c] uses the standard Fourier transform rather than the symplectic one.Despite this difference, the proof goes through in a similar way.
11 Even when |ψ⟩ is a stabilizer stabilizer state, measuring two copies of |ψ⟩ in the Bell basis returns x ∈ F 2n 2 with probability p ψ (x + a), where a ∈ F 2n 2 is an unwanted shift.Bell difference sampling essentially cancels out this unwanted shift a. See [Mon17,GNW21] for more detail.
Lemma 2.16 (Bell difference sampling, [GNW21, Theorem 3.2]).Let |ψ⟩ be an arbitrary n-qubit pure state.Bell difference sampling corresponds to drawing a sample from the following distribution: and uses four copies of |ψ⟩.We refer to q ψ (x) as the Weyl distribution.

On the Weyl and Characteristic Distributions
We prove identities related to the characteristic distribution p ψ and Weyl distribution q ψ that are critical for our results.We emphasize that these results hold for all pure quantum states.First, we show that the mass on a subspace T ⊆ F 2n 2 under p ψ is proportional to the mass on T ⊥ under p ψ .Theorem 3.1.Let T ⊆ F 2n 2 be a subspace.Then Proof.
A similar result is true for q ψ .In words, we show that the average probability mass on a subspace T under q ψ is equal to the squared-ℓ 2 -norm of the probability mass on T ⊥ under p ψ .Theorem 3.2.Let T ⊆ F 2n 2 be a subspace.Then Proof.

Quantum Circuits With Few Non-Clifford Gates
To begin, we show that the output state |ψ⟩ of a t-doped Clifford circuit, where t < n/2, induces a distribution q ψ that is supported over a subspace of dimension at most 2n − 2.
Lemma 4.2.Let |ψ⟩ be the output state of a t-doped Clifford circuit.Then the stabilizer dimension of |ψ⟩ is at least n − 2t.
Proof.We proceed by induction on t.In the base case t = 0, so |ψ⟩ is a stabilizer state and has stabilizer dimension n.
For the inductive step, let t > 0. Write |ψ⟩ = CU |φ⟩, where |φ⟩ is the output of a (t − 1)-doped Clifford circuit, U is a single-qubit gate, and C is a Clifford circuit.Because the stabilizer dimension is unchanged by Clifford gates, it suffices to show that the stabilizer dimension of U |φ⟩ is at least n − 2t.
Let S = Weyl(|φ⟩), which by the induction assumption has dimension at least n − 2(t − 1).Observe that for any x ∈ S, if the Weyl operator W x commutes with U , then: Hence, letting T := {x ∈ S : U W x U † = W x }, we see that the stabilizer dimension of U |φ⟩ is at least the dimension of T .But |T | ≥ |S|/4, because T contains all elements x of S for which W x restricts to the identity on the qubit to which U is applied.Thus, the stabilizer dimension of U |φ⟩ is at least n − 2t, as desired.
We remark that the stabilizer dimension lower bound in Lemma 4.2 can be improved to n − t in the case that all of the non-Clifford gates are diagonal (for example, if all of the non-Clifford gates are T -gates).This is because diagonal gates commute with both I and Z.

Anticoncentration of Haar-Random States
Now we show that if |ψ⟩ is Haar-random, then q ψ is well-supported over the entirety of F 2n 2 in the sense that every proper subspace of F 2n 2 contains a bounded fraction of the q ψ mass.This implies that sampling from q ψ gives 2n linearly independent elements of F 2n 2 after a reasonable number of iterations.
We first require the following lemma, which shows that the Weyl measurements are concentrated around 0. Proved in [GIKL23c], this is a consequence of Lévy's lemma.
Combining with the fact (Theorem 3.2) that the q ψ mass on a subspace is proportional to its p 2 ψ mass on the symplectic complement, we obtain the following.
Lemma 4.7.Let |ψ⟩ be a Haar-random n-qubit state.Then all subspaces T ⊆ F 2n 2 of dimension 2n − 1 simultaneously satisfy except with probability at most Proof.Let T be any subspace of dimension 2n − 1.Then the symplectic complement T ⊥ has dimension 1, so it is the span of a single nonzero x ∈ F 2n 2 .By Theorem 3.2, Hence, the probability that there exists a T for which x∈T q ψ (x) exceeds 2 3 is at most the probability that there exists a nonzero x for which |⟨ψ|W x |ψ⟩| ≥ 4 » 1 3 .By Lemma 4.6, this probability is at most 2 2n+1 exp Ä − 2 n 36 √ 3π 3 ä .

Distinguishing From Haar-Random
We are now ready to state and analyze our algorithm that, given copies of |ψ⟩, efficiently distinguishes whether |ψ⟩ is (i) Haar-random or (ii) a state prepared by a (n/2−1)-doped Clifford circuit, promised that one of these is the case.While the analysis is not so trivial, the algorithm itself is straightforward: Bell difference sample O(n) times, and, with high probability, we will have a set of Weyl operators that span F 2n 2 when |ψ⟩ is Haar-random.On the other hand, if |ψ⟩ is the output of an t-doped Clifford circuit, for t < n/2, this can never happen because q ψ is supported on a subspace of dimension at most n + 2t (which we proved in Corollary 4.5).To prove the correctness of Algorithm 1, we need the following lemma.
Lemma 4.8.Let |ψ⟩ be an n-qubit Haar-random quantum state and fix δ > 0. Taking 6n + 9 2 log(2/δ) samples from q ψ suffices to sample 2n linearly independent elements of F 2n 2 with probability at least 1 − δ over both the Haar measure and the sampling process.
2 , let T i = ⟨x 1 , . . ., x i−1 ⟩ be the subspace spanned by the first i − 1 samples for arbitrary 1 ≤ i ≤ m.Define the {0, 1} indicator random variable such that we have achieved our goal if and only if m i=1 X i ≥ 2n.We see that if By Lemma 4.7, x∈T ′ i q ψ (x) ≤ 2 3 for all T ′ , with overwhelmingly high probability over the Haar measure.Let us assume that this has happened.Since x∈F 2n 2 q ψ (x) = 1, we know that in the scenario where T i ̸ = F 2n 2 that x∈F 2n 2 \T i q ψ (x) ≥ 1 3 .Since both scenarios give expectation over 1 3 , this gives us E[X i | X 1 , . . ., X i−1 ] ≥ 1 3 for any assignment of X 1 , . . ., X i−1 .By Hoeffding's inequality (Fact 2.2), Writing m = (6 + γ)n, this bound becomes exp ã Thus taking γ = 9 2n log(2/δ), we see that after m = 6n + 9 2 log(2/δ) samples, this probability is at most δ/2.By the union bound, the total failure probability over both the Haar measure and the samples is at most ã which in turn is at most δ, for reasonable choices of δ. 14 Theorem 4.9.Algorithm 1 succeeds with probability at least 1 − δ, and it uses O(n + log(1/δ)) copies of the input state and O(n 3 + n 2 log(1/δ)) time.
Proof.In the case that |ψ⟩ is the output of a t-doped Clifford circuit, q ψ is supported on a subspace of dimension at most n + 2t by Corollary 4.5, so the algorithm always outputs 1.Therefore, we need only argue that in the case that |ψ⟩ is Haar-random we see at least 2n linearly independent elements of F 2n 2 .By Lemma 4.8, this happens with probability at least 1 − δ over the Haar measure and the sampling process.
It is clear that the sample complexity is O(m) = O(n + log(1/δ)).To analyze the time complexity, we note that each sample takes O(n) time, so sampling takes O(mn) time.Gaussian elimination takes O(mn 2 ) = O(n 3 + n 2 log(1/δ)) time and is the dominating term.
Our distinguishing algorithm immediately implies a lower bound on the number of non-Clifford gates needed to prepare computationally pseudorandom quantum states.
Corollary 4.10.Any family of t-doped Clifford circuits that produces an ensemble of n-qubit computationally pseudorandom quantum states must satisfy t ≥ n/2. 13A cautious reader may object that the random variables here are not independent, which is needed for Hoeffding's inequality.However, because the conditional expectations are at least 1/3, the sum (first-order) stochastically dominates a sum of independent Bernoulli random variables with means 1/3.Hence, the left tail probability of the former can only be smaller than the latter.
14 Of course, the union bound fails when δ is doubly exponentially small, as our bound for the error over the Haar measure is 2 −2 O(n) .However, in this setting, it is information-theoretically impossible to distinguish a state from Haar-random.
Note that this lower bound can be improved by a factor of 2 in the special case that all of the non-Clifford gates are diagonal (e.g.T -gates), because of the improved lower bound on stabilizer dimension in Lemma 4.2 for this case.
Corollary 4.11.Any family of Clifford+T circuits that produces an ensemble of n-qubit computationally pseudorandom quantum states must use at least n T -gates.

Stabilizer State Approximations
We state and analyze our algorithm that, given copies of an n-qubit quantum pure state |ψ⟩ that has stabilizer fidelity at least τ , outputs a succinct description of a stabilizer state that witnesses fidelity at least τ − ε.
Our presentation is split into two parts.First, in Section 5.1, we prove a useful lemma regarding q ψ on S * = Weyl(|ϕ⟩), where |ϕ⟩ is the stabilizer state that maximizes stabilizer fidelity with |ψ⟩.At a high level, we argue that any sample from q ψ has a good enough chance of "making progress" towards learning a complete set of generators for S * .Formally, we prove that the q ψ -mass on S * is not heavily concentrated on proper subspaces of S * , so that when we sample an element of S * , we obtain an element of S * that is linearly independent of the previous samples with a reasonable probability.Second, in Section 5.2, we state our algorithm, prove its correctness, and analyze its sample and time complexities.

Bell Difference Sampling Makes Progress
The next lemma gives a way to argue that in many of our proofs, we can suppose without loss of generality that |0 n ⟩ maximizes stabilizer fidelity.
Lemma 5.1.Given an n-qubit stabilizer state |ψ⟩, let S = Weyl(|ψ⟩) be its unsigned stabilizer group, and let T ⊆ S be a subspace of dimension n − t.Then there exists a Clifford circuit C such that Proof.Because the Clifford group acts transitively on stabilizer states, there exists a Clifford circuit So, it only remains to show that C can be chosen so as to map T to 0 n+t × F n−t 2 while preserving these properties.This holds because a CNOT gate between qubits i and j in its action on F 2n 2 maps (0 n , x) ∈ F 2n 2 to (0 n , M x) where M ∈ GL n (F 2 ) is an elementary matrix (in particular, a matrix equal to the identity except with the (i, j) entry equal to 1).Hence, CNOT gates between arbitrary qubits generate all of GL n (F 2 ).So, we can choose CNOT gates so as to map T to an arbitrary subspace of 0 n × F n 2 of the same dimension, while preserving |0 n ⟩ and 0 n × F n 2 .
Now, we show that the p ψ -mass on S * is bounded below by the squared stabilizer fidelity of |ψ⟩.

It remains to lower bound x∈F
Since we know that x∈F n 2 p ψ ′ (0 n , x) ≥ F S (|ψ⟩) 2 , this tells us that x∈S * p ψ (x) ≥ F S (|ψ⟩) 2 as well.
We can generalize this result to arbitrary subspaces of S * .
Corollary 5.3.Given an n-qubit state |ψ⟩, let |ϕ⟩ be a stabilizer state that maximizes the stabilizer fidelity, and let S * = Weyl(|ϕ⟩).Let T ⊆ S * be a subspace of S * .Then where we have used the fact that S * ⊆ T ⊥ .Now we show a series of anticoncentration lemmas on proper subspaces of S * .For these next lemmas, we will find it more convenient to assume without loss of generality (because of Lemma 5.1) that the state maximizing fidelity is |0 n ⟩, which conceptually simplifies the computations.
We can visualize the first qubit of this state on the Bloch sphere.The surface enclosed by the red and blue curve is exactly the set of points on the sphere for which |0⟩ is the closest stabilizer state.By our assumption that the stabilizer fidelity of |ψ⟩ is maximized by |0 n ⟩, α 0 |0⟩ + α 1 |1⟩ must lie on this surface, up to normalization.The corners of this surface (the intersection of a blue curve with a red curve) represent a choice of α 0 and α 1 that minimizes α 0 .
Choose the global phase on |ψ⟩ to assume without loss of generality that α 0 is positive-real and α 1 = |α 1 |e iθ .We may write: All of these values need to be less than |α 0 | 2 , as otherwise |ψ⟩ would have larger fidelity with one of the above states.Due to symmetry of both sin and cos, we will only consider θ ∈ [0, π 2 ] such that the only relevant equations to consider are the first and last.This allows us to write the largest of the above inner products as which is minimized for θ = π/4.Plugging that back in and comparing to α 2 0 leads to and solving for the maximum Lemma 5.5.Let |ψ⟩ be an n-qubit state.Suppose the fidelity |⟨ψ|ϕ⟩| 2 is maximized by |ϕ⟩ = |0 n ⟩ over stabilizer states |ϕ⟩.Let S * = 0 n × F n 2 = Weyl(|0 n ⟩), and let T = 0 n+1 × F n−1 2 be a maximal subspace.Then Proof.We can write Now apply Corollary 5.3 and Lemma 5.4 respectively and we get as desired.
Lemma 5.6.Given an n-qubit state |ψ⟩, let |ϕ⟩ be a stabilizer state that maximizes the stabilizer fidelity, and let S * = Weyl(|ϕ⟩).Let T ⊂ S * be a proper subspace of S * .Then Proof.Use Lemma 5.1 to choose a Clifford circuit such that By Lemma 5.5, this sum is lower bounded by

The Algorithm
Our algorithm for stabilizer state approximations uses the powerful classical shadows framework [HKP20] to improve its sample complexity.
Then there exists a quantum algorithm that first performs m shadow = O(log(K/δ)/ε 2 ) random Clifford measurements on independent copies of ρ.Then, later given K different observables O 1 , O 2 , . . ., O K in an online fashion, where each O i is a rank-1 projector, the algorithm uses the measurement results to output estimates o 1 , . . ., o K , such that with probability at least 1 − δ, for ev- a projector onto a stabilizer state, then each o i can be computed from the measurement results by a classical algorithm that takes time O(n 2 m shadow ).
For the "moreover" part of Theorem 5.7, see the remarks on Page 1053 of [HKP20]. 15e also require an algorithm, due to [TTT06], for computing all of the maximal cliques in a graph.
Theorem 5.8 (Computing maximal cliques [TTT06]).Given an undirected graph G with n vertices, there is a classical algorithm that outputs a list of all of the maximal cliques in G in time O(3 n/3 ).
Note that this implies that the number of maximal cliques is at most O(3 n/3 ).We are now ready to describe the stabilizer state approximation algorithm.At a high level, it uses Bell difference sampling to obtain a list of candidate Lagrangian subspaces generated by the sampled Weyl operators.Then, it iterates through the candidate groups to find the stabilizer state with largest fidelity, using classical shadows to perform the estimation.
We first argue that with high probability, one of the maximal cliques generates the Lagrangian subspace corresponding to a state that maximizes stabilizer fidelity.bound (Fact 2.1), we have Hence, choosing suffices to guarantee that Eq. ( 1) is at most δ.Now we have everything needed to prove the correctness of Algorithm 2.
Theorem 5.10.Let |ψ⟩ be an n-qubit state with F S (|ψ⟩) ≥ τ .Then choosing ã suffices to guarantee that with probability at least 1 − δ, Algorithm 2 outputs a state |ϕ⟩ satisfying Proof.Choose the failure probability in Lemma 5.9 to be at most δ/2.Choose the parameters in Theorem 5.7 so that the additive error in the estimates is ε/2 and the failure probability is at most δ/2; this requires choosing We assume henceforth that both Theorem 5.7 and Lemma 5.9 do not fail, which happens with probability at least 1 − δ over the samples.
Letting |φ⟩ be the state maximizing stabilizer fidelity and S * = Weyl(|φ⟩), Lemma 5.9 guarantees that the algorithm samples a complete set of generators for S * .These generators are necessarily contained in some maximal clique of G because they all commute, and moreover, the subspace spanned by this clique must equal S * because S * equals its symplectic complement (so the maximal clique cannot contain any elements not in S * ).
By Theorem 5.7, the estimate o φ is at least F S (|ψ⟩) − ε/2, so max ϕ o ϕ ≥ F S (|ψ⟩) − ε/2.Thus, the state |ϕ⟩ that maximizes the estimate o ϕ (and is output by the algorithm) has Finally, we briefly comment on the sample and time complexities of Algorithm 2. The sample complexity of the algorithm is The runtime is dominated by iterating through all of the maximal cliques, iterating through all of the stabilizer states |ϕ⟩ such that Weyl(|ϕ⟩) = S, and computing o ϕ .There are at most O Ä 3 m clique /3 ä maximal cliques, by Theorem 5.8.There are exactly 2 n stabilizer states in each basis.Theorem 5.7 then guarantees that computing each o ϕ from the classical shadows takes time O(n 2 m shadow ).Thus the overall time complexity is at most Plugging in the bounds on m clique and m shadow gives exp which further simplifies to exp by absorbing the rightmost term into the big-O in the exponent.
Proof.We write the input state as where |ϕ⟩ is a stabilizer state and |ϕ ⊥ ⟩ is the part orthogonal to |ϕ⟩.Let P = ±W x be the signing of W x for which ⟨ϕ|P |ϕ⟩ = 1.Then we conclude that ⟨ψ|P |ψ⟩ ≥ 2τ − 1, and hence (using If an operator is not in |ϕ⟩'s unsigned stabilizer group, then it must anticommute with at least half of the Pauli operators in that group.The uncertainty principle states that the expectation of these operators must be small, since the expectation of the operators in Weyl(|ϕ⟩) is large.To show this formally, we use the Schrödinger uncertainty relation.Proof.Let W x be a Pauli operator that anticommutes with W y and for which x ∈ Weyl(|ϕ⟩).Note that such an operator must exist, for if it didn't, then W y would commute with every stabilizer of |ϕ⟩, and therefore we would have y ∈ Weyl(|ϕ⟩) ⊥ = Weyl(|ϕ⟩), which contradicts the supposition y / ∈ Weyl(|ϕ⟩).Fact 6.2 simplifies to the following: where we use the fact that all Pauli operators are Hermitian and unitary and that W x and W y anticommute.Then In the second-to-last step, we used Proposition 6.1.Proposition 6.1 and Proposition 6.3 suggest that we can determine whether a given Pauli operator is in the unsigned stabilizer group only from its squared expectation as long as, for all y / ∈ Weyl(|ϕ⟩) and for all x ∈ Weyl(|ϕ⟩), which happens only when 4τ (1 − τ ) < (2τ − 1) 2 ⇐⇒ cos 2 (π/8) < τ .However, we must also take into account the fact that we cannot know the squared expectations exactly.Rather, we can only recover them to some ±O(γ) accuracy, which in turn implies that τ must be at least cos 2 (π/8) + γ for some γ > 0. We formalize this in the following corollary.
A noteworthy consequence of Corollary 6.4 is that the state |ϕ⟩ must be unique:

The Algorithm
We now state and analyze our algorithm.Corollary 6.4-which is the starting point of our algorithm-implies that, based only on the squared expectation of a Weyl operator, we can decide whether nor not it is in Weyl(|ϕ⟩), where |ϕ⟩ is the stabilizer state that has fidelity at least cos 2 (π/8) + γ with the input state |ψ⟩ .At a high level, there are two missing pieces to complete our algorithm.First, we need to find a polynomial-size list of Weyl operators that is guaranteed to contain a list of generators of Weyl(|ϕ⟩).By Lemma 5.9, we can achieve this by Bell difference sampling repeatedly from Weyl(|ϕ⟩).Second, we must estimate the squared expectations of the Weyl operators we sample.One way to do so is by naïvely measuring each Weyl operator repeatedly, one at a time.However, an algorithm due to Huang, Kueng, and Preskill [HKP21] achieves a better runtime, letting us estimate the squared expectation value of many Weyl operators with only a logarithmic sample complexity.
We also provide a simple proof of Theorem 6.6 in Appendix A.
Putting the pieces together, our algorithm works as follows.
Our goal now is to filter out the samples outside of Weyl(|ϕ⟩).To do so, we estimate the squared expectation of each of the sampled Weyl operators to within error ε := 2 √ 2γ, using Theorem 6.6.This requires 4 log (6m/δ) ε 2 = log (6m/δ) 2γ 2 copies to succeed with probability 1 − δ/3.Conditioned on the success of Line 3, Corollary 6.4 tells us that at the end of Line 4, we will have filtered out all elements outside of Weyl(|ϕ⟩).Assuming Lines 2 and 3 both succeed, we will be left with S = Weyl(|ϕ⟩), which is Lagrangian.
Finally, we need to bound how many samples are necessary to measure |ϕ⟩ for the majority result when we measure in the basis specified by S. Let k be the number of times we measure in Line 5, and for indicator random variables Then via Hoeffding's inequality (Fact 2.2): Therefore, by taking k ≥ 4 log(3/δ), we can guarantee that we output the correct state |ϕ⟩ with probability at least 1 − δ/3.Via the union bound, the failure probability is at most δ.

Tolerant Property Testing of Stabilizer States
We collect (and give alternative proofs of) a few results related to the property testing algorithm for stabilizer states due to Gross, Nezami, and Walter [GNW21] (hereafter, the "GNW algorithm").
We combine these results with the prior work of [GIKL23a] to give a tolerant property testing algorithm for stabilizer states.By tolerant property testing, we mean that the tester must accept inputs that are ε 1 -close to having some property and reject inputs that are ε 2 -far from having the same property.This is more general than the standard setting where ε 1 is set to 0.
The following two remarks are important for understanding the extent of our contribution.First, our algorithm is similar to the algorithm given in [GIKL23c] 17 that distinguishes Haarrandom states from quantum states with at least 1/poly(n) stabilizer fidelity.Second, we note that our algorithm only works in certain parameter regimes, not for all sensible settings of ε 1 and ε 2 .This is discussed further in Section 7.3.
To explain our property testing model in more detail, we are testing whether or not a quantum state is close to a stabilizer state, where distance is measured with fidelity.Specifically, we are given copies of an n-qubit quantum pure state |ψ⟩ as input, and we must decide whether F S (|ψ⟩) ≥ 1 − ε 1 or F S (|ψ⟩) ≤ 1 − ε 2 , promised that one of them is the case.

Completeness and Soundness Analysis
We record several statements that will go into our proof that our tolerant property testing algorithm is sound.It is first convenient to recall the GNW algorithm, which works as follows.Perform Bell difference sampling on the input state to get a string x ∈ F 2n 2 .Then perform the {±1} measurement W ⊗2 x on |ψ⟩ ⊗2 and accept if the result is 1.The algorithm uses six copies of the input state.Following the notation of [GIKL23c] let: be the expected value of such a measurement.We use the following identity proven by [GNW21] and [GIKL23c].where η is the expected value of the GNW algorithm measurement defined in Eq. (2). Proof.
17 Which is itself a repeated application of the base GNW algorithm, for the purposes of error amplification.
Prior to this work, it was known that, for any quantum pure state |ψ⟩, where the lower bound is due to [GNW21] and the upper bound is due to [GIKL23c, Lemma 15]. 18n this subsection, we re-prove these bounds using the formalism and techniques developed in this work.We begin by giving an alternative proof of the upper bound.
Proof.Let |ϕ⟩ be the stabilizer state that achieves the maximum overlap with |ψ⟩ and let S * := Weyl(|ϕ⟩).Then The first inequality follows from Lemma 5.2, the second inequality follows from Hölder's inequality, the third inequality follows from the nonnegativity of p ψ , and the final equality follows from Fact 7.1.
We now move on to the lower bound which says that (4η −1)/3 ≤ F S (|ψ⟩).We begin by proving a lower bound on stabilizer fidelity in terms of p ψ -mass.Proof.We apply Lemma 7.3 with the Lagrangian subspace T (so that t = 0).Following the notation in Lemma 7.3, there must exist a state | ψ⟩ whose fidelity with |ψ⟩ is at least x∈T p ψ (x).Note also that | ψ⟩ is a stabilizer state since its unsigned stabilizer group is Lagrangian.We conclude that the stabilizer fidelity F S (|ψ⟩) must be at least x∈T p ψ (x) because the stabilizer fidelity is the maximum fidelity over all stabilizer states.
We also need the following fact: any pair of Weyl operators whose expectation with |ψ⟩ is each at least 1/2 must commute.
We now prove the lower-bound.
Our algorithm is stated in Algorithm 4. The number of copies follows from the fact that Bell difference sampling consumes 4 copies of the input state, the measurement in Step 4 of Algorithm 4 consumes 2 copies of the input state, and that the loop is repeated m times.The running time is clearly O(mn).

Parameter Regime Discussion
We conclude this section by studying the regime in which our tolerant testing algorithm works in comparison to prior work.The GNW algorithm already implicitly functions as a tolerant property tester: because it uses 6 copies of |ψ⟩ and accepts any stabilizer state with probability 1, if the trace distance between |ψ⟩ and some stabilizer state is at most ε, then the test accepts |ψ⟩ with probability at least 1 − 6ε.We can use this observation to establish the values of α 1 and α 2 in which repeated applications of the GNW algorithm works, given only the soundness analysis of [GNW21] but not applying the completeness bound implied by Lemma 7.2.

Discussion and Open Problems
A natural direction for future work is to improve the performance of our algorithms or to prove (conditional or unconditional) lower bounds.In particular, can the exponential running time of Algorithm 2 be improved upon, or is stabilizer state approximation computationally hard for general parameter regimes?We are optimistic that the exponential factors in our runtime analysis could be made much smaller in practice, because our bound on the sample complexity of finding a complete set of generators is probably far from optimal.We also remark that, at least superficially, our problem of finding the nearest stabilizer state resembles the closest vector problem (CVP): given a lattice L and a target vector, find the nearest lattice point to the target vector.In our problem, we are given a target vector, and we want to find the nearest stabilizer state to the target vector.While not a lattice, the stabilizer states are "evenly spread" across the complex unit sphere due to their 3-design property [KG15,Web16,Zhu17].CVP is known to be NP-hard to solve approximately to within any constant and some almost-polynomial factors [vEB81,ABSS97,DKS98].Is there a formal connection between these two problems?
Can tighter bounds between η and stabilizer fidelity be proven?In [GIKL23c, Appendix B], the authors prove that one can hope for at most a roughly quadratic improvement in the bound F S (|ψ⟩) 6 ≤ η.In addition to η, are there other statistics related to stabilizer fidelity (or any other stabilizer complexity measure) that can be estimated efficiently?Progress in this direction would extend the parameter regimes for which our property testing algorithm works (see Fig. 2).
One can view the output of Algorithm 2 as an approximation of the input state by a nearby stabilizer state.Following this theme, a natural objective is to design similar approximation algorithms relative to other classes of quantum states such as product states or matchgate states.We note that it is even open to design a time-efficient algorithm that, given copies of an n-qubit quantum state, outputs the nearest state from the set {|0⟩ , |1⟩ , |+⟩ , |−⟩ , |i⟩ , |−i⟩} ⊗n , which is a subset of stabilizer states.In addition to potentially improving Clifford+T simulation algorithms (as discussed in Section 1.1), are there other applications for these types of state approximation algorithms?

Figure 1 :
Figure 1: An illustration of the argument in the proof of Lemma 5.4.Consider the (possibly unnormalized) stateα 0 |0 n ⟩ + α 1 |10 n−1 ⟩.We can visualize the first qubit of this state on the Bloch sphere.The surface enclosed by the red and blue curve is exactly the set of points on the sphere for which |0⟩ is the closest stabilizer state.By our assumption that the stabilizer fidelity of |ψ⟩ is maximized by |0 n ⟩, α 0 |0⟩ + α 1 |1⟩ must lie on this surface, up to normalization.The corners of this surface (the intersection of a blue curve with a red curve) represent a choice of α 0 and α 1 that minimizes α 0 .

Fact
Theorem 6.6 ([HKP21, Proof of Theorem 4]).Given any m Weyl operators P 1 , . . ., P m and copies of an unknown pure state |ψ⟩, there is an algorithm that estimates ⟨ψ|P i |ψ⟩ 2 to ±ε accuracy with probability at least 1 − δ by performing Bell measurements on 4 log(2m/δ) ε 2 copies of the unknown state |ψ⟩.The time it takes is O

Figure 2 :
Figure2: The shaded regions indicate the parameter regimes of α 1 and α 2 that are permissible by the analysis of the GNW algorithm in Section 7.3 (orange) and Algorithm 4 (blue).Thus, the difference between the orange and blue regions illustrates the improvement due to Lemma 7.2.
Weyl operator is a Pauli operator, and every Pauli operator is a Weyl operator up to a phase.Because the Clifford group normalizes the Pauli group, Clifford circuits induce an action on F 2n 2 by conjugation of the corresponding Weyl operators (up to phase).That is, for every Clifford circuit C and x ∈ F 2n 2 , there exists a unique y ∈ F 2n 2 and phase α ∈ {±1} such that CW x C † = αW y .In a slight abuse of notation, we denote this action on F 2n 2 by C(x) = y.Definition 2.12 (Unsigned stabilizer group).Let Weyl(|ψ⟩) := {x ∈ F 2n 2 : W x |ψ⟩ = ± |ψ⟩} denote the unsigned stabilizer group of |ψ⟩.It is not hard to show that, as a consequence of the uncertainty principle, Weyl(|ψ⟩) is an isotropic subspace of F 2n 2 .Additionally, if T ⊂ F 2n 2 is a Lagrangian subspace, then the set of states {|φ⟩ : Weyl(|φ⟩) = T } forms an orthonormal basis of the n-qubit Hilbert space.Moreover, since each basis state |φ⟩ is stabilized by 2 n Weyl operators (up to phase), every basis state is a stabilizer state.Conversely, observe that for any stabilizer state |φ⟩, Weyl(|φ⟩) is a Lagrangian subspace.We now define a new stabilizer complexity measure based on the unsigned stabilizer group.Definition 2.13 (Stabilizer dimension).Let |ψ⟩ be an n-qubit pure state.The stabilizer dimension of |ψ⟩ is the dimension of Weyl(|ψ⟩) as a subspace of F 2n 2 . 9 x , W y commute when [x, y] = 0 and anticommute when [x, y] = 1.So, if T ⊆ F 2n 2 is a subspace, then T is isotropic if and only if {W x : x ∈ T } is a set of mutually commuting Weyl operators.Similarly, T is Lagrangian if and only if {W x : x ∈ T } is a set of 2 n mutually commuting Weyl operators.
[LOLH22] that the output state of any Clifford circuit augmented with fewer than n/2 non-Clifford single-qubit gates can be efficiently distinguished from Haar random. 12 result, any circuit family that prepares an ensemble of n-qubit pseudorandom quantum states must use at least Ω(n) non-Clifford single-qubit gates.The key idea is that Haar-random states have minimal stabilizer dimension (Definition 2.13) with overwhelming probability.By contrast, for a quantum circuit that acts on a stabilizer state (which has stabilizer dimension n), each single-qubit non-Clifford gate decreases the stabilizer dimension by at most 2.We introduce the following definition to simplify the exposition, borrowing terminology from[LOLH22].Definition 4.1 (t-doped Clifford circuits).A t-doped Clifford circuit is a quantum circuit comprised only of Clifford gates (i.e., Hadamard, Phase, and CNOT) and at most t single-qubit non-Clifford gates that starts in the state |0 n ⟩.
2 4 In Line 4, determining whether S is Lagrangian and computing a Clifford circuit that measures in the basis induced by S can be done using Gaussian elimination on an m × 2n matrix in O (mn • min(m, n)) time [AG04, Section VI].16Furthermore, the computed Clifford circuit contains at most O(n 2 ) gates.Thus, Line 4 takes