A one-query lower bound for unitary synthesis and breaking quantum cryptography

The Unitary Synthesis Problem (Aaronson-Kuperberg 2007) asks whether any $n$-qubit unitary $U$ can be implemented by an efficient quantum algorithm $A$ augmented with an oracle that computes an arbitrary Boolean function $f$. In other words, can the task of implementing any unitary be efficiently reduced to the task of implementing any Boolean function? In this work, we prove a one-query lower bound for unitary synthesis. We show that there exist unitaries $U$ such that no quantum polynomial-time oracle algorithm $A^f$ can implement $U$, even approximately, if it only makes one (quantum) query to $f$. Our approach also has implications for quantum cryptography: we prove (relative to a random oracle) the existence of quantum cryptographic primitives that remain secure against all one-query adversaries $A^{f}$. Since such one-query algorithms can decide any language, solve any classical search problem, and even prepare any quantum state, our result suggests that implementing random unitaries and breaking quantum cryptography may be harder than all of these tasks. To prove this result, we formulate unitary synthesis as an efficient challenger-adversary game, which enables proving lower bounds by analyzing the maximum success probability of an adversary $A^f$. Our main technical insight is to identify a natural spectral relaxation of the one-query optimization problem, which we bound using tools from random matrix theory. We view our framework as a potential avenue to rule out polynomial-query unitary synthesis, and we state conjectures in this direction.


Introduction
This paper is about unitary synthesis, the task of implementing a given n-qubit unitary transformation U as a quantum circuit. Unitary synthesis is ubiquitous throughout quantum computing, since virtually any quantum computational task -be it preparing a state, performing a measurement, or transforming one state into another -can be done by implementing some unitary. Of course, not every unitary can be implemented efficiently. As a special case, consider the classical task of evaluating an (n − 1)-bit Boolean function f : {0, 1} n−1 → {0, 1}. This can be solved by implementing an n-qubit unitary transformation, namely the unitary U : |x, b⟩ → |x, b ⊕ f (x)⟩, and so Shannon's classic counting argument [Sha49] implies that even these unitaries require Ω(2 n /n) gates to implement. But are worst-case unitaries hard to compute only because they can solve hard classical problems? Or is it possible that unitaries could still be hard even if it were easy to solve all classical problems?
This question was first posed in 2006 in an influential work by Aaronson and Kuperberg [AK07], and it was later dubbed "the Unitary Synthesis Problem" by Aaronson in his 2016 Barbados lectures [Aar16]. Formally, they considered poly(n)-size quantum oracle circuits A (·) that have the ability to make quantum queries to an arbitrary Boolean function f : {0, 1} ℓ → {0, 1} on ℓ = poly(n) bits. This gives these circuits the power to instantaneously compute any Boolean function of their choice, and Aaronson and Kuperberg asked if this power enables them to efficiently implement any unitary transformation as well. More concretely, they asked the following question.
The Unitary Synthesis Problem [AK07,Aar16]: Is there a universal efficient oracle circuit A (·) such that for any unitary U , there is a corresponding Boolean function f for which A f implements U ?
In other words, the Unitary Synthesis Problem asks whether the task of implementing an arbitrary unitary can be efficiently reduced to computing Boolean functions. Notably, if the answer turns out to be negative, this would give strong evidence (in the form of a black-box separation) that the hardest quantum problems are harder than the hardest classical problems.
Since it was first posed, the Unitary Synthesis Problem has become arguably the central open problem in the rapidly growing field of unitary complexity, which we will discuss in more detail in Sections 1.1 and 1.3 below. To date, there is no clear consensus on what the true complexity of unitary synthesis should be: for all we knew, it might require as little as one query to the oracle, or as many as 2 Ω(n) .
One reason this question is subtle is that algorithms that make just one query to an arbitrary Boolean function are already quite powerful. For example, it turns out that such algorithms can solve the state synthesis problem, in which the goal is to produce an arbitrary quantum state |ψ⟩ [Aar16, INN + 22, Ros23a] (see Section 1.3 for discussion). The state synthesis and unitary synthesis problems share a number of similarities, and there has been some speculation that extending state synthesis techniques could lead to positive results on unitary synthesis (for example, see [INN + 22, Section 7.2]). An excellent treatment of these and related problems can be found in Aaronson's Barbados notes [Aar16], Rosenthal's Ph.D. thesis [Ros23b], as well as in recent course notes of Yuen [Yue22a,Yue22b].
How hard is unitary synthesis? There are several inefficient algorithms for the Unitary Synthesis Problem, the most basic of which queries an oracle O(2 2n ) times to learn a classical description of U and then implements it using O(2 2n ) gates. As noted by Yuen [Yue22a], this basic algorithm can be implemented with a single quantum query to f using the Bernstein-Vazirani algorithm [BV97], at the expense of making the query extremely large: in particular, it requires a quantum query to a Boolean function f : {0, 1} ℓ → {0, 1} on inputs of length ℓ = O(2 2n ). If we restrict to algorithms that only make efficient queries to f , i.e., queries that only evaluate f on ℓ = poly(n)-length inputs, the best known query complexity is O(2 n/2 ), achieved by a Grover-style algorithm due to Rosenthal [Ros22].
On the other hand, prior to this work, no general query lower bound for the Unitary Synthesis Problem was known. There is a well-known lower bound due to Aaronson and Kuperberg [AK07] that rules out a certain class of one-query algorithms A f , namely those that exactly implement a unitary operation on their first n qubits for all choices of f . More recently, Rosenthal [Ros22] proved a lower bound ruling out a different specialized class of many-query algorithms. We discuss both of these lower bounds further in Section 1.3. However, the problem of ruling out (or constructing) even one-query unitary synthesis algorithms has remained open since Aaronson and Kuperberg first posed it nearly two decades ago (cf. Open Problem 4 and footnote 13 in [AK07]).
In this work, we resolve this open question and prove the first one-query lower bound for the Unitary Synthesis Problem.
Theorem 1.1 (informal, see Theorem 4.18). There is no efficient oracle circuit A (·) that approximately implements an arbitrary n-qubit unitary U by making one quantum query to a U -dependent Boolean function f .
Our lower bound applies even if the oracle circuit is allowed to use an unbounded number of non-oracle gates and ancilla qubits. In fact, it even applies to circuits that are allowed to query f on inputs which are extremely long, but not too extremely long; technically, we require that A (·) can only query Boolean functions f : {0, 1} ℓ → {0, 1} on ℓ = o(2 n ) bits. Note that some restriction on the size of the queries is necessary, due to the one-query Bernstein-Vazirani-style algorithm mentioned above, which queries a Boolean function on ℓ = O(2 2n ) bits. Finally, our lower bound also extends to circuits that query arbitrary functions f with poly(n) bits of output, and to circuits that make poly(n)-many non-adaptive queries, i.e., queries of the form This is because such queries can be simulated using a single query to a more complex function, via another Bernstein-Vazirani trick (see Remark 3.6).
We prove Theorem 1.1 by leveraging a connection between the unitary synthesis problem and quantum cryptography, as we discuss next.

Unitary synthesis and quantum cryptography
Background and motivation. The past few years have seen a surge of interest in so-called inherently quantum problems, which are computational tasks in which either the input is a quantum state, the output is a quantum state, or both. These include many of the most important tasks in quantum computing, such as breaking computationally-secure quantum bit commitments, performing quantum state tomography, preparing the ground state of a local Hamiltonian, and decoding black hole radiation. The central goal of this area is to classify these problems according to the computational resources needed to solve them. Normally, we would do so using the language of computational complexity theory. However, after initial classification attempts, a mysterious, recurring phenomenon has emerged: computational complexity theory appears to be completely unable to classify many of these problems at all.
As just one example of this phenomenon, let us look to the field of quantum cryptography, where some of the most exciting work involving inherently quantum problems is being done today. This is due to the remarkable discovery that certain quantum cryptographic primitivessuch as pseudorandom states and quantum bit commitments -are sufficient for a wide array of cryptographic applications, and yet appear to be weaker than traditional "minimal" cryptographic assumptions such as one-way functions or pseudorandom generators (PRGs).
Pseudorandom states. Of these quantum primitives, we focus on single-copy pseudorandom states (PRSes), introduced by Ji, Liu, and Song [JLS18], which can be seen as a quantum analogue of PRGs. 1 Classically, a PRG is a set of K ≪ N := 2 n efficiently computable n-bit strings {x k } k∈ [K] in which a string x k drawn uniformly at random from the set is computationally indistinguishable from a truly random n-bit string. Quantumly, a (single-copy) PRS is a set of K ≪ N efficiently computable n-qubit quantum states {|ψ k ⟩} k∈ [K] in which a state |ψ k ⟩ drawn uniformly at random from the set is computationally indistinguishable from a Haar random n-qubit state. Single-copy PRSes are known to imply the existence of quantum bit commitments [Yan22,MY22,BCQ23], which are a key ingredient in many cryptographic protocols, ranging from zero-knowledge proof systems [BG22,GJMZ23] to secure multiparty computation [GLSV21,BCKM21,AQY22].
With these definitions in mind, what can we say about the computational complexity of breaking cryptographic pseudorandomness? Classically, it is easy to see that secure PRGs do not exist if P = NP. In fact, there is a polynomial-time black-box (Turing, or even Karp) reduction A (·) which can break PRGs given oracle access to a function f : {0, 1} * → {0, 1} that decides an NP-complete language. This explains why proving the existence of unconditionally secure PRGs has so far been unsuccessful, as doing so would imply the breakthrough complexity theoretic lower bound P ̸ = NP.
In the quantum setting, what can we say about the computational complexity of breaking a PRS? Is there a complexity assumption that we can make, such as BQP = QMA, which would imply that PRSes can be broken in polynomial-time? The answer to this question is currently unknown, and the difficulty stems from the fact that the computational task associated with breaking a PRS is an inherently quantum problem. In particular, the adversary's goal is to distinguish between a pseudorandom state and a Haar random state, given one of the two at random-a quantum-input, classical-output task. On the other hand, traditional complexity classes such as P and PSPACE, and even quantum complexity classes such as QMA, only capture problems with classical inputs. For example, even though the witness for a QMA statement is a quantum state, the input to the problem is always a classical string, such as the description of a local Hamiltonian.
How hard is it to break quantum cryptography? As a result of this mismatch between classical-input and quantum-input problems, it is not at all clear how breaking a PRS is related to traditional complexity assumptions. For example, a recent work of Kretschmer, Qian, Sinha, and Tal [KQST23] has shown that the existence of PRSes is independent of the P-versus-NP question, at least in the oracle setting, by constructing an oracle relative to which PRSes exist but P = NP. However, [KQST23] derives security of their candidate PRS from the hardness of an oracle problem, OR • FORRELATION, which is easily solvable in PSPACE. Despite this, it is not clear whether a PSPACE oracle should be powerful enough to break every PRS -in fact, it is not even clear that an oracle for the halting problem would suffice.
This raises a tantalizing question: what if the existence of PRSes is independent of traditional complexity altogether? Could we show that breaking a PRS does not black-box reduce to deciding any language? Let us now relate this back to the Unitary Synthesis Problem. Given a PRS, there always exists a unitary U which one could use to break the PRS if one could implement it efficiently, namely any unitary which maps span{|ψ 1 ⟩ , . . . , |ψ K ⟩} to span{|1⟩ , . . . , |K⟩}. If an efficient quantum oracle circuit A (·) can synthesize such a U given oracle access to some Boolean function f , then the PRS can be efficiently broken relative to f .
Our second main result is to rule out any single-query algorithm for this task, relative to a random oracle.
Theorem 1.2 (informal, see Theorem 5.2). Relative to a random oracle, there exists a PRS (and a quantum bit commitment scheme) secure against all one-query oracle algorithms A f for every Boolean function f . Theorem 1.2 offers the strongest evidence to date that the security of PRSes might be independent of all of traditional computational complexity. Our two results, when taken together, demonstrate the close connection between the Unitary Synthesis Problem and the security of PRSes; as we will see below, we essentially prove these two results simultaneously, because in constructing a PRS which cannot be broken with one query, we are implicitly constructing a unitary which cannot be synthesized with one query.

Our approach
We prove Theorems 1.1 and 1.2 by analyzing an oracle version of the single-copy PRS security game, which we call the "Oracle State Distinguishing Game" (see Section 3). To state this task, let us define two pieces of relevant notation. First, given a Boolean function h : {0, 1} n → {±1}, we define the corresponding binary phase state as Next, a function family is a function R : [K] × {0, 1} n → {±1}. We think of R as defining a family of K Boolean functions as follows: for each 1 ≤ k ≤ K, we let R k : {0, 1} n → {±1} be the function R k (·) := R(k, ·). In general, we require K ≪ N ; a typical setting will be K = N/2.
be a uniformly random function family. The Oracle State Distinguishing Game involves two parties, a challenger and an adversary. The adversary is modeled as an oracle circuit A (·) which is allowed to query an arbitrary Boolean function f depending on R. The game is played as follows.
2. The challenger generates a random n-qubit state |ψ⟩ in one of two ways: • If b = 0, the challenger samples a uniformly random k ∼ [K] and generates |ψ⟩ := |ψ R k ⟩, the binary phase state corresponding to the Boolean function R k .
4. The adversary runs the oracle circuit A f on |ψ⟩ and outputs a bit b ′ ∈ {0, 1}.
5. If b ′ = b, then the adversary wins. Otherwise, they lose.
Intuitively, the function family R specifies a family of pseudorandom states {|ψ R k ⟩} k∈ [K] , and the adversary's goal is to distinguish a randomly chosen state from this from a uniformly random binary phase state |ψ h ⟩. As discussed above, an algorithm for the Unitary Synthesis Problem yields a successful adversary for the Oracle State Distinguishing Game, and so a query lower bound for the Oracle State Distinguishing Game implies a query lower bound for the Unitary Synthesis Problem. We show the following lower bound for the Oracle State Distinguishing Game.
Theorem 1.4. Suppose that A f is a one-query oracle circuit that achieves advantage ε in the Oracle State Distinguishing Game. Then, A f must make a query of size at least ℓ = Ω(Kε 2 ) bits.
This lower bound implies that for typical settings of K (such as K = N/2), to achieve a nonnegligible distinguishing probability, the adversary's query must have length exponential in n; in particular, a superpolynomial-length query is required whenever ε ≥ n ω(1) / √ K. This dependence on K is optimal, as there are polynomial-time 1-query algorithms which do achieve distinguishing advantage Ω(1/ √ K) (see Appendix C). As discussed above, Theorem 1.4 immediately implies Theorem 1.1, our one-query lower bound for the Unitary Synthesis Problem. In fact, since we show that the adversary's distinguishing advantage is negligible, this gives a unitary U R which is hard to synthesize even in an extremely weak sense: no efficient one-query algorithm A f can correctly implement any unitary that even remotely approximates the behavior of U R . In addition, since the Oracle State Distinguishing Game is an oracle analogue of the security game for a single-copy PRS family, standard techniques (see Section 5) allow us to transform Theorem 1.4 into a proof that, relative to a random oracle, there exist PRS families and quantum bit commitment schemes secure against all one-query adversaries. This gives Theorem 1.2.
These results demonstrate the usefulness of the Oracle State Distinguishing Game as a means for studying the Unitary Synthesis Problem, and we believe that it is also a useful avenue for proving stronger lower bounds against algorithms which use more than one query. To this end, we make the following conjecture.
Conjecture 1.5 (Strong Non-Synthesis Conjecture). For all K ≥ n ω(1) , any polynomial-query oracle algorithm A f wins the Oracle State Distinguishing Game with advantage at most negl(n).
A proof of Conjecture 1.5 would imply a negative resolution to the Unitary Synthesis Problem. In addition, it would imply the existence of single-copy PRSes (and thus, quantum bit commitments) secure against all efficient polynomial-query adversaries, relative to a random oracle. In other words, computationally secure quantum cryptography would not black-box imply the existence of any hard language. We note that the lower bound K ≥ n ω(1) in Conjecture 1.5 is necessary; as discussed above, if K = poly(n), then there is a simple attack that achieves 1/ √ K = 1/poly(n) advantage. 2 In Section 2.5, we state weaker conjectures which correspond to simpler cases of Conjecture 1.5. In particular, in Conjecture 2.7, we give a self-contained mathematical conjecture which corresponds to the simplest class of oracle adversaries that we do not know how to rule out.
2 In fact, there is another attack that achieves advantage close to 1 in this regime, based on the LMR algorithm [LMR14,Yue22b]. The adversary can make a single call to its R-dependent oracle f to generate m = poly(n) copies of each state |ψ R k ⟩. Then for each 1 ≤ k ≤ K, the adversary can test if the challenge state |ψ⟩ is equal to |ψ R k ⟩ by measuring |ψ⟩ ⊗ |ψ R k ⟩ ⊗m with {Πsym, Id − Πsym}, where Πsym is the projector onto the symmetric subspace. If they are not equal, doing so will only perturb the state |ψ⟩ slightly, allowing the adversary to reuse it for further tests.
Additional remarks. We make two final observations about the Oracle State Distinguishing Game. First, note that the adversary's task is to perform a measurement {M 0 , M 1 } which distinguishes between the two cases of the game. In particular, writing U R for the unitary written above, the adversary would like to carry out the measurement specified by the two projectors This is an example of a measurement synthesis task, an inherently quantum problem in which the input is quantum but the output is classical. Measurement synthesis has been discussed much less than state synthesis and unitary synthesis in the literature (the only work we are aware of that discusses it is [BEM + 23]). However, our results suggest that it is measurement synthesis that is the hard problem at the core of unitary synthesis. Combined with the fact that state synthesis has efficient one-query algorithms [Ros23a], this suggests that the crucial distinction between classical problems and inherently quantum problems is whether the input, and not necessarily the output, is classical or quantum. Second, we note that the Oracle State Distinguishing Game is fairly robust to the precise distribution of states used to specify it. For example, rather than specifying the game in terms of random binary phase states, we could have specified it using Haar random states. In this version of the game, K independent Haar random states |ψ 1 ⟩ , . . . , |ψ K ⟩ are sampled in advance. Then the adversary is given either (b = 0) one of these K states sampled uniformly at random, or (b = 1) a new Haar random state |ψ⟩, and asked to distinguish between these two cases. Though we do not prove it here, our lower bound in Theorem 1.4 also holds for this variant of the Oracle State Distinguishing Game. One nice property of this distribution is that hardness of the Oracle State Distinguishing Game for this distribution directly implies hardness of Unitary Synthesis for a Haar-random unitary U . We refer the reader to Section 3.3 for further discussion.

Related Work
In this section, we elaborate on some works related to the Unitary Synthesis Problem and our results. We discuss (1) prior lower bounds, (2) positive results on the closely-related state synthesis problem, and (3) related work in unitary complexity theory.

Lower bounds for unitary synthesis
The best known prior lower bound for the Unitary Synthesis Problem comes from the original paper on this topic by Aaronson and Kuperberg [AK07]. To understand their lower bound, let us first make more explicit the computational model we are assuming for our oracle circuit A (·) . A general oracle circuit A (·) may wish to make use of additional ancilla qubits, in which case it will be structured as follows: it will have an n-qubit input register and an input ancilla register initialized to |0 a ⟩, as well as an n-qubit output register and an a-qubit output "junk" register. Indeed, if A (·) does not have ancillas, then it is unable to query any oracle f on inputs of length greater than n, which turns out to make A (·) quite weak. This is because for such an A (·) , the number of possible unitaries you can synthesize when ranging over all functions f is bounded by 2 2 n , which is simply not enough to "cover all unitaries" by a counting argument. (See Appendix B for a simple lower bound along these lines.) Now we can state the Aaronson and Kuperberg [AK07] lower bound. They showed a one-query lower bound against any oracle circuit A (·) which has the following property: for every choice of oracle f , the oracle circuit A f is required to exactly implement an n-qubit unitary on its first n qubits. Mathematically, this means that for any n-qubit state |ψ⟩, we must have that where U f is some n-qubit unitary which depends on f , and junk f is some a-qubit junk state which depends on f . This defines a class of oracle algorithms that turns out to be highly restrictive, for several reasons. We list two. To elaborate on (2), consider the following simple attack: the oracle circuit A f queries f to learn an ℓ-bit classical string s on an ancilla space, and then applies an n-qubit unitary U s that depends on s. By the Bernstein-Vazirani trick, A f can learn an ℓ-bit string s in a single query by first preparing the uniform superposition on ℓ qubits, then querying the Boolean function f s (x) := s · x, and finally applying a Hadamard transform. Even though this oracle circuit A f always implements a unitary on the first n qubits when f computes an inner-product function, this is not guaranteed in general: for arbitrary f , the oracle circuit may obtain a superposition over different s, in which case the operation on the first n-qubits is not guaranteed to be unitary. Indeed, Aaronson and Kuperberg are able to prove their lower bound against this class by a counting argument: they prove that the number of distinct unitaries that a one-query oracle circuit A (·) in this class can synthesize, ranging over all oracles f , is at most 4 2 n [AK07, Theorem 6.7], irrespective of the number of ancilla qubits a. Unfortunately, as we discuss in Section 2.1 and Appendix B, these types of counting arguments are insufficient to prove a general query lower bound.
A more recent lower bound, due to Rosenthal [Ros22], shows that unitary synthesis is hard relative to a state synthesis oracle. Roughly speaking, this lower bound states that synthesizing a unitary U requires roughly 2 n/2 queries to an oracle that, on any classical input |x⟩, for x ∈ {0, 1} n , outputs the state |x⟩ ⊗ U |x⟩. This shows that the power to produce any state of the form U |x⟩ is insufficient to implement U efficiently. However, the technique says little about the problem of synthesizing U relative to an arbitrary function oracle f .

Relationship to state synthesis.
Let us contrast our one-query lower bound for the Unitary Synthesis Problem with the state of affairs for a related problem known as state synthesis. State synthesis is the task of implementing a quantum circuit that outputs a specified n-qubit quantum state |ψ⟩ when run on the all-0's input. Alternatively, one can view state synthesis as an easier version of unitary synthesis, where the goal is merely to implement the unitary correctly on the all 0's input, rather than on all possible inputs.
Like unitary synthesis, state synthesis requires large quantum circuits: it can be shown via counting arguments that there exist worst-case states on n qubits that require circuits of size Ω(2 n /n) to compute approximately (see the excellent discussion of this in [Ros23b, Section 1.3.4]).
It turns out, however, that state synthesis becomes easy if Boolean functions are easy [Aar16, Ros23a]. In particular, Rosenthal's state-of-the-art result [Ros23a] gives a quantumpolynomial time oracle algorithm A (·) such that for any n-qubit pure state |ψ⟩, there exists a Boolean function f : {0, 1} m → {±1} such that A f (1 n ) makes one quantum query to f and outputs |ψ⟩ up to inverse exponential precision. For some intuition behind this result, observe that binary phase states 1 √ 2 n · x∈{0,1} n f (x) · |x⟩ are trivial to synthesize with one query: simply prepare the uniform superposition 1 √ 2 n · x∈{0,1} n |x⟩, and make one query to the phase oracle O f : |x⟩ → f (x) · |x⟩. It turns out that worst-case states can then be synthesized via a careful reduction to the binary phase state case. This can be viewed as a one-query reduction from the task of state synthesis to the problem of computing an arbitrary Boolean function. In contrast, our main result shows that no such reduction is possible for unitary synthesis.

Quantum cryptography and unitary complexity
A connection between (plain model) quantum cryptography and the Unitary Synthesis Problem was recently discovered by Kretschmer [Kre23], who showed that if the Unitary Synthesis Problem is resolved in the positive, then showing the existence of a secure PRS implies that BPP ̸ = NEXP. This result says that traditional complexity theory does have something to say about the existence of PRSes, but only if unitaries are easy to synthesize.
Beyond "traditional complexity theory," a very recent and intriguing line of work has introduced a complexity theory of inherently quantum problems, with complexity classes corresponding to both state synthesis problems and unitary synthesis problems [RY22, INN + 22, Ros23a, MY23, BEM + 23, DGLM23]. As above, this line of work argues that traditional complexity theory is ill-equipped to address the complexity of inherently quantum problems, as traditional complexity theory is only about classical-input, classical-output problems, i.e., functions f : An important open direction is to study the relationship between these new inherently quantum complexity theories and the traditional "classical" complexity theory. Interestingly, Kretschmer's result above [Kre23] suggests that these seemingly different complexity theories might be closer than they first appear, if the Unitary Synthesis Problem is resolved in the positive. In particular, his result, stated more broadly, is the following: suppose the Unitary Synthesis Problem is resolved in the positive. Then unitaryBQP ̸ = unitaryPSPACE implies that BPP ̸ = NEXP. In this light, our Theorem 1.1, providing negative evidence for the Unitary Synthesis Problem, can also be interpreted as providing positive evidence that these complexity theories are in fact distinct.

Organization
The remainder of this paper is organized as follows. Section 2 gives a technical overview of our proofs. Section 3.1 includes preliminary details about oracle circuits, building towards a simple normal form for these circuits that we will use in our proofs. In Section 4, we give the proof of our main result, the one-query lower bound for the Oracle State Distinguishing Game, which we then use in Section 5 to show the existence of secure PRSes and quantum bit commitments relative to a random oracle. Appendix A includes a second proof of our main result with slightly worse parameters. In Appendix B, we show a counting lower bound against even many-query oracle circuits which can only compute a small number of distinct unitaries, generalizing the one-query lower bound of Aaronson and Kuperberg [AK07]. Finally, in Appendix C, we give a one-query algorithm to match our main lower bound (Theorem 4.18) in its dependence on K.

Acknowledgements 2 Technical overview
We will sketch the proof of Theorem 1.4, beginning by describing our mathematical model for single-query adversaries in Section 2.1. Following this, we will develop our proof strategy in the context of three different and increasingly complicated types of adversaries. First, in Section 2.2, we will look at adversaries which use their one query to prepare a quantum advice state. Next, in Section 2.3, we will look at adversaries which have no ancilla qubits and do not apply any gates prior to their oracle query. Finally, in Section 2.4, we will look at general single-query adversaries.

Modeling the adversary
A single-query adversary can be modeled as a quantum circuit with an input register of n qubits and an ancilla register of a qubits, for a size of m = n + a total qubits. Given an n-qubit input state |ψ⟩, the adversary acts as follows.
1. The adversary will initialize its ancilla qubits to |0 a ⟩. Then, it applies a unitary U to |ψ⟩ |0 a ⟩.
3. Finally, the adversary performs a binary projective measurement {Π, Id − Π} on its state. This produces a measurement outcome b ′ ∈ {0, 1}, which it outputs as its guess.
After the oracle, the adversary's state is O f · V · |ψ⟩. Thus, the probability it outputs b ′ = 0 is Intuitively, one should think of the size m as "small", say m = poly(n). This is because m is also the length of the adversary's oracle query, and it is necessary for us to assume a bound on the query length so that the problem remains nontrivial. Otherwise, there is a simple attack based on the Bernstein-Vazirani algorithm [BV97] which solves the problem using a single extremely large query of length K · N , which we describe below.
The state on the right-hand side is simply the Hadamard transform of |r⟩, and thus the adversary can obtain the entire truth table of R.
As it turns out, once we assume our adversary has "small" query length, it can be converted to one with "small" size m as well (see Section 3.5). Hence, we may assume that the adversary's oracle is applied to all m qubits. We will now carry out the following change in notation that will be applied throughout the paper: to simplify notation, we will set N := 2 n and M := 2 m and associate the set {1, . . . , N } with {0, 1} n and {1, . . . , M } with {0, 1} m . As a result, a "Boolean" function is now formatted as h : [N ] → {±1} and is associated with the phase state Pr where here and throughout this section we are writing h for a uniformly random Boolean function h : [N ] → {±1}. Substituting in Equation (2), this is equal to Our goal is to prove Theorem 1.4, which can be phrased more formally as follows. . Now let us now briefly discuss one potential approach for proving Theorem 1.4: counting arguments. These are based on the simple observation is that the distinguishing advantage is easily upper-bounded for any fixed oracle f , which corresponds to an adversary that does not depend on R. This can be argued using standard concentration of measure tools from probability theory, and the resulting concentration bound one can show is extremely good : in particular, the probability that a fixed A f has distinguishing advantage at least ε is at most 2 −Ω(ε 2 KN ) . Given this degree of concentration, it is tempting to simply union bound over all choices of f to upper-bound the maximum distinguishing advantage; this is known as a counting argument. Unfortunately, this approach quickly begins to fail as the adversary's space grows: the number of possible functions f is 2 M , where M is potentially much larger than KN . Recall that KN ≤ N 2 = 2 2n , while M could be (at least) 2 poly(n) , for an arbitrary poly(n). Thus, this type of counting argument cannot give a general one-query lower bound. That said, it can rule out some interesting special cases of adversaries, which we discuss in Appendix B. Finally, we note that there is a more powerful version of counting arguments known as chaining (cf. [Ver18, Chapter 8]), but we were unable to successfully apply chaining arguments to this problem.
In the next few subsections, we will describe an alternative approach for bounding the maximum distinguishing advantage across all choices of f simultaneously via matrix concentration inequalities.

Adversaries with quantum advice
We begin with the simple but conceptually useful special case of one-query adversaries, namely those that use the query to f to synthesize an f -dependent advice state. In other words, the adversary acts as follows.
1. First, it applies an isometry V that acts by appending a fixed m-qubit state |ϕ⟩. Thus, the n-qubit input state |ψ⟩ is mapped to the (n + m)-qubit state |ψ⟩ ⊗ |ϕ⟩. (We are abusing notation in this subsection by writing m only for the qubits in the advice state, rather than for all of the qubits. We will return to the normal definition of m in Sections 2.3 and 2.4 below.) 2. Next, it makes an oracle query O f that acts as the identity on the input state |ψ⟩ and only modifies |ϕ⟩. Then the adversary's state becomes |ψ⟩ ⊗ |ϕ f ⟩, where |ϕ f ⟩ is some f -dependent state.
In total, for such an adversary, O f · V · |ψ⟩ = |ψ⟩ ⊗ |ϕ f ⟩. Attacks of this form can synthesize many kinds of states: for example, if |ϕ⟩ is a uniform superposition, then |ϕ f ⟩ can be any binary phase state. (We remark that there are techniques in the cryptography literature for proving lower bounds against quantum advice [HXY19, CLQ20, CGLQ20, Liu23]. However, the techniques seem to be highly tailored to the advice setting and are not related to our approach.) Supposing the adversary works in this manner, we can compute its maximum distinguishing advantage on a uniformly random R : by Equation (3). The benefit of focusing on advice states is that we can factor out the f -dependent term |ϕ f ⟩ from each expectation. To do so, for any Boolean function h : Note that 0 ≤ Π h ≤ Id, since |ψ h ⟩ is a unit vector and Π is a projection. Then we can rewrite the distinguishing advantage as Since |ϕ f ⟩ is a unit vector, we can upper bound this by a maximum over all unit vectors, i.e.
(4) ≤ max Here, we are writing ∥ · ∥ op for the operator norm. Thus, we have reduced our problem to bounding the operator norm of the average of K random matrices Z R 1 , . . . , Z R K . We will bound this operator norm using the technique of matrix concentration, which generalizes scalar concentration bounds (such as Chernoff-Hoeffding bounds) to the random matrix setting. Specifically, the matrix Hoeffding inequality (roughly) says the following (see [Tro12], Theorem 1.3 or Theorem A.18 for the precise statement). Theorem 2.3 (Matrix Hoeffding (informal)). If K independent and identically distributed meanzero random D × D Hermitian matrices Z 1 , . . . , Z K always have bounded operator norm, then with high probability, (Note that the scalar Hoeffding bound can be recovered by taking D = 1 above.) To apply the matrix Hoeffding inequality to our problem, we need to verify that when R : is uniformly random, our matrices Z R 1 , . . . , Z R K satisfy these properties. Indeed: • Z R 1 , . . . , Z R K are independent and identically distributed since each R k is an independent, uniformly random Boolean function R k : [N ] → {±1}.
• For each 1 ≤ k ≤ K, Z R k has expectation zero: • For each 1 ≤ k ≤ K, the operator norm ∥Z R k ∥ op is always bounded by 2, since As a result, since in our setting D = M , an ε-distinguisher requires log(M ) = Ω(Kε 2 ), as claimed.
In other words, the adversary needs a huge advice state to win the distinguishing game. In summary, our strategy involved identifying a well-behaved quantity that governs the advantage of A f across all choices of f simultaneously. As we have seen, the operator norm is an example of such a quantity: although bounding the quadratic form ⟨v| · (E k∼[K] Z R k ) · |v⟩ for all vectors |v⟩ would naively require the concentration of O(1/ε) M different scalars (corresponding to an ε-net over C M ), matrix concentration shows that the operator norm behaves as if it has M , rather than 2 M , "independent degrees of freedom".

Adversaries with a trivial isometry
Let us recall Equation (3), our expression for the adversary's maximum distinguishing advantage: The advice state case above suggests the following approach to bounding this expression: 1. Factor the dependence on the oracle O f to the "outside" of the expression, and 2. Rely on a matrix concentration inequality to bound the advantage for all f simultaneously.
Unfortunately, the advice state case does not tell us whether this approach is possible, or how to carry it out, in general. To gain some intuition, we will analyze another simple special case, the case where V = Id, in which the adversary does not use any ancilla qubits and only applies the identity unitary. In this case, M = N , and we will allow the adversary to query an arbitrary oracle f : [N ] → {±1}. Then the because V = Id, the adversary's maximum distinguishing advantage on a uniformly random R : Towards "factoring out" the O f dependence to the outside of the expression, we make use of the fact that any binary phase state |ψ h ⟩ can be written as the product of a diagonal {±1}-matrix and the uniform superposition state: The key benefit of this "diagonal decomposition" is that the diagonal matrices D h and O f commute, which allows us to rewrite the state O f · |ψ h ⟩ as follows: where |ϕ f ⟩ := O f · |+ N ⟩ is the binary phase state corresponding to f . Plugging this back into our expression for the maximum distinguishing advantage, we can again employ a spectral relaxation: As in the advice state case, our problem has again reduced to bounding the operator norm of E k∼[K] Z R k for a uniformly random R. And just like before, the matrices Z R 1 , . . . , Z R K are meanzero, independent and identically distributed, and their norm is bounded by 2, since where the second inequality uses the fact that the D h is a unitary matrix for any Boolean function h : Thus, we can apply the matrix Hoeffding inequality as before. Since the Z R 1 , . . . , Z R K are N × N matrices, matrices, we obtain a bound on the maximum distinguishing advantage of To summarize, the key new idea in this special case was to introduce a diagonal decomposition which holds for arbitrary phase states |ψ h ⟩.

The general one-query bound
Now we consider the case of a general adversary. Let us recall one last time Equation (3), our expression for the adversary's maximum distinguishing advantage: The previous special case suggests the following strategy for bounding this expression: 1. First, for any Boolean function h : Importantly, the diagonal decomposition should satisfy the following two properties: 1. the fixed state should have unit norm, so that we can perform a spectral relaxation, and 2. the h-dependent diagonal matrix should have bounded operator norm, so that we can apply the matrix Hoeffding inequality.
Unfortunately, it turns out that for a general isometry V , a diagonal decomposition satisfying the above requirements does not exist. Consider the following example.
and thus V · |ψ hr ⟩ = H ⊗n · |ψ hr ⟩ = |r⟩. Suppose that we try to write each |ψ hr ⟩ as the product of an h r -dependent diagonal matrix and the uniform superposition state Then the only choice of D r satisfying the above is D r = √ N · |r⟩⟨r|, which has exponentially large operator norm. Moreover, there is nothing special about the uniform superposition -no matter what fixed state we use, there will exist h r such that D r has exponentially large operator norm.
Thus, Example 2.4 shows that we cannot hope for a diagonal decomposition that satisfies our desired conditions for all binary phase states |ψ h ⟩. Nevertheless, we will show that a meaningful diagonal decomposition is still possible in the general case. The key insight, which we will show next, is that for any isometry V , there exists a diagonal decomposition of V · |ψ h ⟩ in which the h-dependent diagonal matrix has bounded operator norm with extremely high probability over the choice of h : [N ] → {±1}.

The weight vector decomposition
Our goal is to find a diagonal decomposition of the form in which D V,h has a "small" operator norm, with high probability over a uniformly random h. Let us consider what would be implied if such a decomposition were to exist, and then work backwards to construct the decomposition.
If such a decomposition exists, then for each 1 ≤ i ≤ M , let us consider the i-th coordinates of the left-hand and right-hand sides, which are given by Because D V,h has "small" operator norm for a "typical" h, this means that (D V,h ) i,i is "small" for a "typical" h. Hence, for such an h, the right-hand side of Equation (6) must be roughly equal to ϕ i , the magnitude of the i-th coordinate in |ϕ⟩. (More correctly, it must be not too much larger than ϕ i .) This, in turn, implies that the left-hand side of Equation (6) must be roughly equal to ϕ i as well, at least for a "typical" h. This motivates studying the magnitude of the i-th coordinate of V · |ψ h ⟩ for a "typical" Boolean function h. We can do so by looking at its average squared magnitude In other words, p i denotes the probability that measuring the state V |ψ h ⟩ in the standard basis results in an outcome of i. Then we expect the i-th coordinate of V ·|ψ h ⟩ to have magnitude roughly √ p i , and that suggests the following choice for our fixed state in the diagonal decomposition: which we refer to as the weight vector for V . We observe that |wt V ⟩ is indeed a unit vector because where the last equality holds because V is an isometry. Intuitively, the state |wt V ⟩ encodes how much weight the isometry V places on each individual coordinate 1 ≤ i ≤ M .
To compute the full diagonal decomposition, we write the isometry V : yielding the decomposition

Bounding the operator norm of D V,h .
Our next step is to determine whether the random matrix D V,h actually has bounded operator norm with high probability. Its operator norm is given by and we know from Equation (7) that for every Therefore, if each coordinate α h,i has good enough (scalar) concentration, we can bound ∥D h ∥ op with high probability. Thus, we have reduced the problem to understanding the concentration of the random variables To see that this expression is small with high probability, we observe that it is a weighted linear , has mean zero, and has variance Therefore, standard (scalar) concentration tools (see Theorem 4.21) tell us that this random variable exhibits "sub-Gaussian concentration," implying (in this case) that it is larger than any t with probability at most 2 · exp −t 2 /2 . Union bounding over all M coordinates, we conclude that ∥D V,h ∥ op > t with probability at most 2M · exp −t 2 /2 , and so it is, for example, unlikely to be much larger than O( √ log M ).

Putting everything together
With the weight-vector decomposition in hand, we can proceed to bounding the adversary's maximum distinguishing advantage along similar lines as in Section 2.3. To begin, we can rewrite the state O f · V · |ψ h ⟩ as follows: for every function h. In other words, the choice of R-dependent function f here is accounted for as the vector O f · |wt V ⟩, which is a unit vector for all functions f . By an argument similar to the one in Section 2.3, we can then bound the adversary's maximum distinguishing advantage by the operator norm Thus, we have again reduced our problem to bounding the operator norm of an average of K independent and identically distributed matrices Z R k whose operator norms are bounded with high probability. In particular, since the operator norm of ∥D V,h ∥ is usually no more than O( √ log M ), over a uniformly random h, the operator norm of D † V,h · Π · D V,h is usually no more than O(log M ). We would like to then conclude that with high probability, which would imply our claimed result. Unfortunately, we cannot quite apply the matrix Hoeffding inequality directly, which requires that the matrices have bounded operator norm with probability 1, not just with high probability. Getting around this issue requires some additional technical ideas, and we give two ways of handling it in the main body of the paper.
1. Our first approach is to truncate the diagonal matrices D V,R k so that any entries whose magnitude exceeds some number B are scaled down so that their magnitude is equal to B.
The result is that all matrices now have bounded operator norm, which means we are in fact able to apply the matrix Hoeffding inequality. Ultimately, this results in a bound of O(1/K 1/4 ) on the adversary's distinguishing advantage for reasonably small values of M (say, M ≤ exp K 1/8 /4 ), which is more than enough to prove the one-query lower bound for the Unitary Synthesis Problem. That said, this bound is not quite strong enough to prove the bound claimed in Theorem 1.4.

2.
To prove the precise bound claimed in Theorem 1.4 and thereby achieve the correct asymptotic dependence on K, we give a somewhat different analysis.
(a) First, we show that it suffices to bound the expected distinguishing advantage on a random R, rather than proving a bound with high probability. To show this, we show that the maximum distinguishing advantage concentrates extremely well around its expectation (see Lemma 3.18).
(b) To bound the expected distinguishing advantage, we use a different technique called "decoupling", which is common in the random matrix theory literature [Ver11,vH17]. At a high level, the technique (when combined with the ideas from this technical overview) allows us to reduce to bounding the expected operator norm of the random matrix where R and R ′ are independent and uniformly random function families. This is easier to give a sharp bound on because the dependence on each of R and R ′ is linear rather than quadratic, allowing us to prove an optimal bound on the expected value by applying a different matrix concentration inequality for matrix Rademacher series (Theorem 4.10). Unlike the matrix Hoeffding inequality, this matrix concentration inequality does not require the matrices to have bounded operator norm with probability 1 (though it does require the matrices to be random Rademacher matrices), which is why it gives stronger bounds than we achieve using our first approach.
The second proof, which achieves the optimal dependence on K, is presented in Section 4. The first proof is presented in Appendix A. We believe that theoretical computer scientists might find the first proof more straightforward to follow.

Future directions: beyond one query
Theorem 1.4 proves that efficient one-query oracle algorithms achieve at most negligible advantage in the Oracle State Distinguishing Game (and thus cannot synthesize arbitrary unitaries). We conjecture (see Conjecture 1.5) that efficient oracle circuits making poly(n)-many sequential queries cannot win our distinguishing game. Towards resolving the full conjecture, we believe it may be useful to focus on the special case of two-query adversaries. In this subsection, we present several conjectures -all weaker than Conjecture 1.5 -that capture the simplest unresolved special cases of two-query attacks. First, we will need the following observation about the power of classical oracle queries. Let us fix a function family R : with M outcomes. Suppose the adversary, upon receiving |ψ⟩, applies an isometry V : C N → C M followed by an oracle query O f . Next, it performs the measurement P, obtaining some outcome in {1, . . . , D}. Depending on whether the adversary's input state is sampled from the "pseudorandom" or "random" distribution, the outcome of measuring P is distributed as either: We observe that if the total variation distance (or statistical distance) between Dist 0 and Dist 1 is ε, then, by making one classical oracle query, the adversary can distinguish the two cases with advantage ε. This is because the second query can be made to the Boolean function g : . If the output of g is +1, the adversary guesses that it was in the pseudorandom case, and if the output of g is −1, the adversary guesses that it was in the Haar random case, and attains distinguishing advantage ε.
The "1.5-query" conjecture. We conjecture that adversaries that make one quantum query followed by one classical query cannot win our distinguishing game.
is at most negl(n).
One potential approach towards bounding this expression is to observe that if we fix a subset S, the remaining expression has the same form as the maximum distinguishing advantage for a one-query adversary. We can therefore apply the same weight-vector decomposition described in Section 2.4.1 and invoke a spectral relaxation. The result is that the following is an upper bound for the adversary's distinguishing advantage: where the definitions of D V,h and D V,R k are the same as in Section 2.4.1.
The central difficulty we face is that matrix concentration inequalities are not sufficient to bound Eq. (8). Indeed, they can be applied for any fixed choice of S, but it is unclear how to bound the operator norm of 2 M matrices simultaneously, one for each S. Nevertheless, we believe that the expression (8) is in fact negligible with high probability over R.
The "(1 + ε)-query" conjecture. Finally, we highlight a sub-class of 1.5-query adversaries that we do not know how to rule out, which we refer to as (1 + ε)-query adversaries. Instead of making an arbitrary first query to the oracle, these adversaries use their first query to synthesize an advice state |v⟩ ∈ C M/N (similar to the adversaries we considered in Section 2.2); note that while |v⟩ is technically restricted to states of a certain type, treating it as an arbitrary unit vector is essentially without loss of generality. 3 Conjecture 2.6 (The (1 + ε)-query conjecture.). Fix any M -outcome projective measurement P = {Π i } i∈[M ] acting on C M . With high probability over a uniformly random R : is at most negl(n).
Again, the difficulty we face in bounding Eq. (9) is that matrix concentration inequalities only seem to apply when the subset S is fixed, and not when we maximize over all S.
A simple mathematical conjecture. Finally, in order to state the simplest mathematical conjecture that captures this "simultaneous matrix concentration" problem, we give a slightly different version of the above (1 + ε)-query conjecture (which corresponds to the case where |ψ R k ⟩ and |ψ h ⟩ are Haar random).
where each Id is M/N × M/N -dimensional and |ψ⟩ is Haar-random.
We note that it would be extremely surprising to us if Conjecture 2.7 turns out to be false, since that would imply that a two-query algorithm can win (the Haar-random state version of) the Oracle State Distinguishing Game.

The Oracle State Distinguishing Game
The purpose of this section is to define the Oracle State Distinguishing Game and prove several fundamental properties about it. We note that our main proof in Section 4 can be understood without reading Sections 3.3 to 3.5.
The section is organized as follows. In Section 3.1, we introduce some notation and formalism for oracle algorithms. In Section 3.2, we define the Oracle State Distinguishing Game. In Section 3.3 we show that hardness of the Oracle State Distinguishing Game for T -query adversaries implies hardness of T -query unitary synthesis for any parameter T .
In Section 3.4, we appeal to concentration of measure to give (for any oracle adversary A) an upper tail inequality on the optimal distinguishing advantage in the oracle state distinguishing game, which implies that it suffices to bound the adversary's expected distinguishing advantage over the choice of R.
In Section 3.5, we show that two complexity measures of an oracle algorithm -query length and space complexity -are tightly related in the oracle state distinguishing game. The assumption that our adversaries are space-efficient as well as query-efficient will be crucial in both proofs of the one-query lower bound.
Finally, in Section 3.6, we give an explicit "normal form" for one-query adversaries (using Section 3.5), setting up simplified notation that suffices for Section 4.

Preliminary notation
We will use boldface to denote random variables. We will write ln(·) for the natural logarithm and log 2 (·) for the base-2 logarithm.
Notation 3.1 (Register size versus dimension). A quantum register consisting of m qubits has dimension M = 2 m . Viewing it as a space of m qubits, it is natural to index the basis by binary strings x ∈ {0, 1} m . On the other hand, viewing it as a space of dimension M , it is natural to index the basis by integers 1 ≤ i ≤ M . We can associate these two indexing schemes by associating the number i with the string x that is the m-bit binary representation of i − 1. We will typically prefer the second indexing scheme, and will therefore typically represent m-qubit states as Throughout this work, we will consider algorithms which take as input a quantum state. We will typically reserve n for the length of the input register in qubits and N := 2 n for the dimension of this register.
Most of our quantum state inputs will come in the form of binary phase states.
Definition 3.2 (Binary phase state). A Boolean function is a function h : {0, 1} n → {±1}. Due to the association between {0, 1} n and [N ] given in Notation 3.1, we will typically prefer to write such a function as h : [N ] → {±1}, and we will elect to still refer to such a function as a "Boolean function". The corresponding binary phase state is Operationally, for any 1 ≤ i ≤ L, the oracle acts as O f · |i⟩ = f (i) · |i⟩. If L = 2 ℓ for some integer ℓ, then we refer to ℓ as the input length of the phase oracle and L as the dimension of the phase oracle.
Phase oracles can be contrasted with bit flip oracles.
In general, a bit flip oracle can always be used to implement a phase oracle, but the reverse is only partially true: implementing a bit flip oracle requires a controlled phase oracle. However, we will see below that the class of phase oracles we consider are actually powerful enough to implement controlled phase oracles, and hence can be converted to bit flip oracles if desired.
Throughout this work, we will consider a class of circuits which take as input a quantum state and are allowed to perform several queries to a phase oracle. We define these formally as follows.
Definition 3.4 (Oracle circuit). A t-query oracle circuit A (·) begins with an input register of n qubits and an ancilla register of a qubits, each initialized to |0⟩, for a total of m = n + a qubits. It then performs the m-qubit unitaries U 1 , . . . , U t+1 . In addition, between each pair of unitaries, it performs a query to a phase oracle of input length ℓ, which acts on the first ℓ qubits. We write A (·) = (n, m, ℓ, U 1 , . . . , U t+1 ) in order to specify these parameters.
The precise execution of the oracle circuit depends on which Boolean function it is given query access to. Given a Boolean function f : {0, 1} ℓ → {±1}, we write A f for the oracle circuit given access to f . On input an n-qubit state, it computes the state This is illustrated in Figure 1. Remark 3.5 (Querying multiple functions). Note that we have defined our oracle circuits so that every application of the oracle gate queries the same Boolean function f . One can consider an alternative model of t-query oracle circuits which are instead allowed to query a different Boolean function f i for each oracle call 1 ≤ i ≤ t. However, one can simulate access to these t Boolean functions using a single Boolean function f defined as f (bin(i), x) := f i (x), where bin(i) is the a = ⌈log 2 (t)⌉-bit binary encoding of i. Hence, an adversary which queries t different Boolean functions can be simulated by an adversary which queries one Boolean function and has a small a-qubit overhead. Thus, it is essentially without loss of generality to focus on adversaries which query a single function, as we do. We note that this transformation is standard and appears, for example, at the top of page 5 in Rosenthal's Ph.D. thesis [Ros23b].
Remark 3.6 (Querying many-bit functions). Yet another model of t-query oracle circuits allows for making bit-flip queries to d-bit output functions of the form |x⟩ |y⟩ → |x⟩ |y ⊕ f (x)⟩ for x ∈ [M ], y ∈ {0, 1} d . As pointed out in [Ros23b] Section 2.1, such queries can be simulated by a single quantum query to a 1-bit function: This also allows us to simulate parallel queries of the form by defining x = (x 1 , . . . , x t ). Therefore, our one-query lower bounds imply lower bounds against a bounded (e.g., polynomial or sub-exponential) number of parallel queries.

Defining the Oracle State Distinguishing Game
In this section, we define the Oracle State Distinguishing Game. To begin, every such game is parameterized by a particular family of functions, which is defined as follows. We have chosen the letter "R" for function families as shorthand for the word "Random", as our function families will often (though not always) be random variables. Remark 3.9 (Computational complexity of the challenger). In the "b = 1 case", the view of the adversary is that it receives a maximally mixed state Id N /N . Hence, we can equivalently view the challenger as sampling a random state from any distribution, so long as an average state drawn from this distribution is maximally mixed. For example, we can equivalently view the challenger as sampling a uniformly random Boolean function h : [N ] → {±1} and setting |ψ⟩ := |ψ h ⟩, or sampling |ψ⟩ as an N -dimensional Haar-random state. We will typically prefer the first of these points of view throughout this work.
We have chosen to have the challenger sample a random basis state |x⟩ in this case to emphasize that the challenger is computationally efficient in our construction. Note that they can also efficiently construct the state |ψ R k ⟩ in the "b = 0 case" given oracle access to R. In particular, they need only query the oracle R(k, ·) on the uniform superposition state We will model our adversary as an oracle circuit A (·) with an N -dimensional input register. Intuitively, the adversary will be allowed to select its own preferred oracle f to give it the best chance of winning the Oracle State Distinguishing Game on R. When the game is played on a uniformly random choice of the function family R : , the adversary will be allowed to select an oracle f R which depends on R.
Definition 3.10 (Adversary). An adversary is specified by an oracle circuit A (·) . Let L be the dimension of the queries the oracle circuit makes. Given oracle access to a Boolean function f : [L] → {±1}, the adversary acts as follows. On input the quantum state |ψ⟩, it applies A f , and then it measures the first qubit in the standard basis. It outputs the measurement outcome b ′ ∈ {0, 1}. Now we introduce several pieces of notation which will help us describe the adversary's winning probability in the Oracle State Distinguishing Game.
Notation 3.11 (Adversary's acceptance probability). Let A (·) be an adversary which has an Ndimensional input register and makes L-dimensional queries. Let h : [N ] → {±1} be a Boolean function, and let f : [L] → {±1} be another Boolean function. We will use the notation be a function family. Then in the "b = 0 case", the probability the adversary wins Game R can be expressed in this notation as E k∼ [K] [p A (R k | f )]. As for the "b = 1 case" let us follow Remark 3.9 and view the challenger as sampling a uniformly random Boolean function h : [N ] → {±1} and setting |ψ⟩ := |ψ h ⟩; given this state, the adversary wins with probability 1 − p A (h | f ). Putting these two together, the probability the adversary wins Game R is Note that the adversary can trivially win with probability 1/2 by always outputting b ′ = 0. Thus, we care about the amount by which the adversary's acceptance probability differs from 1/2, which is known as its advantage. The factor of 2 in front was chosen so that the distinguishing advantage is a number between 0 and 1 and is equal to 1 if the adversary always wins (or loses). If we plug in Equation (10) for the adversary's winning probability, we can rewrite the distinguishing advantage as where h : [N ] → {±1} is a uniformly random Boolean function. This equation is the form that we will most typically express the distinguishing advantage in, and it explains why we refer to this as the distinguishing advantage, which is because it expresses how well the adversary's output can be used to distinguish between the two cases.
The adversary's goal is to maximize the distinguishing advantage, and it can do so by picking the best possible function f : [L] → {±1} to perform oracle queries to. This motivates the following quantity, which is the main quantity we will be studying throughout this paper.
Definition 3.13 (Maximum distinguishing advantage). Let R : [K] × [N ] → {±1} be a function family. Let A (·) be an adversary which has an N -dimensional input register and makes Ldimensional queries. The maximum distinguishing advantage of A (·) on Game R is defined as Finally, the maximum distinguishing advantage of A (·) on Game K,N is equal to is a uniformly random function family.
The goal of this work is to show that ∆ avg A is small for any adversary A (·) which makes a single query of length ℓ = o(K). Moreover, we will prove that in the same parameter regime, with high probability over R, ∆ A (R) is small.

Relationship to the Unitary Synthesis Problem
In this section, we formalize the Unitary Synthesis Problem and its relationship to the Oracle State Distinguishing Game. Or, rather, we will suggest one possible way of formalizing the Unitary Synthesis Problem, as there seems to be no generally agreed upon precise formulation of the problem. For example, the task is to approximate a general n-qubit unitary U , but there are many different ways of defining what it means to approximate a unitary. This was addressed by Scott Aaronson in a comment on the Shtetl-Optimized blog [Aar21], in which he said the following.
"The unitary synthesis problem is interesting for any reasonable notion of approximating U . In other words, we lack a positive result even for the loosest notions of approximation you mentioned, or a negative result even for the most stringent ones! Once we have some results, then we can start worrying about these distinctions." The last few years have seen increasing interest in fundamentally quantum tasks, and as a result we now do have some results on problems related to unitary synthesis [RY22, Ros22, BEM + 23], and these have given several ways of precisely formalizing unitary synthesis.
Let us first recall several standard notions from quantum information theory. Given two n-qubit density matrices ρ 1 and ρ 2 , their trace distance is given by where ∥ · ∥ 1 is the trace norm. Given two quantum channels Φ 1 , Φ 2 , both with n-qubit inputs and outputs, their diamond distance is given by where the maximization is over all 2n-qubit density matrices ρ, and both Id operators refer to the n-qubit identity channel. For more background these distances, see [Wat18, Chapter 3]. Following [BEM + 23], we will define what it means to approximate a unitary in terms of the diamond distance.
Definition 3.14 (Approximating a unitary). Let U be an n-qubit unitary, and let Φ U be the associated quantum channel. Let Φ approx be a quantum channel with n-qubit input and output registers.
We will also define the channel associated with an oracle circuit A (·) in the natural way.
Definition 3.15 (Channel implemented by an oracle circuit). Given a t-query oracle circuit A (·) = (n, m, ℓ, U 1 , . . . , U t , U t+1 ) and a Boolean function f : {0, 1} ℓ → {±1}, the associated n-qubit channel Φ A f is defined as follows: 1. Given an n qubit input |ψ⟩, compute the state Return the first n qubits as the output (and discard the rest).
With these definitions in hand, we can give a formal statement of the Unitary Synthesis Problem.
Definition 3.16 (The Unitary Synthesis Problem). Fix an error parameter ε(n) = 1/2 Ω(n) . Does there exist a poly(n)-query oracle circuit A (·) computable by a poly(n)-sized quantum circuit such that for all n-qubit unitaries U , there exists a Boolean function f : As discussed in Section 1.1, a bound on the maximum distinguishing advantage in the Oracle Distinguishing Game immediately implies a lower bound for the worst-case version of the Unitary Synthesis Problem, since there always exists an information-theoretic distinguisher that wins the corresponding Oracle State Distinguishing Game for R. In fact, if we make a slight modification to the Oracle State Distinguishing Game, then a the distinguishing advantage bound would imply a slightly stronger claim, namely that Unitary Synthesis Problem is hard for a Haar-random U (we note that this is technically the version of the problem stated by Aaronson and Kuperberg [AK07]).
In more detail, one can consider a variant of the Oracle State Distinguishing Game where every |ψ R k ⟩ is sampled as a Haar random state, rather than as a binary phase state (we do not give a separate analysis for the version of Oracle State Distinguishing with Haar random states, but our proof technique can easily be adapted to handle it). Next, suppose that there exists an oracle circuit A (·) that can synthesize an n-qubit Haar random unitary U . Then for a random U and any K < N , there exists (with high probability) a choice of f such that A f implements the channel corresponding to U . In particular, this means that for a Haar-random subspace S = span{U † |1⟩ , . . . , U † |K⟩}, there exists f such that A f maps S to span{|1⟩ , . . . , |K⟩}. Such an oracle circuit A (·) can be used to win the Oracle State Distinguishing Game, since the subspace span{|ψ R 1 ⟩ , . . . , |ψ R K ⟩} is distributed as a K-dimensional Haar-random subspace, and the ability to map this subspace to span{|1⟩ , . . . , |K⟩} immediately yields a distinguisher for the game.
To summarize, we have argued that a lower bound for breaking a (single-copy) pseudorandom state family -in an oracle setting where the K pseudorandom states are distributed as Haar random states -directly implies hardness of synthesizing the first K columns of a Haar-random unitary. Thus, we have the following claim.

Upper tail inequality for the maximum distinguishing advantage
Throughout this subsection, we will write A (·) = (n, m, ℓ, U 1 , . . . , U t+1 ) for a t-query adversary with an (N := 2 n )-dimensional input register and (L := 2 ℓ )-dimensional queries which is playing Game R for function families of the form R : be a uniformly random function family. In this section, we consider the random variable ∆ A (R) corresponding to the maximum distinguishing advantage of A (Definition 3.13), and we show that it has strong one-sided concentration around its mean ∆ A . Our main result is as follows.
The main technical lemma we will need to prove this is the following version of Talagrand's concentration inequality, which is stated in [Ver18, Theorem 5.2.16].
To derive Lemma 3.18 using Talagrand's concentration inequality, we will view a uniformly random function family R : [K] × [N ] → {±1} as a collection of KN independent {±1} random variables. We would then like to apply Talagrand's concentration inequality with the "g" function set to the maximum distinguishing advantage ∆ A (·), interpreted as a function of an input R. However, doing so faces two difficulties: first, ∆ A (·) is defined only for {±1}-valued inputs, wheres the "g" function in Talagrand's concentration inequality must be defined over [−1, 1] inputs. Second, ∆ A (·) is not convex. The first difficulty is straightforward to address, and we begin to do so in the following definition.
Definition 3.21 (Expanding the acceptance probability to bounded inputs). Let us fix a Boolean function f : [L] → {±1}. Given as input the state |ψ⟩, the adversary applies the oracle circuit A f and then measures the first qubit of the resulting state. We can therefore view the adversary as applying a POVM measurement We will now extend this expression to functions which are [−1, 1]-valued rather than {±1}-valued.
Note that ψ h is sub-normalized, meaning that ψ h ψ h ≤ 1, and so it is no longer necessarily a quantum state. In addition, if h : [N ] → {±1} is a Boolean function, then by Equation (11), p A,b (h | f ) still recovers our traditional definition of p A (· | f ) when b = 0. As for the b = 1 case, note that because h is a Boolean function, However, this is not necessarily true of bounded functions h. Now we address the second issue, that of ∆ A (·) not being convex. To do so, we will have to define two variants of ∆ A (·) called ∆ A,0 (·) and ∆ A,1 (·) which we will eventually show are convex. This motivates the following definition, which will only be used in this subsection.
We note that unlike in the definition of ∆ A (·), there is no absolute value in the definition of ∆ A,b (·). (This is needed so that we can later show that it is convex.) Finally, we define is a uniformly random function family.
We will now make some observations about these definitions.
(As before, this is not necessarily true of bounded functions R.) Thus, we have that As a result, We will show the following concentration bound for these two variants of the maximum distinguishing advantage.
This completes the proof.
Now we focus on proving Lemma 3.23. To do so, we would like to show that ∆ A,0 (·) and ∆ A,1 (·) are convex and Lipschitz. Prior to doing so, however, we will first prove this for the p A,b (· | f ) function.
Now, because ∥ · ∥ 2 is convex and x → x 2 is convex, we also have that ∥ · ∥ 2 2 is convex. Hence, by Jensen's inequality, this is at most By Cauchy-Schwarz, we can bound the first term by (by definition of |ψ h ⟩ and |ψ h ′ ⟩) A similar argument shows that the second term is also bounded by ∥h − h ′ ∥ 2 / √ N . Putting these together, this shows that p A,b (· | f ) is (2/ √ N )-Lipschitz.
Next, we use this lemma to show that ∆ A,b (· | f ) is also convex and Lipschitz.
Proof. We first prove the lemma for the map in Equation (14 Thus, this map is convex. Next, Now, we apply Lemma 3.24, which states that p A,b (· | f ) is (2/ √ N )-Lipschitz. Hence, we can upper-bound this by where the inequality is due to Cauchy-Schwarz. Thus, this map is (2/ √ KN )-Lipschitz. As for ∆ A,b (· | f ), we recall that it is defined as follows: This is just the map in Equation (14), offset by a constant. Hence, it too is convex and (2/ √ KN )-Lipschitz. This completes the proof.
We have finally reached our goal, which is to show that the ∆ A,b (·) = max f ∆ A,b (· | f ) functions are convex and Lipschitz.
By Lemma 3.25, the function ∆ A,b (· | f ) is convex. Hence, this is at most Hence, ∆ A,b (·) is convex. Now we show that ∆ A,b (·) is Lipschitz. To do so, we will show that for any two bounded functions R, R ′ : This will show that ∆ A,b (·) is (2/ √ KN )-Lipschitz, as To begin, Let f be function maximizing the first expression. Then this is equal to

Recalling that ∆ avg
, this completes the proof.

The adversary's space is bounded without loss of generality
In this subsection, we will show that if A (·) is an oracle circuit that makes t queries, each of which has size at most ℓ, then we can assume without loss of generality that A (·) uses at most t · ℓ ancilla qubits, in addition to the n qubits in its input register. We prove this by showing that for any such oracle circuit (that potentially uses unbounded space), there is an oracle circuit B (·) that simulates A (·) using only t · ℓ ancilla qubits. This will allow us to restrict our attention to adversaries that are space-efficient when proving our one-query lower bounds, which is necessary given the technical tools we apply. We begin by defining in what sense B (·) simulates A (·) .
Notation 3.27 (Query register). In this subsection, we will assume that every oracle circuit makes an oracle call on a register of exactly ℓ qubits. We will write L = 2 ℓ for the dimension of this register, and we will write H query := C L for the vector space corresponding to this register.
Now we state the main lemma of this section, namely that an oracle circuit that makes t queries of size ℓ can be converted to one of space n + t · ℓ. Typical values for these parameters are t, ℓ = poly(n), in which case this results in an oracle circuit of poly(n) space.

Lemma 3.29 (Space reduction for oracle circuits). Consider a t-query oracle circuit
Then A (·) can be simulated by a t-query oracle circuit that uses m B = (n + t · ℓ) qubits of space.
The key technical ingredient we will use in the proof of this lemma is the following method for compressing an isometry with a large output dimension into an isometry with a small output dimension.
To get intuition for this definition, note that the operators {M z } correspond to the following measurement: first apply the original isometry V , and then measure the resulting query register to obtain an outcome z. As a result, compress(V ) is the natural isometry that corresponds to the {M z } measurement. We note that compress(V ) is indeed an isometry, because where the last step used the assumption that V is an isometry. The following technical lemma gives one sense in which compress(V ) does indeed compress V , in that whenever V is used to temporarily transition into H query ⊗ C S in order to query an oracle, we can use compress(V ) to move into H query ⊗ C D instead with the exact same results.
Proof. The proof is via a straightforward calculation: That completes the proof. Now we use this technical lemma to show that the action of compress(V ) followed by an oracle is actually equivalent to the action of V followed by an oracle, up to an isometry.
Prior to proving this lemma, we will establish the following linear-algebraic proposition. We expect that this proposition is well-known, although we were unable to find a reference for it.
Proposition 3.33 (Matching inner products implies an isometry). Let d 1 ≤ d 2 be integers. Consider two sets of m vectors |x 1 ⟩ , . . . , |x m ⟩ ∈ C d 1 and |y 1 ⟩ , . . . , |y m ⟩ ∈ C d 2 . Suppose that these sets have the same pairwise inner products, meaning that ⟨x i |x j ⟩ = ⟨y i |y j ⟩ , for all 1 ≤ i, j ≤ m. Then there exists an isometry T : Because the two sets of vectors have matching inner products, Given a complex matrix A, we will denote by A + the Moore-Penrose pseudo-inverse of A. The one fact we will use about the pseudo-inverse, which can be found in [Pet12, Proposition 4.9.2], is that A + · A is the projector onto the image of A † . Multiplying both sides of Equation (15) From our pseudo-inverse fact, (Y † ) + · Y † is the projector onto the image of (Y † ) † = Y . Hence, Now, let us define T := (Y † ) + · X † , so that T · X = Y . Note that for all 1 ≤ i ≤ m, this implies that as desired. Next, write span X := span{|x 1 ⟩ , . . . , |x m ⟩} and span Y := span{|y 1 ⟩ , . . . , |y m ⟩}.
Then we claim (i) T maps any vector in span ⊥ X to 0, and (ii) T is an isometry from span X to span Y . We prove these as follows.
(ii) To show that T is an isometry mapping span X to span Y , it suffices to show that it maps any vector in span X to span Y , and that it preserves lengths. Let |v⟩ ∈ span X . Then |v⟩ = α 1 · |x 1 ⟩ + · · · + α m · |x m ⟩ for some complex coefficients α 1 , . . . , α m . By Equation (16), which is indeed an element of span Y . Next, the squared length of |v⟩ is which is the squared length of T · |v⟩. This proves the claim.
Hence, T is an isometry mapping span X to span Y , acts as 0 outside of span X , and satisfies T · |x i ⟩ = |y i ⟩, for all 1 ≤ i ≤ m. As a result, it can be extended to an isometry mapping C d 1 to C d 2 which satisfies this property by picking any isometry that maps span ⊥ X to span ⊥ Y . This gives the desired construction. Now we prove Lemma 3.32.
We will prove that there exists an isometry T : H query ⊗ C D → H query ⊗ C S such that for all 1 ≤ x ≤ D and Boolean functions f : [L] → {±1}. This will in turn imply the desired claim by linearity. By Proposition 3.33, it suffices to show that { |Φ f,x ⟩} f,x and { | Φ f,x ⟩} f,x have the same pairwise inner products, i.e.
for all 1 ≤ x, y ≤ D and Boolean functions f, g : [L] → {±1}. To complete the proof, we verify this by direct calculation for all x, y, f, g: This completes the proof.
With this in hand, we can finally prove the main result of this section, Lemma 3.29.
Proof of Lemma 3.29. In this proof, we will construct a sequence of isometries V 1 , . . . , V t in which for each 1 ≤ i ≤ t, Given a Boolean function f : [L] → {±1}, we will use the shorthand Operationally, Prod U,f,i corresponds to alternating between i unitaries and oracle calls, and similarly for Prod V,f,i .
We will first prove the following statement: for each 0 ≤ i ≤ t, there exists an isometry At the end, we will derive Lemma 3.29 from this statement. The proof is by induction on t, the base case being t = 0. In this case, the statement follows from setting T 0 : C N → C M A as T 0 := Id N ⊗ |0 m A −n ⟩. This is because as desired. As for the induction step we suppose it is true for i ≤ t − 1 and prove that it holds for i + 1. By the induction hypothesis, we have that as desired. Applying Lemma 3.32, there exists an isometry such that for all Boolean functions f :

Plugging this into Equation (20), we have that
Thus, the (i + 1) case of the statement is also true, completing the proof by induction. It remains to show that the existence of isometries V 1 , . . . , V t and T 0 , . . . , T t satisfying Eqs. (18) and (19) implies Lemma 3.29. Recall that our goal is to construct an oracle circuit B (·) = (n, m B , ℓ, U B 1 , . . . , U B t+1 ) that uses m B = (n + t · ℓ) qubits of space and simulates A (·) in the sense of Definition 3.28: namely, there exists an isometry T : where D A = 2 m A −ℓ and D B = 2 m B −ℓ . To this end, for 1 ≤ i ≤ t, we will extend each isometry V i : C N ⊗ (C L ) ⊗i−1 → C N ⊗ (C L ) ⊗i to a unitary U B i acting on m B = n + t · ℓ qubits as follows. First, for 1 ≤ i ≤ t, define U B i to be an extension of the isometry V i to a unitary on n + ℓ · i qubits, i.e., We then extend this to a unitary on n + t · ℓ qubits by setting To put everything together, we need to prove the existence of an isometry T satisfying Eq. (21). Plugging in our definitions for U B i into Eq. (19), there exists an isometry T t : The equation (22)  Simulating the measurement. Our notion of what it means for B (·) to simulate A (·) only guarantees that there exists an isometry T such that (on any input) running B (·) and then applying T produces the same state as running A (·) . However, our aim is to use B (·) in place of A (·) as an adversary in the Oracle State Distinguishing Game; recall that an adversary (Definition 3.10) in this game first applies the oracle circuit on a given input state, and then measures the first qubit of the resulting state to produce a guess bit b ′ . Thus, what we need is a way to run a low-space oracle circuit B (·) so that when we measure the first qubit of the resulting state, the outcome distribution is the same as if we had run A (·) and measured its first qubit. Fortunately, we can resolve this issue with standard techniques from quantum information (namely Naimark dilation; see, e.g., page 94 of [NC10]). First, define the m B -qubit binary-outcome Now, observe that if we run B (·) and then measure {E 0 , E 1 = Id − E 0 }, the resulting outcome b ′ is distributed exactly the same as it would be if we had instead run B (·) , then applied T , and measured the first qubit (and the latter is equivalent to running A (·) and measuring the first qubit).
To implement this POVM as a measurement of the first qubit of the adversary's state, we will define the isometry V guess : We note that V guess is in fact an isometry, since Moreover, applying V guess and measuring the first qubit of the resulting state produces the same distribution as measuring {E 0 , E 1 }, since for any state |ψ⟩ ∈ C M B and any b ∈ {0, 1}, Thus, given any circuit B (·) = (n, m B , ℓ, U B 1 , . . . , U B t+1 ) that simulates A (·) in the sense of Definition 3.28, we can easily modify B (·) to obtain another low-space t-query oracle circuit C (·) = (n, m B + 1, ℓ, U C 1 , . . . , U C t+1 ) that has the additional guarantee that running C (·) and measuring its first qubit produces a guess from the correct output distribution.
Concretely, define U guess to be an (m B + 1)-qubit unitary that extends the isometry V guess in the sense that U guess · (|0⟩ ⊗ Id M B ) = V guess . Then define The unitaries corresponding to 1 ≤ i ≤ t are defined as they are in B (·) except that they act on one additional qubit, i.e., U C i := Id 2 ⊗ U B i . By the preceding discussion, these definitions guarantee that running C (·) and measuring its first qubit yields the outcome distribution of the original oracle adversary A (·) . Thus, we have the following corollary of Lemma 3.29.
Corollary 3.34. Without loss of generality, any t-query adversary in the Oracle State Distinguishing Game uses an oracle circuit (n, m, ℓ, U 1 , . . . , U t , U t+1 ) that requires m ≤ n + t · ℓ + 1 qubits of space.

One-query adversary model, final problem setup
In this section, we give a "normal form" for one-query adversaries A (·) with bounded query length. By Corollary 3.34, for any A (·) with query length bounded by ℓ, we may assume without loss of generality that A (·) uses at most a ≤ ℓ + 1 ancilla qubits, for a total number of m = n + a qubits. As a result, following Definition 3.4, we may assume that A f operates as follows, for some choice of unitaries U 1 , U 2 : 1. Given an n qubit input |ψ⟩, compute the state Measure the first qubit of the resulting state in the standard basis.
In this special case, we simplify our notation slightly with the following definitions: • Let M := 2 m denote the dimension of the adversary's final Hilbert space.
• Let V = U 1 · (Id N ⊗ |0 a ⟩) denote the isometry describing A's behavior prior to the query.
To summarize, we have modeled the adversary as A (·) = (M, V, Π), where M is an integer, V : C N → C M is an isometry, and Π ∈ C M ×M is a projection. In this language, the adversary's probability of outputting "0" on a binary phase state |ψ h ⟩ is given by Our goal in Section 4 will be to prove an upper bound on ∆ A (R) (see Definition 3.12) of the form with high probability over R, which will establish a lower bound of m = log(M ) = Ω(Kε 2 ) for adversaries with distinguishing advantage ε.

Proof of the one-query lower bound
In this section, we will consider a single-query adversary A and show that its advantage in the oracle state distinguishing game ∆ A (R) is very small with overwhelming probability over R. By Lemma 3.18, it suffices to bound the expectation E[∆ A (R)] = ∆ avg A for every A. We will bound this expected value as follows: in Section 4.1, we will apply a standard decoupling trick to the expression for the adversary's distinguishing advantage. Next, in Section 4.2 we will develop a natural spectral relaxation of this decoupled distinguishing advantage. Following that, in Section 4.3 we will use a matrix concentration inequality to bound the expectation of the spectral relaxation in terms of a quantity that we call the width of a collection of binary phase states. Then, in Section 4.4, we show how to bound the expected width of a random family of binary phase states. Finally, in Section 4.5 we will combine these ingredients and complete the proof of the one-query lower bound.
Notation. We will first fix some notation to use throughout the section. By Section 3.6, we can model the adversary as A = (M, V, Π), where M is an integer (a typical value of which is M = 2 poly(n) ), V : C N → C M is an isometry, and Π ∈ C M ×M is a projection. Writing v i,x for the (i, x)-th entry of V , we can express it as This motivates the following definition.
Definition 4.1 (Isometry weights). The isometry weights are the numbers wt V,i := 1 N · ⟨v i |v i ⟩ , for 1 ≤ i ≤ M . Note that these sum to one and therefore form a probability distribution. 4

Decoupling the quadratic form
Our overall goal is to bound the adversary's distinguishing advantage. Writing h : [N ] → {±1} for a uniformly random Boolean function, the distinguishing advantage can be written as The coefficients of the vector |ψ R k ⟩ are independent {±1} Rademacher random variables, and indeed there are tools from random matrix theory which allow us to prove concentration bounds on matrices whose entries are linear combinations of Rademachers. However, the first term in Equation (24) is quadratic in the |ψ R k ⟩ vector, and so these tools cannot be immediately applied. What we would like to do is decouple the left random vector ⟨ψ R k | from the right random vector |ψ R k ⟩ so that this expression becomes a function of two independent random vectors, and is linear in both of them, rather than being quadratic in a single random vector. This motivates the following definition, a natural decoupled analogue of the distinguishing advantage. Then the corresponding decoupling distinguishing advantage is given by Unlike the normal distinguishing advantage, the decoupled distinguishing advantage has no natural operational interpretation. However, it still gives a convenient upper bound to the normal distinguishing advantage, as shown in the following lemma.
Decoupling inequalities are standard in the random matrix theory literature. Our proof follows an outline similar to other decoupling arguments, for example those in [vH17, Lemma 5.2] and [Ver11].
Proof of Lemma 4.3. Throughout this proof, we will adopt the following shorthand for convenience: given an oracle O acting on C M , we will write be a uniformly random function family. Let R ′ be an independent copy of R. Then for each 1 ≤ k ≤ K, R ′ k is distributed as a uniformly random function, even conditioned on R. As a result, the average distinguishing advantage is given by For each 1 ≤ k ≤ K, consider the two vectors Note that for each 1 ≤ k ≤ K and 1 ≤ independently and uniformly at random from {±1}. Define S − k (x) similarly. Then the next two equations follow by definition: Plugging this in above, The expression inside the max only depends on R and R ′ through S + and S − . Hence, this is equal to But it can be checked that S + and S − are just distributed as two independent and uniformly random function families. This completes the proof.

A spectral relaxation for the decoupled distinguishing advantage
In this section, we develop a spectral relaxation for the decoupled distinguishing advantage As a precursor to this, we will develop a formula for expressing the vectors V · |ψ R k ⟩ and V · |ψ R ′ k ⟩. Let h : [N ] → {±1} be a Boolean function, and consider the binary phase state The isometry V maps |ψ h ⟩ to Thus, the amplitude on the i-th basis element is ⟨v i |ψ h ⟩. We would like to estimate the magnitude of this amplitude for a "typical" binary phase state. This is given by the following proposition.

Proposition 4.4 (Typical amplitudes). Let h : [N ] → {±1} be a uniformly random Boolean function. Then
where wt V,i is the isometry weight defined in Definition 4.1.
Proof. We calculate the expectation as follows: where the second-to-last equality used the fact that E h [h(x)h(y)] = 1 if x = y and 0 otherwise, because h is uniformly random. The proof concludes by applying the definition of wt V,i .
In light of this, it is natural to define the following vector, which contains the "typical" amplitudes of V · |ψ h ⟩.
Definition 4.5 (The weight vector). The weight vector is the unit vector given by We can express V · |ψ h ⟩ in terms of the weight vector as This motivates the following definition.
Definition 4.6 (The rescaling matrix). Let h : [N ] → {±1} be a Boolean function. Then the corresponding rescaling matrix is the diagonal matrix given by By construction, we have that V · |ψ h ⟩ = D V,h · |wt V ⟩.
Remark 4.7. Loosely speaking, the size of the rescaling matrix indicates how close the amplitudes of V · |ψ h ⟩ are to their "typical" values. If each diagonal entry of D V,h is close to 1 in magnitude, then the amplitudes of V · |ψ h ⟩'s are roughly typical; otherwise, at least one of V · |ψ h ⟩'s amplitudes is atypically large or small.
We can therefore express the decoupled distinguishing advantage as Now we observe that O and D V,R k are both diagonal matrices, and hence they both commute (and similarly for D V,R ′ k ). As a result, this is equal to Note that for any function f , O f · |wt V ⟩ is a unit vector. We can therefore upper-bound this expression by relaxing O f · |wt V ⟩ to be an arbitrary unit vector maximizing this expression. This gives the spectral relaxation.
From the above discussion, the following lemma is immediate.

Expectation of the spectral relaxation with one parameter held fixed
The spectral relaxation is the operator norm of a matrix which is bilinear in both R and R ′ . Keeping R fixed, we can consider a uniformly random R ′ : [K] × [N ] → {±1}, and doing so makes this a random matrix whose entries are linear combinations of random {±1} variables. The key technical result we will use to study such matrices is the following, stated in [Tro15, Theorem 4.1.1].
Theorem 4.10 (Concentration for matrix Rademacher series). Let x 1 , . . . , x n be n independent, uniformly distributed {±1} random variables. Let Z be a d 1 × d 2 complex matrix whose entries are linear combinations of the x k 's, i.e.
where each c i,j,k is a fixed, complex number. Let v(Z) be the matrix variance statistic of Z, i.e.
We now use this to upper bound the expectation of the spectral relaxation when one of the parameters is held fixed. It states that this expectation can be bounded in terms of a quantity called the width of the function family R. Roughly speaking, the width is a measure of the "size" of the diagonal rescaling matrices D V,R k , over all 1 ≤ k ≤ K. As discussed in Remark 4.7, when R is a "typical" function family, we expect that these rescaling matrices should have small (i.e. close to 1) entries on the diagonal, in which case the width of R will be small. For atypical function families, on the other hand, the width might be large, but we expect such families to be extremely rare.
In addition, let R ′ : be a uniformly random function family. Then Proof. For the reader's convenience, we will recall the definition of the diagonal rescaling matrix corresponding to a Boolean function h : [N ] → {±1}: Note that each entry of D V,h is a linear combination of the Boolean values h(1), . . . , h(N ). In addition, note that Our goal is to compute To this end, define the matrix For each 1 ≤ k ≤ K, D V,R ′ k is a matrix whose entries are linear combinations of the R ′ k (x)'s. As a result, the entries of Z are linear combinations of the K · N many {±1}-valued random variables in R ′ . Hence, we can apply Theorem 4.10 to bound E R ′ [∥Z∥ op ]. To do so, we must first compute the matrix variance statistic of Z. To begin, Now, if k ̸ = k ′ , then R ′ k and R ′ k ′ are distributed independently from each other. As a result, for any fixed matrix C, because D V,R ′ k and D V,R ′ k ′ are mean-zero. (For Equation (28) above we only need the C = Id M ×M case, but we will apply it below using a different matrix C.) On the other hand, if k = k ′ , then by Proposition 4.4, Combining these two facts, we have that Finally, we bound this by So far, we have shown that Thus far, we have only computed the first term in the matrix variance statistic of Z. Now we move on to the second term. Fortunately, we can reuse many of the steps involved in computing the first term to compute the second term: (27) and (29)) Now, let h : [N ] → {±1} be a uniformly random Boolean function. Then h has the same distribution as R ′ k . for each 1 ≤ k ≤ K. As a result, this is equal to (by Equations (27) and (30)) In total, this shows that As a result, the matrix variance statistic of Z is Now we apply Theorem 4.10. It states that This completes the proof.

A bound on the width of a random state family
In the previous section, we showed that the expectation of the spectral relaxation, when one of the input state families R ′ : is randomized, can be bounded by a parameter of the other input family R referred to as its width. In this section, we show how to bound the expected width of a uniformly random family of binary phase states R : For intuition, recall that width(R) is defined to be the quantity Let us fix a value 1 ≤ i ≤ M and consider the i-th average being maximized over. By Proposition 4.4, for each 1 ≤ k ≤ K, the k-th term in the average has expectation exactly equal to 1, and indeed we will show that this term is close to 1 with high probability. As the i-th average is an average over K such terms, we expect that it should be extremely close to 1 with an extremely high probability, a probability so high that we can then union bound over all 1 ≤ i ≤ M and show that width(R) itself is close to 1 with high probability. From this, we will be able to conclude that the expectation is close to 1 as well.
To start, let us focus on the k-th term in the i-th average. It is the absolute value squared of the following quantity: This is just a complex-weighted linear combination of random {±1} variables. In addition, the sum of the squared weights is given by We would like to show that weighted sums of this form are highly concentrated. In particular, we will show that they posses a particular concentration property known as being sub-exponential.
Definition 4.12 (Sub-exponential random variables). A random variable X is sub-exponential with parameter γ > 0 if This is shown in the next lemma, whose proof we defer to Section 4.6.
Lemma 4.13 (Each term in the width is sub-exponential). There exists a constant γ ≥ 1 such that the following is true . Let b 1 , . . . , b m be independent and uniform ±1 random variables, and let a 1 , . . . , a m be complex numbers such that |a 1 | 2 + · · · + |a m | 2 = 1. Define S = a 1 · b 1 + · · · + a m · b m . Then the random variable |S| 2 − 1 is mean-zero and sub-exponential with parameter γ.
Now that we have shown our random variables are well-concentrated, we would like to that averages of them, as occur in the formula for the width (Equation (32) above), are extremely well-concentrated. This can be shown using Bernstein's inequality for averages of independent sub-exponential random variables, which is stated in [Ver18, Corollary 2.8.3].
Theorem 4.14 (Bernstein's inequality). There exists a constant c > 0 such that the following is true. Let X 1 , . . . , X m be independent, mean-zero, sub-exponential random variables, each with sub-exponential parameter at most γ. Then we have Now we combine these ingredients to show the following tail bound on the width. Lemma 4.13 states that there is a constant γ ≥ 1 such that (width i,k (R) − 1) is sub-exponential with parameter γ, for all 1 ≤ i ≤ M and 1 ≤ k ≤ K. Now, fix a value 1 ≤ i ≤ M . Since each width i,k (R) only depends on R k , the random variables (width i,k (R) − 1) are independent across all 1 ≤ k ≤ K. As a result, Bernstein's inequality states that there exists a constant c > 0 such that for all t ≥ 0, Now, since the width is defined as width(R) = max 1≤i≤M {width i (R)}, we have that This completes the proof, by taking the constant "c" in the lemma statement to be c/γ 2 .
Finally, we use our tail bound to derive an expectation bound on the width. Our proof will allow us to prove a bound of 1 + o(1) for a wide range of parameters M and K, as our initial intuition suggested. However, to get a bound which applies to the widest relevant range of parameters, we will prove a slightly weaker O(1) bound, which is still sufficient for our applications.  Proof. Fix some α ≥ 1, to be determined later. Then We can compute the integral exactly: In total, this gives us a bound of Now we select α to be α = max{1, c −1 }. Then we get a bound of When M ≤ e K , this is at most which is a constant. Picking this for the "C" in the lemma statement completes the proof.

The one-query lower bound
Now we complete the proof of the one-query lower bound. We begin by proving a bound on the expected value of the distinguishing advantage.
(by Lemma 4.9) ≤ 4 · E R 2 ln(2M ) · width(R) K (by Lemma 4.11) ≤ 4 · E R 2 ln(2M ) · width(R) K (by Jensen's inequality) ≤ 4 · 2 ln(2M ) · C K (by Lemma 4.16, for some constant C ≥ 1) Picking the "C" in the theorem statement to be 8 √ C, this completes the M ≤ e K case. As for the M > e K case, we note that because the distinguishing advantage is a difference of two probabilities, it is always at most 1. Hence, The first inequality is because C ≥ 1, and the second inequality is because M > e K . This completes the M > e K case, and therefore completes the proof.
In particular, this implies Theorem 1.4.

Technical lemma: sub-exponential random variables
Now we prove Lemma 4.13. For convenience, we restate it here.
To compute the mean of |S| 2 − 1, we will use the following proposition.
Proposition 4.20. Let b 1 , . . . , b m be independent and uniform ±1 random variables, and let a 1 , . . . , a m be complex numbers. Then S = a 1 · b 1 + · · · + a m · b m satisfies Proof. We calculate where the final step used E[b i b j ] = 1 if i = j and 0 if i ̸ = j. This completes the proof.
To show concentration for |S| 2 − 1, we use the following tail bound, a version of Hoeffding's inequality for complex-weighted random sums.
Theorem 4.21 (Sub-Gaussian concentration for sums of complex random variables). Let b 1 , . . . , b m be independent and uniform ±1 random variables, and let a 1 , . . . , a m be complex numbers. Then S = a 1 · b 1 + · · · + a m · b m satisfies As it turns out, this can be proved as a (very) special case of the matrix concentration tail bound stated in Theorem 4.10.
Then its "matrix" variance parameter is where the final step used Proposition 4.20. Then the tail bound of Theorem 4.10 implies that for all t ≥ 0, This completes the proof.
The following is an immediate corollary of Theorem 4.21.
Corollary 4.22. Let S be as in Lemma 4.19. Then |S| 2 is a sub-exponential random variable with parameter γ = 2.
Proof. By Theorem 4.21, This means that |S| 2 is a sub-exponential random variable with parameter γ = 2.
Now we want to show that |S| 2 − 1 is also sub-exponential, taking advantage of the fact that E[|S| 2 ] = 1. To do so, we will use standard facts about sub-exponential random variables from [Ver18, Section 2.7]. In particular, we will rely on an alternative method of parameterizing subexponential random variables in terms of their moment generation functions (MGFs).
Definition 4.23 (Sub-exponential norm). Given a real random variable X, the MGF of |X| is The smallest κ for which this equation is holds is given by the sub-exponential norm of X, denoted ∥X∥ ψ 1 , and is defined formally as follows: We require two facts about this method of parameterizing sub-exponential random variables. The first is stated in [Ver18, Proposition 2.7.1] and the second is stated in [Ver18, Exercise 2.7.10].
Proposition 4.24 (Approximate equivalence of the two parameterizations). There is an absolute constant C 1 > 0 such that the following is true. If the MGF of |X| is bounded at point κ, then X is sub-exponential with parameter γ, for some γ ≤ C 1 · κ. Likewise, if X is sub-exponential with parameter γ, then the MGF of |X| is bounded at point κ, for some κ ≤ C 1 · γ.
Proposition 4.25 (Centering). There is an absolute constant C 2 > 0 such that the following is true. If X is a sub-exponential random variable, then so is X − E[X], and it satisfies ∥X − E[X]∥ ψ 1 ≤ C 2 · ∥X∥ ψ 1 .

Now we prove Lemma 4.19.
Proof of Lemma 4.19. First, Proposition 4.20 states that Hence, |S| 2 − 1 is mean-zero. Next, Corollary 4.22 states that |S| 2 is a sub-exponential random variable with parameter γ 1 = 2. Proposition 4.24 then implies that the MGF of ||S| 2 | is bounded at point κ 1 , for some By definition of the sub-exponential norm, this immediately implies that ∥|S| 2 ∥ ψ 1 ≤ 2 · C 1 . Proposition 4.25 then implies that Now |S| 2 −1 is a non-constant random variable, and in particular it is nonzero with finite probability.
In addition, it only obtains a discrete set of values. Hence, the infimum over {t > 0} in the definition of the sub-exponential norm ∥|S| 2 − 1∥ ψ 1 is achieved at a nonzero minimizing value κ 2 > 0; in other words, if we set then the MGF of ||S| 2 − 1| is bounded at point κ 2 . Applying Proposition 4.24 again, this implies that |S| 2 − 1 is sub-exponential with parameter γ 2 , for some Now, we note that if a random variable X is sub-exponential with parameter a > 0, then it is also sub-exponential with parameter b, for any b ≥ a. This is because for all t > 0, Hence, because |S| 2 − 1 is sub-exponential for parameter γ 2 ≤ 2 · C 2 1 · C 2 , it is also sub-exponential for parameter γ = max{1, 2 · C 2 1 · C 2 }. This is a constant which is greater than or equal to 1, which completes the proof.

Pseudorandom states relative to a random oracle
In this section, we use Theorem 4.18 to derive Theorem 1.2, our lower bound for breaking pseudorandom state families. We begin with a definition of (single-copy) pseudorandom states in the plain model, for reference. • Efficient constructability: there is a polynomial-time quantum algorithm that on input (1 λ , k), for k ∈ {0, 1} λ , outputs |ψ λ,k ⟩.
• Pseudorandomness: for all algorithms A described by polynomial-size quantum circuit families, we have that where |ψ⟩ is drawn from the Haar distribution on n(λ)-qubit states.
Our Oracle and Adversary Model. In this paper, we consider pseudorandom state families defined relative to an oracle R : In that case, the efficient constructability property requires that there is a quantum polynomial-time oracle algorithm that on input (1 λ , k), for k ∈ {0, 1} λ , outputs |ψ λ,k ⟩, given oracle access to R.
In addition, the pseudorandomness property should require that the PRS family be secure against all algorithms A f , where A (·) is an oracle algorithm described by a polynomial-size oracle circuit family and f = f R is an arbitrary R-dependent oracle; equivalently, A (·) is computable by a family of quantum circuits output by a polynomial-time Turing machine with the help of polynomial-size non-uniform advice.
The main result of this section (Theorem 5.2 below) proves that relative to a random oracle, there are PRS families secure against all one-query attacks. Explicitly, the adversary model we consider is as follows: • For a given function R : {0, 1} * × {0, 1} * → {±1}, the adversary is described by an Rdependent Turing machine and R-dependent collection of advice strings (z λ ) λ∈N .
• On input z λ , the Turing machine outputs the description of a one-query oracle circuit A (·) := A (·) R,z λ .
Theorem 5.2 (Theorem 1.2 formalized). Let n(λ) be any efficiently computable polynomial function in λ such that n(λ) ≥ λ + 1 for all λ. Then with probability 1 over the choice of a random oracle R : {0, 1} * × {0, 1} * → {±1}, the following is true relative to R. There exists a PRS family consisting of n(λ)-qubit quantum states that is secure against all polynomial-time quantum algorithms A f that have polynomial-size non-uniform classical advice and make one query to an arbitrary Boolean function f : Proof. For each λ ∈ N, we define the function family R λ : for each k ∈ {0, 1} λ and x ∈ {0, 1} n(λ) . Then the candidate PRS family is the state family ensemble which contains the family of n(λ)-qubit quantum states for each security parameter λ ∈ N. By construction, the state |ψ R λ k ⟩ can be generated in time poly(λ) given a single oracle call to R. Thus, all that remains is to establish security.
Security nearly follows from Theorem 4.18, except that the order of quantifiers is wrong: in Theorem 4.18, the oracle circuit A (·) is not allowed to depend on R, although the function f it queries is. However, in this setting, A (·) is allowed to depend on R. We handle this by a standard quantifier-switching argument using the Borel-Cantelli lemma [BG81,IR89], which applies even in the case of A with bounded non-uniformity.
The argument is as follows. We abuse notation and let A(·) denote a polynomial-time Turing machine that on input z λ outputs a one-query oracle circuit A (·) z λ (·). The adversary runs A f z λ on input state |ψ⟩ using an arbitrary R-dependent oracle f = f R . Here, z = {z λ } λ is a collection of advice strings in which z λ has length poly(λ). Because A(·) runs in polynomial time, the query length of A f z λ is bounded by some p(λ) = poly(λ). As a result, by Theorem 4.18 (setting ε = 1 √ K ), we know that for every security parameter λ ∈ N, where the last inequality uses the fact that n(λ) ≥ λ + 1. We may then union bound over the 2 p(λ) possible advice strings z λ and conclude that for a universal constant c > 0 and all sufficiently large λ. Let E λ denote the above event. Then, we know that the summation Therefore, for all sufficiently large λ ∈ N, no matter what advice z = {z λ } λ the algorithm is given. Finally, we observe that the probability space above is uncountable. Therefore, we may union bound over all countably many polynomialtime Turing machines A(·) and conclude that Equation (33) holds for all A(·) and all sufficiently large λ ∈ N. This shows that the PRS family satisfies the claimed pseudorandomness property, concluding the proof.
We will largely follow the same notation as the proof in Section 4, which can be found in Section 3.1 as well as in Section 4.2. For convenience, we repeat several important definitions and results here.
Definition A.1 (Isometry weights, Definition 4.1 restated). The isometry weights are the numbers for 1 ≤ i ≤ M . Note that these sum to one and therefore form a probability distribution.
Definition A.3 (The weight vector, Definition 4.5 restated). The weight vector is the unit vector given by Definition A.4 (The rescaling matrix, Definition 4.6 restated). Let h : [N ] → {±1} be a Boolean function. Then the corresponding rescaling matrix is the diagonal matrix given by By construction, we have that V · |ψ h ⟩ = D V,h · |wt V ⟩.
Theorem A.5 (Sub-Gaussian concentration for sums of complex random variables, Theorem 4.21 restated). Let b 1 , . . . , b m be independent and uniform ±1 random variables, and let a 1 , . . . , a m be complex numbers. Then S = a 1 · b 1 + · · · + a m · b m satisfies Now we move to the proof.

A.1 A spectral relaxation for the distinguishing advantage
Next, it applies an oracle O f . This will produce the state Now we observe that O f and D V,h are both diagonal matrices, and hence they both commute. As a result, Note that O f · |wt V ⟩ is always a unit vector, and that it is independent of h. Finally, the adversary performs the measurement {Π, I − Π} and accepts if it observes the first outcome. We can therefore calculate the acceptance probability of the adversary A with oracle access to a function f : [M ] → {±1} as: be a function family. By the above calculation and the definition of the distinguishing advantage ∆ A (R) allows us to conclude that Recall that O f · |wt V ⟩ is a unit vector which depends on V and f . We can therefore upper-bound this expression by relaxing O f · |wt V ⟩ to be an arbitrary unit vector maximizing this expression. This gives the spectral relaxation.
Definition A.6 (Spectral relaxation). Let R : be a function family. The spectral relaxation of the distinguishing probability on R is given by From the above discussion, the following lemma is immediate. In the worst case, this relaxation can be quite poor. The following is an example in which the relaxation is equal to √ N − 1, even though the distinguishing value ∆ A (R) can never be more than one.
Example A.8 (A large relaxation value). For this example, we will view the space C N as corresponding to n qubits, so that the standard basis contains the vector |x⟩ for each x ∈ {0, 1} n . With this viewpoint, a binary phase state is specified by a Boolean function h : {0, 1} n → {±1} and is given by Suppose the adversary's strategy does not expand the Hilbert space (so that M = N ). In addition, suppose that the isometry V is just the n-qubit Hadamard transform V = H ⊗n , and that the measurement Π is just the n-qubit identity matrix Π = I N ×N . In this case, the rows of V are just the binary phase states |ψ χα ⟩, where χ α : {0, 1} n → {±1} is the Boolean function χ α (x) = (−1) ⟨α,x⟩ ; in other words, V = α∈{0,1} n |α⟩⟨ψ χα | .
As a result, the weight wt V,α = 1/N for all α ∈ {0, 1} n , and so the rescaling matrix is given by Now we compute the two terms in the spectral relaxation. The second term is independent of the function family R, and so we compute it first: where the last equality used the fact that E h ⟨ψ χα |ψ h ⟩ | 2 = wt V,α = N due to Proposition A.2. As for the first term, consider a worst-case function family R in which every R k is equal to the same parity R k = χ α , for some α ∈ {0, 1} n . Then for every 1 ≤ k ≤ K, the rescaling matrix is given by As a result, The operator norm of this matrix is √ N − 1, and so ∆ Spectral The reason that this example has such a large relaxation value is that the rescaling matrices D V,R k all have an extremely large diagonal entry and therefore an extremely large operator norm. We would like to rule out examples like this by only considering function families R in which the rescaling matrices have operator norms which are not too much larger than 1. This motivates the following definition.
Example A.8 showed that there exist worst-case function families R which are not B-bounded for small values of B. However, the next lemma shows that an average-case function family will in fact be B-bounded with extremely high probability. This lemma follows as a simple corollary of the following lemma by applying it to each function R k separately and then union bounding over all 1 ≤ k ≤ K. The key technical tool in the proof of this lemma is Theorem A.5 Proof of Lemma A.10. Fix a 1 ≤ i ≤ M , and let us consider D V,h,i . By definition, where As a result, Theorem A.5 says that Union bounding over all 1 ≤ i ≤ M , we have that This completes the proof.

A.2 Truncating the spectral relaxation
Although Lemma A.10 shows that the overwhelming majority of function families R are B-bounded, it will still be convenient to modify the spectral relaxation slightly so that the rare bad events do not lead to extremely large values, as in Example A.8. We will handle this by truncation.
Definition A.12 (B-truncation). Let trunc B : C → C be the function which, on input t ∈ C, acts as follows: By design, |trunc B (t)| ≤ B for all t ∈ C. Now we use this to define a truncated version of the rescaling matrix. Then the B-truncated rescaling matrix is the diagonal matrix given by With this in hand, we can define truncated analogues of the distinguishing advantage and the spectral relaxation.
In addition, the B-truncated distinguishing advantage and B-truncated spectral relaxation are defined as follows.
Note that the B-truncated spectral relaxation remains a spectral relaxation of the B-truncated distinguishing advantage, in that ∆ A,B (R) ≤ ∆ Spectral A,B (R).
As we have seen, a random function family is B-bounded with overwhelming probability. This suggests that the B-truncated analogue of the distinguishing advantage should not be too far from the regular distinguishing advantage. This is shown in the next lemma.
Before proving this, we will need to establish the following technical lemma.
Proof. The first step of the proof is a simple triangle inequality: where the second inequality is because p A (h | f ) is an acceptance probability and therefore at most 1. As for the second term, we note that We note that although this bound is quantitatively weaker than Theorem 4.18, it still gives a strong lower bound. For our typical settings of K and M , it states that ∆ A (R) is roughly bounded by 1/K 1/4 with all but a negligible probability. The key technical result we use to prove this is the following variant of the matrix Chernoff bound, stated in [Tro12, Theorem 1.3].
Theorem A.18 (Matrix Hoeffding). Let {Z k } be a set of independent, Hermitian, random matrices with dimension D. Let {C k } be a set of fixed Hermitian matrices. Assume that for all k, E Z k = 0 and Z 2 k ⪯ C 2 k . Set Then By applying matrix Hoeffding to both ( k Z k ) and −( k Z k ), we can derive the following concentration bound for the operator norm.
where the second inequality is due to Lemma A.10 and the fact that ∆ A,B (R) ≤ ∆ Spectral A,B (R). We will now focus on bounding the first term. By definition of the B-bounded spectral relaxation, To analyze this, we note that for each 1 ≤ k ≤ K, R k is distributed as a uniformly random function, and so R k has the same distribution as h. Hence, if we keep k fixed and randomize over R, This means that the random matrix has E R [X R k ] = 0. In terms of these matrices, our goal is to bound Note the following properties of X R k : 1. For each k, X R k only depends on R k . Hence, the random variables X R k , over all 1 ≤ k ≤ K, are independent and identically distributed.
2. X R k is an M × M matrix.
3. To bound the operator norm of X R k , we begin with the bound Now, let us bound the operator norm of D † V,R k ,B · Π · D V,R k ,B . Let |v⟩ be any unit vector. Then because D V,R k ,B is a diagonal matrix whose diagonal entries have magnitude at most B, D V,R k ,B · |v⟩ has norm at most B. Hence, As a result, D † V,R k ,B · Π · D V,R k ,B has spectral norm at most B 2 ; a similar argument will show that D † V,h,B · Π · D V,h,B has spectral norm at most B 2 as well. Thus, the spectral norm of X R k is at most 2B 2 , and so, X 2 R k ≤ 4B 4 · Id always. Now we are in a good place to apply matrix Hoeffding. In our setting, the matrices X R 1 , . . . , X R K are independent, Hermitian, and have dimension M . Furthermore, we know that X 2 R k ≤ 4B 4 · Id always. Hence, our value of σ 2 is

Now, our goal is to bound
This we can apply Corollary A.19 to, which tells us that Pr R
This completes the proof.

B On the power of counting arguments
In this section, we will consider the power of counting arguments to show lower bounds for the Oracle State Distinguishing Game. Counting arguments apply to the case when the adversary cannot compute too many unitaries, which we formalize as follows.
Definition B.1 (Small oracle circuits). An oracle circuit A (·) is S-small if the number of distinct unitaries A f , ranging over all oracles f , is at most S.
Although most oracle circuits are not "small" enough to be useful, there are a few interesting families of small oracle circuits, which we state below. As always, we will write N := 2 n for the size of the oracle circuit's input register, and M := 2 m for the total size of the oracle circuit's registers.
Example B.2 (Aaronson-Kuperberg adversaries). As described in Section 1.3, Aaronson and Kuperberg [AK07] considered oracle circuits A (·) in which for every oracle f , A f exactly computes some unitary transformation on its first n qubits. They showed that any such oracle circuit is 4 N -small [AK07, Theorem 6.7].
Example B.3 (Adversaries with no ancillas but multiple oracles). Consider an oracle circuit with n input qubits and no ancilla qubits which makes t queries. For this example only, we will depart from our usual notation and allow the queries to be made to t potentially different functions f 1 , . . . , f t . Then there are at most 2 N choices for each function f i , and so this oracle circuit is (2 N ) t = 2 N t -small. If t = poly(n), then this is 2 N ·poly(n) -small.
Example B.4 (Small-ancilla adversaries). Let A (·) be an oracle circuit which makes multiple queries to a single function f and uses n input qubits and a ancilla qubits, for a total of m = n + a qubits. Then A (·) is 2 M -small.
Note that the bound in Example B.4 subsumes Example B.3. This is because by Remark 3.5, an adversary which makes t queries to different functions f 1 , . . . , f t can be simulated by an oracle circuit A (·) which uses ⌈log 2 t⌉ additional ancilla qubits and queries a single function f . Now we state our main bound, which rules out adversaries for the Oracle State Distinguishing Game which are "small".
Theorem B.5 (Counting bound). There is a universal constant c > 0 such that the following is true. Consider the Oracle State Distinguishing Game played with a uniformly random function family R : [K] × [N ] → {±1}. Let A (·) be an adversary which is S-small, for S = exp c · ε 2 KN . Then Pr In the context of our examples, this rules out Aaronson-Kuperberg adversaries, so long as K = Ω(1/ε 2 ). This also rules out small-ancilla adversaries. For example, if adversary uses a = 1 2 · log 2 (K) ancilla qubits, then its total size is M = N · √ K, and so it is 2 N √ K -small, which is small enough (assuming reasonable settings of parameters) for Theorem B.5 to apply. On the other hand, Theorem B.5 cannot rule out general adversaries which use a = log 2 (K) ancilla qubits or more. For example, a natural adversary might intend to perform the query (k, x) → R(k, x) in superposition, and to do so it needs log 2 (K) + n qubits, putting it in the range where Theorem B.5 no longer applies. This shows the limitation of this style of counting argument: it becomes ineffective once the adversary has even a small number of ancilla qubits. Now we prove Theorem B.5. We will do so using a standard "concentration and union bound" approach: we prove a tail inequality on the probability that A f results in a good attack for a fixed f , and the we union bound over all f . Lemma B.6 (Success probability of no-query adversaries). There is a universal constant c > 0 such that the following is true. Let R : [K] × [N ] → {±1} be a uniformly random function family. Let A (·) be an adversary that does not make any queries. Then Proof. We prove this using tools from Section 3.4. Since A () makes no queries, let us fix an arbitrary f and note that ∆ A (R) = ∆ A (R | f ) for all function families R. Then where p A,0 (· | f ) is the extension of p A (· | f ) to bounded functions R : for some absolute constant c > 0. But by Equation (34) the left-hand side is Pr R [∆ A (R) ≥ ε], and so this completes the proof, with the "c" in the lemma statement being equal to c/4.
Deriving Theorem B.5 from Lemma B.6 is relatively straightforward.
Proof of Theorem B.5. Let A (·) be an S-small adversary, for a value of S to be determined later. Then there exist S functions f 1 , . . . , f S such that the set of unitaries A f 1 , . . . , A f S contains every unitary computable by A (·) . Fix a 1 ≤ i ≤ S. Then by hard-coding the function f i into A (·) , we can view A f i as an oracle circuit that does not make any queries. Thus, Lemma B.6 says that As a result, we can upper-bound the maximum distinguishing probability by (by the union bound) 2 · exp −c · ε 2 KN = S · 2 · exp −c · ε 2 KN .
Now, let us choose S to be S = exp c/2 · ε 2 KN . Then this upper bound on the maximum distinguishing probability equals S · 2 · exp −c/2 · ε 2 KN . This concludes the proof, with the constant "c" in the statement of the proof equal to c/2. C A one-query attack with advantage Ω(1/ √ K) In this section, we give a one-query adversary for the Oracle State Distinguishing Game achieving advantage Ω(1/ √ K) using only one ancilla qubit. This demonstrates that the dependence on K in our main theorem (Theorem 4.18) is tight. The adversary is given as follows.
2. Measure H ⊗n · |ψ⟩ in the standard basis. Let y ∈ {0, 1} n be the measurement outcome. The Hadamard's adversary's one ancilla qubit is used to store the outcome of the query to the bit flip oracle. Note that this can be simulated by an adversary which makes a query to a phase oracle instead, as discussed following the statement of Definition 3.3. We also remark that because the Hadamard adversary applies an n-qubit Hadamard, it will be more convenient to think of the adversary's state space as consisting of n-qubits, rather than being a single space of overall dimension N := 2 n . As a result, using the correspondence between {0, 1} n and [N ] mentioned in Notation 3.1, we will prefer to format our function families as R : [K] × {0, 1} n → {±1}, with the k-th binary phase state being Our main goal is to prove the following bounds on the Hadamard adversary's distinguishing probability.
Theorem C.2 (Distinguishing advantage of the Hadamard adversary). There exists a constant c > 0 such that the following is true. Let K, N ≥ c, and let R : [K] × {0, 1} n → {±1} be a uniformly random function family. Then When the Oracle State Distinguishing Game is played with some function family R : [K]×[N ] → {±1}, with probability 1 2 the Hadamard adversary is given the state |ψ R k ⟩ for k chosen uniformly at random. We will write M R for the probability distribution on the measurement outcome y in this case. In other words, M R (y) := E
With the remaining 1 2 probability, the Hadamard adversary is given a uniformly random phase state; equivalently, it is given the maximally mixed state Id N /N . In this case, the measurement outcome y is distributed as a uniformly random string in {0, 1} n . We will write U N for this uniform probability distribution, i.e. U N (y) := 1/N .
The Hadamard adversary measures a y which is sampled either from M R or U N , and it feeds y into the function f , which can be thought of as a statistical test to distinguish these two distributions. The following lemma characterizes the Hadamard adversary's distinguishing advantage in terms of the total variation distance d TV (·, ·) between these two distributions. Proof. Let f : {0, 1} n → {0, 1} be a Boolean function. By definition, Pr[A f Had outputs "0" on |ψ R The maximum distinguishing advantage is then computed by optimizing this expression over all f , but that is exactly the definition of the total variation distance.
Our goal is to calculate the expectation E R [∆ A Had (R)]. The following lemma gives an alternative expression for this expectation in terms of a new random variable.
Then for a uniformly random function family R : Proof. By Lemma C.3, Now, fix a y ∈ {0, 1} n . Consider the random variable For each value of k, the corresponding term is distributed as the square of the sum of N independent and uniformly random {±1} numbers, and the terms are independent across different values of k. Hence, this random variable is distributed identically to X R . In addition, these K random variables are independent. As a result, by linearity of expectation, This completes the proof. Now we study the distribution of the random variable X R . To begin, we compute its mean and variance.
Next, we compute the variance. To begin, note that for each R, Thus, the variance is given by The expectation over R is zero if k ̸ = k ′ . On the other hand, if k = k ′ , then the expectation is 1 if {x, y} = {z, w} and 0 otherwise. As a result, This completes the proof.
From Lemma C.5, we roughly expect that X R tends to be around the values 1 ± 2/K. If this were true, then the expectation E R |X R − 1| we are trying to compute would be roughly 2/K, and we would be done. However, it could be that the X R 's variance being roughly 2/K could be due to some small probability events where X R is very far from 1, whereas with high probability X R is much closer to 1 than 2/K. To rule this out, we prove the following concentration bound for X R .
Proof. For each 1 ≤ k ≤ K, define Then Lemma 4.13 implies that there exists a constant γ ≥ 1 such that for each 1 ≤ k ≤ K, the random variable |X R,k | 2 − 1 is sub-exponential with parameter γ. Since X R = E k∼[K] [X R,k ], Bernstein's inequality (Theorem 4.14) states that For ε a sufficiently small constant and K a sufficiently large constant, this is at most 3 2 · 1 K . But we already calculated that the variance of X R is N −1 N · 2 K , in Lemma C.5. Hence, we have a contradiction for sufficiently large N , completing the proof.