On The Fourier Coefficients of High-Dimensional Random Geometric Graphs

The random geometric graph $\mathsf{RGG}(n,\mathbb{S}^{d-1}, p)$ is formed by sampling $n$ i.i.d. vectors $\{V_i\}_{i = 1}^n$ uniformly on $\mathbb{S}^{d-1}$ and placing an edge between pairs of vertices $i$ and $j$ for which $\langle V_i,V_j\rangle \ge \tau^p_d,$ where $\tau^p_d$ is such that the expected density is $p.$ We study the low-degree Fourier coefficients of the distribution $\mathsf{RGG}(n,\mathbb{S}^{d-1}, p)$ and its Gaussian analogue. Our main conceptual contribution is a novel two-step strategy for bounding Fourier coefficients which we believe is more widely applicable to studying latent space distributions. First, we localize the dependence among edges to few fragile edges. Second, we partition the space of latent vector configurations $(\mathsf{RGG}(n,\mathbb{S}^{d-1}, p))^{\otimes n}$ based on the set of fragile edges and on each subset of configurations, we define a noise operator acting independently on edges not incident (in an appropriate sense) to fragile edges. We apply the resulting bounds to: 1) Settle the low-degree polynomial complexity of distinguishing spherical and Gaussian random geometric graphs from Erdos-Renyi both in the case of observing a complete set of edges and in the non-adaptively chosen mask $\mathcal{M}$ model recently introduced by [MVW24]; 2) Exhibit a statistical-computational gap for distinguishing $\mathsf{RGG}$ and the planted coloring model [KVWX23] in a regime when $\mathsf{RGG}$ is distinguishable from Erdos-Renyi; 3) Reprove known bounds on the second eigenvalue of random geometric graphs.


INTRODUCTION
Random graphs with a latent high-dimensional geometric structure are increasingly relevant in an era of massive networks over complex computer, social, or biological populations.Such graphs provide a fruitful, even if idealized, model in which to study algorithmic and statistical questions.For these reasons, in the last 15 years random geometric graphs have seen a surge of attention in the combinatorics, statistics, and computer science communities.Tasks addressed in the literature include: 1) Detecting the presence of a latent geometric structure [4,5,11,12,14,16,18,26,28,29], 2) Estimating the dimension of the latent geometry [4,14,20], 3) Embedding the graph in a geometric space and clustering [24,30,35], 4) Matching unlabelled noisy copies of the same geometric graph [25,39].In a di erent direction of study, high-dimensional random geometric graphs exhibit an intricate and useful combinatorial structure.Most notably, in [27], the authors show that in certain regimes spherical random geometric graphs are e cient 2-dimensional expanders, objects for which no other simple randomized constructions are known as of now.
Two of the most common models, studied since the early works [14,16], are spherical and Gaussian (hard thresholds) random geometric graphs.
The main goal of the current paper is to analyse the low-degree Fourier coe cients of the probability mass functions of those two distributions.The Fourier coe cients of an -vertex random graph distribution R are parametrized by edge-subgraphs .The -biased Fourier coe cient corresponding to is de ned by  Low-degree Fourier coe cients of distributions (and, more generally, Boolean functions) are at the core of many milestone results in theoretical computer science and combinatorics such as constructing succinct nearly -wise independent distributions [2], learning various classes of Boolean functions [13,19], the Margulis-Russo formula on sharp-thresholds [33,38] and many more (see [36]).More recently, low-degree Fourier coe cients have become central to the design of e cient algorithms for problems in high-dimensional statistics, as well as providing evidence for computational hardness, via the low-degree polynomial framework [21,22].
Unfortunately, estimating Fourier coe cients is a highly nontrivial task for complex distributions with dependencies among variables.We introduce a conceptually novel approach (described shortly in Section 3.2) for bounding the Fourier coe cients of distributions with random latent structure and use it for RGG( , S −1 , ) and RGG( , N (0, 1 ), ).This unlocks the powerful methods mentioned above which leads to several applications, described next.
1. Testing.Testing against Erdős-Rényi is one of the most natural and well-studied questions on high-dimensional random geometric graphs, starting with [16].Testing is a prerequisite for more sophisticated tasks: if one cannot even distinguish a graph from pure noise, one can hardly hope to do any other meaningful inference about its structure.
In the spherical case, one observes a graph G and the goal is to test between the two hypotheses 0 : G ∼ G( , ) and 1 : G ∼ RGG( , S −1 , ) .The state-of-the-art results for ≤ 1/2 and = Θ(1/ ) are as follows.By counting signed triangles, one succeeds with high probability whenever ≤ ( ) 3 (log 1/ ) for some constant [14,26].Counting signed triangles is conjectured to be information theoretically optimal, i.e., for ≫ ( ) 3 (log 1/ ) it is believed to be impossible to test between the two graph distributions [5,12,26].The best bounds on when RGG( , S −1 , ) and G( , ) are indistinguishable, due to [26], are: 1) ≥ 3 2 (log 1/ ) for all ≤ 1/2; 2) ≥ ( ) 3 (log 1/ ) = polylog( ) for = Θ(1/ ).In particular, the threshold in dimension at which testing becomes possible is only known (up to lower order terms) when = Θ(1) or = Θ(1/ ).We make progress in the intermediate regime 1/ ≪ ≪ 1/2 by showing that the signed triangle statistic is computationally optimal with respect to low-degree polynomial tests at all densities, even in a stronger non-adaptive edge query model recently introduced by [32].Surprisingly, we show that this is not the case for Gaussian random geometric graphs.For small , low-degree tests other than the signed-triangle statistics are much more powerful: when = Θ(1/ ), one can distinguish RGG( , N (0, 1 ), ) and G( , ) for dimensions as large as √ (log ) ′ , in sharp contrast to the = polylog( ) threshold in the spherical case [26].We additionally prove low-degree indistinguishability between RGG( , S −1 , 1  2 ) and a planted coloring model [23] in a regime when both are distinguishable from G( , 1/2) via simple low-degree tests.The two models can be easily distinguished from one another by determining the largest clique, a computationally ine cient test, which shows a computation-information gap for this testing problem.To the best of our knowledge, this is the rst negative 2 ) and a non-geometric distribution when ≪ 3 .
2. Spectral properties.The second eigenvalue 2 of G ∼ RGG is captured by low-degree polynomials via the trace method. 2 naturally plays an important role in the expansion properties of RGG( , S −1 , ) [27].The top eigenvalues are also used in embedding and clustering random geometric graphs via the top eigenvectors [24].These works have characterized the behavior of 2 : when ≪ , 2 = Θ( / √ ) and when ≫ , the behaviour is similar to Erdős-Rényi and 2 = Θ( √ ). 1 We reprove this bound in the case = 1/2 using our estimates on the Fourier coe cients.While our approach yields the same quantitative bounds, its methodology is rather di erent and much more combinatorial.

Organization of Paper
Our main contribution is a new methodology for deriving strong bounds on Fourier coe cients which we use to argue about the random geometric graph distributions.In Section 3.1 we describe the challenges in bounding low-degree Fourier coe cients followed by the main ideas used to overcome them in Section 3.2.Our main theorem followed by applications to testing and the second eigenvalue are stated in Section 4. In Section 5 we give the full proof of our main theorem.The di erent applications follow by variations of what are by now well-known techniques and are given in the arXiv version [6].For testing, we use the 2 low-degree advantage formula (when testing against the planted coloring model, we need a more subtle version of it from [23]).For the second eigenvalue, we use the trace method.

Low-Degree Polynomials
Our results in Sections 4.2 to 4.4 are based on the low-degree polynomial framework introduced in [21,22].One way to motivate it is the following.When testing between graph distributions 0 and 1 (say, 0 = G( , ), 1 = RGG( , S −1 , )), one observes a single graph G and needs to output 0 or 1.The graph G is simply a bit sequence in {0, 1} ( ) .Hence, the output is a function : {0, 1} ( ) −→ {0, 1}.All Boolean functions are polynomials [36].Therefore, one simply needs to compute a polynomial in the edges.Importantly, one can write polynomials over {0, 1} in their Fourier expansion.In the -biased case over graphs, one represents : {0, 1} ( ) −→ R as (2) Here, ( ) is just a constant (the Fourier coe cient corresponding to ) and is a basis of polynomials.Conveniently, as can be seen from Eq. (1), this basis is composed of signed-subgraphs SW .What makes it useful is the following fact [36]: The polynomials When computationally restricted, a tester needs to apply a polytime computable polynomial .What are classes of poly-time computable polynomials?One such class is of su ciently low-degree polynomials (where degree refers to the largest number of edges in a monomial corresponding to some for which ( ) is non-zero).Since those are usually not {0, 1}-valued, one needs to threshold after computing the polynomial, which leads to the following denition, motivated by Chebyshev's inequality.
De nition 2 (Success of a Low-Degree Polynomial, e.g.[21]).We say that a polynomial : {0, 1} ( ) −→ {0, 1} distinguishes 0 and 1 with high probability if If is poly-time computable, this leads to the poly-time algorithm which compares (G) to Very commonly, one takes to be a signed subgraph count [14].That is, for some small graph (e.g.triangle or wedge), one computes SC (G) := where ∼ denotes graph isomorphism.I.e., one computes the total number of signed weights.Importantly, the framework of [21] allows one to refute the existence of low-degree polynomials which distinguish with high probability 0 and 1 .Namely, the condition in De nition 2 fails for all low-degree polynomials.Of course, one needs to quantify "low-degree".Typically, this means degree (log ).While not all (log )-degree polynomials are necessarily poly-time computable, the class of (log )-degree polynomials captures a broad class of algorithms including subgraph counting algorithms [21], spectral algorithms [3], SQ algorithms (subject to certain conditions) [9], approximate message passing algorithms (with constant number of rounds) [34], and are in general conjectured to capture all poly-time algorithms for statistical tasks in su ciently noisy highdimensional regimes [21].
De nition 3 (Low-Degree Polynomial Hardness).We say that no low-degree polynomial distinguishes 0 and 1 with probability Ω(1) if there exists some = (log ) such that holds for all polynomials of degree at most .In particular, this holds (e.g., [23]), if for all polynomials of degree at most .
For our results on the planted coloring, we need a more sophisticated version of Claim 2.1 due to [23] when 0 = PCol.

CHALLENGES AND MAIN IDEAS 3.1 Challenges in Bounding Low-Degree Fourier Coe cients
The importance of Fourier coe cients of graph distributions has motivated a series of previous works on (hyper)graphs with latent random vectors.Existing methods for computing Fourier coecients, however, seem not fully adequate towards our goal.
Approach 0: Direct Integration.The most naive approach to estimating Fourier coe cients is a direct integration (summation) over the latent space.Recalling Eq. ( 1), one can compute the Fourier coe cient of by integrating ( ) ∈ ( ) (1[⟨ , ⟩ ≥ ] − ) against N (0, 1 ) ⊗ ( ) .Such a calculation, however, seems out of reach due to the complex dependencies between di erent terms in the product.As latent vectors , vary smoothly, so does the distance between and and consequently also the probabilities of various events (such as being a common neighbour).As concrete evidence of the di culty of this approach, even in the simplest case of triangles for RGG( , S −1 , ), the authors of [14] spend 5 pages of calculations.For a similar random graph model with geometry, the calculation for triangles is open [4].
Approach 1: Vertex Conditioning.Many Fourier computations are for problems de ned by planting small dense communities in an ambient Erdős-Rényi graph [17,22,23,32,37].In such works, one can use the following simple vertex conditioning strategy (exploiting the ambient Erdős-Rényi structure) to overcome the technical di culty of a direct summation (integration).As a prototypical example, discussed in [7,21], consider the planted -clique distribution where each vertex ∈ [ ] independently receives a label , where Conditioned on the labels, each edge appears with probability 1 if = = 1 and independently with probability 1/2 otherwise.Now, consider the Fourier coe cient IE[ ( ) ∈ ( ) (2 − 1)] indexed by a graph without isolated vertices.Again, there are complex correlations between di erent edges.However, unless all vertices of have label 1, there is a random (probability 1/2) edge and this zeros out the Fourier coe cient IE[ ( ) ∈ ( ) (2 −1)].By conditioning on all vertex labels being 1, one shows that the Fourier coe cient equals ( / ) | ( ) | .An approach based on vertex conditioning seems to not be applicable to hard threshold random geometric graphs: In RGG( , S −1 , ), conditioned on the latent vectors there is no randomness left in G ∼ RGG( , S −1 , ).Hence, one cannot exploit cancellations due to left-over randomness in edges once labels are known, which is crucial in models with ambient Erdős-Rényi.
Approach 2: Lifting From a Single Dimension.The work [4] bounds the low-degree Fourier coe cients of (hard threshold) random geometric graphs over the -dimensional torus R /Z with the ∞ metric.The insight in [4] is that an ∞ random geometric graph is the AND of 1-dimensional random geometric graphs over S 1 .They combine the contributions of di erent coordinates via an analytical approach mimicking the cluster-expansion formula from statistical physics.As explained in [4], in our case of RGG( , N (0, 1 ), ) the edges are closer to MAJORITY over the coordinates.Unfortunately, extending the techniques for the simpler AND combination to the present setting seems to be technically challenging (in particular, because the Fourier expansion of AND is much simpler than that of MAJORITY).

Main Ideas
We focus on RGG( , N (0, 1 ), 1  2 ) for concreteness as this case captures most of the main ideas.The argument for other densities is similar, but requires some modi cation (most notably, it additionally exploits a novel energy-entropy trade-o of the RGG distribution; see Remark 5).Modifying to the sphere can be done via the observation that when ∼ N (0, 1 ), = /∥ ∥ 2 ∼ Unif(S −1 ), the variables , ∥ ∥ 2 are independent and ∥ ∥ 2 concentrates strongly around 1. Recall that 1/2 = 0. Our goal is to bound We assume that | ( )| = polylog( ) as this is most relevant to our applications.
Motivation: A Noise-Operator View.The following noise-operator interpretation from [9] of the calculation for planted clique will turn out to be useful.Consider rst the standard noise operator for functions over {±1} [36].It acts on functions , where ( ) is the distribution in which each coordinate independently equals with probability and otherwise with probability 1 − it is re-randomized.This noise operator contracts Fourier coe cients as Observation 3.1 (Noise Operator View of Planted Cliqe).The planted clique distribution can itself be viewed as arising from application of a di erent noise operator.Given a function on graphs where now ( ) is the distribution obtained by including each vertex in with probability and then rerandomizing all edges in except those with both endpoints in .This operator again contracts Fourier coe cients, ( ) = | ( ) | ( ).If we start with the point mass distribution on the complete graph, then the planted clique probability mass function is obtained by applying / .
Our goal will be to derive such a noise operator perspective for RGG as well and use it to bound Fourier coe cients.We formally give such a view in Observation 5.1, but due to its more complicated nature we gradually build towards it.Remarkably, our noise-operator also implies that (at least on a small enough scale), RGG can also be represented as a small planted subgraph in an ambient Erdős-Rényi! 1. Strategy: Localizing Edges That Create Dependencies.We solve the challenges outlined in the previous subsection with the following high-level idea.We will localize the dependence among edges to a small set of edges F = F ( ) (depending on the latent vectors).The other edges, in F , will be close to uniformly random.Edges in F will in general depend also on edges in F , and we write F as the set of edges upon which those in F depend.Letting Note that by de nition, edges in F \ F are independent of all other edges.Hence, conditioning on F , we can re-randomize F \ F (i.e., apply the noise operator 0 on F \ F ).With this idea, we solve both di culties arising when one attempts the rst two approaches outlined before: randomness ensures cancellations and independence makes calculations easy!To overcome this issue, we de ne a convenient basis for the latent vectors.Namely, for each edge ( ) ∈ ( ), we construct a random variable (depending on latent vectors) such that the collection of random variables { } ∈ ( ) is independent and nearly determines the edge .We exploit the fact that independent Gaussian vectors in high dimension are nearly orthonormal.This suggests that the Gram-Schmidt operation on the latent vectors will produce an orthonormal basis close to the original vectors and, hence, projections on the Gram-Schmidt basis will approximate the inner products.Applying Gram-Schmidt to the = | ( )| Gaussian vectors 1 , . . ., corresponding to vertices of , we obtain the Bartlett decomposition [8]: Here ∼ N (0, 1 ) for each < and 2 ∼ 2 ( − +1), ≥ 0. The collection ( ) ≤ are jointly independent.These properties can be easily derived from the isotropic nature of N (0, 1 ).With respect to this decomposition, Now, each term of the form ℓ ℓ , as well as ( − 1), is typically on the order of ˜ (1/ ), so the entire right-hand side expression above is on the order of ˜ (| ( )|/ ) = ˜ (1/ ).In contrast, ∼ N (0, 1/ ), so it is typically on the order of ˜ (1/ √ ).Therefore, the random variable nearly determines whether is an edge ).This is very promising as the variables { } are also independent, so we can de ne the noise operator by rerandomizing (a subset of) the variables independently and, thus, a ecting edges ( ) independently.
3. Construction: Fragile Edges Localize Dependencies.So far, we constructed independent variables { } ( ) ∈ ( ) which nearly determine the edges.The key word here is nearly -it may well be the case that = ˜ (1/ ), in which case (and with high probability, only in this case) ℓ ℓ or ( −1) could be comparable to or even larger than .In that case, = 1[ + ℓ< ℓ ℓ + ( −1) ≥ 0] depends on edges of the form ( ℓ), ( ℓ) via the variables ℓ , ℓ .As edges ( ) for which = ˜ (1/ ) are the only ones that can depend on other edges they localize dependence.We call edges ( ) for which = ˜ (1/ ) fragile pairs and they form the fragile set F .The rest of the edges are independent, as demonstrated by a noise operator rerandomizing all for non-fragile ( ) (in a way that continues to be large enough so that ( ) is not fragile).
Recall that is distributed as N (0, 1/ ) and, hence, is smaller than ˜ (1/ ) only with probability Θ(1/ √ ).As variables are independent, edges are fragile independently.Thus, the probability of observing many fragile edges is very low.
4. Analysis: Combinatorics of Edge Incidences.Our construction so far is of a noise operator which acts independently on all non-fragile edges F .Hence, even if we condition on all fragile edges, there is some randomness left (unless all edges are fragile, but this happens with very low probability) and, so, we have solved the issue of destroying all randomness by conditioning outlined in Section 3.1.However, it is still di cult to integrate ( ) ∈ ( ) (2 − 1) even conditioned on the set of fragile edges.The reason is that if ( ) is fragile, but ( ℓ) is not for some ℓ < , applying the noise operator on ( ℓ) via ℓ may also a ect 1[ + ℓ< ℓ ℓ + ( −1) ≥ 0].
Our approach to this issue is simple -we de ne the noise operator only over edges not incident to fragile edges in their lexicographically larger vertex (which we formalize in De nition 6 as F ).If there is even a single such edge, the noise operator re-randomizes the edge and zeroes out the Fourier coe cient (as in planted clique).
This leads us to analyzing the combinatorics of edge incidences of subgraphs of .A crucial step in this analysis is the realization that we have the freedom to choose an optimal ordering (with respect to the graph ) for the Gram-Schmidt process so that incidences with lexicographically larger fragile edges (i.e.| F |) are minimized.Optimizing over orderings leads us to a combinatorial quantity associated to the graph which we call the ordered edge independence number OEI( ).Altogether, our bound on Fourier coe - ) OEI( ) .Our last step is to understand the growth of OEI( ).We derive several bounds, simplest and most easily interpretable of which is OEI(

RESULTS
We now formally describe our results, beginning with the exact bounds on Fourier coe cients we obtain.Throughout, we will make the following assumption: There exist some absolute constants , > 0 such that Admittedly, some non-trivial cases are not covered by this assumption.Speci cally, = −1+ (1) and = polylog( ).Nevertheless, we note that in the case = Θ(1/ ), = polylog( ) the testing problem between RGG( , S −1 , ) and G( , ) is fully resolved by [26] and, thus, Eq. (A) captures most of the open regimes at least for the question of testing against Erdős-Rényi.

Main Result: The Fourier Coe cients of Gaussian and Spherical RGG
Fourier coe cients of RGG factorize over connected components, so we only state our bounds for connected.We rst de ne the ordered edge independence mentioned in Section 3.2.Given an ordering of the vertices (think of as the Gram-Schmidt ordering), we denote an edge between and as ( ) if > and ( ) otherwise.We formalize as follows.

De nition 4 (Covering Property
).An edge ( ) ∈ ( )\ is covered by if there exists an edge in with endpoint .Denote with the set of all edges covered by .
Going back to Eq. ( 6), we interpret as follows.If ( ) is covered by fragile edges , there exists some fragile edge ( ).Hence, might depend on via .
De nition 5 (Ordered Edge Independence Number).For a connected graph on vertices and a bijective labelling of the vertices with the numbers {1, 2, . . ., }, we say that a subset of edges ⊆ ( ) strongly covers if ∪ = ( ).We de ne the ordered edge independence number of with respect to and denote by OEI ( ) as the size of the smallest strongly covering .Let OEI( ) = max OEI ( ).
One should think of ( )\( ∪ ) as the set of edges which the noise operator rerandomizes.
While the resulting bounds on Fourier coe cients are strong enough for all of our low-degree hardness results, one may still wonder if they are optimal.It turns out that the likely answer is no: In the special case of density 1/2, the symmetry of the Gaussian distribution around 0 = 1/2 = 1/2 allows us to slightly improve the argument outlined in Section 3.2 and de ne a noise operator that acts also on certain (but not all) edges adjacent to fragile edges.We describe this next.
De nition 6 (Strong Covering).Consider an edge ( ) ∈ ( )\ and denote by ≥ the subset of formed by edges with both endpoints at least as large as .The analogue of Theorem 4.1 is the following (we state it only for RGG( , S −1 , 1 2 ) as the Gaussian and spherical models coincide in the 1/2-density case).Proposition 4.3.Suppose that Eq. (A) holds and is connected.Then, there exists some absolute constant depending only on , in Eq. (A) such that for G ∼ RGG( , S −1 , 1 2 ), .
If an edge ( ) is strongly covered, then the connected component ( ) ≥ ,∋ contains a neighbour of .Hence, there exists a fragile edge with endpoint and is also covered according to De nition 6.Thus, SOEI( ) ≥ OEI( ), so Proposition 4.3 is at least as strong as Theorem 4.1.It turns out that the inequality is strict for many sparse graphs.For example, one can check that when = is a cycle, OEI( ) = ⌈( − 1)/2⌉, but SOEI( ) = − 2. The latter follows from the following bound: It turns out that the OEI( ) bound is too weak for our results on the second eigenvalue of RGG( , S −1 , 1

Application I: Testing Between Spherical RGG and Erdős-Rényi
In the case of spherical random geometric graphs, we not only con rm that the signed triangle statistic is optimal among lowdegree polynomial tests, but also show that this is the case even in the non-adaptive edge-query model recently introduced by [32].
Testing between graph distributions with masks corresponds to a non-adaptive edge query model.Instead of viewing a full graph, one can choose to observe a smaller more structured set of edges in order to obtain a more data-e cient algorithm.The number of edges of M is a natural proxy for "sample complexity" in the case of low-degree polynomials as the input variables of lowdegree polynomials are edges rather than vertices.This idea was introduced recently [32], focusing on the planted clique problem.We obtain the following result for RGG.In it, we use ˜ ≔ √ as a lower-bound on the number of vertices of M. The variable ˜ is useful both in phrasing the assumptions (A) and in comparing with the unmasked case.Theorem 4.5.Consider some where ˜ = √ , , satisfy the assumptions in Eq. (A).Let M be any graph on edges without isolated vertices.Denote by the number of vertices in M. If ≥ ( 1/2 ) 3+ for any constant > 0, no degree (log ) 1.1 polynomial can distinguish with probability Ω(1) the distributions G 0 ⊙ M and In the case M = , we match the conjectured informationtheoretic threshold.Theorem 4.5 is tight in light of the signed triangle statistic [26].

Application II: Testing Between Gaussian RGG and Erdős-Rényi
We begin with a brief comparison of the Gaussian and spherical models.
Remark 1 (Gaussian vs Spherical Random Geometric Graphs).The Gaussian and spherical models coincide in the case = 1/2.More generally, they are intimately related due to the facts If ∼ N (0, 1 ), then := /∥ ∥ 2 ∼ Unif(S −1 ), and (I) This correspondence has been used to argue about either modelin some arguments more helpful is independence of Gaussian coordinates [14,16] while in others orthonormality of the Gegenbauer basis over the sphere [24].We exploit this correspondence in both directions.
We also show that the two models are qualitatively di erent in the sparse regime (see Fig. 1).The cause of this di erence is the perhaps benign looking fact that Eq. (II) is only an approximate statement.This creates dependence between edges in the Gaussian case for edges which are independent in the spherical case: For example, under G ∼ RGG( , S −1 , ), the edges G 21 and G 31 are independent.In contrast, under H ∼ RGG( , N (0, 1 ), ), H 21 , H 31 are positively correlated as both are monotone in ∥ 1 ∥ 2 .The dependence turns out to be quite strong for small values of to the point where (signed) wedges are better than signed triangles for testing against Erdős-Rényi.Theorem 4.6.Consider testing between RGG( , N (0, 1 ), ) and G( , ) under Eq.(A).
In the non-adaptive query complexity model, the di erence turns out to be even more dramatic.One can exploit the fact that wedges are highly informative by querying a star-graph, as star graphs maximize the number of wedges for a xed number of edges.
The main message of Section 4.3 is that even though the Gaussian and spherical models are closely related and each useful for reasoning about the other, they are also fundamentally di erent.
The proofs are similar to the ones in Section 4.2, except that we need to take extra care of graphs with leaves (as their Fourier coe cients are non-zero, unlike in the spherical case).
Remark 2. The work [10] studies the convergence of masked Wishart matrices to GOE (that is Of course, for = (1) the RGG testing problem becomes very di erent from the Wishart versus GOE problem [12,26].

Application III: Testing Between Spherical RGG and Planted Coloring
In the regime ≤ ( ) 3− , RGG is very di erent from Erdős-Rényi.But is it, perhaps, closely approximated by some other simple model?We show that, with respect to low-degree polynomial tests, RGG( , S −1 , 1/2) is indistinguishable from a slight variation of the planted coloring distribution in [23].We focus on the density 1/2 case, but our arguments can be easily extended (we only use Theorem 4. .In comparison, [23] have and adjacent with probability 1/2 when ≠ .Choosing a value of so that the signed triangle counts of RGG( , S −1 , 1  2 ) and PCol( , ) (nearly) match, we prove the following fact.
Remark 3. The condition = Θ( 1/4 ) establishes a statisticalcomputational gap when ≤ 4− for any constant > 0. An instance of PCol( , ) has a clique of size / = Ω( −1/4 ) with probability 1.However, RGG( , S −1 , 1 2 ) does not contain a clique of size more than 3 log 2 with high probability under Eq.(A) by [16].Perhaps surprisingly, our result holds in the exact same regime as the results of [23] for refuting -colarability.Namely, = Θ( 1/4 ), ≥ 8/3 is equivalent to = Ω( 2/3 ).Our contribution here is not the analysis, but the realisation that RGG is indistinguishable from PCol.We prove hardness for detectingcolarability against the natural PCol model and do not need to construct a more sophisticated "quiet distribution" as in [23].

Remark 4.
[15] studies a similar question for Wishart matrices in the regime = ( 3 ) when Wishart and GOE are distinguishable.The authors obtain a sequence of phase transitions for the Wishart density.The approximating densities are de ned in terms of an inverse Fourier transform and are not easily interpretable, in contrast to the simple PCol distribution.

Application IV: The Second Eigenvalue of
Spherical RGG Theorem 4.9.Suppose that G ∼ RGG( , Here, we need the strong bounds in Proposition 4.4 for sparse graphs.As these bounds provably do not hold for OEI, more work is needed to extend to ≠ 1/2.

PROVING THE BOUNDS ON FOURIER COEFFICIENTS
Here, we prove our bounds on Fourier coe cients of random geometric graphs by formalizing the argument in Section 5. Speci cally, in Section 5.1, we prove Theorem 4.1.In Section 5.3, we modify the argument slightly to prove the stronger Proposition 4.3 in the density 1/2 case.In Section 5.2 we prove the bounds on the edge independence numbers stated in Propositions 4.2 and 4.4.

The Main Argument in Theorem 4.1
Fix a connected graph on = | ( )| vertices and = | ( )| edges such that (log ) 3/2 ≤ √ .Let be any bijective labeling of its vertices by [ ].We will identify vertices by their labelling in and optimize over at the end.We prove Theorem 4.1 in the Gaussian setting and state the necessary modi cations for the spherical setting at the end.
Step 1: High-Probability Bound on ℓ< ℓ ℓ + ( − 1).Recall Eq. ( 6).As discussed, dependence between edges is due to the term We bound the size of this term, along the way introducing notation that will be used later.Let be the "reasonable interval" for each summand in (7): By Gaussian and 2 -concentration, for any desired constant ′ , there exists some absolute constant such that under Eq.(A) (which implies log 1/ = (log ) and log = (log )) we have and the same for ( − 1).Denote by RSC the "reasonable set of con gurations": By the union bound, its complement has probability As N (0, 1 ) is a sum of at most terms of order ( log / ) 2 under the high probability event RSC, we conclude that with probability at least 1 − − , simultaneoulsy for all ( ) ∈ ( ), where we de ned Δ ≔ 2 log .We condition on RSC.Since As , it remains only to bound the rst term.
Step 2: Fragile Edges.Observe that under the high probability event in Eq. ( 9), as long as and variables are independent (even conditioned on RSC).Thus, all edges besides the ones for which is close to are independent.We localize dependence to the following fragile edges.

De nition 9 (Fragile Interval and Fragile Edges). Denote
Note that each edge is fragile independently as { } 1≤ ≤ ≤ are independent.Let F be the set of fragile edges.Now, ∼ N (0, 1/ ) and Δ for some absolute constant ′′ , because [F L , F U ] has length Δ and the Gaussian density around is ] and 21 ∼ N (0, 1/ ).Conditioning on the fragile set yields We used the fact that IP . This last conditioning is useful, because our noise operator depends on the set of fragile edges.
Observation 5.1 (Noise Operator View on RGG).The noise operator on the distribution RGG( , N (0, 1 ), ) is parametrized by an ordering of the vertices and marginal edge probability .To sample from RGG, one rst samples a fragile set F by including each edge independently with probability IP[ ∈ [F L , F U ]] (recall that edges are fragile independently).F together with determines F .Then, one samples F∪ F from the marginal distribution on edges F ∪ F from the distribution RGG( , N (0, 1 ), ) conditioned on F being the fragile set with respect to .Conditioned on F , F∪ F , acts independently on edges with the following noise rates: That is, there is no noise on the edges in F ∪ F and the rest of the edges are fully rerandomized.Hence, ( ) is a sample from RGG( , N (0, 1 ), ).
Phrased di erently, the above noise operator represents RGG restricted to the edges of as a small planted subgraph in an ambient Erdős-Rényi G( , ˜ ).
One key di erence with Observation 3.1 is that the distribution of the edge set F∪ F is much more complicated than the distribution on the planted clique .The latter is simply a clique, while F∪ F is a subgraph of a random geometric graph which is further conditioned on its fragile set.
Nevertheless, just as in Observation 3.1, the independent rerandomization over the rest of the edges (F ∪ F ) yields an (exponentially fast) decay of Fourier coe cients, which we discuss next.
Step 4: Factors ( )\( ∪ ) Rerandomized By The Noise Operator.A simple calculation with 1-dimensional Gaussian variables using Eqs.( 8) and (11) and the fact that Step ( Step 6: Putting It All Together.Plugging ( 14) and ( 16) into ( 13), the conditional Fourier coe cients are bounded as We combine this with Eq. ( 12) to obtain ).Finally, we can choose as the maximizer of OEI ( ) and conclude Theorem 4.1 in the Gaussian case.)| corresponds to entropy in the distribution as it is the size of the subset of edges which the noise operator rerandomizes (and are independent with all edges in ( )).The term | | measures energy as F = is the subset of edges with nontrivial interactions (dependence) with other edges in | ( )|.The inequality shows that energy and entropy cannot both be small, and either one being large results in small Fourier coe cient: entropy due to randomness and energy due to low probabilities.
A related question to the tightness of Proposition 4.3 is nding lower bounds and precise estimates of the Fourier coe cients.Those are useful for the design of low-degree algorithms.Our upper bounds on Fourier coe cients are mostly suited to showing hardness.
Finally, the information-theoretic counterparts of many of the questions addressed in this paper remain open.Is it possible to prove such information theoretic convergence using 2 -like arguments based on squares of Fourier coe cients?A simple calculation shows that bounds scaling as | ( ) | − | ( ) | for constant (of which form Theorem 4.1 is) are insu cient to show 2 (RGG( , S −1 , )∥G( , )) = (1).One could hope to surpass this barrier by using a tensorization argument [26,29] and/or conditional 2 -divergence [17,31].

( 1 )
SW (G) is the signed weight of de ned by the above equation.Fourier coe cients are (signed) expectations of subgraphs.

Figure 1 :
Figure 1: Detecting -dimensional geometry via low-degree polynomials.In the model of non-adaptively queried edges M, := | (M)|.A wedge is a path on 3 vertices.
is the connected component of ( ) ≥ containing .We say that ( ) is strongly covered by has a neighbour other than in ( ( ) ≥ ,∋ ) with respect to .We denote by the set of edges strongly covered by .See Fig.2for an illustration.The analogue of De nition 5 is:De nition 7 (Strong Ordered Independence Number).We de ne SOEI ( ), the strong independence number of with respect to , as the minimal cardinality of a set such that ∪ = ( ).SOEI( ) = max SOEI ( ).