Hypergraph Unreliability in Quasi-Polynomial Time

The hypergraph unreliability problem asks for the probability that a hypergraph gets disconnected when every hyperedge fails independently with a given probability. For graphs, the unreliability problem has been studied over many decades, and multiple fully polynomial-time approximation schemes are known starting with the work of Karger (STOC 1995). In contrast, prior to this work, no non-trivial result was known for hypergraphs (of arbitrary rank). In this paper, we give quasi-polynomial time approximation schemes for the hypergraph unreliability problem. For any fixed $\varepsilon \in (0, 1)$, we first give a $(1+\varepsilon)$-approximation algorithm that runs in $m^{O(\log n)}$ time on an $m$-hyperedge, $n$-vertex hypergraph. Then, we improve the running time to $m\cdot n^{O(\log^2 n)}$ with an additional exponentially small additive term in the approximation.


Introduction
In the hypergraph unreliability problem, we are given an unweighted hypergraph G = (V, E) and a failure probability 0 < p < 1.The goal is to compute the probability that the hypergraph disconnects1 when every hyperedge is independently deleted with probability p.The probability of disconnection is called the unreliability of the hypergraph G and is denoted u G (p).The hypergraph unreliability problem is a natural generalization of network unreliability which is identically defined but on graphs (i.e., hypergraphs of rank2 2).The latter is a classical problem in the graph algorithms literature that was shown to be #P-hard by Valiant [Val79] and its algorithmic study dates back to at least the 1980s [KLM89,AFW95].By now, several fully polynomial-time approximation schemes achieving a (1 + ε)-approximation are known for the network unreliability problem [Kar99, HS18, Kar16, Kar17, Kar20, CHLP24].In contrast, to the best of our knowledge, no non-trivial approximation was known for the unreliability problem on hypergraphs of arbitrary rank prior to this work.
Reliability problems are at the heart of analyzing the robustness of networks to random failures.(This can be contrasted with minimum cut problems that analyze the robustness to worst-case failures.)Since real world networks often exhibit random failures, there is much practical interest in reliability algorithms with entire books devoted to the topic [Cha16,Col87].However, many basic questions remain unanswered from a theoretical perspective.One bright spot from a theoretical standpoint is the network unreliability problem, for which the first FPTAS was given by Karger in STOC 1995 [Kar99].Since then, many other FPTAS have been discovered with ever-improving running times [HS18, Kar16, Kar17, Kar20, CHLP24], the current record being a recent O(m + n 1.5 )-time algorithm (for a fixed ε) due to Cen et al. [CHLP24].(Throughout the paper, m and n respectively denote the number of (hyper)edges and vertices in the (hyper)graph.)At the heart of these algorithms is the well-known fact that a graph has a polynomial number of near-minimum cuts -cuts whose value exceeds that of the minimum cut by at most a constant factor [Kar93].This polynomial bound extends to hypergraphs of rank at most O(log n) [KK15] and as a result, the FPTAS for network unreliability also apply to such hypergraphs.However, this approach fails for hypergraphs of arbitrary rank.In general, a hypergraph of rank r can have as many as Ω(m • 2 r ) near-minimum cuts (see Kogan and Krauthgamer [KK15] for an example), which rules out an enumeration of the near-minimum cuts in polynomial time for hypergraphs of large rank.This presents the main technical challenge in obtaining an approximation algorithm for hypergraph unreliability, and the main barrier that we overcome in this paper.
In addition to being a natural and well-studied generalization of graphs in the combinatorics literature, hypergraphs have also gained prominence in recent years as a modeling tool for real world networks.While graphs are traditionally used to model networks with point-to-point connections, more complex "higher-order" interactions in modern networks are better captured by hypergraphs as observed by many authors in different domains (see, e.g., the many examples in the recent survey of higher order networks by Bick et al. [BGHS23]).Indeed, the use of random hypergraphs as a modeling tool for real world phenomena has also been observed previously [GZCN09].Therefore, we believe that the study of reliability in hypergraphs is a natural tool for understanding the connectivity properties of such real world networks subject to random failures.We initiate this line of research in this paper and hope that this will be further expanded in the future.

Our Results
We give two algorithms for hypergraph unreliability.The first algorithm is simpler and achieves the following result: Theorem 1.1.For any fixed ε ∈ (0, 1), there is a randomized Monte Carlo algorithm for the hypergraph unreliability problem that runs in m O(log n) time on an m-hyperedge, n-vertex hypergraph and returns an estimator X that satisfies X ∈ (1 ± ε)u G (p) whp. 3he running time of the algorithm in the theorem above (and also that in the next theorem) is inversely polynomial in the accuracy parameter ε.For brevity, we assume that ε is fixed throughout the paper and do not explicitly state this dependence in our running time bounds.
Note that the number of hyperedges in a hypergraph can be exponential in n.This makes a quasipolynomial-time hypergraph algorithm that has a running time of poly(m) • n poly log n qualitatively superior to one that has a running time of m poly log (n) .(Contrast this to graphs where the two bounds are qualitatively equivalent because m = O(n 2 ).)To this end, we give a second (more involved) algorithm that achieves this sharper bound on the running time incurring a small additive error in the approximation guarantee.
To interpret this result, set δ = exp(−n).Then, we get an algorithm that runs in m • n O(log 2 n) time and returns an estimator X that satisfies X ∈ (1 + ε)u G (p) ± exp(−n) whp.In other words, we obtain the sharper running time bound that we were hoping for in exchange for an exponentially small additive error in the approximation.We may also note that in general, a simple Monte Carlo simulation of the hypergraph disconnection event also gives an estimator for u G (p) with an additive error.But, this additive error would be exponentially larger than the one in Theorem 1.2; in particular, in order to ensure that X ∈ (1 ± ε)u G (p) + exp(−n) whp, we would need to run the Monte Carlo simulation exp(n) times, thereby giving an exponential time algorithm as against the quasi-polynomial running time in Theorem 1.2.

Our Techniques
We now give a description of the main technical ideas that are used in our algorithms.Let us start with a rough (polynomial) approximation to u G (p).In graphs, this is easy.Let λ denote the value of a minimum cut.Since there is at least 1 and at most O(n 2 ) minimum cuts [DKL76], their collective contribution to u G (p) is between p λ and O(n 2 ) • p λ .Now, since the number of cuts of value ≤ αλ is at most n O(α) [Kar93], the collective contribution of all other cuts to u G (p) is also at most O(n 2 ) • p λ (for sufficiently small p, else we can just use Monte Carlo sampling).The bound of O(n 2 ) on the number of minimum cuts continues to hold in hypergraphs (see [GKP17,CX18]; this is implicitly shown in [Cun83]).So, their collective contribution is still between p λ and O(n 2 ) • p λ .But, the number of cuts of value ≤ αλ can be exponential in the rank r, and therefore exponential in n for r = Ω(n) [KK15].Therefore, a naïve union bound over these cuts only gives a trivial exponential approximation to u G (p).
Our first technical contribution is to show that somewhat surprisingly, the upper bound of O(n 2 )•p λ on the value of u G (p) continues to hold for hypergraphs of arbitrary rank.As described above, we can't simply use a union bound over cuts, but must go deeper into the interactions between different cuts.To this end, we consider an alternative view4 of the random failure of hyperedges.For each hyperedge, we generate an independent exponential variable (at unit rate) and superpose the corresponding Poisson processes on a single timeline.We contract each hyperedge as it appears on this timeline; then, the disconnection event corresponds to having ≥ 2 vertices in the contracted hypergraph at time ln(1/p).As hyperedges contract, the vertices (which we call supervertices) of the contracted hypergraph represent a partition of the vertices of the original hypergraph; we assign leaders to the subsets in this partition in a way that we can argue that any two vertices survive as leaders till time ln(1/p) with probability at most p λ .This allows us to recover the O(n 2 ) • p λ bound on the value of u G (p) by a union bound on all vertex pairs.We now use this rough O(n 2 ) approximation to u G (p) in designing a recursive algorithm.We generate a random hypergraph H by contracting hyperedges in G with probability 1−q for some q > p.
(See [Kar16,Kar17,Kar20,CHLP24] for the use of random contraction in network unreliability.)The intuition is that by coupling, these edges will survive if the failure probability is p; hence, contracting them does not affect the disconnection event.The algorithm now makes a recursive call on H with the conditional failure probability p/q and obtains an estimator for u H (p/q). But, how do we bound the variance due to the randomness of H? This is where the O(n 2 )-approximation comes in handyit bounds the range of u H (p/q) to O(n 2 ) • (p/q) λ , thereby giving a bound of n 2 • q −λ on the (relative) variance of the overall estimator.5Thus, if we select q such that q −λ = poly(n), then we only need a polynomial number of random trials.
For this plan to work, we need to we make progress in the recursion, i.e., make recursive calls on subgraphs that are smaller by a constant factor.Unfortunately, we are unable to ensure this in hypergraphs of arbitrarily large rank.To see this, consider a hypergraph containing n hyperedges of rank n − 1, i.e., λ = n − 1.In this case, we have n 2 • q −n+1 trials and the probability of each trial returning the input hypergraph is q n (if none of the n hyperedges is contracted).So, ≥ 1 recursive calls (in expectation) will run on the input hypergraph itself, which defeats the recursion.However, we show that this is really an extreme scenario and we can make sufficient progress in all hypergraphs with rank at most n/2 -we call these universally small and the rest existentially large hypergraphs.
We are now left to handle existentially large hypergraphs.This is where the two algorithms (Theorem 1.1 and Theorem 1.2) differ.The first algorithm (Theorem 1.1) simply enumerates over all outcomes (survival/failure) of the large hyperedges, i.e., those of rank > n/2.To do this efficiently, it orders the large hyperedges and creates a new recursive instance based on the first large hyperedge that is contracted in this order.This generates ℓ ≤ m subproblems, where ℓ denotes the number of large hyperedges.In the last subproblem, all the ℓ large hyperedges fail (i.e., none of them is contracted) and we are left with a universally small hypergraph.In all the other subproblems, at least one large hyperedge is contracted and we are left with a hypergraph containing at most n/2 vertices.So, we make progress in either case.
The second algorithm (Theorem 1.2) cannot afford to enumerate over all large hyperedges.Instead, it partitions the set of hyperedges in G into the large and small hyperedges and creates two hypergraphs, G large and G small .Now, for G to be disconnected, both G small and G large must be disconnected (but not vice-versa!).Recall that earlier, we ran into a problem where our naïve sampling process could not make progress in terms of reducing the size of the hypergraph when sampling large hyperedges.This was epitomized by a hypergraph containing n hyperedges of rank n − 1 each.But, if we think of this instance in isolation, then it is actually quite easy to estimate u G (p) in this hypergraph.This is because whenever the hypergraph disconnects, it does so at a degree cut6 of a vertex.So, there are only n cuts that we need to enumerate over.In fact, this property is true for the hypergraph G large obtained from any hypergraph G; since every pair of large hyperedges share at least one vertex, any disconnected sub-hypergraph must have an isolated vertex.We exploit this property by writing a DNF formula for all the degree cuts of G large (where each variable denotes survival/failure of a large hyperedge) and use the classical importance sampling technique of Karp, Luby, and Madras [KLM89] to generate a sample of G large conditioned on it being disconnected.
How do we augment this sample in G small ?We have two cases.To understand the distinction, let us informally imagine that the minimum cuts of G large and G small coincide, and they form the minimum cut of G. (Of course, this is not true in general!)The two cases are defined based on the relative values of the minimum cuts in G large and G small .If G large contributes most of the hyperedges to the mincut (we call this the full revelation case), then the probability that G small gets disconnected is quite high (recall that u G (p) ≥ p λ ).In this case, it suffices to do Monte Carlo sampling in G small to augment the sample obtained from G large .The other case is when G small contributes a sizeable number of hyperedges to the minimum cut (we call this the partial revelation case).Note that the extreme example of this second case is when G large is empty, i.e., when the hypergraph is universally small.This suggests generalizing the use of random contraction from universally small hypergraphs to this case, i.e., failing hyperedges at a higher probability of q > p in a recursive step.But, to synchronize the sample across G large and G small , we must use the same value q in G large as well.Unfortunately, as we observed earlier, the algorithm might not make progress in terms of the size of the hypergraph in this case.To overcome this, we introduce a second recursive parameter, that of the value of the failure probability itself.This second recursive parameter now requires us to define a new base case when the probability of failure is very small (denote the threshold by a parameter δ) -this is where we incur the additive loss of δ in the approximation.The overall running time is now given by the fact that each subproblem branches into polynomially many subproblems, and the depth of the recursion is bounded by log n log log(1/δ) where the first term comes from the recursion on size and the second term from that on the failure probability.
Organization.We give some preliminary definitions and terminology in Section 2. We then establish Theorem 1.1 in Section 3. Finally, we establish Theorem 1.2 in Section 4. We give some concluding thoughts in Section 5.

Preliminaries
Hypergraphs.We start with some basic notations for hypergraphs.A hypergraph G = (V, E) comprises a set of vertices V and set of hyperedges E, where each hyperedge e ∈ E is a non-empty subset of the vertices, i.e., ∅ ⊂ e ⊆ V .The rank of a hyperedge e is |e|; the rank of a hypergraph G, denoted r G , is the maximum rank of a hyperedge in G.
For any hypergraph G = (V, E) and subset of hyperedges F ⊆ E, denote G − F := (V, E \ F ) to be the hypergraph after deleting the hyperedges in F from G. A cut in a hypergraph is defined as a set of edges C such that G − C is disconnected.The value of a cut C is the number of hyperedges in C. A minimum cut of a hypergraph is a cut of minimum value.We denote the value of a minimum cut in a hypergraph G by λ G .The following is a known result (follows from Theorem 4 in [CQ21] using the maximum flow algorithm in [CKL + 22]): Theorem 2.1.The minimum cut of a hypergraph can be computed in ( e |e|) 1+o(1) time.
In this paper, we often make use of hyperedge contractions.Contracting a hyperedge e in a hypergraph G replaces the vertices in e by a single vertex to form a new hypergraph denoted G/e := (V /e, E/e).Note that there is a natural surjective map ϕ : V → V /e that maps vertices in e to the contracted supervertex in V /e, and maps vertices outside e to themselves.Each hyperedge e ∈ E is replaced in E/e by an element-wise mapped set {u ∈ e : ϕ(u)}.By extension, contracting a set of hyperedges F = {e 1 , e 2 , . ..} is equivalent to contracting all hyperedges in F in arbitrary order: we write G/F := (((G/e 1 )/e 2 ) . ..)/e k .H is called a contracted hypergraph of G = (V, E) if H = G/F for some F ⊆ E. For distinction between the uncontracted vertices in G and the contracted vertices in H, we usually call the former vertices and the latter supervertices.
A key operation in our algorithm is uniform random hyperedge contraction.We use H ∼ G(q) for some q ∈ (0, 1) to denote the distribution of a random contracted hypergraph H obtained from G by contracting each hyperedge independently with probability 1 − q.The next lemma states that u H (p/q) is an unbiased estimator of u G (p): Lemma 2.2.Suppose H ∼ G(q) and q ≥ p.Then, E[u H (p/q)] = u G (p).
Proof.Deleting each hyperedge independently with probability p is equivalent to first choosing each hyperedge with probability q ≥ p and then deleting each chosen edge with probability p/q.u G (p) is the probability that G disconnects in the former distribution.In the latter distribution, note that the hyperedges that are not chosen must be connected in the resulting hypergraph after deletion, so contracting them does not affect the disconnection event of the resulting hypergraph.Thus, E[u H (p/q)] is the probability that G disconnects in the latter distribution.In conclusion, the two probabilities are equal.
Random Variables.Next, we give some basic facts about random variables that we will use in this paper.All random variables considered in the paper are non-negative.
The relative variance of a random variable X is Since we use a biased estimator in Theorem 1.2, we need a non-standard (capped) version of relative variance.We define it and state its properties below.
Definition 2.3 (Capped relative variance).The (δ-)capped relative variance of random variable X is We state some basic facts about capped relative variance (proofs in Appendix A).Note that relative variance is a special case of capped relative variance when δ = 0. Therefore, these facts also hold for relative variance as a special case.
Fact 2.4.The average of M independent samples of X has capped relative variance η δ [X] M .
Fact 2.5.Suppose Y is an unbiased estimator of x, and conditioned on a fixed Y , Z is a biased estimator of Y with bias in [−δ, 0] and capped relative variance In particular, when δ = 0 (i.e. for relative variance of unbiased estimator Z), there is a stronger bound Fact 2.6.Suppose X and Z are independent random variables with expectation in (0, 1), and δ The next two facts are for relative variance and proved in Appendix A: Exponential distribution.Recall that the exponential distribution of rate r gives a continuous random variable X ≥ 0 satisfying Pr[X ≥ t] = e −rt for all t ≥ 0. We state some standard properties of the exponential variables: Fact 2.10 (Moment generating function).Let X follow exponential distribution of rate r.Then for Fact 2.11 (Memoryless property).Let X follow exponentail distribution.Then for any s, t ≥ 0, Fact 2.12.Let X 1 , X 2 , . . ., X k be independent random variables with exponential distribution of rate r, and X = min i≤k {X i }.Then, X follows exponential distribution of rate kr.Moreover, X = X i for every value of i with probability 1/k.
Monte Carlo sampling.Suppose we want to estimate the probability p D that an event D happens.
(For u G (p), D is the event that the hypergraph disconnects.)The Monte Carlo sampling algorithm first draws a sample from the underlying space.(For u G (p), it deletes each hyperedge independently with probability p.) The estimator returns 1 if D happens, and 0 otherwise.The following is a standard property of this estimator (proof in Appendix A): Lemma 2.13.Monte Carlo sampling outputs an unbiased estimator of p D with relative variance at most 1 p D and δ-capped relative variance at most min{ 1 p D , 1 δ }.
Given Lemma 2.13, we can use Lemma 2.7 to obtain the following: Lemma 2.14.We can obtain a DNF probability.In the DNF probability problem, we are given a DNF formula F with N variables and M clauses and a value p ∈ (0, 1).The goal is to estimate the probability u F (p) that F is satisfied when each variable is True with probability p independently.This problem is #P-hard even in the special case of p = 1 2 [Val79].In a seminal work, Karp, Luby and Madras [KLM89] provided an FPRAS in O(N M ) time.

Theorem 2.15 ([KLM89]). The DNF probability problem can be
Our algorithm will need an unbiased estimator for DNF probability.The estimator in Theorem 2.15 could be biased, but we can get an unbiased estimator by using its primitive version, at the cost of a slower running time.We state this in the next two lemmas; these are essentially shown in [KLM89], but we include a proof in the appendix for completeness.
Lemma 2.16.An unbiased estimator of u F (p) with relative variance at most 1 can be computed in time O(N M 2 ).
Lemma 2.17 (DNF sampling).There exists an algorithm that draws a sample of values in time O(N M 2 ) according to the following distribution: Each variable independently takes value True with probability p and False with probability 1 − p, conditioned on the fact that the values satisfy F .

Random Contraction with Large Edge Enumeration
In this section, we design an m O(log n) -time algorithm that outputs an unbiased estimator of u G (p) with relative variance O(1).It follows by Lemma 2.7 that a (1 ± ϵ)-approximation can be computed in m O(log n) ε −2 time, thereby establishing Theorem 1.1.

Algorithm Description
Overview.The algorithm is recursive.Before describing the algorithm formally, we give some intuition for the recursive step.The recursive case is divided into two sub-cases depending on the maximum rank of the hyperedges.We call a hypergraph universally small if all edge ranks are at most n/2; otherwise, it is said to be existentially large.If the hypergraph is universally small, the algorithm runs a single recursive step of random hyperedge contraction, and recursively estimates the unreliability of the contracted hypergraph.This is repeated poly(n) times to reduce the variance of the estimator, and the average of all estimates is taken as output.If the hypergraph is existentially large, the algorithm lists all large hyperedges of rank greater than n/2, enumerates the first large hyperedge in the list that does not fail, and recursively estimates the unreliability of the resulting subgraph.The algorithm also handles the case that all large hyperedges fail by recursing on the (universally small) sub-hypergraph formed by deleting all large edges.Now, we describe the algorithm formally.
Algorithm 1: Unreliability(G = (V, E), p) Merge parallel hyperedges into a weighted hyperedge.10 Enumerate all 2 m possible outcomes of random hyperedge removal, where m ≤ 2 n is the number of weighted hyperedges.

11
return the total probability that the hypergraph is disconnected after random hyperedge removal, where a hyperedge of weight w(e) has failure probability p w(e) .12 end 13 if G is universally small then 14 q ← n −10/λ 15 for i = 1 to 2n 12 do 16 Sample H i ∼ G(q).17 Recursively call Unreliability(H i , p/q) to get estimator X i .18 end 19 return the average of all X i 's.

else 21
List large hyperedges e 1 , e 2 , . . ., e ℓ .Let E i be the set of first i hyperedges in the list.
Base cases.There are three base cases: 1. G is disconnected.In this case, we output 1.
2. p is larger than n −10/λ .In this case, we use Monte Carlo sampling (Lemma 2.13) and take average of n 10 samples.
3. The number of vertices n is a constant.In this case, we merge all parallel hyperedges to form weighted hyperedges.We need to estimate u G (p) when each weighted hyperedge e is removed with probability p w(e) , where w(e) is the weight of e.We enumerate over all possible subsets of weighted hyperedges that are deleted, and compute u G (p) exactly.The first step takes O(m) time; the rest is O(1) time.We have established the following lemma: Recursive case.We start by classifying hypergraphs as follows: Definition 3.2 (universally small, existentially large hypergraphs).A hypergraph is universally small if all hyperedges are of rank at most n/2.A hypergraph is existentially large if there exists a hyperedge of rank greater than n/2.
Recursive algorithm for universally small hypergraphs.The algorithm repeats a random contraction step independently 2n 12 times.In the i-th random contraction step, the algorithm samples H i ∼ G(q) by contracting each edge with probability 1 − q independently, where q = n −10/λ .Note that q ≥ p, otherwise we are in a base case.Then, the algorithm recursively estimates u H i (p/q).We will show later that u H i (p/q) is an unbiased estimator of u G (p) with bounded relative variance.After all 2n 12 recursive calls, the algorithm takes the average of the estimators returned by these recursive calls to be the output.
Recursive algorithm for existentially large hypergraphs.Suppose there are ℓ large hyperedges, ordered arbitrarily as e 1 , e 2 , . . ., e ℓ .Let E i be the subset of first i hyperedges in the list; in particular, E 0 = ∅.We divide the event of hypergraph disconnection into ℓ + 1 disjoint events by enumerating the first hyperedge in the list that does not fail.Formally, for i = 0, 1, . . ., ℓ − 1, let A i be the event that first i hyperedges in the list all fail, but the (i + 1)-th hyperedge survives; Let A ℓ be the event that all ℓ hyperedges fail.Then Pr[A i ] = p i (1 − p) for i ≤ ℓ − 1 and Pr[A ℓ ] = p ℓ .Conditioned on each event A i , we can remove the failed hyperedges in E i and contract the first surviving hyperedge e i+1 to form a subgraph H i .Formally, let H i = (G − E i )/e i+1 for i = 0, 1, . . ., ℓ − 1, and The event that G disconnects conditioned on A i is equivalent to H i disconnecting when each hyperedge is removed with probability p independently.We have The algorithm runs ℓ + 1 = O(m) recursive calls on each H i to get unbiased estimators The subproblems are easier because of the following reason: in H i for i ≤ ℓ − 1, we contracted a large hyperedge from G, so the number of vertices decreases by at least a half; In H ℓ , we removed all large hyperedges from G, so H ℓ is universally small.

Correctness
In this section, we prove the following lemma that establishes correctness of the algorithm.Lemma 3.3.Algorithm 1 outputs an unbiased estimator with relative variance at most 1.
Note that the base cases of disconnected G and constant size output exact value of u G (p), and the base case of Monte Carlo sampling outputs an unbiased estimator of u G (p). Also, an enumeration step in the existentially large case does not introduce variance.So, we only need to bound the relative variance introduced in the universally small case.To do so, we first analyze the variance introduced in a random contraction step.
The key to bounding relative variance of a random contraction step is the following property of a random subgraph which we will prove later.
Lemma 3.4 provides an upper bound on the relative variance of random contraction: Lemma 3.5.Suppose H ∼ G(q) and q ≥ p.Then, the relative variance of u H (p/q) is at most n 2 q −λ − 1.
Proof.Because H is constructed by contraction from G, its min-cut value λ H is at least the min-cut value λ in G.By Lemma 3.4, because |V (H)| ≤ n, λ H ≥ λ, and q ≥ p. u H (p/q) is an unbiased estimator of u G (p) by Lemma 2.2.Next we bound its relative variance η[u H (p/q)].By Fact 2.9, the relative variance is upper bounded by max H u H (p/q) u G (p) − 1.We have max H u H (p/q) ≤ n 2 (p/q) λ by Equation (2), and u G (p) ≥ p λ by Lemma 3.4.Therefore, We are now prepared to prove Lemma 3.3.
Proof of Lemma 3.3.We will prove the lemma by induction on number of vertices n, number of hyperedges m, as well as the value of m ln 1 p .The base case of n = O(1) is given by Lemma 3.1.The base case of m = 0 is handled by the disconnected case in the algorithm.These two base cases output the exact value of u G (p).For the induction on m ln 1 p , notice that this value is a positive integer because p ∈ (0, 1) in all recursive calls.The base case m ln 1 p ≤ 10m ln n λ implies p λ ≥ n −10 .Hence, it is handled by Monte Carlo sampling in the algorithm, which outputs an unbiased estimator of u G (p) with relative variance at most 1 u G (p) ≤ 1 p λ ≤ n 10 by Lemma 2.13.After taking average of n 10 samples, the relative variance is reduced to at most 1 by Fact 2.4.
For the inductive step, there are two cases.We first consider a random contraction step when the hypergraph is universally small.This step generates 2n 12 random subgraphs H i ∼ G(q), where q λ = n −10 .Lemma 2.2 gives that u H i (p/q) is an unbiased estimator of u G (p). Lemma 3.5 gives that u H i (p/q) has relative variance at most n 2 q −λ − 1 = n 12 − 1.By the inductive hypothesis, each subproblem returns an unbiased estimator X i of u H i (p/q) with relative variance at most 1.By Fact 2.5, X i is an unbiased estimator of u G (p) with relative variance at most n 12 (1 + 1) − 1 ≤ 2n 12 .Taking average over all 2n 12 estimators, X i gives an unbiased estimator of u G (p) with relative variance at most 1 by Fact 2.4.
Next, we consider a large hyperedge enumeration step when the input hypergraph is existentially large.The algorithm computes an estimator X = p ℓ • X ℓ + ℓ i=1 p i (1 − p) • X i , where X i 's are returned by recursive calls on H i .By the inductive hypothesis, the recursive calls give unbiased estimators, i.e.

E[X
where the last step is by Equation (1).
X is a convex combination of independent recursive estimators X i , which have relative variance at most 1 by the inductive hypothesis.Hence, X also has relative variance at most 1 by Fact 2.8.
Finally, we argue that the induction is valid, i.e., that we always make progress on one of the inductive parameters in every recursive call.Whenever a hyperedge is contracted or deleted, we decrease n or m.It is possible that a random contraction step does not change the hypergraph.In that case, notice that p is changed to p/q in the subproblem.So, m ln 1 p decreases by m ln 1 q = 10m ln n λ ≥ 10.
Therefore, we also decrease m ln 1 p , and the induction is valid.
In the rest of this subsection, we give a proof of our main technical lemma, Lemma 3.4.
Proof of Lemma 3.4 The lower bound of p λ in Lemma 3.4 holds because a minimum cut fails with probability p λ .The rest of the proof is devoted to the upper bound of n 2 p λ .We first assume that in the hypergraph G, the hyperedges are partitioned into pairs of parallel hyperedges.This is w.l.o.g. because we can replace each hyperedge by two copies and change the failure probability p to √ p.
We introduce some definitions that are used only in the analysis (i.e., the algorithm does not need to compute them).For any contracted hypergraph of G, we choose an orientation of the hyperedges in the sense that in each hyperedge, one vertex is designated the head and all other vertices are tails.We require the orientation to satisfy the property that any pair of parallel hyperedges (in the partition of hyperedges into pairs) have different heads.This is always possible because the rank of each hyperedge is at least 2.Besides this property, the choice of heads are arbitrary.The orientation is chosen in a consistent way.That is, any fixed contracted hypergraph always chooses the same orientation throughout the analysis.
The orientation is used to define representatives of contracted supervertices.Each contracted supervertex during the contraction process will be assigned a representative vertex, which is an original vertex contracted into the supervertex.Initially, each vertex is its own representative.Whenever a hyperedge e is contracted, we assign the representative of the head of e to be the representative of the new contracted supervertex.
Claim 3.6.For any pair of supervertices u ̸ = v in a contracted hypergraph of G, there are at least λ hyperedges that contain at least one of u or v as a tail.
Proof.Consider any pair of supervertices u, v.The (undirected) degree cut of u (denoted ∂u) and degree cut of v (denoted ∂v) are cuts in the original hypergraphs, and have cut value at least λ.The hyperedges in the degree cuts are pairs of parallel hyperedges.For a pair in ∂u \ ∂v, at least one copy contains u as a tail.Similarly, for a pair in ∂v \ ∂u, at least one copy contains v as a tail.Finally, for a hyperedge in ∂u ∩ ∂v, it contains both u and v.The hyperedge only has one head, so either u or v is a tail.Therefore, there are at least hyperedges that contain u or v as a tail.
We now proceed to prove the upper bound.
Exponential contraction process.For the sake of analysis, consider the following continuous time random process called the exponential contraction process.Let each hyperedge e independently arrive at a time Y e following the exponential distribution of rate 1.Then, the probability that a hyperedge does not arrive before time ln 1 q is e − ln 1 q = q.Therefore, contracting hyperedges that arrive before time ln 1 q produces the same distribution as H ∼ G(q).In the contraction process, if the hypergraph is not contracted into a single supervertex at time ln 1 p (which happens with probability u G (p)), there are at least two supervertices.Consequently, at least two vertices survive as representatives.We will show that the probability that any pair of vertices s, t both survive is at most p λ .By union bounding over all n 2 ≤ n 2 pairs of vertices, we have u G (p) ≤ n 2 p λ which completes the proof.
To bound the probability that any pair of vertices s, t both survive, we choose a set of critical edges during the contraction process as follows.When s and t are in different supervertices s and t, we choose λ hyperedges that contain at least one of s or t as a tail guaranteed by Claim 3.6; otherwise, we choose λ arbitrary edges.(The critical edges may change after each contraction.)Note that whenever a critical edge arrives, one of s or t is no longer a representative.Hence, if s, t both survive as representatives after the contraction process, then no critical edge arrives during the contraction process.By Lemma 3.7, this happens with probability at most e −λ ln 1 p = p λ .Lemma 3.7.Suppose during the exponential contraction process from time 0 to T , we maintain a subset of uncontracted hyperedges called the critical edges.The critical edges could change immediately after each arrival of an uncontracted hyperedge, but do not change between two arrivals.Suppose that there are always at least λ critical edges.Then, the probability that no critical edge arrives up to time T is at most e −λT .
Proof.The proof is by induction on the number of uncontracted hyperedges m.The base case is m = λ, where the set of critical edges cannot change, and the probability that no critical edge arrives is e −λT .
For the inductive case, let m crit ≥ λ be the current number of critical edges.We bound the probability that no critical edge arrives up to time T (denoted p G (T )) by integrating over the earliest time t that a hyperedge e i arrives.By Fact 2.12, t follows the exponential distribution of rate m, and e i is not a critical edge with probability 1 − m crit m .After the earliest hyperedge e i arrives, we apply the inductive hypothesis on hypergraph G/e i and remaining time which completes the inductive case.

Running Time
In this section, we prove the following lemma of running time of the algorithm.
Lemma 3.8.The expected running time of Algorithm 1 is m O(log n) .
Size decrease bound.In order to bound the size of the recursion tree, we ideally want each random contraction step to reduce the number of supervertices by a constant factor, so that the recursion tree has depth O(log n).This does not generally hold with high probability when some hyperedge's rank is large.As an extreme example, consider the hypergraph that consists of n distinct hyperedges of rank n − 1.The probability to contract nothing is q n , while the contraction step is repeated n 2 q −λ = n 2 q −(n−1) times, so we expect to see at least one bad subproblem where the number of vertices does not decrease.If this happens, then the size of the recursion tree would be unbounded.This is where the maximum rank assumption in the universally small case becomes useful: we can control the probability to get a bad subproblem to be small enough, which is crucial in bounding the size of the recursion tree.Lemma 3.9.Fix constants A > B > 1. Suppose that all hyperedges have rank at most R = n/A, and n is larger than some constant depending on A. Let n * = ⌈BR⌉, and let H ∼ G(q).Then, Proof.Define t(G) to be the stopping time in the contraction process when the vertex size of the contracted hypergraph decreases to at most n * .By definition By choosing the same parameters A, B in Lemma 3.11 (which we will prove later), where the last inequality uses that 1 + ln x ≤ x for all x ≥ 0.
Lemma 3.11.Fix constants A > B > 1. Suppose the maximum rank is upper bounded by R and R is larger than some constant depending on A. Let τ = min e∈E( G) X e be the earliest arrival time of a hyperedge in G.Because the X e 's are sampled from i.i.d.exponential distributions of rate 1, the random variable τ follows the exponential distribution of rate m by Fact 2.12.Note that any degree cut in G has value at least the min-cut value λ in G.By summing over degrees of all vertices, we have Denote rate(•) to be the rate of a exponential distributed random variable.Because n > n * and r ≤ R, we have rate(τ ) = m ≥ n λ r > n * λ R ≥ B λ.By the moment generating function of the exponential distribution in Fact 2.10, Now, consider the distribution of t( G) after revealing τ .To decrease size from n > n * to below the threshold n * , at least one hyperedge needs to be contracted; so t( G) ≥ τ .Moreover, by the memoryless property of the exponential distribution in Fact 2.11, τ and t( G) − τ are independent.If we further reveal the information that the hyperedge that arrives the earliest is e i , then we know the random process after the earliest arrival is equivalent to running the contraction process starting from G/e i .Thus, t( G) − τ follows the same distribution as t( G/e i ) conditioned on e i arrives first.Because the hyperedges follow i.i.d.exponential distribution, they are equally likely to arrive the earliest.Therefore, unconditionally we have that t( G) − τ follows distribution of t( G/e i ) for each e i with probability 1 m .Formally, We now apply the inductive hypothesis to bound the term E[e B λ•t( G/e i ) ] for each e i separately.Note that the min-cut value λ i in G/e i is at least λ because a contraction cannot decrease the min-cut value.
Next, we prove that max{(1 For the second term, we have For the first term, define x = r(e i )−1 n in terms of r(e i ), which satisfies Note that the LHS is convex on (0, 1) when A ln N ≥ 2, and the RHS is linear.So we only need to prove (5) for the two endpoints x = 1 n and x = R−1 n .For x = 1 n , we have Here, the second inequality is by e −x ≤ 1 − x 2 for x ∈ [0, 1] or A ln N/ n ≤ 1, which holds when N is larger than some constant (depending on A).The last inequality holds when ln N ≥ 4.
Next, we consider the second endpoint Here, the last inequality is equivalent to 1 − 1 A + 1 n ≤ e −1/A , which holds when n > 3A 2 and A > 1.Now, we have Proof of Lemma 3.8 We color each recursive call as black or red.Intuitively, they represent a "success" or "failure" in recursive calls respectively.A call is black if it decreases size by a constant factor, more precisely when |V (H)| ≤ 0.8n.Otherwise, the call is red.
The recursion tree has the following properties: 1.Each subproblem makes at most M = m O(1) recursive calls.This is guaranteed by the algorithm.
2. The algorithm reaches base case after O(log n) black recursive calls.This is because each black recursive call decreases n by a constant factor, and O(log n) black calls reduces n to a constant, which is a base case.
3. For each subproblem, the expected number of red recursive calls is at most 1/2.This is because in random contraction, we have O(n 2 q −λ ) recursive calls, and each call fails (to get size decrease) with probability at most n 2.7 q 1.5λ by Corollary 3.10.The expected number of red calls is their product, which is upper bounded by n 4.7 q 0.5λ = o(1) when q λ = n −10 .Lemma 3.12 below shows that these properties give a upper bound of m O(log n) on the number of recursive calls.If we charge the time of sampling a random contracted hypergraph to the subproblem on the contracted hypergraph, then each subproblem spends O(nm) time outside the recursive calls.Therefore, the overall expected running time is m O(log n) .This concludes the proof of Lemma 3.8.Lemma 3.12.Suppose in a randomly growing tree, each node u is either a leaf, or has M (u) ≤ M children, where M = Ω(n) is a parameter.Each edge from u to its children is colored red with probability f (u) such that M (u)f (u) ≤ θ = 1 2 , and black otherwise.The different children at a parent node are independent (including independence betwen the parent-child edges); Also, a subtree is independent of everything outside the subtree.Moreover, on any path from root to leaf, there can be at most L black edges.Then, the expected number of nodes in the tree is at most M O(L) .
Proof.We say a node w is a red descendant of node u if u is an ancestor of w, and the path from u to w is formed by red edges only.See Figure 1 for an illustration.
Let K be the number of red descendants of some node u.Let K i be the number of red descendants of u that are i steps deeper than u, so that K = i≥1 K i .We have Inductively, suppose at level i, there are K i red descendants {u 1 , . . ., u k i }.By definition, the red descendants in level i + 1 must be children of red descendants in level i.Each u j will generate at most θ red edges in expectation.So, 1−θ ≤ 1 converges and K i 's are nonnegative, so we can apply Fubini's theorem to get Let V i be the set of nodes that have k black edges from the root.We prove by induction that . The base case for k = 0 is the expected number of red descendants of the root, as well as the root itself, which is at most 1 + 1 = 2 = 2(2M ) 0 .Next consider the inductive step.For any node w ∈ V k+1 , let (u, v) be the black edge closest to w on the path from root to w.Then, u ∈ V k , and w is either a red descendent of v or v itself.Each node u in V k generates M (u) children v, and each v generates at most 1 red descendants in expectation.Therefore,

Random Contraction with DNF Sampling
In this section, we strengthen the previous algorithm to obtain a running time of m•n O(log n•log log 1 δ ) , at the cost of an additive error of δ, i.e. the output estimator is within (1±ε)u G (p)±δ whp (Theorem 1.2).When δ = 2 −poly(n) , the running time of the algorithm is m • n O(log 2 n) .We show that the algorithm outputs an estimator of u G (p) with bias at most δ and δ-capped relative variance O(1).Theorem 1.2 then follows by Lemma 2.7.

Algorithm Description
The algorithm is recursive.We start by defining the simple base cases.Then, we introduce the definition of large hyperedges, which characterize the last base case.Finally, we define the recursive cases.

Base Cases
There are four base cases, three of which are the following: 1.When the number of vertices n is a constant, we enumerate all possible outcomes by brute force, which is identical to Lemma 3.1.
2. When the hypergraph is already disconnected, output 1.
3. When p λ < 2 −3N , output 0. These three base cases are deterministic.The algorithms for the first two base cases return the exact value of u G (p).The third case has an additive bias of u G (p), which is at most n 2 p λ by Lemma 3.4.We assumed N ≥ log 2 n and p λ < 2 −3N , so The fourth base case is called full revelation; it will be described later in the section.

Large and Small Hyperedges
Before proceeding with the formal description of the remaining algorithm, let us provide some intuition.
Recall from Algorithm 1 that the large hyperedges are the bottleneck in the random contraction algorithm: we branch m times to halve the number of vertices, which leads to the m O(log n) running time.However, if we ignore the small hyperedges and consider the hypergraph with large hyperedges only, it turns out that the structure of cuts becomes much simpler.This motivates us to partition the set of hyperedges E into two sets, E large and E small .Intuitively, these are sets of hyperedges of large and small rank respectively, but for technical reasons, the precise definition needs to be more nuanced.
We now formally define the set E large .It depends on phase nodes in the recursive computation tree, which we define first.Initially, the root of the computation tree is a phase node.For any non-root node w of the computation tree, let u be the closest ancestor node of w that is a phase node (which exists because the root is a phase node).w is a phase node if and only if the number of vertices n u in u and n w in w satisfy n w ≤ 0.8n u .If w is not a phase node, such a u is called the phase ancestor of w.Define a phase with phase node u to be all computation nodes whose phase ancestor if u, as well as u itself.See Figure 2 for an illustration.
Given the definition of phase nodes, we define E large as follows: In a phase node, E large is the set of all hyperedges of rank > n/2.In a non-phase node w, E large is inherited from its phase ancestor u, i.e. the set of all hyperedges of rank > n u /2 in u.Let G large denote the hypergraph (V (G), E large ).
We let E small be the complement set of hyperedges E \ E large , and G small = (V (G), E small ).

The Last Base Case: Full Revelation
We are now ready to describe the last base case that we call full revelation.Let β = λ − λ L , where λ L is the min-cut value in G large .The last base case is invoked when β < λ/N .
The algorithm samples a random subgraph H ∼ G(p) conditioned on the event that the contracted hyperedges in E large do not contract the whole hypergraph into a singleton.This is done in two steps.First, we write a DNF formula for the disconnection event in G large , and apply Lemma 2.17 to contract each hyperedge in E large with probability 1 − p conditioned on the event that G large is not contracted into a single vertex.Second, we directly sample the remaining uncontracted hyperedges in E small , that is we independently contract each of those hyperedges with probability 1 − p.The resulting hypergraph H follows the desired distribution.
The algorithm repeats 8n 2 independent samples of the above process to obtain samples H i , and estimates X i = 0 if H i is contracted into a singleton, and X i = 1 otherwise.Let X be the average of all these estimators X i .Next, we use the DNF counting algorithm in Lemma 2.16 to get an unbiased estimator Z of u G large (p).The product XZ is the estimator of u G (p) output by the algorithm.
We are now left to describe the recursive step of the algorithm.For this purpose, we need to first establish some properties of large edges:

Properties of Large Edges
The first property is that the association of E large with large ranks and E small with small ranks is approximately correct: Fact 4.1.Any hyperedge in E large has rank at least 0.3n, and any hyperedge in E small has rank at most 0.7n.
Proof.Let w be the current computation node.If w is a phase node, then the fact is by definition of E large .Else, let u be the phase ancestor of u.For any hyperedge e ∈ E large (w), r w (e) ≥ r u (e) − (n u − n w ) ≥ 0.5n u − 0.2n u = 0.3n u ≥ 0.3n w .
The algorithm only contracts non-trivial hyperedges during recursion, i.e., hyperedges that are contracted into a singleton supervertex are removed.We assert that all large hyperedges are candidates for contraction: Fact 4.2.Suppose w is a non-phase node and E large (w) is inherited from the phase ancestor u of w.Then, every hyperedge in E large (u) still appears in E large (w), but may be partially contracted.
Proof.If there is a hyperedge e in E large (u) \ E large (w), then that hyperedge is contracted to a single vertex at node w.But then n w ≤ n u − r u (e) + 1 ≤ n u /2, so w is a phase node by definition, a contradiction.
Finally, we come to the most important property, that of the simple structure of cuts in G large .To explain this, let us introduce the following property: min u d large (u), i.e., the minimum degree of a vertex in G large .Intuitively, β is used to control the number of small hyperedges in each degree cut, which measures the speed of random contraction when no large hyperedges get contracted.Note that 0 ≤ β ≤ λ.Ideally, we want to decrease β to as small as λ/N , which reduces to the full revelation case.However, β can be non-monotone as both λ and λ L can increase because of contraction during recursion.So, we define another parameter γ that can be related to β to bound the depth of recursion in a phase.Let γ = ℓ − λ L , where ℓ = |E large | is the number of large hyperedges.We show that unlike β, γ is monotone in a phase: Lemma 4.7.Suppose v, w are nodes in the same phase.If w is a descendant of v, then γ v ≥ γ w .
Proof.Since E large is the same set of hyperedges within a phase by Fact 4.2, we have ℓ v = ℓ w .Now, since w is a descendant of v, the hypergraph G large (w) is formed by contracting some set of hyperedges in G large (v).Hyperedge contractions cannot decrease the value of the minimum cut; hence, λ L (w) ≥ λ L (v).The lemma follows.
Algorithm for partial revelation.The algorithm runs random contraction at a more aggressive rate q β = n −700 .7This is done in two steps.First, we write a DNF formula for the disconnection in G large , and apply Lemma 4.6 to contract each hyperedge in E large with probability 1 − q conditioned on G large not being contracted into a singleton.Second, we independently contract each uncontracted hyperedge in E small with probability 1 − q.The resulting hypergraph H follows the distribution of H ∼ G(q) conditioned on the event that the contracted hyperedges in E large do not contract the whole hypergraph into a singleton.
The algorithm repeats 32n 704 independent samples H i , and recursively computes a (biased) estimator X i of u H i (p/q).Let X be the average of all these estimators X i .Next, we use the DNF counting algorithm in Lemma 4.6 to get an unbiased estimator Z of u G large (q).The product XZ is the estimator of u G (p) output by the algorithm.

Bias of the Estimator
We first show that all base cases have bias at most δ, and the recursive steps are unbiased.Then, we prove by induction that the recursion keeps the same bound δ on bias.
We introduce some notations when E large and E small are uniquely defined in context.Let G(p 1 , p 2 ) be the random subgraph formed by independently contracting each hyperedge in E large with probability 1 − p 1 , and each hyperedge in E small with probability 1 − p 2 .Let D L be the event that in some random contraction, the contracted hyperedges in E large do not contract the whole hypergraph into a singleton.
Base cases.The first base case of n = O(1) outputs the exact value of u G (p) by Lemma 3.1.The second base case of disconnected G is trivial.In the third base case, the bias is 0 − u G (p) ∈ [−δ, 0].Next, we prove that the algorithm in full revelation case is unbiased.
Lemma 4.8.The algorithm in the full revelation case outputs an unbiased estimator of u G (p).
Proof.The algorithm first samples H L ∼ G(p, 0) | D L , then forms H S by deleting all hyperedges in E large from H L , and finally samples H ∼ H S (p).In other words, the subgraph H S is sampled by either contracting or deleting all large hyperedges, and then H is sampled from H S by randomly contracting small hyperedges.Let a be the number of contracted hyperedges in E large .Then, H S is sampled with probability p a (1 − p) ℓ−a /u L , where u L = u G large (p).
The step estimates u G (p)/u L by X H , where X H = 0 if H is contracted into a singleton, and The sum is over all H S sampled by either contracting or deleting all large hyperedges, conditioned on that it is not contracted into singleton by large hyperedges.The condition can be dropped because when it doesn't hold we have u H S (p) = 0. Then the sum is u G (p)/u L .
The algorithm outputs the product XZ.X is an average of X H ; its expectation is u G (p)/u L .Z has expectation u L .So the total expectation is u G (p).
Recursive cases.A random contraction step in the universally small case is unbiased by Lemma 2.2.So, we only need to show this for the partial revelation case.We do this in two steps.First, we assume that the inductive subproblems in this case return exact estimators, and show that the resulting estimator after this step is unbiased.Then, we use this fact to show that if the inductive subproblems return biased estimators, then the bias does not increase after the partial revelation step.
Define a partial revelation step to be that of the algorithm in the partial revelation case, except that we now directly use u G large (q) times average of u H i (p/q) as the estimator instead of recursively estimating them.Lemma 4.9.A partial revelation step is an unbiased estimator of u G (p).
Proof.A partial revelation step can be expressed as H L ∼ G(q, 0) | D L , and H ∼ H L (0, q).Here, the sampling is viewed as two steps: First, the algorithm samples a subgraph H L by contracting each large hyperedge with probability q, and rejects samples that are contracted into a singleton.The probability to keep a sampled H L is u L = u G large (q), so the probability mass for each H is multiplied by 1 u L compared to the distribution of contracting each large hyperedge with probability q.Second, the algorithm samples H by contracting each small hyperedge with probability q from H L .
Finally, a partial revelation step estimates u G (p) by u L • u H (p/q).We have The third equality drops the indicator function 1[H L disconnects] because H L being connected implies u H (p/q) = 0.The last step follows Lemma 2.2.
We now prove the inductive claim on the bias of the estimator.
Lemma 4.10.The algorithm outputs an estimator with negatively one-sided bias of at most δ.
Proof.We prove by induction.In the base case of p λ < 2 −3N , the output is 0; so, the bias is negatively one-sided and upper bounded by The other base cases are unbiased.
In a random contraction step of universally small case, we take the average X = 1 M i≤M X i .By the inductive hypothesis, each In the partial revelation case, Z is the DNF sampling estimator of u L = u G large (q), which is unbiased and independent of X. Next, we bound the bias of X compared to u G (p)/u L .We take average X = 1 M i≤M X i , and each X i is an estimator for u H 0] by the inductive hypothesis.Note that here H i is sampled from a different distribution where After scaling by E[Z] = u L ≤ 1, the overall bias of partial revelation case is

Capped Relative Variance of the Estimator
We first show that the recursive calls do not introduce relative variance.Then we show that the full revelation base case outputs an unbiased estimator with bounded relative variance.However, because the base case when p λ < 2 −3N is biased, we need to control δ-capped relative variance instead of relative variance.We conclude by bounding capped relative variance for the whole recursion in Lemma 4.14.
In universally small hypergraphs, we have shown that a random contraction step has relative variance at most n 2 q −λ = n 12 in Lemma 3.5.Fact 4.11.Suppose X is a nonnegative random variable that takes value 0 outside a subspace D, and the measure of D is u D .If we do rejection sampling to sample X ′ that only accepts samples in D, then η outputs X = 0, whose δ-capped relative variance is 0/δ 2 = 0. Other base cases are unbiased estimators of u G (p) with relative variance at most 3 by Lemmas 3.1 and 4.13.
In the inductive step of random contraction, the first level estimator u H i (p/q) has relative variance at most n 2 q −λ − 1 = n 12 − 1 by Lemma 3.5.The recursive estimator X i for u H i (p/q) has negatively one-sided bias of ≤ δ by Lemma 4.10 and η δ [X i |H i ] ≤ 3 by inductive hypothesis.By Fact 2.5, (unconditionally) η δ [X i ] ≤ 4 • n 12 • (3 + 1) = 16n 12 .The algorithm takes average of 16n 12 estimators, so the overall estimator has capped relative variance ≤ 1 by Fact 2.4.
In the inductive step of the partial revelation case, the first level estimator u H i (p/q) has relative variance n 4 q −β by Lemma 4.12.The recursive estimator X i for u H i (p/q) has negatively one-sided bias of ≤ δ by Lemma 4.10 and η δ [X i |H i ] ≤ 3 by inductive hypothesis.By Fact 2.5, (unconditionally) η δ [X i ] ≤ 16(n 4 q −β + 1) ≤ 32n 704 .The algorithm takes average of 32n 704 estimators to obtain X, so η δ [X] ≤ 1 by Fact 2.4.Finally, the algorithm multiplies X by Z, which is an unbiased estimator of u G large (p) with relative variance ≤ 1 by Lemma 4.6 and is independent of X.So, the product XZ has δ-capped relative variance ≤ 3 by Fact 2.6.

Running Time
The argument is similar to Section 3.3.We color each recursive call as black or red.Intuitively, they represent a "success" or "failure" respectively.
For both the universally small and partial revelation cases, if the child subproblem is a phase node, then the recursive call is marked a success (i.e., a black node).This is the only type of success for the universally small case, which we call type 1 success.For the partial revelation case, we have an additional situation where we declare type 2 success: when the parameter γ decreases to 0.9γ and |ℓ − λ| ≤ 0.1β.Now, the recursion tree satisfies the following properties: 1.Each subproblem makes n O(1) recursive calls.This is clear in the algorithm description.
2. The algorithm reaches the base case after O(log n • log N ) black recursive calls (interleaved with red recursive calls).We prove this in Lemma 4.15.
3. At each subproblem, the expected number of red recursive calls is o(1).We prove this later in the section.
Lemma 3.12 shows that these properties give a upper bound of n O(log n•log N ) on the number of recursive calls.If we charge the time of DNF sampling and random contraction to the subproblem on the contracted hypergraph, then each subproblem spends O(n 2 m) time outside the recursive calls, where the bottlenecks are DNF sampling and DNF probability estimation given by Lemma 4.6.Therefore, the overall expected running time is m • n O(log n•log N ) .
Lemma 4.15.There can be at most O(log n • log N ) black recursive calls from root to a base case.
Proof.There are at most O(log n) phases from root to a base case because each phase node decreases n to at most 0.8n compared to the last phase node.Within each phase, γ is non-increasing by Lemma 4.7, and each type 2 success decreases γ to at most 0.9γ.Note that whenever the type 2 success happens, we are in the case that |γ − β| = |ℓ − λ| ≤ 0.1β.Initially, β ≤ λ and the algorithm reaches a base case when β < λ/N ; so, there can be at most O(log N ) recursive calls with the type 2 success in a branch of the recursion tree within a phase.
For failure event 1, no large hyperedge gets contracted.This happens with probability at most q ℓ /u L ≤ q ℓ−λ L .
We now bound failure event 3 conditioned on failure event 1 occurring.Here, it suffices to upper bound the probability that some vertex s ∈ G w is incident to less than λ − 0.7β hyperedges in E large , given that no large hyperedge is contracted.The hypergraph under random contraction is now G small , and this random contraction is independent of the previous event on E large .
Consider the exponential contraction process on G small .Replace each hyperedge with two copies (these copies are "half-hyperedges" with survival probability √ q instead of q).Assign a head to each hyperedge, in a way that two copies from a hyperedge have different heads.The orientation may change during the process after each hyperedge arrives.The orientation is consistent, which means for any fixed subgraph we always choose the same orientation.Besides this requirement, the orientation is arbitrary.We assign a representative to each contracted supervertex during the process.Originally, each vertex is its own representative.When a hyperedge e is contracted, the representative of the head of e becomes the representative of the new contracted supervertex.Define the critical hyperedges for a supervertex s to be the hyperedges that contain s as a tail.For the failure event 3 to happen, there exists a supervertex s in w that is incident to less than λ − 0.7β hyperedges in E large .Let s be the representative of s, so s is a vertex in node v.By extension, we also denote by s the supervertices that contain s as a representative throughout the exponential contraction process.(Note that initially s = s.)Since we do not contract any large hyperedge, the number of large hyperedges that contain s can only increase over time, but by assumption, s is incident to less than λ − 0.7β large hyperedges at the end.So throughout the process, s is always incident to less than λ − 0.7β hyperedges in E large .Because the degree cut of s has value at least λ, s is always incident to at least 0.7β hyperedges in E small .Then, it always has 0.7β critical (half-)hyperedges after duplication.
By Lemma 3.7, the probability that such a vertex s will survive as a representative until time ln 1 √ q is at most ( √ q) 0.7β = q 0.35β .By a union bound, the probability of failure event 3 (conditioned on failure event 1) is at most nq 0.35β .The total failure probability is q 0.9β • nq 0.35β = nq 1.25β .

Conclusion
In this paper, we initiated the study of unreliability in hypergraphs and provided quasi-polynomial time approximation schemes for the problem.The immediate open question is whether there is a PTAS (or even FPTAS) for this problem.More generally, we hope that our work will inspire further exploration of the rich space of reliability problems in hypergraphs.For instance, a natural complementary question to network unreliability is that of network reliability, i.e., estimating the probability that a network stays connected under independent random failures of the (hyper)edges.
Recall that all random variables discussed in the paper are non-negative.This will be implicitly used in the following proofs.
Proof of Fact 2.4.For i.i.d.samples, the denominator max{(E[X]) 2 , δ 2 } is identical, while the numerator Var[X] is divided by M after taking average.So, the capped relative variance is divided by M .
Proof of Fact 2.5.The special case is given by Fact A.1.Next, we consider general case δ > 0. We start with This implies µ + δ ≥ x and max{µ, δ} ≥ 1 2 x.We can bound η δ [Z] by Notice that Thus, Proof of Fact 2.6.Under the assumptions, So we can apply Lemma A.2 to get a (1+ε)-approximation of X +δ, which is a (1+ε, δ)-approximation of X after subtracting δ.
Proof of Fact 2.8.Suppose X = i≤k α i X i and E[X i ] = µ i .Then Proof of Fact 2.9.
We prove Lemmas 2.16 and 2.17 together: Proof of Lemmas 2.16 and 2.17.For each clause C i , suppose it has a i positive literals and b i negative literals.Then the probability that C i is satisfied is p a i (1 − p) b i .Denote this as u i .
For any hypergraph G formed by contraction from G, define t( G) to be the following stopping time: In a contraction process starting at G, the vertex size of the contracted hypergraph decreases to at most n * = ⌈BR⌉.Suppose |V ( G)| ≤ N , where N = AR.Let λ be the min-cut value in G.Then, E e B λ•t( G) ≤ max Our proof is by induction on n = |V ( G)|.As the base case, when n ≤ n * , by definition t( G) = 0 and e B λ•t( G) = 1, so the statement holds.Next consider the inductive step where n > n * .Let r = 1 m e∈E( G) r(e) be the average rank in G, where m = |E( G)|.

Figure 1 :
Figure 1: A depiction of a portion of the computation tree.The failed recursive calls are shown in dashed red, while the successful ones are shown in solid black.Lemma 3.12 analyzes the expected size of the recursion tree.

Figure 2 :
Figure 2: A depiction of phases in the computation tree.The filled in nodes are phase nodes.The blue and green nodes respectively root the blue and green phases.Each phase can contain successful recursive steps, shown by solid black edges and black nodes, and failed recursive steps, shown by dashed red edges and red nodes.In a phase, every node has a phase ancestor which is the root node of the phase; for instance, u is the phase ancestor of v (and of every other node in the blue phase).

− 1
Proof of Lemma 2.13.The estimator X follows a binomial distribution of parameter p D , soE[X] = p D and Var[X] = p D (1 − p D ) ≤ p D .It follows that η[X] = Var[X] E[X] 2 ≤ 1 p D .For capped relative variance, when p D ≥ δ, η δ [X] = Var[X] p 2 D ≤ 1 p D ; when p D < δ, η δ [X] = Var[X] , CGZZ23]; the corresponding problem in hypergraphs remains open.semester program on Data Structures and Optimization for Fast Algorithms at the institute.DP also wishes to acknowledge the support of Google Research, where he was a (part-time) visiting faculty researcher at the time of this research.RC and DP would also like to thank William He and Davidson Zhu for discussions at an early stage of this research.