Set Cover in the One-pass Edge-arrival Streaming Model

We study the Set Cover problem in the one-pass edge-arrival streaming model. In this model, the input stream consists of a sequence of tuples (S, u), indicating that element u is contained in set S. This setting captures the streaming Dominating Set problem and is more general and harder to solve than the Set Cover set-arrival setting, where entire sets with all their elements arrive in the stream one-by-one. We prove the following results (n is the size of the universe, m is the number of sets): A work by [Khanna, Konrad, ITCS'22] on streaming Dominating Set implies a one-pass Õ(√n)-approximation algorithm with space Õ (m) for edge-arrival Set Cover in adversarially ordered streams. We show that this space bound is best possible up to poly-log factors in that every α-approximation algorithm, for α = Ω(√n ), requires space Ω(mn^2/α^4 ) in adversarially ordered streams, even if the algorithm is only required to output an α-approximation of the size of an optimal cover. As our main result, we give a one-pass Õ (√n )-approximation algorithm with space Õ (m/√n) for edge-arrival Set Cover in random order streams. This result together with the lower bound mentioned above establishes a strong separation between the adversarial and random order settings. Finally, in adversarial order streams, we show that non-trivial algorithms with space o(m) can be achieved at the expense of increased approximation factors Ω(√n ), which is in contrast to the set-arrival setting, where space Õ (n) is enough for a Θ(√n )-approximation, and space Ω(n) is needed for an o(n/łog n)-approximation. We give an α-approximation algorithm for one-pass edge-arrival Set Cover with space Õ(mn/α^2 ), for every α = Ω(√n).


INTRODUCTION
The advent of Big Data has fueled the need for algorithms that are able to process huge quantities of data while maintaining a memory that is much smaller than the size of the input.Data streaming algorithms fulfil this role and have received significant attention since more than two decades.A data streaming algorithm processes its input sequentially in a single pass (or few passes) and uses a memory of size sublinear in the input size.We are interested in how well fundamental problems can be solved in this model, in particular, how the space requirements of such algorithms depend on the desired solution quality, as well as on various other aspects of the model, such as the arrival order of the input data or the number of passes.
In this paper, we consider the Set Cover problem in the one-pass streaming model.In Set Cover, we are given a universe U of size  and a family S = { 1 , . . .,   } of  subsets   ⊆ U,  ∈ [], of the universe U.The objective is to output a smallest subset T ⊆ S that covers the entire universe, i.e., such that  ∈ T  = U, and a cover certificate  : U → T , indicating for each element  a set in T that covers/contains .We will consider approximation algorithms for Set Cover.We say that a (streaming) algorithm is an -approximation algorithm if it outputs a cover of size at most  times the size of a smallest set cover.
Set Cover in the Set-arrival Model.Set Cover in the so-called set-arrival model has been extensively studied in the literature [1,4,10,12,13,15,22].In the set-arrival model, a streaming algorithm sees a sequence of the input sets in arbitrary order, where each set arrives together with all its elements.The one-pass setting in the set-arrival model is fully understood: For any  =  ( √ ), results by Assadi, Khanna, and Li [4] show that space Θ(   )1 is necessary and sufficient.It is also known that an Θ( √ )-approximation can be computed with space Õ() [10,13].Together, these results show that set-arrival Set Cover undergoes a phase transition at approximation factor  = Θ( √ ), namely, Õ() space is sufficient for  ( √ )-approximation but Θ(   ) space is needed for  =  ( √ ).Furthermore, it is not hard to see that the lower bound for streaming Dominating Set by [19] also applies to set-arrival Set Cover and shows that space Ω() is necessary for any approximation factor  (/log ).
Set Cover in the Edge-arrival Model.In this paper, we study Set Cover in the one-pass edge-arrival model.In this model, the input stream consists of a sequence of tuples (, ), indicating that element  ∈ U is contained in set . Bateni et al. [6] were the first to consider this model and gave a -pass ((1 +) log )-approximation streaming algorithm with space Õ(  ( 1  ) + ).The edge-arrival setting also appeared in a work by Indyk et al. [16] who observed that their multi-pass streaming algorithm for fractional Set Cover can also be implemented in the edge-arrival setting.Furthermore, Khanna and Konrad [19] studied the Dominating Set problem in the graph streaming model, which can be seen as a special case of edge-arrival Set Cover with  =  sets.Their results imply the following algorithm:
Since the edge-arrival setting is more general than the set-arrival setting, lower bounds for Set Cover in the set-arrival setting, in particular, the Ω(/) space lower bound for  =  ( √ ) by Assadi et al. [4], also applies to the edge-arrival setting.Furthermore, since the Õ(   )-space algorithm by Assadi et al. can also be implemented in the edge-arrival setting (see the Appendix of [19] for details), the edge-arrival setting is also completely understood for  =  ( √ ).The KK-algorithm together with the Ω(   ) space lower bound by Assadi et al. imply that, similar to the set-arrival setting, Set Cover in the edge-arrival setting also undergoes a phase transition at approximation factor  = Θ( √ ).

Our Results
While the regime  =  ( √ ) is fully understood for edge-arrival Set Cover in the adversarial order setting, the space complexity for  = Ω(

√
) is open.In our first result, we resolve the space complexity for  = Θ( √ ) up to poly-logarithmic factors, showing that space Ω() is necessary, which renders the KK-algorithm (Theorem 1) best possible: Theorem 2. Let  ≥ √ .Then any randomized -approximation one-pass streaming algorithm for edge-arrival Set Cover in adversarial order streams requires Ω( 2 / 4 ) space, even if the algorithm is only required to output an -approximation of the size of an optimal cover.
The fact that our lower bound even holds for algorithms that only output an approximation of the optimal set cover size is a substantial strength.Indeed, the Ω(/) lower bound for  =  ( √ ) by Assadi, Khanna, and Li [4] crucially relies on the fact that algorithms output a cover certificate, and it is an open problem whether this requirement can be lifted.
Next, as our main result, we give a one-pass Õ( √ )-approximation streaming algorithm with space Õ(  √ ) for random order streams, i.e., streams where the arrival order of the tuples (, ) is chosen uniformly at random.This result together with the lower bound stated in Theorem 2 establish a strong separation between the adversarial and random order settings.
Approx.Space Stream order Ref. ), when the input stream is in random order.
Random order streams have received significant attention in the data streaming literature for a variety of problems, including matchings [2,7,17,18,20], ruling sets [3], frequency moments [8], and submodular maximization [14].The random order model is considered to be a more realistic model than the worst-case (adversarial) order model since, in practice, data rarely arrives in the worst possible order.Regarding Set Cover, our paper is the first to study edge-arrival Set Cover in random order streams.In the setarrival setting, it is known that the one-pass random order setting is not easier than the adversarial order setting for approximation factors  =  ( √ ) [4].Last, returning to the adversarial order setting, we show that the regime  = Ω( √ ) is non-trivial in the edge-arrival setting, which is counter to the set-arrival setting, where space Θ() is enough for a Θ( √ )-approximation and necessary for an  (/log )approximation.We obtain the following result:   1 summarizes all results known on the edge-arrival setting.

Techniques
We describe below the techniques used behind our results.We will describe the techniques behind our main result, our random order streaming algorithm, last since this description builds upon the description of our adversarial order algorithm.
Lower Bound for Adversarial Order Streams.We first discuss our Ω( 2 / 4 ) space lower bound for -approximation algorithms.
Our lower bound is proved in the one-way multi-party communication setting, where each of overall  parties holds a portion of the input instance.The parties communicate via messages in order, that is, the first party sends a message  1 to the second, who in turn sends a message  2 to the third.This continues until the last party  has received a message   −1 and then outputs the result of the protocol.It is well-known that a lower bound on the minimum message length of the longest individual message implies a space lower bound for one-pass streaming algorithms.
We first point out that, in order to prove lower bounds above Θ() for an approximation factor  = Ω( √ ), we require  = Ω( 2 /) parties since there is a simple deterministic -party protocol with approximation factor 2

√
and maximum message length Õ() (omitted due to space restrictions).Indeed, our lower bound construction uses  = Θ( 2 log 2 /) parties and is a reduction from the -party version of the Set-Disjointess problem.In -party Set-Disjointess, each party holds a subset of a universe of size .The parties are guaranteed that either their sets are all pairwise disjoint or there is a unique element that appears in all sets.The goal is to decide between these two cases, and it is known that every protocol solving -party Set-Disjointness requires at least one message of size Ω(/ 2 ).
We work with a family of random sets  The implementation of this idea requires the last party to fork the execution of the algorithm in the reduction  times.In parallel run , the set U \   is added to the instance, which allows focusing on covering the elements   .See Section 3 for further details.
Algorithm for Adversarial Order Streams.We will now explain the ideas underlying our one-pass randomized -approximation algorithm with space Õ(/ 2 ), for  = Ω( √ ).Our algorithm constitutes an improvement over the KK-algorithm by Khanna and Konrad [19].We will first discuss the KK algorithm and then our improvements.
The key challenge in designing algorithms for edge-arrival Set Cover is the fact that sets may be spread out over the input stream and algorithms cannot take decisions based on the entire content of a set.In the set-arrival setting where entire sets arrive in the stream, algorithms can greedily add sets to the solution if they cover enough not-yet covered elements.This strategy does not work here.The KK-algorithm provides a solution to this problem based on the use of uncovered-degree counters for the sets: Every tuple (  , ) arriving in the stream increments the uncovered-degree  (  ) of   if  is not yet covered by the algorithm.Intuitively, we only want to add a set to the solution if it covers enough yet-uncovered elements.However, since this information is not available at any one moment, a probabilistic inclusion process is used instead: Whenever the uncovered degree of a set reaches  • √ , for any integral  ≥ 1, the set is included in the solution with probability 2  • √   , and, if included, covers then all elements contained in this set that arrive from this moment onward in the stream.
The key point of the analysis of the KK-algorithm is to show that the probabilistic inclusion process does not add too many sets to the solution.To see this, denote by S  ⊆ S the subset of level- sets, which are those sets with uncovered-degree in [ √ , ( + 1) √ ) at the end of the stream.Then, according to the inclusion rule, we expect to add at least level- sets to the solution.In order for not too many sets to be included, this can only work if the number of level  sets decreases exponentially, and, indeed, it is shown in [19] that , for every , which implies that each level only contributes Õ( √ ) sets to the final solution.The KK-algorithm requires space Θ() for storing the uncovereddegrees of all sets.To go below this space bound, we thus cannot maintain the uncovered-degrees of all sets.Instead, for each set, we maintain its current level instead of its uncovered-degree.This requires us to use a different promotion strategy for the sets to reach the next level since uncovered-degrees are no longer available.Whenever a tuple (  , ) arrives in the stream such that  is not yet covered, we increase the level of   with probability 1   .At a first glace, this strategy does not appear to allow us to decrease the space requirements since we need to maintain the current level of every set.However, we show that, for  = Ω( √ ), only Õ(/ 2 ) sets reach the second level over the course of the algorithm.Hence, it is sufficient to explicitly store only the levels of those Õ(/ 2 ) sets that were promoted at least once, which achieves the desired space bound.
Algorithm for Random Order Streams.At a high-level, our random arrival order algorithm for set cover aims to simulate the KK algorithm [19], albeit this time by rotating sets through the memory in blocks of  (/ √ ) sets at a time: at any given moment in time, the algorithm tracks arriving edges for only a predetermined collection of  (/ √ ) sets.Intuitively speaking, even though each set effectively only sees an  (1/ √ )-fraction of the stream, the random arrival order of the stream should still make it possible to collect a statistical signal that highlights sets that can cover Ω( √ ) of yet uncovered elements.However, a number of challenges emerge in successfully implementing this high-level intuition.First, the length of the stream that needs to be examined to register such a signal for a set  depends on the number of new elements that  is able to cover.In particular, if this number is  √ , it is crucial that any such set  is included in the solution by the time a Õ (1/)-fraction of the stream is consumed because otherwise,  ( √ ) elements that  could have covered would have already passed by in the stream.So the duration of the stream that should be assigned to processing a set needs to depend on how many elements it would end up covering in the solution.We address this by designing a family of algorithms that are run successively, where the first algorithm consumes only about a 1 √  -fraction of the stream, and in general, the  ℎ algorithm consumes the next 2  √  -fraction of the stream.This ensures that every relevant set gets detected in a timely manner, before too many of its incident edges have passed.
A second challenge emerges in ensuring that not too many sets are chosen in our implementation of the KK algorithm.With each successive level, the KK algorithm geometrically increases the rate at which a relevant set  (one that covers Ω( √ ) of yet uncovered elements) is sampled.It is then necessary that the number of sets considered for sampling at each successive level is geometrically decreasing.This goal is achieved in [19] via the following argument.If too many sets are being considered for sampling at level , it must necessarily be the case that some uncovered element  is contributing many edges towards such sets.The original implementation of the KK algorithm where each set is being tracked at all times, naturally confers a monotonicity property that ensures that if a set  is chosen for sampling at a level , then it must have been also chosen for sampling at all previous levels.The analysis in [19] then shows that with high probability, one of the sets containing the element  would have been sampled before level  starts, and hence element  would have been marked as covered when level  starts.Both these properties are somewhat difficult to ensure in our setting since the reduced Õ (/ √ ) space requires working with very sparse information about the sets and elements, making it hard to ensure either monotonicity or coverage of elements that have many incident edges.
We ensure monotonicity by demanding an increasingly stronger statistical signal as we go from one level to the next.This ensures that if a set  is chosen for sampling at a level , then even allowing for stochastic deviations, with high probability, it must have also been chosen at every previous level.However, an element  that would have been covered in the KK algorithm, may continue providing statistical signal for sampling other sets in our setting simply because while a set  containing the element  has been added to our solution, the actual edge (, ) is yet to arrive in the stream.This can lead to sampling too many sets at level .We overcome this by tracking another statistical signal with the goal of identifying elements that would have been covered in the KK algorithm in a timely manner.This task is made more difficult by the availability of Õ (/ √ ) space.Nonetheless, we show that we can detect this early enough in the stream and optimistically mark such elements as covered, and hence prevent them from creating too many sets for sampling.These ideas together allow us to obtain a Õ( √ )-approximation using only Õ (/ √ ) space.

Further Related Work
Saha and Geetor [22] initiated the study of (set-arrival) Set Cover in the streaming model and gave a  (log )-approximation algorithm that makes  (log ) passes over the input and uses space Õ().
Emek and Rosén showed that an  ( √ )-approximation can be obtained in a single pass with the same amount of space.This has been extended to multiple passes by Chakrabarti and Wirth [10], who showed that an  ( 1 +1 )-approximation can be achieved using  passes and space Õ(), as long as  is constant, which, as proved in their paper, is best possible.
Regarding algorithms that use substantially more space, results by Assadi [1] and Har-Peled et al. [15] show that, using a polylogarithmic number of passes, an -approximation can be achieved using space Õ( 1  ) and this space bound is optimal.
Set Cover on massive inputs can be solved well in practice.Most practical approaches are based on efficient implementations of the Greedy Set Cover algorithm [11,21,23].It is also known that the streaming algorithm of Emek and Rosén [13] only produces slightly larger covers as those produced by Greedy in practice, albeit using substantially less memory [5].

Outline
We present in Section 2 a graphical representation of Set Cover instances that is used throughout the paper.We then give our lower bound for adversarial order streams in Section 3. Next, we present our streaming algorithm for the random order setting in Section 4, and our algorithm for the adversarial order setting is presented in Section 5. Finally, we conclude in Section 6.

PRELIMINARIES
Throughout this paper, we assume that every element  ∈ U is contained in at least one set since otherwise the set cover instance would not be feasible.
We make use of a representation of the input Set Cover instance (S, U) with  = |S| and  = |U| as a bipartite graph.This representation is obtained by setting S and U as the two bipartitions of the graph, and for every set   ∈ S and every  ∈   , we include the edge (  , ) in the graph.More formally, we define  = (S, U, ) with (  , ) ∈  if and only if  ∈   .Then, a cover T ⊆ S of the input instance corresponds to a subset of vertices of the left bipartition whose neighborhood equals the entire right bipartition.

LOWER BOUND FOR ADVERSARIAL ORDER STREAMS
In this section, we will prove that every one-pass streaming algorithm for Set Cover in the edge-arrival model with approximation factor  ≥ √  requires Ω( 2 /( 4 log 4 )) space.Our lower bound is obtained by a reduction from the one-way -party version of the Set Disjointness problem.In this problem, each party 1 ≤  ≤  holds a subset   ⊆ U of the universe U = [].The players are promised that either all sets  1 , . . .,   are pairwise disjoint, i.e.,   ∩   = ∅ holds, for every  ≠ , or the sets uniquely intersect, i.e., |    | = 1 and |  ∩   | = 1, for every  ≠ , hold.The objective is to decide which is the case.
It is well-known that the one-way communication complexity of -party Set Disjointness is Ω(/) [9], which implies that at least one of the  − 1 messages involved needs to be of size Ω(/ 2 ).

Theorem 5 ([9]
).Let P be a protocol that solves the -party Set-Disjointness problem with error at most 1/4.Then, at least one message in P is of size Ω(/ 2 ).
Our lower bound requires a family of sets with small pairwise intersection.We first establish the existence of such a set family in Lemma 1, and then give our lower bound in Theorem 2. Lemma 1.Let , ,  be integers with  ≤  and  =  (poly()).Then, there exists a family of sets  1 , .
Proof.We will show that random sets 1 , . . .,  fulfil this lemma with non-zero probability, which implies the existence of such a set family.
Then: Let A be an algorithm as in the statement of the theorem, and let ( 1 , . . .,   ) be an instance of -party Set-Disjointness with   ⊆ [].The  parties use A to solve the instance.To this end, every party  includes all the partial sets    with  ∈   in the Set Cover instance.The parties then execute A on this instance by forwarding the memory state.More concretely, the first party runs A on the sets  1   , for every  ∈  1 , and sends the memory state of the algorithm to the second party.Upon receiving the memory state of A from party  − 1 ( ≥ 2), party  continues the execution of A on the sets    , for every  ∈   .The last party behaves as follows.After having executed A on her share of the input, the party forks the execution of A and runs  parallel executions.In the parallel execution , the last party continues the execution of A on set Observe that if ( 1 , . . .,   ) uniquely intersect then there exists a cover of size 2 in the parallel run  =    .This is because the sets   and   are part of the input in run  and form such a cover.
Suppose now that ( 1 , . . .,   ) are pairwise disjoint.Then, for any 1 ≤  ≤ , an optimal cover in parallel run  is of size at least OPT 0 =  (  −/ log  ), since the  elements in   need to be covered, which can be achieved by the at most one set    , for some , and then via other sets which, by the properties of the set family, have an intersection with   of size at most  (log ).
The last party generates the output as follows: If one of the parallel runs returns an estimate of the size of the optimal cover of at most OPT 0 − 1, then the party reports uniquely intersecting.If no such run exists then the party reports pairwise disjoint.
For the algorithm A to output a solution of size at most OPT 0 − 1 when there is a solution of size 2, it is required that its approximation factor  is such that: We can thus choose a value  = Θ( 2 log 2 /).Then, by Theorem 5, we obtain that A uses space ).
In order to have correctly invoked Theorem 5, we require that the probability that none of the parallel runs fail to be at least 3/4.Using the union bound, we see that this is achieved if the error probability of A is 1 4 .□ In the full version of this paper, we give a -party protocol with approximation factor  = 2

√
and maximum message length Õ().The existence of such a protocol highlights the need for  = Ω( 2 /) parties in order to prove our lower bound.
Remark.While the lower bound above is stated for algorithms with a success probability of at least 1 − 1/4, we note that any algorithm A with success probability of at least 3/4 can be converted into an algorithm with a success probability of at least 1−1/(4) by running  (log ) parallel copies of A, and outputting the smallest answer over all runs.Thus we can also conclude an Ω( 2 /( 4 log 4  log )) space lower bound for algorithms that are only required to succeed with probability at least 3/4.

ALGORITHM FOR RANDOM ORDER STREAMS
In this section, we present our algorithm for random order streams.We will first present the algorithm and discuss some of its key properties in Subsection 4.1.Three key invariants that hold throughout the algorithm and the main theorem are then presented in Subsection 4.2.Due to space resitrictions, the proofs of two of these invariants are deferred to the appendix.

Algorithm
The key idea behind our algorithm (see Algorithm 1 for a listing) is to process the input sets S in , focusing on at most one batch at any one moment.This allows us to reduce the space complexity from Õ() to Õ(

√ 𝑛
).We assume that the number of sets  and the size of the universe  are known to the algorithm, and, w.l.o.g., we also assume that the input stream length  is known.This is without loss of generality for the following reason: First, we can assume that  ≥  √  since otherwise the entire stream would fit into memory.Furthermore, we also know that  ≤  •  since every set is of size at most .We can therefore run a logarithmic number of executions of our algorithm in parallel, using the guess 2   √ for the value  in run .Since our algorithm is not sensitive to the exact value of  , the run with the guess closest to  will therefore produce a valid solution.
Our algorithm gradually adds sets to the initially empty set Sol (Line 1), which constitutes the output set cover when the algorithm terminates.In particular, a set that is added to Sol is never removed.The algorithm also maintains the set of marked as covered elements, i.e., elements that are either covered by one of the sets in Sol or that are likely to be covered at a later stage (Line 3).We also store, for each element, a cover certificate, i.e., a set that covers the element.Furthermore, for each element , we remember the first set in the stream that covers  (Line 4).These sets are required in the postprocessing stage in order to cover any elements that have not been covered over the course of the algorithm.
The algorithm then enters epoch 0. In epoch 0, the algorithm adds every set with probability  0 =  • √   log  to Sol (Line 6).This ensures that elements with degree Ω(  √ ) are covered by one of these sets with high probability.We therefore next identify elements of such large degree by detecting their signal in only a small fraction of the stream and mark these elements as covered (Line 7).Observe that we have not yet necessarily observed an edge that covers such an element , however, with high probability, such an edge will be observed at a later stage.
Next, the algorithm runs  = 1 2 log() − 3 log log() − 2 algorithms A (1) , A (2) , . . ., A ( ) sequentially.Each algorithm A ( ) consists of log  − 1 2 log  epochs, and each epoch of subepochs are used for processing the sets S 1 , . . ., S √  in turn.Each subepoch of algorithm A ( ) processes ℓ  = 2    log  input edges, which implies that, overall, input edges are processed in algorithms A (1) , A (2) , . . ., A ( ) , such elements are covered by special sets that are added to Sol with high probability and we can thus mark these elements as covered (observe that this is very similar to epoch 0).On an intuitive level, once an element is marked, the element cannot continue to contribute to increasing the counters of sets, which implies that we expect fewer and fewer sets to become special as the algorithm proceeds.This observation also justifies that the inclusion probabilities double between epochs, i.e.,   = 2 •   −1 .In order to identify elements of such high degree, we make use of the fact that the special elements in epoch  are a subset of the special elements in epoch  − 1 with high probability (Lemma 5), a property implied by the fact that, for a set to be special in epoch  + 1, we require to observe log 6  edges towards yet unmarked elements more than in epoch , and a probabilistic argument shows that it is extremely unlikely that a set exceeds the higher threshold in epoch  + 1 but not the lower threshold in epoch .We therefore take a uniform random sample of the special elements in epoch  (the set Q′ , which becomes Q in epoch  + and track/store all edges between the sampled sets and the edges of epoch  + 1.We prove in Lemma 6 that this sample is indeed enough to detect elements that are incident to at least 1.1 •  2  +1 √  special elements in epoch  + 1.These elements are then marked as covered in Line 31. Once algorithm A ( ) has finished, we process the remaining edges and mark yet unmarked elements if they are incident to sets in Sol and store their covering witnesses.In the post-processing step, we cover the yet uncovered elements at a rate of one set per element in order to make sure that we indeed output a legal cover (Line 38).We observe that some marked elements may not have found a covering witness prior to the post-processing stage.This can happen when a high degree element is marked as covered in Line 31, but all its incident edges towards sets in Sol have appeared earlier in the stream.Our analysis accounts for this via the notion of missed edges.We prove that, for every set  added to Sol, there are only Õ( √ ) missed edges incident to .

Analysis
We use  to denote the input stream length.We assume that the number of sets  is larger than  •  2 log 3 , for a sufficiently large constant , and polynomially bounded (in ), i.e.,  = poly .
For any set , we will denote by  () the set of elements that are contained in .Thus the stream contains edges of the form (, ) for all  ∈  ().For any set , and a subset  ⊆ [], we will denote by (,  ) the set of all edges of the form (, ) for  ∈  .
Denote by Sol ( ) the variable Sol at the moment when algorithm A ( ) finished, and let  ( ) ⊆ U be the set of elements that are not covered by Sol ( ) , i.e.,  ( ) = U \ ∪  ∈Sol ( ) .Note that even if an element  can be covered by a set  ∈ Sol ( ) , it may be that the edge (, ) appeared in the stream before the set  was included in our solution; in this case the element  may remain marked as uncovered but  will not be included in  ( ) .We will refer to such edges as missed edges.
Our analysis relies on proving the three invariants (I1), (I2), and (I3) from which our main result is derived.We will next state these invariants, then state and prove our main result.For space reasons, Invariants (I2) and (I3) are proved in the appendix.
(I1): At the end of A ( ) , with probability at least 1 − ()/ 5 , for any set  ∉ Sol ( ) , the set  can cover at most (/2  ) • log 9  elements in  ( ) .(I2): If a set  is included in Sol ( ) during the execution of A ( ) , then with probability at least 1−1/ 3 , there are  ( √  log 9 ) missed edges that are incident on the set  only.(I3): With probability at least 1 − 1/ 3 , the total number of sets added to Sol during the execution of We note at this occasion that we have not attempted to minimize the poly-log factors appearing in our analysis.Theorem 3. Let  = Ω( 2 ) ∩ poly().Then, Algorithm 1 is a one-pass Õ( √ )-approximation streaming algorithm for edge-arrival Set Cover in the random order setting that uses space Õ(  √ ) and succeeds with high probability over the random ordering of the input and the random coin flips of the algorithm.
Proof.We first analyse the space requirements of the algorithm and then establish the approximation factor.
Space Analysis.We start by observing that for any , all variables used by algorithm A ( )  ). Last, the algorithm can easily be modified so that |Sol| ≤  always holds.Indeed, if |Sol| reaches the size  then we report a trivial cover that covers every element with a single set (recall the algorithm stores for every element the first set that it is contained in).This establishes the space complexity of the algorithm.
Approximation Ratio Analysis.Assuming the three invariants (I1), (I1), and (I3) hold, we can show that, with probability at least 1 − 1/, our algorithm outputs a solution that is at most  ( √  log 12 ) times the size of the optimal solution.To see this, we first observe that by Invariant (I3), with probability at least 1−1/ 2 , the total number of sets added to Sol by A (1) , A (2) , ..., A ( ) , is bounded by  ( √  log 3 ).We next observe that with probability at least 1 − 1/ 2 , by Invariant (I2), we have that for every set  included in Sol during the execution of the algorithms A (1) , A (2) , ..., A ( ) , there are only  ( √  log 9 ) missed edges.Finally, by Invariant (I1), we know that when A ( ) terminates, no set outside of Sol can cover more than  ( √  log 12 ) elements in  ( ) .So during the patching phase, the total number of sets added by our algorithm is at most  ( √  log 12 ) times more than the number of sets used by an optimal solution.
Putting together, we obtain that with probability at least 1 − 1/, we output a solution whose size is at most  ( √  log 12 ) = Õ( √ ) times the optimal solution.□

Concentration Result
The random order assumption allows us to prove a concentration result that we will use throughout our analysis.
Lemma 2. Let  ⊆ [ ] be a subset of positions in the stream with | | = ℓ.Let  ∈ S be any set and let  ⊆  be a fixed subset of .Then, with probability at least 1 − 1  20 , we have that the total number of edges of the form (, ) with  ∈  that appear in the locations  is: (1) at least 0.
) and at most Due to space restrictions, the proof of this lemma is deferred to the appendix.
Proof.We will prove invariant (I1) by induction.At the start of A ( ) , with probability at least 1 − ( − 1)/ 5 , we have that for any set  ∉ Sol ( ) , the set  can cover at most (/2  −1 ) log 9  elements.This is clearly true when the algorithm A (1) starts as no set can cover more than  elements.Now suppose this invariant holds at the end of the algorithm A ( −1) .Assume by way of contradiction, that there is a set  such that  does not get sampled during A ( )  but  contains at least (/2  ) • log 9  elements in  ( ) .Let  =  () ∩  ( ) ; by our assumption, we have | | ≥ (/2  ) • log 9 .
Consider any epoch E  that is devoted to the processing of set .Let  ⊆ [ ] denote the set of indices (positions in the stream) during which we processed set  so far.Since  ∉ Sol ( ) ,  has only been observed in epoch 0, in all subepochs of the algorithms A (1) through A ( −1) dedicated to processing , in the first (  − 1) subepochs of A ( ) dedicated to processing , and part of the  ℎ epoch of A ( ) .

We can thus bound |𝐼 | by
where we used the bound 2  =  ( √ ) and  = Ω( 2 ).By Lemma 2, we know that with probability at least 1−1/ 20 , the total number of edges in (,  ()) that appear in  can be bounded by where we used the trivial bound | ()| ≤ .Now note that the indices in  are the only locations where the algorithm has thus far observed edges incident on .Thus if we fix the set  of all the locations in the stream where edges incident on  appear, and the set  ⊆ [] of elements such that only edges in (,  ) appear in the stream at indices in  , then all edges in (,  () \ ) appear in a uniformly at random permutation on indices in  \  .In particular, this means that all elements in  \  appear as in a uniformly at random chosen permutation over indices in  \  .We also note at this stage that the algorithm would process these elements since, as we show in Lemma 7, uncovered elements are not marked with high probability.
Again invoking Lemma 2, we know that with probability at least 1 − 1/ 20 , the total number of edges of the form (,  ) that appear in the sub-epoch of E ( )  devoted to processing set  can be bounded from below by 0.99 Thus with probability at least 1 − 1/ 20 , during the sub-epoch of E

(𝑖 )
devoted to processing the set , the set  gets selected for sampling with probability   (since 0.99 log 8  −  (1) ≥  • log 6 ).Moreover, by taking a union bound over all epochs, we know that this assertion holds for set  in every epoch with probability at least 1 − 1/ 19 .Thus conditioned on this event, the set  is guaranteed to get sampled with probability 1, and hence included in the solution before A ( ) terminates.By taking a union bound over all sets, we conclude that the probability that Invariant (I1) is violated during A ( ) is bounded by 1/ 18 , completing the proof.□

ALGORITHM FOR ADVERSARIAL ORDER STREAMS
We will now present our Õ(/ 2 ) space -approximation streaming algorithm, for  = Ω( √ ), for edge-arrival Set Cover.
Proof.The experiment described in the statement of the lemma can be seen as follows.We are given a ground set of size  and a subset of the ground set  of size | |.We now pick a random subset of the ground set of size ℓ.We would like to show that the number of elements of  in this subset is concentrated with high probability.
We first prove the upper bound.To this end, consider the process of repeatedly (ℓ times) drawing elements from the ground set without replacement.Denote by   the number of elements of  that were drawn within the first  steps.Then, the probability that the ( + 1)th element drawn is from  is: This process is stochastically dominated by a sequence of independent Bernoulli trials with success probability  =  to obtain an error probability of at most 1  20 (for large enough ).We thus conclude that with probability 1 − 1  20 , we have Next, we prove the lower bound results, and we first focus on result 1.We condition on the event that the upper bound holds, i.We now turn to the lower bound stated as result 3.By the same calculation as for result 1 (observe that we even use the upper bound from result 1 here), we see that the resulting process stochastically dominates a sequence of ℓ independent Bernoulli trials with success probability  = ).The expected number of successes of this process is therefore  = ℓ  | |(1 − Proof.Fix any set  ∈ Sol ( ) .W.l.o.g.assume that the set  was added to Sol ( ) during the execution of A ( ) .We also assume that | | ≥  ′ • √  log 3 , for some large  ′ , since otherwise the statement is trivially true.
Let Γ be the set of slots in the stream in which edges incident on  appear during the execution of epoch 0 and algorithms A (1) , ..., A ( ) .To prove invariant (I2), it suffices to show that with probability at least 1 − 1/ 3 , the number of elements in ( ∩  ( ) ) that appear in Γ is bounded by  ( √  log 3 ), as this is precisely the set of missed edges incident on .To bound this quantity, we will partition Γ into two sets, Γ 1 and Γ 2 , where Γ 1 ⊆ Γ denotes the set of slots in epoch 0 and in the executions of A (1) , ..., A ( ) which reside in the sub-epochs of these algorithms devoted to the processing of the set .We will refer to these slots as observed slots of  as they are the slots where the algorithm observes which elements incident on  appear.We will refer to the set of remaining slots, namely Γ 2 = Γ \ Γ 1 , as the unobserved slots of .Note that once we fix the set of elements in  that appear in the observed slots, the execution of the algorithms A (1) , ..., A ( ) , is oblivious to the remaining elements of  that appear in the unobserved slots.Moreover, each remaining element of set  is equally likely to appear in any of the unobserved slots.
Now to prove invariant (I2), we will bound the number of elements in ( ∩  ( −1) ) that appear in Γ as this is an upper bound on the set of missed edges incident on set .By the same argument as in the proof of (I1), we have |Γ 1 | ≤ 1.01

√
with probability 1 − 1  20 , and we will prove that the number of elements in ( ∩  ( −1) ) that appear in Γ 2 is  ( √  log 3 ).These two properties together complete the proof.
We now bound |Γ 2 |.We have where each 2  √  corresponds to the fraction of the stream observed by algorithm A ( ) .Note that we also have the lower bound Hence, by Lemma 2, with probability at least 1 − 1/ 20 , the total number of edges incident on  that appear during the execution of A (1) , ..., A ( ) is at most 1.01 We now note that every unobserved element in the set  is equally likely to appear in any unobserved slot.Thus, the expected number of elements in ( ∩  ( −1) ) that appear in the unobserved slots Γ 2 can be bounded from above by 1.01 By Invariant (I1), with probability at least 1 − 1/ 4 , we know that | ∩  ( −1) | ≤ (/2  −1 ) log 9 .Thus with probability at least 1 − 1/ 4 , the expected number of elements in ( ∩  ( −1) ) that appear in Γ 2 is at most Finally, using Chernoff bounds for negatively correlated random variables, we conclude that with probability at least 1 − 1/ 3 , the number of elements in ( ∩  ( ) ) that appear in Γ 2 is at most 10 √  log 9 .It now follows that with probability at least 1 − 1/ 3 , the total number of elements in ( ∩  ( ) ) that appear during the execution of A (1) , ..., A ( ) Before presenting the proof of Invariant (I3), we require some technical lemmas, which rely on the notion of forward-edges: Definition 1.We say that, at a given position in the stream, an element  ∈ U is forward-incident to a set  if the edge (, ) appears in the remainder of the stream.We then also say that (, ) is a forward-edge.
Our first lemma states that if a set is special in epoch  of algorithm A ( ) , then the set was also special in epoch  − 1 of A ( ) .This monotonicity property is a consequence of the gradually increasing thresholds  • log 6 .If a set is special in epoch , i.e., it has seen  • log 6  incident edges towards uncovered elements, then it is extremely unlikely that it has seen less than (  − 1) log 6  incident edges towards uncovered elements in epoch  − 1. Lemma 5.For every set , every algorithm A ( ) , and every epoch  ≥ 2, with probability 1 − 1  10 , if set  is chosen for sampling in epoch  of A ( ) , then the set  was also chosen for sampling in epoch (  − 1) of A ( ) .
Proof.Fix any set , any algorithm A ( ) , and any epoch  ≥ 2. Now consider the moment at the beginning of epoch (  − 1) of algorithm A ( ) and let  ′ ⊆  denote the yet unmarked forwardincident elements of , i.e., the elements  ∈  that are not yet marked such that (, ) is a forward-edge.
We will prove that if then  would not be sampled in epoch  with high probability, and on the other hand, then  would be sampled in epoch (  − 1) with high probability.Hence, no matter what the exact size of | ′ | is, the event that  would be sampled in epoch  but not sampled in epoch  − 1 can only happen with vanishingly small probability.

Suppose thus that |𝑆
. As argued in Inequality 1, when algorithm A ( ) finishes, then at most a 1 log 3  fraction of the stream has been processed.Let  ′ denote the length of the stream that has not yet been processed.Then, Observe further that ℓ  ≤  ′ √  holds for all 1 ≤  ≤ .Then, by Lemma 2, at most ) log 6 ) ) log 6  + log 4.5  elements of  ′ would appear in the subepoch  dedicated to processing  with probability 1 − 1  20 , and the set would therefore not be sampled (since the threshold in the algorithm is  • log 6 ).

Suppose now that |𝑆
. Similar to the reasoning above, by Lemma 2, at least ) ) log 6 ) log 6  − log 4.5  ≥ (  − 1) log 6  elements of  ′ would appear in subepoch  − 1 with probability 1 − 1  20 and the set would therefore be sampled.Taking a union bound over the error probabilities incurred for all sets, algorithms A ( ) , and epochs, the result follows. □ Next, we take the perspective of an unmarked element.We show that every unmarked element is forward-incident to at most 1.1  2  √  elements at the end of epoch  with high probability.This implies that, the higher the epoch, elements can contribute less and less to increasing the counters of sets.Consequently, the number of special sets decreases from epoch to epoch.This is proved in Lemmas 6 and 8.In Lemma 7, we argue that uncovered elements are not marked with high probability.This property is used in the proof of (I1).
Since the reasoning follows from the proof of Lemmas 6, we give the lemma here.Proof.We consider the initial sampling of sets with probability  0 as epoch 0 of any of the algorithms A ( ) (i.e., all algorithms A (1) , A (2) , . . .have the same epoch 0).Denote by Sol (0) the sampled sets.After the initial sampling, the number of occurrences of all elements in a substream of length

Theorem 4 .
For any  = Ω( √ ), there is a randomized one-pass streaming algorithm for Set Cover with expected approximation ratio  in the edge-arrival model with space Õ(   2 ).While the lower bound of Theorem 2 and the upper bound of Theorem 4 match up to poly-logarithmic factors for  = Θ( √ ), we leave it as an interesting open problem to close the gap between the two bounds for other values of  = Ω( √ ).
th epoch of A ( ) ) and let us analyze the sub-epoch of E ( )

𝑁 21 ,
−ℓ .This process has the expected value  =  • ℓ = ℓ  −ℓ | |.Hence, by a Chernoff bound,  ℓ ≤  log  •max{ ℓ  −ℓ | |, 1}, for a large constant , which implies the second result.To prove the upper bound of the first result, we use the assumptions ℓ ≤ 0.001 and ℓ  | | ≥  log , for some large constant .Then, the expected value  is bounded as  log  , for a large constant .Hence, by a Chernoff bound, we have that with probability at least 1 − 1  prove the upper bound of result 3. We use the assumptions ℓ ≤  √  and ℓ  | | ≥ log 6 .In this case, the expected value  is at least log 6 .By a Chernoff bound, the probability that the outcome is larger by an additive  term is at most  −  2 2+  .Hence, we can choose  = log  √ e.,  ℓ ≤ 1.01 • ℓ  | |.We then bound the inclusion probability of item  + 1 as follows: | | −   0.0001 • 1.01) .Our process thus stochastically dominates a sequence of independent Bernoulli trials with success probability  = | |  (1 − 0.0001 • 1.01).Similar to the calculation above, by Chernoff bounds, we obtain that  ℓ ≥ 0.99 • ℓ | |  holds with probability at least 1 − 1  21 .By a union bound.both upper and lower bounds hold with probability 1 − 1  20 .

Lemma 6 .
For any algorithm A( ) and any epoch  ≥ 0, with probability at least 1 − +1  8 , at the end of epoch  of algorithm A ( ) , every not yet marked as covered element is forward-incident to at most 1.1  2  √  special sets of epoch .

√
log   are computed.Consider an element of degree at least 1.1  √  .Then, by Lemma 2, the element appears at least 0.99 • 1.1

Table 1 :
here Set Cover in the one-pass edge-arrival model

Table
1 , . ..,  , each of size √ , and partitions of each of these sets into  random subsets  1  , . ..,   of size √︁ / each such that   =  1  ∪ . ..∪   .Then, given a -party Set-Disjointness instance ( 1 , . ..,   ) with   ⊆ [], every party  includes the partial set    into the Set Cover instance if and only if  ∈   .Observe that if the sets ( 1 , . ..,   ) are pairwise disjoint, then every set in the Set Cover instance is of size √︁ /.On the other hand, if the sets ( 1 , . ..,   ) are uniquely intersecting, then there exists one set in the Set Cover instance of size √.Denote this set by   .We will argue that, since the parties cannot determine which is the case, most of the elements of   need to be covered using other partial sets bounds, we obtain that |   ∩   | =  (log ) with probability 1 − 1   , for any constant .Taking a union bound over all indices ,  and  ( ≠ ), we see that the previous statement holds for all such indices with probability 1 −  2 •  • 1   , which can be bounded from below by 1 − 1 by choosing  large enough and using the fact that  =  (poly()).Then, every -approximation one-pass streaming algorithm for Set Cover in the edge-arrival model with error probability at most 1/(4) uses space Ω( 2 / 4 log 4 ), even if the algorithm only outputs an -approximation of the size of an optimal cover.Proof.Let  be an integer. Lt  1 , 2 , . . .,  ⊆ [] with |  | =  = □ Theorem 2. Let  ≥ √ .√  • , for all , be a set family as in the statement of Lemma 1, i.e., each set   can be partitioned into subsets  1  , . . .,   , each of size /, such that, for any , ,  with  ≠ , |  ∩    | =  (log ) holds.
2 ) yet uncovered elements.More specifically, in epoch  of algorithm A( ), the algorithm counts the number of edges observed between every set  and the yet unmarked elements (Line 27).are considered in subepoch  .We call a set in epoch  special if we have observed at least  • log 6  edges between the set and yet unmarked elements.Each special set is then added to Sol with probability   = 2  •  0 (Line 29) and to a set Q′ with probability  = 2  (Line 30).Before illustrating the purpose of Q′ , we give more intuition as to why the sampling probability   = 2  •  0 for adding special sets to Sol is suitable.In Lemma 8, we will prove that the number of special sets in epoch  is bounded by 1.12  with high probability.Hence, since each of the special sets is included in Sol with probability   =  • Sol in every epoch of every algorithm A ( ) , and, thus, Sol contains at most Õ( √ ) sets when the last algorithm A ( ) has finished.We will next discuss the purpose of Q′ .Recall that every special set in epoch  of algorithm A ( ) is included in Q′ with probability   = 2   .The set Q′ , which becomes set Q in epoch  +1 (see Line 32), is used for tracking.Our analysis crucially relies on being able to identify elements that are incident to at least 1.1 •  Since special sets are included in Sol with probability . Epoch 0 runs on only Θ( √  log   ) =  ( /log 3 ) edges.Hence, when algorithm A ( ) finishes, only  /(2 • log 3 ) +  ( /log 3 ) ≤  /log 3  edges have been processed.Algorithm A ( ) focuses on those sets that are incident to at least Ω( except Q, Q′ , and Sol are of size at most Õ (/ √ ).We will see in Lemma 8 that the number of special sets in epoch  is at most 1.1  2  with high probability.Since the tracked sets in epoch  are the special sets from epoch  − 1 subsampled with probability   −1 , by Chernoff bounds, we track at most Õ(  −1  2  −1 ) = Õ(   ) special sets in epoch  with high probability.This bounds the sizes of Q′ and Q by Õ(   ).Furthermore, we will prove in Lemma 6 that | | = Õ( 99 • ℓ  | | and at most 1.01 • ℓ  | |, if | | ≤ 0.001 and ℓ | | ≥  log , for a large enough constant ; (2) at most  log() • max{ ℓ  | |, 1}, for some large constant , if ℓ ≤  2 − ] ≤   2 /2 ., we obtain a failure probability of  log 2 /2 , which is at most 1  20 for large enough .The result follows.Lemma 4 (Invariant (I2)).If a set  is included in Sol ( ) during the execution of A ( ) , then with probability at least 1 − 1/ 3 , there are only  ( √  log 9 ) missed edges that are incident on the set .
2 log  forward-edges in epoch  (we assumed here that  is large enough, e.g.,  = Ω( 2 log 3 ), to get concentration, which allows us to applyLemma 2).By a similar reasoning, for an element  with fd(,   ) ≤ 1.07  2  +1 √  , there will be at most 1.01 • 1.071 2  −1  2 log  forward-edges in epoch  between  and Q  .We can thus distinguish the two cases and mark as covered all elements that appear at least 1.085 • 2  −1  2 log  times (observe that 1.01 • 1.071 < 1.085 < 0.99 • 1.099).The argument is then completed by the fact that the special elements in epoch  + 1 are a subset of those in epoch .Last, bounding the error probabilities in the induction step with a union bound, including the error of +1  8 from the induction hypothesis, we see that the statement holds with probability 1 − Lemma 7.With probability 1 − 1  18 , an element  ∈  ( ) is not marked when algorithm A ( ) terminates.Proof. Theonly possibility for an element  ∈ U ( ) to be marked is if  is marked in Line 31 but none of the special sets that contain  are added to Sol.The analysis of the previous lemma reveals that if  is marked then  was incident to at least 1.01 • 1.07  2  +1 √  special sets, and each special set is added to Sol with probability   =  • 2  √   log .By Chernoff bounds, the probability that none of these sets are added to Sol is at most 1  30 .The result follows.□ Lemma 8.For any , , there are at most 1.1 •  2  special sets in epoch  of algorithm A ( ) with probability 1 − 1  5 .Proof. Conider the state of the algorithm at the end of epoch  − 1.Then, by Lemma 6, every unmarked element  has a forwarddegree to the set of special elements   in epoch  of at most 1.1   √  of these forward-edges appear in epoch .Observe that, any of these edges (, ) appears in the subepoch of epoch  that processes  with probability 1   of these edges contribute to increasing the counters.Since there are  elements, the sum of the counters can reach at most  2  , and since a set becomes special once a counter reaches  • log 6  ≥ 1, the result follows.□ Lemma 9 (Invariant (I3)). Wth probability at least 1 − 1/ 3 , the total number of sets added to Sol during the execution of A ( ) is  ( √  log 2 ).Proof.By Lemma 8, for any , , there are  (  2  ) special sets in epoch  of algorithm A ( ) with probability 1 − 1  5 , and by a union bound, this statement holds for all ,  with probability 1 − 1  3 .Since each of these sets is added to the solution with probability   =  • , by a Chernoff bound, at most  ( √  log ) sets are added.Hence, overall  ( √  log 2 ) sets are added in algorithm A( ).□