Faster Approximation Algorithms for Parameterized Graph Clustering and Edge Labeling

Graph clustering is a fundamental task in network analysis where the goal is to detect sets of nodes that are well-connected to each other but sparsely connected to the rest of the graph. We present faster approximation algorithms for an NP-hard parameterized clustering framework called LambdaCC, which is governed by a tunable resolution parameter and generalizes many other clustering objectives such as modularity, sparsest cut, and cluster deletion. Previous LambdaCC algorithms are either heuristics with no approximation guarantees, or computationally expensive approximation algorithms. We provide fast new approximation algorithms that can be made purely combinatorial. These rely on a new parameterized edge labeling problem we introduce that generalizes previous edge labeling problems that are based on the principle of strong triadic closure and are of independent interest in social network analysis. Our methods are orders of magnitude more scalable than previous approximation algorithms and our lower bounds allow us to obtain a posteriori approximation guarantees for previous heuristics that have no approximation guarantees of their own.


INTRODUCTION
In network analysis, graph clustering is the task of partitioning a graph into well-connected sets of nodes (called communities, clusters, or modules), that are more densely connected to each other than they are to the rest of the graph [20,36,39].This fundamental task has widespread applications across numerous domains, including detecting related genes in biological networks [8,41], finding communities in social networks [34,50], and image segmentation [43], to name only a few.A standard approach for finding clusters in a graph is to optimize some type of combinatorial objective function that encodes the quality of a clustering of nodes.Just as there are many different applications and reasons why one may wish to partition the nodes of a graph into clusters, there are many different types of objective functions for graph clustering [10,16,34,41,43], all of which strike a different balance between the goal of making clusters dense internally and the goal of ensuring that few edges cross cluster boundaries.In order to capture many different notions of community structure within the same framework, many graph clustering optimization objectives come with tunable resolution parameters [16,33,38,40,51], which control the tradeoff between the internal edge density and the inter-cluster edge density resulting from optimizing the objective.
One of the biggest challenges in graph clustering is that the vast majority of clustering objectives are NP-hard.Thus, while it is often easy to define a new way to measure clustering structure, it is very hard to find optimal (or even certifiably near-optimal) clusters in practice for any given objective.There has been extensive theoretical research on approximation algorithms for different clustering objectives [5,11,30,50], but most of these come with high computational costs and memory constraints, often because they rely on expensive convex relaxations of the NP-hard clustering objective.On the other hand, scalable graph clustering algorithms have been designed based on local node moves and greedy heuristics [9,32,34,42,45,50], but these come with no theoretical approximation guarantees.As a result, it can be challenging to tell whether the structure of an output clustering depends more on the underlying objective function or on the mechanisms of the algorithm being used.
This paper focuses on an existing optimization graph clustering framework called LambdaCC [21,23,42,50], which comes with two key benefits.The first is that it can detect different types of community structures by tuning a resolution parameter  ∈ (0, 1).Many existing clustering objectives can be recovered as special cases for specific choices of  [50].The second benefit is that LambdaCC can be viewed as a special case of correlation clustering [6], a framework for clustering based on similarity and dissimilarity scores, that has been studied extensively from the perspective of approximation algorithms [12,17,23].As a result, LambdaCC directly inherits an  (log ) approximation algorithm that holds for any correlation clustering problem [12,17] and is amenable to even better approximation guarantees in some parameter regimes.Gleich et al. [23] showed that for very small values of , the  (log ) approximation is the best that can be achieved by rounding a linear programming relaxation (the most successful approach known for approximating the objective).However, a 3-approximation algorithm has been developed for the regime where  ≥ 1/2 [50].Despite these results, LambdaCC suffers from a similar theory-practice gap as many other clustering frameworks.These previous approximation algorithms rely on expensive linear programming relaxations and are therefore not scalable.While faster heuristic algorithms do exist [42,50], these come with no approximation guarantees.
The present work: fast approximation algorithms for parameterized graph clustering.We develop algorithms for LambdaCC that come with rigorous approximation guarantees and are also far more scalable than existing approximation algorithms for this problem.We present new algorithms for all values of the parameter  ∈ (0, 1), focusing especially on the regime  ∈ ( 12 , 1), since constant factor approximations are possible in this regime and have been a focus in arXiv:2306.04884v1[cs.DS] 8 Jun 2023 previous research.This is also the regime where LambdaCC interpolates between two existing objectives known as cluster editing and cluster deletion [41].We first of all design a fast combinatorial approximation algorithm that returns a 6-approximation for any value of  ∈ ( 12 , 1), that runs in only  (  2  ) time, where   is the degree of node .While this is a factor of 2 worse than the best existing 3-approximation, it is orders of magnitude faster than this previous approach, which requires solving an LP relaxation with  ( 3 ) variables for an -node graph and takes Ω( 6 ) time.Our second algorithm is an improved 7 − 2/ approximation for  ≥ 1 2 (which ranges from 3 to 5 as  → 1) based on rounding an LP relaxation with far fewer constraints.In numerical experiments, we confirm for a large collection of real-world networks that the number of constraints in this cheaper LP tends to be orders of magnitude smaller than the  ( 3 ) constraint set of the canonical LP relaxation.It can also be run on graphs that are so large that even forming the  ( 3 ) constraint matrix for the canonical LP relaxation leads to memory issues.Even more significantly, this cheaper LP that we consider is a covering LP, a special type of LP that can be solved using combinatorial algorithms based on the multiplicative weights update method [19,22,37].
We also adapt our techniques to obtain a (1 + 1/) approximation by rounding the cheaper LP when  < 1 2 .As is the case when rounding the righter and more expensive canonical LP relaxation, this gets increasingly worse as  decreases.This is not surprising, given that even the canonical LP relaxation has an  (log ) integrality gap [23].Our (1 + 1/) approximation is in fact quite close to the previous 1/ approximation for small  that was previously developed by Gleich et al. [23] based on the canonical LP.
All of our approximation algorithms rely on a new connection between LambdaCC and an edge labeling problem that is based on the social network analysis principle of strong triadic closure [18,24,44].This principle posits that if two people share strong links to a mutual friend, then they are likely to share at least a weak connection with each other.This principle has inspired a line of research on strong triadic closure (STC) labeling problems [25,26,35,44,48], which label edges in a graph as weak or strong (or in some cases add "missing" edges) in order to satisfy the strong triadic closure property.Previous research has shown that unweighted variants of this labeling problem are related to cluster editing and cluster deletion [25,26] (special cases of LambdaCC when  = 1/2 and  ≈ 1 respectively).Recently it was shown that lower bounds and algorithms for these unweighted STC problems can be useful tools in designing faster approximation algorithms for cluster editing and cluster deletion [48].We generalize this strategy by defining a new parameterized edge-labeling problem we call LambdaSTC, which provides new types of lower bounds for LambdaCC.We also provide a 3-approximation algorithm for LambdaSTC that applies for every value of  ∈ (0, 1).All of these constitute new results for an edge labeling problem of independent interest in social network analysis, but our primary motivation is to use them to develop faster clustering approximation algorithms.
We demonstrate in numerical experiments that our algorithms are fast and effective, far surpassing their theoretical guarantees.
In our experiments, we even find that solving our cheaper LP relaxation actually tends to return a solution that can quickly be certified to be the optimal solution for the more expensive canonical LP relaxation for LambdaCC.When this happens, we can use previous rounding techniques that guarantee a 3-approximation for  ≥ 1 2 .

PRELIMINARIES AND RELATED WORK
We begin with technical preliminaries on graph clustering, correlation clustering, and strong triadic closure edge labeling problems.

The LambdaCC Framework
Given an undirected graph  = ( , ) the high-level goal of a graph clustering algorithm is to partition the node set  into disjoint clusters in such a way that many edges are contained inside clusters, and few edges cross between clusters.These two goals are often in competition with each other, and there have been many different approaches for defining and forming clusters, all of which implicitly strike a different type of tradeoff between these goals.The Lamb-daCC clustering objective [50] provides one approach for implicitly controlling this tradeoff using a resolution parameter  ∈ (0, 1).Formally, given  = ( , ) and parameter  ∈ (0, 1), LambdaCC seeks a clustering that minimizes the following objective where    is a binary cluster indicator for every node-pair, i.e.,    = 1 if  and  are clustered together and 0 otherwise.The number of clusters to form is not specified.Rather, the optimal number of clusters is controlled implicitly by tuning .Observe that the two pieces of the LambdaCC objective directly correspond to the two goals of graph clustering: the term (1 − )(1 −    ) for (, ) ∈  is a penalty incurred if  and  are separated, and the term    for (, ) ∉  places a penalty on putting  and  together if they do not share an edge.The relative importance of the two competing goals of graph clustering (form clusters that are internally dense and externally sparse) is then controlled by tuning .Smaller values of  tend to produce a smaller number of (larger) clusters, and choosing large  leads to a larger number of (smaller) clusters.One of the benefits of the LambdaCC framework is that it generalizes and unifies a number of previously studied objectives for graph clustering, including the sparsest cut objective [5], cluster editing [6,41], and cluster deletion [41].It also is equivalent to the popular modularity clustering objective [34] under certain conditions.The definition of modularity depends on underlying null distributions for graphs.When the Erdős-Rényi null model is used, modularity is equivalent to Objective (1) for an appropriate choice of .When the Chung-Lu null model is used, modularity can be viewed as a special case of a degree-weighted version of Lamb-daCC [50], though we focus on Objective (1) in this paper.Finally, for an appropriate choice of , the LambdaCC objective is equivalent to graph clustering based on maximum likelihood inference for the popular stochastic block model [1], which can be seen from LambdaCC's relationship to modularity [33].

Correlation Clustering
The CC in LambdaCC stands for correlation clustering [6], a framework for clustering based on pairwise similarity and dissimilarity scores.In the most general setting, an instance of weighted correlation clustering is given by a set of vertices  , along with two non-negative weights ( +   ,  −   ) for each pair of distinct vertices ,  ∈  .If nodes  and  are placed in the same cluster, this incurs a disagreement penalty of  −   whereas if they are separated, a disagreement penalty of  +   is imposed.In correlation clustering, disagreements are also called mistakes.This terminology is especially natural when only one of the weights ( +   ,  −   ) is positive and the other is zero (which is true for the most widely-studied special cases).In this case, each node pair (, ) is either "similar" ( +   > 0) and wants to be clustered together, or "dissimilar" ( −   > 0) and wants to be clustered apart.A mistake or disagreement happens precisely when nodes are clustered in a way that does not match this "preference." Formally, the disagreements minimization objective for correlation clustering can be represented as the following integer linear program (ILP): where    is a binary distance variable between nodes , , i.e.,    = 0 means nodes  and  are clustered together, and    = 1 means they are separated.The most well-studied special case is when ( +   ,  −   ) ∈ {(1, 0), (0, 1)}.This is known as complete unweighted correlation clustering, as it is often viewed as a clustering objective on a complete signed graph where each pair of nodes either defines a positive edge or a negative edge.This is equivalent to the cluster editing problem [41], which seeks to add or delete the minimum number of edges in an unsigned graph  = ( , ) to partition it into a disjoint set of cliques.This is in turn related to cluster deletion, where one can only delete edges in  in order to partition it into cliques.Cluster deletion is the same as solving Objective (2) when ( +   ,  −   ) ∈ {(1, 0), (0, ∞)} for every pair of nodes (, ).Approximation algorithms.Correlation clustering is NP-hard even for special cases of cluster editing and cluster deletion, but many approximation algorithms have been designed [4,12,13].Most of these algorithms rely on solving and rounding a linear programming (LP) relaxation of ILP (2), obtained by replacing    ∈ {0, 1} with the constraint    ∈ [0, 1].For the general weighted case, the best approximation guarantee is  (log ), which matches the integrality gap of the linear program [12,17].However, constant factor approximations are possible for certain weighted cases [3,4,53].Ailon et al. [4] designed a fast randomized combinatorial algorithm called Pivot for the complete unweighted case.This algorithm repeatedly selects a uniform random pivot node in each iteration and clusters it together with its unclustered neighboring nodes that share a positive edge.This algorithm comes with an expected 3-approximation guarantee.However, for general weighted correlation clustering, it can produce poor results.
Deterministic pivot.A derandomized version of the standard Pivot algorithm was developed by van Zuylen and Williamson [47], which can be applied to a broader class of weighted correlation clustering problems.Instead of randomly choosing pivot nodes, this technique relies on solving the LP relaxation of correlation clustering, constructing a derived graph Ĝ based on the solution to this LP, and then running a pivoting procedure in Ĝ that deterministically selects pivot nodes based on the LP output.They showed that this can produce a deterministic 3-approximation algorithm for the complete unweighted case, and can also be applied to other weighted cases including the case of probability constraints (where  +   +  −   = 1 for every pair {,  }).In proving these results, van Zuylen and Williamson presented a useful theorem (Theorem 3.1 in [47]) that can be used as a general strategy for developing approximation algorithms for other special weighted variants.We state a version of this theorem below, as it will be a useful step for some of our results.The original theorem includes details for choosing pivot nodes deterministically.The approximation holds in expectation when choosing pivot nodes uniformly at random.Theorem 2.1.Consider an instance of weighted correlation clustering given by a node-set  and weights { +   ,  −   } ≠ .Let    represent a budget for node pair {,  } where  ≠ , and Ĝ = ( , Ê) be a graph which for  > 0 satisfies the following conditions: (1)  −   ≤    for all edges (, ) ∈ Ê, and Applying Pivot to Ĝ will return a clustering solution with an expected weight of disagreements bounded above by  <     .
Approximations for LambdaCC.The LambdaCC objective on a graph  = ( , ) corresponds to a special case of Objective (2) where [50] previously showed a 3-approximation algorithm for the case where  ≥ 1/2, based on rounding the standard correlation clustering LP relaxation.However, because this LP has  ( 3 ) constraints for an -node graph, in practice it is challenging to solve it for graphs with even a thousand nodes.Later, Gleich et al. [23] proved that the LP integrality gap can be  (log ) for small values of .They also developed approximation guarantees for smaller values , but these get increasingly worse as  → 0. Faster heuristics algorithms for LambdaCC have also been developed [42,50], but these come with no approximation guarantees.Thus, a limitation of this previous work is that existing LambdaCC algorithms either depend on an extremely expensive linear programming relaxation or come with no guarantees.The focus of our paper is to bridge this gap.

Strong Triadic Closure and Edge Labeling
In social network analysis, the principle of strong triadic closure [18,24] posits that two individuals in a social network will share at least a weak connection if they both share strong connections to a common friend.This has been used to define certain types of edge labeling problems where the goal is to label edges in a graph  = ( , ) in such a way that this principle holds [2,25,26,35,44].
Given a graph  = ( , ) (which could represent an observed set of social interactions), a triplet of vertices (, , ) is an open wedge centered at  if the vertex pairs (, ) and ( , ) are edges (i.e., in ) while (, ) is not.The strong triadic closure principle suggests that if such an open wedge exists, then either (, ) or ( , ) is a weak edge, or else (, ) is a missing connection that should appear as an edge in  but was not observed when  was constructed.With this principle in mind, Sintos and Tsaparas [44] defined the strong triadic closure labeling problem (minSTC), where the goal is to label edges as weak and strong so that every open wedge contains as least one weak edge, and in such a way that the number of edges labeled weak is as small as possible.They showed that the problem is NP-hard but has a 2-approximation algorithm based on reduction to the Vertex Cover problem.They also considered a variation of the problem that allows for edge additions (minSTC+).
In our paper, we use W  to denote the set of open wedges centered at  in , and let W = ∪  ∈ W  .We use the term STClabeling to indicate labeling of node pairs (, ) ∈  2 that satisfies the strong triadic closure in the following sense: for every open wedge (, , ) centered at , at least one of the edges {(, ), ( , )} is labeled weak, or the non-edge (, ) ∉  is labeled as a missing edge.Such labeling is encoded by a collection of weak edges denoted as  weak ⊆ , as well as a set of missing edges denoted as  miss ⊆ (  2 − ).The minSTC+ problem seeks an STC-labeling ( weak ,  miss ) that minimizes | weak | + | miss |.This can be formally cast as the following ILP: If    = 1, this represents the presence of either a weak edge (if (, ) ∈ ) or a missing edge (if (, ) ∉ ).This problem is also NP-hard but can be reduced to Vertex Cover in a 3-uniform hypergraph in order to obtain a 3-approximation algorithm.
While the strong triadic closure principle and the resulting edge labeling problems are of their own independent interest, we are particularly interested in these problems given their relationships with certain clustering objectives.The solution for minSTC is known to lower bound the cluster deletion objective, and minSTC+ similarly lower bounds cluster editing [25,26,29,48].The LP relaxations for these problems, therefore, provide lower bounds for cluster deletion and clustering editing that are cheaper and easier to compute than the standard linear programming relaxations.Veldt [48] recently showed how to round these LP relaxations-and how to round approximation solutions for minSTC+ and minSTC-to develop faster approximation algorithms for cluster editing and cluster deletion.We generalize these techniques in order to develop faster approximation algorithms for the full parameter regime of LambdaCC.

LAMBDA STC LABELING
We now introduce a parameterized edge labeling problem called LambdaSTC, which generalizes previous edge labeling problems and can also be used to develop new approximations for LambdaCC.
Problem definition.Given graph  = ( , ) and a parameter , LambdaSTC is the problem of finding an STC-labeling ( weak ,  miss ) that minimizes ( We first note that this problem is equivalent to the minSTC+ problem when  = 1 2 .When  is close enough to 1, LambdaSTC is equivalent to minSTC.To see why, note that if  > ||(1 − ), then labeling a single non-edge as "missing" is more expensive than labeling all edges in  as "weak".Hence, with a couple of steps of algebra, we can see that when  > | | | |+1 , the optimal LambdaSTC solution will not place any non-edges in  miss , but will only add edges to  weak in order to construct a valid STC-labeling, so this differs from minSTC only by a multiplicative constant factor.
Varying  between 1 2 and 1 offers us the flexibility to interpolate between minSTC+ and minSTC.Meanwhile, the  < 1  2 regime corresponds to a new family of edge labeling problems where it is cheaper to label non-edges as missing.If we think of the graph  = ( , ) as a (potentially noisy) representation of some social network observed from the real world, then the parameter  can be chosen based on a user's belief about the accuracy of the process that was used to observe edges.If the user has a strong belief that there are many friendships in the social network that were just not directly observed (and hence are not included as edges in the graph) then a smaller value of  may be appropriate.If missing edges are unlikely, then a large value of  is appropriate.
Approximating LambdaSTC.Approximation algorithms for minSTC and minSTC+ can be obtained by reducing these problems to unweighted Vertex Cover problems (in graphs and hypergraphs, respectively) [26,44].We generalize this approach and design an approximation algorithm that applies to LambdaSTC for any choice of the parameter , based on the Local-Ratio algorithm for weighted Vertex Cover [7].Algorithm 1 is pseudocode for our method, CoverLabel.This method "covers" all the open wedges in graph  by either adding a missing edge between a pair of non-adjacent nodes or labeling at least one of the two edges as weak.This can be seen as finding a weighted vertex cover on a 3-uniform hypergraph H = (V, E) constructed as follows: • Every node pair (, ) ∈  2 is assigned to a vertex    in V with a node-weight of (1 − ) if (, ) ∈  and  otherwise.
Nodes in H correspond to edges in , and hyperedges in H correspond to open wedges in .Therefore, a vertex cover in H corresponds to a labeling of edges in  that "covers" all open wedges in a way that produces an STC-labeling.If the covered vertex is associated with an edge (, ) ∈ , we consider (, ) a weak edge.However, if it corresponds to a non-edge pair (, ) ∉ , this is labeled as a missing edge.This provides an approximation-preserving reduction from LambdaSTC to 3-uniform hypergraph weighted Vertex Cover, so employing a 3-approximation algorithm for hypergraph vertex cover results in a 3-approximation for LambdaSTC.CoverLabel is equivalent to implicitly applying the Local-Ratio algorithm [7] to the hypergraph H described above.By implicitly, we mean that we do not form H explicitly, but we apply the mechanics of this algorithm directly to find an STC-labeling in .The following theorem follows from the guarantee of the Local-Ratio algorithm for node-weighted 3-uniform hypergraphs.( This linear program shares the same objective function as the canonical LambdaCC LP relaxation, but has a subset of the  ( 3 ) triangle inequality constraints.In particular, it only constraints    +   ≥   when {, ,  } is an open wedge, rather than for all triplets of edges.This makes this LP relaxation easier to solve on a large scale.Furthermore, this is an example of a covering LP, which can be solved much more quickly than a generic LP using the multiplicative weights update method [19,22,37].

FASTER LAMBDACC ALGORITHMS
We now present faster algorithms for LambdaCC, using lower bounds derived from the LambdaSTC objective.For a fixed  value, we use the notation STC () and CC () to represent the optimal solution values for LambdaSTC and LambdaCC, respectively.

CoverFlipPivot algorithm
We present the first combinatorial algorithm for LambdaCC, called CoverFlipPivot (CFP), which provides a 6 approximation for every  ≥ 1/2.As outlined in Algorithm 2, CFP comprises three steps: Before proving any approximation guarantees for CFP, we begin with a more general result that sheds light on the relationship between LambdaSTC and LambdaCC.This generalizes previous results showing that the optimal cluster deletion and minSTC objectives (and similarly, the cluster editing and minSTC+ objectives) differ by at most a factor of 2 [48].Proof.To prove this result, we show that all conditions of Theorem 2.1 are satisfied for  = 2. Recall that for the LambdaCC framework, weights are defined as: To bound the LambdaCC objective in terms of the STC-labeling, we define budgets based on flipped edges: The sum of budgets can now be written as Proof.The optimal LambdaCC solution provides an upper bound for the optimal LambdaSTC solution, i.e., STC () ≤ CC () .Algorithm A produces a -approximate labeling solution ( weak ,  miss ) with the LambdaSTC objective value STC(A) = ((1 − )| weak | + | miss |), so we have that Therefore, 1   STC(A) lower bounds CC () .Applying Theorem 2 with algorithm A, we obtain a clustering with LambdaCC objective score of CC so CC(A) is a (2)-approximation for LambdaCC.If A optimally solves LambdaSTC, then  = 1 and so STC () ≤ CC () ≤ 2STC () .If A represents our 3-approximate CoverLabel algorithm for LambdaSTC, then combining it with Theorem 4.1 shows that Algorithm 2 is a 2 • 3 = 6-approximation for LambdaCC.□ Algorithm 2 can be easily derandomized using the deterministic strategy for choosing pivot nodes for Theorem 2.1 (see [47]).

Faster LP algorithm
Algorithm 3 is an approximation algorithm for LambdaCC based on rounding the LambdaSTC LP relaxation.This LP has |W| constraints, whereas the canonical LP has  ( 3 ).In the worst case, it is possible for |W| =  ( 3 ), but our experimental results demonstrate that |W| is far smaller for all of the real-world graphs we consider.Even more significantly, the LambdaSTC LP is a covering LP, which makes it possible to use fast existing techniques for approximating covering LPs.This leads to much faster algorithms, at the expense of only a slightly worse approximation factor since the LP is only solved approximately.The next section provides a more detailed runtime analysis.
Our approach for rounding the LambdaSTC LP follows a similar strategy as CFP, which involves building a new graph Ĝ and then running Pivot.The construction of Ĝ depends on the LP variables {   }, the edge structure in , and the value of .When  ≥ 1/2, we always ensure that a non-edge in  maps to a non-edge in Ĝ.For  < 1/2, we always ensure that an edge in  maps to an edge in Ĝ.We first prove that the algorithm has an approximation factor that ranges from 3 to 5 as  goes from 1/2 to 1.
Proof.To prove Theorem 4.4, we show the conditions in Theorem 2.1 are satisfied for  = (7 − 2  ).In our analysis, we define budgets    for each distinct pair of nodes (, ) based on the LP objective (5).Specifically, we set    = (1 − )   if (, ) ∈ , and We begin by checking Condition (1) in Theorem 2.1 for each distinct pair of nodes (, ), i.e.,

A 3-approximation via an intermediate LP
Veldt et al. [49,50] originally presented a 3-approximation algorithm for  ≥ 1/2 based on the canonical LP relaxation.This algorithm however comes with an  ( 3 ) size constraint matrix since all triangle inequality constraints are considered for all triplet of nodes (, , ) ∈  3 .In contrast, in the previous section, we proposed a faster 6-approximation algorithm based on the LambdaSTC LP relaxation that includes a triangle inequality constraint   ≤   +   only when (, , ) an open wedge (centered at ) in .In this section, we show how to obtain a 3-approximation by rounding an LP relaxation whose constraint set lies somewhere between the LambdaSTC and canonical LambdaCC LP relaxation.In more detail, we include a triangle inequality constraint for every wedge in  as well as for every triangle in .This is a superset of the constraint set for the LambdaSTC LP relaxation but does not include a triangle inequality constraint for every {, ,  }.Formally, this LP relaxation is given by min where T  represents a triangle that includes node  as one of its vertices.Algorithm 4 uses the same rounding strategy that was used for the canonical LP relaxation [50], except that it is applied to the LP in equation ( 11) rather than the canonical LP.
The LP relaxation presented here has a constraint size determined by the number of wedges and triangles, denoted as |W| + |T |, in the graph.While both |W| and |T | can potentially be  ( 3 ) in the worst case, this is not typically the scenario in practical situations.In real-world networks, the number of wedges and triangles is significantly smaller.Figure 1 illustrates this observation by comparing the number of constraints in the intermediate LP (11) against the canonical LP.Thus, solving and rounding this LP is more efficient compared to existing techniques, and we now prove that this can be done without a loss in the approximation factor.Theorem 4.5.Algorithm 4 is a randomized 3-approximation algorithm for LambdaCC when  ≥ 1/2.Proof.We can prove that Algorithm 4 satisfies Theorem 2.1 by making slight modifications to the proof presented in Theorem 6 in the work of Veldt et al. [49].Condition (1) of Theorem 2.1 can be satisfied following the proof as is in [49].To prove Condition (2), we demonstrate that for every triplet of nodes (, , ) that is mapped to an open wedge centered at  in Ĝ.This means that    ,   < 1/3,   ≥ 1/3.Note that we are only able to apply the triangle inequality   ≤    +  if (, , ) is also an open wedge or a triangle in the original graph .Given an arbitrary triplet (, , ) that maps to an open wedge in Ĝ, there are 8 possibilities for the edge structure in , depending on which pairs of nodes in (, , ) share an edge in .Following Veldt et al. [49], we can prove inequality (12) for each of the 8 cases separately.Note that we do not need to update the analysis for cases where (, , ) is an open wedge or a triangle in , since our new LP in (11) includes triangle inequality constraints for these cases.This means that the following cases from the analysis of Veldt et al. [49] remain unchanged: • Case 1: (, , ) forms a wedge centered at  in .

Runtime Analysis
For a graph  = ( , ), let  = || and  = | |.When written in the form min Ax=b c  x, the canonical LP relaxation for LambdaCC has  ( 3 ) constraints and variables.Even using recent theoretical algorithms for solving linear programs in matrix multiplication time [14,28], the runtime is Ω(( 3 )  ) where  is the matrix multiplication exponent, so the runtime for solving and rounding the canonical relaxation is Ω( 6 ).Not only does this have a prohibitively expensive runtime, but in practice even forming such a large constraint matrix can lead to memory issues that make it infeasible to solve this on a very large scale.Thus, although this approach provides the best theoretical approximation factor, it is not scalable.
Our new approximation algorithms come with good approximation guarantees and are much faster than solving the canonical relaxation, both in theory and practice.Finding the open wedges of  can be done in time  (  ∈  2  ) by visiting each node, and then visiting each pair of neighbors of that node in turn.This runtime is upper bounded by  ().When applying the randomized Pivot algorithm, this is in fact the most expensive part of CFP, so the overall runtime for CFP is  (  ∈  2  ) =  ().If we use the deterministic pivoting strategy of van Zuylen and Williamson [47], this can be implemented in  ( 3 ) time so that is the runtime for a derandomized version of CFP.
The LambdaSTC LP is a covering LP, so for  ≥ 0 we can find a (1 + )-approximate solution in Õ ( 1  2 |W|) time using the multiplicative weights update method [19,22,37], where Õ suppresses logarithmic factors.This assumes we already know |W|; if we factor in the time it takes to find all open wedges the runtime comes to Õ ( 1  2 |W| +  ∈  2  ).Minor alteration to our analysis quickly shows that a (1 + )-approximate solution to the LP translates to approximation factors that are a factor (1+) larger.Once again, applying the deterministic pivot selection adds  ( 3 ) to the runtime, which is still far better than Ω( 6 ).

EXPERIMENTS
This section presents a performance of our algorithms.We conduct experiments on publicly available datasets from various domains, including the SNAP [31] and Facebook100 [46] datasets, which are available at the Suitsparse matrix collection [15].To implement the algorithms, we use the Julia programming language, and we run all experiments on a Dell XPS machine with 16 GB RAM and an Intel Core i7 processor.Both the canonical and the LambdaSTC LP relaxations are solved using Gurobi optimization software [27].We focus here on finding exact solutions for the LambdaSTC LP relaxation using existing optimization software.This is already far more scalable than trying to form the constraint matrix for the canonical LP relaxation and solve it using Gurobi.Finding faster approximate solutions for the LP using the multiplicative weights update method is a promising direction for future research, but is beyond the scope of the current paper.Code for our implementations and experiments is available at https://github.com/Vedangi/FastLamCC.

Approximation algorithms for LambdaCC
A natural question to ask is how well our approximation algorithms compare against previous algorithms for LambdaCC based on the canonical LP relaxation.It is worth noting first of all that even forming the full constraint matrix for the canonical LP (which has ( − 1)( − 2)/2 =  ( 3 ) triangle inequality constraints) becomes infeasible for even modest-sized graphs due to memory constraints.Meanwhile, the LambdaSTC LP relaxation has one triangle inequality constraint for each open wedge.Although there exist graphs such that |W| =  ( 3 ), this is not the case in practical situations.Figure 1 plots the size of |W| for all of the Facebook100 networks, as well as for a range of graphs of different classes from SNAP (e.g., social networks, citation networks, web networks, etc.).In all cases |W| is orders of magnitude smaller than (−1)(−2)/2, illustrating that solving and rounding this LP is far more practical than using existing LP-based techniques.We also plot the number of constraints in the intermediate LP relaxation from Section 4.3, showing that it has only a slight increase in constraint size over the LambdaSTC LP.
An additional reason to use the LambdaSTC relaxation is that in practice, solving the LambdaSTC relaxation often also solves the canonical LP relaxation.This can be checked by seeing whether the optimal LP variables for the LambdaSTC relaxation are also feasible for the canonical LP. 1 Table 2 shows results for solving and rounding the LambdaSTC LP (Algorithm 3) on three graphs for a range of different  values.The graphs are Simmons81 (a social network), ca-GrQc (a collaboration network), and Polblogs (a Political blogs network).We attempted to form the full canonical LP relaxation for these graphs and solve it but quickly ran out of memory.We were able to form and solve the LambdaSTC LP relaxation, and in almost all cases the optimal solution variables for this LP were certified as being feasible (and hence optimal) for the canonical LP.Thus, our LP-based algorithm far exceeded its theoretical guarantees.In practice, it produced solutions that are within a factor of 2 or less from the LP lower bound.When rounding, we applied both our new approach (Algorithm 3) as well as the existing rounding strategy for the canonical LP relaxation, since the rounding step is very fast.We used the previous rounding strategy for the canonical LP whenever we could certify we had solved canonical LP (since this has an improved a priori guarantee).In practice though, the results for the two different rounding strategies were nearly indistinguishable.
Table 2 also displays results for CFP, showing that it is even orders of magnitude faster than solving and rounding the Lamb-daSTC LP relaxation while producing comparable approximation ratios (ratio between clustering solution and the computed lower bound).While solving our LP relaxation takes up to hundreds of seconds on the three graphs, CFP takes mere fractions of a second.

Combining CFP with Fast Heuristics
The Louvain method is a widely used heuristic clustering technique that greedily moves nodes in order to optimize a clustering objective [9].The original Louvain method was designed for maximum modularity clustering, but many variations of the method have been designed.This includes a fast heuristic called LambdaLouvain [50], that greedily optimizes the LambdaCC objective for a given parameter , as well as a parallel version of this method [42].Although these methods are fast and perform well in practice, they do not compute lower bounds for the LambdaCC objective nor provide any approximation guarantees.One benefit of our algorithms is that they come with lower bounds that can be used not only to design faster approximation algorithms for LambdaCC, but also to obtain a posteriori guarantees for other heuristic methods. 1 This can also be viewed as the first step in a more memory efficient approach for solving correlation clustering LP relaxations that has been applied elsewhere [48,52]: solve the LP over a subset of the constraints and iteratively add more constraints until the variables satisfy all triangle inequalities.Our results indicate that for these graphs and  values, enforcing triangle inequality constraints just at open wedge is sufficient.Figures 2 and 3 showcase the combined results of LambdaLouvain with CFP lower bounds on graphs even larger than those considered in Table 2.These results demonstrate superior a posteriori approximation ratios (clustering objective divided by CFP lower bound) compared to running CFP by itself.Notably, as  → 1, the difference in approximation factors between CFP and LambdaLouvain decreases, converging toward a similar outcome.We execute both the CFP rounding procedures and LambdaLouvain method for 15 iterations, reporting the mean and standard deviation for approximation ratios and runtimes.While the CFP+LambdaLouvain yields better approximations, it comes with longer runtimes compared to the CFP alone.Even for small values of , we can certify that LambdaLouvain can produce a respectable factor of around 2 by using the lower bounds generated by CFP.

Scalability of CoverFlipPivot
We further test the limits of CFP by running it on much larger graphs.Figure 4 shows approximation results on a social network with 1.59 million edges (Texas84), a road network with 2.76 million edges (roadNet-CA), a citation network with 1.65 billion edges (cit-Patents), an Amazon product co-purchasing network with 2.4 million edges (amazon0601), Wikipedia web network with 2.54 billion edges (wiki-topcats) and a blogging community network with 3.46 billion edges (com-Journal).CFP consistently outperforms its theoretical 6-approximation guarantee.For  ≥ 0.55, it produces approximations of 2.1 or better.When  = 0.5, approximation factors increase to between 2.4-2.8, which still outperforms the 6-approximation guarantee.(We omit these results from the plot in Figure 4 in order to zoom in and better display factors near 2 for  ≥ 1/2.)For each value of , the method takes around 58 minutes for the larger com-LiveJournal graph (with 3.46B edges).
In contrast, the cit-Patents graph, consisting of 1.65 billion edges, is processed in less than 11 minutes, showcasing a notably faster runtime.Furthermore, the method exhibits even quicker processing times for the other graphs.An intriguing observation in this context is that as the objective transitions from cluster editing to cluster deletion ( → 1), both the approximation factor and the runtime exhibit improvements.

CONCLUSION
We present new approximation algorithms for LambdaCC graph clustering framework that are far more scalable than existing approximation algorithms relying on LP relaxations with  ( 3 ) constraints.We introduce the first combinatorial algorithm for Lamb-daCC in the parameter regime  ∈ ( 1 2 , 1)-where the problem interpolates between cluster editing and cluster deletion-which comes with a 6-approximation guarantee.We then provide algorithms for all parameter regimes based on rounding a less expensive LP relaxation.A major theoretical benefit of these alternative LPs is that they are covering LPs.This means that the multiplicative weights update method provides fast combinatorial methods for finding approximation solutions.Although in our work we focused on using existing optimization software to exactly solve these relaxations, a clear direction for future research is to implement these faster approximate solvers in order to achieve additional runtime improvements.Another direction for future work is to see whether it is possible to develop a 3-approximation for all  ∈ ( 1 2 , 1) by rounding the LambdaSTC LP.Though our theoretical approximation factors get increasingly worse for  → 1, in practice we see no deterioration in approximations.Finally, a compelling open question is whether we can develop an  (log ) approximation algorithm for LambdaCC that applies for all values of  and can be made purely combinatorial, and does not rely on the canonical LP.

Figure 1 :
Figure 1: Comparing number of constraints in the canonical LP against the number of constraints in the LambdaSTC LP and the intermediate LP (from Section 4.3) for graphs from Facebook100 (left) and SNAP datasets (right).Each dot represents the number of open wedge constraints for each graph while each star represents the number of constraints for the intermediate LP.We use the same SNAP graphs as Veldt[48], and color code based on their type (location-based social, other social, web, communication, road, product, collaboration, and citation networks).

Figure 3 :
Figure 3: Runtimes for combining CFP and LambdaLouvain on ca-HepPh, Auburn71, cit-HepTh and Michigan23 for different values of .The time to compute the CFP lower bound is in blue, which is the bottleneck for this algorithm.
[48]ch dot represents the number of open wedge constraints for each graph while each star represents the number of constraints for the intermediate LP.We use the same SNAP graphs as Veldt[48], and color code based on their type (location-based social, other social, web, communication, road, product, collaboration, and citation networks).