SAT-boosted Tabu Search for Coloring Massive Graphs

Graph coloring is the problem of coloring the vertices of a graph with as few colors as possible, avoiding monochromatic edges. It is one of the most fundamental NP-hard computational problems. For decades researchers have developed exact and heuristic methods for graph coloring. While methods based on propositional satisfiability (SAT) feature prominently among these exact methods, the encoding size is prohibitive for large graphs. For such graphs, heuristic methods have been proposed, with tabu search among the most successful ones. In this article, we enhance tabu search for graph coloring within the SAT-based local improvement (SLIM) framework. Our hybrid algorithm incrementally improves a candidate solution by repeatedly selecting small subgraphs and coloring them optimally with a SAT solver. This approach scales to dense graphs with several hundred thousand vertices and over 1.5 billion edges. Our experimental evaluation shows that our hybrid algorithm beats state-of-the-art methods on large dense graphs.


INTRODUCTION
Graph coloring is the fundamental computational problem of coloring the vertices of a given undirected graph with as few colors as possible, avoiding monochromatic edges, or, equivalently, partitioning the graph's vertex set into as few independent sets as possible.Graph coloring arises naturally in many applications, including scheduling, register allocation, pattern matching, and computational geometry.The decision version of the problem-where the number of colors is given, and one asks whether a coloring exists-can be naturally cast as a constraint satisfaction problem: The graph's vertices are variables that range over a finite domain of colors, and each edge represents a binary inequality constraint.Graph coloring is one of Karp's 21 fundamental NP-hardproblems [20].
1.5:2 A. Schidler and S. Szeider For decades, much research has been devoted to developing algorithmic methods for graph coloring.One can distinguish between exact methods that search for a coloring with the smallest number of colors possible and heuristic methods that possibly yield suboptimal colorings.
Exact methods for graph coloring includeconstraint programming (CP), propositional satisfiability (SAT), and integer linear programming (ILP) formulations [6,16,18].Here, the problem is expressed in terms of constraints, propositional logic, or linear constraints over integer domains, respectively, and then solved by a general solver.Generally, these exact methods do not scale to graphs with more than a few thousand vertices, as these encodings become prohibitively large.In our experiments, the largest graph successfully colored by a SAT encoding had around 14,000 vertices and was comparatively easy to color due to the graph's sparsity.
Heuristic graph coloring methods include various forms of greedy colorings combined with local search, especially tabu search, reducing the number of colors used by the greedy coloring [4,5,17].Such heuristic methods scale to very large graphs and find good colorings for sparse graphs but struggle with large, dense graphs.
The CG:SHOP Challenge 2022 1 posed the Minimum Partition into Plane Subgraphs Problem (MPPS): the problem of finding the smallest number of classes we can partition a given set of line segments into, such that line segments within the same class do not intersect.This problem is reducible to graph coloring (see Section 2.2) and the competition instances were crafted such that they are noticeably different from well-known graph coloring instances [10] and yield graph coloring instances that are comparatively large and dense graphs.Since the aforementioned methods do not perform well on them,new approaches for graph coloring were developed, one of which is this article's subject.
In this article, we propose a hybrid approach between exact and heuristic techniques, following the general framework of SAT-based Local Improvement Method (SLIM) that has recently been successfully customized for various problems [13,15,21,26,27,30,31,33]. Our idea is to enhance tabu search by applying SAT encodings locally.Our hybrid algorithm GC-SLIM incrementally improves a candidate coloring by repeatedly selecting small subgraphs (local instances) and coloring them optimally with a SAT solver.The problem solved by the SAT solver is a list coloring problem, where each vertex has a list of available colors.The lists ensure that the subgraph's coloring is consistent with the colors of the vertices outside the selected subgraph.GC-SLIM's most essential ingredients include strategies for local instance selection, the SAT-based solution of the local instance, and a technique called chain propagation.
GC-SLIM scales to dense graphs with several hundred thousand vertices and over 1.5 billion edges.Our experimental evaluation shows that our hybrid algorithm beats state-of-the-art methods on large, dense graphs.

Related Work
Since the work on graph coloring is extensive, we discuss only the most relevant work for this article.We refer to Sun's dissertation [35] for a more exhaustive survey on graph coloring algorithms.
Greedy colorings are the most common and easy heuristics for graph coloring.Given an ordering of the vertices, each vertex gets assigned the smallest color that avoids monochromatic edges in the given order.Different heuristics use different orderings.DSatur [5] is one of the most successful greedy heuristics, and we use it in our approach.DSatur always chooses as the next vertex one that is most constrained, i.e., one with the fewest colors available.
Tabu search has been successfully used for graph coloring.Most relevant to this article is Partialcol [4], which we discuss in more detail in Section 2.3.

1.5:3
Iterated-DSatur (I-DSatur) [19] is a SAT-based extension of DSatur that combines DSatur with extensive pre-processing and SAT-solving to a new method that can compute optimal colorings for small graphs.Used as a heuristic, it scales to sparse graphs with several million vertices.I-DSatur adds a reordering mechanism to DSatur invoked whenever the current uncolored vertex v cannot be colored with any of the existing colors, i.e., the current partial k-coloring would become a partial (k + 1)-coloring.At this point, I-DSatur tries to find a better coloring for all the vertices colored so far and v.If successful, then no new color is required; if unsuccessful, then the best lower bound on the number of required colors known to I-DSatur can be increased.The main difference between GC-SLIM and I-DSatur is that GC-SLIM tries to reduce the number of colors by improving several smaller local instances, while I-DSatur tries to find improvements for a single local instance that is as large as possible.The former scales better on dense graphs, while the latter performs better on sparse graphs, as we will further discuss in our experimental evaluation.
Large graphs yield a prohibitively large encoding size when the standard SAT encodings for graph coloring are used.Recently, a new approach based on clause learning has been proposed [16,18], which can circumvent the size issue for many instances.Here, only those clauses required for a correct solution are added iteratively.This approach is also used in I-DSatur [19].
Much research in recent years focused on very large and sparse graphs.The advantage of sparse graphs is that they can often be colored with a small number of colors relative to their size and are easily reducible to smaller graphs.State-of-the-art approaches use these and other structural properties of sparse graphs to scale to graphs with millions of vertices [25,32,37].We compare GC-SLIM to the most recent such algorithms FastColor [25] and I-DSatur [19].
The top three submissions to the CG:SHOP Challenge 2022 used different variations of the same idea.They perform local search guided by a conflict score, i.e., how often a vertex has been recolored [8,9,14,34].This strategy performed better on the competition instances than other established local search strategies.We will further discuss this strategy in Section 2.3.

PRELIMINARIES 2.1 Graphs and Colorings
We consider a connected simple graph G with the set of vertices V (G) and set of edges E (G).We will often assume without loss of generality that V (G) = {1, . . ., |V |}.We denote the edge between vertices v, w ∈ V (G) by vw or equivalently wv.For X ⊆ V (G), we denote by For an integer k ≥ 1, we denote the set {1, . . ., For a (partial) k-coloring c of G, we call the integers [k] colors and the sets , the color classes of c. Observe that each color class is an independent set of G and that color classes are pairwise disjoint.We also write c 0 (G) = V (G) \ V (c) for the set of uncolored vertices and write c (v) = 0 for a vertex v ∈ V (G) \ V (c).We write c ℓ , instead of c ℓ (G), if G is clear from the context.Since its color classes uniquely determine a partial k-coloring, we will often specify a k-coloring this way.Further, we write N G,c, ℓ (X ) = N G (X ) ∩ c ℓ (G), the ℓ-colored neighborhood.Whenever G and c are clear from context, we drop the subscript and use N ℓ (X ).The prevalence of a color ℓ is |c ℓ |, and the prevalence of a color ℓ with respect to a vertex The graph coloring problem takes an undirected graph G as input; the task is to produce a coloring of G that uses the least possible number of colors.The decision version of the problem takes as input G and an integer k; the task is to decide whether G admits a k-coloring.

Minimum Partition into Plane Subgraphs Problem (MPPS)
The Minimum Partition into Plane Subgraphs Problem (MPPS) takes as an instance a geometric graph G, with vertices V (G) represented by points in the plane, and edges E (G) by straight-line connections between vertices.The task is to finda partitioning of E into as few classes E 1 , . . ., E k as possible, such that each subgraph G i , with In this article, we consider the MPPS problem in terms of graph coloring.There is a natural reduction from the MPPS problem to graph coloring, which reduces an MPPS instance G to the conflict graph G ′ , containing a vertex for each line segment and where two vertices are adjacent if the corresponding line segments intersect.Evidently, G admits a partitioning into k plane subgraphs if and only if G ′ has a k-coloring.

Tabu Search
Tabu search is a very successful local search approach to graph coloring.We use Partialcol's search strategy [4] In each iteration, the algorithm performs a p-swap with smallest p.The choice of color ℓ for the p-swap is restricted by a tabu list for vertex v * : a list of the colors assigned to v * in the last few iterations.This mechanism ensures that vertices do not get re-assigned the same colors within a certain number of iterations and forces the algorithm to explore more of the search space.Figure 1 shows how a series of swaps can empty c 0 .Partialcol terminates if c 0 = ∅, in which case c is now a full k-coloring, or when it reaches a prescribed number of iterations.Usually, tabu search is run repeatedly, choosing different colors to eliminate.

1.5:5
Conflict Scores.The winning submissions [8,9,14,34] to the CG:SHOP Challenge are based on heuristic algorithms that utilize a different selection criterion based on the conflict count.For a vertex v, the conflict count q(v) measures how often v has been removed from c 0 by a swap, i.e., how often v has been colored.Initially, we set q(v) = 0 for all vertices v.The conflict count is then used to calculate a conflict score that is used for picking the next swap.The different submissions calculate the conflict score differently.We follow the approach by Spalding-Jamieson et al. [34] due to its simplicity: For the next swap, the solver picks a random vertex v ∈ c 0 and swaps it to the color ℓ that minimizes u ∈N ℓ (v ) (1 + q(u) 2 ).

SAT-BASED LOCAL IMPROVEMENT FOR GRAPH COLORING
The propositional satisfiability problem (SAT) asks whether a given propositional formula is satisfiable.As the first problem to be shown to be NP-complete [7,24], it forms a cornerstone in computational complexity.In contrast to its theoretical hardness, SAT provides an important framework for solving hard combinatorial problems in practice by encoding instances in propositional logic and solving them with a SAT solver [12].Today's SAT solvers are extremely efficient, robust, and can routinely solve instances that encode real-world problems with hundreds of thousands of variables.The progress achieved by algorithm engineering for SAT is "nothing short of spectacular" [36].SAT-based methods automatically benefit from further improvements to SAT solvers, making them even more attractive.
SAT-based Local Improvement (SLIM) is an anytime meta-heuristic that embeds SAT encodings into heuristic algorithms.It improves a given (sub-optimal) global solution through a series of local improvements accomplished by a SAT solver.SLIM has been successfully utilized in several applications [13,26,27,30,33] and allows us to apply the solving power of SAT to instances that are too large to be encoded as a whole to SAT.Instead, we repeatedly choose smaller local instances that can be quickly encoded and solved.SLIM is a special case of Large Neighborhood Search [29], distinguishing itself by combining a structurally constrained notion of a neighborhood with a complete method (SAT).
Our new SLIM approach to graph coloring, GC-SLIM, tries to eliminate one color at a time in a fashion similar to Partialcol.Starting from a heuristically computed (k + 1)-coloring, GC-SLIM selects a color e ∈ [k + 1], removes e from c, and tries to iteratively recolor subgraphs using a SAT solver until all vertices are colored, and c gives rise to a k-coloring.
We first discuss the core of every SLIM algorithm: a method to extract local instances such that their improvement eventually translates to an overall improvement.First, we discuss how we define local instances, i.e., we show how we can color subgraphs of G with a SAT solver while maintaining consistency with the coloring of the remaining graph.Then, we discuss how we find good local instances.We also discuss further additions to GC-SLIM that enhance its performance.

Local Instances and SAT
Let G be the input graph and c a partial k-coloring of G. Since G is too large to be encoded as a whole to SAT, we select a subset X ⊆ V (G), based on a process described in the next subsection, limiting the size of X in terms of a budget parameter b.The goal is now to find a partial k-coloring However, a newly found k-coloring of G ′ will, in general, not be compatible with the coloring c of the vertices outside X .We consider the vertices adjacent to X as extra constraints by defining the local instance in terms of the list coloring problem: Let L be a mapping that assigns each vertex Here in particular, we let the partial k-coloring obtained by composing c and c ′ : The following lemma provides an important link between colorings and list colorings: Lemma 3.1.Given a graph G and X ⊆ V (G), let c be a partial k-coloring of G, (G ′ , L) be the local instance for X , and c ′ be a partial list coloring of (G ′ , L).
We note in passing that the list coloring problem is a proper generalization of the graph coloring problem.For instance, graph coloring is fixed-parameter tractable in the graph's treewidth, while list coloring is W [1]-hard when parameterized by treewidth [11].
Our general aim is to increase the number of colored vertices.Ideally, we would find a full k-coloring for (G ′ , L).While this is often not possible, it turns out that it is still useful to obtain a partial list coloring c ′ of (G ′ , L), which colors all previously uncolored vertices and minimizes the number of newly introduced uncolored vertices.
We achieve this by a slight tweak of the local instance.For all v ∈ X \ c 0 , we add 0 to L(v) and thereby allow them to become uncolored.The problem is now a minimization problem: find a partial list coloring c ′ for (G ′ , L) that minimizes |c ′ 0 |.We encode the existence of a partial list coloring of G ′ that minimizes the number of uncolored vertices.To this end, for r ≤ |X |, we define a propositional formula F (G ′ , L, r ) that is satisfiable if and only if (G ′ , L) has a partial list coloring c ′ where |c ′ 0 | ≤ r .We can minimize the number of uncolored vertices by solving F (G ′ , L, r ) for different values of r .
The encoding requires one set of variables and two sets of clauses.For each v ∈ X and ℓ ∈ L(v), Hence, the first set of clauses encodes that each vertex v ∈ X is assigned at least one color The second set of clauses encodes that adjacent vertices in G ′ must not have the same color: Note that c v,0 and c w,0 can both be true even if vw ∈ E(G).Finally, we use a totalizer encoding [2] to express the cardinality constraint

Local Instance Selection
In this section, we describe how GC-SLIM constructs local instances for the SAT encoding described in the previous section.Let G be the input graph and c a partial k-coloring of G. Our goal is to select a suitable subset X ⊆ V (G) that defines our local instance.The overall approach is to start at a single uncolored vertex and perform a breadth-first search among the least prevalent colors in the neighborhoods where the size of X is limited by a budget b and the breadth by a branching factor f .We first select an uncolored vertex v * ∈ c 0 .We initially put X 0 = ∅, X 1 = {v * } and continue computing a chain of sets X 0 X 1 • • • X s as long as |X s | ≤ b.If no further addition is possible, then we stop, as we have found the set X = X s .
Assume we have constructed X i , i ≥ 1.We now construct X i+1 by starting from X i+1 := X i and incrementally extending X i+1 .Let S = X i \X i−1 .For each w ∈ S, we find the smallest non-empty set we add N ℓ (w ) to X i+1 and in any case, we proceed to the next vertex in S. We repeat this step at most f times, i.e., for each vertex in S, we add at most f colors from the neighborhood of the vertex to finish constructing X i+1 .
We observe that X \ V (c) = {v * }, i.e., v * is the only uncolored vertex in X .Figure 2 illustrates local instance selection on a simple graph.
The goal of the budget b is to keep the size of the local instance small enough such that the SAT solver can solve it within an expected timeout.In practice, the best budget varies greatly with the instance, so we automatically adjust it.Whenever a specified number of consecutive SAT solver calls time out, the budget is decreased, and conversely, whenever the same number of consecutive SAT solver calls return a result, the budget is increased.
We described the process such that we always expand X i+1 using the color ℓ such that N ℓ (w ) is minimal.Alternatively, we can also use the conflict score discussed in Section 2.3.We discuss both options in our experimental section.

Chain Propagation
In this section, we describe chain propagation, which is a powerful technique that allows us to determine whether we can quickly color a given uncolored vertex v * by using a chain, or sequence, of swaps and propagating the impact of the swaps in the chain until hopefully finding a 0-swap.This concept is inspired by s-chain tabu search [28], where chains up to a length s are explored, and by the consideration of a single flat chain in I-DSatur [19], where a chain of 1-swaps is applied within a single iteration whenever available.Another way to view chain propagation is as a lookahead for the actions that Partialcol would perform.We start with the set of uncolored vertices U = {v * } and try to empty this set by applying the following rules until either U = ∅ or no rule is applicable.Whenever we find a chain of swaps that empties U , we have found a chain that successfully colors v * .Figure 3 illustrates these rules using our running example.
Rule 1 (0-swap).Take a vertex w ∈ U and a color ℓ ∈ [k] such that N ℓ (w ) = ∅.Swap the color of w to ℓ and remove w from U .
The immediate goal of local search is finding a 0-swap, as a 0-swap decreases the number of uncolored vertices.The problem with 0-swaps is that they only consider the immediate neighborhood of the vertex.
Therefore, local search may miss possible 0-swaps if they are not included in our local instance or hidden behind larger swaps.We remedy this issue by extending chain propagation beyond 0swaps and exploring all chains of limited complexity with the goal of coloring v * .
A slightly more elaborate case prevails when we apply the following rule multiple times, keeping the number of uncolored vertices constant, completed by a final application of Rule 1: Rule 2 (1-swap).Take a vertex w ∈ U , such that for a color ℓ ∈ [k] and a vertex u, we have N ℓ (w ) = {u}.Swap the color of w to ℓ, make u uncolored, and replace w with u in set U .
We call such a sequence of rule applications a 1-swap chain, sometimes called a flat chain [19].Even more powerful but also more costly is a p-swap chain, where p > 1 is a fixed constant.It uses the following generalization of Rule 2: Rule 3 (p-swap).Take a vertex w ∈ X , such that for a color ℓ ∈ [k], we have |N ℓ (v)| ≤ p. Swap the color of w to ℓ, make all the vertices in N ℓ (w ) uncolored, and replace w with N ℓ (w ) in X .
Chain propagation explores the possible chains exhaustively.Bookkeeping is necessary to avoid re-applying the same series of swaps, as this leads to cycles, and consequently, chain propagation may not terminate.Further, we apply the rules in order, as it is faster to explore chains without p-swaps.
Two hyperparameters regulate the complexity of the chains.Since Rule 3 increases the number of uncolored vertices, it is the main factor for the complexity of the chains explored and, therefore, the main factor on the runtime of chain propagation.We limit the applications of the rule in two ways: (i) We limit p and thereby how much the number of uncolored vertices can increase within one rule application, and (ii) we limit how often Rule 3 can be applied within one chain for the same reason.Together, the two hyperparameters regulate how much the number of uncolored vertices can increase within a single chain. 1.5:9

Putting It All Together
Algorithm 1 combines the main ingredients of GC-SLIM that we have discussed.Initially, we compute a k-coloring c with a heuristic like DSatur and then repeatedly call GC-SLIM with different colors as the elimination goal.Each call either succeeds, reducing the number of colors, or fails, in which case, we restore c to the state it had before a color was removed.For each call, we pick the least prevalent color we have not yet tried as the elimination goal, breaking ties arbitrarily.Update tabu list.

6:
if chain propagation for v * is not successful then Changes ← call_sat(I, m, sat_timeout) 10: if finding a list coloring I with at most m uncolored vertices fails then 11: Perform m-swap of v * . 12: Check if budget should decrease.Restore c.

22:
return Failed 23: end if In Algorithm 1, GC-SLIM starts with adjusting c according to the given elimination goal and then tries to complete c for a prescribed number of iterations.In each iteration, it picks a vertex v * and first tries to color it using chain propagation.If this fails, then the algorithm creates a local instance based on v * and tries to color it using the SAT encoding.The number of uncolored vertices in the solution to the local instance is limited to m, where m is the prevalence of the least prevalent color in the neighborhood of v * .This limit of m ensures that GC-SLIM will not perform worse than a p-swap.If the local instance is (partially) colored successfully, then GC-SLIM proceeds to the next vertex; otherwise, it defaults to a p-swap as Partialcol would perform.
The algorithm contains several hyperparameters, which we will discuss next.

Hyperparameters
The hyperparameters controlling Algorithm 1 are the iteration limit, the timeout for the SAT solver, and the choice of SAT solver.
The iteration limit controls how much time the algorithm spends on eliminating a single color.Lower iteration limits cause shorter runtimes.Therefore, one can try to eliminate more colors in the same amount of time at the price of possibly missing some improvements: Sometimes GC-SLIM will run many iterations with very few uncolored vertices until eventually finding the 0-swaps that complete the coloring.A low iteration limit will miss these improvements.Omitted in the listing is a mechanism that grants GC-SLIM an extra 10% of the iteration limit whenever the number of uncolored vertices decreases.Thus, GC-SLIM runs as long as it reduces the number of uncolored vertices, no matter the iteration limit.
The timeout for the SAT solver follows a similar tradeoff.Lower values lead to quicker search space exploration by trying many different local instances, while larger values may discover new improvements.While the iteration limit regulates how often GC-SLIM generates a local instance, the timeout for the SAT solver strongly influences the budget for the local instances.
The SAT solver can also impact the performance of GC-SLIM, both in terms of memory usage and speed, and different solvers may perform very differently for different instances.
The hyperparameters from this section, together with the branching factor, budget, and p-limit for chain propagation as discussed above, control GC-SLIM.As we will discuss next, some further options can severely impact GC-SLIM's performance.We will further explore this impact in our experiments.

Further Options
In this section, we discuss several minor options that can affect GC-SLIM's efficiency positively or negatively, depending on the instance.We will explore their effects further in the next section.
Prerun Tabu Search.Partialcol iterations are much faster than GC-SLIM iterations and can often reduce the number of colors quicker, while GC-SLIM can find improvements that Partialcol misses.Running Partialcol for several iterations before starting GC-SLIM tries to take the best from both worlds.
Flexible Vertices.We say that a vertex v ∈ V (G) is flexible with respect to a (partial) k-coloring c if v ∈ V (c) and there is a color ℓ ∈ [k] \ {c (v)} such that N ℓ (v) = ∅.Thus, we can change the color of a flexible vertex and still have a (partial) k-coloring.We let F c ⊆ V (G) be the set of all vertices that are flexible w.r.t.c.Flexible vertices provide an additional option when choosing a color: Instead of simply choosing the least prevalent color, we redefine the prevalence of a color ℓ as |c ℓ \ F c | and the prevalence w.r.t. the neighborhood of a vertex analogously.This can lead to a more accurate estimation, since flexible vertices allow immediate 0-swaps.This calculation is heuristical, as adjacent flexible vertices can block each other's color options, so we actually can only change the color of one of them.
Parallelization.GC-SLIM can run in parallel with minimal synchronization: Each thread runs GC-SLIM, and whenever one thread successfully eliminates a color, GC-SLIM is restarted in each thread with the improved coloring.This introduces the new hyperparameter thread count.Generally, more threads are better, as they enable faster search space exploration.Threads can either try to eliminate different colors, the same colors with different hyperparameter settings, or a mixture of both.

EXPERIMENTS
The aim of this article is not to determine the fastest graph coloring method but to investigate how SAT/CP methods can be utilized for the coloring of large, dense graphs.
We conduct three sets of experiments, one that evaluates the impact of the different hyperparameter settings and, by extension, the different features of GC-SLIM.The second experiment compares GC-SLIM to the state-of-the-art graph coloring methods FastColor and I-DSatur. 3In the last experiment, we look at GC-SLIM's performance on the whole set of CG:SHOP instances.
Setup.We ran our experiments on a cluster where each server had two Intel Xeon E5-2640 v4 CPUs with 10 cores to 2.4 GHz for the first and second experiment, and two AMD EPYC 7402 CPUs, each with 24 cores running at 2.8 GHz, for the last experiment.The servers ran on Ubuntu 18.04 and used gcc 7.5.0.The runs were limited to 64 GB of memory.We implemented our approach in C++ and used Glucose 3 4 [1] and Cadical 1.5.05[3] as SAT solvers.
Our implementation of DSatur [5] computes the initial colorings.We compare GC-SLIM against FastColor6 [25] and I-DSatur7 [19], representing state-of-the-art methods for coloring massive graphs.FastColor and I-DSatur runs were limited to 128 GB, as lower memory limits were insufficient for large and dense graphs.
We used an initial budget of 300 for the local instance.Whenever three consecutive SAT solver calls times out, we decrease the budget by 60 vertices; whenever three consecutive calls succeeds, we increase the budget by 60.In practice, the budget varied between 60 for very dense graphs and over 2,000 for sparse graphs.
The CG:SHOP competition instances are very dense and have more structure than random graphs.The instances have up to 73,000 vertices and 1.5 billion edges.We picked 10 instances of the 225 used in the competition for our second experiment: Five instances are from the largest instances in the set with over 73,000 vertices; the other five were chosen such that density and size vary.
We used random graphs, as it was hard to find large benchmark instances that were not also very sparse.Therefore, we generated 14 Erdos-Renyi random graphs, which vary in size between 10,000 and 100,000 nodes and in density between 0.05 and 0.5.
The instances from the Snap repository and 10th DIMACS instances contain large graphs with up to several million vertices and have been used in related work [19,25].We preprocessed the instances by removing all vertices with degrees smaller than the lower bound on the chromatic number, determined in related work [19].We picked the 11 instances with more than 10,000 and fewer than 280,000 vertices from these preprocessed instances.The 280,000 limit was due to memory constraints, since we focused on supporting dense graphs, and adjacency matrices become very memory-intensive for larger graphs.The mentioned sizes refer to preprocessed instances.
The DIMACS instances have been used in many papers for graph coloring and are included for reference, as GC-SLIM was not designed for such small instances.We used 10 instances that are considered hard [17].Time is in minutes on the x-axis and the number of colors is on the y-axis.

Hyperparameter Impact
We explore how the different hyperparameter settings change the results in the first set of experiments.We use a base configuration and vary the setting of one parameter at a time.Each run for each instance was limited to seven hours.
As the base configuration, we use a SAT solver timeout of 10 seconds, 300 iterations, no chain propagation, a branching factor for local instances of 3, no multithreading, and no prerun tabu search.We count the instances for which a hyperparameter value finds the best coloring.We also count the instances where it does so uniquely, i.e., no other setting for this hyperparameter found an equally good or better coloring.
SAT Solver Timeout.We try different values for the SAT solver timeout: 5, 10 (default), 30, and 60 seconds.A timeout of 5 seconds finds the best result for 37 of the 44 instances, where it reaches the unique best result on 15 of the instances.The results quickly deteriorate with higher timeouts: While a timeout of 10 seconds finds the unique best result for 4 instances, the higher timeouts achieve the same for only one instance.
A closer look at the number of SAT calls and the size of the local instances explains the results.While local instances contain 1,511 vertices on average with a 5 seconds timeout, they only increase to 1,610 vertices on average with a 60 seconds timeout.This is in stark contrast to the number of SAT calls, which decreases from 13,910 to 6,764.Depending on the instance, 25% to 50% of the SAT calls eventually time out, leading to a significant decrease in the number of possible SAT calls with higher timeouts.Therefore, higher timeouts should be reserved for later stages when lower values fail to find improvements.
Chain Propagation.We use different limits for the maximum size of the swaps in the chains we propagate.We use values of 0 (default), 1, 2, 3, and 5.A limit of 2 achieves the overall best result, reaching the best result on 34 of the 44 instances and the unique best on 18 instances.Limiting the chains to 1-swaps or using no chain propagation at all performs very poorly.Higher limits can be beneficial for some instances, as, for example, a limit of 5 performs slightly better on the Snap instances.This indicates that higher limits might be beneficial for large and very sparse graphs.
Figure 4 shows how chain propagation impacts GC-SLIM for large, dense instances.We can see that the number of improvements over time speeds up significantly and becomes comparable to Partialcol in terms of speed, sometimes surpassing it.
Impact of SAT solver.We consider Glucose and Cadical as SAT solvers due to good results and their capability for incremental solving.Overall, Cadical achieves the unique best result on 19 of the 44 instances and Glucose on 10.This makes Cadical the best default choice, while Glucose may be better suited for individual instances.Glucose generally performs better on random instances and worse on the other instances in our experiments.
Flexible Vertices.Using flexible vertices does not give a clear advantage in the number of best results.Considering flexible vertices achieves the unique best result on 17 of the instances and not considering them on 14.While this does not seem like a clear advantage, the reduction in colors is significant, up to 100 colors for the instances where it performs better.For instances where it performs worse, the increase in the number of colors is never worse than 6.Considering flexible vertices performs consistently better for the CG instances.
Prerun tabu search.The benefits of running tabu search prior to GC-SLIM are very instancespecific.It achieves consistently better results on the DIMACS and Snap instances and worse results on the random and CG instances.This indicates that this configuration is beneficial for small and sparse graphs.One possible explanation is that the tabu search iterations become slower for dense graphs, reducing the efficiency gain over GC-SLIM.
Conflict Score.Another option is using a conflict score to determine swaps and selecting local instances instead of simply picking the smallest colored neighborhood.This achieves the unique best result for 16 instances, while not using this option gives the unique best result on 15, making its benefits very instance-specific.Using prerun tabu search with the conflict score gives the unique best result on 19 instances, in contrast to 17 instances, where the basic configuration finds the unique best result.It performs consistently bad for the random instances, mixed for the DIMACS instances, and consistently good on the CG and snap instances.
Iteration Limit.We try iteration limits of 100, 300 (default), 500, 1,000, and 5,000.None of the limits performs significantly better than the others, with 500 and higher performing better and 5,000 performing overall best with 7 uniquely best results.While 1,000 iterations is a good default setting, different settings may perform better for different instances.Furthermore, higher iteration limits may be necessary if the number of colors is close to optimal and it becomes harder to eliminate a color.
Branching.We try branching factors of 2, 3 (default), 5, 10, and 15.The overall best is a branching factor of 2, which achieves the best result on 31 of the 44 instances and finds 13 uniquely best instances.The results worsen with higher branching factors, except for the Snap instances, where a branching limit of 10 performs best.This matches the results for chain propagation, where higher limits also perform better for the Snap instances, suggesting that a search focused on breadth over depth may be a generally good strategy for sparse instances.Initial Node Limit.When creating the local instance, we treat the initial vertex as a special case: Instead of limiting the number of colors we choose, we limit the number of neighbors we add for the initial vertex.This system chooses more different colors if there are many low-prevalence colors and fewer if not.We try limits of 10, 25, 50 (default), 75, and 100.
Each limit leads to the best result on about 16 to 21 instances and a unique best result on 4 to 5 instances.Therefore, there is no discernible good default and choosing the right value always depends on the instance at hand.A pattern similar to chain propagation and the branching limit emerges here: The lower the density, the better a higher initial limit, i.e., more focus on breadth, works.

Comparison and Parallelism
We use the results from the first experiment to discern a best configuration for each of the four sets of instances; the configurations are shown in Table 1.In our comparison, we use different GC-SLIM configurations, FastColor, I-DSatur, and Partialcol.Each was run with a 24-hour time limit.We additionally run other methods and use their output as GC-SLIM's input, i.e., we prerun our own implementations of I-DSatur (IS) and/or Partialcol (PC).Partialcol runs either 7 hours alone or 4 hours together with I-DSatur.We also apply multithreading with 4 threads and varying the configuration parameters using the values shown in Table 1.The multithreading runs for only 6 hours and has twice the memory.The results are in Table 2.
The results show how well GC-SLIM performs on large and dense graphs.On the CG and Random instances, it significantly outperforms FastColor and I-DSatur.Interestingly, for random instances, a Partialcol prerun performs consistently better than any other configuration.For the CG instances, using a portfolio of varied configurations, combining the configurations in Table 1, performs best, and the overall best configuration performs comparatively poorly.
While GC-SLIM also outperforms both algorithms on the small DIMACS instances, it does not come close to reaching the best-known value on almost all instances.This shows that specialized algorithms for these small instances work better.
FastColor and I-DSatur shine on the Snap instances where they benefit from the structure of sparse graphs, which GC-SLIM does not particularly exploit.Still, FastColor, which performs better than I-DSatur, finds the best solution for six of the instances, as does the varied configuration.Note that our goal was not to compare FastColor and I-DSatur, but to compare these approaches to GC-SLIM on various graphs.
The GC-SLIM configurations perform very differently.Figure 5 shows the progression over time for selected instances.
Overall, the results show that a varied configuration is better than a single, tuned configuration.This finding is strengthened by the fact that multithreading incurs an overhead, as each thread has to restart once a color has been eliminated.

CG:SHOP Instances
The competition used 225 instances in total, separated in different classes, where instances into each class are created by the same process but with different sizes and densities.The instances were created such that then-available graph coloring solvers failed to produce good results.This is supported by the fact that these solvers would have placed very low in the competition [10].
We started our submission by implementing Partialcol.This implementation was run for several days for each instance, and we only started implementing GC-SLIM, when Partialcol failed to find improvements.During that time, we varied and randomized every parameter and decision in our implementation.This gives us the possibility to compare GC-SLIM to these long Partialcol runs in Figure 6.The figure shows that GC-SLIM is able to significantly improve the colorings, even after the long Partialcol runs.We also have results from a more controlled experiment, where GC-SLIM runs for 24 hours using the best configuration.Figure 7 shows that the final GC-SLIM implementation achieves almost the same results as the results from the long runs of Partialcol and GC-SLIM during development as described above.Our final implementation achieves these results in a fraction of the time.This    further shows how well GC-SLIM performs on these instances.Finally, Figure 8 shows the comparison between the best GC-SLIM run and the best results from the competition.

CONCLUSION
With GC-SLIM, we have presented a new hybrid approach to graph coloring that enhances tabu search with SAT-based local improvement.Key elements of this combination are the selection method for local instances, the SAT-based solution for local instances, and chain propagation.Further improvements are due to hyperparameter tuning, the prerunning of tabu search, and different metrics for selecting vertices for color elimination.We also proposed and tested a parallel version of GC-SLIM.Our experimental evaluation shows that GC-SLIM complements existing methods and can find colorings with significantly fewer colors than other methods on large dense graphs.
For future work, we see two main paths.The first one is improving the selection of local instances, as we expect a better criterion than using the least prevalent color.The other path is adapting GC-SLIM to more general graphs.We have seen that FastColor excels even for large random graphs and can handle even larger graphs.GC-SLIM can be adapted to handle large and sparse graphs.This would also require implementing features from FastColor and I-DSatur that exploit the structural properties of sparse graphs, as well as preprocessing.Integrating these features may also lead to a better method for local instance selection.

Fig. 1 .
Fig.1.Tabu search example that shows how vertex x is colored through a series of swaps.Note that for the last two graphs, two swaps are performed at once.
. Starting from a (non-optimal) (k +1)-coloring c of the given graph G, Partialcol selects a color e ∈ [k+1] to eliminate.The vertices in c e are then removed from c and considered uncolored, making c a partial k-coloring.Partialcol now tries to complete c and color the vertices in c 0 by performing swaps: For a partial k-coloring c, a vertex v * ∈ c 0 , and a color ℓ ∈ [k], a (color) swap of v * to ℓ is obtained from c by setting c (v * ) := ℓ, and c (w ) := 0 for all w ∈ N ℓ (v * ).The swap is a p-swap if |N ℓ (v * )| = p.Let u = |c 0 | be the number of uncolored vertices before the swap, then |c 0 | = u + p − 1 after a p-swap.

Fig. 2 .
Fig. 2. Example for local instance selection with branching factor 2 and a budget of 8.The selected component is indicated by discolored (gray) vertices.

Fig. 3 .
Fig. 3. Example of a chain propagation sequence coloring vertex x.

Fig. 4 .
Fig.4.Comparison of how many colors are eliminated over time with different swap limits for chain tracing.Time is in minutes on the x-axis and the number of colors is on the y-axis.

Fig. 5 .
Fig. 5. Comparison of how many colors are used by different configurations over time.The x-axis shows the time in minutes and the y-axis the number of colors.The vertical dotted line indicates when Partialcol preruns end and for multithreaded runs, the times are multiplied by the number of threads.PC refers to running Partialcol on the instance before GC-SLIM.IS refers to running I-DSatur to obtain the initial coloring.

Fig. 6 .
Fig.6.The CG:SHOP 2022 instance results.For each instance, we show the initial solution using DSatur, the long-term improved coloring using Partialcol, and the eventually submitted solution obtained using GC-SLIM.

Fig. 7 .
Fig. 7.The results from a 24-hour run of GC-SLIM compared to Partialcol and GC-SLIM as used in the competition and Figure 6.

Fig. 8 .
Fig. 8.Comparison between the best GC-SLIM and the best CG:SHOP 2022 colorings.Each mark is an instance, and the position indicates the number of colors used.

Table 1 .
Best Hyperparameter Settings for Instance Sets

Table 2 .
Comparison of 24-hour Runs DS shows the DSatur run used as input for GC-SLIM.Best shows the best known result for the instance from the literature.For comparison, we list Fast Color, Iterated DSatur (IS), and our Partialcol (PC) implementation.GC-SLIM configurations start with G, V denotes varying parameters, I indicates a I-DSatur Prerun, and P a Partialcol Prerun.