Online Edge Coloring is (Nearly) as Easy as Offline

The classic theorem of Vizing (Diskret. Analiz.'64) asserts that any graph of maximum degree $\Delta$ can be edge colored (offline) using no more than $\Delta+1$ colors (with $\Delta$ being a trivial lower bound). In the online setting, Bar-Noy, Motwani and Naor (IPL'92) conjectured that a $(1+o(1))\Delta$-edge-coloring can be computed online in $n$-vertex graphs of maximum degree $\Delta=\omega(\log n)$. Numerous algorithms made progress on this question, using a higher number of colors or assuming restricted arrival models, such as random-order edge arrivals or vertex arrivals (e.g., AGKM FOCS'03, BMM SODA'10, CPW FOCS'19, BGW SODA'21, KLSST STOC'22). In this work, we resolve this longstanding conjecture in the affirmative in the most general setting of adversarial edge arrivals. We further generalize this result to obtain online counterparts of the list edge coloring result of Kahn (J. Comb. Theory. A'96) and of the recent"local"edge coloring result of Christiansen (STOC'23).


Introduction
Online edge coloring is one of the first problems studied through the lens of competitive analysis [BNMN92].In this problem, a graph is revealed piece by piece (either edge-by-edge or vertex-by-vertex).An algorithm must assign colors to edges upon their arrival irrevocably so that no two adjacent edges are assigned the same color.The algorithm's objective is to use few colors in any graph of maximum degree ∆, close to the (offline) optimal ∆ or ∆ + 1 guaranteed by Vizing's Theorem [Viz64].
A trivial greedy online algorithm that assigns any available color to each edge upon arrival succeeds while using a palette of 2∆ − 1 colors.As shown over three decades ago, no algorithm does better on low-degree n-vertex graphs with sufficiently small maximum degree ∆ = O(log n) [BNMN92].While our understanding of online edge coloring is thus complete on low-degree graphs, the dynamics are considerably more complex and interesting when ∆ surpasses ω(log n).The authors in [BNMN92] even conjectured that barring the low-degree case, edge coloring can be performed online while nearly matching the guarantees of offline methods.

Conjecture 1.1 ([BNMN92]
).There exists an online edge-coloring algorithm for n-vertex graphs that colors the edges of the graph online using (1 + o(1))∆ colors, assuming known maximum degree ∆ = ω(log n). 1 Progress towards resolving Conjecture 1.1 was obtained for restricted settings, including random-order edge arrivals [AMSZ03, BMM12, BGW21, KLS + 22] and vertex arrivals [CPW19, SW21, BSVW24].In the most general setting, i.e., under adversarial edge arrivals, [KLS + 22] recently provided the first algorithm outperforming the trivial algorithm, showing that an ( e e−1 + o(1))∆-edge-coloring is achievable.Most prior results for online edge coloring under adversarial arrivals [CPW19, SW21, KLS + 22] were attained via the following tight connection between online edge coloring and online matching. 2Given an α∆-edge-coloring algorithm, it is easy, by sampling a color, to obtain an online matching algorithm that matches each edge with probability at least 1/(α∆).In contrast, [CPW19] provided an (asymptotically) optimal reduction in the opposite direction, from online (α + o(1))∆-edge-coloring algorithms when ∆ = ω(log n) to online matching algorithms that match each edge with probability at least 1/(α∆).(See Lemma 2.1.)A positive resolution of Conjecture 1.1 therefore requires-and indeed, is equivalent todesigning an online matching algorithm that matches each edge with probability at least 1/((1 + o(1))∆).
Our Main Result.In this work, we resolve Conjecture 1.1: Theorem 1.2 (See exact bounds in Theorem 4.11).There exists an online algorithm that, on n-vertex graphs with known maximum degree ∆ = ω(log n), outputs a (1 + o(1))∆-edge-coloring with high probability.
Via the aforementioned reduction, we obtain the above from our following key technical contribution.
Theorem 1.3.There exists an online matching algorithm that on graphs with known maximum degree ∆, outputs a random matching M satisfying We note that the above matching probability of 1/(∆ + q) for q = Θ(∆ 3/4 log 1/2 ∆) approaches a (lower order) lower bound of q = Ω( √ ∆) implied by the competitiveness lower bound of 1 − Ω(1/ √ ∆) for online matching in regular graphs due to [CW18].
Before explaining further implications of our results and techniques, we briefly discuss our approach for Theorem 1.3 and its main differences compared to prior work. 1 Knowledge of ∆ is necessary: no algorithm can use fewer than e e−1 ∆ ≈ 1.582∆ colors otherwise [CPW19]. 2 Online matching algorithms must decide for each arriving edge whether to irrevocably add it to their output matching.
Techniques overview.For intuition, consider a simple "algorithm" that, by its very design, appears to match every edge with a probability of 1/(∆ + q): When an edge e t = (u, v) arrives and connects two unmatched vertices, match it with probability P (e t ) = 1 ∆ + q • 1 Pr[u, v both unmatched until time t] .
However, the caveat, and the reason for the quotation marks around "algorithm, " is that this process is viable only if P (e t ) constitutes a probability.This raises the question: how large must q be to ensure that P (e t ) ⩽ 1 for all edges e t ?Suppose we naively assume that the events "u is unmatched until time t" and "v is unmatched until time t" are independent.In that case, straightforward calculations show that q = O( √ ∆) would suffice for well-defined probabilities.However, the assumption of independence rarely holds outside of simplistic graphs like trees, and so the aforementioned events may exhibit complex and problematic correlations.Such correlations present the central challenge in establishing tight bounds for general edge arrivals.
Previous studies addressed the above challenge by circumscribing and managing these correlations.For instance, in more constrained arrival models, [CW18,CPW19] used a variant of this approach; they rely heavily on one-sided vertex arrivals in bipartite graphs to choose an edge to match in a correlated way upon each vertex arrival, while creating useful negative correlation allowing for Chernoff bounds beneficial for future matching choices.In contrast, the only known method for general edge arrivals was given by [KLS + 22]: they subsample locally tree-like graphs and employ sophisticated correlation decay techniques to approximate the independent scenario, albeit at the expense of only being able to match each edge with a probability of at least 1/(α∆) for α := e/(e − 1) + o(1).Unfortunately, the ratio of e/(e − 1) appears to be an intrinsic barrier for this approach.
Our approach deviates from the one guiding [CW18, CPW19, KLS + 22], by allowing for correlations instead of controlling and taming them.Crucially, we present a different but still simple algorithm, with a subtle difference, which we describe informally here (see detailed exposition in Section 3): instead of obtaining the probability P (e t ) by scaling 1/(∆ + q) by 1/ Pr[u, v both unmatched until time t], our scaling factor depends upon the algorithm's actual execution path (sequence of random decisions) so far.The modified algorithm allows us to analyze the scaling factor for an edge as a martingale process.While there may still be correlations, we show that this martingale has (i) small step size and (ii) bounded observed variance.These properties allow for strong Chernoff-type concentration bounds, specifically through Freedman's inequality (Lemma 2.3), which is pivotal to our analysis.The change of viewpoint is crucial for achieving our result and leads to a simple and concise algorithm and analysis.The results and techniques also extend to more general settings, as we explain next.
Secondary Results and Extensions.In Appendix A we combine Theorem 1.3 with a new extension of the above-mentioned reduction, from which we obtain online counterparts to two (offline) generalizations of Vizing's theorem, concerning both "local" and list edge coloring.For some background, a list edge coloring of a graph is a proper coloring of the edges, assigning each edge a color from an edge-specific palette.The list chromatic number of a graph, also introduced by Vizing [Viz76], is the least number of colors needed for each edge to guarantee that a proper list edge coloring exists.A seminal result of Kahn [Kah96] shows that the list chromatic number is asymptotically equal to ∆. Another, "local", generalization of Vizing's Theorem was recently obtained by Christiansen [Chr23], who showed that any graph's edges can be properly colored (offline) with each edge (u, v) assigned a color in the set {1, 2, . . ., 1 + max(deg(u), deg(v))}.In this work, using Theorem 1.3 and extensions of the aforementioned reduction, we show that results of the same flavor as [Kah96] and [Chr23] can be obtained by online algorithms.
Theorem 1.4 (See exact bounds in Theorem A.1).There exists an online algorithm that computes an edge coloring which, with high probability, assigns each edge e a color from its list L(e) (revealed online, with edge e), provided each list has sufficiently large size (1 + o(1))∆ and that ∆ = ω(log n).
Finally, in Appendix B, we show that our algorithmic and analytic approach underlying Theorem 1.3 allows us more generally to round fractional matchings online.Here, an α-approximate online rounding algorithm for fractional matchings is revealed (online) an assignment of non-negative values to the edges, x : E → R ⩾0 , with value x e revealed upon arrival of edge e, so that the total assigned value to the incident edges to each vertex is at most one, e∋v x e ⩽ 1.The algorithm's objective is to output online a randomized matching M that matches each edge e with probability at least x e /α.For one-sided vertex arrivals in bipartite graphs, it is known that the optimal α is in the range (1.207, 1.534) [NSW23], while if the matching is "sufficiently spread out", max e x e ⩽ o(1), then α = 1+o(1) is possible [Waj20, Chapter 5].We generalize the latter result to the more challenging edge arrivals setting in general graphs.
Theorem 1.6 (See exact bounds in Theorem B.1).There exists an online (1 + o(1))-approximate rounding algorithm for online matching ⃗ x under adversarial edge arrivals, subject to the promise that max e x e ⩽ o(1).
We note that Theorem 1.3 is the special case of Theorem 1.6 applied to the fractional matching assigning values 1/∆ = o(1) to each edge.While we focus on this special case in the paper body for ease of exposition, we believe that our more general rounding algorithm is of independent interest and has broader applicability.Illustrating this, in Appendix B.1, we combine our rounding algorithm with a rounding framework and algorithm for fractional edge coloring of [CPW19] to obtain the first online edge coloring algorithm beating the naive greedy algorithm for online edge coloring under vertex arrivals with unknown maximum degree ∆ = ω(log n); specifically, we show (details in Theorem B.11) that (1.777 + o(1))∆edge-colorings are attainable in this setting, approaching the lower bound of 1.606∆ proved by [CPW19].

Related Work
Since edge coloring is the problem of decomposing a graph into few matchings, it is natural to relate this problem to online matching.
The study of online matching was initiated by Karp, Vazirani and Vazirani [KVV90], whose main result was a positive one: they presented an optimal algorithm under one-sided vertex arrivals in bipartite graphs, showing in particular that the greedy algorithm's competitive ratio is suboptimal for this problem.Similar positive results were later obtained for several generalizations, including weighted matching [AGKM11, FHTZ20, BC21, GHH + 21], budgeted allocation (a.k.a AdWords) [MSVV07, HZZ20], and fullyonline matching [HKT + 20, HTWZ20].However, in the most general setting, i.e., under edge arrivals, the competitive ratio of the trivial greedy algorithm is optimal [GKM + 19].
The study of online edge coloring was initiated by Bar-Noy, Motwani and Naor [BNMN92], who presented a negative result: they showed that the greedy algorithm is optimal, at least for low-degree graphs.Positive results were later obtained under random-order arrivals [AMSZ03,BMM12], culminating in a resolution of Conjecture 1.1 for such arrivals, using the nibble method [BGW21].For adversarial arrivals, [CPW19,BSVW24] show that in bipartite graphs with one-sided vertex arrivals, the same conjecture holds.This was followed by progress in general graphs, under vertex arrivals [SW21], and edge arrivals [KLS + 22], though using more than the hoped-for (1 + o(1))∆ many colors.We obtain this bound in this work.Thus, we show that not only is the greedy algorithm suboptimal for online edge coloring, but in fact in the most general edge arrival setting, the online problem is asymptotically no harder than its offline counterpart.

Preliminaries
Notation.As standard, we denote by N (v) and δ(v) the neighborhood and edge sets of v, respectively, and denote the number of vertices and edges of G by n := |V | and m := |E|.We also denote by deg H (v) the degree of vertex v in (sub)graph H, and use the shorthand deg(v) := deg G (v).We say an event happens with high probability in a parameter k if it happens with probability at least 1 − k −c for a constant c > 0.
Problem definition and notation.In the online problems studied in this paper, the input is an undirected simple graph G := (V, E) with known maximum degree ∆.Its edges arrive one at a time, with edge e t ∈ E arriving at time t.An online edge coloring algorithm must color each edge e t upon arrival with a color distinct from its adjacent edges.Similarly, an online matching algorithm must decide whether to match e t upon arrival, if none of its endpoints are matched.For both problems, we consider randomized algorithms and assume that the input is generated by an oblivious adversary, which fixes the input graph and edges' arrival order before the algorithm receives any input. 3The objective of edge coloring algorithms is to output a coloring using as few colors as possible, close to the offline optimal ∆ or ∆+1 colors [Viz64].The objective of online matching algorithms is traditionally to output a large matching.However, due to the reduction mentioned in the introduction, and restated more formally below, our interest will be in online matching algorithms that match each edge with high probability, close to 1/∆.Lemma 2.1 (Reduction ([CPW19, SW21])).Let A be an online matching algorithm that, on any graph of maximum degree ∆ = ω(log n), matches each edge with probability at least 1/(α • ∆), for α ⩾ 1.Then, there exists an online edge coloring algorithm A ′ that on any graph with maximum degree ∆ = ω(log n) outputs an edge coloring with (α + O((log n/∆) 1/4 )) • ∆ colors with high probability in n.
In Appendix A, we generalize the above lemma, and use this generalization to obtain results for online list edge coloring (each edge has a possibly distinct palette) and for online local edge coloring (each edge e should be colored with a color of index not much higher than max v∈e deg(v)).In particular, the appendix implies (see Lemma A.15) that one can reduce the slack above to (α + O((log n/∆) 1/3 )) • ∆ colors.
Martingales.A crucial ingredient in the analysis of our algorithms is the use of martingales.
Definition 2.2 (Martingale).A sequence of random variables Y 0 , . . ., Y m is a martingale with respect to another sequence of random variables X 1 , . . ., X m if the following conditions hold: The technical advantage of using martingales in our analysis is their amenability to specialized concentration inequalities which, unlike Chernoff-Hoeffding type bounds, do not require independence (or negative correlation) between the involved random variables.In particular, we will use a classic theorem due to Freedman providing a Chernoff-type bound only depending on the step size and on the observed variance of the martingale, with the latter defined as follows.For any possible outcomes (x 1 , . . ., x m−1 ) of the random variables (X 1 , . . ., X m−1 ), let: be the observed variance encountered by the martingale on the particular sample path x 1 , . . ., x m−1 it took.To simplify notation, we usually assume that x 1 , . . ., x m−1 are chosen arbitrarily and write: Lemma 2.3 (Freedman's Inequality [Fre75]; see also [BDG16,Theorem 12], [HMRAR98, Theorem 3.15]).Let Y 0 , . . ., Y m be a martingale with respect to the random variables X 1 , . . ., X m .If |Y k − Y k−1 | ⩽ A for any k ⩾ 1 and W m ⩽ σ 2 always, then for any real λ ⩾ 0: We remark that we tailored the inequality to our use, and a more general version holds [Fre75].

Online Matching Algorithm
In this section, we design an online matching algorithm as guaranteed by Theorem 1.3.

Our First Matching Algorithm
We first describe our key modification of the basic algorithm presented in the introduction.Analogous to that algorithm, upon arrival of edge e t whose endpoints are still unmatched, we match e t with a "scaled" probability P (e t ).However, and crucially for our analysis, our scaling factor will depend on the specific execution (random choices) of the algorithm.To illustrate this modification, refer to Figure 1, which depicts the neighborhood surrounding e t .Here, we denote by e t j the j-th edge connecting a vertex in e t with a vertex w j , with these k = 7 edges appearing sequentially before e t , with Suppose now that we fixed the randomness associated with all edges other than {e t , e t 1 , , e t 2 , . . ., e t k }, and conditioned on this event, which we refer to as R, let P R (e t j ) represent the probability that the algorithm will add the edge e t j to the matching, assuming that none of the preceding edges e t 1 , e t 2 , . . ., e t j−1 have been matched.Consequently, we have: and overall, by total probability, (1 − P R (e t j )) .The primary complication with the initial algorithm in the introduction is the intricate correlation within the joint distribution of P R (e i 1 ), . . ., P R (e i k ) as functions of the randomness R of edges outside the direct neighborhood of e t .This correlation complicates both the computational aspect of determining the probability that vertices u and v are unmatched until time t, as well as the theoretical analysis of the algorithm's competitive ratio.
Our algorithm overcomes this challenge with a, in hindsight, simple strategy.We utilize a scaling factor conditional on the randomness R, i.e., we scale with respect to the "observed" probabilities, thus ensuring that the resulting online algorithm is both computationally efficient and (as we will see) theoretically tractable to analyze.Specifically, we obtain the following algorithm: When an edge e t = (u, v) arrives, match it with probability if u and v are still unmatched, 0 otherwise, where e t 1 , . . ., e t k are those previously-arrived edges incident to the endpoints of e t .
Note that the values P (e t j ) needed to compute P (e t ) are all defined at time t (and easy to compute), since any such edge e t j arrived before e t .Moreover, assuming u and v are unmatched (free), these values equal P R (e t j ), where R is the event corresponding to the random bits used in this execution for the edges outside the neighborhood of e t .Thus, if P (e t ) ⩽ 1 for every edge e t , this algorithm is well-defined, and attains the right marginals, by total probability over R: However, our new natural algorithm is ill-defined, as P (e t ) may exceed 1, even on trees.For example, suppose edges in Figure 2 arrive in a bottom-up, left-to-right order, and no edge before e t is matched by the algorithm (this is a very low-probability event).Then P (e t 1 ) = 1 ∆+q , where e 1 , . . ., e ∆−2 are the edges below e t 1 .But the term i (1−P (e i )) is the probability that none of the edges e i are matched by the algorithm, which is exactly 1− ∆−2 ∆+q = q+2 ∆+q , since each e i in this example is matched with probability exactly 1 ∆+q and these are disjoint events.Hence P (e t 1 ) = 1 ∆+q • ∆+q q+2 = 1 q+2 .We can calculate P (e t 2 ) in the same way, remembering to also scale up by 1 1−P (et 1 ) , and we get P (e t 2 ) = 1 q+2 • 1 1−P (et 1 ) = 1 q+1 .Continuing in this fashion, P (e t i ) = 1 q+3−i for any i = 1, . . ., q + 1 (importantly P (e t i ) < 1, so indeed with some non-zero probability the algorithm will not match any of them).Finally, by a similar calculation, = q+2 q+1 > 1, making the algorithm undefined.However, in most runs of the algorithm, many bottom-level edges are matched, and so P (e t i ) = 0 for many i, and P (e t ) is much smaller.In the above example, the event that P (e t ) in this case exceeds one hinges on low-probability events.This therefore does not rule out matching probabilities of, say, 1/(∆ + q) − 1/∆ 3 ⩾ 1/(∆ + O(q)), so long as we avoid the use of P (e t ) as probabilities when P (e t ) > 1.In the next section we present a modification of Algorithm 1 doing just this, and provide an overview of its analysis.

The Analysis-Friendly Matching Algorithm
As discussed, it is not at all clear whether the random variables P (e t ) in Algorithm 1, which we interpret as probabilities, even are valid probabilities, i.e., whether they are upper bounded by 1.To avoid working with potentially invalid probabilities, we use a slightly different variant of Algorithm 1, whose pseudocode is given by Algorithm 2. This variant not only addresses the ill-defined probability concern, but also introduces a more precise notation, facilitating our analysis.However, this increased precision might initially obscure the connection between Algorithm 2 and the more intuitive Algorithm 1.
To clarify this relationship, consider the scaling factor 1/ k j=1 (1 − P (e t j )) for an edge e t = (u, v), which can be partitioned based on the edges incident to vertices u and v. Let δ t (u) denote the set of edges incident at u and arriving before e t .We then define F t (u) (and similarly F t (v)) as follows: and Consequently, as G is a simple graph, the probability P (e t ) used in Algorithm 1 can be reformulated as: if u and v are still unmatched by time t, 0 otherwise.
The above aligns with the pseudo-code in Algorithm 2. Assuming that P (e t ) = P (e t ) (and using that G is a simple graph and hence has no parallel edges), it becomes evident that Algorithm 2 is equivalent to Algorithm 1 under the premise that all probabilities P (e t ) are at most one.The critical modification in Algorithm 2 is the introduction of P , possibly distinct from P , ensuring that P (e t ) ⩽ 1.This is accomplished by constraining the values of F t (v), now redefined in terms of P , i.e., F t (v) := et j ∈δt(v) (1 − P (e t j )), to not fall below q 4∆ , implying P (e t ) ⩽ 1 for appropriately small q (see Observation 4.3).
Analysis: Intuition and Overview.As in Algorithm 1, it is not hard to prove that edge e t is selected with a probability of 1/(∆ + q) if P (e t ) always equals P (e t ), as detailed in Section 4.1.Therefore, the meat of our analysis, in Section 4.2, focuses on proving that for any edge e t , the equality P (e t ) = P (e t ) holds with high probability in ∆.For intuition why this should be true, note that if each of the deg t (v) edges of vertex v by time t is matched with probability 1 ∆+q , then the value F t (v)-that intuitively stands for the probability of v being free at time t-should be: Above, the inequality follows from deg t (v) ⩽ ∆ and the approximation follows by our choice of q = o(∆).Moreover, basic calculations (see Observation 4.4) imply that P (e t ) = P (e t ) if min{F t (u), F t (v)} ⩾ q 3∆ .In other words, P (e t ) ̸ = P (e t ) only if the F t (•)-value has dropped significantly below its "expectation" for one of the endpoints of e t .
The core of the analysis then boils down to proving concentration bounds that imply, for any vertex v and time t, that F t (v) ⩾ ∆ 3q with high probability in ∆.The values F t (v) are non-increasing as t grows, so Algorithm 2 (MatchingAlgorithm).
At the arrival of edge e t = (u, v) at time t: • Define if u and v are unmatched in M t , 0 otherwise. and it suffices to prove this inequality for the final value F (v) := F m (v).Let u 1 , . . ., u ℓ be the neighbors of v in the final graph.By simple calculations (Lemma 4.5), we show that: where the binary random variables E i are non-zero if and only if the neighbor u i of v is unmatched by the time t i at which the edge (v, u i ) arrives.If we denote by S : ) can be shown to be at most ∆ ∆+q .If the variables E i and F t i (u i ) were independent, one could now use Chernoff-Hoeffding type bounds to conclude that Y ⩽ ∆ ∆+q/2 with high probability in ∆, proving in the process (see Lemma 4.5).However, in general, the events of different neighbors u i of v being matched when (v, u i ) arrives are not independent, and so the variables E i , F t i (u i ) are correlated, making such an approach not applicable.We overcome this by interpreting the right-hand side of (1) as the final state of a martingale.
Concretely, our main idea is to parameterize the set S, the random variables E i and the sum Y over time.Assuming that the input stops at time step t ⩽ m, one can naturally define the analogues of S, E i , and Y up to time t by S t := {u i ∈ N (v) | u i not matched until time min{t, t i }}, E ti := 1[u i ∈ S t ] and: So, at each time step, we either change the value of a term or drop a term.With this notation, we trivially have that S = S m , E i = E mi and Y = Y m .The advantage of this representation is that Y 0 , . . ., Y m turns out to be a martingale with respect to the random variables X 1 , . . ., X m sampled by Algorithm 2. As Y 0 ⩽ ∆ ∆+q , our objective reduces to proving that the following holds with high probability in ∆: As the martingale Y 0 , . . ., Y m takes ∆ 2 non-trivial steps (based on the two-hop neighborhood), it is not enough to use the maximum step size and the number of steps to argue about concentration as done in, e.g., Azuma's inequality.However, we can bound the maximum step size and the observed variance, which is sufficient for applying Freedman's inequality (Lemma 2.3), yielding our desired result.In the next section we substantiate the above intuition, and analyze Algorithm 2.

Analysis of the Online Matching Algorithm
In this section we present the formal analysis of Algorithm 2, and prove that it matches each edge e t with probability at least 1/(∆ + O(q)), with q to be chosen shortly.Our analysis is divided into two parts.
In the first part (Section 4.1), we prove that if P (e t ) = P (e t ), i.e., the values F t (v) for both v ∈ e t are large enough, then we match e t with probability at least 1/(∆ + q) (Lemma 4.2).
In the second part (Section 4.2), we remove the assumption that P (e t ) = P (e t ) and prove that Algorithm 2 achieves a matching probability of at least 1/(∆ + 4q), by showing that with high probability in ∆, the values F t (v) are large enough to guarantee P (e t ) = P (e t ).To prove the latter high-probability bound, we interpret a sufficient desired lower bound (Lemma 4.5) as the final state of a martingale, and use Freedman's inequality (Lemma 2.3) to prove that F t (v) is likely sufficiently large for our needs.
Choice of q.We will use q := √ 200 • ∆ 3/4 ln 1/2 ∆, for reasons that will become clear in the proof of Lemma 4.10.For the rest of the section we will only make use of the following corollaries of our choice of q: 8 Note that the upper bound on q not only follows from its choice (for sufficiently large ∆), but we may also assume this bound without loss of generality: if q > ∆/4, then simply picking a random color used by the (2∆ − 1) colors of the greedy online coloring algorithm will match each edge with probability 1/(2∆ − 1), greater than our desired 1/(∆ + 4q) matching probability.

A Sufficient Condition for 1/(∆ + q) Matching Probability
We begin with a simple observation, which will facilitate our characterization of random values associated with Algorithm 2 under various conditionings.
Observation 4.1.For any time t, the random variables F t (v), P (e t ), P (e t ) are determined by the current partial input e 1 , . . ., e t and the current matching M t−1 .
Proof.Since P (e t ) and P (e t ) are determined by the values of the variables F t (v), it suffices to prove the statement only for these latter variables.This follows by induction on t.For the base case, we have F 1 (v) = 1 for all vertices, which implies the statement trivially.For the inductive step with t > 1, note that by construction F t (v) is determined by the values { P (e t ′ ) : t ′ < t and e t ′ = (u ′ , v) is incident to v}.Any such value P (e t ′ ) is in turn a function of F t ′ (u ′ ), F t ′ (v) and P (e t ′ ).By the inductive hypothesis, are functions of M t ′ −1 and therefore also functions of M t−1 (since t ′ < t).Finally, the value P (e t ′ ) is determined by The following lemma formalizes the intuition discussed in Section 3; it proves that Algorithm 2 has the correct behavior for an edge e t = (u, v) by assuming that the randomness outside the 1-neighborhood (δ(u) ∪ δ(v)) of e t is fixed and P (e t ) = P (e t ).
Proof.Fix all randomness except for the edges incident to u and v.That is, we fix the outcomes Above, for the first equality we used the independence of the different X t i , while for the last equality we partitioned the edges incident to u and v to obtain the factors f t (u) and f t (u) respectively.Note that, because the graph is simple and so parallel edges are disallowed, this partitioning is well defined, as none of the edges e t i connects u to v. If u or v are unmatched before time t, we get: . Since the random variable X t is independent from M t , the above then yields the desired equality when conditioning on A(⃗ x): The lemma now follows by the law of total probability over all possible values of A(⃗ x).
Note that a consequence of the above lemma is that, if the values of P and P were to always coincide during the execution of Algorithm 2, then all edges would be matched with probability 1/(∆ + q).In the following section, we prove that the equality P (e t ) = P (e t ) holds with high probability in ∆ for any edge e t , which, by simple calculations, implies a matching probability of 1/(∆ + O(q)).

Analysis of Algorithm 2 in General
In this section we wish to prove that P and P coincide for each edge with high probability in ∆.This requires proving that F t (v) is likely to be high for all times t and vertices v-intuitively that means that the probability that v is unmatched is never too low.We start by observing a trivial lower bound on F t (v) and upper bound on P (e) that follows directly from the algorithm's definition.
Proof.For any fixed v, we prove the first statement by induction on t.For t = 0, using (2), we have F t (v) = 1 ⩾ q/(4∆).Now, assume that F t (v) ⩾ q/(4∆) for some t ⩾ 0. If either v / ∈ e t or P (e t ) = 0, then clearly F t+1 (v) = F t (v) and the statement is proven.Otherwise, by the definition of P it follows that ⩾ q/(4∆), and again the statement is proven.The fact that P (e t ) ⩽ 1/4 is now a consequence of the previously proven fact and q ⩾ 8 √ ∆, by (2): To prove that P (e t ) = P (e t ) with high probability in ∆, we note that by their construction in Algorithm 2, the values P (e t ) can only differ from P (e t ) if the values of the variables F t (v) are too small, and in particular are close to their lower bound of q/(4∆) guaranteed by Observation 4.3.
Given Observation 4.4 and Lemma 4.2, it suffices to show that the probability that F t (v) < q/(3∆) is very small for all time steps t.As F t+1 (v) ⩽ F t (v), it is thus sufficient to bound this probability at the very last step, i.e., to bound the probability that F (v) < q/(3∆), where F (v) := F m (v).In the following lemma, we identify a sufficient condition-Equation (3)-for the condition F (v) ⩾ q/(3∆) (and thus also P (e) = P (e)) to hold.In particular, we lower bound F (v) by only focusing on the impact of neighbors of v on P (e t i ) for edges e t i ∋ v and then applying the union bound.
Lemma 4.5.Let e t 1 = (u 1 , v), . . ., e t ℓ = (u ℓ , v) be the edges incident to v, arriving at times be those neighbors u i that are not matched before time t i when the edge e t i = (u i , v) arrives.Then, As a consequence, F (v) ⩾ q/(3∆) holds if Proof.We look at how F t (v) develops throughout the run of Algorithm 2. When edge e t i = (u i , v) arrives, the algorithm sets ), yielding the following lower bound on F t+1 (v), relying on P (e t i ) ⩽ P (e t i ): .
Above, the second inequality is only an equality if both v, u i are not matched before time t i , and in particular u i ∈ S. In the alternate case, we have that P (e t i ) = P (e t i ) = 0, and so F t+1 (v) = F t (v).We conclude that F t i +1 (v) − F t i (v) can only be non-zero for times t i with u i previously unmatched (u i ∈ S), in which case Since F 1 (v) = 1 initially, the first part of the lemma follows by summing over all u i ∈ S.
The second part of the lemma, whereby F (v) ⩾ q/(3∆) provided Equation (3) holds, now follows from the first part and a simple calculation, using that q ⩽ ∆/4 by Equation (2): Similar arguments to those in Section 4.1 can be used to prove that the sum P := u i ∈S ∆+q .(This also follows from the subsequent martingale analysis later.)That the expectation of P is bounded away from the upper bound of ∆ ∆+q/2 required in (3) hints at using appropriate concentration inequalities, such as Chernoff-Hoeffding type bounds, to prove P ⩽ ∆ ∆+q/2 with high probability in ∆.Unfortunately, P is not a sum of independent or negatively correlated random variables in general, and therefore Chernoff-Hoeffding bounds are not applicable.Instead, in the subsequent sections we model the development of P as a martingale process, which will allow proving the desired concentration inequality without having to argue explicitly about correlations.

Our martingale process
Fix a vertex v.To prove that the sufficient condition F (v) ⩾ q/(3∆) of inequality (3) holds often in general, we view the development of the left-hand-side of (3) as a martingale.For a time step t, define: Recall that t i is the time step at which the edge e t i = (v, u i ) arrives, and N (v) is the final set of neighbors of v in the graph.Hence, S t contains all neighbors of v in the final graph (including the future neighbors), except those neighbors that were already matched by the time min{t, t i }.In particular, if u i ∈ S t i , i.e., u i was not matched by the time t i it gets connected to v, it will remain inside all future sets S t , for t ⩾ t i .Also, notice that both S t and Y t are unknown to the algorithm at time t, as their definition requires "future" knowledge of the input graph, and that they are only used for the analysis.
With the above notation, Y 0 = deg(v) ∆+q ⩽ ∆ ∆+q and Y := Y m equals the left-hand-side of Equation (3).Y t−1 is determined by the independent random variables X 1 , . . ., X t−1 sampled by Algorithm 2. As we now show, Y 0 , . . ., Y m indeed form a martingale: Lemma 4.6.Y 0 , . . ., Y m form a martingale w.r.t. the random variables X 1 , . . ., X m .Furthermore, the difference Y t − Y t−1 is given by the following two cases: • If e t is added to M t+1 , which happens with probability P (e t ), then: • If instead e t is not added to M t+1 , which happens with probability 1 − P (e t ), then: Proof.To prove the martingale property, we check the conditions given by Definition 2.2.First, notice that fixing X 1 , . . ., X t in Algorithm 2 determines the set S t and the values of the random variables We first verify the claimed identities at (4) and (5).If the edge e t arriving at time t is not incident to any u i ∈ S t with t < t i , then Y t+1 = Y t deterministically, and (4), (5) hold as their right hand sides are equal to 0. On the other hand, if e t is incident to one or two such vertices u i ∈ S t , then we have two cases: e t might be added to M t+1 or not.If e t was matched, S t+1 = S t \ (S t ∩ e t ), so some terms are dropped from the sum and the identity (4) follows.Otherwise, if e t was not matched, we have for any u i ∈ S t ∩ e t : and identity (5) follows by summing the above for all u i ∈ S t ∩ e t .Now E[Y t | X 1 , X 2 , . . ., X t−1 ] = Y t−1 follows by direct computation using (4) and (5).Hence, all conditions of Definition 2.2 are fulfilled, and indeed Y 0 , . . ., Y m forms a martingale w.r.t.X 1 , . . ., X m .

Bounding martingale parameters
We recall that our goal is to prove that Equation ( 3 Our idea for proving that inequality (6) holds often is to use specialized concentration inequalities for martingales, which do not require independence or explicit bounds on the positive correlation.Specifically, we will appeal to Freedman's inequality (see Lemma 2.3).To this end, in the following two lemmas we upper bound this martingale's step size and observed variance.Proof.By using the expressions for the difference Y t − Y t−1 from Lemma 4.6, we obtain: For the second inequality, first notice that P (e t ) ⩽ P (e t ) ⩽ 1/2 (by Observation 4.3) which implies P (et) ⩽ 1.Also, we have F t (u i ) ⩾ q/(4∆) (by Observation 4.3) at any point of time in the algorithm.
For the third inequality we used the trivial fact that |S t ∩ e t | ⩽ 2.
We next upper bound the observed variance W m .While the following proof is computation-heavy, we note that all our following manipulations are straightforward, except for the insight that the hard lower bound F (v) ⩾ q/(4∆) allows us to also bound e∈δ(u) P (e).Lemma 4.9 (Observed Variance).For the martingale Y t described above, we have: e t2 e t3 Figure 3: The configuration described in Lemma 4.5, i.e., the 2-hop neighborhood of v.The (at most) ∆ 2 edges drawn correspond to the only non-trivial steps of the martingale.
Proof.Using the expression for Y t − Y t−1 (see Lemma 4.6), we first have: Note that the P (e t )'s and F t (u i )'s depend on the variables X 1 , X 2 , . . ., X t−1 we are conditioning on, and that we will show the bound on the observed variance in any execution of the algorithm.The above sums contain either one or two terms, as |S t ∩ e t | ⩽ 2. By using the elementary inequality (a + b) 2 ⩽ 2a 2 + 2b 2 , we obtain the following upper bound: The above expression can be rewritten compactly by factoring out .
Using Observation 4.3, we have P (e t ) ⩽ P (e t ) ⩽ 1/2 and thus 1 + P (et) for any u i ∈ S t ∩ e t .Additionally we note that |S t ∩ e t | ⩽ 2, and so we have: By summing this inequality over all t we will obtain an upper bound for W m .Notice that an edge e t is thereby summed over on the right hand side only if it is incident to some vertex u i ∈ S t at some time step t (see Figure 3).As S t ⊆ S 0 = N (v), we obtain the following upper bound, double counting edges e t that connect two distinct vertices of S t (for example edge (u 2 , u 3 ) in Figure 3): To upper bound the inner sum, fix a vertex u i ∈ N (v) and note that: which implies that e∈δ(u i ) P (e) ⩽ ln 4∆ q ⩽ ln(∆) (using that q ⩾ 8 √ ∆ ⩾ 8 ⩾ 4).By plugging the above into (8) and using the fact that |N (v)| ⩽ ∆, we finally can conclude the proof of Lemma 4.9:

Conclusion of the analysis
Having upper bounded both step size and variance of the martingale Y 0 , Y 1 , . . ., Y m =: Y , we are now ready to leverage Freedman's inequality to prove that with high probability (in ∆), our desired upper bound on |Y − Y 0 | of inequality (6) holds, and hence F (v) ⩾ q/(3∆).
To obtain the statement of Theorem 1.3, it remains to put the pieces together.With high probability in ∆, the inequality from Lemma 4.5 holds for both vertices incident at an edge e.Also, by a sequence of other lemmas we know that this inequality implies that P (e) = P (e) (with high probability in ∆).Intuitively, this should guarantee the right marginal 1/(∆ + O(q)) of matching e.More formally, we now complete the proof of our main technical contribution: Theorem 1.3.There exists an online matching algorithm that on graphs with known maximum degree ∆, outputs a random matching M satisfying Proof.We prove that Algorithm 2 with q as in this section is such an algorithm.Fix an edge e t = (u, v).
The marginal probability that e t is matched by Algorithm 2 is given by Pr[X t < P (e t )].We note that for the event X t < P (e t ), we need that both X t < P (e t ) and P (e t ) = P (e t ) (else P (e t ) = 0 and so trivially X t ⩾ P (e t )).We therefore have that Pr[X t < P (e t )] = Pr (X t < P (e t )) ∧ P (e t ) = P (e t ) = Pr[X t < P (e t )] − Pr (X t < P (e t )) ∧ P (e t ) ̸ = P (e t ) ⩾ Pr[X t < P (e t )] − Pr[ P (e t ) ̸ = P (e t )] .
Above, the second and third inequalities follow from Lemma 4.2 and Lemma 4.10 together with union bound over u, v ∈ e t .The last inequality, on the other hand, which is equivalent to 3q (∆+q)(∆+4q) ⩾ 4∆ −3 , which clearly holds for large enough ∆, and can be verified to hold for all ∆ ⩾ 1, using 8 From Matching to Edge Coloring.We are finally ready to prove our main result, the existence of an online ∆(1 + o(1))-edge-coloring algorithm.To do so, it suffices to combine our online matching algorithm of Theorem 1.3 with the known reduction from online edge coloring to online matching given by Lemma 2.1, or alternatively, our strengthening of the latter in Lemma A.15. Combining these, we obtain the following quantitative result.

APPENDIX A Online List Edge Coloring and Local Edge Coloring
In this section we present a number of applications that follow from our Algorithm 2. In particular, we show generalizations to online list edge coloring (Theorem A.1, in Appendix A.3) and online local edge coloring (Theorem A.2, in Appendix A.4).
In the online list edge coloring problem, every online arriving edge e presents a list L(e) ⊆ N of colors that can be used for coloring e.Unlike in the classical setting, this list L(e) is not necessarily of the form {1, . . ., ∆ + q} for some q ⩾ 1, but can instead be arbitrary.The objective of an algorithm is to provide a valid coloring of all edges with each edge e assigned color in its list, c(e) ∈ L(e), and no vertex having two incident edges assigned the same color.Our result for this problem is given by: Theorem A.1 (Online List Edge Coloring).There exists an online list edge-coloring algorithm which, on n-vertex graphs of known maximum degree ∆, outputs with high probability (in n) a valid list-edge coloring, provided all lists L(e) satisfy |L(e)| ⩾ ∆ + q, for q := 10 24 ln n + 104 • ∆ 3/4 ln 1/2 ∆ + ∆ 2/3 ln 1/3 n .
In the online local edge coloring problem, the setting is the same as in the classical version.However, we now aim to color each edge e = (u, v) using a color of index c(e) that is not much larger than the max degree of its endpoints, d max (e) := max{deg(u), deg(v)}.Here we prove the following result: Theorem A.2 (Online Local Edge Coloring).There exists an online edge-coloring algorithm which, on nvertex graphs with a priori known degree sequence {deg(v) | v ∈ V }, 4 computes, with high probability (in n), an edge coloring c : E → N that colors each edge e using a color c(e) which satisfies: To obtain both generalizations of Theorem 1.2 above, we rely on an online coloring subroutine, Algorithm 3, which we provide in the following section.This algorithm colors a (potentially strict) subset of the edges of the graph, and assigns each colored edge e a color from some individual (small) list of colors ℓ(e), revealed when e arrives.Under the mild technical condition (which holds with high probability for our invocations of this algorithm in later sections) that |ℓ(e)| ≈ λ for some appropriate parameter λ, the subroutine reduces the maximum degree ∆ of the remaining uncolored subgraph by ≈ λ.In other words, for all high-degree vertices in U (those with degree ≈ ∆), the algorithm colors ≈ λ of their incident edges.This subroutine is heavily inspired by [CPW19, Algorithm 4], which uses (instead of our Algorithm 2) an older online matching rounding algorithm, called MARKING [CW18,Waj20], for sampling matchings.Unlike our subroutine which works under edge arrivals, the previous [CPW19, Algorithm 4] is restricted to one-sided vertex arrivals, because MARKING is also restricted to this regime.Moreover, our algorithm's more general ability to color from lists ℓ(e) for each edge e allows us to obtain both the list edge coloring result and local edge coloring results, which could not be obtained from [CPW19, Algorithm 4].
In Section A.2, we present Algorithm 4, which applies Algorithm 3 successively on monotonically decreasing subgraphs of G to color edges.This coloring algorithm is very flexible, in the sense that every arriving edge e presents a personalized list L(e) of colors that can be used to color it.This algorithm forms the underpinning of both theorems above, proven in later sections.

A.1 Subroutine to Reduce the Maximum Degree
In this section we provide an algorithmic subroutine for reducing the uncolored maximum degree.This will be used for our list and local edge coloring results, as well as our improvement on the reduction from edge coloring to matching of Lemma 2.1.
The psueocode of this algorithmic suroutine is given in Algorithm 3, and it works as follows.Its input is a graph U with n vertices and maximum degree upper bounded by some ∆(U ) arriving online edgeby-edge.(We think of U as the uncolored subgraph of an initial graph G on the same vertex set.)Let λ := ∆(U ) 2/3 ln 1/3 n.We call a vertex v ∈ V dense if deg U (v) ⩾ ∆(U ) − λ.For any arriving edge e, the algorithm receives a list ℓ(e) of available colors.If e is incident to a dense vertex v, the list ℓ(e) is guaranteed to have size |ℓ(e)| ≈ λ.For each new color revealed, we run a copy of Algorithm 2, guaranteeing that each edge is matched (and hence) colored by each of the colors c ∈ ℓ(e) with probability ≈ 1 ∆ .
Input: Graph U with n vertices, arriving edge-by-edge.Each edge e arrives with list of colors ℓ(e).
Promise: A value ∆(U ) upper bounding the maximum degree of U is given.
Output: Coloring of a subset of edges of U .
When edge e together with list ℓ(e) arrives: Iterate through c ∈ ℓ(e): • If c was never seen before, launch a new instance of Algorithm 2 corresponding to c on U , using ∆(U ) as an upper bound for the maximum degree of U .
• Input the edge e to the instance of Algorithm 2 corresponding to c.
• If e was matched by Algorithm 2 and was not already colored, color e with color c.
As we now show, if the maximum degree of U is sufficiently large(r than ln n), the above algorithm guarantees with high probability that dense vertices have ≈ λ edges colored by this algorithm, and therefore the maximum degree of the remaining uncolored graph decreases by roughly λ as well.
Theorem A.3.Every edge e colored by Algorithm 3 is assigned a color c ∈ ℓ(e).Moreover, after running Algorithm 3 on U with ∆(U ) ⩾ 10 24 ln n, and using Algorithm 2 with q := 10 2 • ∆(U ) 3/4 ln 1/2 ∆(U ), with the necessary promises, the maximum degree of the remaining uncolored subgraph Before proving the above theorem, it will be convenient to present some useful inequalities in a separate lemma, whose technical proof (which are easy to verify hold asymptotically, but also hold for all choices of n ⩾ 2 due to the large constants chosen) is deferred to the end of the section: Fact A.4. Suppose ∆(U ) ⩾ 10 24 ln n and define the following parameters: Then, these parameters satisfy the following inequalities: Proof of Theorem A.3.That each edge e is assigned a color (if any) from ℓ(e) is immediate from the algorithm's description.The "meat" of the proof is therefore in proving that removal of the edges colored yields a subgraph of maximum degree as low as claimed in the theorem statement.Consider a dense vertex v of U , i.e., with degree deg U (v) ⩾ d − λ. (If no such vertex exists, the statement to prove follows trivially.)By the proof of Theorem 1.3, each edge e incident to v is picked by color c ∈ ℓ(e) with probability at least 1 d+4q ′ ⩾ 1 d+q , for q ′ = √ 200d 3/4 ln 1/2 d, where 4q ′ ⩽ q.Let X e be the indicator variable for the event that e is colored by Algorithm 3 with any color c ∈ ℓ(e).As |ℓ(e)| ⩾ λ for dense vertices, we have by the first two terms of the Taylor expansion of exp(−λ/(d + q)) for λ/(d + q) ⩽ 1 (as in our case): Let X := e∈δ(v) X e be the number of edges incident to v that are colored by Algorithm 3. We wish to argue that X is not much less than the following lower bound E[X] ⩾ µ := (d − λ) • λ d+q 1 − λ d+q on its expectation.(This lower bound follows by our lower bound on Pr[X e = 1], linearity of expectation, and the lower bound on the number of edges of dense vertices.) We now argue that with high probability in n, the expectation E[X] does not fall short of this lower bound of µ.This would follow easily by standard Chernoff bounds if the X e were independent, though they clearly are not.However, we can interpret these variables as indicator variables of a balls and bins process, which are negatively associated (NA), and hence admit the same Chernoff bounds as independent random variables [DR96,Waj17].In more detail, we have a ball for each color c, and each ball (color) c falls into a (single) bin corresponding to either an edge e of v or a dummy bin, depending on whether or not e is matched by the c th copy of Algorithm 2, which happens independently (but not i.i.d) for different c.The values X e are therefore indicators for whether bin e is non-empty, and these random variables are NA [Waj20, Corollary 2.4.6].Consequently, the sum X of the NA (but not independent) random variables X e , which satisfies E[X] ⩾ µ, also satisfies the following standard Chernoff bound for any ε < 1: Fix ε := 30 ln n µ .By Equation (12) from Fact A.4, have that ε ∈ [0, 1].On the other hand, as we shall soon see, we also have that To see this, first note that, as µ = λ Furthermore, µ ⩾ λ − 2λ • q+λ ∆+q , by (10) from Fact A.4.We obtain (14) by summing up these last two inequalities.Consequently, combining (13) and ( 14) and using our choice of ε, we have that The claimed upper bound on the maximum degree of U ′ then follows, since this upper bound holds for all non-dense vertices in U (which is a super graph of U ′ ), together with union bound over all dense vertices v, which all have degree at most ∆(U ) in U , and so, with high probability in n, in the new graph U ′ all vertices have degree at most , where the last inequality follows from (11) from Fact A.4.
We now turn to using this subroutine to completely color a given graph revealed online.

A.2 Generic Coloring Algorithm
Strategy.For subsequent applications, we consider graphs G with a maximum degree known to be at most ∆, with edges arriving online one by one.Moreover, every arriving edge e provides a list of available colors L(e).We will repeatedly apply Algorithm 3, consuming different sublists ℓ(e) ⊆ L(e) of colors per phase, to reduce the maximum degree of the currently uncolored graph U ⊆ G, by coloring subsets of the edges of U .The following definition introduces the relevant parameters and is used to define Algorithm 4: Definition A.5 (Degree Sequence).Let d 0 := ∆ be (an upper bound on) the maximum degree of a graph G with n vertices.For i ⩾ 0, we define the following: Let f be the minimal value for which d f < 10 24 ln n. (Such an f exists, since d i ⩾ 10 24 ln n, then by Fact A.4, Algorithm 4 (Generic Coloring Algorithm).
Input: Graph G with n vertices, arriving edge-by-edge together with lists L(e) of available colors.We use the notations introduced in Definition A.5, and denote by C := ∪ e L(e) the set of all colors.
Output: Coloring of a subset of edges of G.
• Let C 0 , . . ., C f +1 be a partitioning of C which is computed online.
-For any edge e incident to a dense vertex in U i (i.e., having degree ⩾ d i − λ i ), use the sublist ℓ i (e) := L(e) ∩ C i to input online to Algorithm 3.
-Set U i+1 ← U i \ {edges colored by Algorithm 3 in phase i} • Try to color the final uncolored graph U f +1 using Greedy with the remaining lists of available colors ℓ f +1 (e) := L(e) ∩ C f +1 for each edge e.
Let C := ∪ e∈G L(e) be the set of all colors.Note that this set is a priori unknown to the online algorithm and is only revealed indirectly through the lists L(e) of the arriving edges.Algorithm 4 partitions this set into f + 2 subsets of colors C 0 , . . ., C f +1 online.More concretely, whenever the arriving list L(e) of an edge e contains a color c ∈ L(e) which is seen for the first time, it decides online to which of the sets C 0 , . . ., C f +1 color c will be assigned.We insist on the fact that this choice can be made arbitrarily depending on the application.However, there is a restriction which we discuss in the next paragraph.
The i-th phase is the iteration of Algorithm 4 on the uncolored subgraph U i .The partitioning of C into C 0 , . . ., C f +1 defines the colors to be used in each phase.Thereby, ℓ i (e) := L(e) ∩ C i is the chosen sublist of L(e) which is inputted online to Algorithm 3 for edge e during the execution of the i-th phase.Let L i (e) := L(e) \ ∪ i−1 j=0 ℓ j (e) be the set of colors which are still available to use for e during the i-th phase.For all edges e, the sublists ℓ i (e) ⊆ L i (e) must have the size required to apply Theorem A.3, that is, λ i ⩽ |ℓ i (e)| ⩽ λ i + 10 √ λ i ln n.This motivates the following: Definition A.6 (Admissible Partitioning of Colors).Given a fixed input of Algorithm 4, let C 0 , . . ., C f +1 be a partitioning of the set of colors C := ∪ e∈G L(e) which is computed online.We say that the partitioning C 0 , . . ., C f +1 is admissible if for any edge e and phase i ∈ {0, . . ., f } in which e is incident to a dense vertex in U i (i.e., having degree ⩾ d i − λ i ), we have that ℓ i (e) := L(e) ∩ C i satisfies: By successively applying Algorithm 3 (starting with the initial graph U 0 := G), as done in Algorithm 4, and using an admissible (online) partitioning of C into C 0 , . . ., C f +1 as defined in Definition A.6, one obtains a sequence of subgraphs such that by successive applications of Theorem A.3 we obtain the following: Lemma A.7.With high probability in n, the graphs U i computed online by Algorithm 4 (consisting of yet uncolored edges) have their maximum degree bounded by d i .Moreover, for i ⩽ f we have d i ⩾ 10 24 ln n.
Proof.Let good i be the event that U i has its degree upper bounded by d i , where i ∈ {0, . . ., f }.Clearly, good 0 holds by the fact that d 0 is an upper bound on the maximum graph of the initial graph G. Assuming good i holds, consider the i-th phase of Algorithm 4.Then, by Theorem A.3, the probability that good i+1 holds is at least 1 − 1 n 10 .Hence: By induction it easily follows that Pr[good i ] ⩾ 1 − i n 10 for any i ∈ {0, . . ., f }.For i = 0 this is true with with probability 1, and for the induction step we use (17) together with the induction hypothesis to obtain: In particular, for the complementary events we have Pr[good i ] ⩽ 1 n 9 for any i, and the lemma follows by union bound over the at most n possible values of i: Analysis.We have designed a generic coloring algorithm which, with high probability, successively reduces the maximum degree of the uncolored subgraphs U i until their maximum degree drops below O(log n).It remains to argue how to ensure the following properties required implicitly by Algorithm 4: • The lists L i (e) of remaining colors need to have sufficiently large size to allow extracting the sublists ℓ i (e) ⊆ L i (e), such that |ℓ i (e)| ⩾ λ i as required by the application of Algorithm 3 inside Algorithm 4.
• In particular, to color U f +1 successfully using Greedy, one needs to ensure that L f +1 (e) ⩾ 2d f +1 .
To obtain these guarantees, we maintain by induction, throughout all phases i, the property |L i (e)| ⩾ d i + a i for edges connected to dense vertices, where a i is a large enough slack.We define these slacks precisely: Definition A.8 (Slack Sequence).Let d 0 ⩾ 1 and introduce the degree sequence D(d 0 ) of d 0 , and the parameters f, λ i , q i as in Definition A.5.We define the slack sequence Sl(d 0 ) := {a i : 0 ⩽ i ⩽ f + 1} where: As anticipated, we prove by induction that these slacks fulfill the required guarantees: Lemma A.9.To finish the analysis, it remains to upper bound the slacks a i in closed form, as opposed to the (convenient but) recursive definition from Definition A.8.In particular, the upper bounds on a i indicate how large the lists L i (e) need to be to make Algorithm 4 succeed.It is clear that: where the last inequality follows because all three terms inside the parentheses are non-increasing in j.
By the above manipulations, it remains to upper bound the quantity f − i + 1, where f is the number of phases.For this purpose, the following lemma is helpful: Lemma A.10.Let a > 0 be any number.Consider a sequence (x k ) k⩾0 of non-negative integer numbers, such that x 0 ⩾ 1 and: Then, for (integer) k ⩾ Proof.We prove the statement by strong induction on the starting value x 0 ⩾ 1 of the sequence.If x 1/3 0 < a, then ax 2/3 0 > x k , and in particular, x 1 = x 0 − ⌈a • x 2/3 0 ⌉ < 0, so the statement holds.Now assume instead that x 1/3 0 ⩾ a, and that the statement holds for any integer x ′ 0 < x 0 .We apply the induction hypothesis on the sequence starting with x ′ 0 := x 1 = x 0 − ⌈a • x 2/3 0 ⌉ (trivially x 1 < x 0 ).We can assume x 1 ⩾ 1, else the statement already holds.The induction hypothesis gives us that x k+1 ⩽ 1 for k ⩾ a .It thus suffices to prove: which we rearrange into: Now taking the third power of both sides (which are indeed both positive): The above inequality follows from the fact that ⌈ax 2/3 0 ⌉ ⩾ ax 2/3 0 , and a 2 x 1/3 0 /3 ⩾ a 3 /3 > a 3 /9 (since we assumed x 1/3 0 ⩾ a).Thus the induction proof is concluded and the lemma proven.
We are now ready to upper bound f − i + 1.
Proof.Fix such an i.Notice that for any k with i ⩽ k + i ⩽ f we have d k+i+1 ⩽ d k+i − 0.9λ k+i (Fact A.4) and 0.9λ k+i = 0.9 k+i ln 1/3 n⌉ for n, d k+i ⩾ 10.Therefore: k+i ln 1/3 n⌉ for any k with i ⩽ k + i ⩽ f .By Lemma A.10, the sequence (d k+i ) k⩾0 will drop below 1 after at most k max := 6 3 d i ln n steps.However, f is defined as the highest index k + i for which d k+i ⩾ 10 24 ln n ⩾ 1.This implies that f Finally, we can now upper bound the slacks a i : Lemma A.12 (Upper bounding slacks a i ).For d 0 ⩾ 1, define the slack sequence Sl(d 0 ) according to Definition A.8.We have, for any i ∈ {0, . . ., f }: Proof.We begin by recalling (18): Replacing λ i = d 2/3 i ln 1/3 n and applying Lemma A.11, we get: which is the claimed statement.
Furthermore , we have εµ ⩽ 5 √ λ i ln n.Indeed, this is equivalent (by squaring) to: This last expression is equivalent to d i ⩾ (90/7) 4 ln n which is obvious.Hence, the Chernoff inequality gives: Now that the lemma is proven, it follows easily by union bound that, with high probability in n, all induced sublists ℓ i (e) in all phases i ∈ {0, . . ., f } have the required size, and so condition 1 of Theorem A.13 is fulfilled.This finishes the proof of Theorem A.1.
An improvement of Lemma 2.1.As a final note, we observe that the slack q in Theorem A.1 can be naturally decomposed into two parts: the first part is the O(∆ 3/4 log 1/2 ∆)-term, which comes directly from the application of Theorem A.3, which in turn relies on the fact that-by Theorem 1.3-we have access to an online matching algorithm that matches any edge e with probability 1/(∆ + Θ(∆ 3/4 log 1/2 ∆)).The second part of the slack is the O(∆ 2/3 log 1/3 n)-term, which comes from our choice of λ in Algorithm 4.However, it is not hard to see that these two parts of the final slack q in Theorem A.1 arise independently of each other and, more generally, if one had access to an online matching algorithm with a different guarantee than 1/(∆ + Θ(∆ 3/4 log 1/2 ∆)) matching probability per edge, this would directly translate to a change in the corresponding first term of the final slack q.More concretely, for the classical online edge coloring problem, we can generalize Lemma 2.1 to obtain the following reduction from online edge coloring to online matching: Lemma A.15 (Improved Reduction).Let A be an online matching algorithm that, on any graph of maximum degree ∆ = ω(log n), matches each edge with probability at least 1/(α • ∆), for α ⩾ 1.Then, there exists an online edge coloring algorithm A ′ that on any graph with maximum degree ∆ = ω(log n) outputs an edge coloring with (α + O((log n/∆) 1/3 )) • ∆ colors with high probability in n.

A.4 Local Edge Coloring
In this section we consider online local edge coloring, where we recall that we wish to color each edge e = (u, v) with a color not much higher than d max (e) := max{deg(u), deg(v)}.Our main result for this problem is the following.

Theorem A.2 (Online Local Edge Coloring).
There exists an online edge-coloring algorithm which, on nvertex graphs with a priori known degree sequence {deg(v) | v ∈ V },5 computes, with high probability (in n), an edge coloring c : E → N that colors each edge e using a color c(e) which satisfies: Proof.The statement of Theorem A.2 is almost an immediate consequence of our more general Theorem A.13.For i ∈ {0, . . ., f + 1}, let d i and a i denote the entries from the degree sequence and slack sequence of d 0 := ∆(G) as defined in Definition A.5 and Definition A.8 (also see Theorem A.13).We define the set of colors C to be C := {1, . . ., d 0 + a 0 } and propose the following partitioning: The fact that this is a valid partitioning follows from Lemma A.9.
To finish the proof of the theorem, it suffices to show that: Lemma A.17.With g(e) as defined above, we have: As in the case of Lemma A.16, we defer the proof (see below).By combining ( 27) and (28), we get the desired upper bound for c(e), and the statement of Theorem A.2 follows.
We now present the proofs of lemmas Lemma A.16 and Lemma A.17, which were deferred previously: Algorithm 5 (MatchingAlgorithmChanges).
At the arrival of edge e t = (u, v) at time t: Sample X t ∼ [0, 1] uniformly at random.Theorem B.1.Let x ∈ R E be a fractional matching of some graph G, which is revealed online, and satisfies x e ⩽ ε for all edges e, for some known ε ⩽ 0.99.Then there exists a randomized online matching algorithm whose output matching M satisfies for any edge e:
Remark B.2.Note that in general graphs, fractional matchings can be 3/2 times larger than their largest integral counterparts, as exemplified by a triangle with values x e = 1/2 for all its edges.That is, the integrality gap of this relaxation of matchings is 3/2.Nonetheless, our algorithm works for non-bipartite graphs, and for sufficiently spread out fractional matchings we almost losslessly round them to integral matchings.This does not contradict the integrality gap of this relaxation in general graphs, as all "odd set" constraints in the integral matching polytope [Edm65] are approximately satisfied for spread out fractional matching x e ⩽ ε.
On the other hand, to round fractional matchings in non-bipartite graphs, it is necessary to incur some loss (with respect to ε) when rounding, even in offline settings.
The new algorithm, a straightforward adaptation of Algorithm 2, is given by Algorithm 5.The following results are proven analogously to their counterparts from Section 4, and are therefore omitted here: Observation B.3 (Corresponds to Observation 4.3).F t (v) ⩾ s(ε) 4 and P (e t ) ⩽ P (e t ) ⩽ ε s 2 (ε) for every vertex v ∈ V and time t.
Observation B.4 (Corresponds to Observation 4.1).For any t, the random variables F t (v), P (e t ), P (e t ) are determined by the current partial input e 1 , . . ., e t and the current matching M t−1 .Lemma B.5 (Corresponds to Lemma 4.2).For any edge e t it holds that Pr[X t < P (e t )] = x e • (1 − s(ε)).
The proof of Lemma B.6 (just as its counterpart Observation 4.4) requires P (e t ) ⩽ ε s 2 (ε) ⩽ 1/4.This can be achieved by imposing: However, as C is supposed to be a constant, we still need to check that the right hand side of the above inequality is also a constant.By the statement of Theorem B.1, we have that ε ⩽ 0.99, and it is easy to check that the function ε is bounded in the interval (0, 0.99).Hence, the right hand side of (31) is a constant.The analogue of Lemma 4.5 is: Lemma B.7 (Corresponds to Lemma 4.5).Let e t 1 = (u 1 , v), . . ., e t ℓ = (u ℓ , v) be the edges incident to v with t 1 < • • • < t ℓ .Further, let S := {u i | u i ̸ ∈ M t i } be those neighbors u i that are unmatched by time t i when edge e t i = (u i , v) arrives.If: then F (v) ⩾ s(ε)/3.
For ease of notation let x i := x et i be the fractional input of the edge e t i = (u i , v), which connects v to its neighbor u i .We will derive a martingale from the following quantities: x i • (1 − s(ε)) F min{t,t i } (u i ) .
Claim B.8 (Corresponds to Lemma 4.6).Y 0 , . . ., Y m form a martingale w.r.t. the random variables X 1 , . . ., X m .Furthermore, the difference Y t − Y t−1 is given by the following two cases: • If e t is added to M t+1 , which happens with probability P (e t ), then: • If instead e t is not added to M t+1 , which happens with probability 1 − P (e t ), then: 1 − P (e t ) • To apply Freedman's inequality, as in Section 4, bounds on the step size and on the variance of the martingale are required.They are given by the following two lemmas: Lemma B.9 (Corresponds to Lemma 4.8).For all times t and realization of the randomness, |Y t −Y t−1 | ⩽ A, where A := 8ε s(ε) .
Theorem B.11.There exists an online algorithm which on n-vertex general graphs with maximum degree ∆ only known to satisfy a lower bound ∆ ⩾ ∆ ′ = ω(log n), with the graph revealed vertex-by-vertex, computes a (1.777 + o(1)) • ∆-edge-coloring with high probability.
[CPW19] study the following relaxation of edge coloring; in this relaxation a fractional α∆-edge-coloring consists of α∆ many fractional matchings such that each edge is matched to an extent of (at least) one when summed across all fractional matchings.In their terminology, a graph is shown to be fractionally k-edge-colorable if the following LP has a solution.For graphs with unknown degree, [CPW19] provide online fractional edge coloring algorithms using e/(e − 1)∆ and 1.777∆ matchings under one-sided vertex arrivals in bipartite graphs and arbitrary vertex arrivals in general graphs, respectively.Both factional algorithms maintain collections of fractional matchings {x e,c } c with bounded ℓ ∞ norm, max e x e,c = o(1), whenever ∆ = ω(1).[CPW19] further provide (see their Algorithm 2) a rounding framework to round these; they show how to convert online fractional α∆-edge-coloring algorithms with the above bounded ℓ ∞ norm guarantee into (randomized) online (α + o(1))∆-edge-coloring algorithms for graphs with unknown maximum degree ∆ satisfying ∆ ⩾ ∆ ′ = ω(log n), with ∆ ′ known.(The above o(1) term is of the form poly(log n/∆ ′ ).)An important ingredient for their framework is a (1 + o(1))-approximate rounding scheme for online fractional matchings ⃗ x with max e x e = o(1).Such a rounding scheme for fractional matchings under one-sided vertex arrivals in bipartite graphs was given by [CW18] (see [Waj20, Chapter 5]).Our Theorem B.1 provides such a rounding scheme under edge arrivals (and hence also under vertex arrivals) in general graphs.Combining our new rounding scheme with the rounding framework of [CPW19] then allows us to round their online fractional 1.777∆-edge-coloring algorithm and obtain an online (1.777 + o(1))∆-edge-coloring algorithm under general vertex arrivals, as claimed.

Figure 1 :
Figure 1: An example of the neighborhood of e t = (u, v) with k = 7.

Figure 2 :
Figure 2: Example where Algorithm 1 might be undefined.

S
t := {u i ∈ N (v) | u i ̸ ∈ M min{t,t i } } and Y t−1 := u i ∈St c ⩽ 1 ∀v ∈ V, c ∈ [k] x e,c ⩾ 0 ∀e ∈ E, c ∈ [k] the arrival times of the edges in δ(u) ∪ δ(v) before time t.The only randomness left up to the point of e t 's arrival is now given by the random variables X t 1 , . . ., X t ℓ , which are independent of A(⃗ x).For the selection of e t to be possible, we need to condition on the event that none of the edges e t 1 , . . ., e t ℓ are taken in the matching.We note that conditioning on A(⃗ x) and e t 1 , . . ., e t ℓ ̸ ∈ M t completely determines M t ′ for all time steps t ′ ⩽ t.Using Observation 4.1, we thus have that P (e t 1 ), . . ., P (e t ℓ ), and F t (u), F t (v) are uniquely determined under this conditioning.Let p(e t 1 ), . . ., p(e t ℓ ), and f t (u), f t (v) be the concrete values of these random variables under this conditioning.We then have: Assume that |L(e)| ⩾ 2•10 24 ln n for all edges e.Furthermore, assume that, before the execution of the i-th phase of Algorithm 4, for any edge e connected to a dense vertex, |L i (e)| ⩾ d i + a i .Then, after the execution of the i-th phase, one has|L i+1 (e)| = |L i (e) \ ℓ i (e)| ⩾ d i+1 + a i+1 .Furthermore, executing Greedy on U f +1 in Algorithm 4 is possible, as |L f +1 (e)| ⩾ 2d f +1 for all edges e ∈ U f +1 .|Li+1(e)|= |L i (e)| − |ℓ i (e)| ⩾ d i + a i − |ℓ i (e)| ⩾ d i + a i − (λ i + 10 λ i ln n) = d i+1 + a i+1 ,where the last inequality follows by the definitions of a i and d i+1 (see Definition A.5 and Definition A.8).As for the property |L f +1 (e)| ⩾ 2d f +1 , if e was ever connected to a dense vertex in some phase i, then the property follows because|L f +1 (e)| ⩾ d f +1 + a f +1 ⩾ 2d f +1 ,where the last inequality is due to a f +1 > d f +1 (see Definition A.8).If on the contrary e was never connected to a dense vertex, then L f +1 (e) = L(e) (i.e. the list of available colors never changed during the execution of Algorithm 4) and so we have |L f +1 (e)| = |L(e)| ⩾ 2 • 10 24 ln n > 2d f +1 .