Reducing Reconfiguration Time in Hybrid Optical-Electrical Datacenter Networks

We study how to reduce the reconfiguration time in hybrid optical-electrical Datacenter Networks (DCNs). With a layer of Optical Circuit Switches (OCSes), hybrid optical-electrical DCNs could reconfigure their logical topologies to better match the on-going traffic patterns, but the reconfiguration time could directly affect the benefits of reconfigurability. The reconfiguration time consists of the topology solver running time and the network convergence time after triggering reconfiguration. However, existing topology solvers either incur high algorithmic complexity or fail to minimize the reconfiguration overhead. In this paper, we propose a novel algorithm that combines the ideas of bipartition and Minimum Cost Flow (MCF) to reduce the overall reconfiguration time. For the first time, we formulate the topology solving problem as an MCF problem with piecewise cost, which strikes a better balance between solver complexity and solution optimality. Our evaluation shows that our algorithm can significantly reduce the network convergence time while consuming less topology solver running time, making its overall performance superior to existing algorithms. Our code and test cases are available at a public repository [25].


INTRODUCTION
The interest in hybrid optical-electrical Datacenter Networks (DCNs) has been growing as it offers the capability of performing trafficaware topology designs.As network bandwidth keeps increasing, building Clos networks is becoming cost-prohibitive [4].In fact, the traffic in DCNs is highly skewed and time-varying [3].By adapting the network topology to the traffic patterns, the performancecost ratio can be improved.This necessitates a reconfigurable topology, which can be achieved by introducing novel optical network components such as Optical Circuit Switches (OCSes).
We study how to reduce the reconfiguration time in hybrid opticalelectrical DCNs.When the DCN traffic pattern changes, we may need to compute a new topology for this traffic pattern, and then reconfigure the DCN from its old topology to this new topology.The reconfiguration time consists of the topology solver running time and the network convergence time after triggering reconfiguration.The former one depends on the algorithmic complexity and the latter one depends on the number of links to be changed during the reconfiguration process.Therefore, we need a topology solver with low algorithmic complexity, while being able to reduce the number of reconfigured links.
Existing algorithms are limited because they either incur high topology solver running time or yield a highly suboptimal solution that requires many link reconfigurations.The topology optimization problem can be formulated as an Integer Linear Programming (ILP) problem which is hard to solve by directly using an ILP solver.To reduce the algorithmic complexity, [6] takes advantage of the homogeneity of the DCN physical topologies and presents a greedy minimal-rewiring algorithm using Minimum Cost Flow (MCF), but experimental results show that the total number of rewires can be far from optimal.On the other hand, [5] presents an algorithm that utilizes the idea of bipartition and achieves a lower number of rewires than the MCF-based algorithm, but it is still based on ILP and can be very slow in practice.
In this paper, we propose an algorithm that combines the advantages of MCF and bipartition to reduce the total number of rewires, while ensuring a polynomial running time.The standard form of an MCF problem has a linear cost for each link.We note that the cost function of each link can be generalized to a convex piecewise linear function.This observation allows us to develop a polynomial algorithm with minimum number of rewires for the case where there are two OCSes.We then generalize this algorithm to the -OCS cases using an iterative decomposition approach.Our evaluation shows that our algorithm exhibits very low algorithmic complexity, while achieving a low rewiring ratio at the same time.Our code and test cases are available at [1].

PROBLEM
An OCS possesses many input and output ports that can be interconnected with (electrical) switches.A complete matching between the input and output ports can be configured within the OCS.The physical connections between the OCS and the switches is referred to as the physical topology, while the corresponding equivalent topology in the absence of the OCS is referred to as the logical topology.
Consider a flat topology comprising of Top-of-Rack (ToR) switches and OCSes, where the uplinks of the ToR switches are only connected to the OCSes.The physical topology of the network is characterized by two key parameters, denoted as ∈ Z × ≥0 and ∈ Z × ≥0 .Here, represents the number of connections from the -th OCS to the -th switch, and signifies the number of connections from the -th switch to the -th OCS.The logical topology is characterized by ∈ Z × ≥0 .
The matching of all OCSes can be represented by ∈ Z × × ≥0 . is the number of equivalent connections from the -th switch to the -th switch established by the forwarding of the -th OCS.Given the physical and logical topology, a feasible matching should satisfy the following constraints.
We define the set of all feasible matchings as ( , , ).We use ∈ ( ′ , ′ , ′ ) and ∈ ( , , ) to represent the old and new matching of OCSes, respectively.Since physical topology seldom changes, we assume that ′ = and ′ = .
We may use the minimum number of disconnections to reflect the network convergence time [6].Therefore, our target is to solve the following optimization problem 1 .min In this paper, we focus on a special case where the physical topology is proportional.

ALGORITHM DESIGN 3.1 An Exact Polynomial-Time Algorithm for a Special Case
When = 2, we can rewrite the objective function and other constraints using the constraint 2 = − 1 to obtain the following equivalent problem. min The problem is thus equivalent to the following MCF problem.There are supply nodes { 1 , 2 , . . ., } and demand nodes { 1 , 2 , . . ., }.The supply node has 1 units of supply, and the demand node has 1 units of demand.This setting models the constraints (1a) and (1b).For each pair of ( , ), consider the function This is a convex piecewise-linear function.Assume that it has noncontinuous points { 1 , 2 , . . ., } and define 0 = 0, +1 = .Assume that on [ −1 , ] the slope of (•) is .Then we add + 1 arcs from to .For the -th arc, the cost is and the capacity is − −1 .This models the objective function and constraint (1c).
Integral MCF problem is a special ILP that can be solved in polynomial time, and in practice the solving is usually fast, so when = 2, our problem can be solved efficiently.

The General Algorithm
For general cases where > 2, we can merge some OCSes to be a larger OCS so that the physical topology can be seen as if it has only 2 OCSes.The merging is an approximation because it widens the range of reconfiguration.Then we can solve the approximated problem using the algorithm proposed in Section 3.1.To obtain a real feasible solution, we need to decompose the solution on each imaginary OCS, which requires solving two subproblems recursively.We prove the correctness of our algorithm when the physical topology is proportional, and we prove that the worst-case time complexity of our algorithm is ( 4 log + 2 log ) ≈ ( 4 log ), if we choose even bipartition at each division step and use the cost-scaling algorithm for solving the MCF problem.