Does rate adaptation at daily timescales make sense?

Today, networking hardware is not fast enough to save energy with rate adaptation. Or is it? While we are not (yet) able to turn on and off line cards in milliseconds, we can do so a couple of times per day. The question is, does it make sense to save energy? It may: We estimate we could potentially save in the order of MWh per year with link sleeping and down-rating in a cloud provider network. Importantly, these are “easy” gains: They would come without modifying the routing state nor impacting the quality of service of the traffic; it only leverages the typically low utilization of the links. This study provides only a rough approximation of the potential savings of rate adaptation and leaves several practical questions open. More than anything else, it motivates further investigation of the approach.


INTRODUCTION
Networking researchers long observed that the Internet, with its success, grew to consume massive amounts of energy and yield an important carbon footprint.Two decades ago, Gupta and Singh [7] called for researching ways to "green the Internet."Nedevschi et al. [14] then theorize how, in lightly-loaded networks, one could save a significant fraction of energy by rate-adapting links or turning line cards to sleep.This was 15 years ago, but despite the promising potential, these techniques did not become state-of-practice.This can be explained in part by two practical limitations: (1) Today's hardware design does not allow turning line cards on and off fast enough to allow for efficient sleeping.
[14] postulated a wake-up time of 1 ms; later studies report values in minutes [15].(2) The power savings from down-rating-which made sense in copper cables and encouraged the Energy Efficient Ethernet (802.3az)standard-do not translate well to the nowcommonplace optical links.
Despite these limitations, the energy-saving potential is still present.Many networks feature average utilization in the tens of percent and-perhaps more importantly-strong diurnal patterns; that is, the network load is strongly correlated to the time of the day.Even if we cannot (yet) turn off part of the network at a millisecond scale, we can definitely do it a couple of times a day.This paper aims to quantify the energy-saving potential of down-rating and sleeping at daily timescales.What if we simply turn parts of the network off at night?How much energy could be saved by leveraging the diurnal utilization patterns in networks today?
As in [14], we do not consider modifications of the network routing state or energy-aware traffic-engineering techniques.Such approaches (e.g., [3,4,19,20]) are relevant as they help create more favorable conditions rate adaptation, but they create additional operational complexity.In contrast, we look at "simple" actions, i.e., which do not affect the routing state.We focus on networks where optical links are dominant and utilization tends to be low, such as ISPs or WANs.
To estimate energy savings from rate adaptation, we need two pieces: (i) a fine-grained time series of link utilization data in a given network; (ii) a power model for the energy an optical link uses, given its rate configuration.We obtained the first piece by analyzing the OVH Weather dataset gathered by [16].The dataset contains per-link utilization data at a granularity of five minutes over two years.Our second piece, the power model, comes from experiments performed in our lab; we measure the power drawn by a programmable switch-a WEDGE100BF-32X-in controlled conditions. 1 We find that, in the OVH network, • one can save tens of MWh/year with sleeping, i.e., by turning redundant links off ( § 4); • one can save about one MWh/year with down-rating, i.e., by reducing the port rate of individual links ( § 5).
While these numbers are approximations building upon many hypotheses (discussed later in the paper), we expect the order of magnitude to be sensible.We believe these initial results justify further research to assess more accurately-and, hopefully, harvest-the energy benefits of rate adaptation at daily timescales.For simplicity, in the rest of this paper, we use "rate adaptation" to refer to both down-rating and sleeping.

POWER PROFILING A NETWORK SWITCH
To estimate the power-saving potential of rate adaptation, we need a model of the power usage of the network device.Perhaps surprisingly, such a model is not easily available today.Prior works such as [12,18] look at the energy consumed by networks but report only aggregate numbers and lack the granularity required to assess the impact of port configuration on power.[13] provides a bottom-up power model more appropriate for our needs, but the 14-year-old study does not precise the devices they profile, which makes the resulting power models dubious to apply to today's networks.
Thus, we replicate the modeling on more recent hardware.We measure a programmable switch (a WEDGE100BF-32X with a Tofino chipset) under various controlled conditions to derive a model of its power usage under different port configurations (10G, 25G, or 100G) and traffic load. 2 Our main findings are summarized as follows: • The idle power of the switch is around 108W; • Each enabled port induces a power increase between 0.3W and 1.6W depending on the port settings, without any traffic; • Forwarding traffic only marginally increases power; about 1W for 100Gbps of traffic.
In the rest of this section, we detail our experimental procedure and present our empirical power model for our Wedge switch.

Measurement setup
To measure the power drawn by the Wedge switch, we interconnect a MCP39F511N Power Monitor [1] between the power plug and one of the switch's power supply units and disconnect the second one.The measurement is controlled remotely using the PinPoint driver [10].The power monitor specifies an accuracy of ±0.5% and can sample active power at up to 200Hz.In our experiments, we measure each setting at 20Hz for 60s.We use a second programmable switch to send traffic to the switch under test over up to 10 parallel 100Gbps QSFP28 shortdistance links.We use tcpreplay to send a mixed-traffic packet trace (primarily TCP and SSL traffic) toward the second switch, which amplifies the traffic statelessly to generate controlled-rate incoming traffic for the switch under test.The switch under test only reflects received packets back on their incoming port.We do this to cancel out the effect of the data plane program on power use since we want to focus on the port configuration effects.Refer to the full report [11] for more details about the measurement setup.
It is important to note that we used short-range electrical cables [6] because that is all we had.Long-range optical transceivers are expected to draw more power than electrical ones (see § 2.3).

Profiling the Wedge switch
Using the setup described in § 2.1, we measured the total power used by the Wedge switch under many combinations of port configurations and traffic loads.We make the following observations: • The power values are very stable within each run (i.e., a 60s measurement of one setting).Thus, in the following, we only report the median values for a given setting.We configure ports at 100G and send increasing traffic volumes over 10 ports.We derive the port's dynamic power from the linear regression.
• There is an inter-run variability of about ±0.5 for the same setting.This may be due to changes in external factors such as the room temperature.However, this variability is smaller than the specified measurement accuracy for the power range we measure (±0.5% of 100-200W); therefore, we report median values across runs.• The idle power refers to the power drawn by the switch just to be on, without any port enabled.We find it to be around 108W.Notably, the idle power appears independent of the dataplane program.• Without any traffic, there is a power increase for each enabled port, which we refer to as the port's static power.This increase depends only on the port configuration; i.e., the chosen power rate and forward error correction scheme.• The port's dynamic power is the cost induced by traffic forwarding.We observed that, for our traffic distribution, the dynamic power grows linearly with the traffic volume in Gbps.We illustrate this in Fig. 1 for ports set at 100G; we made similar observations for other port rates.
From these observations, we derive an empirical power model for our switch, which we present next.

A simple power model
From the experimental observations presented in § 2.2, we derive the following power model for the Wedge switch: where () and  () denote the traffic load and configuration of port , respectively.The idle power  idle , static power  sta , and dynamic power  dyn values are reported in Table 1. 3his model is simple but gives a detailed quantitative view of the impact of port configuration on power.It is not meant to generalize beyond this purpose.Specifically, this model has two main limitations, which we discuss next.
First, it considers only the handling of packets by the transceivers and their forwarding over the switching fabric; in particular, it does The efficiency of rate adaptation strategies depends on the number of links between routers and the amount of traffic to serve between them.With a single link, sleeping is not possible, and all down-rating approaches are equivalent.The more parallel links between routers, the more saving potential.Notably, sleeping and optimal down-rating yield comparable savings.
Interactive version of the plot, including the toggling of individual plot lines: nbviewer.org/<shortname>.ipynbnot measure the impact of the data-plane logic (e.g., memory lookups, basic algorithmics)-this is not the purpose of this model.Second, we only used electrical transceivers ( § 2.1), which are specified to draw less than 0.5W [6].Long-range optical transceivers are specified at 5W [5]; multiplied by 32 ports, it results in a significant difference in the maximal total power.It is unclear how the power drawn by such transceivers would evolve when adapting their transmission rate; we are currently investigating that point.

Rate adaptation strategies
With the power model presented in § 2.3, we can envision and estimate the efficiency of different rate adaptation strategies.
The key observation we make is that, when compatible with the traffic volume to serve, it saves energy to configure a port at 10G or 25G rather than 100G.Indeed, the static cost of configuring a port at 100G is about 1W bigger than configuring at 10G or 25G, while the dynamic cost is marginally smaller at 100G (Table 1).Hence, for a single 100G link with a load below 10Gbps, it saves more than 1W to configure that link at 10G rather than 100G.This is the basic idea of down-rating to save energy.
In addition, two routers are often connected by parallel links, which allows putting some of them to sleep-turning them off completely-without affecting the routing state.One may combine the two to design many rate adaptation strategies, including: No adaptation Set all links at maximal capacity.This draws the most power possible.We suspect this is the common practice.Sleeping Use only one link set at maximal capacity and keep the others off; turn on additional links then the traffic volume increases beyond the available capacity.This is a simple and power-effective strategy, but it induces local forwarding changes when links are turned on or off.Uniform down-rating Keep all links on, but set their configuration to the lowest setting required to serve the traffic load; up-rate all links when the load increases.This strategy is effective only at low utilization: once the load exceeds 25Gbps per link, the power is raised to its maximum value.Optimal down-rating Keep all links on, and optimize the configuration of each link individually to minimize the power draw while providing sufficient capacity overall.This is more efficient but more complex to orchestrate.Optimal rate adaptation The same strategy as before, but allowing turning off links.This yields optimal power but combines the drawbacks of both approaches.
Fig. 2 compares the achievable power by the different strategies when applied to parallel links between two routers.With a single link, sleeping is not possible, and all down-rating strategies are equivalent; the saving potential is limited but also easy to harvest.With more parallel links, both sleeping and optimal down-rating provide close to optimal power reduction.
As often, the best strategy depends on the context.In the rest of this paper, we analyze the OVH network dataset [16] to assess what strategy is best there, and how much one could hope to save.

THE OVH DATASET
OVH is a French cloud provider with over 300,000 servers spread in 32 datacenters and a worldwide network of more than 180 routers with a total egress capacity of more than 20 Tbps, in a network solely composed of 100G links [16].The authors of [16] curated per-link utilization data at a granularity of five minutes over a span of two years.This study analyzes this dataset to estimate the energy-saving potential of rate adaptation in the OVH network.
Two characteristics are helpful for rate adaption to yield energy savings: low utilization, and high degree of parallelism.In this section, we quantify the prevalence of these characteristics in the OVH network.We analyze only 31 months of data, from June 2020 to December 2022.

Utilization is low and predictable
The overall utilization of the OVH network is low (≈ 18%) and exhibits a typical diurnal pattern.As shown in Fig. 3, the overall network utilization pattern is very stable, with peaks in the evening (≈ 24%) and valleys at night (≈ 12%).We can also observe a weekly component.We observe similar patterns at the level of individual links (not shown); the range of utilization values varies more, but the same daily pattern remains present.
This suggests that the network generally operates under low utilization-we discuss possible causes in § 6.Therefore, there is a possibly-large potential for rate adaptation; i.e., temporarily reducing the network capacity to save energy.

Parallelism is high
There is a high degree of parallelism in the OVH network; that is, many router pairs are connected via two or more links (Fig. 4): 55% of links are part of a parallel connection between routers (i.e., there is at least one other link connecting the same router pair).32% of links are part of a group of at least four links, and 11% of a group of at least ten links.
Given the generally low utilization discussed in § 3.1, this parallelism opens a significant potential to turn redundant links off to save energy without modifying the network topology at layer 3. [14] demonstrated that sleeping and rate adaptations are efficient energy-saving strategies in low utilization scenarios.But, as discussed in § 1, today's hardware is unable to adapt at the millisecond timescale as hypothesized in [14].

We can adapt at daily timescales
However, we can adapt hardware configurations fast enough for daily timescales.Networks commonly exhibit strong diurnal patterns, such as those of the OVH network illustrated above (Fig. 3).It is conceivable to adapt port configurations-adapting their rate or turning them off completely-a couple of times per day, assuming this would yield significant energy savings.
In the rest of this paper, we apply our power model ( § 2) to the OVH dataset to derive a first-order approximation of those potential savings.We first investigate the potential of sleeping ( § 4), then continue with down-rating ( § 5).

POTENTIAL FOR SLEEPING
We first consider the energy-saving potential of turning off links.
Questions.How many links can we turn off without modifying the L3 topology?Put differently, among existing parallel links between router pairs, how many are required to serve the traffic load between these routers?How much could we save by turning the others off?
Answers.In the OVH network, around 40% of links could be turned off throughout the day.The median value is around 54% and goes up to 60% at night.By turning those links off, one could save between 14MWh and 45MWh per year.4: Many links connect the same router pairs.As utilization is generally low (Fig. 3), this opens a significant potential to turn redundant links off to save energy.
Explanations.We derive those numbers as follows: • For each router pair, we sum up the traffic load across all parallel links and take the maximum load across both directions.From this, we derive the minimal number of 100G links required to serve the aggregated load and assume all others can be put to sleep.• We sum up the links that can be put to sleep over all router pairs.This gives the ratio of links that can be put to sleep and the power reduction this would yield-multiply the number of links put to sleep by the static power of a 100G port.The static power value of our switch model ( § 2) gives a lower bound.We use the datasheet power value of a long-range transceiver (5W [5]) for an upper bound.• We compute energy savings as the median power reduction over all snapshots applied to the whole year.
Discussion.We summarize the data in Fig. 5: the left side shows the ratio of links that can be put to sleep as a function of the time of day; each dot shows the value for a 5-min snapshot.We observe again the strong daily pattern.Moreover, for any time of the day, the data spread by at most 5%, which confirms the stability of the utilization data over time.The right side shows the histogram of the same data: we see that the distribution of links that can be put to sleep is roughly uniform.
This pseudo-uniform distribution justifies our simplified computation of the energy savings: extrapolating the median value over the whole year does not bias the result much.
Moreover, we only consider the static power because we observed that the dynamic power is roughly linear with the traffic volume in our experiments (Fig. 1).Since the total traffic does not change, neither should the dynamic power.
Our results suggest that turning links off can yield sizable savings.However, it implies maximizing the load on some links to disable others, to be turned back on when their capacity is required.One can also do the opposite: keep all links on but match their rate to the demand.That is the principle of down-rating, investigated next.

POTENTIAL FOR DOWN-RATING
We now consider the energy-saving potential of down-rating individual links assuming available port rates of 10, 25, and 100G.
Questions.How many links can we down-rate at 10G or 25G instead of 100G?How much energy could that save?Answer.On average in the OVH network, half of the links can be down-rated: i.e., 26% and 27% down to 10G and 25G, respectively (median values).This results in a potential yearly savings of 0.83MWh (0.45 and 0.38MWh, respectively).
Explanations.We derive those numbers as follows: • For each link, we take the maximum load it serves in both directions to define its minimal rate configuration (10, 25, or 100G) for a given snapshot.• We sum up the links that can be configured at a given rate over this snapshot.This gives the ratio of links that can be down-rated, and the power reduction this would yieldmultiply the number of links down-rated by the static power difference between 100G and 10G or 25G, respectively.We only use our switch model values ( § 2) as we are not aware of other data for the impact of down-rating a link on power.Time of day Links we can down-rate (%) Figure 6: About 50% of links can be down-rated to either 10G or 25G on average, with a strong skew toward lower values.
• We compute energy savings as the median power reduction over all snapshots applied to the whole year.
Discussion.Fig. 6 summarizes the data as previously.We observe again the nightly peak in the number of links that can be downrated.However, the distributions reveal a skew towards the lowest values; thus, extrapolating the median value underestimates the potential savings.Conversely, we also neglect the difference in dynamic power between the different link rates, which slightly overestimates the savings.However, the effect of the dynamic power is negligible compared to the static power difference (≈ −1 W for the static part; +0.05W maximum for the dynamic part; see Table 1).
Given the average network utilization of ≈ 18% (Fig. 3), it may be surprising that only half of the links may be down-rated.We hypothesize this is because link loads are asymmetrical; i.e., links carry more traffic in one direction than in the other.As down-rating applies to both directions simultaneously, it forces many links to higher rates.Another limitation is that we only consider 10, 25, and 100G rates since those are the only ones for which we have power values available.However, many links have average utilization in the 30-40G range (not shown) and would thus benefit from an intermediate port rate configuration, e.g., 40G or 50G.
Finally, note that, for simplicity, we consider links in isolation (see Fig. 2, left).One could magnify the savings of down-rating by spreading the load evenly between parallel links (which are many, as discussed in § 3.2).One may further combine rate adaptation strategies to optimize power further-see Fig. 2, middle and right.

DISCUSSION AND FUTURE WORK
We aim to estimate whether rate adaptation at daily timescales makes sense to save energy.This paper presents our work-inprogress on that question, not a definite answer.
Discussion.We argue that there may be some "easy gains" on the energy usage of networks; power-aware rate adaptation at daily timescales is one candidate that allows exploiting the typical underutilization of networks.We believe it is worth further investigation.Similar efforts considered data center networks, e.g., [2,8]; what can be done in other networking contexts?One may object that the savings we estimate-10s of MWh/yearwould be negligible compared to the total energy usage of the network.That may be true, but it is also a potential saving that is readily harvestable in today's network and with today's hardware.We argue that any potential saving is worth investigating: getting 1% better everyday compounds over time!Another objection is that network redundancy is not deployed without reason; the low utilization we observed ( § 3.1) may be intended.In any case, aggressive rate adaptation resulting in links at 100% utilization sounds like a bad idea.All that is fair: In this work, we do not consider any bounds on "how much we can turn off."However, such bounds are trivial to add to our analysis; e.g., one can easily derive the potential for down-rating while keeping individual link utilization under 50%.
Note that the rate adaptation strategies we discuss here aim to leave the network service unaffected-i.e., users should not notice that link capacities decrease from 100G to 10G; if they would, they would be utilizing the link, and thus it would not be down-rated.This omits the case of high-throughput bursts which would not be visible in 5-min utilization data.The prevalence of such bursts in WANs is not clear.
Putting things in a broader context, this work focuses on energy proportionality, which is a fundamental prerequisite for efficient demand-response strategies.Sustainable computing aims at reducing workload when the carbon intensity is high, but this is useful only if the power decreases with the workload.Compute has become much better at this in the past decade, but communication is lagging behind; this is the gap we try to fill.
Future work.There are several practical questions left open before eventually implementing rate adaptation: (i) How quickly can we reconfigure ports or turn them back up?On our Wedge switch, we measure about one second, but how does that generalize to routers? (ii) How do we decide when to adapt a port configuration?Even if the time of day is a good predictor of network utilization, it is not trivial to derive a good "power controller" trading off between reliability and energy savings.(iii) Is rate adaptation compatible with the use of DWDM or fiber amplifiers in WAN networks?
In addition, there are also some potential risks that require careful investigation, including increased hardware wear from being turned on and off frequently, power management faults or misprediction of the required port configurations, and the apparition of new attack vectors triggering high-power configurations.
Finally, there are other potential gains to investigate.Can we save power within the DWDM or amplifiers, e.g., by tuning the number of wavelengths used at a given time?Reducing port rates or turning links off could yield other energy benefits, such as reduced cooling costs or enabling turning off entire line cards; we do not yet have the means to estimate such potential savings.We could also hope to make the "idle power more proportional." More concretely, can we power-gate hardware components that may not be needed and thus reduce idle power?In addition, one may also aim to magnify the potential savings by allowing updates to the routing state.Our study only starts the exploration.To facilitate future research in this area, we publish all our artifacts [9].

Figure 1 :
Figure 1: Power increases linearly with the traffic volume.

Figure 2 :
Figure2: The efficiency of rate adaptation strategies depends on the number of links between routers and the amount of traffic to serve between them.With a single link, sleeping is not possible, and all down-rating approaches are equivalent.The more parallel links between routers, the more saving potential.Notably, sleeping and optimal down-rating yield comparable savings.

Figure 3 :
Figure 3: The overall network utilization of the OVH network is low, stable, and with strong daily patterns-2x between peaks and valleys.Illustrative 2-week window; see the interactive version for the full data: nbviewer.org/<shortname>.ipynb

Figure
Figure4: Many links connect the same router pairs.As utilization is generally low (Fig.3), this opens a significant potential to turn redundant links off to save energy.

Figure 5 :
Figure 5: 40% of links can be turned off throughout the day; the median value is around 52% and goes up to 60% at night.

Table 1 :
Empirical power model parameters.