E-WISH: An Energy-aware ABR Algorithm For Green HTTP Adaptive Video Streaming

HTTP Adaptive Streaming (HAS) is the de-facto solution for delivering video content over the Internet. The climate crisis has highlighted the environmental impact of information and communication technologies (ICT) solutions and the need for green solutions to reduce ICT's carbon footprint. As video streaming dominates Internet traffic, research in this direction is vital now more than ever. HAS relies on Adaptive BitRate (ABR) algorithms, which dynamically choose suitable video representations to accommodate device characteristics and network conditions. ABR algorithms typically prioritize video quality, ignoring the energy impact of their decisions. Consequently, they often select the video representation with the highest bitrate under good network conditions, thereby increasing energy consumption. This is problematic, especially for energy-limited devices, because it affects the device's battery life and the user experience. To address the aforementioned issues, we propose E-WISH, a novel energy-aware ABR algorithm, which extends the already-existing WISH algorithm to consider energy consumption while selecting the quality for the next video segment. According to the experimental findings, E-WISH shows the ability to improve Quality of Experience (QoE) by up to 52% according to the ITU-T P.1203 model (mode 0) while simultaneously reducing energy consumption by up to 12% with respect to state-of-the-art approaches.


INTRODUCTION
In recent times, there has been a significant global surge in video streaming usage.As stated in the Ericsson Mobility Report [7], video streaming accounted for 69% of all mobile data traffic by the end of 2021, and it is projected to increase to 79% by 2027.The escalating climate crisis and the urgent need to reduce CO2 emissions have highlighted the crucial role of ICT solutions and their environmental footprint [1,16].As the majority of Internet traffic is now dominated by video streaming, there is a growing imperative to address the energy consumption associated with this activity [8].
Most online videos are transmitted using HTTP Adaptive Streaming (HAS) [3], which includes Dynamic Adaptive Streaming over HTTP (MPEG-DASH) [6] and HTTP Live Streaming (HLS) [4,18,23].The original video in HAS is encoded into different representations on the server side, each identified by a specific bitrate and resolution, and then divided into segments of equal duration.To ensure the best possible Quality of Experience (QoE), an Adaptive BitRate (ABR) algorithm tries to maximize video quality while adapting to changes in network traffic [22].However, this approach often leads to unnecessary and inefficient consumption of energy resources [24].
Most ABR algorithms make decisions based on available bandwidth and buffer occupancy without considering the energy implications of these choices [9,15,20].As a result, they tend to select higher bitrates and resolutions when network conditions improve, leading to increased energy consumption on both server and client sides.Furthermore, when faced with varying energy constraints on mobile devices or battery-operated platforms, the lack of energy awareness in ABR algorithms can significantly impact the device's battery life and overall user experience.In contrast, energy-aware ABR algorithms offer a promising avenue to reduce the environmental impact of video streaming.These algorithms adaptively adjust video quality and bitrate in response to real-time network conditions, aiming to reduce energy consumption on the end-user device while maintaining a satisfactory user experience.This approach not only reduces the strain on the delivery chain, but also extends the battery life of end-user devices, fostering a more sustainable digital ecosystem.
To address the growing energy challenges associated with video streaming, we propose E-WISH, an energy-aware ABR algorithm situated on the client for HAS, which considers the (i) available throughput, (ii) player buffer, (iii) video quality, and (iv) energy consumption to select the optimal representation for the next video segment.
This paper is organized as follows.Section 2 gives an overview of the related work.Section 3 introduces E-WISH and the details of the algorithm.The experimental setup and preliminary results are reported in Section 4 with a focus on QoE and energy consumption, followed by the concluding remarks and future work in Section 5.

RELATED WORK
ABR algorithms can be divided into (i) throughput-based, (ii) buffer--based, or (iii) hybrid-based based on the metrics utilized for adjusting the bitrate selection [3].In the family of throughput-oriented ABR algorithms, we consider AGG [17] (short for aggressive), which selects the highest possible bitrate within the estimated throughput limit.Within the category of buffer-based approaches, BBA-0 (Buffer-Based Adaptation version 0) [9] is an ABR algorithm that requests the next segment whose bitrate matches a monotonic nonincreasing function of the instant buffer occupancy.Moreover, it is necessary to specify a low and a high threshold.When the buffer occupancy falls below the low threshold, BBA-0 requests for the lowest-quality segment.Conversely, if the buffer level surpasses the high threshold, the client fetches the highest-quality segment.BBA-0 is designed to handle throughput fluctuations by relying on the buffer state.SARA [12] is a hybrid ABR algorithm that retrieves the next segment based on the current buffer state and the estimated throughput.SARA's approach involves categorizing buffer occupancy into four zones ranging from low to high.When the buffer level falls within the first zone, the "fast start" phase triggers the selection of the lowest quality version.In the second zone, the quality level for the next segment increases by one unit, employing the "additive increase" method.The optimal operational zone, as per SARA, lies in the third area, where the quality level adjusts or stabilizes based on network conditions and buffer level.If the buffer occupancy surpasses the third zone, the most suitable quality for throughput is selected, and the download is intentionally delayed to force the buffer to function within the optimal operational zone.Oriented to user preferences, WISH (WeIghted Sum model for HAS) [15] was proposed to take into account different factors and video streaming parameters that influence user experience.WISH selects the quality for the next segment based on throughput cost, buffer cost, and quality cost.Furthermore, the user can assign a priority to different factors to be taken into account during the selection (e.g., "focus on the quality", or "reduce the data consumption").
In the realm of video streaming, there has been a recent surge in awareness regarding the energy consumption of end-user devices.This has led to the emergence of various energy-aware techniques [24,25] aimed at mitigating the impact of multimedia streaming services on energy consumption.
Varghese et al. [25] propose eDASH, an energy-aware plugin for MPEG-DASH players designed specifically for mobile devices.The eDASH player seeks to reduce battery consumption on mobile devices by considering factors such as bitrate and video brightness when determining the next video chunk to download.However, this study is hardly replicable in real-world video streaming conditions due to a low number of representations.Furthermore, the impact of vital metadata, such as file size, content complexity, and resolution, on energy consumption remains largely unexplored.
To partially address this, GreenABR has been proposed [24].It comprises a reinforcement learning (RL) algorithm, whose goal is to select the video representation that maximizes the video quality (expressed in Video Multi-Method Assessment Fusion (VMAF) [14] points) while reducing energy consumption and stall duration.The latter value is predicted by a deep neural network (DNN) accepting input features like bitrate, resolution, file size, video quality, and motion rate.Although it provides promising results, a VMAF score is required for every video segment and representation, whose computation is a time-consuming and energy-consuming operation.Furthermore, the energy consumption required for the training process is not reported in the study.
Consequently, further research is needed to thoroughly explore energy-related metrics and to obtain a comprehensive understanding of energy-efficient video streaming techniques while simplifying the employed ABR techniques.

E-WISH
In this section, we extend the aforementioned WISH algorithm and propose E-WISH as a technique to reduce the energy consumption of the end-user device with minimal impact on the QoE with respect to state-of-the-art approaches.
The E-WISH algorithm is presented in Algorithm 1, whereas all parameters discussed in this Section are included in Table 1.E-WISH selects the bitrate   for the next segment according to the following two modes: Conservative mode.If the current buffer level before downloading the next segment  (denoted as   ) is less than a predefined threshold   , then the buffer is in a dangerous zone with a high stall possibility.In this case, E-WISH selects the lowest representation,  1 , to start the playback as fast as possible at the beginning of the streaming session and to decrease the risk of stalling.
Operative mode.If the current buffer exceeds   , the streaming session switches to the operative mode.The core of E-WISH operates as follows.It calculates the cost of the video representation , C(), which is an extended formulation than the one defined in the WISH paper [15].The overall cost C() is a weighted sum of throughput cost   (), buffer cost   (), and quality cost   (), plus the energy cost   () as presented in Equation 1; the weights , , Algorithm 1: E-WISH ABR algorithm.The frames per second of representation   () The total cost of representation    () The throughput cost of representation    () The buffer cost of representation    () The quality cost of representation    () The energy cost of representation  , and  are positive numbers: Estimating the throughput  according to WISH [15], the throughput cost   () of representation  is calculated as a linearly increasing function of its bitrate   as follows: In the operative mode, to cope with throughput oscillations, E-WISH only considers the representations whose bitrates   are less than the last throughput   with the margin  ( = 0.1 in this paper) (line 6 in Algorithm 1).
The buffer cost   () is defined based on the following observations.While downloading representation  with bitrate   , the buffer is drained by the download time, which is   ×   , where  is the segment duration.The longer the download time, the more the buffer is decreased.Additionally, with the same download time, a buffer at a low level has a higher risk of under-running, which would result in a stall event.Therefore, the buffer cost of representation  can be computed as: The quality cost comprises two sub-penalties: (i) a penalty when a representation is lower than the highest-bitrate representation, and (ii) a penalty if it is different from the average quality of the recent segments.To make the quality cost positive, we use an exponential function: where () is the quality of representation , and Q k is the average quality of the last  segments (default  = 10), as defined in line 4 of Algorithm 1.In this paper, we use the bitrate   to calculate the quality of the  − ℎ representation as: Finally, the energy cost   () of representation  is defined as a linear function of its bitrate   , content resolution   (number of overall pixels), and frames per second   [8]: where each   with  = 1, .., 3 is a model parameter specific to the end-user device.The importance given to each weight   is derived from the model in [8] and relative to the type of the client device, such as PC, laptop, and smartphone.
We relied on the cited model, given its reliability and adoption for different devices.However, it is worth noting that the adopted model could be replaced in the future by more precise energy estimation models.
This paper does not cover the mathematical determination of the weights , , , and , which poses a notable challenge given the complexity of finding an optimal solution.In Section 4.1 we use , , and  determined in [15], while we set  to an empirical value.

EXPERIMENTS AND PRELIMINARY RESULTS 4.1 Experimental Setup
Our testbed, depicted in Figure 1, relies on a laptop Lenovo Thinkpad P1 Gen.1 with Windows 10 acting as an HTTP server and a desktop computer running Ubuntu 22.04 LTS with 16 GB RAM, Intel Core i7-8700K CPU, and NVIDIA GeForce GTX 1060 GPU working as an to handle the streaming session.To simulate different scenarios, we shape the network connection between the server and the client using Wondershaper 2 , according to a cascade network trace with pattern {20, 8, 4, 8} Mbps (where each value is set every 30 s) repeated over time [2].We used two test videos stored on the server (Figure 2): (i) Tears of Steel (the first 5 minutes) -ToS1, (ii) Tears of Steel (the last 5 minutes) -ToS2.These videos have different complexity in terms of spatial information (SI) and temporal information (TI) [11], namely low SI (28.6) and low TI (14.1) for ToS1 and high SI (74.2) and high TI (31.7) for ToS2.
The segment duration  is 4 s as recommended in [5].At the client, the buffer capacity   is set to 20 s to test the responsiveness of the adaptation algorithm.The buffer threshold   is set to 4 s (i.e.,   = ).We compare our proposed method, E-WISH, with state-of-the-art approaches described in Section 2: AGG, BBA-0, SARA, and WISH.Although state-of-the-art ABR algorithms, GreenABR and eDASH are not considered in these experiments.GreenABR's RL approach is implemented using Python rather than on an MPEG-DASH player, which makes it unfair to be compared with using a real testbed.Indeed, the video segments are fetched but not decoded and rendered.Furthermore, the energy consumption values related to decoding and rendering, provided by the authors, are estimated by their energy consumption model, designed for smartphones, and not by an accurate multimeter.eDASH, on the other hand, is proposed for MPEG-DASH players but implemented on SimPy5 , a simulation framework based on Python, so it would be part of an unfair comparison like GreenABR.Furthermore, it would require several modifications to the HAS architecture and to the manifest structure to additionally include the brightness levels at which each video sequence needs to be encoded.Each experiment is run five times, and the experimental results represent the average values with the respective standard deviations.
In this paper, the following metrics are used: (i) QoE score according to the extended version6 [19] of the original ITU-T P.1203 mode 0 [10]; (ii) energy consumption measured via the Voltcraft © VC-7200BT7 digital multimeter.
For WISH and E-WISH we chose the same configuration for comparison with WISH and E-WISH sharing the three weights , ,  computed mathematically according to the equations defined in [15], which in this scenario are 0.074, 0.203, and 0.723, respectively.The fourth weight, only used in E-WISH, is empirically defined as  = 0.1, leading the energy consumption cost to be on the same order of magnitude as that of the other costs.A detailed investigation of  is subject to future work.Since we run the Bitmovin player on a desktop computer, the weights   for the energy cost function are defined as  1 = 0.19,  2 = 6.29 × 10 −8 ,  3 = 3.522 × 10 −4 according to [8].

Preliminary Results
Figure 3 depicts E-WISH's performance compared to the aforementioned state-of-the-art approaches according to the metrics introduced earlier.This figure shows two video sequences, ToS1 (blue) and ToS2 (orange).Arrows signify improvement with an upward direction (↑) or with a downward direction (↓).The black error bars represent the relative standard deviations.4.2.1 QoE. Figure 3a represents the QoE achieved by E-WISH compared to state-o-f-the-art ABR algorithms within a video streaming session.From the figure, it is possible to notice that SARA obtains the lowest QoE for both video sequences compared to the other techniques, which is primarily due to its greedy strategy, which leads to requests for high bitrate video segments that, in the presence of a throughput drop, induce several stall events.From the literature, we know that the QoE is influenced by visual quality (e.g., estimated from bitrate, resolution, and codec), video instability  (i.e., the quality trend of subsequent video segments over time), and stall events [22].3b represents the energy consumption measured from the beginning until the end of the playback.It is measured in watt-seconds (W s) which defines the electrical energy consumed in a specific time range, which in our case is the duration of the streaming session.Analyzing the figure, we can see that SARA consumes the highest amount of energy with respect to the compared ABR approaches, which are 1187 W s for ToS1 and 1191 W s for ToS2.As explained before, SARA's greediness leads to long stall events, which impacts the duration of the streaming session and, hence, the overall energy consumption.AGG and BBA-0 obtain similar results for ToS1 with roughly 1155 W s, while BBA-0 performs slightly better than AGG for ToS2 with 1140 W s versus 1176 W s. Lastly, E-WISH achieves the lowest energy consumption with roughly 1050 W s for both video sequences, whereas WISH achieves approximately 1100 W s. Similarly to the previous considerations concerning the quality, the primary explanation for these results is to be searched in the maximum resolution limit and the energy consumption trend for decoding the video representations.Running further experiments for different bitrates, in which all segments have been requested at the same constant bitrate for the whole streaming session, we discovered that the energy consumption when requesting and playing the highest representation (with a bitrate of 17 Mbps) for the whole session is approximately 1250 W s, while it reduces to 1100 W s when considering the lower representation with bitrate 7.5 Mbps.These results are averaged between ToS1 and ToS2.Therefore, constraining E-WISH to request video segments up to 7.5 Mbps, the QoE decreases negligibly, as explained in Section 4.2.1, compared to the highest representation (∼ −3%) while the energy consumption declines by more than 11%.Specifically, according to our presented results, E-WISH reduces the energy consumption of the video streaming session by 5% compared to WISH and by up to 12% compared to SARA.

CONCLUSION AND FUTURE WORK
In this paper, we proposed E-WISH, a novel energy-aware ABR algorithm that takes into account (i) throughput, (ii) buffer, (iii) quality, and (iv) energy costs to enhance the end-user's QoE while concurrently reducing the end-user device's energy consumption.According to our initial findings, E-WISH can reduce the energy consumption of end-user devices by up to 12% while improving the QoE by up to 52% compared to state-of-the-art approaches.
In the future, we plan to evaluate E-WISH's performance with new video sequences and weights configurations, determining a mathematical threshold for limiting the maximum resolution.Lastly, we plan to extract a mathematical formulation to compute the weights for all costs, similarly to the formulation proposed in WISH [15], extending the weights determination to include .

Figure 3 :
Figure 3: QoE (a) and energy consumption (b) of the compared ABR algorithms for 20 s buffer capacity.

Table 1 :
); Notations used in the paper.
14   =    ; 15 return   ; R The set of incrementally ordered available bitrates  = { 1 , ...,   }, where   has the highest bitrate AGG, BBA-0, and WISH lead to quite similar QoE scores for both ToS1 (2.36, 2.46, 2.48) and ToS2 (2.46, 2.29, 2.40).On the contrary, E-WISH outperforms all other ABR techniques for both video sequences, including WISH.For ToS1, E-WISH scores a QoE of 3.16 points, improving the QoE by 27% compared to WISH and up to 52% compared to SARA.For ToS2, E-WISH achieves a QoE of 2.96 points, increasing the QoE by 19% compared to WISH and up to 48% compared to SARA.The main reason for these results is to be searched in the maximum resolution limit and the QoE trend of the video representations.According to the ITU-T P.1203 QoE model (mode 0), the QoE when playing the highest representation is approximately 4.92 with a resolution of 3840×1714 and a bitrate of 17 Mbps.Moving to the 7.5 Mbps representation with a resolution of 2560×1142, the computed QoE is 4.76 points, reduced by 0.16 points compared to the previous one.Considering one representation lower, for a resolution of 1920×858 and bitrate 5.8 Mbps, the estimated QoE drops to 4.18 points, with a reduction of more than 15%.In this scenario, we limit E-WISH's decision space by setting the maximum acceptable resolution to 2560×1142 since the QoE reduction from the highest representation is negligible (∼ −3%) while the bitrate decrease is significant (∼ −56%).Additionally, this has a major impact on energy consumption, which is discussed in the next Section 4.2.2.4.2.2 Energy consumption. Figure