Towards Seamless Serverless Computing Across an Edge-Cloud Continuum

Serverless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock their full potential in hybrid environments. To gain a deeper understanding and firsthand knowledge of serverless computing in edge-cloud deployments, we review the current state of open-source serverless platforms and compare them based on predefined requirements. We then design and implement a serverless computing platform with a novel edge orchestration technique that seamlessly deploys serverless functions across the edge and cloud environments on top of the Knative serverless platform. Moreover, we propose an offloading strategy for edge environments and four different functions for experimentation and showcase the performance benefits of our solution. Our results demonstrate that such an approach can efficiently utilize both cloud and edge resources by dynamically offloading functions from the edge to the cloud during high activity, while reducing the overall application latency and increasing request throughput compared to an edge-only deployment.


INTRODUCTION
Serverless computing, as a new and greatly popular paradigm with the cloud computing community, simplifies software development and deployment in a cloud environment.Serverless enables developers to run their code without having to manage any underlying infrastructure [4,5].One typical offering is Functions-asa-Services (FaaS), exemplified by public Cloud services such as AWS Lambda [24], Azure Functions [13], and Google Cloud Functions [16].FaaS offers event-driven execution of functions for cloud customers to run their code for virtually any type of application or backend service without provisioning or managing any servers.
Apart from serverless computing, the emerging applications in various areas such as manufacturing [6,21], healthcare [1,20], smart cities [12,15], agriculture and farming [10,19] and transportation [14,27], have been both nurturing and demanding edge computing in recent years.Edge computing brings processing, data storage, and applications closer to the edge of the network, where end devices such as Internet-of-Things (IoT) devices and smartphones generate and consume data, which benefits from low latency, improved reliability, better data privacy, cost savings, and energy efficiency [2,7,23].At the same time, it is more common to have applications composed of different modules distributed over different tiers (e.g., edge, fog, cloud) and interoperate between themselves [17].Such computational model has emerged under the term Edge-Cloud continuum, in which infrastructure's geo-distributed and heterogeneous nature presents unique challenges and opportunities [3,22].However, interoperating control across an Edge-Cloud continuum is still a challenge.Several existing works have made much progress in tackling this challenge.Nevertheless, resource management, scheduling, fault tolerance, deployment complexity, and cold-start mitigation [8,9,18,25,26] still need to be addressed.
In this paper, we address the interoperability problem between edge and cloud environments through serverless computing, making the deployment process more efficient across the Edge-Cloud continuum and improving the performance of processing applications.Our vision for interoperable serverless computing, which enables an Edge-Cloud continuum, involves a platform that abstracts not just the infrastructure, but also the location where functions are executed, using available resources as efficiently as possible.We aim to gain practical insights into the intricacies and challenges associated with this architecture.Our primary contribution can be summarized as follows: • We review the current state of serverless platforms and compare them based on predefined requirements.• We implement a serverless platform and develop a novel edge orchestration technique that enables a seamless deployment of serverless functions across both edge and cloud environments.• We propose an edge offloading strategy and conduct extensive experiments to showcase its performance benefits.The source code is available at the GitHub repository 1 .The remainder of this paper is structured as follows.Section 2 presents related work and in Section 3, we pose the requirement analysis, platform choices and our framework.Section 4 details the experimental setup and discusses the results.Finally, we conclude our work in Section 5.

RELATED WORK
Recently, research has addressed serverless applications in the edgecloud continuum, focusing on three dimensions: (1) scheduling functions in resource-limited edge environments [25,26], (2) optimizing serverless resource usage in edge environments [9,11], and (3) deployment complexity for seamless integration of serverless computing across this continuum [8,18].Wang et al. [26] introduced Lass, a platform for running latency-sensitive serverless apps on the edge using queuing theory to allocate resources and autoscale as needed.Tang et al. [25] proposed a deep learning task scheduling algorithm for resource utilization improvement at the edge.By contrast, we extend Knative's default round-robin scheduling to enable offloading requests from edge to cloud based on function response times.Gadepalli et al. [9] explored WebAssembly's potential for efficient serverless computing at the edge due to its low resource overhead.Jeon et al. [11] optimized resource usage by caching function dependencies using deep reinforcement learning.Our approach focuses on offloading work to the cloud, not optimizing runtime overhead.Nastic et al. [18] presented Serverless Computing Fabric (SCF) for the Edge-Cloud continuum, addressing edge-native backend services, resource usage, and edge intelligence.Ferry et al. [8] introduced a solution for Cloud-Edge-IoT applications with a modeling language.Our approach does not target IoT scenarios specifically; however, it may have higher overhead in some IoT contexts, as Knative's features have been designed with a focus on cloud environments.

METHODOLOGY 3.1 Requirement Analysis
The design of a serverless platform running on the Edge-Cloud continuum necessitates careful consideration of multiple factors, including scalability, security, cost-effectiveness, and performance.On the one hand, this method must be capable of operating at both the Edge, close to the source of data, as well as in the Cloud, where it can leverage the resources of a large data center.On the other hand, the design should address the unique challenges presented by this hybrid environment, such as managing data transfer between the Edge and Cloud and handling variable traffic levels in real-time.By harnessing the advantages of both the Edge and the Cloud, our serverless platform can provide a flexible and efficient solution for a wide range of applications.To achieve this, we define the following functional and non-functional requirements.
The functional requirements we propose for our solution are the following: • Location Agnostic.The same function definition can be utilized to run serverless functions in both Edge and Cloud environments.• Scalability.The platform is capable of accommodating the addition of new edge clusters and nodes dynamically.• Dynamic Scheduling.The edge cluster gateway must have the ability to dynamically determine the execution location of a function, either at the Edge or in the Cloud.
We also define the non-functional requirements as follows: • Heterogeneity.The system should support nodes with varying degrees of hardware capabilities (e.g.x86, ARM).• Resource Allocation.The system must possess the capability to dynamically allocate resources in accordance with workload demands.• Fault Tolerance.If a worker node experiences a failure, the system can still operate normally.• Reliability.The system should be robust to bad Edge-Cloud connection.• Security.Communication between Edge and Cloud should be secure.
We establish a set of selection criteria for platform comparison that will guide the design process.
(1) The serverless platform should be actively maintained in the community.(2) The serverless platform should be open-source and have a permissive license.We wish to be able to extend the platform while avoiding any restrictive licensing.(3) The serverless platform must possess the capability to scaleto-zero and scale workloads according to actual demand, which are essential features for its intended purpose.(4) The serverless platform is able to run on limited resources and a varying degree of heterogeneous hardware.
These criteria serve as the foundation for our solution, ensuring that the final solution aligns with the desired goals and objectives.

Technology Investigation
We collect information from literature and official sources, such as GitHub statistics incl., stars, forks, issues, number of commits, and software documentation, and conclude nine mainstream opensource serverless platforms.We have checked the capabilities of each platform as understood from the official documentation and initial deployments of the platforms on our experimental setup and evaluated the reasons for including or excluding them from our platform options.This evaluation process helps us to determine which serverless platforms are best suited to meet our platform's functional and non-functional requirements and to make informed Table 1: For each investigated serverless platform, we have assessed whether they fit our platform criteria #(1-4).
Serverless Platform Platform criteria #(1) #(2) #(3) #( 4) decisions about which platform to choose for our implementation.We summarize our findings in Table 1 and explain our reasoning below for each serverless platform on a case-by-case basis.
• Knative is a well-regarded serverless platform that has gained popularity in academic research circles and within the open-source community.Our requirements analysis has determined that it is a suitable choice to include in the platform options.• Kyma.As a FaaS solution based on Knative, Kyma inherits this underlying framework's technological capabilities and constraints.It meets all the previously defined requirements.• OpenFaaS.Despite its technical capabilities, the licensing of core serverless features in OpenFaaS prevents us from using it; it requires a paid license for crucial features such as scaling to zero and event handling using message brokers.• Fission.Our analysis has indicated that it has the capabilities and features to fulfill the intended purpose of the platform, as well as meet the non-functional requirements such as performance, scalability, security, and cost-effectiveness.• OpenLambda.Despite being designed for academic research, OpenLambda lacks more support for cluster deployments, which is a critical requirement for deploying our solution.• OpenWhisk is a well-established serverless platform.Our requirements analysis indicates that it satisfies all the previously established functional and non-functional requirements.• Kubeless.Despite being widely recognized for its performance, we exclude Kubeless from the design of the serverless platform because it is no longer actively maintained.• Fn.While Fn was one of the pioneers in the open-source serverless space, it is no longer maintained and will not be considered in our analysis.• IronFunctions.Like Fn, IronFunctions is an early exam- ple of an open-source serverless solution.Unfortunately, like Fn and Kubeless, it has become inactive and, as a result, has been excluded from our design's list of platform options.
During our practical assessment using our test bed, we encountered various deployment challenges for different serverless platforms.Notably, we faced memory limitations that prevented us from deploying OpenWhisk due to requirements associated with the necessary message broker.While Fission was successfully deployed (as shown in Table 1), it exhibited suboptimal performance on our test bed, with function instances frequently hanging and necessitating frequent restarts.In contrast, both Knative and Kyma were successfully deployed and thoroughly tested.After careful consideration, we selected Knative as the foundation for our solution.This choice was supported by the fact that Kyma is built on top of Knative, ensuring compatibility between our solution and Kyma.

Platform Design and Implementation
We design and implement Knative Edge as an extension of the existing Knative serverless platform as shown in Figure 1(A).

Cloud-to-Edge Replication.
The Knative Edge controller mirrors Knative Services from the Cloud cluster to the Edge cluster by watching for changes in the Kubernetes resources of both clusters and modifying the Edge resource to keep a consistent definition.As shown in the replicate of Figure 1(B), it uses several different components within the edge cluster to achieve its goals.However, replicating a resource may risk making unnecessary changes to the resources it replicates.It is most evident when a Knative Service is replicated, as any changes can trigger Knative Serving to react and make more changes to the resource, which can trigger Knative Edge to make more changes.This feedback loop can cause degradation of the serverless functions running either on the Edge or Cloud and can increase the traffic between the Cloud and Edge.
Our approach to this issue is to selectively compare fields in the Knative Service definition.Whenever Knative Edge receives an update for services it replicates, it copies the current definition from the Edge and overwrites a subset of its fields using the definition from the Cloud.When overwriting, we skip over the state of the service and any annotations which are not defined by Knative Edge; thus, the internal state of the Edge resources are persisted.The new service definition is compared to the current definition on the Edge cluster and deployed to the cluster if any change is detected.

3.3.2
Edge-to-Cloud Offloading.For efficiently monitoring and managing runtime platform metrics in our Edge cluster, we utilize an instance of Prometheus deployed on each Edge cluster, as shown in Figure 1 (C).This instance is configured to be aware of all Knative components in the cluster, including all running function instances.It scrapes these metrics regularly and temporarily stores them in a time series database, which is optimized for storing and querying metrics such as those generated by Knative.Since only recent data is being queried by Knative Edge, we configure short data liveness to reduce as much as possible the overhead of Prometheus.The scheduling is driven by a load-balancing algorithm that spreads the traffic to the different routes based on a defined percentage.We implement a simple default offloading strategy that uses the request latency metrics of all the functions running at the Edge.The API Gateway makes the decision randomly, and only a percentage of traffic (decided by the offloading strategy) is being sent to the cloud.
Let X  () be the distribution of request latencies at time  and  95 and  50 are the 95 ℎ and 50 ℎ percentile, respectively.The weighted sum of the latest latency response ratio   () can be given by, Giving more importance to recent values than to older ones, we use an exponentially decreasing weighted sum with  decay as exponent to implement  ′  (), which is given by, where   defines the how many steps in time are used to calculate the weighted sum of the latest request latency ratios measured.It calculates   in Equation (3), the intended traffic percentage that should be forwarded to the Cloud.Through different iterations, we found that directly using   for setting the traffic percentage lead to unstable offloading.We define   () as in which  soft and  hard are the soft limit and hard limit of the percentile ratio.For any  ′  that falls bellow  soft , the traffic percentage is set to 0. For values above  hard , the traffic percentage is set to 100.And finally, for values between, we interpolate them between 0 and 100, depending on where the values lie in respect to  soft and  hard .Hence, we define   in Equation ( 4) as a more stable way of updating the traffic percentage.
where  in is the inertia factor, which measures how much   influences   .

PLATFORM EVALUATION 4.1 Experimental Setup
We utilized four Raspberry Pi 3B+ devices, a low-power x64 edge device, and a cloud virtual machine (VM) to establish our platform.This platform was meticulously developed and rigorously tested on Knative Serving 1.7 and Kubernetes 1.24.Our experiments involved four real-world workloads: matrix multiplication (MatMult), image processing (Image Proc.), random I/O, and a combination of these three loads (Mixed).These workloads exhibit varying demands on CPU, memory, disk, and network resources during execution, making them suitable targets for evaluating our platform's capabilities.We generated requests for these workloads using a specialized experiment runner, allowing us to control the request rate: initially, we used a low request rate, then increased the rate linearly to a high request rate, which was maintained until the end of the run.The rates were chosen in such a way that the low request rate could be handled entirely by the edge devices and the high request rate would overload the edge devices under no offloading.Additionally, we employed an edge scheduling strategy to determine how traffic was distributed between the Edge and the Cloud.This strategy could be configured to various distribution levels (0%, 25%, 50%, 75%, 100%, and auto) in our experiments, as detailed in Table 2.These settings enabled us to assess the impact of different workload types and edge scheduling strategies on the platform's performance.We measured several performance metrics, including average response time, CPU and memory utilization of the edge devices, as well as network utilization of the cloud VM.  reaction times, leading to reduced response times.However, as more requests are offloaded to the Cloud, the response time converges towards the lower bound given by our edge network conditions as soon as the request rate is increased (see Figure 2 (Latency)).

Results and Evaluation
CPU.Our results indicate that our solution can effectively offload excessive workload from the Edge cluster but may also underserve many requests (see Figure 2 (CPU)).As our algorithm optimizes the response times of workloads, CPU utilization is reduced as a consequence.However, determining the optimal level of CPU utilization is more complex.If the CPUs of edge devices are not fully utilized but requests are being forwarded to the Cloud, Edge cannot be efficiently utilized.Furthermore, reserving spare CPU resources could also benefit the Edge cluster when they are needed for possible bursts of requests or system operations.Finally, our test bed uses Raspberry Pi 3B+ as edge devices which have a low CPU power, and more powerful edge devices, such as Nvidia Jetson, might need more analysis and investigation in order to optimally balance where requests should be served from.
Memory.We assess that, in most cases, memory utilization remains within reasonable bounds.Our initial assessment of the workloads suggested that we would observe a significant increase in memory utilization for both image processing and matrix multiplication.While the latter did see a noticeable increase in memory consumption, the former has seen a much-lessened effect, performing similarly to the random I/O workload, which we estimated to have no increase in memory (see Figure 2 (Memory)).
Network.Bandwidth saturates for image processing and matrix multiplication when all requests offload to the cloud, however the mixed workload never hits the maximum of 100MB/s.Similarly, our network ingress findings indicate that image processing and matrix multiplication are constrained by Edge-to-Cloud network bandwidth at full offloading.In case the network is the bottleneck (which is in the case of the matrix multiply and mixed workloads), then the offloading does not help.It would make the response times worse, depending on how they compare to the edge.Additionally, our offloading strategy does not take into account the network latency or bandwidth between the Edge and Cloud environments, and a more sophisticated strategy is required to optimally offload in different network conditions.

CONCLUSION
The topic of serverless computing on the Edge-Cloud continuum is still in its infancy.This paper aims to provide valuable insights into the design of serverless platforms for a unified Edge-Cloud computing environment.We presented our platform, an extension of the serverless platform Knative that enables deployments across edge and cloud environments and offloads requests from edge devices to the cloud.We demonstrated that our approach can deploy a serverless application across edge and cloud environments and demonstrated its offloading capability.In future, we will explore more about offloading strategies for resource optimization and performance improvement.

Figure 1 :
Figure 1: An overview of the System Architecture and core components.

Figure 2
Figure 2 illustrates the average latency of the responses, the average CPU and memory utilization of the edge devices, and network utilization of the cloud VM, all measured over the course of each experiment run.Latency.The offloading functionality improves considerably response times.For instance, our solution initially exhibits slower

Table 2 :
For each workload type and traffic split, we present the total number of successful responses.