Abstract
Recent advancements in distributed systems have enabled deploying low-latency and highly resilient edge applications close to the IoT domain at the edge of the network. The broad range of edge application requirements combined with heterogeneous, resource-constrained, and dynamic edge networks make it particularly challenging to configure and deploy them. Besides that, missing elastic capabilities on the edge makes it difficult to operate such applications under dynamic workloads. To this end, this article proposes a lightweight, self-adaptive, and decentralized mechanism (DECENT) for (1) deploying edge applications on edge resources and on premises of Edge-Cloud infrastructure and (2) controlling elasticity requirements. DECENT enables developers to characterize their edge applications by specifying elasticity requirements, which are automatically captured, interpreted, and enforced by our decentralized elasticity interpreters. In response to dynamic workloads, edge applications automatically adapt in compliance with their elasticity requirements. We discuss the architecture, processes of the approach, and the experiment conducted on a real-world testbed to validate its feasibility on low-powered edge devices. Furthermore, we show performance and adaptation aspects through an edge safety application and its evolution in elasticity space (i.e., cost, resource, and quality).
1 INTRODUCTION
The Internet of Things (IoT) has prominently diffused into society in recent years. A wide range of services are designed on top of IoT technologies in various industries such as Industrial Manufacturing, Healthcare, and Smart Buildings. Simultaneously, cloud-based solutions are no longer sufficient to satisfy the stringent requirements (i.e., low latency and high availability) of the safety-critical and real-time IoT services. To overcome such a gap between cloud and IoT entities, new computational resources named edge devices are being introduced at the edges of networks providing low-latency services and enhancing privacy within IoT infrastructures [25]. Edge devices are essentially low-powered computers located at edge networks—closer to the data source, respectively, to IoT domains (i.e., sensors, actuators, etc.). In this context, edge devices can process data streams streamed into an IoT system. Notably, to achieve such an aim, we require deploying and running various analytic or decision functions at edge networks [8]. For instance, a scheduler decides whether the pumped data must be processed at an edge network or forwarded to a cloud infrastructure. Thus, many operational and business challenges can be solved by running decision-making functions on edge resources.
As a newly introduced paradigm, Edge Computing [25] is a key enabler for IoT proliferation. In contrast to cloud infrastructures, edge networks are resource-constrained environments. Edge networks essentially are environments where a set of heterogeneous edge devices are connected in a peer-to-peer manner. Such devices usually have limited resources, referring to their computational and storage capabilities. A wide range of available resources at the edge have introduced new opportunities such as deploying low-latency, privacy-aware, and resilient edge applications (e.g., IoT applications). Besides many benefits introduced by edge devices, analyzing high-volume IoT data streams on a single device through monolithic applications poses many limitations and a set of challenges in terms of processing capabilities, storage, energy, and communication bandwidth. To that end, modern applications are no longer monolithic [2]; such edge applications (i.e., services) are divided into a set of independently deployable software components (i.e., microservices) and distributed over edge resources or on premises of Edge-Cloud infrastructure.1 Similarly, resource management techniques need to be designed as decentralized systems to run in resource-constrained environments. Thus, this brings completely new challenges where novel lightweight resource management techniques are needed to fully utilize available resources at edge networks. Nonetheless, the broad range of requirements concerning latency, Quality of Service (QoS), or fault tolerance, combined with edge networks’ heterogeneous and dynamic nature, make it particularly challenging to manage, configure, deploy, and operate such applications.
Over the past few years, researchers have been widely focused on proposing multiple resource allocation techniques at the edge [24]. However, less attention is given to providing elastic features at the edge [14, 17]. In most cases, elasticity refers to a system’s capability to adapt to workload changes by (de)provisioning resources in an automatic manner [13]. Resource demands for a particular running application or component may change over time. Consequently, this may cause poor overall performance and higher latency than the expected response time. For instance, consider a scenario where a health application running in an edge network (e.g., smart home) monitors residents’ health through processing data streams created by the users’ smartwatches. At the time \( t_{0} \), a single edge device has sufficient resources to monitor only one resident’s health. At the time \( t_{1} \), there is more than one resident, and the edge device may not have enough resources to process produced data for all residents. To avoid such situations, edge applications should scale over edge resources or on premises of Edge-Cloud infrastructure. Therefore, introducing elasticity features at the edge is crucial.
The elasticity concept is heavily related and used in cloud computing, and often is considered one of the main features of the cloud paradigm [7]. In Edge Computing, very few works propose methods for controlling application elasticity [10, 12, 23, 27]. Current approaches exhibit several limitations, such as that they are built as centralized systems, are application specific, and enable application scaling only by considering hardware resources and their capacity to scale. Furthermore, centralized approaches are sensitive to edge system characteristics (i.e., resource constrained, dynamic, and uncertain). Thus, edge networks’ dynamic nature requires continuously re-evaluating placement decisions for edge functions. Nevertheless, in our conception, besides resource requirements, elasticity in three-tiered infrastructures should also target their relations with the different types of costs and quality.
To address the aforementioned challenges, we propose a lightweight framework called Decentralized Configurator for Controlling Elasticity in Dynamic Edge Networks (DECENT) and its runtime mechanism for controlling elasticity requirements in edge applications. DECENT enables deploying and scaling edge applications in dynamic edge networks and on premises of Edge-Cloud infrastructures. Essentially, the developer defines application requirements, elasticity requirements, and a scaling model for each edge application and its components. DECENT interprets these requirements, deploys components in the three-tier architecture, and enforces various scaling operations at runtime to fulfill edge application demands. The system we propose enables easy configuration, deployment, and operation of edge applications on top of heterogeneous edge infrastructure (i.e., edge network). Furthermore, DECENT is a self-adaptive and decentralized mechanism that can be easily deployed and run on low-powered edge devices.
Our concrete contributions are as follows:
We enable application developers or domain experts to specify high-level elasticity requirements for their edge applications in a declarative way. Besides that, the user specifies the deployment and scaling model inherent to a specific infrastructure configuration. To specify edge application elastic requirements, we consider the declarative language called Simple Yet Beautiful Language (SYBL) [6]. We extend the language and focus on developing novel constraints and enforcement strategies to support edge application characteristics. In our conception, besides resource requirements, elasticity in three-tiered architectures should also target their relations with the different types of costs and quality.
We extend the prototype of [21] with a lightweight mechanism that enables deploying and controlling the elasticity of edge applications in a decentralized manner at an edge network. Edge devices that run application components capture and interpret their elastic requirement through Elasticity Interpreters and report to the configurator device whether the requirements are violated. The configurator device takes action and re-configures the application to meet the specified elastic demands.
To validate the approach’s feasibility, we perform an experimental evaluation that shows that an edge application and its components can easily scale and be controlled by the mechanisms deployed at the edge of a network. Our prototype is evaluated on a real-world testbed composed of several low-powered edge devices.
The rest of the article is structured as follows. Section 2 gives an overview of the platform along with a running example of the article. Related work is considered in Section 3. Section 4 describes edge applications and system modeling, along with describing application requirements. In Section 5, we describe in detail the DECENT framework, the processes, and details regarding prototype implementation. Evaluation and results are presented in Section 6. Finally, Section 7 concludes the article and outlines future work directions.
2 BACKGROUND AND RUNNING EXAMPLE
This section outlines the domain for which we have developed our system. We give a short overview of our previous work on top of which the proposed approach is built. Afterward, we present our motivational scenario through a running example.
2.1 Background
This article substantially extends our previous works [21] with a new elasticity controlling module and its lightweight runtime mechanism. The former introduces the platform and system architecture enabling automatic discovery of heterogeneous resources (i.e., computational, sensing, context data) in edge networks [20].
One prominent approach that has recently emerged is to combine edge, fog, and cloud infrastructures to enable providing low-latency services [8]. Edge and fog paradigms provide almost similar features. Both paradigms foresee enabling more computation resources near end-users and IoT domains. However, the most significant difference between the two tiers is administrative differences and responsibilities. We acknowledge that Edge Computing means different things to different people; we envision Edge Computing as a bridge between IoT devices and the nearest edge device to a user. Furthermore, fog devices may provide much more powerful resources and services for larger geographical areas. For instance, smart transportation systems may benefit from connecting and processing vehicle data in fog infrastructure. Nonetheless, both paradigms aim to provide low-latency services since end devices are closer to the source where the data is produced and consumed.
Furthermore, the three-tier infrastructure architecture shows a seamless opportunity for deploying various applications (e.g., industrial, health, etc.) where low latency, QoS, reliability, and scalability are critical requirements. This enables the distribution of application components over edge, fog, and cloud resources. However, in the past few years, researchers in the field of edge and fog computing have been mostly focused on proposing multiple centralized techniques for scheduling, controlling, and monitoring IoT applications deployed at the edge. In fact, such functions are deployed on powerful devices such as local servers or cloud devices [28]. Nonetheless, as we explore new IoT systems and the heterogeneous and dynamic nature of edge networks, distributing system components among various computation entities becomes an increasingly inevitable requirement. As a result, shifting various system functions closer to edge networks and dynamically placing them in the most suitable devices is crucial (see Section 5). Thus, deploying decision mechanisms at the edge in a decentralized manner makes edge networks autonomous environments and less dependent on centralized devices. To that end, in our previous work, we proposed an efficient approach that solves the placement problem of the configurator on the most suited (e.g., powerful) edge device in a given dynamic edge network [21].
The latter introduces the technical framework and a solution to build and organize devices in edge networks such that the resource discovery complexity can be handled [21]. The proposed framework implements the configurator placement approach and enables system designers to define and configure their edge networks. More specifically, edge devices in edge networks are organized into clusters. Each formed cluster has a cluster coordinator and one global coordinator (i.e., configurator) of an edge network. Both coordinators aim to provide various functionalities to support resource discovery, and they act similarly as superpeers [15] at the edge. All coordinators are placed dynamically on the most suited edge devices in an edge network. Nevertheless, the dynamic nature of edge networks necessitates continuously re-evaluating placement decisions for coordinators. Thus, using a self-adaptive and decentralized configurator aims to solve such challenges at dynamic edge networks. In this article, our proposed approach enables and supports the execution of edge applications on various edge networks and maintains their correct functionality throughout the execution time. The proposed elasticity control mechanism maintains the correct functionality of edge applications by considering multiple elastic perspectives (i.e., quality, cost, and resources). The proposed elastic mechanism is an extension of the introduced framework in [21].
2.2 Running Example
To motivate our subsequent discussion, we consider emergencies such as natural disasters (e.g., earthquakes, fires, floods) in the city. Emergencies like earthquakes may affect various city zones, which can damage infrastructure, cause injury or loss of life, and trap people under buildings. In such situations, time is valuable, and drones may be used to analyze the entire situation and help rescue teams find and communicate with victims under a collapsed building. In this scenario, we consider multiple connected drones (i.e., form an edge neighborhood) flying over the city’s affected areas aiming to provide services for the rescue team in finding victims under a collapsed building. Each drone (i.e., edge device) is equipped with various computation capabilities and integrated sensors (e.g., radar sensors, infrared cameras, electronic nose, etc.). We consider that drones are multi-purpose devices where the rescue teams may request to deploy various services depending on the emergency. Meanwhile, base stations may provide computational and storage capabilities (i.e., fog devices) and provide docker charge stations for charging drones. At the same time, cloud capabilities may be used to store data for long terms.
Referring to the situation illustrated in Figure 1, we assume that a rescue team deploys (1) the lost-person service (i.e., edge safety application) in the affected area. Such a service aims at helping rescue teams solve missing person cases faster by finding their location in the affected zones (2). The service is dependent on camera resources, which are integrated into various drones. Specifically, the lost-person service consists of components responsible for specific tasks (i.e., front-end, image processing, generating results, storing results, etc.). The service takes as input images provided by the flying drones in the affected area. Each drone every second generates various images to be processed by the service. However, with the increasing number of drones, the number of generated images is increased (5). To that end, application components may require more computing resources to process images to fulfill application requirements. For instance, consider that the front-end component that accepts drones’ images has a response time requirement that should be less than 100 ms. When the response time requirement is violated (e.g., 100 ms), the service component must scale to multiple instances on edge or on premises (3–4) to meet the desired service quality. Thus, it is evident that to meet service demands at runtime and to avoid resource over-provisioning/under-provisioning, we require a lightweight mechanism that dynamically controls application elasticity at the edge. Furthermore, we assume that the service running on an edge neighborhood is accessible by users within the range covered by devices.
Fig. 1. The lost-person service.
3 RELATED WORK
Research efforts associated with the elasticity at the edge are still at a relatively early development stage. Elasticity features in edge infrastructures mostly have been focused on scaling up/down resources to meet application demands. In some approaches, tasks/services are reallocated when devices are overloaded [26]. However, such practices, in turn, incur an overhead of resource usage, increased cost, and increased energy consumption. Even though resource over-provisioning can be considered feasible in resource-rich environments such as the cloud, such an assumption is highly impractical in resource-constrained edge networks. More specifically, reserving resources more than needed to support the intended task workload wastes available resources.
Very few approaches address these challenges at the edge. Furst et al. [12] introduce a new framework that enables services to self-adapt and meet the current service demands of their Service-Level Objectives (SLOs). A novel programming model called Diversifiable Programming (DivProg) uses function annotations as an interface between the service logic, its SLOs, and the execution framework to achieve such an adaption dynamically. Essentially, a third-party execution framework captures service configuration given by the developer through DivProg, interprets, and scales services that conform to changing SLOs. Tseng et al. [27] provide a lightweight autoscaling mechanism for fog computing in industrial applications. Lujic and Truong [16] propose a novel, holistic approach for architecting elastic edge storage services, featuring three aspects such as data/system characterization (e.g., metrics, key properties), system operations (e.g., filtering, sampling), and data processing utilities (e.g., recovery, prediction). The authors [23] discuss how applications for a fog infrastructure can be packaged into containers and act elastically. Their approach is built on top of the container orchestration tool Kubernetes and extends it to the fog. In [29], the authors investigate the benefits of virtualization to move and redeploy mobile components to fog devices near the targeted end devices. By using geometric monitoring, the approach dynamically scales and provisions the resources for the fog layer. Furthermore, several edge platforms such as EdgeX Foundry,2 AWS IoT Greengrass,3 or Google IoT Edge4 promise to bridge the gap between IoT and the cloud by providing a flexible runtime for applications running at the edge.
Any Edge-Cloud system’s goal is to hide the complexity of edge applications’ deployment and operations in heterogeneous edge networks and enable developers to specify application requirements in a declarative way. A domain-specific language (DSL) specifies the high-level constraints of edge applications, such as QoS, application criticality, and elasticity requirements. The DSL essentially makes it easy for users to develop these specifications. Understanding the current and future requirements of edge applications from various domains remains a prominent challenge. A platform for the described IoT scenarios (e.g., as in our running example) needs to hide this operational complexity from application developers. In particular, programmers should not have to worry about the distribution of data and edge or cloud resources’ heterogeneous capabilities. Developers should be able to express the context in which applications are allowed to run and the elasticity requirements in a high-level way [7]. The platform should then take care of resource provisioning and data movement. However, this requires that the programming model and API are intuitive for developers but expressive enough to help the execution platform make runtime decisions on scheduling edge application components. In this article, we consider the SYBL [6] to specify elastic requirements in terms of resource, cost, and quality. The SYBL enables developers to specify elasticity requirements (i.e., constraints, monitoring, strategies, and priorities) at design time and enables scaling edge applications in an elasticity space. Nevertheless, Service-Level Objectives for Next-Generation Cloud Computing (SLOC) is another novel elasticity framework, which promotes a novel performance-driven, SLO-native approach to cloud computing, respectively, Edge-Cloud environments [22]. The new SLO elasticity policy language considers similar elastic dimensions (i.e., resource, cost, and quality) as the SYBL language.
Finally, our work is an effort to advance Edge Computing platforms’ current state and enable more straightforward configuration, deployment, and operation of edge applications on top of heterogeneous edge infrastructure. The above-mentioned systems are extremely limited in their operational capabilities and lack of self-adaptive mechanisms required in dynamic edge and IoT settings. In essence, such systems assume static configurations that do not change over time, provide no way to specify elasticity or QoS requirements, and do not have a mechanism to enact them. Our proposed approach aims to bridge this gap and ensure multi-dimensional elasticity control (i.e., cost, resources, and quality) for fulfilling edge application demands deployed on Edge-Cloud infrastructure. It enables edge applications and their components to adapt in response to dynamic changes in their workload. Finally, we allow developers to easily define elasticity requirements captured and executed by our lightweight mechanism in a decentralized manner.
4 EDGE MODELING AND ELASTIC REQUIREMENTS
This section formally defines the concepts of edge applications, the edge system, the deployment and scale policy, and elasticity requirements. First, we model edge applications and the system. Then, we describe application deployment and scaling models in Edge-Cloud architectures. Finally, we extensively explain application elasticity requirements, which enable developers to characterize their edge applications.
4.1 Edge Application and System Model
A service-based edge application
Fig. 2. Edge application model.
Each component
The Edge-Cloud architecture consists of edge infrastructures (i.e., multiple edge devices connected in a peer-to-peer manner forms an edge infrastructure), fog infrastructure, and cloud infrastructure. As mentioned in Section 2, our approach is built on top of an edge network, which is built as a Distributed Hash Table (DHT) network [19]. We assume that every edge device trusts all devices to establish a direct communication link; they all belong to the same local administrative domain. Furthermore, in this work, our primary focus resides at edge infrastructures, while we assume that the fog and cloud infrastructures are considered as Infrastructure as a Service (IaaS) [18]. We assume that the system designer configures the edge network to connect to the IaaS services.
Executing components on heterogeneous environments (i.e., edge tier, fog tier, or cloud tier) is crucial. For instance, an application with multiple components can scale on multiple instances running on different locations in either edge or cloud, depending on current demands and the application’s constraints. To overcome the challenges introduced with heterogeneous environments, we consider Docker5 as our homogeneous application runtime platform that follows the “run once, run anywhere” model. The Docker platform represents a lightweight, stand-alone, executable package that contains everything needed to run the specifically added component. The application runtime is essentially responsible for executing edge applications (i.e., container based) on edge devices or on premises. Thus, to deploy edge applications in Edge-Cloud infrastructure, components are packaged in individual Docker containers.
4.2 Deployment and Scaling Models
In our conception, edge applications can be thought of as a set of deployable software components running on premises of the Edge-Cloud infrastructure. Thus, edge application components can be deployed and scaled according to the following models [3]:
Everything in the Cloud: The application components are deployed in the cloud. Essentially, this model is suitable when applications require significant computation and storage capabilities.Everything in the Edge: In this model, application components are distributed across available devices at the edge of the network. In essence, edge networks may exist in different settings starting from private edge networks (e.g., smart home) to enterprise edge networks (e.g., industries, health, etc.), as well as the public edge networks (e.g., smart city). Furthermore, researchers exchangeably use two terms for the available devices at this layer, such as edge and fog devices [8]. This study refers to an edge device as low powered and resource constrained (i.e., lower computation and storage capabilities). In contrast, a fog device is much more powerful than edge devices and less than the cloud.Hybrid Edge-Cloud: In this model, application components are distributed across available resources at the edge, fog, and cloud. Essentially, the hybrid model enables deploying and executing applications with low-latency requirements and resource-demanding processes.
Each mentioned model has different characteristics in terms of cost, latency, privacy, and other quality properties [3].
4.3 Elasticity Requirements
Elasticity properties at the edge are crucial for executing edge applications and fulfilling their dynamic resource demands. In Edge-Cloud architectures, elasticity targets not only resources and their capacity to scale but also their relationships with various forms of costs and quality [7]. In this context, multiple stakeholders may be involved in specifying elastic requirements. For instance, the developer could specify that the latency between application components must not reach 20ms without carrying out how many resources should be used to achieve the desired state. An edge network provider could specify its resource utilization schema; for example, when overall utilization at the edge is higher than 90%, it enables scaling applications toward fog or cloud infrastructure. To enable such a feature, we consider a declarative language called SYBL to allow users to specify an edge application’s elasticity requirements at the design time [6]. We extend the language and focus on developing novel constraints and enforcement strategies to support edge application characteristics running on Edge-Cloud infrastructures. In addition, we extend and optimize the language runtime engine to support controlling Docker-based edge applications and enable execution on low-powered edge devices. We provide a time-based mechanism that analyzes workloads generated by incoming requests to optimize and avoid unnecessary scaling operations. More specifically, the time-based mechanism controls for a few seconds (i.e., a configurable value, e.g., 20 seconds) if the increased load on a particular component is handled without executing any scaling operation. The feature mentioned above is useful in situations when the increased or decreased load in a component is occasional and not persistent.
SYBL enables users (i.e., developer or system user) to specify application elasticity requirements in a declarative way represented in the form of (1) monitoring (i.e., specifying which metrics to monitor), (2) constraints (i.e., specifying the limits in which the monitored metrics can oscillate), (3) strategies (i.e., specifying actions to be followed in case the constraint is violated or becomes true), and (4) priorities (i.e., specifying constraints with higher priority than the other ones). The user can specify elastic requirements at different levels of edge application. Thus, elasticity controls can be achieved at the (1) edge application level (i.e., specifying high-level application elastic requirements) and (2) edge component level (i.e., specifying low-level application elastic requirements). Listing 1 shows an example of elasticity requirements specified by the user. At the edge application level, the user may specify the maximum cost allowed for the entire edge application executed in an Edge-Cloud infrastructure. The user could specify that the application needs to scale down when the cost is high and CPU usage is below 20%. Or, when the cost is below the predefined value (e.g., 5 euros) and the CPU usage is higher than 80%, the application needs to scale up. At the edge component level, for instance, elasticity requirements from the developer side can be applied regarding the quality, such that, e.g., if the edge device battery is less than 10%, the component must scale up to avoid application failure.
The SYBL elastic requirements can be easily injected/integrated into various description languages. For instance, the elastic requirements can be easily integrated into the cloud application description language TOSCA standard,6 docker-compose files (i.e., YAML), or JavaScript Object Notation (JSON), or specified separately through XML descriptions. In the current version of our prototype, we specify edge application elastic requirements in a JSON file. Nevertheless, future work remains to provide a mechanism that will inject elastic requirements easily in the YAML file (i.e., since our application runtime platform considered is Docker).
Listing 1. An example of elastic requirements.
5 DECENT - DESIGN AND PROCESSES
This section provides an overview of our approach to deploy service-based edge applications and control their elasticity on an edge network. We extensively outline the main components of DECENT and the interaction of its components during runtime.
5.1 System Overview
An overview of our approach at design time and runtime of the system is illustrated in Figure 3. The developer defines the edge application model and its requirements at design time, as described in the previous section. In essence, the developer defines the application structure and resource requirements for each component. The elasticity requirements and deployment policy can be determined by the developer as well as by the system user (i.e., owner) before deployment. Along these lines, the deployment process starts when the user requests the system’s configurator device to deploy an edge application. Exploring the module that enables users to interact with the system is out of this article’s scope.
Fig. 3. Overview of DECENT’s components and their interaction during runtime.
Each edge device consists of similar system components, as illustrated in Figure 3. However, the edge device that becomes the system’s configurator provides the main features to control edge applications and runtime aspects. The architecture of the approach comprises five main modules, as described in the following.
At system design time, the Orchestrator is configured regarding the number of swarm manager devices that should be consistent at runtime and the expected size of the edge network. The system designer should consider a tradeoff between performance and fault tolerance when it defines the number of swarm managers. Having more swarm manager devices makes the system more fault tolerant, while writing performance is reduced (i.e., due to the network round-trip traffic). We configure an odd number of swarm manager devices to take advantage of the swarm mode’s fault-tolerance features. The Orchestrator promotes new swarm manager devices whenever the edge network doubles the expected network size. Notice that we may have a maximum of five managers in an edge network.
Nevertheless, the Orchestrator periodically monitors the desired swarm manager number (i.e., system designer perspective) and the current number of swarm managers. Thus, if the desired state is violated, it takes the required actions to keep swarm managers’ quorum in the system. Notice that the Orchestrator configuration data (i.e., swarm managers, swarm cluster joining key, etc.) is stored as DHT, meaning that it is shared and kept consistent between all devices within the whole network.
5.2 The Process
Edge applications are multi-container Docker-based applications. This means that the developer defines components that make up an edge application, including their hardware requirements specified in the docker-compose file. Essentially, an edge application and each component have their own unique name when they are deployed. Furthermore, at design time, the user specifies the deployment and scaling model and the elastic requirements (as presented in Listing 1). Both these requirements are formatted and stored as a single JSON file. Thus, we assume that the mentioned requirements are given at the design time.
The process starts when the user requests (1) the configurator device to deploy an edge application at an edge network (as illustrated in Figure 4). At this phase, the deployment planner interacts with the Orchestrator to get the current hardware infrastructure status, available edge devices, resource information, resource utilization rates, and latency of the communication links between devices ((2)–(3)). Afterward, it gets the edge application docker-compose file and specified hardware requirements and generates all eligible deployment plans (3). To generate such plans, it runs the algorithm presented in [4]. In essence, for each application component, the
Fig. 4. The process at runtime.
Suppose the
Through the Orchestrator module, the configurator shares elastic requirements with edge devices (5). Elastic requirements are shared by using DHT. Essentially, each edge device automatically identifies when the configurator assigns a container (e.g., named \( \varphi _1 \)) to them. Thus, when \( \varphi _1 \) is in the running state, elasticity interpreters on each device query DHT to receive elastic requirements for the running applications. The user can change elastic requirements at runtime. The changes made on elastic requirements (i.e., in the configurator device) are automatically updated by other edge devices and captured by corresponding Elasticity Interpreters. Afterward, before starting elasticity monitoring (7), the elasticity interpreter first checks whether it is the only device running \( \varphi _1 \). The configurator device provides information (6), and such information is required to avoid situations where multiple elasticity interpreters start monitoring \( \varphi _1 \) (i.e., when \( \varphi _1 \) runs on several devices). Moving on, consider a situation when an elasticity interpreter monitors a constraint that says the edge application component requires to scale up when it uses 80% of the edge device CPU (e.g., see Listing 1). Thus, when the specified constraint is violated, the interpreter communicates it to the
The
In case the configurator device fails, edge devices hold an election to find a new configurator device as presented in [21]. Elasticity interpreters contact other swarm managers to enforce scaling operations until a new configurator device is elected. Since all edge devices keep consistent data through DHTs, the newly elected configurator is initialized quickly by considering the locally stored data. Nevertheless, each device in the network knows the system’s current configurator device at any time. Note that the aspects mentioned above are primarily addressed in our previous works (as discussed in Section 2) and are not evaluated in this article. Furthermore, edge applications running at the edge network are not affected by possible configurator failure.
6 EVALUATION
This section first presents details about the prototype implementation, setup environment, and limitations. Furthermore, we experimentally evaluate the approach’s effectiveness and present the evolution of an edge application in elasticity space. We conclude with a discussion in Section 6.3.
6.1 Prototype Implementation, Setup, and Limitations
To assess the proposed approach, we extend the prototype of [21] with a lightweight mechanism that enables deploying and controlling the elasticity of edge applications in a decentralized manner at an edge network. The prototype is partially developed and written in Java. The prototype is tested in a real environment on edge devices (i.e., Raspberry Pi 3 Model B V1.2) with 4 \( \times \) ARM Cortex-A53 CPU at 1.2 GHz, 1 GB of RAM, and 16 GB disk storage. The prototype is deployed on each edge device, and each edge device runs the Docker Engine as the edge application runtime platform. To implement the deployment generator, we refined and extended parts of the FogTorch\( \Pi \) [4] simulator to generate all eligible deployment plans for an edge application. In addition, the simulator does not consider dynamic environments or runtime aspects, does not provide elasticity features, does not implement monitoring tools, and does not implement any communication protocol between computation entities at the edge. Thus, the extensions we refer to are further functionalities developed to support the runtime aspects of edge applications (e.g., elasticity, etc.) in a realistic testbed. Along these lines, the planner is fully integrated into the prototype. It gets the infrastructure state (i.e., available devices, network structure) and generates plans by considering real-time infrastructure-specific metrics.
The
To evaluate our prototype, we exploited the testbed (i.e., edge network) composed of 10 edge devices placed close to each other. Edge devices in the testbed are connected through a wireless connection with a nominal speed of 10 Mbps and 5 Mbps in download and upload. Furthermore, the prototype’s main limitation is being executed in the Java Virtual Machine (JVM) environment. We acknowledge that the JVM is resource expensive; however, we aim to show the approach’s feasibility within this article.
6.2 Use Case, Experiments, and Results
Consider an edge application (i.e., edge safety application) providing a service as described in our motivation scenario (Section 2.2). The edge safety application is partially developed, and it is made out of five components, with three of them written in Python (as illustrated in Figure 5). The front-end component \( \varphi _1 \) enables edge devices (i.e., drones) to interact with service and continuously upload their real-time images, including location coordinates. The Redis component \( \varphi _2 \) collects new images and stores them in binary format. The processing component \( \varphi _3 \) consumes data and processes (i.e., image analysis) and stores them in the database component \( \varphi _4 \) (i.e., Postgres). Finally, the results component visualizes the safe path for the rescue team member residing in the affected zone.
Fig. 5. Edge safety application.
Software components are containerized (docker images). Each container is configured with specific resource requirements (i.e., 1 CPU (1.2 GHz) and 60 MB memory) and resources that containers can use on the hosting edge device. Notice that to reserve and use a various number of CPU resources per container, the RPi3 must be upgraded to the latest firmware.12 Furthermore, we assume that the component images are already available on each edge device. Such an assumption is made due to the latency issues introduced when images are downloaded from centralized devices. In [1], the authors acknowledge the problem and address relevant aspects to improve deployment time.
Figure 6 illustrates the average time required to generate all possible valid deployment plans for each edge safety application component when needed to be deployed and scaled. We simulate the generation of deployment plans 10 times and illustrate their maximum average time requirements. For the edge safety application with five components and the given testbed, we may have up to 84,960 generated valid deployment plans (i.e., maximum average time is 6.73s). Nevertheless, once a single valid plan is founded, the process is interrupted, and the application components are deployed or scaled. Notice that when the infrastructure changes, the
Fig. 6. Edge safety application deployment at an edge network with 10 low-powered edge devices.
The edge safety application is configured to run and scale only at the available devices at the target edge network. To simplify the scenario, we evaluate the front-end component and show the adaption process in response to its changing workload during its runtime. The front-end is the first component in which drones (i.e., edge devices) interact with the edge application by uploading their images continuously (i.e., every 1 s).
Figure 7 illustrates the workload generated for the safety edge application and used in the experiment. First, the workload increases linearly every 3 seconds. Afterward, we stress test our approach by creating concurrent requests (i.e., up to 30 requests/s) and examine the front-end component behavior during its runtime. Such a workload is generated by new edge devices that use the service. For instance, referring to Figure 7, when the component receives up to 50 requests per second, the component may be required to scale out to operate correctly since it may utilize the CPU more than 80%. Or, when the component receives fewer than 50 requests per second, it means that it may need to scale in to not overuse resources. Generally speaking, edge applications or their software components may experience various workloads over time (i.e., periodically, continuously, or unpredictably). The given workload is just an example used for testing purposes to show our approach’s goal to automatically control edge application behavior in elasticity space. To enable elastic behavior for the front-end component, we define elastic requirements in Listing 2.
Fig. 7. Workload used in the experiment.
Listing 2. An example of elastic requirements: Front-end component.
Elastic requirements given in Listing 2 define the elastic behavior of the front-end component. Strategy St3 states that if the average response latency is higher than 100 ms, the component should scale out to ensure the service’s quality. When Co1 or Co2 is violated, strategies St1 or St2 enforce specified actions to keep resource utilization in acceptable ranges. As can be noted, the specified metrics are monitored continuously for the front-end component. Furthermore, each constraint may have various priorities. For instance, no matter how much the CPU is utilized, the front-end component must scale if the provided service has higher latency than a specified threshold. To that end, if both constraints are violated, the Co3 is enforced since it is prioritized over Co2. Similarly, Co3 is enforced first since it is prioritized over Co1.
The first graph of Figure 8 shows the CPU utilization by the front-end component under the given workload (see Figure 7). The second graph of Figure 8 shows the front-end component adaption process in response to the workload. As can be noted, whenever elastic constraints (i.e., Co1 and Co2) are violated, the front-end component scales up or scales down. The adaption process occurs automatically in response to the current workload. The front-end component scales on multiple instances (i.e., containers) to provide the desired service quality. As can be noted, even with continually changing the workload of the front-end component, the CPU utilization remains between elastic boundaries. This ensures that the desired service quality is always guaranteed. The other important aspect is to overcome resource over-provisioning. As can be noted, when the front-end component’s workload decreases, the container number is decreased as well (see the second graph of Figure 8). Besides, the front-end component’s memory utilization remained within elastic boundaries and didn’t violate elastic constraints. The CPU of an edge device may fluctuate up and down very quickly due to various workloads. This may cause undesired scaling operations for the same workload. Thus, specified metrics are monitored for 5 seconds to overcome the mentioned problem. The scaling operation is enforced if the mean value violates elastic constraints. Furthermore, Figure 9 shows the front-end component latency over time and the elastic constraint Co3 violation. However, in this situation, both Co1 and Co3 constraints are violated. As can be noted from the elastic requirement, the Co3 constraint is prioritized over the Co1. In this case, strategy St3 will be enforced to keep the latency within the elastic boundary.
Fig. 8. The CPU utilization and adaptation process.
Fig. 9. Front-end component latency and adaptation process.
A significant challenge remains in the time required to start containerized components’ low-powered edge devices. In our case, starting safety edge application components takes between 20 and 30 seconds. After the scaling operation is enforced, our approach checks and waits for whether a container is started or shut down. Thus, we avoid undesired situations such as enforcing multiple scaling operations for the same workload. Contrary to the starting operation, the shutdown process occurs in a few seconds for all containers. Nevertheless, edge application components can scale vertically and horizontally depending on the available resources at the edge. The application runtime platform (i.e., Docker Engine) manages this process and scales components within the list of eligible edge devices generated by the DP planner.
In Figure 10, we show the evolution of the front-end component in the three-dimensional space (cost, quality, and resources). The quality refers to the latency, the resources refer to the allocated CPU (i.e., edge devices), and the cost is estimated based on the resource allocated. In this case, the cost value is an assumption made to simulate the price paid for resource usage. As can be noted from Figure 10, when the service quality decreases (i.e., starting point with green dots), the front-end component scales by increasing the number of resources used and the cost is increased. Furthermore, the edge application scales down when the service is not used (i.e., red dot). To that end, such an approach is guaranteed to meet edge application resource demands at runtime. Other edge application components evolve in the elasticity space based on their load during runtime. Similarly, the configurator device monitors elastic requirements specified at the edge application level. Thus, the configurator device considers the overall resource consumption of edge application components. For instance, the user may specify that a particular edge application cannot use more than 50% of available resources at the edge.
Fig. 10. Evolution of front-end component in the elasticity space.
6.3 Discussion
We have demonstrated through a running example that automatic scaling of edge applications is easily achieved in an edge infrastructure with low-powered devices by using DECENT. Furthermore, we showed that our approach helps avoid highly undesirable situations, such as resource over-provisioning. This ensures that the available resources are used whenever edge applications need them. Nevertheless, elasticity features are crucial in avoiding edge device failures due to resource over-utilization. For instance, a low-powered edge device can quickly fail when an edge application or a software component fully utilizes device resources. Thus, specifying elastic requirements and the DECENT mechanism helps to avoid the overloading of edge devices.
Several assumptions inherent in our approach must be further investigated. In the current prototype, elastic constraints do not conflict with each other. We focus on developing novel constraints and enforcement strategies related to these applications. Simplifying the development process of elastic specifications is among the future works we plan to do. Thus, we plan to integrate the language into an IDE such as c-Eclipse and extend it with further functionalities. The c-Eclipse framework provides a user-centric interface through which developers can describe their applications for deployment over edge and cloud. The language integrated into a development environment will make it easy for users to develop elastic specifications and specify correct values to avoid the wrong configuration. Nevertheless, the following tool will also help to detect conflicting constraints that the user may select. Nevertheless, we acknowledge that the user may specify conflicting elastic constraints, and thus, we plan to investigate various techniques that would help identify and avoid such situations.
Within this article, our primary focus resided in enabling elasticity features at edge infrastructures; thus, we consider only edge applications where all components are deployed on the edge (i.e., everything on the edge model). Accordingly, adding cloud or fog devices will expand overall available resources and allow executing application components (i.e., containers) in these environments when insufficient resources are at the edge. In future work, we will investigate performance aspects when moving edge application components in large-scale Edge-Cloud infrastructures and controlling their elasticity from the edge.
7 CONCLUSION
Satisfying dynamic and stringent requirements of edge applications has become challenging for resource-constrained edge networks. Even though edge applications can be modeled as multi-components, dynamic workloads may cause unexpected latencies that are higher than the expected response time between application components, IoT devices, and end-users. We proposed an efficient solution that simplifies the deployment process and enables elasticity controlling in edge applications deployed in the Edge-Cloud infrastructure to overcome such challenges. The developer and user can characterize edge applications by specifying elasticity requirements that are captured and interpreted by DECENT. The DECENT runtime mechanism then performs complex elasticity controls at the edge of a network.
Edge networks can be different in size and setting; thus, the proposed system is configurable by the system designer. In this article, we consider edge networks as resource-constrained environments composed of low-powered edge devices. The experiments conducted in a realistic testbed showed the feasibility of executing elastic features on low-powered edge devices and adapting edge application components at runtime at the edge. Furthermore, edge applications are executed in a runtime that considers the heterogeneity of edge resources. The proposed framework automatically reconfigures edge applications to meet their specified elastic requirements. In future work, we plan to perform an extensive evaluation of the approach by considering distributed cloud entities in the system. Furthermore, we plan to develop a user-centric interface through which developers and users can easily describe their edge applications for deployment over edge and cloud environments.
Footnotes
1 We consider the Edge-Cloud infrastructure as three-tier architecture composed of edge, fog, and cloud entities [9].
Footnote2 EdgeX Foundry, https://www.edgexfoundry.org/.
Footnote3 AWS IoT Greengrass, https://aws.amazon.com/greengrass/.
Footnote4 Google IoT Edge, https://cloud.google.com/solutions/iot.
Footnote5 Docker, https://www.docker.com/.
Footnote6 OASIS, Topology and Orchestration Specification for Cloud Applications (TOSCA), http://docs.oasis-open.org/tosca/TOSCA/v1.0/cs01/TOSCA-v1.0-cs01.html.
Footnote7 Nagios, https://www.nagios.com/.
Footnote8 Ganglia, http://ganglia.sourceforge.net.
Footnote9 Raft Consensus Algorithm, https://raft.github.io/.
Footnote10 Docker Java API, https://github.com/docker-java/docker-java.
Footnote11 Hyperic Sigar, https://github.com/hyperic/sigar.
Footnote12 Raspberry Pi, https://github.com/raspberrypi/firmware.
Footnote
- [1] . 2018. Docker container deployment in fog computing infrastructures. In 2018 IEEE International Conference on Edge Computing (EDGE’18). IEEE, 1–8.Google Scholar
Cross Ref
- [2] . 2015. Automatic deployment of distributed software systems: Definitions and state of the art. Journal of Systems and Software 103 (2015), 198–218.Google Scholar
Digital Library
- [3] . 2018. Cloud, edge, or both? Towards decision support for designing IoT applications. In 2018 5th International Conference on Internet of Things: Systems, Management and Security. IEEE, 155–162.Google Scholar
Cross Ref
- [4] . 2017. QoS-aware deployment of IoT applications through the fog. IEEE Internet of Things Journal 4, 5 (2017), 1185–1192.Google Scholar
Cross Ref
- [5] . 2019. Measuring the fog, gently. In International Conference on Service-Oriented Computing. Springer, 523–538.Google Scholar
Digital Library
- [6] . 2013. SYBL: An extensible language for controlling elasticity in cloud applications. In 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. IEEE, 112–119.Google Scholar
Digital Library
- [7] . 2011. Principles of elastic processes. IEEE Internet Computing 15, 5 (2011), 66–71.Google Scholar
Digital Library
- [8] . 2020. Towards distributed edge-based systems. In 2020 2nd IEEE International Conference on Cognitive Machine Intelligence. IEEE, 1–9.Google Scholar
Cross Ref
- [9] . 2021. Towards IoT processes on the edge. In Next-Gen Digital Services. A Retrospective and Roadmap for Service Computing of the Future. Springer, 167–178.Google Scholar
- [10] . 2019. Elastic computing in the fog on Internet of Things to improve the performance of low cost nodes. Electronics 8, 12 (2019), 1489.Google Scholar
- [11] . 2020. Lightweight self-organising distributed monitoring of fog infrastructures. Future Generation Computer Systems 114 (2020), 605–618. Google Scholar
Cross Ref
- [12] . 2018. Elastic services for edge computing. In 2018 14th International Conference on Network and Service Management (CNSM’18). IEEE, 358–362.Google Scholar
- [13] . 2013. Elasticity in cloud computing: What it is, and what it is not. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC’13). 23–27.Google Scholar
- [14] . 2019. Resource management in fog/edge computing: A survey on architectures, infrastructure, and algorithms. ACM Computing Surveys (CSUR) 52, 5 (2019), 1–37.Google Scholar
Digital Library
- [15] . 2006. Proximity-aware superpeer overlay topologies. In IEEE International Workshop on Self-managed Networks, Systems, and Services. Springer, 43–57.Google Scholar
- [16] . 2019. Architecturing elastic edge storage services for data-driven decision making. In European Conference on Software Architecture. Springer, 97–105.Google Scholar
Digital Library
- [17] . 2018. Fog computing: A taxonomy, survey and future directions. In Internet of Everything. Springer, 103–130.Google Scholar
Cross Ref
- [18] . 2014. Resource management for infrastructure as a service (IAAS) in cloud computing: A survey. Journal of Network and Computer Applications 41 (2014), 424–440.Google Scholar
Cross Ref
- [19] . 2002. Kademlia: A peer-to-peer information system based on the XOR metric. In International Workshop on Peer-to-Peer Systems. Springer, 53–65.Google Scholar
- [20] . 2019. Edge-to-edge resource discovery using metadata replication. In 2019 IEEE 3rd International Conference on Fog and Edge Computing (ICFEC’19). IEEE, 1–6.Google Scholar
Cross Ref
- [21] . 2021. A decentralized approach for resource discovery using metadata replication in edge networks. IEEE Transactions on Services Computing PP, 99 (2021), 1–11.
DOI: Google ScholarCross Ref
- [22] . 2020. SLOC: Service level objectives for next generation cloud computing. IEEE Internet Computing 24, 3 (2020), 39–50.Google Scholar
Cross Ref
- [23] . 2020. ElasticFog: Elastic resource provisioning in container-based fog computing. IEEE Access 8 (2020), 183879–183890.Google Scholar
Cross Ref
- [24] . 2017. Edge mesh: A new paradigm to enable distributed intelligence in internet of things. IEEE Access 5 (2017), 16441–16458.Google Scholar
Cross Ref
- [25] . 2016. Edge computing: Vision and challenges. IEEE Internet of Things Journal 3, 5 (2016), 637–646.Google Scholar
- [26] . 2018. A framework for optimization, service placement, and runtime operation in the fog. In 2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC’18). IEEE, 164–173.Google Scholar
Cross Ref
- [27] . 2018. A lightweight autoscaling mechanism for fog computing in industrial applications. IEEE Transactions on Industrial Informatics 14, 10 (2018), 4529–4537.Google Scholar
Cross Ref
- [28] . 2017. LAVEA: Latency-aware video analytics on edge computing platform. In Proceedings of the 2nd ACM/IEEE Symposium on Edge Computing. ACM, 15.Google Scholar
Digital Library
- [29] . 2018. Elastic provisioning of Internet of Things services using fog computing: An experience report. In 2018 6th IEEE International Conference on Mobile Cloud Computing, Services, and Engineering (MobileCloud’18). IEEE, 17–22.Google Scholar
Cross Ref
Index Terms
DECENT: A Decentralized Configurator for Controlling Elasticity in Dynamic Edge Networks
Recommendations
All one needs to know about fog computing and related edge computing paradigms: A complete survey
AbstractWith the Internet of Things (IoT) becoming part of our daily life and our environment, we expect rapid growth in the number of connected devices. IoT is expected to connect billions of devices and humans to bring promising advantages ...
Line Monitoring and Identification Based on Roadmap Towards Edge Computing
AbstractIn recent years, with the rapid growth of Internet of Things (IoT) and cloud services having received special attention from the research community across the world. IoT provides a platform of creating a world connected through internet. The ...
Edge and fog computing for IoT: A survey on current research activities & future directions
AbstractThe Internet of Things (IoT) allows communication between devices, things, and any digital assets that send and receive data over a network without requiring interaction with a human. The main characteristic of IoT is the enormous ...
Highlights- We comprehensively review edge computing technology in the IoT environment.
- We ...


















Comments