Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution

Wang, Ning; Wang, Lin; Li, Xin; Qin, Xiaolin

doi:10.3390/electronics12163404

Open AccessArticle

Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(16), 3404; https://doi.org/10.3390/electronics12163404

Submission received: 30 June 2023 / Revised: 6 August 2023 / Accepted: 7 August 2023 / Published: 10 August 2023

(This article belongs to the Special Issue Advances in Cloud/Edge Computing Technologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Service mesh is gaining popularity as a microservice architecture paradigm due to its lightness, transparency, and scalability. However, fully releasing configurations to the data plane during the business development phase can result in noticeable performance degradation. Therefore, fine-grained traffic management of microservice applications is crucial to service performance. This paper proposes a novel configuration distribution algorithm, DATM, which utilizes inter-service dependencies from the service call chain to manage data-plane traffic and dynamically maintain cluster services. The proposed algorithms enable on-demand distribution based on the obtained service dependency relationships by combining monitoring, information processing, and policy distribution. We validate the proposed mechanism and algorithms via extensive experiments. We show that the approach reduces the memory usage of data-plane agents and improves system resource utilization. Additionally, this reduces the time to issue configuration while effectively saving storage space and significantly reducing the number of cluster updates. Consequently, this approach ensures application performance and guarantees the quality of microservice applications in clusters.

Keywords:

microservice; traffic management; performance optimization; lightweight; cost optimization

1. Introduction

Monolithic architectures are extremely difficult to develop and maintain on large projects, especially in the cloud computing era. Most Internet suppliers have been using microservice architectures (MSA) [1,2] in recent years, such as Amazon, Netflix, and Uber. And conventional enterprises are also making the switch from monolithic architecture to MSA. Although service-oriented architecture (SOA) design methodology can already partially modularize cloud services [3], microservices draw on SOA thinking to further granularize services based on business logic. Docker [4] containers and Kubernetes [5,6] orchestration have revolutionized many domains, leading to a preference for MSA in distributed systems. However, as microservices achieve the goal of independent deployment and evolution, communication among them becomes a major challenge, and the service governance capability at a higher level needs to be urgently improved. In this context, service mesh [7] is gaining popularity as a MSA paradigm due to its lightness, transparency, and scalability.

Service mesh has emerged to deal with the problem of service-to-service communications in distributed systems or microservices. By employing lightweight proxies alongside the applications, service mesh governs inbound and outbound traffic based on configured rules fetched from the control plane. This approach enables the service mesh to take control of the cluster’s traffic, offering advanced features such as network resilience, security, and traffic management [8]. However, the deployment of service mesh also introduces specific challenges. Cost reduction and efficiency improvement are the critical requirements from the stakeholders. To improve the utilization of existing resources [9], various technologies have been proposed, such as (1) resource scheduling and business deployment, (2) resource utilization estimation, (3) application expansion and configuration, and (4) workload management [10]. Consider resource utilization from the perspective of traffic management, i.e., to improve system resource utilization by controlling the direction and amount of traffic through appropriate policies. However, certain pain points exist in service mesh deployments. For instance, the full configuration issued during the business development stage can lead to significant performance degradation in both the data and control planes. Therefore, fine-grained traffic management of microservices is crucial to ensure optimal application service performance.

Istio [11] is a powerful service mesh implementation that makes operating cloud-native services architectures in a hybrid environment easier. Istio has two components: the data plane and the control plane. The data plane in Istio primarily consists of agents operated by Envoy, while the Pilot serves as the central component of the control plane. Accordingly, traffic in the Istio is architecturally divided into data-plane traffic, which refers to the traffic invoked for services between services, and control-plane traffic which refers to the configuration between components of the control plane and the traffic to manage the proxies. Previous studies mainly concentrate on data-plane traffic, with control-plane traffic being relatively neglected. The native mode of control-plane traffic is to configure fully, which wastes memory and blocks threads. For example, in a small cluster, service A calls service B in the data plane, meaning that in ideal mode, A only requires the configuration of B and some basic configuration. However, for the native mechanism, when the request arrives to trigger the control plane component, it delivers according to the service in the cluster rather than the configuration that A needs. Hence, A will simultaneously receive DiscoveryResponse with xDS (x Discovery Service) related to other services in this cluster. Other services will also receive the same configuration, even if they do not have any calling services. Also, a recent empirical study shows that in a mesh with 325 clusters and 175 listeners, one proxy occupies about 100 M; with 466 instances in the mesh, all proxies use 466 * 100 M = 46.6 G. The full configuration consumes a lot of storage space in data-plane agents and causes the thread block to process the redundant configuration. As the number of services in distributed clusters grows, the demand for configuration management becomes increasingly important in the microservice environment. The performance issue caused by fully configuring is the main issue at present. To address this issue, the concept of loading configuration needs to be explored. The control plane must know the routes of endpoints (Kubernetes’ pods) and the dependencies between services to guide traffic between meshes [12]. Therefore, a reasonable control mechanism is needed to minimize the amount of configuration receiving services to improve the utilization of existing resources. We intend to investigate the traffic management of the control plane under microservices.

This paper aims to improve resource utilization in microservice clusters and guarantee the quality of microservice applications by studying traffic management in service mesh architecture. To achieve this, we abstract the service request and make a proxy specifically as a global sidecar (GL), as illustrated in Figure 1. When a request with a chained invocation arrives, the front-end service determines whether it has the routing information of the back-end. If it does, it directly visits the back-end service; otherwise, it requests the proxy to retrieve the necessary information from the global sidecar. The contributions of this paper can be summarized as follows:

Management mechanism: We observe the configuration redundancy problem in service meshes and propose the corresponding config on-demand idea for memory efficiency. We design the dependency-aware traffic management (DATM) mechanism, combining monitors and a controller. The mechanism is application-agnostic, non-intrusive, and does not require any source or business code changes.
Dependency-aware traffic algorithm: We analyze the characteristics of microservices and research the configuration and dependencies of microservice applications. We propose the algorithm of service dependencies extraction and implement the controller of the control-plane traffic in the form of plugins. The configuration can be distributed on demand.
Evaluation: We extensively evaluate DATM using a comprehensive benchmark. The experimental results demonstrate that the DATM mechanism significantly reduces storage resource usage of a single agent by 40% to 60% and greatly reduces the number of cluster updates. Additionally, the DATM mechanism improves the efficiency of issuing configurations, resulting in reduced configuration time. Furthermore, from the perspective of the entire cluster, the optimization results are even more impressive.

The rest of this paper is organized as follows. Section 2 gives the problem statement of traffic management, Section 3 presents our proposed solution framework, Section 4 and Section 5 conduct some experiments and evaluate the efficiency of the controller, Section 6 reviews the related works, and Section 7 offers our conclusion.

2. Problem Analysis

This section aims to improve resource utilization and ensure the quality of service (QoS) in microservice clusters by examining traffic management in a service mesh architecture. We explore key aspects such as service descriptions, problem scenarios, memory usage, configuration distribution time, and expected improvements. For convenience, Table 1 summarizes notations used throughout the paper.

2.1. Service Description

To comprehend the relationships and communication requirements between services, a thorough analysis of service descriptions and a discussion of problem situations are necessary. Our research is based on a number of microservices in distributed clusters, aiming to gain a comprehensive understanding of the relationships and communication requirements between services. Services in a cluster contain many types. Since some control components have nothing to do with traffic on control planes, not all of them are required. Given that the state of the service could change over time, this is appropriate. By examining the status of each service, such as whether it is running and whether it is a business service, we can identify the key attributes that define the services within the cluster. Additionally, we explore the namespaces to which the services belong and the related destinations they interact with. In the subsequent sections, these insights lay the foundation for our traffic management analysis and resource utilization optimization. To clearly describe the microservice, we use a tuple to characterize it as

γ = 〈χ, ψ, ξ, σ〉

, where

–: $χ$ indicates whether the service is up;
–: $ψ$ is used to identify whether the service is a business service or not;
–: $ξ$ indicates which namespace the service belongs to;
–: $σ$ is the all related destinations of this service. For the service, we have

\begin{matrix} ψ (γ_{s v c_{j}}) = \{\begin{matrix} 1, & s v c_{j} is a business service; \\ 0, & o t h e r w i s e . \end{matrix} \end{matrix}

(1)

\begin{matrix} χ (γ_{s v c_{j}}) = \{\begin{matrix} 1, & s v c_{j} is a running service; \\ 0, & o t h e r w i s e . \end{matrix} \end{matrix}

(2)

To distinguish between different types of services, we have selected the state and characteristics shown in Table 2.

2.2. Problem Scenario

The data plane of the service mesh consists of agents Envoy [13] for each service, which utilizes iptables technology to hijack and rewrite routes for traffic. This effectively intercepts and manipulates traffic between services and provides advanced communication features such as traffic management. In the native mechanism, the control plane Pilot is responsible for distributing and maintaining all information within the mesh. Each service can access any other service within the mesh network, and its agents have a complete set of data, including service discovery information, network routing rules, and network security rules. However, much of this data still needs to be used. In large clusters with frequent configuration updates and many instances, any update triggers the distribution of the full configuration to all Envoys. This can have the following adverse effects on data-plane and control-plane traffic.

(1): Redundant Memory Footprint. Pilot cannot provide accurate data based on specific requests due to the unpredictable dependencies between services. Instead, it provides available data based on existing clusters. When a change is detected, the Pilot sends a discovery response (full configuration) to Envoy to update the configuration, increasing memory overhead. The memory footprint of Envoy and its configuration endpoints is proportional to the amount of memory they consume, which means that the more endpoints there are, the more memory is consumed. This increase in memory usage presents a significant challenge. The delivery of a complete configuration in the control plane increases memory usage, which can easily lead to out of memory (OOM) errors.
(2): Frequent Configuration Updates. In large clusters with frequent updates and many instances, any update triggers the delivery of the full configuration to all Envoys. This scenario is depicted in Figure 2a. The ideal update frequency of services is shown in Figure 2b,c. When an edge service is updated and has no dependencies, it only updates itself and does not trigger others. Other services can be added to record all the cluster’s updates for consolidation. On the other hand, when a service with dependencies is updated, we need to trigger the related service while the rest of the services are unaffected.
(3): Configuration Time Increase. In the Pilot push process, full configuration will lead to a longer control plane push time and convergence time. This is not conducive to a normal invocation of services and simultaneously pulls down Pilot’s overall performance and efficiency.

2.3. Basic Platform

The base platform includes the underlying cloud platform, which involves container images, container orchestration, and service mesh. This system first builds multiple physical nodes in the base platform, deploys multiple microservices on each one, and packages the microservice applications using the Docker engine. Therefore, it needs to consider the sum of microservice traffic at both endpoints of the link is within the acceptable bandwidth. A Kubernetes cluster is built, and an Istio service mesh is used on top of the cluster for service governance. In Kubernetes, two types of nodes are divided into master and worker nodes. Master nodes are responsible for managing the state of the whole cluster, including scheduling containers, monitoring cluster resource usage, client request processing, etc. Worker nodes are responsible for running containers and providing compute and storage resources for containers. The combination of Kubernetes and Istio provides a stable and scalable foundation platform for building and deploying complex microservices, providing an efficient and reliable way to manage containerized applications.

The Pilot of the Istio control plane watches for user-defined rules and Kubernetes-managed services are deployed in the Kubernetes cluster as pods at deployment time, with the pod requirements and metadata typically declared in a YAML file. These requirements can include the container image used to create the pod and the amount of resources (CPU or memory) required for each container. The workload owner then submits the pod (YAML file) to the Kubernetes cluster, where different components contribute to getting the workload up and running in one of the cluster’s nodes. Pilot also watches these resources, and user-defined rules are built into the cluster as an extended CRD, with the full amount of rules contained in each Envoy.

2.4. Memory Footprint

According to our previous analysis, the main impact of full configuration distribution is on memory. Therefore, we explore the relationship between memory usage and load, and propose ways to optimize storage footprint. There are n services

S v c = \{s v c_{1}, s v c_{2}, . . ., s v c_{n}\}

in working cluster.

I_{i}

is the number of instances of microservice

s v c_{i}

. Let

s v c_{i} = \{s v c_{i}^{1}, s v c_{i}^{2}, . . ., s v c_{i}^{m}\}, i \in [1, n]

represents that there are m instances of this service. We optimized the memory footprint by considering the amount of configuration.

As the number of service increases, the amount of storage needed by the agent and the number of updates considerably rise. The relationship between memory usage U and load L:

\begin{matrix} U \propto k_{1} L (k_{1} > 1) \end{matrix}

(3)

The main configuration size of the service C is determined by bootstrap, listeners, clusters, routes, and a basic file in JSON format. The relationship between load L and configuration size of service C:

\begin{matrix} L \propto k_{2} C (k_{2} > 1) \end{matrix}

(4)

Most of the composition of memory usage U is the accumulation of the configuration of all instances. Since the configuration is delivered in full, it is the same size regardless of the instance, so they are all represented by C. By adding up the size of the full configuration with the number of service instances, we obtain the memory footprint of the configuration in the native cluster. The relationship between memory usage U and configuration size of service C:

\begin{matrix} U \propto (\sum_{i = 1}^{n} I_{i}) C \end{matrix}

(5)

Optimizing a single service’s memory footprint

M E M

can be regarded as omitting the basic configuration and default settings for the namespace, such as node and software version information, admin address configuration, and trace Zipkin cluster address reference from the entire configuration.

\begin{matrix} M E M = n * C - σ (γ_{s v c_{j}}) - ω [ξ (γ_{s v c_{j}})] \end{matrix}

(6)

Expected improvement

E I (M E M)

represents the storage space saved for the whole mesh. It can be defined as:

\begin{matrix} E I (M E M) = U - σ (γ_{s v c_{j}}) * \sum_{j = 1}^{n} I_{j} - W \end{matrix}

(7)

Here,

W \in \{ξ (γ_{s v c_{j}})\}, j \in [1, n]

. The

E I (M E M)

contains configuration not needed for this service, including configuration information irrelevant to this service call.

2.5. Configuration Time

Another effect of the full configuration being issued is on the configuration time. We explore the issue of configuration distribution time

T I M E

. To evaluate the impact of

T I M E

on the performance of the both data and control plane, we observe the work queue of configuration.

The push queue is a buffer that manages the distribution of configurations. This metric covers the time spent in the network transmission, agent reception, and loading configuration phases. The push queue time

T_{p u s h}

is when the configuration enters the queue to when it is sent to the agent. Note that the agent receives time and the push queue time may affect each other.

\begin{matrix} T_{p u s h} = \sum_{i = 1}^{n} \sum_{j = 1}^{m} p u s h_{i j} \end{matrix}

(8)

The receive time

T_{c o n v}

is when an agent receives a configuration update to when it reaches an available state. This metric provides the cumulative value of the convergence time, and we can obtain the time of one convergence by observing the amount of change in the timestamp.

\begin{matrix} T_{c o n v} = \sum_{i = 1}^{n} \sum_{j = 1}^{m} c o n v_{i j} \end{matrix}

(9)

The time of configuration dispatch is affected by various factors, such as agent receive time

T_{c o n v}

, push queue time

T_{p u s h}

, and network latency

o (N e t)

. The agent receives time

T_{c o n v}

is the time from sending configuration information to the agent until the agent successfully receives the configuration. The push queue time

T_{p u s h}

is the time from when the configuration enters the queue to the time it is sent to the agent.

\begin{matrix} T I M E = T_{p u s h} + T_{c o n v} = (α C) + (β C) + o (N e t) \end{matrix}

(10)

Both

α

,

β

are the correlation coefficients (

α > 0

,

β > 0

). We use these two coefficients to relate the processing time to the configuration size received by the agent in a linear relationship. In Istio, the control plane manages and dispatches configurations to agents throughout the service mesh. Pilot and other components in the control plane collect and process configuration information such as service registration information, policy rules, routing rules, and so on, and distribute them to the appropriate agents. And the network latency varies depending on the network conditions at the time of the request.

3. DATM Mechanism

In this section, we describe the overall architecture of the DATM mechanism and its implementation. The insight of DATM is that configuration efficiency can be improved through the combined optimization of runtime metrics and service visibility. Figure 3 illustrates the design of DATM, which enables runtime metrics through its monitor coordinator and supports service visibility through controller extractor.

(1): DATM obtains runtime metrics information (including the name and status of the invoking service, etc.) when a service request arrives in the data plane or when historical invocation information exists. It does so using the $m o n i t o r$ $c o o r d i n a t o r$ , which is marked as ➀ in Figure 3. The monitor collects telemetry data from each microservice instance and stores them in a centralized database for processing. It is described in Section 3.1. These runtime metrics information is used to guide control extractor in calculating and determining the optimal runtime configuration.
(2): $G L$ stands for global service (marked as ➁ and described in Section 3.2). It is a normal Envoy proxy and carries the full configuration information in the mesh. GL exists to guarantee access when routing information is missing. When a new request arrives, it triggers GL access to obtain the corresponding route if the corresponding proxy does not have it.
(3): The control extractor detects alarm messages from the monitor. It queries the information through the collected runtime data to (a) extract the $H B$ (marked as ➂ and described in Section 3.3); (b) extract the DEM (marked as ➃ and described in Section 3.3). Using the telemetry data collected in ➀ and the $H B$ identified in ➂, DATM makes a mitigation decision to reconfigure the sidecar dependencies for the $H B$ . The policy used to make this decision is generated using a control loop. The control loop makes a mitigation policy by comparing the existing configuration list information with the new change information.
(4): Finally, through Istio’s native deployment module (marked ➄ and described in Section 3.4), specific configuration information is distributed in the form of xDS that are understandable to data-plane agents. Actions are verified and executed on the underlying Kubernetes cluster.

3.1. Monitor Coordinator

Software developers usually need to know the runtime of a service unless the underlying logic continues to stabilize over time. However, in the context of service meshes, the complex invocation relationships between microservices make it a challenge to determine the service state and invocation dependencies in advance. As our results demonstrate, it is feasible for software to dynamically check service invocation relationships in real-time. Real-time connectivity and monitoring of running clusters is established through monitors. We employ Prometheus as the monitor. Prometheus periodically extracts monitoring metrics from active (running) target hosts and stores it in time series database (TSDB). Monitoring data can be collected for services within a Kubernetes cluster by configuring static tasks or utilizing service discovery. Triggered alerts can be sent to Alertmanager by configuring alert rules. By utilizing monitoring logs and metrics data, one can obtain real-time information about the Kubernetes cluster to identify the services involved in a business process and their corresponding service dependencies.

3.2. Information Acquisition

To populate its service registry, Istio establishes connections with the discovery system of the platform on which it is installed. When Istio is deployed on Kubernetes, it automatically discovers services and instances, thereby simplifying the registration process. The main concept behind our approach is to consider cluster traffic as a spatio-temporal range storage operation. This approach entails storing the most probable call trajectories and efficiently computing them based on given call and time parameters. By adopting this framework, we aim to enhance the management and optimization of service mesh traffic within a cluster.

The control plane Pilot passes in the full configuration information to

G L

, which fetches information about all services in the mesh. Pilot listens to the API-Server, and when App A generates a call, the first call from App A to App B is relayed through

G L

. Prometheus monitors the clusters, analyzes the received traffic characteristics, and alerts the change to control extractor. The controller sends the new dependency relationship to Sidecar (CR) to update Sidecar resources by comparing the original and further invocation information. After the previous step, the configuration information sent by Pilot to App A is necessary. After that, when App A initiates access to App B again, the traffic will go straight to App B.

3.3. Control Extractor

Building on the insights gained from previous studies, we propose a control extractor (Figure 4) that can be easily deployed within service meshes. Retrieve the necessary data for cluster traffic management from the monitoring system. The controller acquires the most up-to-date service information in the cluster, including the associated service dependency information, and stores it within the local store. The local repository is continuously compared to the newest configuration in a control loop. These actions may include adding a Sidecar for a new service, deleting a Sidecar, or updating the configuration of an existing Sidecar. These results are sent to the controller’s work queue. And the queue updates the local storage, ensuring the storage remains synchronized with the most recent configuration data.

DATM first detects the change of

H B

within the cluster.

H B

stands for sub service, a central service with complex dependencies. DEM stands for dependencies of microservices and is used to store the dependencies of a service. The controller is implemented in the Golang language. The pseudocode of the HB extraction algorithm is presented in Algorithm 1. The controller is configured with one parameter: the microservice execution history metrics, denoted as M. The result is provided to the control plane, which distributes it to each sidecar in xDS and records all activities to Etcd to provide data support for other services based on the policy’s final service dependency. To function, our controller requires the metric of the recently executed service as input. We select all services with Istio’s

m e t r i c_d e s t i n a t i o n_s e r v i c e

as running services from the result array that Prometheus monitored. Then filter them as business services using

s e c u r i t y_i s t i o_i o_t l s M o d e

from Istio, which correlates well with Istio’s business services. We deposit the names of these services in a slice and then clean it through the map to determine the length of the business service queue. We leverage the above-mentioned metrics and Istio’s

i s t i o_r e q u e s t s_t o t a l

metrics to test our controller. The basic idea behind the processing algorithm is a three-step approach:

Initialization. This step initializes some variables. The

G L q u e u e

is a priority queue that records all applications in the cluster. The

R B l i s t

is used to store the running business services of Istio. The

D E M m a p

is a map data structure based on Hash tables. The initialized

D E M m a p

value is set to nil, which is used as the call relationship between receiving services. The

D l i s t

is used to store the services with the calling relationship.

Generate the initial cluster relationships. In this step, we first use a condition query to find all the running services. Note that we may obtain a lot of worthless data, such as services running out of time or services used to control scheduling.

Algorithm 1: Hub Service Extraction

Determine the final results. The overall procedure is shown in Algorithm 2. This component will check whether the

R B l i s t

initial relationships must be collected in the

D E M m a p

. We calculate the new service information, including namespace and default configuration. By comparing the original call information with the new one, the controller expresses the new dependency on the Sidecar to update the resources. For services with the load-on-demand feature enabled, the controller creates a Sidecar resource to limit App A’s service visibility and notifies the API-server of Kubernetes of the service’s updates.

Algorithm analysis. The space usage of these two algorithms mainly comes from storing intermediate and output results. So the space complexity of the

H u b

S e r v i c e

E x t r a c t i o n

(Algorithm 1)) depends on the size of the

R B l i s t

that stores the business services in the cluster, which is

O (| R B l i s t |)

. The complexity of the

S e r v i c e

D e p e n d e n c i e s

E x t r a c t i o n

(Algorithm 2)) depends on the

D E M m a p

that stores the dependencies between services, which is

O (| H B |)

. The time complexity of Algorithm 1 depends on the scale of the cluster, the implementation of the function

g e t G l o b a l A p p l i c a t i o n

, and the function of judgment and collection. This function retrieves the global application list

G L q u e u e

, is of constant time complexity, and is written as O(1). The

f o r

loop traverses each application q in

G L q u e u e

. The operations inside the loop, including conditional judgment and collection operations, have time complexity denoted as

O (f)

. Thus, the total time complexity of the algorithm is

O (| G L q u e u e | \cdot f)

, where

| G L q u e u e |

is the length of the global list of applications, and f represents the time complexity of each operation inside the loop. Two main components determine the time complexity of Algorithm 2. First, for each child service

c h i l d S e r v i c e

, the algorithm collects the keys from

D E M m a p

by calling the keyCollect operation. Second, when

D l i s t

matches an attribute of

c h i l d S e r v i c e

, the algorithm enters a while loop and collects values from

D E M m a p

by calling the valueCollect operation. The number of iterations of this loop depends on the length of

D l i s t

, so the time complexity is

O (| D l i s t |)

. For applications with a large number of cycles and frequently changing service dependencies, the magnitude of the algorithm difficulty factor usually depends on the number of initial access services. Although such a workflow is highly unlikely to exist in a practical application scenario due to practical constraints (e.g., limitations on duration, number of services, and concurrency level), we analyze the efficiency of the algorithm by considering the worst-case scenario shown in Figure 5. The workflow consists of n microservices that form a chain of invocations. Thus, the worst-case time complexity of the overall algorithm is

O (| H B | \cdot (f_{1} + f_{2}))

, where

| H B |

is the number of sub-services in the hub service and

f_{1}

and

f_{2}

are the time complexities of the keyCollect and valueCollect operations. This shows that the algorithm is applicable now and in the foreseeable future.

Algorithm 2: Service Dependencies Extraction

3.4. Control Traffic Delivery

The rules that Istio controls the data plane mainly include traffic governance rules such as VirtualService (VR), DestinationRule (DR), Gateway (GW), etc. Pilot transforms the various rules into Envoy-recognizable formats and sending them to Envoy through the standard xDS protocol, written in json files. These protocols take Envoy’s actions until completion. In the same appreciation, Envoy subscribes to Pilot’s configuration resources through gRPC streaming. In our mechanism design, the rules controlling the data plane are mainly Sidecar. Pilot isolates services based on namespaces during the creation of listeners and clusters. This approach helps reduce the overall number of listeners and clusters in Envoy. However, it is still a coarse-grained approach and has limited effectiveness in optimizing memory usage. Figure 6 concisely depicts the lazy configuration process in action. This process involves the utilization of the Sidecar and Custom Resource Definition (CRD), a community-provided resource. By leveraging these tools, more granular and efficient configuration management can be achieved within the service mesh environment. The Sidecar resource object allows control over the ports, protocols, etc. Envoy forwards and receives, and can limit the set of destinations that Sidecar outbound traffic is allowed to reach. These rules are defined in terms of different control fields, such as WorkloadSelector, IstioIngressListener.

When the resources are distributed, the traffic control resources mainly control the routing of services and the traffic of cluster entrances and exits. The final dependency of each service obtained according to the policy is transmitted to the control plane. Meanwhile, all operations be recorded to Etcd to provide data support for other services. This scheduling mechanism follows this process until a manual stop or failure in each working cycle. Our implementation design objectives are scalable, suitable for research through instrumentation/observability, and easily integrated with the existing service mesh. The latter implies an application-independent approach such that existing services can benefit without modifying the source code.

4. Methodology

We start by describing the experimental setting, followed by discussing the pressure test scenario we created.

Experiment Setting. All experiments are performed on a bare-metal machine with Hygon C86 7151 16-core CPU and a 256 GB NVMe hard drive running Ubuntu 18.04 LTS to evaluate our proposed method. The cluster is set up with Docker 20.10.7, Kubernetes 1.20.0, and Istio 1.8.2. We test a mock service in three isolated environments to evaluate the performance. In studying configuration delivery improvements for service mesh, the Lazyload is the best seen in the industry.

Benchmarks. Bookinfo [14], chosen for our evaluation, is a polyglot application consisting of four micro-services communicating over gRPC. All services are stateless and use Istio to demonstrate the fundamental phases of traffic management. We developed three sets of Bookinfo in different namespaces. One uses DATM, the other uses Lazyload, and the remaining does nothing. The obtained service deployment information in the cluster is as follows:

This app models a category in an online bookstore displaying book information. Four business services: Productpage, Details, Reviews, and Ratings. The web page shows a book description, details about the book (ISBN, number of pages, etc.), and some reviews about the book.
Productpage and Details each have a version, and Reviews have three versions. These services have no dependency on Istio but constitute a representative example of a service mesh. It consists of multiple services, languages, and versions of the reviews service.
A global service contains the configuration information for all services, e.g., service discovery information, network routing rules, and network security rules.

Workload test settings. The workload for the experiment is generated by the Isotope (the load service) [15], Istio’s official loading-test tool set. It is a comprehensive application with configurable topology results. The simulation services component of Isotope is a reasonably simple HTTP server that obtains Prometheus measures by following the instructions in YAML files. The GKE (Google Kubernetes Engine) cluster runs the Fortio1.1.0 client and an Isotope service, with the service machine limited to 1vCPU and 3.75 GB of memory. We gradually increase the number of workload services to imitate mesh scale growth. Each namespace contains 19 services, with each initial pod consisting of 5 services for a total of 95 pods. The workload traffic scenario, a multiple server load, is shown in Figure 7. The experiments are each 60 min long (3600 s), constituting a complete period for the above workload.

Tuning the controller. Our controller can adapt to the workload, complex service dependencies, and cluster service changes. Various methods for such controller tuning have been proposed [16]. However, because these methods are beyond this work’s scope, we concentrate on comparing the proposed controller to the default mode and Lazyload. To define good measures for determining appropriate controller settings, we deployed the Bookinfo service and our controller (as specified in Algorithm 1) on our server. We use the Isotope, benchmark work, officially recommended by Istio as the influencing factor, the workload of a single namespace has 18 services, and the generated workload is shown in Figure 7, adding one namespace workload every 5 min. Based on these experiments, it is a smart option to include commonly used dependencies in Sidecar. This is consistent with our intuition that the controller should avoid abrupt output changes and maintain relative stability. The controller queries Prometheus to obtain the service information and the dependencies of services and then tunes the relationship of running the service in the Sidecar’s configuration (EgressHosts).

Note that no source code has been modified in service. We only modified the Kubernetes deployment manifest such that our dynamic infrastructure is put in place and services communicate through it. These modifications are simple and can be automated in the future, e.g., via a Kubernetes Mutating Admission Controller as contemporary service meshes do.

5. Evaluation

We explore the three strategies in this section:

Loading configuration by default modes (Default).
A control strategy developed using scaffolding (Lazyload).
Our proposed controller (DATM).

We evaluate the memory footprint, number of update times, and configuration distribution time for different settings. This section reports the averaged results across five test runs.

5.1. Memory Footprint Evaluation

Comparing the Envoy memory footprint of specific services. To compare the memory utilization of the proxy implemented by our controller with the overall configurable memory, we employed Isotope as the measurement tool and Prometheus to monitor the proxy memory performance. In the memory size experiment, we examined the impact of adding many instances to the cluster on the proxy memory of the deployed service while maintaining the same service scenario. The traffic rate in each experiment was simulated based on the observations from Figure 7. Figure 8 depicts the results of memory footprint in three different workload modes, DATM (i.e.,

M E M_{D A T M}

), Lazyload (i.e.,

M E M_{L a z y l o a d}

), and Default (i.e.,

M E M_{D e f a u l t}

). The memory reduction rate of our controller reaches 56% when deploying 360 pods spanning five namespaces (i.e., with

M E M_{D A T M} = 11 MB, M E M_{L a z y l o a d} = 13 MB, a n d M E M_{D e f a u l t} = 25 MB

). As shown in the figure, the memory utilization in our controller results closely matches that of Lazyload, with a slight improvement. Although the controller’s advantage over Lazyload may seem marginal for a single agent in the cluster, the overall cost savings become significant as the number of service versions and instances grows.

Comparing the Envoy memory footprint of different service. To assess the impact of the controller on services with different business relationships, we conducted a series of experiments using the same scenario and workload. Specifically, we evaluated the memory consumption of the Productpage, which has complex invocation relationships, and the Ratings, which only has invoked relationships. Figure 8a,b illustrate the memory usage of one proxy, providing evidence that the configuration in our DATM approach effectively keeps the memory usage of Envoy below 13 MB and 11 MB, respectively, regardless of the specific service. Since the Ratings does not have any calling service specified in the

w o r k l o a d S e l e c t o r

field, we examined its memory usage in controlled and uncontrolled scenarios to determine if our controller needed to take any action. As shown in Figure 9, it is apparent that while the Ratings does not require a dedicated Sidecar, configuring a sidecar without any

w o r k l o a d S e l e c t o r

field to apply to all workloads in a particular namespace is preferred. This configuration ensures that the service only communicates with other services running in the same namespace and the Istio control plane (e.g., as required by Istio’s egress functionality). The experimental results demonstrate the effectiveness of our controller in managing memory consumption and ensuring efficient utilization of resources for services with different business relationships. The DATM approach effectively maintains low memory usage of Envoy proxies and provides insights into the optimal configuration strategy for different service scenarios.

In summary, the amount of statically configured agent memory varies significantly depending on the size of the load, whereas our proposed controller ensures minimal memory utilization. During high system load, dynamic and static configurations do not impact standard application services. These results demonstrate that our controller maintains reasonable resource utilization and application capabilities. Furthermore, as the time lengthens, the updates to the query’s scope also increase. However, our approach demonstrates that as the time window lengthens, the effect on Envoy memory becomes negligible without increasing the load. This is due to establishing an effective isolation mechanism, ensuring that the service remains unaffected by any additional updates in the cluster unrelated to the service. These experimental results validate the efficiency and effectiveness of our controller in optimizing memory utilization and ensuring stable performance in microservice clusters.

5.2. Number of Updates Evaluation

Comparing the number of updates. We compare the metric of updates using Prometheus’

e n v o y_c l u s t e r_m a n a g e r_c l u s t e r_u p d a t e d

. In the passive distribution mode of Pilot and Envoy communication, Envoy subscribes to specific resource events, generates configurations, and delivers them upon resource updates. The full update of Envoy ensures strongly consistent configuration synchronization as a stream. Still, it burdens the entire mesh greatly as any change triggers a full configuration delivery.

Figure 10 illustrates the performance comparison among the Default delivery mode, Lazyload mode, and DATM mode in Istio. Notably, the DATM mode effectively reduces redundant service data compared to the other two modes. In the Default mode, the number of CDS (Cluster Discovery Service) updates increases dramatically as the workload within the cluster grows. However, both DATM and Lazyload scenarios exhibit fewer updates than the Default mode when the cluster load and the number of instances increase. This is due to the restrictions imposed on configuration distribution. Lazyload adopts a scaffolding approach using operator-sdk to modify the limit’s scope, while DATM represents a more radical rewriting of the sidecar (CR). With our controller in action, the control plane delivers the xDS protocol associated with Sidecar, limiting service visibility to ensure that the Envoy does not receive full xDS updates in the same scenario. As a result, the service is shielded from unnecessary update requests within the system.

Our proposed approach effectively saves storage space and significantly reduces the number of cluster updates, thereby enhancing system resource utilization. Specifically, the DATM algorithm reduces configuration traffic by distributing configuration updates only to the affected services, rather than broadcasting them across all services. This not only improves the performance of the service mesh but also reduces the workload of control plane agents.

Comparing the number of CDS and EDS. The data presented in Figure 11 represents an increasing load on the service namespace, with three values in each data set. Figure 11a indicates the number of CDS detected in the DATM, Lazyload, and Default namespaces, while Figure 11b represents the number of EDS detected in the same namespaces. CDS dynamically retrieves cluster information and is the xDS protocol that undergoes the most changes. An Envoy proxy typically abstracts an upstream cluster as a traffic forwarding target, which can be accessed by an Envoy listener (for TCP) or a route (for HTTP). As the number of namespaces increases, it signifies a growth in the number of services within the entire cluster, resulting in a heavier load on the cluster. The advantage of DATM becomes more prominent in larger clusters because, in such cases, a higher percentage of individual services prefer not to communicate with other services. This leads to the interception of more stored information, resulting in relatively fewer CDS and EDS entries being stored. We demonstrate the effectiveness of our traffic management approach using an official microservice case, where user traffic is minimally affected and user requests are not blocked. The observed performance loss is also relatively small, demonstrating the efficiency and effectiveness of our approach to managing traffic in microservice clusters. These experiments provide insights into the performance and scalability of our DATM approach, showcasing its benefits in handling increasing loads and optimizing resource utilization in large-scale microservice deployments.

5.3. Time of Configuration Distribution Evaluation

Effect on Configuration Distribution. Figure 12 summarize the effect of different distribution policies on configuration processing time. By monitoring the istio_agent_pilot_xds _push_time_sum metric, we can obtain the total configuration push time. We can evaluate the performance and efficiency of configuration pushing in the DATM scenario by observing the difference in the istio_agent_pilot_xds_push_time_sum metric at the time of update, which is the following part of Figure 12. By monitoring the istio_agent_pilot_proxy_convergence_time_sum metric we can observe the convergence time of agents, which is the upper part of Figure 12. Our evaluation shows that the agent convergence time is higher for services at the back of the call chain. The agent convergence time is when the agent receives a new configuration, loads it, takes effect and starts processing the request. We used the DATM mechanism and found a lower convergence time with this mechanism, indicating that the agent can adapt to the new configuration and start processing traffic quickly.

In summary, the DATM mechanism provides a more flexible and dynamic approach to configuration management. It makes it possible to apply and update the agent’s configuration more quickly as changes occur without redeploying the entire service. This improves system maintainability and configuration consistency. We can understand the speed of configuration updates and agent performance by evaluating metrics such as configuration dispatch time and agent convergence time. These evaluation metrics show that our approach has better performance in terms of efficiency of configuration issuance and agents in adapting to new configurations, improving the performance and maintainability of the system.

6. Related Works

Service governance in cloud applications is a popular and well-researched topic. Currently, extensive research on system architecture, dynamic performance modeling, and traffic management exists. Most of them focus on the intersection of these areas. Some particularly notable individual studies are highlighted below.

System Architecture. Spring Cloud [17] represents one of the most common microservice architectures, with a Software Development Kit (SDK) built into the application. However, its service governance capabilities are unsuitable for integrating heterogeneous systems. To address this issue, Delavergne et al. [18] introduce service mesh to enable business processes to focus more on business logic. This approach meets the business requirements and thus ensures smooth operation. Aldea et al. [19] introduce a cybersecurity platform of microservice architecture. Additionally, Service Mesh Istio provides various functionalities extending cloud-native systems’ high-order capabilities. Furthermore, Cilium [20] adds network security filtering to Linux container systems (e.g., Docker and Kubernetes) using eBPF, which enforces security policies of containers and pods at the network and application-layer. Yang et al. [21] research a native serverless system. The serverless computing [22,23] paradigm holds great promises for the next generation of microservice.

Dynamic Performance Management. The efficient management of resources in cloud environments has been extensively studied [24,25,26]. Estimating resources based on maximum traffic during deployment often leads to resource waste [27]. To address this, Bao et al. [28] proposed a performance modeling and task scheduling method for microservices, considering their performance overhead. Suresh et al. [29] introduced a service-oriented architecture with rate restrictions and per service scheduling to optimize deadline compliance. Control solutions based on Sidecar [30] have gained popularity in the industry. Lazyload-based modular architectures, such as Slime [31] and the lazyXds scheme [32], offer effective resource management and configuration optimization. Several studies have analyzed the performance and cost implications of applications on public platforms. For instance, Lin et al. [33] proposed modeling and optimization algorithms for Function-as-a-Service (FaaS) applications, considering performance and cost constraints. Li et al. [34] offer TETRIS to reduce the memory footprint of inference services through runtime sharing and tensor sharing in serverless platform. Scheduling strategies based on heuristics [35] allocate resources efficiently by analyzing the performance demands of individual services. Dynamic resource allocation techniques, such as autoscaling and scheduling [20], enhance resource utilization and performance. The impact of microservice invocation topology on overall application performance is another area of research. In order to cope with the dynamic nature of microservice systems, some techniques focus on the more efficient use of resources, such as scheduling techniques [20]. Additionally, techniques that restrict access to resources to limit performance have been explored. These approaches help manage traffic, balance the elastic load, ensure security, and improve observability.

Traffic Management. Efficient traffic management is crucial for maintaining the QoS in microservice clusters. Recent studies have highlighted various aspects of traffic management that require further research [36]. Strategies such as cache-based circuit-breaker [37] and algorithmic optimization [38,39] have been proposed to address request failures and increase service delivery efficiency. Traffic management in multi-cloud environments has also received significant attention. For example, studies have focused on improving load-balancing performance through decentralized algorithms in centralized systems [40].

This paper differs from the previous work in the following aspects: (1) We propose a controller solution for dependency-aware adaptive management of control-plane traffic. (2) We achieve fine-grained performance and cost save for microservices by accumulating the accurate dependencies of the hub service. (3) We design an optimization algorithm to solve the full contribution with instance-level constraints in a fine-grained manner. In summary, this paper contributes to the existing body of research by addressing the challenges of cost reduction, performance optimization, and traffic management in cloud applications and microservices. The proposed DATM mechanism offers a novel approach to efficiently manage control-plane traffic efficiently, considering dependencies and achieving fine-grained performance and cost savings. The experimental results demonstrate the effectiveness of our approach in improving resource utilization and system performance.

7. Conclusions

In this paper, we propose a configuration traffic distribution mechanism (DATM) to improve resource utilization and guarantee quality of microservice applications in clusters. Our algorithm analyzes the characteristics of configuration information in service meshes and leverages inter-service dependency relationships to control data-plane traffic. By combining monitoring, information processing, and policy distribution, our approach distributes configurations on demand, dynamically maintaining cluster services and reducing the memory usage of data plane agents. Our proposed method provides a practical solution to the problem of full-volume configuration traffic distribution in the service mesh. It improves the resource utilization of microservice clusters and guarantees the performance of microservice applications. The proposed approach can be easily integrated into existing microservice architectures and applied to various applications, including cloud-native applications and large-scale distributed systems.

Author Contributions

Conceptualization, N.W., L.W., X.L. and X.Q.; methodology, N.W., L.W., X.L. and X.Q.; software, N.W. and L.W.; validation, N.W. and L.W.; formal analysis, N.W. and L.W.; investigation, N.W. and L.W.; resources, N.W. and L.W.; data curation, N.W. and L.W.; writing—original draft preparation, N.W. and L.W.; writing—review and editing, X.L. and X.Q.; visualization, N.W. and L.W.; supervision, X.L. and X.Q.; project administration, N.W. and L.W.; funding acquisition, X.L. and X.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Fundamental Research Funds for the Central Universities under Grant NT2022028.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors would like to thank all authors of previous papers for approving the use of their published research results in this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

API	Application Programming Interface
CDS	Cluster Discovery Service
CPU	Central Processing Unit
CRD	Custom Resource Definition
DATM	Dependency-Aware Traffic Management mechanism
DEM	Dependencies of Microservices
DR	DestinationRule
EDS	Endpoint Discovery Service
FaaS	Function-as-a-Service
GKE	Google Kubernetes
GL	Global Service
GW	Gateway
HB	Hub Service
HTTP	Hypertext Transfer Protocol
MSA	Microservices Architectures
QoS	Quality of Service
SDK	Software Development Kit
SOA	Service-Oriented Architecture
TSDB	Time Series Database
VR	VirtualService
xDS	x Discovery Service
YAML	YAML Ai not a Markup Language

References

Gan, Y.; Delimitrou, C. The architectural implications of cloud microservices. IEEE Comput. Archit. Lett. 2018, 17, 155–158. [Google Scholar] [CrossRef]
Wang, S.; Guo, Y.; Zhang, N.; Yang, P.; Zhou, A.; Shen, X. Delay-Aware Microservice Coordination in Mobile Edge Computing: A Reinforcement Learning Approach. IEEE Trans. Mob. Comput. 2021, 20, 939–951. [Google Scholar] [CrossRef]
Sprott, D.; Wilkes, L. Understanding Service-Oriented Architecture. Archit. J. 2004, 1, 10–17. [Google Scholar]
Merkel, D. Docker: Lightweight linux containers for consistent development and deployment. Linux J. 2014, 239, 2. [Google Scholar]
Kubernetes. Available online: https://kubernetes.io (accessed on 3 March 2023).
Rejiba, Z.; Chamanara, J. Custom Scheduling in Kubernetes: A Survey on Common Problems and Solution Approaches. ACM Comput. Surv. 2022, 55, 151. [Google Scholar] [CrossRef]
Lewis, J.; Fowler, M. Microservices. Library Catalog. 2014. Available online: https://martinfowler.com/ (accessed on 3 March 2023).
Li, W.; Lemieux, Y.; Gao, J.; Zhao, Z.; Han, Y. Service Mesh: Challenges, State of the Art, and Future Research Opportunities. In Proceedings of the 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE), San Francisco, CA, USA, 4–9 April 2019; pp. 122–1225. [Google Scholar] [CrossRef]
Wang, J.; Cao, J.; Wang, S.; Yao, Z.; Li, W. IRDA: Incremental Reinforcement Learning for Dynamic Resource Allocation. IEEE Trans. Big Data 2022, 8, 770–783. [Google Scholar] [CrossRef]
Costa, B.G.; Bachiega, J., Jr.; de Carvalho, L.R.; Araújo, A.P.F. Orchestration in Fog Computing: A Comprehensive Survey. ACM Comput. Surv. 2023, 55, 29. [Google Scholar] [CrossRef]
Istio—Connect, Secure, Control, and Observe Services. Available online: https://istio.io/ (accessed on 1 March 2023).
Song, J.; Guo, X.; Ma, R. Istio Handbook—Advanced Practice of Istio Service Mesh; Electronic Industry Press: Beijing, China, 2020; Volume 8, pp. 10–18. [Google Scholar]
Envoyproxy. Envoy—Adaptive Concurrency Filter. 2023. Available online: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_filters/adaptive_concurrency_filter (accessed on 17 March 2023).
Github—Bookinfo Sample. 2022. Available online: https://github.com/istio/istio/tree/master/samples/bookinfo (accessed on 10 March 2022).
GitHub—Isotope. Available online: https://github.com/istio/tools/tree/master/perf/load (accessed on 23 December 2022).
Leva, A.; Maggio, M. The PI+p controller structure and its tuning. J. Process. Control 2009, 19, 1451–1457. [Google Scholar] [CrossRef]
Christudas, B. Practical Microservices Architectural Patterns: Event-Based Java Microservices with Spring Boot and Spring Cloud; Apress: Pune, India, 2019. [Google Scholar]
Delavergne, M.; Cherrueau, R.; Lebre, A. A Service Mesh for Collaboration Between Geo-Distributed Services: The Replication Case. In Proceedings of the Agile Processes in Software Engineering and Extreme Programming—Workshops—XP 2021 Workshops, Virtual Event, 14–18 June 2021; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
Aldea, C.L.; Bocu, R.; Vasilescu, A. Relevant Cybersecurity Aspects of IoT Microservices Architectures Deployed over Next-Generation Mobile Networks. Sensors 2022, 23, 189. [Google Scholar] [CrossRef] [PubMed]
Kuznetsov, A.V. Protein transport in the connecting cilium of a photoreceptor cell: Modeling the effects of bidirectional protein transitions between the diffusion-driven and motor-driven kinetic states. Comput. Biol. Med. 2013, 47, 758–764. [Google Scholar] [CrossRef] [PubMed]
Yang, Y.; Zhao, L.; Li, Y.; Zhang, H.; Li, J.; Zhao, M.; Chen, X.; Li, K. INFless: A Native Serverless System for Low-Latency, High-Throughput Inference. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, New York, NY, USA, 28 February 2022–4 March 2022; pp. 768–781. [Google Scholar] [CrossRef]
Zhou, Z.; Zhang, Y.; Delimitrou, C. AQUATOPE: QoS-and-Uncertainty-Aware Resource Management for Multi-Stage Serverless Workflows. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, New York, NY, USA, 25–29 March 2022; Volume 1, pp. 1–14. [Google Scholar] [CrossRef]
Shahrad, M.; Fonseca, R.; Goiri, I.; Chaudhry, G.; Batum, P.; Cooke, J.; Laureano, E.; Tresness, C.; Russinovich, M.; Bianchini, R. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), Virtual, 15–17 July 2020; pp. 205–218. [Google Scholar]
Alencar, D.; Both, C.; Antunes, R.; Oliveira, H.; Cerqueira, E.; Rosário, D. Dynamic Microservice Allocation for Virtual Reality Distribution With QoE Support. IEEE Trans. Netw. Serv. Manag. 2022, 19, 729–740. [Google Scholar] [CrossRef]
Wu, H.; Alay, Ö.; Brunstrom, A.; Ferlin, S.; Caso, G. Peekaboo: Learning-Based Multipath Scheduling for Dynamic Heterogeneous Environments. IEEE J. Sel. Areas Commun. 2020, 38, 2295–2310. [Google Scholar] [CrossRef]
Shah, S.Y.; Dang, X.H.; Zerfos, P. Root Cause Detection using Dynamic Dependency Graphs from Time Series Data. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 1998–2003. [Google Scholar] [CrossRef]
Jennings, B.; Stadler, R. Resource Management in Clouds: Survey and Research Challenges. J. Netw. Syst. Manag. 2015, 23, 567–619. [Google Scholar] [CrossRef]
Bao, L.; Wu, C.Q.; Bu, X.; Ren, N.; Shen, M. Performance Modeling and Workflow Scheduling of Microservice-Based Applications in Clouds. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 2101–2116. [Google Scholar] [CrossRef]
Suresh, L.; Bodík, P.; Menache, I.; Canini, M.; Ciucu, F. Distributed Resource Management across Process Boundaries. In Proceedings of the 2017 Symposium on Cloud Computing, SoCC 2017, Santa Clara, CA, USA, 24–27 September 2017; pp. 611–623. [Google Scholar] [CrossRef] [Green Version]
Bhattacharya, R. Smart Proxying for Microservices. In Proceedings of the 20th International Middleware Conference Doctoral Symposium, Davis, CA, USA, 9–13 December 2019; pp. 31–33. [Google Scholar] [CrossRef]
GitHub—Aeraki-Framework/Aeraki. Available online: https://github.com/aeraki-framework/aeraki (accessed on 15 February 2023).
GitHub—Slime-io/Slime. Available online: https://github.com/slime-io/slime (accessed on 17 March 2023).
Lin, C.; Mahmoudi, N.; Fan, C.; Khazaei, H. Fine-Grained Performance and Cost Modeling and Optimization for FaaS Applications. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 180–194. [Google Scholar] [CrossRef]
Li, J.; Zhao, L.; Yang, Y.; Zhan, K.; Li, K. Tetris: Memory-Efficient Serverless Inference through Tensor Sharing. In Proceedings of the 2022 USENIX Annual Technical Conference (USENIX ATC 22), Carlsbad, CA, USA, 11–13 July 2022. [Google Scholar]
Delimitrou, C.; Kozyrakis, C. Paragon: Qos-Aware Scheduling For Heterogeneous Datacenters. Comput. Archit. News 2013, 41, 77–88. [Google Scholar] [CrossRef] [Green Version]
Xie, X.; Govardhan, S.S. A Service Mesh-Based Load Balancing and Task Scheduling System for Deep Learning Applications. In Proceedings of the 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID), Melbourne, VIC, Australia, 11–14 May 2020; pp. 843–849. [Google Scholar] [CrossRef]
Saleh Sedghpour, M.R.; Klein, C.; Tordsson, J. An Empirical Study of Service Mesh Traffic Management Policies for Microservices. In Proceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering, Beijing, China, 9–13 April 2022; pp. 17–27. [Google Scholar] [CrossRef]
Auriol, J.; Boussaada, I.; Shor, R.J.; Mounier, H.; Niculescu, S. Comparing Advanced Control Strategies to Eliminate Stick-Slip Oscillations in Drillstrings. IEEE Access 2022, 10, 10949–10969. [Google Scholar] [CrossRef]
Xu, M.; Buyya, R. Brownout Approach for Adaptive Management of Resources and Applications in Cloud Computing Systems: A Taxonomy and Future Directions. ACM Comput. Surv. 2019, 52, 1–27. [Google Scholar] [CrossRef]
Rusek, M.; Landmesser, J. Time Complexity of an Distributed Algorithm for Load Balancing of Microservice-oriented Applications in the Cloud. ITM Web Conf. 2018, 21, 18. [Google Scholar] [CrossRef] [Green Version]

Figure 1. An example of request.

Figure 2. Examples of triggering updates. (a) An edge service triggers an update in default mode. (b) An edge service triggers an update under controller processing. (c) Service with a calling relationship triggers an update under controller processing.

Figure 3. DATM architecture overview.

Figure 4. Implement process of control extractor.

Figure 5. Worst-case scenario.

Figure 6. Lazy configuration.

Figure 7. The generated workload scenario.

Figure 8. Envoy server memory allocated for microservices in different workload scenarios. (a) Productpage envoy memory. (b) Ratings envoy memory.

Figure 9. The impact of DATM on Ratings in the same namespace.

Figure 10. The number of CDS update.

Figure 11. The number of CDS/EDS in different workload scenarios. (a) CDS; (b) EDS.

Figure 12. The configuration distribution time of the three strategies.

Table 1. Definition of Notations.

Notation	Definition
$S v c$	a set of microservices; svc is a service
$I_{i}$	the number of instances of microservice $s v c_{i}$
$s v c_{i}^{m}$	instance m of service i
$γ_{s v c_{j}}$	status of service j
$χ$	whether the service is running
$ψ$	whether the service is a business service
$ξ$	namespaces to which the service belongs
$σ$	related destinations of this service
L	number of load
C	size of configuration
U	storage space occupation
$T I M E$	configuration distribution time
$T_{c o n v}$	agent receive time
$T_{p u s h}$	push queue time
$o (N e t)$	network latency
$E I$	performance enhancement
$M E M$	memory footprint for a single service
M	microservice execution history metric
$R B l i s t$	running business services list
$G L q u e u e$	priority queue which records all applications in the cluster
$D l i s t$	services which have the calling relationship
$D E M m a p$	all relationship between receiving services

Table 2. Prometheus Metric.

	Filter Properties			Other Properties
$s_{1}$	$s t a t u s_{1}$	$s e c u r i t y_{i} s t i o_{i} o_{t} l s M o d e_{1}$	$d e s t i n a t i o n_{1}$	$v a l u e_{1}$	$r e s u l t T y p e_{1}$
$s_{2}$	$s t a t u s_{2}$	$s e c u r i t y_{i} s t i o_{i} o_{t} l s M o d e_{2}$	$d e s t i n a t i o n_{2}$	$v a l u e_{2}$	$r e s u l t T y p e_{2}$
...	...	...	...	...	...
$s_{i}$	$s t a t u s_{i}$	$s e c u r i t y_{i} s t i o_{i} o_{t} l s M o d e_{i}$	$d e s t i n a t i o n_{i}$	$v a l u e_{i}$	$r e s u l t T y p e_{i}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, N.; Wang, L.; Li, X.; Qin, X. Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution. Electronics 2023, 12, 3404. https://doi.org/10.3390/electronics12163404

AMA Style

Wang N, Wang L, Li X, Qin X. Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution. Electronics. 2023; 12(16):3404. https://doi.org/10.3390/electronics12163404

Chicago/Turabian Style

Wang, Ning, Lin Wang, Xin Li, and Xiaolin Qin. 2023. "Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution" Electronics 12, no. 16: 3404. https://doi.org/10.3390/electronics12163404

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fine-Grained Management for Microservice Applications with Lazy Configuration Distribution

Abstract

1. Introduction

2. Problem Analysis

2.1. Service Description

2.2. Problem Scenario

2.3. Basic Platform

2.4. Memory Footprint

2.5. Configuration Time

3. DATM Mechanism

3.1. Monitor Coordinator

3.2. Information Acquisition

3.3. Control Extractor

3.4. Control Traffic Delivery

4. Methodology

5. Evaluation

5.1. Memory Footprint Evaluation

5.2. Number of Updates Evaluation

5.3. Time of Configuration Distribution Evaluation

6. Related Works

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI