*Article* **Software Aging Effects on Kubernetes in Container Orchestration Systems for Digital Twin Cloud Infrastructures of Urban Air Mobility**

**Jackson Costa 1, Rubens Matos 1,2,\* , Jean Araujo 1,3 , Jueying Li 4, Eunmi Choi <sup>5</sup> , Tuan Anh Nguyen 4,6,\* , Jae-Woo Lee 6,\* and Dugki Min 4,\***


**Abstract:** It is necessary to develop a vehicle digital twin (DT) for urban air mobility (UAM) that uses an accurate, physics-based emulator to model the statics and dynamics of a vehicle. This is because the use of digital twins in the operation and control of UAM vehicles is essential for the UAM operational digital twin infrastructure (UAM-ODT). There are several issues that need to be addressed in this process: (i) the lack of digital twin engines for the digitalization (twinization) of the dynamics and control of UAM vehicles at the core of UAM-ODT systems; (ii) the lack of back-end system engineering in the development of UAM vehicle DTs; and (iii) the lack of fault-tolerant mechanisms for the DT cloud back-end system to run uninterrupted operations 24/7. On the other hand, software aging and rejuvenation are becoming increasingly important in a variety of computing scenarios as the demand for reliable and available services increases. With the increasing use of containerized systems, there is also a need for an orchestrator to support easy management and reduce operational costs. In this paper, an operational digital twin (ODT) of a typical urban air mobility (UAM) infrastructure is developed on a private cloud system based on Kubernetes using a proposed cloud-in-the-loop simulation approach. To ensure the ODT can provide uninterrupted operational control and services in UAM around the clock, we propose a methodology for investigating software aging in Kubernetes-based containerized clouds. We evaluate the behavior of Kubernetes software using the Nginx and K3S tools while they manage pods in an accelerated lifetime experiment. We continuously execute operations for creating and terminating pods, allowing us to observe the utilization of computing resources (e.g., CPU, memory, and I/O), the performance of the Nginx and K3S environments, and the response time of an application hosted in those environments. In some conditions and for specific metrics, such as virtual memory usage, we observed the effects of software aging, including a memory leak that is not fully cleared when the cluster is stopped. These issues could lead to system performance degradation and eventually compromise the reliability and availability of the system when it crashes due to memory space exhaustion or full utilization of swap space on the hard disk. This study helps with the deployment and maintenance of virtualized environments from the standpoint of system dependability in digital twin computing infrastructures where a large number of services are running under strict continuity requirements.

**Keywords:** operational digital twin; urban air mobility; cloud-in-the-loop simulation; software aging; software rejuvenation; Kubernetes; Nginx; K3S

**Citation:** Costa, J.; Matos, R.; Araujo, J.; Li, J.; Choi, E.; Nguyen, T.A.; Lee, J.-W.; Min, D. Software Aging Effects on Kubernetes in Container Orchestration Systems for Digital Twin Cloud Infrastructures of Urban Air Mobility. *Drones* **2023**, *7*, 35. https://doi.org/10.3390/ drones7010035

Academic Editors: Ivana Semanjski, Antonio Pratelli, Massimiliano Pieraccini, Silvio Semanjski, Massimiliano Petri and Sidharta Gautama

Received: 19 October 2022 Revised: 11 December 2022 Accepted: 22 December 2022 Published: 3 January 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Digital twin (DT) technology is a cutting-edge innovation that has the potential to revolutionize various industries. DT involves creating a virtual replica of a physical object or system, and using data-driven analysis and decision-making to continuously update and improve it. The virtual replica, or digital twin, is made up of computational models that evolve and change over time, reflecting the structure, behavior, and environment of the physical object or system they represent [1,2]. Digital twin systems are digital representations of physical systems, such as vehicles, buildings, or manufacturing processes. They are used to simulate the behavior and performance of the physical system, and to predict its behavior or performance under different conditions. This can be useful for a variety of applications, such as planning for maintenance, optimizing the operation of the physical system, or analyzing the impact of changes to the system's design or operation.

The development of an operational vehicle digital twin system for urban air mobility (UAM-ODT) includes the following fundamental modules: (i) neural digital twin dynamic engines (DTDE), (ii) neural digital twin control engines (DTCE), (iii) digital twin control frame (DTCF), and (iv) digital twin cloud infrastructure (DTCI) as shown in Figure 1. The DTDE module is responsible for creating a virtual replica of the aerodynamics of UAM vehicles using learning-based techniques. The DTCE module performs control tasks, such as robust control, optimal control, and adaptive control, to ensure the safety of the vehicle. These two modules digitalize the dynamics and control of the vehicle to ensure that the operations of the vehicle in the digital space are identical to those in the physical space. The DTCF module serves as a bridge between the digital twin and the physical twin of the vehicle. It can provide teleoperation services, fault-tolerant control, or traffic prediction and management, with the belief that if the dynamics and control of the physical vehicle are accurately captured in the digital space along with the digital environment (e.g., city, region, country), the operations in the digital space can be effectively transferred to the physical space. The DTCI module is the common computing platform that hosts the entire UAM-ODT system, running constantly to create a virtual space of the real-world UAM physical infrastructure. Due to the stringent requirements for the high availability of the digital twin system, the DTCI must handle any failures and maintain constant digital operations and services in the long run. Particularly, if a digital twin runs all day and night, it can be subject to a phenomenon known as "software aging". Software aging is the gradual deterioration of the performance and reliability of software over time, due to factors such as changes in the operating environment, errors and defects in the software, or the accumulation of wear and tear on the software. If a digital twin runs continuously, it can experience software aging more quickly than if it were run only intermittently. This can cause the digital twin to become less accurate and less reliable over time, which can affect the quality of the predictions and decisions it makes. In this work, we investigate the software aging problems in the digital twin cloud infrastructure which is developed upon Kubernetes-based cloud environment using a cloud-in-the-loop simulation approach.

Software aging is a phenomenon that occurs when software systems become less reliable and less efficient over time. This can happen for a variety of reasons, such as changes in the environment, changes in the software itself, or the accumulation of errors and defects. When software ages, it can become less accurate and less reliable, which can affect the performance and behavior of the systems that it is used to control or manage. The software aging phenomenon occurs in operating software systems, causing sudden failures such as crashes and continuous performance degradation, which can be circumvented by a proactive strategy such as software rejuvenation to avoid abrupt system interruptions [3,4]. The relevance of such a phenomenon is remarkable, considering that the demand for availability and reliability in the provision of services in practically all areas has increased in order to have quality and competitiveness in each field of activity. Considering high availability requirements, the services of computing, health, security, financial system, geolocation, and routing are examples that can be cited. In order to meet such service demands without the unwanted effects of software aging, it is necessary to use an architecture capable of maintaining its offer without huge operational costs of employing several redundant servers with high computational power, requiring human resources for their handling and management, and also incurring higher energy costs. Using virtual machines in contexts such as these has been an alternative because they provide functionalities of a physical server based on the same traditional computational architecture. Thus, it is possible to create several virtual machines on a single server, and each virtual machine can run different environments allowing the execution of heterogeneous systems [5]. The scalability and flexibility of IT (Information Technology) can be increased through virtualization, in addition to generating significant savings in operational costs. Thus, IT administration becomes easier to manage by obtaining better availability, operability, performance, and greater workload mobility through virtualization [6].

**Figure 1.** Operational Digital Twin for Urban Air Mobility (UAM-ODT).

If the flight control software of an unmanned aerial vehicle (UAV) experiences software aging, it can affect the performance and behavior of the UAV. As the software ages, it can become less accurate and less reliable, which can cause the UAV to behave in unexpected or unsafe ways. To address software aging in the flight control software of a UAV, it is important to periodically update and maintain the software. This can involve installing patches and updates, fixing errors and defects, and re-tuning or re-calibrating the software to account for changes in the environment or the UAV itself. Regular maintenance and updates can help to ensure that the flight control software remains accurate and reliable over time, and can help to prevent or mitigate the effects of software aging. In some cases, software aging can cause the flight control software to become unstable or unreliable. If this happens, it may be necessary to take the UAV out of service temporarily in order to perform maintenance or repairs. This can involve replacing or upgrading the flight control software, or making other changes to the UAV in order to improve its performance and reliability. So, software aging in the flight control software of a UAV can affect the performance and behavior of the UAV. To address this problem, it is important to periodically update and

maintain the flight control software, and to take the UAV out of service if necessary in order to perform maintenance or repairs. This can help to ensure that the UAV remains safe and reliable over time.

To create a digital twin, a mathematical model of the physical system is created using data about the system's behavior and performance. This model is then used to simulate the behavior of the physical system under different conditions, and to make predictions about its performance. In order to create a reliable and accurate digital twin, it is important to use accurate and reliable software to create the model and simulate the system's behavior. However, software aging can be a problem for digital twin systems. As the software used to create and simulate the digital twin ages, it can become less accurate and less reliable. This can affect the accuracy and reliability of the digital twin, and can cause it to produce incorrect or inconsistent predictions. In some cases, this can lead to incorrect or sub-optimal decisions or actions based on the digital twin's predictions. To address software aging problems in digital twin systems, it is important to periodically update and maintain the software used to create and simulate the digital twin. This can involve installing patches and updates, fixing errors and defects, and re-tuning or re-calibrating the software to account for changes in the environment or the system being modeled. Regular maintenance and updates can help to ensure that the digital twin remains accurate and reliable over time, and can help to prevent or mitigate the effects of software aging.

Digital twin systems can experience a variety of errors, depending on the specific characteristics of the system and the software being used. Some common types of errors that can occur in digital twin systems include:


Overall, digital twin systems can experience a variety of errors, including data errors, modeling errors, and software errors. To address these errors and improve the accuracy and reliability of the digital twin, it is important to carefully collect and pre-process the data, to create accurate and well-calibrated models, and to use high-quality software that is free of errors and defects.

When using virtualization, it is possible to implement many servers in a smaller number of hosts (physical servers), which consequently implies the gain of physical spaces and energy cost reduction. However, once the virtual machine is initialized, all the hardware on which the Operating System (OS) is running is loaded and not just a copy of the OS, resulting in the consumption of many system resources, making virtualization very expensive from a computational point of view [7]. The use of containerization mitigates the operational cost of traditional virtualization, as stated by [5], in which the author addresses host-level virtualization known as container, which is another type of virtualization. This type of virtualization acts on top of the physical server offering support to several independent systems since the physical server already has an OS installed, not needing to load all the host hardware or its copy. Container-based virtualization has recently gained

much attention [8,9]. This virtualization makes an application run efficiently in the most varied computing environments through its encapsulation and its dependencies [10]. This virtualization technique is said by the author of [11] to be lightweight, as the system significantly decreases workloads by sharing OS resources from host. Containers provide an isolated environment for system resources such as processes, file systems and networks to run at the host OS level, without having to run a Virtual Machine (VM) with its own OS on top of virtualized hardware. By sharing the same OS kernel, containers start much faster using a small amount of system memory compared to booting an entire virtualized OS like in [10] VMs. Kubernetes is a widely used tool for managing containers, configure, maintain and manage solutions that have containers as an approach to the detriment of VMs. Thus, this work aims to evaluate the effects of software aging and the performance of Kubernetes when undergoing a high-stress load, characterized by creating replicas of pods to maintain service availability in the Nginx and K3S environments. Furthermore, the aging problem on an unmanned vehicle refers to the degradation of the vehicle's performance over time, due to factors such as wear and tear, corrosion, and obsolescence. As an unmanned vehicle ages, its components may become less reliable and less capable of performing their intended functions, which can affect the vehicle's ability to operate safely and effectively. The aging problem can be particularly challenging for unmanned vehicles, as they often operate in harsh or hostile environments, and they may be subjected to high levels of stress and strain. For example, an unmanned aerial vehicle (UAV) may experience high levels of vibration and air turbulence during flight, which can cause its components to wear out faster. Similarly, an unmanned underwater vehicle (UUV) may be exposed to corrosive saltwater, which can cause its components to corrode and deteriorate over time. In this work, our focus is on the investigation of a digital twin cloud infrastructure in which a Kubernetes-based cloud environment is investigated regarding software aging phenomenon of the cloud if hosting the UAM-ODT with no downtime.

The study in this work extends the related research area on software aging in virtualized environment through the following *key contributions:*

	- **–** It is important to stress that aging events found in test-bed experiments indicate the threats of system failures and performance degradation due to software aging symptoms. However, the time that those events will occur depends on the characteristics and intensity of the workload that the system needs to process, as well as the hardware and software specification of that Kubernetes system.
	- **–** If the system has more resources available or less workload than those employed in this experiment, the aging phenomenon would be slower, and subsequently, the failures due to resource exhaustion would take longer to occur. This fact does not reduce the importance of evaluating software aging in those systems as well as planning actions for their mitigation.

To the best of our knowledge, this work contributes to the practical implementation and maintenance of virtualized environment on the perspectives of system dependability in digital twin computing infrastructures in which a huge amount of services are running with a stringent requirement of continuity. The findings of this study bring about the comprehension of software aging phenomena in digital twin computing infrastructures developed on top of Kubernetes, which is at very early stage of current research on software aging problems for a high level of dependability and fault-tolerance in digital twin computing infrastructures.

In order to facilitate the understanding of this work, the paper is organized as follows. Section 2 addresses the related works that inspired this study on software aging assessment; Section 3 presents the fundamental concepts and system design used in this work; Section 4 deals with the methodology used in the research; the objective and planning, covering the context in which it was produced, the tools selected, variables involved, scripts for reproduction and the hardware used are discussed in Section 5. The results are presented and discussed in Section 6. In Section 7 are the remarks arising from our research results.

#### **2. Related Work**

The work described in [10] analyzes the performance of running containers with services hosted on them, carrying out experiments with containers monitoring system resources, including network, memory, disk, and CPU. The testbed environment consists of a Kubernetes cluster manually deployed to carry out the evaluation, considering the Microsoft Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), or Amazon Elastic Container Service for Kubernetes (EKS).

The authors in [12] evaluated the memory utilization, network overhead of containers, storage, and CPU using Docker, comparing them with KVM hypervisors. They exposed in their experiments that the containers obtained, in the worst case, similar or superior performance when compared to the VMs.

The work presented in [13] conducted a similar study, however, comparing the performance obtained from containers when monitoring the number of requests an application server could handle in relation to the same application deployed in a VM and the results showed that the VMs had significantly outperformed the containers.

The research reported in [14] performed application experiments for HPC (highperformance computing), using benchmarking tools to evaluate memory, network, disk, and CPU performance in Linux Container (LXC) related virtualization implementations, along with OpenVZ and Linux VServer, showing that all containerized apps performed similarly to a native system.

The authors of [15,16] showed improvements obtained related to performance isolation for MapReduce workloads. However, when evaluating disk workloads, LXC failed to fully isolate resources, opposite behavior to that of hypervisor-based systems.

Through memory, network, and disk metrics, the authors of [17] evaluated the performance of LXC, Docker, and KVM running many benchmarking tools to measure the performance of these components and concluded that the overhead caused by containerbased virtualization technologies could have its weight considered irrelevant, despite the performance being compensated by safety.

Our main focus is on software aging investigation on a private cloud system hosting an operational digital twin of an eVTOL vehicle flying in a virtualized urban air mobility. Operational digital twins of vehicles in urban air mobility are digital representations of real-world vehicles that can be used for a variety of purposes. Some potential uses of operational digital twins in urban air mobility include:


the movement of vehicles, to avoid collisions and other hazards, and to optimize the flow of traffic in urban airspace.

• *Emergency response and rescue:* Operational digital twins can be used to support emergency response and rescue operations in urban air mobility systems, by providing real-time information about the location and status of vehicles. This can help to quickly and accurately identify the location and condition of vehicles in distress, and to coordinate rescue and recovery efforts.

Operational digital twins of vehicles in urban air mobility can be used for a variety of purposes, including performance modeling and simulation, fleet management and maintenance, traffic management and control, and emergency response and rescue. Due to such constant operational services, the UAM-ODT cloud system is inevitable to suffer software aging problems. In this study, we specifically investigate the software aging problems of a UAM-ODT cloud system based on Kubernetes virtualization environment.

#### **3. System Design**
