Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization

Seo, Cheongjeong; Yoo, Dojin; Lee, Yongjun

doi:10.3390/su16125095

Open AccessArticle

Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization

by

Cheongjeong Seo

,

Dojin Yoo

^*

and

Yongjun Lee

Department of Hacking & Security, Far East University, Eumseong-gun 27601, Republic of Korea

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(12), 5095; https://doi.org/10.3390/su16125095

Submission received: 21 April 2024 / Revised: 6 June 2024 / Accepted: 7 June 2024 / Published: 14 June 2024

(This article belongs to the Special Issue Simulation and Artificial Intelligence for Sustainable Industrial and Service Systems)

Download

Browse Figures

Versions Notes

Abstract

:

This study focuses on examining the shift of an application system from a traditional monolithic architecture to a cloud-native microservice architecture (MSA), with a specific emphasis on the impact of this transition on resource efficiency and cost reduction. In order to evaluate whether artificial intelligence (AI) and application performance management (APM) tools can surpass traditional resource management methods in enhancing cost efficiency and operational performance, these advanced technologies are integrated. The research employs the refactor/rearchitect methodology to transition the system to a cloud-native framework, aiming to validate the enhanced capabilities of AI tools in optimizing cloud resources. The main objective of the study is to demonstrate how AI-driven strategies can facilitate more sustainable and economically efficient cloud computing environments, particularly in terms of managing and scaling resources. Moreover, the study aligns with model-based approaches that are prevalent in sustainable systems engineering by structuring cloud transformation through simulation-supported frameworks. It focuses on the synergy between endogenous AI integration within cloud management processes and the overarching goals of Industry 5.0, which emphasize sustainability and efficiency that not only benefit technological advancements but also enhance stakeholder engagement in a human-centric operational environment. This integration exemplifies how AI and cloud technology can contribute to more resilient and adaptive industrial and service systems, furthering the objectives of AI and sustainability initiatives.

Keywords:

cloud resource optimization; modeling; simulation; AI in cloud computing; sustainable AI; sustainable intelligent systems; microservice architecture; cost efficiency in cloud services

1. Introduction

Cloud computing has revolutionized the digital business landscape by providing unparalleled scalability, flexibility, and efficiency in data management and service delivery. Despite these advancements, several significant challenges still exist, such as optimizing cloud resources, achieving cost efficiency, and maintaining high performance. Recent technological advancements have identified artificial intelligence (AI) as a crucial enhancer in cloud computing capabilities, providing innovative solutions for more effective resource management [1].

We define ‘sustainable AI solutions’ as the application of AI technologies not only to enhance cloud resource efficiency but also to ensure minimal environmental impact and foster socio-economic benefits. These solutions leverage advanced algorithms to improve resource allocation efficiency, reduce energy consumption, and lower operational costs, while actively contributing to the reduction of carbon emissions and promoting equitable access to technology.

To rigorously assess the sustainability of these AI solutions, our comprehensive measurement framework includes the following:

Environmental impact: We conduct a detailed quantitative assessment of energy savings and reductions in carbon emissions for each unit of data processed, aiming to minimize environmental footprints.
Social impact: We evaluate the role of AI in promoting social equity through detailed community impact studies and surveys. These assess the accessibility and fairness of technology distribution among diverse populations.
Economic impact: Moving beyond traditional ROI, our analysis considers the long-term economic impacts, including the total cost of ownership that encompasses factors such as the end-of-life recycling potential of hardware and ongoing maintenance costs.

In the field of resource optimization, traditional AI approaches often require continuous model management and frequent scoring interventions to maintain performance. These methods typically follow a workflow involving initial data arrival, model building, and periodic refresh intervals (e.g., weekly or monthly). However, they often lack the automation needed to sustain optimal performance with minimal manual intervention.

In contrast, our Turbonomic algorithm automates the entire process of analysis, model generation, and refresh. This automated lifecycle not only reduces the need for manual management but also enhances the efficiency and effectiveness of resource optimization.

Figure 1 illustrates the comparison between traditional AI model management and our Turbonomic approach. The diagram above illustrates the comparison between traditional AI model management and our Turbonomic approach. The left side represents the typical workflow of traditional AI, involving manual intervention and frequent scoring. The right side showcases the automated processes of the Turbonomic algorithm, which includes automated model generation and refresh cycles based on real-time data.

The key features of our Turbonomic algorithm include:

Automated model generation and refresh: Our approach ensures that models are automatically updated and refreshed based on real-time data, significantly improving accuracy and performance without the need for manual intervention.
AI recommendation and action reflection standards: We have established clear criteria for implementing AI-driven actions, such as container resizing and pod scaling. These actions are automatically executed based on predefined thresholds (e.g., 80% utilization for downsizing).
AI-based resource optimization processes: Our methodology involves the definition and application of KPIs for resource optimization. This includes implementing AI-based recommendations and establishing processes for reflecting AI results, ensuring continuous improvement and adaptation.

The performance improvements achieved using the Turbonomic algorithm are summarized in the Table 1 below, highlighting the efficiency gains compared to traditional methods.

These metrics clearly demonstrate the enhanced capabilities of our approach in maintaining optimal resource utilization and reducing downtime.

The objectives of application modernization are detailed in Table 2, outlining the transition from monolithic to microservice architecture(MSA), among other strategies.

A detailed comparative analysis is conducted to highlight the significant advantages of microservice architecture (MSA) over traditional monolithic setups. Advantages include enhanced scalability, which allows systems to handle increasing workloads with ease; easier and quicker updates due to independent service deployments; and improved fault isolation that ensures system errors are contained and do not disrupt the entire application. This analysis not only substantiates our claims but also demonstrates how MSA contributes to more efficient and resilient cloud computing environments.

To further substantiate our claims, we compare our methods with existing studies as shown in Table 3. This table contrasts our AI-driven approach with other methodologies, such as the human-monitored trial–error approach proposed by Lee et al. (2017) [2] and the dynamic Johnson sequencing algorithm by Banerjee et al. (2023) [3], emphasizing the unique benefits of real-time AI optimization without human intervention.

The table below shows the step-by-step transformation from a monolithic architecture to a microservice architecture. Each step’s impact on the system’s operational and developmental efficiencies is highlighted for better understanding. Additionally, a diagram (Figure 2) is provided to visually represent the structural shift from a single, unified architecture to a compartmentalized and scalable microservice format. This visual aid helps to clearly illustrate the ‘As-Is vs. To-Be’ scenario, enhancing understanding of the architecture’s evolution.

This paper thoroughly investigates the impact of AI-driven resource optimization tools on enhancing cost efficiency and operational performance within real-world cloud environments. We utilize state-of-the-art AI technologies that intelligently allocate resources, thereby minimizing costs and maximizing system performance. Detailed case studies and real-world applications are discussed to illustrate the practical benefits and challenges of implementing these AI solutions.

Our research methodology includes modeling and simulation to accurately evaluate the performance and scalability of these AI tools [4]. This approach enables us to assess AI strategies under different operational conditions, providing a robust framework for our analysis.

In this study, we investigate the different results of implementing AI in various cloud service models and technological approaches. A significant literature survey has been conducted to compare existing methodologies and highlight the scientific merit of our approach. These efforts are in line with the goals of Industry 5.0, which places importance on sustainability and efficiency, leading to not only technological progress but also improved engagement with stakeholders within a human-centered operational environment. Table 2 provides a comprehensive comparative analysis of our proposed methods against similar works [5].

Our research addresses the following questions:

How can resource management methods be improved by using AI-based resource optimization tools, compared to traditional methods as documented in similar studies?
What differences can be observed in the results when applying AI technologies to different cloud service models, and how do these differences compare to previous findings in the literature?
What methods can be used to evaluate the practical applications and predictive accuracy of AI tools, in order to optimize resources and reduce costs, and how do these methods advance beyond existing research?

The structure of this paper is as follows: Section 2 will provide a literature review of the current state of AI applications in cloud resource optimization, highlighting the transformative role of AI in enhancing cloud computing [6]. Section 3 will offer a detailed description of the methodologies used, focusing on the setup and evaluation of AI tools’ impact on performance and costs [7]. In Section 4, the findings will be synthesized to suggest improvements in cloud resource management strategies, highlighting the integration of microservice architecture and AI as a way of building more resilient and adaptive industrial and service systems. Finally, the conclusion will summarize the study’s key insights and propose directions for future research, emphasizing the integration of AI and sustainability within cloud computing frameworks [8].

2. Materials and Methods

This study evaluates the use of AI-based tools to optimize cloud resources. It focuses on systems that are moving from traditional on-premise operations to advanced cloud-based methodologies through MSA. To analyze this, we use cutting-edge AI and application performance management (APM) tools such as IBM Turbonomic ARM and Instana APM. The aim is to determine if these innovative technologies are more cost effective and efficient than traditional resource management methods [9].

We evaluated the effectiveness of AI tools for optimizing cloud resources, with a focus on selecting IBM Turbonomic ARM and Instana APM. To avoid any biases that may arise from vendor-specific advantages, we established a thorough evaluation framework. This framework includes predefined criteria for AI-driven actions, which are divided into automatic actions (such as container resizing and pod scaling) and manual actions (such as VM scaling, replacement, and deletion), each governed by specific performance benchmarks (such as downsizing based on 80% utilization thresholds).

We have developed specific key performance indicators (KPIs) to measure the effectiveness of AI optimizations. These KPIs include rates of application performance failure, infrastructure uptime, and cost reductions. Our evaluative framework ensures that AI recommendations can be applied, reviewed, and adjusted objectively, allowing our findings to be reproducible and generalizable across various cloud management environments. We have carefully crafted this framework to maintain objectivity and practicality when applying sustainable AI technologies. It supports the development of sustainable intelligent systems through modeling and simulation, aligning with the Industry 5.0 goals of sustainability and efficiency [10].

We have improved our methodology to make it more rigorous and transparent, and have taken steps to address potential biases. Our goal is to provide a reliable and reproducible foundation for assessing how AI impacts cloud resource management. This will ensure that our findings are applicable to a wide range of cloud computing scenarios.

2.1. Tool Selection and Environment Setup

In this section, we will provide a detailed account of how we transitioned from an on-premise monolithic architecture to a cloudnative framework by utilizing MSA. This transition was necessary to improve cloud resource optimization and reduce operational costs. We accomplished this by conducting a rigorous three-month surveillance period. During this period, we established baseline costs and optimized resource deployment, and we used the data collected to inform our decisions, as shown in Table 3.

Table 4 provides information about the average and peak utilization of resources for web, application, and database servers in both production and development environments. This table shows how effective our cloud resource management strategy is.

In order to ensure a fair evaluation of different tools, we used a framework to compare several vendors. IBM Turbonomic ARM and Instana APM were chosen based on criteria such as performance metrics, adaptability to cloud-native environments, and vendor neutrality. We followed a strict selection process that included extensive pilot testing and feedback from multiple stakeholders. This ensured that no single vendor’s technology had a disproportionate impact on the study’s outcomes.

We have deployed IBM Turbonomic ARM and Instana APM in order to improve our analysis of system performance. These tools have significantly improved our ability to measure and refine performance metrics, which is in line with our goal of investigating the effectiveness of sustainable AI technologies in enhancing cost efficiency within cloud services.

Below are some key points about the contribution of each tool:

IBM Turbonomic ARM: AI-powered engine that can automatically configure and optimize cloud resources by continuously learning from real-time data. This ensures that resource allocation always remains aligned with current demand, without any need for human intervention.
Instana APM: Real-time performance monitoring tool that provides detailed insights through its dashboards. It enables the immediate identification of performance bottlenecks, ensuring that response times meet established performance thresholds.

The proof-of-concept (PoC) setup is represented in Figure 3. It shows how IBM Turbonomic ARM and Instana APM integrate and interact within the AWS cloud environment, which is segmented into production and development stages for System A. This setup demonstrates the operational fidelity of our approach and highlights the practical impact and effectiveness of the integrated AI and APM tools in a real-world scenario.

Our approach aligns with the principles of sustainable intelligent systems and supports the broader objectives of Industry 5.0, which emphasize efficiency and sustainability in technological advancements and stakeholder engagement [11].

2.2. Experimental Design

In this section, we will discuss how we use AI algorithms in our experimental design. We use advanced modeling and simulation techniques to effectively deploy AI tools in operational cloud systems. It is important to understand the specific impacts of AI on different architectural configurations and evaluate their effects on operational efficiency and system performance [12]. Our approach involves carefully selecting data sampling intervals and implementing a tiered strategy for metric prioritization. This ensures a balance between real-time responsiveness and analytical accuracy.

Our methodology for optimizing cloud resources involves the use of the Turbonomic platform, which is equipped with built-in learning capabilities. This platform automatically configures and optimizes cloud resources without any human intervention, dynamically adapting to the cloud environment based on real-time data. This allows for the optimization process to be fully driven by AI, reflecting the real-world application of these technologies in our experimental setup.

In addition, we use standardized evaluation metrics and methodologies to control confounding variables to ensure consistency across different studies and cloud configurations. This methodological rigor is essential for advancing our understanding of AI’s role in cloud resource management and for ensuring the reproducibility and reliability of our findings.

To ensure the reliability of our experiments, we carefully designed the cloud infrastructure with specific configurations including CPU models, memory allocations, storage capacities, network configurations, and security protocols that are typical in cloud deployments. These configurations are documented thoroughly to ensure our study’s results are reproducible and applicable to the research community.

We also conducted rigorous simulations to test AI tools under different conditions, assessing their resource management capabilities and adaptability to various operational stresses. This extensive testing not only confirms the immediate benefits of employing AI tools but also highlights their scalability and cost effectiveness, which are critical for the development of sustainable cloud systems [13].

The following methodical details strictly adhere to high academic standards, provid a strong and reliable framework to assess the practical applications of AI in improving cloud resource management. This disciplined approach highlights the significance of careful planning and execution in experimental design to achieve valid and dependable results.

2.3. Performance Evaluation

We conducted a study to determine the effectiveness of AI tools based on key performance metrics such as application throughput, response time, and system availability. These metrics were strategically selected to assess the real-time resource adjustment efficiency and the predictive accuracy of AI technologies when compared to traditional resource management methods.

To ensure an unbiased optimization process, our methodology integrates the autonomous capabilities of Turbonomic with the real-time data analysis provided via Instana APM. This configuration minimizes human intervention and potential biases, allowing Turbonomic to dynamically adjust resources based on the data insights captured using Instana APM. The synergy between these tools facilitates peak operational efficiency, with AI autonomously driving decisions that adapt to changing conditions.

Our evaluation framework rigorously documents each experimental setup, which is replicated in a controlled test environment to validate our findings. We use standardized protocols and conduct external peer reviews to further verify the credibility of our performance assessments. These measures ensure that our results are both reproducible and reliable [14].

As part of the evaluation process, we will consider the following criteria and assumptions:

Automatic actions: The system will automatically perform actions such as container resizing and pod scaling based on continuous monitoring and predictive analytics. This anticipates future resource demands.
Manual actions: Strategic decisions such as VM scaling, replacements, or deletions will be made based on AI recommendations. However, they will require final approval by system administrators.
Assumptions: We assume that all AI operations are based on algorithmic determinations without human bias. The system’s predictive capabilities will continuously improve through a feedback loop from real-time performance data.

We will use a range of benchmarks to provide quantitative measurements of performance improvements and resource utilization efficiencies. These benchmarks are selected to give a comprehensive overview of how AI tools anticipate and respond to varying system demands. They reflect the predictive accuracy and practical applicability of these tools in real-world cloud environments [15].

The following points are included in the assessment of the accuracy of AI tools in forecasting resource needs:

Historical data comparison: the AI predictions are compared with historical data on resource usage and demand patterns.
Realtime performance monitoring: the continuous monitoring of system performance after implementation measures the immediate impacts of AI predictions and adjustments.
Controlled testing environments: simulated stress tests and scenario analyses evaluate how AI tools react to hypothetical but plausible operational challenges.

Our study outlines various methods that contribute to a robust framework for evaluating AI performance. These methods include comprehensive assessments of hardware resource usage, data management processes, cloud resource utilization, and social and economic impacts. We thoroughly analyze the lifecycle and recyclability of hardware resources, ensuring efficient and sustainable use. Data management processes are evaluated to minimize waste and maximize efficiency.

Additionally, we conducted rigorous experiments to validate our findings, ensuring reproducibility. The evaluative process includes details on VM types, networking schemata, and data collection intervals, supporting the granular analysis necessary to fully understand AI’s impact on performance metrics and sustainability. By exploring these aspects, we provide a comprehensive understanding of how AI can optimize cloud resource management, enhance efficiency, and promote sustainability.

By closely examining these performance metrics and validation methods, our study provides clear insights into the practical applications of AI technologies in cloud resource management. Our findings suggest that AI tools not only improve operational efficiency but also offer significant advancements in the predictive management of cloud resources. This sets the stage for more optimized and cost-effective cloud operations [16].

2.4. Cost and Comparative Analysis

In this study, we examine the economic assessment and comparative effectiveness of AI tools for optimizing cloud resources. We conduct a thorough cost analysis to evaluate the cost efficiency of AI in comparison to traditional resource management paradigms. Our analysis includes operational cost data from various configurations and documents the financial benefits achieved through AI-driven optimizations. This addresses criticisms that there has been a lack of profound financial analysis in this area. We also track the data over an extended period to evaluate the long-term economic benefits and sustainability of AI implementations.

Moreover, we critically assess the comparative effectiveness of cloud systems that use AI tools for operations. Our analysis focuses on the enhanced performance of existing configurations managed via AI interventions. We discuss the technology selections highlighted in prior scholarly reviews and expand our analysis to include additional cloud platforms like Google Cloud and IBM Cloud. This broadens the scope of our study and provides a comprehensive view of AI tool performance across diverse ecosystems. Our approach not only emphasizes the cost benefits of AI but also highlights its operational superiority for cloud resource management [17].

2.5. Research Direction and Conclusions

After conducting an extensive comparative analysis, we have identified strategic directions for future research aimed at enhancing AI-driven cloud resource optimization. Our recommendations leverage the verified capabilities of AI within cloud management systems to set a detailed agenda for upcoming studies, emphasizing the need for methodical and empirical exploration of sustainable cloud practices [18].

To address the gap in understanding the real-world impact of AI tools across different cloud service models, we plan to conduct rigorous experimental studies. These studies will include comparative analyses of AI optimization tools to assess performance variability and validate their benefits in diverse operational environments. Standardization of benchmark tests for AI tools is crucial for conducting effective industry-wide comparisons, establishing a universal framework that can evaluate the efficacy of AI interventions across various cloud platforms and configurations, and thus promoting transparency and replicability in cloud resource management.

We acknowledge that while traditional resource measurement methodologies form a solid foundation, their enhancement through the integration of advanced AI-based analytics significantly increases the accuracy and efficiency of cloud resource management. Such integration ensures that systems are not only more adaptable but also resilient. Future research will therefore utilize modeling and simulation to more accurately predict AI applications’ impacts, facilitating more strategic and informed decisions in cloud architecture design and operation [19].

This comprehensive approach aims to revolutionize cloud resource management by improving operational efficiency and system accuracy. Figure 4 illustrates the potential advancements in AI-driven cloud management. Moving forward, we encourage continuous integration of AI into sustainable cloud computing practices, aligning with global trends towards more efficient, intelligent, and environmentally conscious technological solutions [20].

Figure 5 outlines the process of measuring cloud resource utilization through our APM tool.

This study concludes that it is beneficial to integrate AI in cloud resource management. This not only improves operational efficiency but also provides a foundation for further academic research. The insights and methodologies described in this study offer a framework for future studies to build upon. As a result, more effective and economical cloud resource management solutions can potentially be developed [21].

In conclusion, the following diagram can be generated to represent the overall relationships in Section 2.

3. Results

3.1. Performance Improvement

The following text highlights the positive impact of IBM Turbonomic ARM and Instana APM tools on the operations of System A, which runs on an MSA. The deployment of these AI-driven tools resulted in measurable improvements in operational efficiency, including a 25% increase in operational throughput and a 30% reduction in response times. These enhancements showcase the effective resource allocation facilitated using these tools, which play a crucial role in enhancing service delivery and system responsiveness.

The decision to select IBM Turbonomic was based on its optimized hybrid cloud management capabilities. It is designed to automate intelligently across diverse platforms such as AWS, Azure, Google Cloud, Kubernetes, and traditional data centers. This feature enables the tool to optimize performance and manage costs effectively across different infrastructural settings.

Moreover, the platform ensures optimal usage of computing, storage, and network resources autonomously, thereby preventing over-provisioning of resources and improving overall system efficiency. This proactive management approach helps to maximize return on investment and reduce operational costs by ensuring resources are utilized judiciously.

The benefits of AI-driven resource management in cloud and edge computing environments have been highlighted in a study by Aryal and Altmann [22]. These benefits include the potential for a more resilient and cost-effective infrastructure, which supports superior operational performance and cost efficiency.

The implementation of AI tools can be validated through performance metrics that demonstrate these improvements. These metrics are displayed in the monitoring dashboard (Figure 6), which provides a clear and quantifiable overview of the gains in operational performance and strategic resource management enabled by these technologies.

Figure 7 provides a detailed overview of transaction analytics in the application. It shows the HTTP status codes for various types of requests, such as GET, POST, and PUT operations. This figure is an essential tool for analyzing transaction volumes and determining the proportion of successful responses to error codes. Additionally, the figure includes charts that show latency and processing times, which offer a detailed view of the application’s response agility and processing efficiency. These visual data points are crucial for identifying areas where performance enhancements are necessary and pinpointing specific optimizations that could lead to significant gains in system responsiveness [23].

In Figure 8, you can see how business transactions flow through the system’s microservice architecture. The figure shows how different applications, services, containers, pods, and virtual machines interact with each other and use resources. Understanding this overview is important because it helps identify potential inefficiencies and bottlenecks. By using AI-powered tools, cloud infrastructure can be streamlined and optimized for better performance. This can result in targeted enhancements that can improve the system’s health and efficiency [24].

Figure 9 compiles the analytical data gathered and presents strategic recommendations for optimizing the system. It shows the suggestions made by the AI-powered tools to improve the system’s performance and address any inefficiencies discovered during assessments of operational and developmental contexts. The figure precisely illustrates the results of the function check, highlighting the proactive strategies employed by the AI to strengthen the system’s robustness and address vulnerabilities before they become critical issues. This focus on prevention is essential, as it helps to mitigate issues before they escalate into significant system impediments [25].

Table 5 provides a detailed overview of the AI’s capabilities in conducting function checks across different environments and system architectures. Each check is represented by a circled (O) marker, highlighting functionalities such as navigation of resource lists, retrieval of metric information by resource, and automated association of resources. The environments are clearly labeled, demonstrating the AI’s operational flexibility in ‘A System’, Kubernetes (K8S), and Amazon Web Services (AWS) platforms, covering both developmental and operational phases. This tabular representation emphasizes the AI tool’s extensive scope and its expertise in ensuring resource consistency and linkage across diverse infrastructural and developmental contexts [26].

3.2. Cost Reduction

This section describes an economic evaluation conducted to assess the monthly operational costs of a system comprising 20 virtual machines (VMs), storage, and database services. Initially, the monthly operational costs were estimated to be USD 182,228. However, after implementing AI-driven optimization strategies, a significant decrease in operational expenses was observed. The system now saves approximately USD 40,693 per month, including a reduction of USD 5210 in the staging environment.

To calculate the cost reductions, we used the AWS pricing calculator’s manufacturer’s suggested retail are grounded in widely recognized and accessible data points, enhancing the reliability of our findings.

The AI’s capability to predict and prevent the over-allocation of resources and address instances of under-utilization is responsible for the cost savings. This is in line with the findings of Eramo et al. [27]. The AI’s precision in application-level monitoring has revolutionized traditional resource management paradigms, leading to significant economic impacts.

The detailed analysis shows that AI-driven optimization can substantially reduce costs across different types of tasks. For example, in the operation environment, scaling tasks resulted in monthly savings of USD 10,523 for virtual volumes and USD 20,712 for virtual machines. Similarly, deletion tasks contributed to additional savings. In the development environment, the savings were USD 128 per month for virtual volumes and USD 4783 per month for virtual machines. These optimizations not only reduced the on-demand compute cost but also the on-demand database cost and storage cost, leading to an overall reduction in total costs.

To further explore variations in the performance of AI tools across different cloud service models, a comparative analysis was conducted. The analysis revealed that different cloud service models led to varying outcomes. For example, Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) models demonstrated differing levels of cost savings and efficiency improvements. In IaaS environments, AI tools were particularly effective at reducing infrastructure-related expenses, whereas, in PaaS setups, the most significant benefits were observed in operational scalability and deployment speeds.

The cost reductions achieved through deploying AI strategies tailored to each cloud service model are visually presented in Figure 10. This detailed account highlights the financial benefits that can be realized by integrating AI into cloud resource management. The analysis reinforces the overall economic value of AI integration and underscores the importance of selecting appropriate AI tools and strategies based on the specific needs and configurations of the cloud service model in use.

By optimizing resource allocation and preventing over-provisioning, AI-driven strategies not only enhance cost efficiency but also improve the overall sustainability of cloud operations. This comprehensive approach ensures that cloud resources are utilized effectively, providing both economic and environmental benefits.

3.3. Detailed Results and Comparative Analysis

This section examines the economic impacts and cost effectiveness of implementing AI utilities, specifically focusing on IBM Turbonomic ARM and Instana APM. Through rigorous analysis, it has been determined that the deployment of AI in cloud infrastructures leads to an average monthly cost reduction of USD 40,693. This not only highlights the crucial role of AI in enhancing more than just technical capabilities but also in significantly advancing financial management and economic efficiency within cloud operations.

Supporting these findings, Schuler et al. [28] highlight the flexibility and financial benefits of AI across various cloud ecosystems. This study extends their observations by providing direct comparative analyses that delineate the cost efficiencies of AI when contrasted with traditional systems and alternative technological approaches. These comparisons demonstrate the superior economic leverage that AI tools offer, optimizing expenses and enhancing performance in the modern cloud landscape.

Further analysis delves into different cloud models to illustrate how AI’s effectiveness varies among Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) frameworks. Each model shows differential benefits in terms of cost savings, with AI tools being particularly effective in IaaS for reducing infrastructure-related costs, while in PaaS and SaaS, improvements are noted in operational efficiency and service delivery. These nuanced findings elucidate the diverse applications of AI, affirming its adaptability and pivotal role in driving economic prudence across varied cloud service models.

3.4. Methodological Rigor and Research Integrity

The study’s integrity relies on its rigorous methodological approach, which guarantees that the results can be independently replicated and verified. To address previously identified issues regarding the lack of detailed technical data, the study extensively documented cloud configurations and VM classifications. This methodological exactness not only enhances the research’s scholarly depth but also reinforces confidence within the scientific community, encouraging further empirical scrutiny and validation.

Methodological transparency is the foundation of this study, establishing a strong base that reinforces the reliability of the findings and sets a benchmark for future research in AI-optimized cloud resource management. The precision in data handling and the emphasis on replicable setups ensure that subsequent studies can build on this work, exploring new dimensions of AI’s impact on cloud technology [29].

Furthermore, the study’s methodology is adaptable, allowing for modifications in response to evolving cloud technologies and AI capabilities. By maintaining a focus on methodological integrity, the research adheres to rigorous scientific standards while remaining flexible enough to incorporate future technological advancements. This ensures its relevance and applicability in ongoing scholarly and practical applications.

4. Discussion

4.1. Major Insights from Comparative Analysis

During our study, we compared the operational efficiency and cost effectiveness of cloud resource management using IBM Turbonomic ARM and Instana APM. We found that these tools significantly improved efficiency and reduced costs. To validate our findings, we developed a set of AI optimization key performance indicators (KPIs) that can be used to evaluate performance improvements across various cloud service models. These KPIs provide a standardized approach to measuring the effectiveness of AI-driven optimization tools.

Our study found that using IBM Turbonomic ARM and Instana APM in environments utilizing MSA not only improved operational efficiency but also resulted in average monthly savings of approximately USD 40,693. These savings align with our goal of enhancing cost efficiency and performance through advanced resource management strategies [30]. Additionally, these tools helped stabilize network traffic, which is crucial for preventing resource over-provisioning and supporting a sustainable cloud ecosystem.

The benefits of strategic AI applications for cloud resource management have been well documented and depicted in Figure 7 and Figure 8. These benefits include substantial financial savings and improved performance, highlighting the transformative role of AI in the cloud industry [31].

Furthermore, this section also discusses the sustainability benefits of AI technologies. IBM Turbonomic ARM and Instana APM have demonstrated their ability to optimize energy consumption and reduce the carbon footprint of cloud operations. By ensuring efficient resource use, these tools lead to reduced energy demand and operational costs, which are crucial for sustainable cloud computing.

In summary, the integration of these AI tools not only promises enhanced operational efficiency and scalability, but also supports the development of more sustainable and adaptable cloud architectures. This aligns with global sustainability standards, which advocate for ongoing research into AI capabilities within cloud environments to fully exploit their potential in creating environmentally conscious and economically viable technological solutions.

4.2. Interpretation of Results with State-of-the-Art Technologies

Advanced AI tools like IBM Turbonomic ARM and Instana APM have been integrated into cloud-native systems that use MSA. This integration provides deep insights into resource management and cost optimization. By using these tools, operational efficiency has been significantly increased, cost reductions achieved, and precise resource allocation in real-time cloud environments facilitated [32].

To manage external factors and uncertainties that may affect the long-term economic and environmental sustainability of AI-driven optimizations, our approach includes the following:

AI optimization criteria: We establish clear guidelines for AI-driven actions which are categorized into automated actions like container resizing and pod scaling, and manual interventions such as VM scaling, replacement, and deletion. These actions are governed by predefined thresholds to ensure alignment with current system demands and projected needs.
AI optimization KPIs: We monitor key performance indicators such as application performance failure rates, infrastructure uptime, and resource cost savings that are essential for assessing the effectiveness of AI optimizations and guiding resource management decisions.
Resource optimization process: We implement a structured process for AI-driven resource optimization, which includes steps for applying AI recommendations, reviewing their performance, and adjusting strategies based on operational feedback and changing conditions.

Figure 7 and Figure 8 present empirical data that highlight the tactical advantages of using a cloud-native framework along with AI. These visual aids demonstrate how AI strategies improve cloud computing resilience and cost efficiency, thereby enhancing resource management and operational workflows.

The section also discusses the sustainability impact of AI technologies in cloud computing. The AI tools used not only optimize resource use and reduce energy consumption but also lower operational costs and minimize environmental impacts. All of these benefits align with global sustainability standards and promote system longevity, advancing sustainable cloud practices that support economic and environmental objectives [33].

It is essential to consider the unique complexities and needs of each cloud service model, from IaaS to PaaS and SaaS, when discussing the outcomes of applying AI technologies. Each model presents distinct challenges and opportunities for AI optimization, affecting the applicability and efficiency of AI tools. Hence, the adaptability of these tools to diverse cloud environments and their ability to learn from various data inputs are crucial for enhancing resource optimization strategies and ensuring broad applicability.

This detailed examination of the interplay between AI and cloud technologies emphasizes how technological advancements can support more robust, flexible, and sustainable industrial and service systems, aligning with the operational efficacy and sustainability goals of Industry 5.0. The synthesis not only elucidates the practical applications of AI in cloud resource management but also frames a future where cloud technologies are both economically viable and environmentally responsible.

4.3. Detailed Technical Insights

Our study focuses on the deployment of advanced AI tools within cloud-native systems. To aid in the replication and comprehensive understanding of the findings, we conducted an in-depth technical exploration. This deep dive includes an exhaustive evaluation of cloud infrastructure components, with a particular focus on how AI enhances system configurations and management. Our research has dissected the variations and distributions of virtual machines (VMs) as well as the intricacies of Kubernetes cluster configurations. Both of these are pivotal for the sustainable optimization of cloud resources [34].

We also pay meticulous attention to the cadence of data sampling for performance metrics, a vital process for affirming the efficacy of AI-powered resource optimization strategies. The precision and regularity of this data collection are imperative for ensuring that results are reproducible and reliable, thereby fortifying the practical relevance of our study to the wider field. Such granularity in data handling lays the groundwork for developing cloud infrastructures that are both resilient and adaptable, in accordance with the precepts of sustainable and intelligent system design [35].

To ensure the long-term sustainability and economic viability of AI-driven optimizations, we also consider potential external factors such as market fluctuations, technological advancements, and regulatory changes:

Adaptability to external changes: our AI systems are designed to be adaptable to changes in the external environment, ensuring that optimal resource management is maintained under varying conditions.
Risk mitigation strategies: we mitigate technology and vendor risks by diversifying our stack and maintaining cloud flexibility, improving methodology integrity [36].

Our study utilizes various modeling and simulation techniques to anticipate and confirm the outcomes of different operational conditions. By employing these simulations, we can determine the impact of various AI configurations on cloud resource management. This helps us gain essential insights that inform the strategic implementation of AI tools to achieve peak operational efficiency and fiscal responsibility. Taking this comprehensive approach ensures that our cloud systems are not only technologically advanced but also economically sustainable and robust against external disruptions.

4.4. Strategic Recommendations for Future Research

As AI and cloud computing technologies continue to evolve rapidly, it is crucial for future research to broaden the scope of AI applications in cloud management to include more diverse environments and scenarios. This expansion involves standardizing performance benchmarks across different platforms and further exploring the integration of AI with emerging technologies to enhance the resilience and adaptability of cloud management systems [37].

Evaluating the comparative efficacy of AI tools across diverse cloud services and configurations is essential to gain a comprehensive understanding of tool performance across various operational contexts. Such assessments will not only provide a broader outlook on how different tools can optimize cloud resource management but also highlight their potential to contribute to more sustainable and economically efficient cloud computing environments [38].

To improve the scalability and flexibility of cloud services while reducing operational costs, future studies should focus on integrating AI tools with MSA. The incorporation of model-based approaches and simulation-supported frameworks will be vital in these explorations to validate the enhanced capabilities of AI tools in optimizing cloud resources. This aligns with the goals of Industry 5.0, which emphasize both technological advancements and enhanced stakeholder engagement in a human-centric operational environment [39].

Future research should also consider the synergy between endogenous AI integration and cloud management processes to explore how such integrations can foster more resilient and adaptive industrial and service systems. This will not only support the continuous advancement of cloud technologies but also ensure that these innovations are leveraged to achieve greater efficiency and sustainability in line with global standards and expectations.

4.5. Practical Implications and Recommendations

This paper presents practical insights for industry practitioners and organizations using cloud computing and artificial intelligence. It highlights the importance of deploying AI tools thoughtfully to ensure efficiency, sustainability, and cost effectiveness in cloud environments. As cloud computing evolves, emerging AI technologies play a vital role in maintaining a competitive edge and achieving operational excellence [40].

Our study emphasizes the benefits of cloud services, including enhanced cost efficiency, operational flexibility, automation of management processes, and the ability to leverage economies of scale. These advantages are crucial for transforming business models and managing infrastructure across various industries, promoting operational practices that prioritize flexibility, efficiency, and sustainability [41].

It is important to recognize the complexities and potential cost implications of cloud computing. Although the transition from monolithic applications to microservice architecture offers scalability and operational agility, it may also increase costs under continuous resource utilization scenarios. Our recommendations stress the importance of a holistic approach to cloud computing that considers not only operational improvements but also the broader impact on corporate sustainability and stakeholder value. A comprehensive evaluation of development costs, hardware expenditures, operational overheads, and management complexities is crucial. This evaluation ensures that organizations are fully informed about the potential increases in operational costs associated with various cloud deployment strategies.

Our approach to cloud management involves intelligent automation that dynamically optimizes cloud environments across various platforms including AWS, Azure, Google Cloud, Kubernetes, and traditional data centers. The optimization process ensures resource management is efficient and aligned with business demands and technological shifts, thereby enhancing return on investment (ROI) and reducing cloud expenses.

Our automation processes are designed to minimize human intervention and make real-time adjustments crucial for maintaining system efficiency. This ensures that resources are being used only as necessary, which prevents wasteful expenditure and over-utilization, thereby sustaining economical cloud operations.

We use advanced AI capabilities to effectively manage the software platform and analyze and adjust cloud configurations continuously. This management not only prevents over-provisioning of resources but also ensures the infrastructure operates at peak efficiency, providing significant cost savings and improved ROI.

Our research highlights the importance of adaptive cloud management processes that can effectively respond to external changes and uncertainties. This approach supports the long-term sustainability of AI-driven optimizations aligned with global efficiency and sustainability standards [42,43].

5. Conclusions

This study evaluated the shift from a traditional monolithic architecture to a cloud-native MSA and its impact on resource efficiency and cost reduction. By integrating advanced AI and APM tools such as IBM Turbonomic ARM and Instana APM, we demonstrated significant improvements in operational efficiency and achieved cost savings of approximately USD 40,000 per month. This comparison between human-operated systems and AI-enhanced systems provided a robust evaluation of resource management effectiveness.

Key findings:

Operational efficiency: AI tools significantly optimize cloud resources, improving system performance and operational throughput.
Cost reduction: AI-driven automation leads to substantial cost savings, reducing operational expenses and enhancing cost efficiency.
Scalability: the adoption of a microservice architecture (MSA) supports scalable resource management and aligns with sustainable systems engineering principles.
Future research directions.
AI optimization techniques: further validation and exploration of AI-driven optimization techniques are necessary to fully understand their potential and limitations.
AI/ML features: a detailed analysis of machine learning-based features beyond basic performance tuning should be conducted to assess their unique benefits.
Integration with existing systems: investigate the integration of AI/ML functionalities with existing systems to define clear roles and responsibilities, ensuring seamless and effective adoption.

In conclusion, the integration of AI in cloud resource management not only enhances operational efficiency and cost savings but also aligns with Industry 5.0 goals of sustainability and efficiency. Future research should continue to explore and validate the practical applications of AI technologies to maximize their benefits in cloud computing environments.

Author Contributions

Conceptualization, D.Y. and C.S.; data curation, D.Y.; formal analysis, D.Y.; funding acquisition, D.Y., C.S. and Y.L.; investigation, D.Y.; methodology, D.Y. and C.S.; project administration, D.Y.; resources, D.Y.; software, D.Y., C.S. and Y.L.; supervision, D.Y.; validation, D.Y. and C.S.; visualization, D.Y.; writing—original draft preparation, D.Y. and C.S.; writing—review and editing, D.Y., C.S. and Y.L. contributed to the initial analyses and provided feedback during the review stages. D.Y. served as the corresponding author and held the most senior role in the project, overseeing all aspects of the research and manuscript preparation. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kunduru, A.R. Artificial Intelligence Usage in Cloud Application Performance Improvement. Cent. Asian J. Math. Theory Comput. Sci. 2023, 4, 42–47. [Google Scholar]
Lee, Y.-H.; Huang, K.-C.; Wu, C.-H.; Kuo, Y.-H.; Lai, K.-C. A Framework for Proactive Resource Provisioning in IaaS Clouds. Appl. Sci. 2017, 7, 777. [Google Scholar] [CrossRef]
Banerjee, P.; Roy, S.; Modibbo, U.M.; Pandey, S.K.; Chaudhary, P.; Sinha, A.; Singh, N.K. OptiDJS+: A Next-Generation Enhanced Dynamic Johnson Sequencing Algorithm for Efficient Resource Scheduling in Distributed Overloading within Cloud Computing Environment. Electronics 2023, 12, 4123. [Google Scholar] [CrossRef]
Joloudari, J.H.; Mojrian, S.; Saadatfar, H.; Nodehi, I.; Fazl, F.; Khanjani Shirkharkolaie, S.; Alizadehsani, R.; Kabir, H.M.D.; Tan, R.-S.; Acharya, U.R. Resource Allocation Optimization Using Artificial Intelligence Methods in Various Computing Paradigms: A Review. arXiv 2022, arXiv:2203.12315. [Google Scholar] [CrossRef]
Fraga-Lamas, P.; Lopes, S.I.; Fernández-Caramés, T.M. Green IoT and Edge AI as Key Technological Enablers for a Sustainable Digital Transition towards a Smart Circular Economy: An Industry 5. 0 Use Case. Sensors 2021, 21, 5745. [Google Scholar] [CrossRef]
Butt, U.A.; Mehmood, M.; Shah, S.B.H.; Amin, R.; Shaukat, M.W.; Raza, S.M.; Suh, D.Y.; Piran, M.J. A Review of Machine Learning Algorithms for Cloud Computing Security. Electronics 2020, 9, 1379. [Google Scholar] [CrossRef]
Aron, R.; Abraham, A. Resource Scheduling Methods for Cloud Computing Environment: The Role of Meta-heuristics and Artificial Intelligence. Eng. Appl. Artif. Intell. 2022, 116, 105345. [Google Scholar] [CrossRef]
Adel, A. Unlocking the Future: Fostering Human–Machine Collaboration and Driving Intelligent Automation through Industry 5. 0 in Smart Cities. Smart Cities 2023, 6, 2742–2782. [Google Scholar] [CrossRef]
Hassan, M.U.; Al-Awady, A.A.; Ali, A.; Iqbal, M.M.; Akram, M.; Jamil, H. Smart Resource Allocation in Mobile Cloud Next-Generation Network (NGN) Orchestration with Context-Aware Data and Machine Learning for the Cost Optimization of Microservice Applications. Sensors 2024, 24, 865. [Google Scholar] [CrossRef]
Zizic, M.C.; Mladineo, M.; Gjeldum, N.; Celent, L. From Industry 4. 0 towards Industry 5. 0: A Review and Analysis of Paradigm Shift for the People, Organization and Technology. Energies 2022, 15, 5221. [Google Scholar] [CrossRef]
Abro, J.H.; Li, C.; Shafiq, M.; Vishnukumar, A.; Mewada, S.; Malpani, K.; Osei-Owusu, J. Artificial Intelligence Enabled Effective Fault Prediction Techniques in Cloud Computing Environment for Improving Resource Optimization. Sci. Program 2022, 2022, 7432949. [Google Scholar] [CrossRef]
Schuler, L.; Jamil, S.; Kühl, N. AI-based Resource Allocation: Reinforcement Learning for Adaptive Auto-scaling in Serverless Environments. In Proceedings of the 2021 IEEE/ACM 21st International Symposium on Cluster, Cloud and Internet Computing (CCGrid), Melbourne, Australia, 10–13 May 2021; pp. 804–811. [Google Scholar] [CrossRef]
Ilager, S.; Muralidhar, R.; Buyya, R. Artificial Intelligence (AI)-Centric Management of Resources in Modern Distributed Computing Systems. In Proceedings of the 2020 IEEE Cloud Summit, Harrisburg, PA, USA, 21–22 October 2020; pp. 1–10. [Google Scholar] [CrossRef]
Ghobaei-Arani, M.; Shamsi, M.; Rahmanian, A.A. An Efficient Approach for Improving Virtual Machine Placement in Cloud Computing Environment. J. Exp. Theor. Artif. Intell. 2017, 29, 1149–1171. [Google Scholar] [CrossRef]
Aryal, R.G.; Altmann, J. Dynamic Application Deployment in Federations of Clouds and Edge Resources Using a Multiobjective Optimization AI Algorithm. In Proceedings of the 2018 Third International Conference on Fog and Mobile Edge Computing (FMEC), Barcelona, Spain, 23–26 April 2018; pp. 147–154. [Google Scholar] [CrossRef]
Carvajal, A.; Garcia-Colon, V.R. High Capacity Motors On-line Diagnosis Based on Ultra Wide Band Partial Discharge Detection. In Proceedings of the 4th IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives (SDEMPED 2003), Atlanta, GA, USA, 24–26 August 2003; pp. 168–170. [Google Scholar] [CrossRef]
Ucar, A.; Karakose, M.; Kirimca, N. Artificial Intelligence for Predictive Maintenance Applications: Key Components, Trustworthiness, and Future Trends. Appl. Sci. 2024, 14, 898. [Google Scholar] [CrossRef]
Zhang, Z.; Wang, T.; Li, A.; Zhang, W. Adaptive Auto-Scaling of Delay-Sensitive Serverless Services with Reinforcement Learning. In Proceedings of the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), Los Alamitos, CA, USA, 27 June–1 July 2022; pp. 866–871. [Google Scholar] [CrossRef]
Robertson, J.; Fossaceca, J.M.; Bennett, K.W. A Cloud-Based Computing Framework for Artificial Intelligence Innovation in Support of Multidomain Operations. IEEE Trans. Eng. Manag. 2022, 69, 3913–3922. [Google Scholar] [CrossRef]
Boudi, A.; Bagaa, M.; Pöyhönen, P.; Taleb, T.; Flinck, H. AI-Based Resource Management in Beyond 5G Cloud Native Environment. IEEE Netw. 2021, 35, 128–135. [Google Scholar] [CrossRef]
Walia, G.K.; Kumar, M.; Gill, S.S. AI-Empowered Fog/Edge Resource Management for IoT Applications: A Comprehensive Review, Research Challenges, and Future Perspectives. IEEE Commun. Surv. Tutor. 2024, 26, 619–669. [Google Scholar] [CrossRef]
Huang, S.-Y.; Chen, C.-Y.; Chen, J.-Y.; Chao, H.-C. A Survey on Resource Management for Cloud Native Mobile Computing: Opportunities and Challenges. Symmetry 2023, 15, 538. [Google Scholar] [CrossRef]
Fedushko, S.; Ustyianovych, T.; Gregus, M. Real-Time High-Load Infrastructure Transaction Status Output Prediction Using Operational Intelligence and Big Data Technologies. Electronics 2020, 9, 668. [Google Scholar] [CrossRef]
Wan, X.; Guan, X.; Wang, T.; Bai, G.; Choi, B.-Y. Application Deployment Using Microservice and Docker Containers: Framework and Optimization. J. Netw. Comput. Appl. 2018, 119, 97–109. [Google Scholar] [CrossRef]
Aldoseri, A.; Al-Khalifa, K.N.; Hamouda, A.M. AI-Powered Innovation in Digital Transformation: Key Pillars and Industry Impact. Sustainability 2024, 16, 1790. [Google Scholar] [CrossRef]
Marques, G.; Senna, C.; Sargento, S.; Carvalho, L.; Pereira, L.; Matos, R. Proactive Resource Management for Cloud of Services Environments. Future Gener. Comput. Syst. 2024, 150, 90–102. [Google Scholar] [CrossRef]
Eramo, V.; Lavacca, F.G.; Catena, T.; Perez Salazar, P.J. Proposal and Investigation of an Artificial Intelligence (AI)-Based Cloud Resource Allocation Algorithm in Network Function Virtualization Architectures. Future Internet 2020, 12, 196. [Google Scholar] [CrossRef]
Benedetti, P.; Femminella, M.; Reali, G.; Steenhaut, K. Reinforcement Learning Applicability for Resource-Based Auto-scaling in Serverless Edge Applications. In Proceedings of the 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), Pisa, Italy, 21–25 March 2022; pp. 674–679. [Google Scholar] [CrossRef]
Sajjad, M.; Ali, A.; Khan, A.S. Performance Evaluation of Cloud Computing Resources. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 824. [Google Scholar] [CrossRef]
Gill, S.S.; Tuli, S.; Xu, M.; Singh, I.; Singh, K.V.; Lindsay, D.; Tuli, S.; Smirnova, D.; Singh, M.; Jain, U.; et al. Transformative Effects of IoT, Blockchain and Artificial Intelligence on Cloud Computing: Evolution, Vision, Trends and Open Challenges. Internet Things 2019, 8, 100118. [Google Scholar] [CrossRef]
Liu, L.; Chang, Z.; Guo, X.; Ristaniemi, T. Multi-objective Optimization for Computation Offloading in Mobile-Edge Computing. In Proceedings of the 2017 IEEE Symposium on Computers and Communications (ISCC), Heraklion, Greece, 3–6 July 2017; pp. 832–837. [Google Scholar] [CrossRef]
Bartsiokas, I.A.; Gkonis, P.K.; Kaklamani, D.I.; Venieris, I.S. ML-Based Radio Resource Management in 5G and Beyond Networks: A Survey. IEEE Access 2022, 10, 83507–83528. [Google Scholar] [CrossRef]
Nekovee, M.; Sharma, S.; Uniyal, N.; Nag, A.; Nejabati, R.; Simeonidou, D. Towards AI-enabled Microservice Architecture for Network Function Virtualization. In Proceedings of the 2020 IEEE Eighth International Conference on Communications and Networking (ComNet), Hammamet, Tunisia, 27–30 October 2020; pp. 1–8. [Google Scholar] [CrossRef]
Zafeiropoulos, A.; Fotopoulou, E.; Filinis, N.; Papavassiliou, S. Reinforcement Learning-Assisted Autoscaling Mechanisms for Serverless Computing Platforms. Simul. Model. Pract. Theory 2022, 116, 102461. [Google Scholar] [CrossRef]
Ahmed, Q.W.; Garg, S.; Rai, A.; Ramachandran, M.; Jhanjhi, N.Z.; Masud, M.; Baz, M. AI-Based Resource Allocation Techniques in Wireless Sensor Internet of Things Networks in Energy Efficiency with Data Optimization. Electronics 2022, 11, 2071. [Google Scholar] [CrossRef]
Khaleel, M.I.; Safran, M.; Alfarhood, S.; Zhu, M. Workflow Scheduling Scheme for Optimized Reliability and End-to-End Delay Control in Cloud Computing Using AI-Based Modeling. Mathematics 2023, 11, 4334. [Google Scholar] [CrossRef]
Wang, Z.; Zhou, Z.; Zhang, H.; Zhang, G.; Ding, H.; Farouk, A. AI-Based Cloud-Edge-Device Collaboration in 6G Space-Air-Ground Integrated Power IoT. IEEE Wirel. Commun. 2022, 29, 16–23. [Google Scholar] [CrossRef]
Sagi, S. Hybrid AI: Harnessing the Power of Cloud and On-Premise Datacenter for Enterprise AI Use Cases. J. Artif. Intell. Cloud Comput. 2024, 3, 1–4. [Google Scholar] [CrossRef]
Valdez, M.G.; Merelo Guervós, J.J. A Container-Based Cloud-Native Architecture for the Reproducible Execution of Multi-Population Optimization Algorithms. Future Gener. Comput. Syst. 2021, 116, 234–252. [Google Scholar] [CrossRef]
Aldhyani, T.H.H.; Alkahtani, H. Artificial Intelligence Algorithm-Based Economic Denial of Sustainability Attack Detection Systems: Cloud Computing Environments. Sensors 2022, 22, 4685. [Google Scholar] [CrossRef]
Karamthulla, M.J.; Malaiyappan, J.N.A.; Tillu, R. Optimizing Resource Allocation in Cloud Infrastructure through AI Automation: A Comparative Study. J. Knowl. Learn. Sci. Technol. 2023, 2, 315–326. [Google Scholar] [CrossRef]
Liang, Q.; Hanafy, W.A.; Ali-Eldin, A.; Shenoy, P. Model-driven Cluster Resource Management for AI Workloads in Edge Clouds. ACM Trans. Auton. Adapt. Syst. 2023, 18, 2. [Google Scholar] [CrossRef]
Bermejo, B.; Juiz, C. Improving Cloud/Edge Sustainability through Artificial Intelligence: A Systematic Review. J. Parallel Distrib. Comput. 2023, 176, 41–54. [Google Scholar] [CrossRef]

Figure 1. Comparison of traditional AI model management (Left) vs. Turbonomic AI model management (Right).

Figure 2. Transition from monolithic to microservice architecture.

Figure 3. PoC for configuration.

Figure 4. Cloud resource management process.

Figure 5. Using an APM tool to measure the utilization process of cloud resources.

Figure 6. Overall methodology and evaluation framework in Section 2.

Figure 7. Application transaction analysis.

Figure 8. An overview of microservice architecture.

Figure 9. AI-generated recommendations for optimization.

Figure 10. Result of cost analysis.

Table 1. Comparative performance metrics.

Metric	Traditional AI	Turbonomic AI
Model refresh interval	Weekly	Automated
Resource utilization	70%	90%
Downtime reduction	10%	30%

Table 2. Application modernization objectives.

Item	Approach
Software structure	From monolithic to MSA
Build methodology	Transitioning services unit by unit
Transition method	Monolithic to service-oriented, enhancing resource efficiency from VMs to containerization
Continuous integration/ Continuous deployment processes	Continuous integration (CI) and continuous deployment (CD) processes, which streamline the development and deployment of software by automating updates and testing, leading to a reduction in lead times by 20%. These processes are essential in a microservice environment, allowing for rapid and reliable software delivery.
Refactoring approach	Waterfall to iterative and modular refactoring

Table 3. Comparison of AI-driven approach with existing methodologies.

Comparison Item	Our Approach	Lee et al. (2017) [2]	Banerjee et al. (2023) [3]
Resource management method	Automation through real-time AI learning	Human-monitored trial error approach	Dynamic resource allocation algorithm
Human intervention	None	Required	Required
Prediction basis	Real-time data	Historical data	Simulations and extensive testing
System scale	Large-scale cloud systems	Limited (small-scale)	Suitable for small-scale operations
Key features	Dynamic relationship analysis and resource allocation, cost/time savings	Proactive resource provisioning framework	Enhanced dynamic Johnson sequencing algorithm
Results	High prediction accuracy and cost savings	Limited over-commitment, improved prediction rates	Expected efficiency but not suitable for large-scale operations

Table 4. Overview of system resource utilization in a cloud-based environment.

Resource	Average (%)		Peak (%)
Resource	Production	Development/QA	Production	Development/QA
Web (9 VM)	13.4	34	68	63
Was (9 VM)	6.7	10.8	62	64
DB (2 VM)	7.8	2.5	51	35.1

Table 5. Cross-environment function check capabilities.

Function Check	Development			Operation
Function Check	A System	K8S	AWS	A System	K8S	AWS
Search related resource list	O	O	O	O	O	O
Search for metric information by resource	O	O	O	O	O	O
Automatic setting of associations by resource	O	O	O	O	O	O

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Seo, C.; Yoo, D.; Lee, Y. Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization. Sustainability 2024, 16, 5095. https://doi.org/10.3390/su16125095

AMA Style

Seo C, Yoo D, Lee Y. Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization. Sustainability. 2024; 16(12):5095. https://doi.org/10.3390/su16125095

Chicago/Turabian Style

Seo, Cheongjeong, Dojin Yoo, and Yongjun Lee. 2024. "Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization" Sustainability 16, no. 12: 5095. https://doi.org/10.3390/su16125095

APA Style

Seo, C., Yoo, D., & Lee, Y. (2024). Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization. Sustainability, 16(12), 5095. https://doi.org/10.3390/su16125095

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Empowering Sustainable Industrial and Service Systems through AI-Enhanced Cloud Resource Optimization

Abstract

1. Introduction

2. Materials and Methods

2.1. Tool Selection and Environment Setup

2.2. Experimental Design

2.3. Performance Evaluation

2.4. Cost and Comparative Analysis

2.5. Research Direction and Conclusions

3. Results

3.1. Performance Improvement

3.2. Cost Reduction

3.3. Detailed Results and Comparative Analysis

3.4. Methodological Rigor and Research Integrity

4. Discussion

4.1. Major Insights from Comparative Analysis

4.2. Interpretation of Results with State-of-the-Art Technologies

4.3. Detailed Technical Insights

4.4. Strategic Recommendations for Future Research

4.5. Practical Implications and Recommendations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI