Federated Learning for Cloud and Edge Security: A Systematic Review of Challenges and AI Opportunities
Abstract
:1. Introduction
- Examine the role of FL and AI in cloud security to determine how the two emerging technologies known as FL and AI can improve cloud security.
- Identify opportunities to emphasize FL’s and AI’s future use in multiple cloud security fields, including threat identification, privacy preservation, and access permission.
- Explore key challenges to consider the major issues that come with FL in cloud security, such as data heterogeneity, communication overhead, and model convergence.
- Review current research trends to identify current trends and gaps in FL and AI contributions to cloud security from published studies.
- Suggest future directions to provide recommendations on aspects that require future study regarding FL and AI in cloud security enhancement.
2. Methodology
2.1. Planning Phase
2.1.1. Research Questions
- RQ1: What are FL and AI, and how can they effectively contribute to improving the data privacy and security concerns of the cloud?Justification: Centralized security models of cloud computing present vulnerabilities in data privacy, unauthorized access, and compliance issues. Recent studies suggest that FL and AI can be used to address these risks by moving data processing to the edge, reducing the surface of the attack and real-time threat detection. It is crucial to understand the role of FL and AI in improving security frameworks to tackle these challenges without compromising on system performance and scalability.
- RQ2: In which fields of cloud security are FL and AI most valuable?Justification: Cloud security is broken into multiple sectors, such as threat detection, privacy protection, access control, and compliance monitoring. Traditional AI-driven security models depend on centralized data aggregation, but FL provides an alternative privacy-preserving method. However, there is limited research that explicitly categorizes the most beneficial security applications of FL and AI, and this paper aims to address this gap by identifying these fields, which will provide insights into optimizing FL’s implementation for maximum impact.
- hlRQ3: What challenges are there while applying FL in the cloud?Justification: While FL improves privacy by keeping data decentralized, its practical deployment in cloud environments is not without its challenges, which are as follows:
- Data heterogeneity: This is due to the fact that cloud users employ diverse systems with different data formats and distributions; for instance, FL model convergence is affected by such a scenario.
- Communication overhead: This is because the frequent model updates between clients and servers cause latency and bandwidth consumption.
- Security vulnerabilities: Gradual leakage of gradient and model poisoning are some of the adversarial attacks that are likely to affect the implementations of FL.
It is important to understand these issues in order to develop a solution to improve the learning of FL in cloud security. - RQ4: What are the computation issues related to FL in the cloud environment?Justification: The effectiveness of FL for cloud security is directly related to computational efficiency. In contrast, FL needs much more computational power on edge devices and cloud nodes to perform local training and global model aggregation than centralized AI. The main challenges include the following:
- Resource constraints: Restrictive processing power available on edge devices is slow.
- Model aggregation complexity: Combining updates from multiple clients can be quite complex and may lead to inefficiency as well as increase the computation time.
- Energy consumption: A critical concern when training federated learning models in a distributed system is optimizing power consumption; this is especially important for devices such as IoT and mobile devices.
By considering these issues, the paper aims to suggest optimization techniques for enhancing FL’s efficiency in cloud environments. - RQ5: What is the contribution of both FL and AI in addressing the regulation of data privacy and security for clouds?Justification: The General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) are regulatory frameworks that require cloud service providers to implement strict data protection measures. The traditional security models are generally inadequate to meet these regulations because they are based on centralized data processing. FL works on the decentralized principle that keeps raw data on local devices, which is in line with legal requirements. However, there is limited empirical evidence to date that establishes how effectively FL complies with data security standards while preserving its performance. Understanding this research question will enable the evaluation of FL’s position in satisfying legal and ethical data security standards.
2.1.2. Inclusion and Exclusion Criteria
- Inclusion criteria
- –
- Data of publication: Only papers that were published from 2020 to 2024 are included to ensure that the information included is up-to-date.
- –
- Relevance to the field: This SLR aims to include papers that are devoted to using FL and AI to secure the cloud.
- –
- Language: English language papers are required in order to ensure that everyone is able to read and understand them.
- –
- Peer-reviewed: We only include papers that have gone through the peer review process, as in the case of peer-reviewed journal papers, together with selected conference papers and technical papers.
- –
- Full-text access: We only include papers where the reader can access their full content for further examination of the topic.
- –
- Original research papers: This SLR also encompasses papers that present research results in terms of outcomes or theoretical analysis that advance knowledge in FL and AI to secure the cloud.
- Exclusion criteria
- –
- Irrelevant papers: Irrelevant papers include papers that do not focus on FL and AI to secure the cloud.
- –
- Non-peer-reviewed: Non-scientific publications that could be easily found in sources like grey literature and a number of opinion papers are excluded. Grey literature often lacks formal peer review like preprint (e.g., arXiv). Thus, these papers are excluded from the selection.
- –
- Non-English papers: To avoid translation errors and enable understanding the content of papers, only papers in English are considered for this SLR.
- –
- Duplicate studies: This includes any study that is identified twice by two different databases.
- –
- Inaccessible papers: Papers that cannot be read through in their entirety are not considered in the final choice.
- –
- Paper length: This includes brief papers that do not afford comprehensive comprehension or have insufficient details about the topic.
2.2. Conducting Phase
2.2.1. Data Sources
2.2.2. Search String
2.3. Reporting Phase
2.3.1. Screening Process
2.3.2. Selection Process
- Inclusion and exclusion criteria: We included papers from 2020–2024 that focused on FL and AI in cloud security and were published in English.
- Quality screening: Non-peer-reviewed sources, inaccessible full texts, and duplicate studies were removed.
- Final selection (relevance and contribution): Papers without original contributions, weak methodologies, or lacking empirical evidence were removed.
3. Background
3.1. FL
3.1.1. Overview
3.1.2. Communication and Architectures
- Centralized architecture: In this particular approach, there is a master server that gathers model parameters from all participating clients (the devices that are used in the training process) and then produces a single, unified model out of these updates. Every client takes its local training data, and when it finishes the local training, it sends the new model parameters, such as weights, to the server. The server receives these updates from all the clients, integrates them, and produces the global FL model, which is sent back to the clients for more updates. Although this approach provides a simple and clear solution for aggregation, it has some privacy and security concerns as well. For example, if the data are not transferred to the server, potential attackers can still extract some information from the updated data [22].
- Decentralized architecture: In the case of decentralized FL, there is no single point of control in the process. Instead, a number of devices or servers collaborate in a distributed fashion in order to train the model. Every device or server has to exchange information with other devices and perform model fusion locally. On the one hand, this approach eliminates the risks of concentrating all functions in one central point; on the other hand, it presents challenges in the form of inter-device communication and collaboration. There are usually technologies that support decentralized architectures, for instance, blockchain, which improves reliability and guarantees safe communication among the involved clients [22].
3.1.3. Scale of Federation
3.1.4. Security and Privacy in FL
3.2. Privacy-Preserving Mechanisms in FL
3.3. Scalability and Communication Challenges
3.4. Security Vulnerabilities and Adversarial Threats
3.5. Opportunities for Advanced FL Integration
3.5.1. Federated Averaging (FedAvg)
3.5.2. Federated Learning in Healthcare
3.5.3. Federated Learning in Finance and Banking
3.5.4. Edge Computing and IoT
3.5.5. FL in Dynamic Edge Environments
- Real-time decision-making: The integration of FL into edge AI enables models to operate in real-time through local training and update the capabilities of edge devices. Model training in edge devices cuts down the back-and-forth data transmission delay to central servers, thus resulting in more rapid responses. The method also provides improved data privacy through local data management, which reduces the exposure of sensitive information during transfers between source and external servers [48].
- Applications (autonomous vehicles): The FL system enables vehicles to supplement one another’s operational skills by exchanging different driving conditions, which enables real-time decision-making without disclosing raw information.
- Smart manufacturing: The implementation of FL under Industry 4.0 allows machines to gain knowledge of failure patterns without sharing operation-specific data, thus improving their predictive performance and operational effectiveness.
- Healthcare: Medical institutions can train AI models jointly with FL through data protection methods, which both support quick clinical decisions and maintain patient confidentiality.
3.5.6. Federated Learning in Natural Language Processing (NLP)
3.6. Real-World Applications and Insights
3.6.1. Google’s Zero Trust Implementation
3.6.2. FL in Healthcare
3.7. Cloud Computing
3.7.1. Introduction to Cloud Computing
3.7.2. Key Technologies Enabling Cloud Computing
- Virtualization: Most cloud providers rely on virtualization technology to create several virtual machines (VMs) over one physical server. Virtualization provides the ability for different applications and separately authenticated users to share the same physical hardware by dividing up resources like CPU, memory, and storage and isolating them from each other. It enhances resource utilization, scalability, and flexibility by allowing for simple provisioning, management, and scaling of virtual machines according to demand.
- Distributed Computing: Distributed computing is when you split a computational task among multiple computers or servers that work in coordination to achieve a common goal. Distributed computing for large-scale processing and storage in geographically dispersed data centers is defined in cloud computing. With this model, reliability is increased since tasks can be distributed among multiple nodes so that no single point of failure exists, and scalability is increased as additional resources can be brought on as needed when demand grows.
- Network Infrastructure: Cloud computing needs a robust network infrastructure that ensures client, server and data center networking. To guarantee trouble-free access to cloud resources, high-speed Internet, advanced networking protocols, and data transmission technologies are all essential. Data redundancy and load balancing with efficient network infrastructure help cloud providers provide consistent performance and availability across global locations.
3.7.3. Cloud Computing Security and Privacy
3.7.4. Security Challenges in Cloud Computing
- Data Breach and Data Leakage: Securing cloud environments is all about data breaches. Sensitive information, such as credit cards, is stored on third-party servers in a cloud so that you lose visibility of it, and access over the Internet is susceptible to unauthorized access, hacking, and other forms of cyber threats. Unlike conventional data centers, cloud environments are available from everywhere, making them more susceptible to data being intercepted or exposed if appropriate security measures are not in place. In addition to being created by malicious users, data leakage can also occur as a result of misconfigurations in the access settings, a lack of encryption practices, or vulnerabilities in shared resources, which means that sensitive information can be leaked [62].Over the past few years, an increasing number of organizations have been adopting cloud solutions, and as the popularity of cloud adoption has grown, so have security breaches. As per the studies [63,64], the number of cloud breaches reported has steadily risen from 1200 in 2020 to 1800 in 2023. The 50% increase over four years highlights the fundamental vulnerabilities in cloud infrastructure as organizations shift to the cloud for data storage, collaboration, and operations. Cloud adoption accelerated in 2020 as companies shifted to remote work, but too many companies did not have the security in place to protect themselves, leading to huge breaches like ransomware and unauthorized data access. In 2021, the number of breaches stood at 1350, and cybercriminals were hitting misconfigured cloud storage services and weak access controls. The trend continued in 2022, with breaches reaching a total of 1550 incidents. Attackers were found to be using increasingly sophisticated phishing techniques, API misconfigurations, and supply chain vulnerabilities to compromise cloud-based systems [64]. After that, in 2023, the number of breaches further deteriorated to 1800. Advanced persistent threats (APTs) and multi-cloud environments, which focused on inadequate coordination across platforms, were the causes of this sharp rise. According to [65], the higher complexity of hybrid and multi-cloud infrastructures further aggravates the difficulties of keeping strong security protocols, which cyber adversaries take advantage of to exploit weak links in cloud ecosystems.The statistics above underscore the dire need for organizations to embrace advanced security measures, including adopting Zero Trust architectures, integrating AI-driven security and threat detection solutions, and using privacy-preserving technologies like FL. In addition, regulatory compliance frameworks like GDPR and the California Consumer Privacy Act (CCPA) require organizations to take a proactive stance against these vulnerabilities. Cloud breaches are on an upward trajectory, and cloud security has become a critical part of any digital transformation strategy.
- Top 10 security breaches in cloud computing: The frequency of cloud security breaches is increasing, and that is where AI and FL play an important role in resolving such risks. Below are several prominent breaches and how AI and FL could have been instrumental in addressing the challenges they presented:
- Facebook Data Leak (2021): Poorly configured databases exposed over 530 million user records, including phone numbers and email addresses [66]. An AI-driven anomaly detection system would have been able to detect unusual database queries or access patterns in real-time and stop data exfiltration.
- Alibaba’s Taobao Breach (2019): The unauthorized scraping of millions of user details was due to unsecured cloud storage. The privacy-preserving capabilities of FL could have allowed for secure, decentralized analysis of sensitive data without exposing them to outside access [67].
- LinkedIn Data Scraping (2021): Insecure API configurations allowed the scraping of personal data of 700 million users [68]. FL could have trained AI systems to dynamically monitor and restrict API misuse, thereby reducing exposure risks by a large margin.
- Capital One Breach (2019): In total, 100 million records were exposed from a misconfigured AWS server [69]. Misconfigurations could have been identified, and security teams could have been proactively alerted by AI-enabled threat detection before data exfiltration.
- Sina Weibo Breach (2020): Weak data management practices led to over 538 million user records being stolen [72]. Using AI-driven behavioral analytics, suspicious access attempts could have been detected and stricter authentication measures put in place to protect data.
- Accenture Ransomware Attack (2021): LockBit encrypted client data and demanded a USD 50 million ransom [73]. Early ransomware behaviors could have been identified by AI systems, isolating systems affected to prevent widespread encryption.
- Toyota Cloud Breach (2022): Client and employee sensitive data were exposed [74]. Without centralizing sensitive information, FL could have facilitated secure collaboration among Toyota’s global teams, thereby reducing breach risks.
- AWS Credential Leak (2022): Millions of AWS credentials were exposed due to insecure API configurations [75]. AI real-time monitoring of API usage could have detected unusual patterns and automatically disabled compromised credentials.
- Verizon Cloud Leak (2017): A misconfigured cloud storage by a third-party partner exposed over 14 million customer call logs [76]. With FL, secure analytics could have been performed across third-party systems while preserving data privacy and reducing dependency on direct access.
- Insecure APIs and Interfaces: This refers too interfaces and APIs, which are the route by which users and applications deal with cloud services. But, if these APIs are not secured properly, they can serve as a way for cyber attackers to enter your service. If not, insecure APIs may not have the appropriate authentication, authorization, or encryption features that lock it down from others viewing, changing, and/or even deleting data. Since most APIs are available over the Internet, any vulnerabilities will be exposed to malicious partakers, who could unlock them and take full control of your cloud [77].
- Account Hijacking: User account hijacking is when attackers have unauthorized access to a user’s accounts, for instance, by phishing users into giving up their usernames and passwords or through weak passwords and credential theft [78]. In cloud environments where people share resources, account compromise can bite hard, with attackers being able to move laterally to other parts of the network, sniff and decrypt sensitive data, and mess with resources. To prevent account hijacking, here are the security measures you need to take: MFA, strong passwords, and account monitoring.
- Insider Threats: Cloud security is vulnerable to insider threats, both intentional and not. Sensitive data and systems are a target for (staff or) contractors with access to them, who may exploit their access for personal gain or accidentally expose data due to negligence [78]. As an organization grows and gains access to more users in a cloud environment, managing and monitoring insider access becomes complicated. Implementing strictly designated access controls, ongoing internal reviews, and employee alertness training can steer away insider threats.
- Compliance and Regulatory Issues: Organizations storing or processing sensitive information in the cloud must comply with industry regulations and standards (including GDPR, HIPAA, or PCI DSS) [79]. Because of the shared responsibility model of security in cloud computing, where both cloud providers and customers have to manage security, it is indeed difficult to keep up with compliance. Dealing with multiple cloud environment providers can be complex, ensuring that cloud providers comply only with regulatory requirements and implementing necessary controls on the customer side. Non-compliance can result in massive legal consequences, fines, and tarnished reputation.
- Shared Responsibility Model Complexity: Security responsibility in cloud computing is divided between the cloud provider and the customer. Although cloud providers are typically responsible for the underlying infrastructure, physical security, and customer data, applications and access controls are provided by the customer. Unfortunately, this model of shared responsibility can create overwhelming confusion and security gaps if customers mistakenly believe that the provider handles everything when it comes to security [80]. Second, when it comes to cloud deployments, roles and responsibilities are extremely important so that you do not end up with misconfigurations and vulnerabilities.
- Visibility and Control: As data and applications reside in third-party cloud environments, organizations may lose visibility and control over the resources they have. But, customers often have no control over the infrastructure, so it becomes difficult for them to monitor activities, detect threats, and take necessary actions in real-time. Limited visibility can also make organizations unable to discover possible security issues quickly and enforce policy [80]. Logging, monitoring, and auditing tools help provide visibility in the cloud, but they come with the need for additional resources and expertise.
- Data Loss and Disaster Recovery: Cloud environments are prone to data loss by accidental deletion, hardware failure, etc. Very much like any other technology, cloud providers do not provide a disaster recovery solution; instead, it is up to the customer to devise backup strategies and restore protocols for the customer data and applications under their care [81]. However, seeing things from this perspective does not mean you can fall back on the provider’s recovery solutions and expect a complete recovery every time. Regular backups of data, testing of the disaster recovery, and keeping redundant systems in place are ways to cover the data loss from the risk.
3.8. Edge Computing
3.8.1. Introduction to Edge Computing
3.8.2. Key Technologies Enabling Edge Computing
- IoT: The IoT refers to a network of devices connected to one another and to the Internet, collecting, transmitting, and perhaps even processing data regarding their environment [85]. As such, IoT devices are critical to edge computing, producing massive quantities of data close to the edge of the network and sometimes in real-time. With edge computing, IoT devices can process data locally and ping information to a centralized cloud less often, reducing the latency and improve response times while also conserving bandwidth. Examples of IoT applications for edge computing range from smart home devices to industrial sensors, healthcare wearables, and autonomous vehicles [85].
- 5G and Network Advancements: Fifth-generation technology and other networking advancements enable the high-speed, low-latency network connectivity that edge computing depends on. Because 5G increases data transfer rates significantly, edge devices can now communicate faster with local data centers or gateways [86]. High speed is crucial for 5G, as it supports real-time use cases such as remote surgery, autonomous driving, and augmented reality, which cannot accept any delays, and 5G can do that. Furthermore, 5G also allows and supports many more devices to connect per area compared to 4G, enabling edge computing to scale much better as the number of IoT devices grows [86].
- AI and Machine Learning at the Edge: By fusing AI and ML at the edge, devices can now analyze data in real-time and make autonomous decisions without needing a live connection to a centralized cloud resource. Organizations can carry out image recognition, anomaly detection, perform predictive maintenance, and language processing with minimal latency by deploying AI and ML algorithms directly on edge devices or local gateways [87]. At the edge, the models of AI and ML are optimized for low power and resource efficiency. Therefore, they can be operated on smaller devices, which consume fewer resources and do not require much computation power [87]. This local intelligence allows edge devices to function free of the central hub and act immediately to the change in conditions, thus improving the efficacy edge computing is bringing to use cases that call for rapid, data-driven responses.
3.8.3. Benefits of Edge Computing
- Reduced Latency and Faster Response Times: Data are processed locally by edge computing, therefore saving a lot of the time it takes for the data to travel to a centralized server and back. For this reason, especially for applications that require an immediate answer, such as autonomous vehicles, industrial automation, and healthcare monitoring, this low latency is critical. Faster response times bolster better user experience and enable ‘real-time’, near-instant decision-making in scenarios where milliseconds matter [82].
- Bandwidth Optimization: Optimization of bandwidth usage is achieved by processing and filtering data near the source and discarding noncritical data, as opposed to raw data being sent, for analysis on a centralized cloud. Data that need to travel over the network are limited to only relevant information or summary data [83,88]. On top of this bandwidth conservation, this also reduces transmission costs, which makes this especially appealing for many devices in the IoT space with constant data being generated.
- Enhanced Privacy and Data Security: Because edge computing ensures data remain closer to their source, it does not require the transmission of sensitive information over potentially vulnerable networks. Local data, which have been processed and stored, are less exposed to threats from the outside world, meaning privacy and security are increased [88]. This is very helpful in industries like healthcare and finance, where data privacy is quite strict. Further, edge devices can implement specific security measures and encryption protocols that put in place many different layers of protection for sensitive data.
3.8.4. Challenges and Limitations of Edge Computing
- Limited Processing and Storage Capabilities: By design, centralized cloud data centers have much more processing power and storage capacity than edge devices. Large-scale data processing or complex computations can be challenging for edge devices due to their physical size and the power they are limited to. As a result, this constraint can limit the types of applications that can run at the edge, and optimizing algorithms and data processing may be necessary to squeeze the work within the available resources [82]. However, in cases where more extensive processing is required, data may still be offloaded to the cloud, thus negating the latency benefits of edge computing.
- Security and Management of Distributed Infrastructure: Security and management challenges come with the distributed nature of edge computing. It is hard to secure each node and ensure consistent security protocols when data processing occurs across multiple devices located in different places. Network security is not just about protecting users; rather, each edge device represents a potential attack surface, and protecting the network from unauthorized access, malware, and data breaches requires robust security measures [83]. Additionally, managing a multitude of distributed devices is difficult, particularly in identifying and applying software updates, managing security patches, and troubleshooting issues remotely.
- Integration with Cloud and Existing IT Systems: The integration of the edge computing solution in conjunction with legacy cloud and IT systems can be challenging. Data flows between edge devices and centralized systems need to be coordinated, which means that there can be very complex data synchronization and interoperability issues for organizations. When you work with legacy systems, this inevitably means careful planning that often comes with a few custom solutions if you need to ensure compatibility with both edge and cloud architectures [84]. It is also possible to want a smooth transition between local edge data processing and more extensive cloud analytics, and this often involves a lot of network architecture and protocols to ensure data consistency and system performance.
3.8.5. Edge Computing Use Cases
- Real-time Applications (e.g., autonomous vehicles and industrial automation): It is particularly important for applications where prompt reactions and decision-making are important, such as cars with auto-pilot mode or industrial control systems [89]. Self-driving cars are driven by real-time data from the car’s sensors and cameras to make crucial decisions on the road, which would be impractical with the use of a centralized cloud. Likewise, in the industry, data coming from the sensors could be processed by automated machinery with the help of edge computing to make quick decisions and avoid downtime [89]. These time-critical use cases benefit from edge computing because it performs computations on data collected at the edge, thereby improving safety, performance, and agility.
- Smart Cities and IoT Applications: In smart cities, edge computing is applied to control the large number of IoT devices, which collect significant data about traffic, the environment, energy consumption, and safety [90]. For instance, traffic cameras and sensors mounted in different parts of a city can process data at the edge to control traffic, minimize congestion, and improve safety without burdening the core system. In smart cities, edge computing helps to ease the network load, process data faster, and protect privacy by keeping data within city limits [90]. In particular, it ensures that energy is used optimally and that infrastructure is properly maintained, thus enhancing the livability of cities.
- Health Care and Remote Monitoring: In healthcare, edge computing is applied in real-time patient monitoring, especially in places that are far from the hospital or at home. Wearable devices, connected health monitors, and mobile medical devices can work at the edge by analyzing data and sending information about the patient’s condition, including vitals, in real-time with a notification in the event of an adverse event. This form of processing minimizes latency, thus increasing the rate of response, which is very important for patients. Furthermore, edge computing is beneficial in enhancing the privacy of the patient’s information by processing health data near the patient, thus helping to adhere to the set healthcare data regulations and maintaining the patient’s privacy [90].
3.8.6. Edge Security and Privacy
- Data Protection at the Edge: Edge computing security refers to protecting data that undergoes computation at the edge, meaning on devices or at edge points, instead of sending it to the cloud. Edge computing is the processing of data at the edge of the network, which decreases the likelihood of the data being intercepted during transfer [91]. However, this kind of data handling is only carried out at the localized level and, therefore, needs to be protected with strong encryption and access control mechanisms. Data encryption is a way of ensuring that information is protected at the time when it is stored as well as when it is in transfer; other methods include anonymization and tokenization [92].
- Securing Edge Devices and Networks: The other categories of devices include sensors, gateways, and local servers that may be affected by an attack. This is important to prevent threats from spreading across the entire edge network that these devices form a part of. This includes using measures like MFA and device-specific certificates to ensure that only the right people are granted access [92]. Consequently, edge networks need software updates, security patches, and firmware updates to fix issues that may be opening the network to attacks. Others include network segmentation and intrusion detection systems (IDS), which can help in containing the affected device and identifying threats within the network, respectively. In combination, these approaches provide a layered security system that can mitigate a vast number of threats to edge devices and networks [93].
3.9. AI
3.9.1. Overview of AI
3.9.2. Types of AI
- Narrow AI: Weak AI (narrow AI) is a subset of AI that is meant to perform particular processes or fix related issues in one specific area [95]. This covers recommendation systems, image recognition, and language translation, just to name a few [94]. The other type is called narrow AI, which does not have the ability to act outside its predefined functions and does not have general intelligence. Narrow AI refers to most of the current AI applications, including those for cloud security. Take, for example, an AI system that is narrow, which means it could analyze network traffic patterns to discover anomalies or to detect potential threats, but it would not know how to apply this knowledge to another unrelated domain.
- General AI: Strong AI, on the other hand, or general AI, is an imaginary form of technology that also possesses human-like intelligence and is capable of learning, understanding, and applying knowledge to various areas without any intervention or supervision [95]. General AI would be that which can solve general complex problems without human intervention, as well as adapt to new situations and reason as we do. Although general AI remains a long-term goal in AI research, such capabilities have the potential to dramatically influence security by producing autonomous systems that perceive and respond to threats as well as human analysts can. It should be noted, however, that general AI does not exist and is not yet used in modern cloud security solutions.
- Machine Learning, Deep Learning, and Reinforcement Learning: In the context of cloud security, various learning techniques are applied to design AI systems that can identify threats and attacks and enhance the protection mechanisms of data.
- ML: ML is a subfield of AI that allows systems to learn from data and act or predict on it. In cloud security, ML algorithms are usually applied to detect anomalies, identify intrusions, and classify malware [96]. The main category of ML in security is supervised learning, where models are trained on labeled datasets and unsupervised learning. Here, models work on unlabeled data to discover new threats [96].
- Deep Learning (DL): DL is a subset of ML that applies artificial neural networks to various data sets and is based on the multi-layered approach. DL models are mainly used in image and audio identification. However, they are also used in cloud security to detect subtle patterns in network traffic, user activity, and system event logs [97]. These models are excellent at pattern matching in the data and can help detect intricate and convoluted threats in cloud computing environments. However, deep learning models are capital-intensive and depend on cloud computing for flexibility in computation power [97].
- Reinforcement Learning (RL): RL is a subfield of machine learning in which an agent attempts to determine the best policy by taking actions in an environment and receiving outcomes which can be positive or negative [98]. In cloud security, reinforcement learning can be applied to design self-tuning security systems to counter constantly changing threats [98]. For instance, an RL-based system may fine-tune firewall settings or access control rules according to emerging threats to enhance the security policy to the least risk. This is quite important in cloud security, where the threat is not static at all and can change at any moment.All these types of AI and their approaches greatly enhance cloud security by deploying systems that can identify, analyze, and counter threats in real-time, thus fostering a secure and enduring cloud environment.
3.9.3. AI in Cybersecurity
- Role of AI in Threat Detection: Increasingly, AI is used in threat detection to analyze lots of data to detect patterns and potential threats in real-time. Existing methods for threat detection are rule or signature-based, restricting the ability to detect newer and evolving threats [99]. Conversely, AI-based systems can constantly learn from data, learning new threat patterns and identifying new esoteric attack techniques. For example, machine learning algorithms consider historical data on network traffic, i.e., whether they have not seen it before, to see if it deviates from what they have seen previously, which might indicate malicious activity [100]. With the capability to detect small but important variations, such as changes in users’ behavior or in the network traffic, deep learning models are especially good at spotting security incidents. Identifying and predicting threats early on makes it possible for cybersecurity teams to respond proactively and, therefore, minimize the risk of data leakage and other cyber incidents.
- Anomaly Detection: AI is also crucial for applications of anomaly detection, identifying unusual patterns in cloud environments. At scale in a cloud setting, user activities, network traffic, and, at times, system logs generate massive amounts of data, and it is almost always impossible for a human to identify anomalies manually. Anomaly detection models driven by AI can learn what is and is not considered to be the norm in the environment and will automatically raise red flags in the occurrence of behavior that is potentially indicative of a security incident. Anomaly detection is sometimes carried out using unsupervised learning and can be used when labeled data are not available [99,101]. For example, an anomaly detection model in a cloud environment may pick up a login pattern that is unusual or data access that is unexpected and alert security folks to a potential threat. In cloud systems, by catching these anomalies early, AI-powered systems prevent data leaks, account takeovers, etc.
3.9.4. AI Applications in Cloud Security
- IDS and Intrusion Prevention Systems (IPS): For identifying and mitigating attacks on cloud networks, IDS and IPS serve as critical solutions. AI improves the functionality of IDS and IPS by providing the dynamics to learn from past attack instances and then detect threats in real-time [102,103]. An IDS driven by AI can recognize known signatures of attacks and can adapt to new and unknown threats by detecting anomalies. For example, in IDS, machine learning algorithms can examine network traffic and discover deviations that may represent possible intrusions. With AI’s help, intrusion prevention systems can go a step further, not just detecting but actually blocking suspicious activities. With AI-powered IPS, firewall rules can be modified, malicious IP addresses blocked, and access restricted based on learned threat patterns, allowing a proactive measure to cloud security [103].
- Behavioral Analysis: Behavioral analysis is often performed with AI models to identify and react to suspicious user or device activity in a cloud environment. With a baseline of normal behavior, AI models can identify deviations indicating security risks, like account takeovers, insider threats, or compromised devices [102]. In cloud security, the most important use of behavioral analysis is when users use different devices and access the cloud system from different places. Login patterns, data access habits, and user interactions are scrutinized by AI being powered behavioral analysis systems, flagging any unusual activity like a user logging in from an unusual location or accessing sensitive data outside of normal hours. This application of AI works to enhance the ability to detect potential threats, as well as react to unauthorized access attempts in real-time with greater accuracy.
- Data Encryption and Privacy with AI: With the growing use of the cloud in business environments, more and more people are now using AI-based techniques that aid in data encryption and privacy. While data encryption is crucial for securing sensitive data, conventional encryption approaches may fall behind in keeping up with the vast quantities of data in the cloud [104]. Through automation, AI can speed up encryption and help spot the most efficient way to carry it out, factoring in the sensitivity of data and data usage patterns. Moreover, AI models can create secure data storage, locate flaws in encryption protocols, and advise on the best configurations to avoid leaking data. For example, in privacy-preserving applications, AI techniques can be leveraged to train a collaborative model across distributed cloud systems without revealing sensitive data, and while data are being processed, privacy is maintained. AI helps build more secure and trustworthy cloud environments by improving both encryption and privacy [104].
3.9.5. Challenges of Using AI in Cloud Security
- Data Privacy and Security: Data privacy and security are critical issues in cloud security as you work with AI because AI works with models, and the information may be sensitive. Training and analysis of AI algorithms usually use a lot of data, which commonly contain personal or sensitive information. Usage of such data poses privacy risks because the data can be accessed or misused by unauthorized parties [105]. In addition, certain AI applications involve the transfer of data from one cloud server to another or across international borders, thus introducing an increased risk of data leakage or compliance violation. Protecting privacy and security in AI requires infusing robust data protection practices, including data anonymization, access controls, and privacy-preserving methods such as FL (to train a model without sharing the raw data) [105,106].
- Explainability and Interpretability: A major challenge in employing AI for cloud security explains and interprets the use of AI for cloud security. AI models, especially those more complex like deep learning, tend to be thought of as ‘black boxes’ whose decision-making processes are inscrutable to humans. Without transparency, this can make it hard for security teams to trust AI-driven insights or to understand exactly why it triggered a specific alert or detection. This opacity can become a problem in security contexts where trust and clarity are critical [106]. The methods of explainable AI (XAI) are to remedy this by making AI decisions more transparent and explainable. All of this, however, can become a difficult challenge to strike the right balance between model complexity and interpretability, especially in the case of complicated and nuanced security threats.
- Scalability Issues: Deploying AI models at scale in a cloud environment is very challenging. Training and running AI models, especially in instances where AI models are resource intense, such as deep learning models, demands large computing power, memory, and storage. Large-scale cloud environments have difficulty managing these resources efficiently and keeping up performance [107]. Furthermore, the growth of the number of users and devices in a cloud environment also leads to the growth of the data in such a way that the AI models used should scale to handle such an increase in volume for which it may lead to latency issues and high costs. To do this at scale, techniques like model optimization, distributed processing, and load balancing are almost always required, and they only add to the complexity of deploying AI at scale. A continuing challenge with cloud security is to make sure that AI solutions work effectively and are sensitive enough to respond to the enormous, rapidly changing cloud landscape [107].
3.9.6. AI and FL Synergies in Cloud Security
- Decentralized Training: With FL, decentralized training can occur, where the AI model can be trained locally at the edge device or in different clouds without centralizing the data. In terms of cloud security, this decentralized method is very useful since the learning does not need to be performed in a central server, and each device or organization can independently train security models using their own data. With FL, data are kept localized, which lowers the risk of exposure of the data, requires less bandwidth, and has lower latency [108]. In particular, this is helpful for threat detection and anomaly detection, where local patterns often uncover security insights associated with a specific region and environment. FL also improves model robustness by leveraging the power of decentralized training, securing multiple sources of security insight, and aggregating them into a more robust and adaptable AI model [12].
- Data Privacy through FL: One of the core advantages of FL is data privacy, especially in the context of cloud security applications. FL tackles privacy concerns regarding centralized data for AI training since sensitive data can remain on local devices or in individual environments [109]. In FL, model updates (not raw data) are shared to a central server for aggregation in order to build a better global model. Using this approach, organizations can take advantage of the insights from multiple datasets without revealing private or sensitive data. Within some industry verticals, government regulation can make the transfer and the sharing of sensitive data a no-go (i.e., healthcare and finance), where strict data privacy laws (e.g., GDPR and HIPAA) prevent the transfer of data out of the local jurisdiction. FL enables you to continue meeting compliance requirements while leveraging the power of AI without compromising data privacy or cloud security [109].
3.9.7. Examples of FL in AI-Driven Cloud Security
- IDS: Improving IDS in cloud environments uses FL methodologies to train models on local network data for multiple organizations or data centers. The model is trained in each organization on its own network traffic patterns, with these made aggregate in order to form a robust global model that can detect a broader set of intrusion patterns without data sharing.
- Malware Detection in Distributed Systems: FL helps organizations collaboratively build a model that can identify new malware variants by training local data. With this approach, cloud security providers’ detection capabilities are enhanced across multiple clients, masking sensitive client information like file characteristics and user activity.
- Financial Fraud Detection: Financial institutions train models to detect fraudulent activities in real-time through FL and collaboratively train a model in a federated setting. Using FL for training on the transaction data locally within the institution allows them to combine a single shared fraud detection model that captures the different fraud patterns across institutions without having to expose individual transaction records.
4. FL in Cloud and Edge Computing Security
- Data privacy through local training: FL allows every client (device or organization) to train the model on their data without sharing the data with the central server. But, this approach sends only the changes in the models or parameters to the central server, which in turn helps to minimize the risk of data leakage of the raw data, which is especially relevant when working with large datasets that contain sensitive personal information.
- Privacy preservation techniques: Several mechanisms are used in the FL to ensure the privacy of the parameters being exchanged between the clients and the cloud server. Techniques like differential privacy, SMC, and homomorphic encryption are applied to enhance security where no single data set can be attributed to an individual, and computations cannot be reversed. For example, secure multi-party computation enables computations on encrypted data and, hence, avoids exposing the data when transmitted.
- Robustness to model inference attacks: Traditional training models that are based on a centralized model are prone to attacks such as model inversion and membership inference attacks where the attacker is able to learn about the training dataset. These risks are minimized by FL since the training is conducted across multiple clients. Thus, techniques such as differential privacy, which involves adding noise to the updates, make it challenging for the attacker to make inferences about individual data.
- Efficiency in secure communication: As for many FL applications in cloud settings, reducing the amount of data exchanged between the clients and the server is essential for both efficiency and security. Through selective parameter sharing and dynamic client participation, FL can decrease the number and size of updates exchanged. This approach reduces the vulnerability points and exit points where data may be captured during transmission.
- Support for honest-but-curious and collusion scenarios: The ‘Honest-But-Curious’ model supposes that servers can obey the protocol and try to learn something from the updates. FL implements this through cryptographic measures that ensure that servers and clients cannot see the raw data or unique inputs even when some of them may be working in unison. For example, double-key ElGamal encryption offers very strong protection, given that only partial model parameters are available for aggregation, thus protecting data from insider threats.
- Dynamic client participation and model integrity: In FL, clients may come and go without affecting the quality of the model at any one time. This provides a great chance to continue model training in a smooth manner without putting the data at risk. Some parameters can only be collected safely and then securely shared across different devices or with limited computational capabilities.
- Data privacy and confidentiality: This means that FL data can be processed locally on edge devices without the need to transfer raw data to the cloud. This decentralized architecture ensures that data are stored on devices and, therefore, not at risk of being leaked during transfer. The only things that can be shared with a central server are model parameters or gradients, while the user’s data or any other private information remain safely on the user’s device; this is particularly important in industries with strict privacy regulations, such as healthcare or finance.
- Reduced attack surface for edge devices: Due to the high number of connected devices in edge computing environments and the limited computational and security capabilities of these devices, edge environments are often faced with external threats. FL minimizes the surface of the attack as data are only stored locally on edge devices, minimizing the possibility of interception by a third party. Flows of sensitive information to a few servers are restricted in FL; hence, in case of an attack on one edge device, the effect will be minimal since a large number of data will not be compromised.
- Enhanced communication efficiency: While FL does not transfer massive volumes of data from edge devices to central servers, it transfers model updates, which are comparatively very small. This reduction in bandwidth use directly leads to enhanced network efficacy and security, as fewer data packets are exchanged over the network, thus decreasing the probability of interception or leakage of data during transmission.
- Scalability and flexibility in edge networks: FL also allows the dynamic inclusion of multiple edge devices, which can freely join or leave the network without causing any disturbance to the global model. This is especially helpful in edge computing, where the network is not always reliable, and the devices used in the network can also be unpredictable. This capability of FL makes it capable of operating under these conditions and ensuring that the model is trained well, as well as improving data security since the updates of the model are aggregated and validated from several sources, hence making it a stronger defense mechanism against data poisoning attacks.
- Resilience against privacy legislation compliance challenges: The concept of FL is also appropriate for data protection standards such as GDPR because data are never sent by the device that generated them. This is particularly advantageous in a decentralized system since the data owner retains full control of the information and thus can meet the legal requirements for data management. This compliance advantage makes FL especially appropriate for edge computing networks that function across multiple jurisdictions with different legal requirements on data protection.Thus, the decentralized approach of FL provides improved security for edge computing systems by avoiding the leakage of data, allowing processing on the edge and being compliant with legal data privacy regulations, which makes it a suitable solution for secure edge computing systems.
5. Challenges in Using Federated Learning and AI for the Cloud
- Communication overhead:
- Issue: FL requires frequent transmission of model parameters between client devices and central servers, hence consuming a large communication bandwidth.
- Example: Research on FL for IoT-based smart cities was conducted in 2023, and it was established that frequent exchange of model updates between edge devices and cloud servers tremendously affected the bandwidth demand, and this reduced the performance of real-time applications, as seen in traffic prediction [120].
- Impact: High communication costs can delay the model’s training and will degrade the system’s performance, particularly in bandwidth-limited scenarios.
- Potential solutions: We can use efficient compression techniques for updates, like quantization or sparsification, and synchronization, which occurs periodically rather than constantly.
- Resource constraints:
- Issue: FL is mostly based on edge devices, which, most of the time, have low processing capacity, memory, and battery power to undertake complex AI models.
- Example: Federated soil fertility analysis based on Raspberry Pi devices in edge computing has some delays caused by constraints in memory size (2 GB) and computational resources [121].
- Impact: Such limitations can slow down the model training process, increase the execution time, and decrease the performance of the federated system.
- Potential solutions: We can employ low complexity models, dynamic resource provisioning, and shifting computation-intensive tasks to the cloud hosts.
- Data heterogeneity:
- Issue: Data in federated systems are often partitioned among multiple devices and have non-IID (non-independent and identically distributed) as the data distributional structure.
- Example: Diagnostics for COVID-19 across hospitals in FL pose challenges since datasets have different characteristics (e.g., patients’ demographics and image resolution) [7].
- Impact: This non-uniformity leads to the possibility of the generation of biased or suboptimal models for some or all the clients.
- Potential solutions: We can employ federated optimization algorithms like FedProx and specific kinds of federated learning techniques.
- Privacy and security risks:
- Issue: Although FL aims to safeguard data privacy, gradient or model updates can indeed disclose sensitive information through manipulation by attack agents, such as gradient inversion.
- Example: Hijazi et al. [122] reported that FL-based financial fraud detection systems were threatened by adversarial attacks since the compromised participants uploaded the poisoned model updates.
- Impact: Such vulnerabilities can break the system in terms of confidentiality and trustworthiness.
- Potential solutions: The techniques of differential privacy, homomorphic encryption, and secure multi-party computation can be used.
- Scalability challenges:
- Issue: Since FL incorporates a large number of devices, managing updates together with ensuring scalability is a challenging task.
- Example: A federated learning system for autonomous vehicles encountered scalability issues that were common with federated learning when there were many clients, i.e., more than 1000 participants in this case. Real-time updates became a challenge for the central server [123].
- Impact: The increasing number of participants implies congestion of the servers, which causes many delays or bottlenecks.
- Potential solutions: Approaches based on hierarchical aggregation architectures and decentralized FL can be used.
- Lack of standardization:
- Issue: Specific guidelines regarding the integration of FL in cloud security systems are not recognized globally.
- Impact: This results in incompatible technologies and frameworks and often makes it hard to integrate between the different technologies.
- Potential solutions: Standardization of FL implementation protocols and APIs.
- Adversarial attacks:
- Issue: FL systems are sensitive to different types of attacks, including poisoning attacks, whereby the attackers seek to provide wrong updates to the FL model.
- Example: In the federated intrusion detection system, the attacker poisoned the model updates and then concealed the malware traffic, compromising the integrity of the system [120].
- Impact: Such attacks can greatly diminish the model’s performance and decrease its reliability.
- Potential solutions: Applying strong aggregation approaches, unusual pattern identification, and ensuring the model’s consistency.
- Regulatory and compliance barriers:
- Issue: Federated systems will have to work within guidelines and rules imposed by GDPR and/or HIPAA or any similar act on data protection.
- Example: The problem of GDPR compliance emerged in an FL system developed for financial fraud detection; some of the participants’ data were transferred across borders [122].
- Impact: These legal requirements can pose great challenges to data-sharing practices and system design.
- Potential solutions: We can adhere to and promote privacy by designing and including compliance checkups.
- Latency sensitivity:
- Issue: When it comes to cloud security, actions may be required in real-time or near-real-time. This is problematic with FL because of the inherent time required for model training.
- Example: Federated cybersecurity systems involving real-time threat detection became incapable of responding quickly enough to cloud updates due to network latency [124].
- Impact: Late identification of threats poses a challenge to the security of the system.
- Potential solutions: We can employ real-time federated systems and model caching in order to obtain fast predictions.
- Cost management:
- Issue: FL and AI system deployment and management in a cloud security environment are capital-intensive and require considerable funding.
- Impact: These technologies have high costs, so they can act as a barrier to the adoption of such technologies by organizations.
- Potential solutions: We can develop affordable solutions to deployment costs through cost-sharing models and open-source tools.
6. Related Work
7. Discussion
7.1. Gaps in Current Research
- Limited scalability: Concerns are staying in existing implementations by compromising cross-device federations of millions of IoT devices, with high communication overhead and non-equivalent model updates [128].
- Privacy-preserving mechanisms: The currently existing differential privacy and homomorphic encryption methods are relatively efficient, though their implementation can be problematic in terms of computation in restrictive environments [136].
- Dynamic changes: Actual cloud architectures are unstable as data distributions, nodes, and network conditions are changing in the real world. These changes affect FL systems in terms of their effectiveness and reliability, thus making it difficult to achieve constant model accuracy and real-time responses [133].
- Resource constraints: Several FL deployments are limited by the restricted computational resources and storage in edge and IoT devices [134].
- Security vulnerabilities: FL is still vulnerable to adversarial risks such as poisoning attacks as well as gradient inversion attacks that threaten the confidentiality and integrity of collaborative learning systems [108].
- Real-world implementation challenges: Several challenges are reported, such as non-IID data distribution across the various nodes, the effect on models’ efficiency, and delays in real-time applications. In addition, there are some issues and emerging challenges that may be associated with the implementation of FL in various and dynamic IoT settings [22].
7.2. Actionable Insights
- FL can help to minimize the latency problem in security-sensitive zones if adequately implemented in fog computing platforms.
- AI-based approaches for proactive threat analysis demonstrate the possibility of lowering breach rates compared to existing approaches based on predictive models.
- The FL systems enhanced by blockchain show potential use cases in the healthcare and finance sectors, including maintaining the record’s originality and adherence to different guidelines.
7.3. Innovative Proposals
- Neuromorphic computing for FL: Studying the application of neuromorphic computing into the FL may enhance the detection of anomalies in real-time and with low latency and energy consumption.
- Quantum-enhanced privacy: The application of quantum encryption to ensure privacy enhancement in federated learning to industries that need secure data privacy.
- FL in smart manufacturing: Considering the use of FL for detecting anomalies in autonomous guided vehicles (AGVs) under the context of IIoT, the performance in terms of operational safety and reliability could be enhanced.
8. Future Directions
8.1. Enhancing Privacy and Efficiency
- Developing lightweight encryption techniques to reduce computational overhead, inspired by approaches like the Paillier homomorphic encryption algorithm and double-key ElGamal encryption. These methods can address the trade-offs between privacy and computational efficiency in edge computing environments.
- Exploring the integration of HFL models to improve scalability and efficiency in heterogeneous network environments, as seen in 6G and IoT applications.
8.2. Advancing Model Adaptability
- Implementing adaptive algorithms for handling non-IID data distributions and dynamic network conditions, as emphasized in studies like those focusing on UAV networks and collaborative cloud-edge systems.
- Investigating federated reinforcement learning techniques to enhance model training in dynamic and resource-constrained environments, leveraging AI-driven optimization methods.
8.3. Innovative Applications
- Leveraging GANs for data augmentation and anomaly detection. For instance, GAN-based models could address imbalanced datasets in healthcare and cybersecurity domains, improving detection rates and overall model performance.
- Designing FL and AI frameworks tailored for specific applications, such as spam detection, smart transportation, and healthcare, where decentralized data processing and privacy preservation are critical.
8.4. Improving Interoperability and Scalability
- Creating simulation platforms like ChainFL to enable researchers to test FL models in diverse and dynamic environments, focusing on interoperability across varied hardware and software ecosystems.
- Developing federated architectures that combine edge, cloud, and device-level computations to optimize resource usage while maintaining model accuracy, as illustrated by methods like FedAgg.
8.5. Addressing Security Challenges
- Investigating methods to mitigate adversarial threats such as model poisoning, data leakage, and free-rider problems in FL systems. This could involve integrating blockchain-based solutions for secure model updates and participation verification.
- Exploring robust differential privacy mechanisms to enhance data confidentiality without compromising model utility.
- Incorporating Zero Trust principles into FL frameworks to enhance resilience against insider threats and ensure robust security in hybrid environments. Zero Trust methodologies could redefine access control mechanisms and enable secure collaboration across distributed nodes.
9. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
APIs | Application Programming Interface |
AI | Artificial Intelligence |
BSBODP | Bridge Sample Based Online Distillation Protocol |
CFL | Clustered Federated Learning |
CCPA | California Consumer Privacy Act |
DL | Deep Learning |
EECC | End-Edge-Cloud Collaboration |
FL | Federated Learning |
FRL | Federated Reinforcement Learning |
FedAgg | Agglomerative Federated Learning |
GDPR | General Data Protection Regulation |
GANs | Generative Adversarial Networks |
HierFAVG | Hierarchical Federated Averaging |
IIoT | Industrial Internet of Things |
IoT | Internet of Things |
IDS | Intrusion Detection Systems |
IPS | Intrusion Prevention System |
IoV | Internet of Vehicles |
MFA | Multi-Factor Authentication |
NLP | Natural Language Processing |
QoS | Quality of Service |
RBAC | Role-Based Access Control |
RL | Reinforcement Learning |
SLR | Systematic Literature Review |
UAV | Unmanned Aerial Vehicle |
XAI | Explainable AI |
References
- Yanamala, A.K.Y. Emerging Challenges in Cloud Computing Security: A Comprehensive Review. Int. J. Adv. Eng. Technol. Innov. 2024, 1, 448–479. [Google Scholar]
- Lad, S. Cybersecurity Trends: Integrating AI to Combat Emerging Threats in the Cloud Era. Integr. J. Sci. Technol. 2024, 1, 1–9. [Google Scholar]
- Li, Z.; Sharma, V.; Mohanty, S.P. Preserving data privacy via federated learning: Challenges and solutions. IEEE Consum. Electron. Mag. 2020, 9, 8–16. [Google Scholar] [CrossRef]
- GangwanI, N. Enhancing Privacy and Security in Cloud AI: An Integrated Approach Using Blockchain and Federated Learning. Int. J. Comput. Eng. Technol. (IJCET) 2024, 15, 728–737. [Google Scholar]
- Aledhari, M.; Razzak, R.; Parizi, R.M.; Saeed, F. Federated learning: A survey on enabling technologies, protocols, and applications. IEEE Access 2020, 8, 140699–140725. [Google Scholar] [CrossRef]
- Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Zomaya, A.Y. Federated learning for COVID-19 detection with generative adversarial networks in edge cloud computing. IEEE Internet Things J. 2021, 9, 10257–10271. [Google Scholar] [CrossRef]
- Rahman, A.; Hasan, K.; Kundu, D.; Islam, M.J.; Debnath, T.; Band, S.S.; Kumar, N. On the ICN-IoT with federated learning integration of communication: Concepts, security-privacy issues, applications, and future perspectives. Future Gener. Comput. Syst. 2023, 138, 61–88. [Google Scholar] [CrossRef]
- Rane, J.; Mallick, S.; Kaya, O.; Rane, N. Federated learning for edge artificial intelligence: Enhancing security, robustness, privacy, personalization, and blockchain integration in IoT. In Future Research Opportunities for Artificial Intelligence in Industry 4.0 and 5.0; Deep Science Publishing: Mumbai, India, 2024; Volume 5, pp. 2–94. [Google Scholar]
- Mothukuri, V.; Parizi, R.M.; Pouriyeh, S.; Huang, Y.; Dehghantanha, A.; Srivastava, G. A survey on security and privacy of federated learning. Future Gener. Comput. Syst. 2021, 115, 619–640. [Google Scholar] [CrossRef]
- Banabilah, S.; Aloqaily, M.; Alsayed, E.; Malik, N.; Jararweh, Y. Federated learning review: Fundamentals, enabling technologies, and future applications. Inf. Process. Manag. 2022, 59, 103061. [Google Scholar] [CrossRef]
- Zhao, Z.; Feng, C.; Yang, H.H.; Luo, X. Federated-learning-enabled intelligent fog radio access networks: Fundamental theory, key techniques, and future trends. IEEE Wirel. Commun. 2020, 27, 22–28. [Google Scholar] [CrossRef]
- Drainakis, G.; Katsaros, K.V.; Pantazopoulos, P.; Sourlas, V.; Amditis, A. Federated vs. centralized machine learning under privacy-elastic users: A comparative analysis. In Proceedings of the 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA), Cambridge, MA, USA, 24–27 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–8. [Google Scholar]
- Sunyaev, A.; Sunyaev, A. Cloud computing. In Internet Computing: Principles of Distributed Systems and Emerging Internet-Based Technologies; Springer: Berlin/Heidelberg, Germany, 2020; pp. 195–236. [Google Scholar]
- Saudi Digital Library. Saudi Digital Library (SDL). Available online: https://sdl.edu.sa (accessed on 12 February 2025).
- Li, L.; Fan, Y.; Tse, M.; Lin, K.Y. A review of applications in federated learning. Comput. Ind. Eng. 2020, 149, 106854. [Google Scholar] [CrossRef]
- Yang, T.; Andrew, G.; Eichner, H.; Sun, H.; Li, W.; Kong, N.; Ramage, D.; Beaufays, F. Applied Federated Learning: Improving Google Keyboard Query Suggestions. arXiv 2018, arXiv:1812.02903. [Google Scholar]
- Kholod, I.; Yanaki, E.; Fomichev, D.; Shalugin, E.; Novikova, E.; Filippov, E.; Nordlund, M. Open-source federated learning frameworks for IoT: A comparative review and analysis. Sensors 2020, 21, 167. [Google Scholar] [CrossRef] [PubMed]
- Xu, L.D.; Lu, Y.; Li, L. Embedding blockchain technology into IoT for security: A survey. IEEE Internet Things J. 2021, 8, 10452–10473. [Google Scholar] [CrossRef]
- Golosova, J.; Romanovs, A. The advantages and disadvantages of the blockchain technology. In Proceedings of the 2018 IEEE 6th Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE), Vilnius, Lithuania, 8–10 November 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–6. [Google Scholar]
- Nilsson, A.; Smith, S.; Ulm, G.; Gustavsson, E.; Jirstrand, M. A performance evaluation of federated learning algorithms. In Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning, Rennes, France, 10 December 2018; pp. 1–8. [Google Scholar]
- Khan, L.U.; Saad, W.; Han, Z.; Hossain, E.; Hong, C.S. Federated learning for internet of things: Recent advances, taxonomy, and open challenges. IEEE Commun. Surv. Tutor. 2021, 23, 1759–1799. [Google Scholar] [CrossRef]
- Brecko, A.; Kajati, E.; Koziorek, J.; Zolotova, I. Federated learning for edge computing: A survey. Appl. Sci. 2022, 12, 9124. [Google Scholar] [CrossRef]
- Karimireddy, S.P.; Jaggi, M.; Kale, S.; Mohri, M.; Reddi, S.; Stich, S.U.; Suresh, A.T. Breaking the centralized barrier for cross-device federated learning. Adv. Neural Inf. Process. Syst. 2021, 34, 28663–28676. [Google Scholar]
- Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. {BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), Virtual, 14–16 July 2020; pp. 493–506. [Google Scholar]
- Durrant, A.; Markovic, M.; Matthews, D.; May, D.; Enright, J.; Leontidis, G. The role of cross-silo federated learning in facilitating data sharing in the agri-food sector. Comput. Electron. Agric. 2022, 193, 106648. [Google Scholar] [CrossRef]
- ur Rehman, M.H.; Dirir, A.M.; Salah, K.; Damiani, E.; Svetinovic, D. TrustFed: A framework for fair and trustworthy cross-device federated learning in IIoT. IEEE Trans. Ind. Inform. 2021, 17, 8485–8494. [Google Scholar] [CrossRef]
- Yang, W.; Wang, N.; Guan, Z.; Wu, L.; Du, X.; Guizani, M. A practical cross-device federated learning framework over 5g networks. IEEE Wirel. Commun. 2022, 29, 128–134. [Google Scholar] [CrossRef]
- Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine learning in agriculture: A review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef] [PubMed]
- Demotes-Mainard, J.; Cornu, C.; Guerin, A.; Bertoye, P.H.; Boidin, R.; Bureau, S.; Chrétien, J.M.; Delval, C.; Deplanque, D.; Dubray, C.; et al. How the new European data protection regulation affects clinical research and recommendations? Therapies 2019, 74, 31–42. [Google Scholar] [CrossRef] [PubMed]
- Barrett, C. Are the EU GDPR and the California CCPA becoming the de facto global standards for data privacy and protection? Scitech Lawyer 2019, 15, 24–29. [Google Scholar]
- Chik, W.B. The Singapore Personal Data Protection Act and an assessment of future trends in data privacy reform. Comput. Law Secur. Rev. 2013, 29, 554–575. [Google Scholar] [CrossRef]
- Hao, M.; Li, H.; Luo, X.; Xu, G.; Yang, H.; Liu, S. Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Trans. Ind. Inform. 2019, 16, 6532–6542. [Google Scholar] [CrossRef]
- Geiping, J.; Bauermeister, H.; Dröge, H.; Moeller, M. Inverting gradients-how easy is it to break privacy in federated learning? Adv. Neural Inf. Process. Syst. 2020, 33, 16937–16947. [Google Scholar]
- Abad, G.; Picek, S.; Ramírez-Durán, V.J.; Urbieta, A. On the security & privacy in federated learning. arXiv 2021, arXiv:2112.05423. [Google Scholar]
- Wei, W.; Liu, L.; Loper, M.; Chow, K.H.; Gursoy, M.E.; Truex, S.; Wu, Y. A framework for evaluating gradient leakage attacks in federated learning. arXiv 2020, arXiv:2004.10397. [Google Scholar]
- Fang, H.; Qian, Q. Privacy preserving machine learning with homomorphic encryption and federated learning. Future Internet 2021, 13, 94. [Google Scholar] [CrossRef]
- Islam, A.; Al Amin, A.; Shin, S.Y. FBI: A federated learning-based blockchain-embedded data accumulation scheme using drones for Internet of Things. IEEE Wirel. Commun. Lett. 2022, 11, 972–976. [Google Scholar] [CrossRef]
- Pandya, S.; Srivastava, G.; Jhaveri, R.; Babu, M.R.; Bhattacharya, S.; Maddikunta, P.K.R.; Mastorakis, S.; Piran, M.J.; Gadekallu, T.R. Federated learning for smart cities: A comprehensive survey. Sustain. Energy Technol. Assess. 2023, 55, 102987. [Google Scholar] [CrossRef]
- Agrawal, S.; Sarkar, S.; Aouedi, O.; Yenduri, G.; Piamrat, K.; Alazab, M.; Bhattacharya, S.; Maddikunta, P.K.R.; Gadekallu, T.R. Federated learning for intrusion detection system: Concepts, challenges and future directions. Comput. Commun. 2022, 195, 346–361. [Google Scholar] [CrossRef]
- Hu, K.; Gong, S.; Zhang, Q.; Seng, C.; Xia, M.; Jiang, S. An overview of implementing security and privacy in federated learning. Artif. Intell. Rev. 2024, 57, 204. [Google Scholar] [CrossRef]
- Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Li, J.; Poor, H.V. Federated learning for internet of things: A comprehensive survey. IEEE Commun. Surv. Tutor. 2021, 23, 1622–1658. [Google Scholar] [CrossRef]
- Ho, T.M.; Nguyen, K.K.; Cheriet, M. Federated deep reinforcement learning for task scheduling in heterogeneous autonomous robotic system. IEEE Trans. Autom. Sci. Eng. 2022, 21, 528–540. [Google Scholar] [CrossRef]
- Shubyn, B.; Maksymyuk, T.; Gazda, J.; Rusyn, B.; Mrozek, D. Federated Learning: A Solution for Improving Anomaly Detection Accuracy of Autonomous Guided Vehicles in Smart Manufacturing. In Digital Ecosystems: Interconnecting Advanced Networks with AI Applications; Springer: Berlin/Heidelberg, Germany, 2024; pp. 746–761. [Google Scholar]
- Anusuya, R.; D Renuka, K. FedAssess: Analysis for Efficient Communication and Security Algorithms over Various Federated Learning Frameworks and Mitigation of Label Flipping Attack. Bull. Pol. Acad. Sci. Tech. Sci. 2024, 72, e148944. [Google Scholar] [CrossRef]
- Babar, M.; Qureshi, B.; Koubaa, A. Investigating the impact of data heterogeneity on the performance of federated learning algorithm using medical imaging. PLoS ONE 2024, 19, e0302539. [Google Scholar] [CrossRef]
- Mehta, S.; Sarpal, S.S. Maximizing Privacy in Reinforcement Learning with Federated Approaches. In Proceedings of the 2023 4th International Conference on Intelligent Technologies (CONIT), Hubballi, India, 21–23 June 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]
- Vinoth, K.; Sasikumar, P. VINO_EffiFedAV: VINO with efficient federated learning through selective client updates for real-time autonomous vehicle object detection. Results Eng. 2025, 25, 103700. [Google Scholar] [CrossRef]
- Liberti, F.; Berardi, D.; Martini, B. Federated Learning in Dynamic and Heterogeneous Environments: Advantages, Performances, and Privacy Problems. Appl. Sci. 2024, 14, 8490. [Google Scholar] [CrossRef]
- Al-Quraan, M.M.Y. Federated Learning Empowered Ultra-Dense Next-Generation Wireless Networks. Ph.D. Thesis, University of Glasgow, Glasgow, Scotland, 2024. [Google Scholar]
- Zohaib, S.M.; Sajjad, S.M.; Iqbal, Z.; Yousaf, M.; Haseeb, M.; Muhammad, Z. Zero Trust VPN (ZT-VPN): A Systematic Literature Review and Cybersecurity Framework for Hybrid and Remote Work. Information 2024, 15, 734. [Google Scholar] [CrossRef]
- Lakhani, R. Zero Trust Security Models: Redefining Network Security in Cloud Computing Environments. Int. J. Innov. Res. Comput. Commun. Eng. 2024, 12, 141–156. [Google Scholar]
- Sheth, A.; Bhosale, S.; Kadam, H.; Prof, A. Research paper on cloud computing. Int. J. Innov. Sci. Res. Technol. 2021, 6, 2021. [Google Scholar]
- Kewate, N.; Raut, A.; Dubekar, M.; Raut, Y.; Patil, A. A review on AWS-cloud computing technology. Int. J. Res. Appl. Sci. Eng. Technol. 2022, 10, 258–263. [Google Scholar] [CrossRef]
- Singh, T. The effect of Amazon web services (AWS) on cloud-computing. Int. J. Eng. Res. Technol. 2021, 10, 480–482. [Google Scholar]
- Saraswat, M.; Tripathi, R. Cloud computing: Comparison and analysis of cloud service providers-AWs, Microsoft and Google. In Proceedings of the 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India, 4–5 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 281–285. [Google Scholar]
- Pham, X.Q.; Nguyen, T.D.; Huynh-The, T.; Huh, E.N.; Kim, D.S. Distributed cloud computing: Architecture, enabling technologies, and open challenges. IEEE Consum. Electron. Mag. 2022, 12, 98–106. [Google Scholar] [CrossRef]
- Yin, K. Cloud computing: Concept, model, and key technologies. ZTE Commun. 2020, 8, 21–26. [Google Scholar]
- Mangalampalli, S.; Sree, P.K.; Swain, S.K.; Karri, G.R. Cloud computing and virtualization. In Convergence of Cloud with AI for Big Data Analytics: Foundations and Innovation; John Wiley & Sons: Hoboken, NJ, USA, 2023; pp. 13–40. [Google Scholar]
- Kitsios, F.; Chatzidimitriou, E.; Kamariotou, M. The ISO/IEC 27001 information security management standard: How to extract value from data in the IT sector. Sustainability 2023, 15, 5828. [Google Scholar] [CrossRef]
- Abdulsalam, Y.S.; Hedabou, M. Security and privacy in cloud computing: Technical review. Future Internet 2021, 14, 11. [Google Scholar] [CrossRef]
- Sun, P. Security and privacy protection in cloud computing: Discussions and challenges. J. Netw. Comput. Appl. 2020, 160, 102642. [Google Scholar] [CrossRef]
- Tabrizchi, H.; Kuchaki Rafsanjani, M. A survey on security challenges in cloud computing: Issues, threats, and solutions. J. Supercomput. 2020, 76, 9493–9532. [Google Scholar] [CrossRef]
- Hamid, S.; Huda, M.N. Mapping the landscape of government data breaches: A bibliometric analysis of literature from 2006 to 2023. Soc. Sci. Humanit. Open 2025, 11, 101234. [Google Scholar] [CrossRef]
- Kayes, A.; Rahayu, W.; Dillon, T.; Shahraki, A.S.; Alavizadeh, H. Safeguarding Individuals and Organisations from Privacy Breaches: A Comprehensive Review of Problem Domains, Solution Strategies, and Prospective Research Directions. IEEE Internet Things J. 2024, 12, 1247–1265. [Google Scholar] [CrossRef]
- Chimuco, F.T.; Sequeiros, J.B.; Lopes, C.G.; Simões, T.M.; Freire, M.M.; Inácio, P.R. Secure cloud-based mobile apps: Attack taxonomy, requirements, mechanisms, tests and automation. Int. J. Inf. Secur. 2023, 22, 833–867. [Google Scholar] [CrossRef]
- Stoffel, E.O.C. The Myth of Anonymity: De-Identified Data as Legal Fiction. NML Rev. 2024, 54, 129. [Google Scholar]
- Gu, J. An Empirical Study on the Judicial Regulation of Data Crawling Unfair Competition. Int. J. Educ. Humanit. 2023, 9, 61–66. [Google Scholar] [CrossRef]
- Sobel, B.L. A new common law of web scraping. Lewis Clark L. Rev. 2021, 25, 147. [Google Scholar]
- Khan, S.; Kabanov, I.; Hua, Y.; Madnick, S. A systematic analysis of the capital one data breach: Critical lessons learned. ACM Trans. Priv. Secur. 2022, 26, 1–29. [Google Scholar] [CrossRef]
- Akter, S.S.; Rahman, M.S. Cloud Forensic: Issues, Challenges, and Solution Models. In A Practical Guide on Security and Privacy in Cyber-Physical Systems: Foundations, Applications and Limitations; World Scientific: Singapore, 2024; pp. 113–152. [Google Scholar]
- Parveen, N.; Basit, F. Securing Data in Motion and at Rest: AI and Machine Learning Applications in Cloud and Network Security. 2023. Available online: https://www.researchgate.net/publication/385417229_Securing_Data_in_Motion_and_at_Rest_AI_and_Machine_Learning_Applications_in_Cloud_and_Network_Security (accessed on 25 February 2025).
- Chen, X.; Huang, C.; Cheng, Y. Identifiability, risk, and information credibility in discussions on moral/ethical violation topics on Chinese social networking sites. Front. Psychol. 2020, 11, 535605. [Google Scholar] [CrossRef]
- Ispahany, J.; Islam, M.R.; Islam, M.Z.; Khan, M.A. Ransomware detection using machine learning: A review, research limitations and future directions. IEEE Access 2024, 12, 68785–68813. [Google Scholar] [CrossRef]
- Pimenta Rodrigues, G.A.; Marques Serrano, A.L.; Lopes Espiñeira Lemos, A.N.; Canedo, E.D.; Mendonça, F.L.L.d.; de Oliveira Albuquerque, R.; Sandoval Orozco, A.L.; García Villalba, L.J. Understanding Data Breach from a Global Perspective: Incident Visualization and Data Protection Law Review. Data 2024, 9, 27. [Google Scholar] [CrossRef]
- Shreyas, S. Security Model for Cloud Computing: Case Report of Organizational Vulnerability. J. Inf. Secur. 2023, 14, 250–263. [Google Scholar] [CrossRef]
- Zuo, C.; Lin, Z.; Zhang, Y. Why does your data leak? uncovering the data leakage in cloud from mobile apps. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1296–1310. [Google Scholar]
- El Kafhali, S.; El Mir, I.; Hanini, M. Security threats, defense mechanisms, challenges, and future directions in cloud computing. Arch. Comput. Methods Eng. 2022, 29, 223–246. [Google Scholar] [CrossRef]
- Butt, U.A.; Amin, R.; Mehmood, M.; Aldabbas, H.; Alharbi, M.T.; Albaqami, N. Cloud security threats and solutions: A survey. Wirel. Pers. Commun. 2023, 128, 387–413. [Google Scholar] [CrossRef]
- Thabit, F.; Alhomdy, S.A.H.; Alahdal, A.; Jagtap, S.B. Exploration of security challenges in cloud computing: Issues, threats, and attacks with their alleviating techniques. J. Inf. Comput. Sci. 2020, 12, 35–47. [Google Scholar]
- Patel, A.; Shah, N.; Ramoliya, D.; Nayak, A. A detailed review of cloud security: Issues, threats & attacks. In Proceedings of the 2020 4th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 5–7 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 758–764. [Google Scholar]
- Al Nafea, R.; Almaiah, M.A. Cyber security threats in cloud: Literature review. In Proceedings of the 2021 International Conference on Information Technology (ICIT), Amman, Jordan, 14–15 July 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 779–786. [Google Scholar]
- Cao, K.; Liu, Y.; Meng, G.; Sun, Q. An overview on edge computing research. IEEE Access 2020, 8, 85714–85728. [Google Scholar] [CrossRef]
- Angel, N.A.; Ravindran, D.; Vincent, P.D.R.; Srinivasan, K.; Hu, Y.C. Recent advances in evolving computing paradigms: Cloud, edge, and fog technologies. Sensors 2021, 22, 196. [Google Scholar] [CrossRef]
- Ogeti, P.; Fadnavis, N.S.; Patil, G.B.; Krishna, U.; Padyana, H.P.R. Edge computing vs. cloud computing: A comparative analysis of their roles and benefits. Webology 2023, 20, 214–226. [Google Scholar]
- Kong, L.; Tan, J.; Huang, J.; Chen, G.; Wang, S.; Jin, X.; Zeng, P.; Khan, M.; Das, S.K. Edge-computing-driven internet of things: A survey. ACM Comput. Surv. 2022, 55, 1–41. [Google Scholar] [CrossRef]
- Pham, Q.V.; Fang, F.; Ha, V.N.; Piran, M.J.; Le, M.; Le, L.B.; Hwang, W.J.; Ding, Z. A survey of multi-access edge computing in 5G and beyond: Fundamentals, technology integration, and state-of-the-art. IEEE Access 2020, 8, 116974–117017. [Google Scholar] [CrossRef]
- Hua, H.; Li, Y.; Wang, T.; Dong, N.; Li, W.; Cao, J. Edge computing with artificial intelligence: A machine learning perspective. ACM Comput. Surv. 2023, 55, 1–35. [Google Scholar] [CrossRef]
- Carvalho, G.; Cabral, B.; Pereira, V.; Bernardino, J. Edge computing: Current trends, research challenges and future directions. Computing 2021, 103, 993–1023. [Google Scholar] [CrossRef]
- Zhang, T.; Li, Y.; Chen, C.P. Edge computing and its role in Industrial Internet: Methodologies, applications, and future directions. Inf. Sci. 2021, 557, 34–65. [Google Scholar] [CrossRef]
- Dave, R.; Seliya, N.; Siddiqui, N. The benefits of edge computing in healthcare, smart cities, and IoT. arXiv 2021, arXiv:2112.01250. [Google Scholar] [CrossRef]
- Alwarafy, A.; Al-Thelaya, K.A.; Abdallah, M.; Schneider, J.; Hamdi, M. A survey on security and privacy issues in edge-computing-assisted internet of things. IEEE Internet Things J. 2020, 8, 4004–4022. [Google Scholar] [CrossRef]
- Zhang, J.; Chen, B.; Zhao, Y.; Cheng, X.; Hu, F. Data security and privacy-preserving in edge computing paradigm: Survey and open issues. IEEE Access 2018, 6, 18209–18237. [Google Scholar] [CrossRef]
- Xiao, Y.; Jia, Y.; Liu, C.; Cheng, X.; Yu, J.; Lv, W. Edge computing security: State of the art and challenges. Proc. IEEE 2019, 107, 1608–1631. [Google Scholar] [CrossRef]
- Oliveira, E. Artificial intelligence: An overview. In Cutting Edge Technologies and Microcomputer Applications for Developing Countries; Routledge: Oxfordshire, UK, 2019; pp. 61–65. [Google Scholar]
- Benbya, H.; Davenport, T.H.; Pachidi, S. Artificial intelligence in organizations: Current state and future opportunities. MIS Q. Exec. 2020, 19, 4. [Google Scholar] [CrossRef]
- Murshed, M.S.; Murphy, C.; Hou, D.; Khan, N.; Ananthanarayanan, G.; Hussain, F. Machine learning at the network edge: A survey. ACM Comput. Surv. (CSUR) 2021, 54, 1–37. [Google Scholar] [CrossRef]
- Wang, F.; Zhang, M.; Wang, X.; Ma, X.; Liu, J. Deep learning for edge computing applications: A state-of-the-art survey. IEEE Access 2020, 8, 58322–58336. [Google Scholar] [CrossRef]
- Agarwal, R.; Schwarzer, M.; Castro, P.S.; Courville, A.C.; Bellemare, M. Deep reinforcement learning at the edge of the statistical precipice. Adv. Neural Inf. Process. Syst. 2021, 34, 29304–29320. [Google Scholar]
- Kaur, R.; Gabrijelčič, D.; Klobučar, T. Artificial intelligence for cybersecurity: Literature review and future research directions. Inf. Fusion 2023, 97, 101804. [Google Scholar] [CrossRef]
- Camacho, N.G. The Role of AI in Cybersecurity: Addressing Threats in the Digital Age. J. Artif. Intell. Gen. Sci. (JAIGS) 2024, 3, 143–154. [Google Scholar] [CrossRef]
- Ansari, M.F.; Dash, B.; Sharma, P.; Yathiraju, N. The impact and limitations of artificial intelligence in cybersecurity: A literature review. Int. J. Adv. Res. Comput. Commun. Eng. 2022. [Google Scholar] [CrossRef]
- Arif, H.; Kumar, A.; Fahad, M.; Hussain, H.K. Future Horizons: AI-Enhanced Threat Detection in Cloud Environments: Unveiling Opportunities for Research. Int. J. Multidiscip. Sci. Arts 2024, 3, 242–251. [Google Scholar] [CrossRef]
- Jaber, A.N.; Anwar, S.; Khidzir, N.Z.B.; Anbar, M. The importance of ids and ips in cloud computing environment: Intensive review and future directions. In Proceedings of the Advances in Cyber Security: Second International Conference, ACeS 2020, Penang, Malaysia, 8–9 December 2020; Revised Selected Papers 2. Springer: Berlin/Heidelberg, Germany, 2021; pp. 479–491. [Google Scholar]
- Kethireddy, R.R. AI-Driven Encryption Techniques for Data Security in Cloud Computing. J. Recent Trends Comput. Sci. Eng. (JRTCSE) 2021, 9, 27–38. [Google Scholar] [CrossRef]
- Hakimi, M.; Amiri, G.A.; Jalalzai, S.; Darmel, F.A.; Ezam, Z. Exploring the Integration of AI and Cloud Computing: Navigating Opportunities and Overcoming Challenges. TIERS Inf. Technol. J. 2024, 5, 57–69. [Google Scholar] [CrossRef]
- Mohammed, S.; Fang, W.C.; Ramos, C. Special issue on “artificial intelligence in cloud computing”. Computing 2023, 105, 507–511. [Google Scholar] [CrossRef]
- Belgaum, M.R.; Alansari, Z.; Musa, S.; Alam, M.M.; Mazliham, M. Role of artificial intelligence in cloud computing, IoT and SDN: Reliability and scalability issues. Int. J. Electr. Comput. Eng. 2021, 11, 4458. [Google Scholar] [CrossRef]
- Bao, G.; Guo, P. Federated learning in cloud-edge collaborative architecture: Key technologies, applications and challenges. J. Cloud Comput. 2022, 11, 94. [Google Scholar] [CrossRef]
- He, C.; Liu, G.; Guo, S.; Yang, Y. Privacy-preserving and low-latency federated learning in edge computing. IEEE Internet Things J. 2022, 9, 20149–20159. [Google Scholar] [CrossRef]
- Bhaskar, V.V.S.R.; Etikani, P.; Shiva, K.; Choppadandi, A.; Dave, A. Building explainable AI systems with federated learning on the cloud. Webology 2019, 16, 1–14. [Google Scholar]
- Ometov, A.; Molua, O.L.; Komarov, M.; Nurmi, J. A survey of security in cloud, edge, and fog computing. Sensors 2022, 22, 927. [Google Scholar] [CrossRef] [PubMed]
- Fang, C.; Guo, Y.; Wang, N.; Ju, A. Highly efficient federated learning with strong privacy preservation in cloud computing. Comput. Secur. 2020, 96, 101889. [Google Scholar] [CrossRef]
- Abreha, H.G.; Hayajneh, M.; Serhani, M.A. Federated learning in edge computing: A systematic survey. Sensors 2022, 22, 450. [Google Scholar] [CrossRef] [PubMed]
- Gao, X.; Hou, L.; Chen, B.; Yao, X.; Suo, Z. Compressive Learning Based Federated Learning for Intelligent IoT with Cloud-Edge Collaboration. IEEE Internet Things J. 2024, 12, 2291–2294. [Google Scholar] [CrossRef]
- Guo, S.; Chen, H.; Liu, Y.; Yang, C.; Li, Z.; Jin, C.H. Heterogeneous Federated Learning Framework for IIoT Based on Selective Knowledge Distillation. IEEE Trans. Ind. Inform. 2024, 21, 1078–1089. [Google Scholar] [CrossRef]
- Prigent, C.; Chelli, M.; Costan, A.; Cudennec, L.; Schubotz, R.; Antoniu, G. Efficient Resource-Constrained Federated Learning Clustering with Local Data Compression on the Edge-to-Cloud Continuum. In Proceedings of the HiPC 2024-31st IEEE International Conference on High Performance Computing, Data, and Analytics, Bangalore, India, 18–21 December 2024. [Google Scholar]
- Xu, Y.; Zhao, B.; Zhou, H.; Su, J. FedAdaSS: Federated Learning with Adaptive Parameter Server Selection Based on Elastic Cloud Resources. CMES-Comput. Model. Eng. Sci. 2024, 141, 609–629. [Google Scholar] [CrossRef]
- Sreerangapuri, A. Federated Learning: Revolutionizing Multi-Cloud AI While Preserving Privacy. Int. J. Res. Comput. Appl. Inf. Technol. (IJRCAIT) 2024, 7, 587–602. [Google Scholar]
- Mpembele, A.B. Differential Privacy-Enabled Federated Learning for 5G-Edge-Cloud Framework in Smart Healthcare. Ph.D. Thesis, Tennessee State University, Nashville, TN, USA, 2024. [Google Scholar]
- Kaleem, S.; Sohail, A.; Tariq, M.U.; Asim, M. An improved big data analytics architecture using federated learning for IoT-enabled urban intelligent transportation systems. Sustainability 2023, 15, 15333. [Google Scholar] [CrossRef]
- Mwawado, R.; Zennaro, M.; Nsenga, J.; Hanyurwimfura, D. Optimizing Soil-Based Crop Recommendations with Federated Learning on Raspberry Pi Edge Computing Nodes. In Proceedings of the 2024 11th International Conference on Internet of Things: Systems, Management and Security (IOTSMS), Malmö, Sweden, 2–5 September 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 82–89. [Google Scholar]
- Hijazi, N.M.; Aloqaily, M.; Guizani, M.; Ouni, B.; Karray, F. Secure federated learning with fully homomorphic encryption for iot communications. IEEE Internet Things J. 2023, 11, 4289–4300. [Google Scholar] [CrossRef]
- Zhang, C.; Cui, L.; Yu, S.; James, J. A communication-efficient federated learning scheme for iot-based traffic forecasting. IEEE Internet Things J. 2021, 9, 11918–11931. [Google Scholar] [CrossRef]
- Salim, M.M.; Camacho, D.; Park, J.H. Digital Twin and federated learning enabled cyberthreat detection system for IoT networks. Future Gener. Comput. Syst. 2024, 161, 701–713. [Google Scholar] [CrossRef]
- Makkar, A.; Ghosh, U.; Rawat, D.B.; Abawajy, J.H. Fedlearnsp: Preserving privacy and security using federated learning and edge computing. IEEE Consum. Electron. Mag. 2021, 11, 21–27. [Google Scholar] [CrossRef]
- Rajendran, S.; Obeid, J.S.; Binol, H.; Foley, K.; Zhang, W.; Austin, P.; Brakefield, J.; Gurcan, M.N.; Topaloglu, U. Cloud-based federated learning implementation across medical centers. JCO Clin. Cancer Inform. 2021, 5, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Zhou, J.; Pal, S.; Dong, C.; Wang, K. Enhancing quality of service through federated learning in edge-cloud architecture. Ad Hoc Netw. 2024, 156, 103430. [Google Scholar] [CrossRef]
- Duan, Q.; Huang, J.; Hu, S.; Deng, R.; Lu, Z.; Yu, S. Combining federated learning and edge computing toward ubiquitous intelligence in 6G network: Challenges, recent advances, and future directions. IEEE Commun. Surv. Tutor. 2023, 25, 2892–2950. [Google Scholar] [CrossRef]
- Nguyen, D.C.; Ding, M.; Pham, Q.V.; Pathirana, P.N.; Le, L.B.; Seneviratne, A.; Li, J.; Niyato, D.; Poor, H.V. Federated learning meets blockchain in edge computing: Opportunities and challenges. IEEE Internet Things J. 2021, 8, 12806–12825. [Google Scholar] [CrossRef]
- Qi, Y.; Feng, Y.; Wang, X.; Li, H.; Tian, J. Leveraging Federated Learning and Edge Computing for Recommendation Systems within Cloud Computing Networks. arXiv 2024, arXiv:2403.03165. [Google Scholar]
- Ye, Y.; Li, S.; Liu, F.; Tang, Y.; Hu, W. EdgeFed: Optimized federated learning based on edge computing. IEEE Access 2020, 8, 209191–209198. [Google Scholar] [CrossRef]
- Tursunboev, J.; Kang, Y.S.; Huh, S.B.; Lim, D.W.; Kang, J.M.; Jung, H. Hierarchical federated learning for edge-aided unmanned aerial vehicle networks. Appl. Sci. 2022, 12, 670. [Google Scholar] [CrossRef]
- Wu, Z.; Sun, S.; Wang, Y.; Liu, M.; Gao, B.; Pan, Q.; He, T.; Jiang, X. Agglomerative federated learning: Empowering larger model training via end-edge-cloud collaboration. In Proceedings of the IEEE INFOCOM 2024-IEEE Conference on Computer Communications, Vancouver, BC, Canada, 20–23 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 131–140. [Google Scholar]
- Liu, L.; Zhang, J.; Song, S.; Letaief, K.B. Client-edge-cloud hierarchical federated learning. In Proceedings of the ICC 2020–2020 IEEE international conference on communications (ICC), Dublin, Ireland, 7–11 June 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Qayyum, A.; Ahmad, K.; Ahsan, M.A.; Al-Fuqaha, A.; Qadir, J. Collaborative federated learning for healthcare: Multi-modal covid-19 diagnosis at the edge. IEEE Open J. Comput. Soc. 2022, 3, 172–184. [Google Scholar] [CrossRef]
- Zhao, M.; Wei, L. Federated Learning Approaches for Privacy-Preserving AI in Cloud. Asian Am. Res. Lett. J. 2024, 1, 2. Available online: https://aarlj.com/index.php/AARLJ/article/view/24 (accessed on 25 February 2025).
- Parra-Ullauri, J.M.; Madhukumar, H.; Nicolaescu, A.C.; Zhang, X.; Bravalheri, A.; Hussain, R.; Vasilakos, X.; Nejabati, R.; Simeonidou, D. kubeFlower: A privacy-preserving framework for Kubernetes-based federated learning in cloud–edge environments. Future Gener. Comput. Syst. 2024, 157, 558–572. [Google Scholar] [CrossRef]
- Su, Z.; Wang, Y.; Luan, T.H.; Zhang, N.; Li, F.; Chen, T.; Cao, H. Secure and efficient federated learning for smart grid with edge-cloud collaboration. IEEE Trans. Ind. Inform. 2021, 18, 1333–1344. [Google Scholar] [CrossRef]
- Falade, A.A.; Agarwal, G.; Sanghi, A.; Gupta, A.K. An end-to-end security and privacy preserving approach for multi cloud environment using multi level federated and lightweight deep learning assisted homomorphic encryption based on AI. In Proceedings of the AIP Conference Proceedings, Oline, 2–6 December 2024; AIP Publishing: Melville, NY USA, 2024; Volume 3168. [Google Scholar]
- Bhansali, P.K.; Hiran, D.; Kothari, H.; Gulati, K. Cloud-based secure data storage and access control for internet of medical things using federated learning. Int. J. Pervasive Comput. Commun. 2024, 20, 228–239. [Google Scholar] [CrossRef]
- Zhang, Z.; Wu, L.; Ma, C.; Li, J.; Wang, J.; Wang, Q.; Yu, S. LSFL: A lightweight and secure federated learning scheme for edge computing. IEEE Trans. Inf. Forensics Secur. 2022, 18, 365–379. [Google Scholar] [CrossRef]
- Wang, H.; Yang, T.; Ding, Y.; Tang, S.; Wang, Y. VPPFL: Verifiable Privacy-Preserving Federated Learning in Cloud Environment. IEEE Access 2024, 12, 151998–152008. [Google Scholar] [CrossRef]
- Lin, L.; Zhang, X. PPVerifier: A privacy-preserving and verifiable federated learning method in cloud-edge collaborative computing environment. IEEE Internet Things J. 2022, 10, 8878–8892. [Google Scholar] [CrossRef]
- Jiang, H.; Liu, M.; Yang, B.; Liu, Q.; Li, J.; Guo, X. Customized federated learning for accelerated edge computing with heterogeneous task targets. Comput. Netw. 2020, 183, 107569. [Google Scholar] [CrossRef]
- Qu, G.; Cui, N.; Wu, H.; Li, R.; Ding, Y. ChainFL: A simulation platform for joint federated learning and blockchain in edge/cloud computing environments. IEEE Trans. Ind. Inform. 2021, 18, 3572–3581. [Google Scholar] [CrossRef]
- Kasula, V.K.; Yadulla, A.R.; Konda, B.; Yenugula, M. Fortifying cloud environments against data breaches: A novel AI-driven security framework. World J. Adv. Res. Rev. 2024, 24, 1613–1626. [Google Scholar] [CrossRef]
- Yazdinejad, A.; Dehghantanha, A.; Karimipour, H.; Srivastava, G.; Parizi, R.M. A robust privacy-preserving federated learning model against model poisoning attacks. IEEE Trans. Inf. Forensics Secur. 2024, 19, 6693–6708. [Google Scholar] [CrossRef]
- Namakshenas, D.; Yazdinejad, A.; Dehghantanha, A.; Srivastava, G. Federated quantum-based privacy-preserving threat detection model for consumer Internet of Things. IEEE Trans. Consum. Electron. 2024, 70, 5829–5838. [Google Scholar] [CrossRef]
- Yazdinejad, A.; Dehghantanha, A.; Parizi, R.M.; Hammoudeh, M.; Karimipour, H.; Srivastava, G. Block Hunter: Federated learning for cyber threat hunting in blockchain-based IIoT networks. IEEE Trans. Ind. Inform. 2022, 18, 8356–8366. [Google Scholar] [CrossRef]
- Zhang, J.; Liu, Y.; Wu, D.; Lou, S.; Chen, B.; Yu, S. VPFL: A verifiable privacy-preserving federated learning scheme for edge computing systems. Digit. Commun. Netw. 2023, 9, 981–989. [Google Scholar] [CrossRef]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Cloud Security Domain | Description | Potential Benefit |
---|---|---|
Threat Detection | Identifying malicious activities and anomalies in real-time | Improved detection without centralizing sensitive data |
Privacy Protection | Ensuring data privacy during model training and inference | High privacy standards with federated data processing |
Access Control | Managing access permissions using AI-driven policies | Adaptive access policies based on real-time data |
Intrusion Detection | Monitoring network activities to detect and mitigate intrusions | Timely detection with decentralized data sources |
Compliance Monitoring | Monitoring adherence to compliance regulations using federated models | Continuous compliance without raw data transfer |
Attribute | FL | Traditional Centralized AI |
---|---|---|
Privacy | High; data remains on local devices, reducing privacy risks | Lower; data are collected centrally, increasing exposure to breaches |
Latency | Lower latency in data access (5–50 ms), but potential delays in model aggregation (100 ms–5 s, depending on communication bandwidth) | High latency due to centralized processing (200 ms–10 s, depending on data center location and workload) |
Scalability | Scalable with more devices, but may face aggregation challenges due to communication overhead. Can support 10,000+ edge devices but suffers from synchronization delays | Scalable but limited by central infrastructure constraints. Performance degrades when handling millions of devices simultaneously |
Model Accuracy | Dependent on data distribution and device capacity. Federated averaging can lead to 1–5% accuracy degradation compared to centralized training | Often higher accuracy (by 2–10%) due to centralized training on complete, diverse datasets |
Data Ownership | Data remains on clients and is stored locally, ensuring regulatory compliance (e.g., GDPR and HIPAA) | Data transferred to central storage, increasing risk of unauthorized access |
Communication Overhead | High; requires frequent communication for model updates (100 MB–1 GB per round for deep learning models) | Low; models are trained centrally, reducing communication costs |
Database | Total Papers | Relevant Papers | Final Selected |
---|---|---|---|
Google Scholar | 12,100 | 1219 | 18 |
SDL | 9206 | 996 | 12 |
Aspect | Cloud Computing | Edge Computing |
---|---|---|
Definition | Centralized computing that stores and processes data in large, remote data centers | Decentralized computing that processes data close to the data source |
Latency | Higher latency due to distance from the data source | Lower latency, as data are processed near the device or data source |
Data Processing | Processes data in centralized data centers | Processes data at or near the edge of the network |
Scalability | Highly scalable with almost unlimited resources | Limited scalability, depending on local hardware and network capabilities |
Bandwidth Requirements | Requires high bandwidth for large data transfers to/from the cloud | Reduces bandwidth needs by processing data locally |
Reliability | Dependent on Internet connectivity to the cloud data center | More reliable in environments with intermittent connectivity |
Privacy and Security | Relies on centralized security protocols, vulnerable to external breaches | Enhanced privacy by keeping data closer to the source, reducing exposure to external threats |
Cost | Potentially higher cost due to bandwidth and centralized infrastructure | Reduces costs by minimizing data sent to the cloud and processing locally |
Ideal Use Cases | Data-intensive applications, complex analytics, backup, and storage | Real-time processing, IoT, and applications requiring low latency and a quick response |
Reference | Key Findings | Limitations/Research Gaps | Suggested Mitigation |
---|---|---|---|
[22] |
|
|
|
[108] |
|
|
|
[109] |
|
|
|
[125] |
|
|
|
[112] |
|
|
|
[126] |
|
|
|
[127] |
|
|
|
[128] |
|
|
|
[129] |
|
|
|
[133] |
|
|
|
[134] |
|
|
|
[135] |
|
|
|
[113] |
|
|
|
[136] |
|
|
|
[137] |
|
|
|
[130] |
|
|
|
[131] |
|
|
|
[132] |
|
|
|
[6] |
|
|
|
[138] |
|
|
|
[139] |
|
|
|
[140] |
|
|
|
[141] |
|
|
|
[142] |
|
|
|
[143] |
|
|
|
[144] |
|
|
|
[145] |
|
|
|
[150] |
|
|
|
[146] |
|
|
|
[147] |
|
|
|
[148] |
|
|
|
[149] |
|
|
|
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Albshaier, L.; Almarri, S.; Albuali, A. Federated Learning for Cloud and Edge Security: A Systematic Review of Challenges and AI Opportunities. Electronics 2025, 14, 1019. https://doi.org/10.3390/electronics14051019
Albshaier L, Almarri S, Albuali A. Federated Learning for Cloud and Edge Security: A Systematic Review of Challenges and AI Opportunities. Electronics. 2025; 14(5):1019. https://doi.org/10.3390/electronics14051019
Chicago/Turabian StyleAlbshaier, Latifa, Seetah Almarri, and Abdullah Albuali. 2025. "Federated Learning for Cloud and Edge Security: A Systematic Review of Challenges and AI Opportunities" Electronics 14, no. 5: 1019. https://doi.org/10.3390/electronics14051019
APA StyleAlbshaier, L., Almarri, S., & Albuali, A. (2025). Federated Learning for Cloud and Edge Security: A Systematic Review of Challenges and AI Opportunities. Electronics, 14(5), 1019. https://doi.org/10.3390/electronics14051019