Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions

Bian, Yue; Zhang, Xin; Luosang, Gadeng; Renzeng, Duojie; Renqing, Dongzhu; Ding, Xuhui

doi:10.3390/electronics14050868

Open AccessArticle

Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions

by

Yue Bian

¹,

Xin Zhang

¹,

Gadeng Luosang

²,

Duojie Renzeng

²,

Dongzhu Renqing

² and

Xuhui Ding

^3,4,*

¹

China Telecom Corporation Limited, Beijing 100001, China

²

School of Information Science and Technology, Tibet University, Lhasa 850000, China

³

School of Cyberspace Science and Technology, Beijing Institute of Technology, Beijing 100081, China

⁴

Advanced Technology Research Institute, Beijing Institute of Technology, Jinan 250101, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(5), 868; https://doi.org/10.3390/electronics14050868

Submission received: 18 December 2024 / Revised: 7 February 2025 / Accepted: 17 February 2025 / Published: 22 February 2025

Download

Browse Figures

Versions Notes

Abstract

:

This study investigates the high-quality data processing technology, immersive experience mechanisms, and large-scale access in the Metaverse, concurrently ensuring robust privacy and security. We commence with a comprehensive analysis of the Metaverse’s service requirements, followed by an exploration of its principal technologies. Furthermore, we evaluate the feasibility and potential benefits of integrating semantic communication to enhance the service quality of the Metaverse. A federated semantic communication framework is proposed, integrating semantic data transmission, semantic digital twins, and a Metaverse construction model trained through federated learning. We proceed to assess the performance of our proposed framework through simulations, highlighting the notable enhancements in transmission efficiency, recovery effectiveness, and intelligent recognition ability afforded by semantic communication for the Metaverse. Notably, the framework achieves outstanding compression efficiency with minimal information distortion (0.055), which decreases transmission delays and improves the immersion quality within the Metaverse. Finally, we identify future challenges and propose potential solutions for advancing semantic communication, federated learning, and Metaverse technologies.

Keywords:

Metaverse construction model; semantic communication; federated learning; semantic twin technology

1. Introduction

The Metaverse is a virtual digital space comprising multiple virtual subspaces that coexist and engage with the real world, establishing a collective virtual reality setting. Within this domain, users have the ability to manipulate digital components and interact with other users through controlled elements, thereby facilitating an immersive experience that parallels real-world interactions [1]. A distinctive characteristic of the Metaverse is its inherent linkage between the physical and virtual domains; the former represents the physical space of the real world, while the latter comprises the virtual spaces within the Metaverse. In essence, the primary objective of the Metaverse is to leverage real-world data obtained from the physical domain by virtual service providers (VSPs) to construct and shape the virtual domain.

The Metaverse has enhanced real-world experiences by incorporating a novel interactive dimension. On the one hand, users are empowered to explore the virtual world of the Metaverse through extended reality (XR) technologies, including augmented reality (AR), virtual reality (VR), and mixed reality (MR) [2]. On the other hand, the Metaverse also offers a novel experimental environment for the real world, where the behavior of users in the Metaverse can have an impact on the real world. For instance, the authors of [3] developed a virtual traffic network in the Metaverse to train and test autonomous driving systems, thus offering valuable empirical references for similar real-world experiments. However, achieving an efficient and reliable interaction between the physical and virtual domains remains a challenge. Critical issues include the rapid construction of the virtual domain utilizing data from the physical domain and ensuring that the feedback from the virtual to the physical domain is both efficient and precise. The authors of [4] developed a semantic framework assisted by blockchain that facilitates interactions between these two domains. This framework is built on a trust-building mechanism on the blockchain, which involves VSPs and edge devices.

Furthermore, the quality of interaction between the physical and virtual domains depends on the communication environment, necessitating the assurance of high-bandwidth, high-reliability, and low-latency service quality. Therefore, it is important to consider the different techniques that can effectively minimize transmission delays. The authors of [5] introduced a modified stopping criterion (ISC) for use in bit-interleaved low-density parity-check code modulation and iterative decoding (BILCM-ID) systems, which represents a significant advancement in achieving more efficient data transmission. The rapid development of the Internet of Things (IoT) [6] has resulted in the integration of numerous sensor interfaces into heterogeneous networks. However, this integration has induced several challenges, such as significant spectrum resource consumption and increased latency in complex communication environments. Consequently, mobile edge computing (MEC) and 6G communication have emerged as potential solutions for the Metaverse [7,8,9]. MEC offers proximity-based edge computing services aimed at improving application performance and response times. The upcoming 6G communication technology introduces a novel paradigm offering expanded spectrum resources [10], reduced latency, and higher transmission speeds to support the Metaverse. Even though MEC effectively improves data transmission efficiency, it relies on support from computing resource providers. Insufficient availability of computing resources directly degrades communication quality. Consequently, the development of incentive mechanisms has emerged as a new paradigm to foster the supply of computing resources. In general, incentive mechanisms aim to maximize incentive compatibility (IC) while maintaining individual rationality (IR) and budget balance. The authors of [11] introduced the concept of the Information Age (AoI) into incentive mechanisms to assess data freshness, encouraging VSPs to refresh data for constructing digital twins (DTs) and facilitating data transactions between VSPs and edge devices. In addition, as the next-generation wireless communication technology, the 6G network is developing in a direction beyond the traditional Shannon theory [12]. In order to cope with the scarcity of spectrum resources and bandwidth bottlenecks, 6G research explores new paradigms based on semantic coding, such as generating semantic signatures through a hash coding framework to support low-latency and high-efficiency information delivery while ensuring real-time processes and accuracy in multimodal data processing [13].

In recent years, semantic communication based on deep learning has become a new paradigm for communication frameworks, and much progress has been made. Most of the existing semantic communication models are implemented based on the end-to-end autoencoder framework [14,15], which communicates by capturing the semantic information of data. However, the training data of end-to-end models are limited, which restricts the learning efficacy of deep learning models and affects the performance of the overall semantic communication model. Therefore, a distributed semantic communication framework needs to be studied. Unlike existing end-to-end studies, we propose a federated variational autoencoder (VAE) semantic communication framework in the context of a distributed Metaverse. In this framework, we try to provide a new paradigm for the Metaverse that can provide low-latency and highly reliable service quality and guarantee the privacy of users’ data. We focus on methods for federated learning network construction and semantic extraction over physical and virtual domains. In this study, our contributions can be summarized as follows.

Firstly, we conduct a comprehensive analysis of the service requirements for the Metaverse. This analysis necessitates a re-evaluation of the quality of service (QoS) associated with the Metaverse, introduces a multifaceted understanding of immersion, emphasizes the necessity of a distributed framework, and highlights the importance of security in the Metaverse.
Subsequently, we present semantic communication and assess its feasibility and potential benefits when applied within the Metaverse. By integrating semantic communication with Metaverse technologies, we innovatively empower MEC and DT technologies, aiming to provide more dependable technical services for the Metaverse environment.
Finally, we propose a federated semantic communication framework specifically designed for the Metaverse. Semantic extraction facilitates the efficient compression of raw data, significantly reducing storage demands and increasing data transmission rates. In addition, the proposed framework improves communication reliability even in low-signal-to-noise (SNR) environments. Concurrently, users’ private data are securely contained within a trusted domain, mitigating the potential for data leakage.

2. Overview of the Metaverse

In this section, we first introduce the service requirements of the Metaverse from the perspective of service delivery and specifically discuss the potential security risks and corresponding security requirements of the Metaverse. Subsequently, the foundational technologies supporting the Metaverse are explored, focusing on the derived requirements.

2.1. Metaverse Service Requirements

As shown in Figure 1, the Metaverse is a virtual space that maps physical domains to virtual domains using a variety of technologies and provides access interfaces to users with the goal of delivering a service experience. In particular, the Metaverse needs to provide the following service requirements.

Premium User Experience: The QoS of the Metaverse is a crucial factor for delivering a high-quality user experience. In the Metaverse, users can interact with other users, virtual scenes, and the physical world through virtual reality technologies, which require the Metaverse to provide efficient, reliable, and intelligent services. Specifically, high bandwidths are of paramount importance to enable the Metaverse to accommodate a large number of simultaneous online users, ensuring unrestricted transmission of high-definition video, audio, and images. Additionally, a low latency plays a vital role in facilitating smooth and natural user interactions with the virtual world, while a high reliability is essential for maintaining stable user interactions in the Metaverse, preventing service disruptions and data loss, even in low-SNR settings. Advanced intelligence within the Metaverse adapts interactions to user needs through technologies such as speech recognition, natural language processing, and emotional analysis. In summary, the crucial requirements for the Metaverse include providing high bandwidth, low latency, high reliability, and superior service quality to deliver a premium interactive experience for users [16].

Deep Immersion: The Metaverse aims to create a virtual environment that closely resembles the real world while accommodating users’ multimodal interaction methods to foster a deeply immersive experience [17,18]. To be specific, the primary objective of immersive interaction in the Metaverse is to collect extensive information from the physical domain and utilize it to reconstruct a virtual space that closely emulates the real world. Immersion for the Metaverse manifests through three key aspects: (1) Interactive Sociality: Users can engage and communicate with digital entities within the Metaverse. (2) Temporal and Spatial Flexibility: The Metaverse enables access to diverse temporal and spatial dimensions, enabling users to transition between different realms and enter the digital world. (3) Creativity: Users have the ability to generate their own digital assets within the Metaverse.

Large-Scale Virtual Worlds: The Metaverse needs to provide a vast and sustainable virtual environment capable of facilitating simultaneous interaction, gaming, socialization, and online business activities for hundreds of millions of users. In the fundamental structure of the Metaverse, servers face the challenge of processing increasing amounts of data, resulting in heightened storage and computing demands. In response to these challenges, a distributed architecture offers a novel approach to constructing expansive virtual worlds and effectively alleviates the burden on central servers by distributing computing tasks. Furthermore, the Metaverse encompasses diverse spatial and technical interfaces, necessitating a flexible and compatible framework.

In addition, the security of private user data within the Metaverse is also worth discussing. To construct a more realistic virtual domain, users’ identity data and behavior preferences from the physical domain are often collected by multilevel Metaverse servers. However, the multilevel transmission and storage of data increases the possibility of data leakage. Once a server is attacked, sensitive user information is susceptible to unauthorized access. In the virtual domain, detailed records of user behavior, personal details, and transaction histories heighten the risk of data theft, which has profound and lasting consequences for users.

2.2. Metaverse-Related Technologies

MEC and 6G Communication Technologies: MEC alleviates computing pressure on central cloud servers by offloading computational tasks to edge servers and optimizing computing and storage resources for edge devices [19]. Meanwhile, edge servers are strategically positioned in close proximity to terminal devices to ensure low-latency services and enhance the overall service quality and response speed. Concurrently, substantial advancements in 6G technologies strive to redefine the communication paradigms for future networks [20,21,22,23,24].

Key technologies include the following: (1) Heterogeneous Radio: The utilization of a terahertz (THz) bandwidth ranging from 95 GHz to 3 THz allows for the application of terahertz beamforming to precisely direct narrower beams toward users in the Metaverse, significantly diminishing interference [6]. (2) Intelligent Reflective Surfaces: Intelligent reflective surfaces leverage signal reflection technology to redirect signals toward specific targets. (3) Novel Multiple Access: The novel multiple access scheme enables the simultaneous serving of multiple users in each orthogonal resource block, allowing for shared utilization of time, frequency, and code domains [25,26]. In existing research, the combination of MEC and 6G communication technology has led to significant improvements in QoS and remarkable achievements in the Metaverse.

DT and XR: DT and XR are fundamental technologies for achieving deep immersion in the Metaverse. The Metaverse is characterized by its capacity to simulate the real world with high fidelity. DT plays a crucial role in transforming physical entities or users from the physical domain into DTs within the Metaverse, a process facilitated by the VSP. This transformation allows the Metaverse’s virtual domain to achieve a level of realism necessary for immersive experiences. Furthermore, research on distributed DT architectures has shown their effectiveness in reducing the load on centralized server infrastructures, facilitating the development of a significantly larger Metaverse. Nevertheless, DTs impose stringent demands on end-to-end synchronization, necessitating near-zero latency and ultra-high transmission rates [7,27]. In addition, XR represents the core technology that provides users with an interactive interface to access the Metaverse.

Moreover, blockchain is frequently employed as a decentralized trust mechanism. More specifically, multiple transaction records are organized into blocks, linked together, and then distributed to all nodes in the network for verification and storage. Within the Metaverse, blockchain frequently serves as an auxiliary mechanism to enhance the interaction between the VSP and the physical domains, facilitating the provision of security services [28]. Furthermore, the rapid advancement of artificial intelligence (AI) offers novel solutions for the Metaverse [29,30]. Generative adversarial networks (GANs) [31] exploit the adversarial interplay between discriminators and generators to generate high-quality data. The authors of [32] proposed a Metaverse model based on reinforcement learning and GANs and proved its effectiveness. Deep semantic communication (DeepSC) can achieve more accurate and efficient semantic extraction by integrating the powerful learning ability of artificial intelligence. DeepSC endows semantic communication with efficient information transmission capabilities, smaller overhead, and a lower delay, and it shows good performance in low-SNR fading channels [33]. In addition, semantic communication combined with reinforcement learning [34,35] has also been widely studied. Reinforcement learning can provide more intelligent and autonomous responses to semantic communication and improve communication efficiency.

3. Semantic Communication in the Metaverse

In this section, we introduce semantic communication, trace its origin, and explore its necessity and feasibility in the Metaverse. We further analyze the advantage that semantic communication may offer when implemented in a virtual environment.

3.1. Semantic Communication

Semantic communication [36,37] has led to a paradigm shift in communication technologies, as it focuses on the semantic content of information rather than traditional communication in bit-stream transmission. Originating from [38], this approach emphasizes the intrinsic meaning within data and distinguishes them from other forms of information. For example, the RGB image

X \in R^{H \times W \times 3}

from the physical space highlights the difference in data-handling requirements, where H and W are the height and the width of the image, and 3 is the number of channels. Unlike traditional compression methods, such as JPEG, JPEG2000, and BPG, the compression for Metaverse image transmission requires a higher compression efficiency and greater noise robustness. Therefore, the encoder for semantic communication is given by

x = f_{e n} (X) \in R^{L},

(1)

where

x

is the compressed information of the image, and

f_{e n}

denotes the encoder. Note that the encoder can be designed with some linear layers or convolutional layers of a neural network.

Traditional communication systems, which focus on the accurate restoration of information bits, are not able to meet the growing demands of edge networks in terms of bandwidth, delay, energy efficiency, and intelligence. In contrast, the core of semantic communication lies in the semantic extraction and restoration of information. This approach removes redundant parts of information and retains semantic features by learning semantic information. Consequently, semantic communication models significantly diminish data transmission overhead, thereby alleviating computational and storage burdens and reducing communication latency.

More recently, the rapid advancement of AI has significantly contributed to the research progress in semantic communication. DeepSC leverages the robust learning capabilities of AI to facilitate accurate and efficient semantic extraction. In DeepSC, the encoder and decoder serve as the essential components. Conversely, the decoder is responsible for reconstructing readable information in the semantic space using the extracted semantics, i.e.,

\hat{X} = f_{d e} (x + n),

(2)

where

\hat{X}

is the estimated image,

f_{d e}

is the decoder, and

n

is the noise vector. DeepSC enhances semantic communication by providing efficient information transmission, reduced overhead, and lower delay, and it exhibits robust performance in low-SNR environments [33]. Additionally, there has been extensive research on the combination of semantic communication and reinforcement learning [34,35]. Incorporating reinforcement learning enables more intelligent and autonomous responses in semantic communication, thereby enhancing communication efficiency.

With the increasing number of users in the Metaverse, the problem of data explosion becomes pressing, necessitating the development of solutions to enhance data transmission rates and alleviate data storage pressure. DeepSC presents innovative ideas and solutions to address the challenges posed by the data explosion in this context. The semantic extraction scheme can effectively reduce the costs associated with data storage and transmission while enhancing the QoS in the Metaverse. Furthermore, DeepSC’s diverse model framework exhibits adaptability to end-to-end communication scenarios and scalability in multimodal and multi-scenario environments. Thus, DeepSC is compatible with various scenarios encountered in the Metaverse.

3.2. Semantic Communication Empowers the Metaverse

As shown in Figure 2, we propose two possible Metaverse application scenarios combined with semantic communication. The mobility of edge devices necessitates their support for a large number of terminal devices, resulting in increased computing and storage pressures. DeepSC-based MEC can address the challenge of large-scale data communication in the Metaverse by utilizing low-dimensional semantic information communication, thus alleviating the burden on edge servers. Moreover, the availability of computing resources in edge computing is constrained by resource providers. To address this, the proposed semantic communication framework for the Metaverse employs competition theory to establish incentive mechanisms [39,40], fostering the provision of computing resources by resource providers and enhancing the interaction between the physical and virtual domains.

The DT technology integrated into DeepSC, known as semantic DT, offers robust support for processing extensive volumes of data. Semantic DT possesses the ability to analyze vast amounts of data and extract meaningful semantics, resulting in a significant reduction in the construction cost of the virtual domain through the exploitation of semantic extraction redundancy. Moreover, by leveraging the reducibility of the semantic space, semantic DT enables precise construction of the virtual domain, providing users with a heightened sense of immersion within the Metaverse.

Generally, semantic communication serves as a key solution to the challenges posed by data proliferation in the Metaverse, enabling higher data transmission rates, lower data storage requirements, and less communication overhead. However, if the encoder excessively compresses the semantic features to minimize transmission overhead, it compromises the anti-noise capability, hindering the accurate restoration of semantic features. Hence, the essence of semantic communication lies in maintaining a delicate balance between the compression level of semantic features and the accuracy of feature reconstruction.

4. Federated Semantic Communication Framework for the Metaverse

In this section, we propose a federated semantic communication framework for the Metaverse, which includes data upload, semantic digital twins, and a federated semantic Metaverse. In particular, the sub-Metaverse is first distributed and created by semantic digital twins based on their own private data. Then, the model for establishing the global Metaverse is trained in the federated learning framework.

The advent of federated learning presents a novel paradigm for privacy preservation in the Metaverse [41], where each distributed node trains a local model using its private data and shares the local model parameters with a central node. The central node aggregates all of the local model parameters to update the global model and disseminates the information to all distributed nodes, iteratively refining the model until convergence is achieved. The entire system collaboratively trains a global model to accomplish shared tasks. A distinct advantage of federated learning [42] is that the original data remain within each domain, existing locally in each node. Only model parameters, rather than the data themselves, are employed during training. This approach significantly safeguards user data privacy and mitigates the risks of data leakage during transmission and storage. It decentralizes computing tasks, reduces the risk of a single point of failure, enhances the robustness and stability of the system, supports personalized model training to adapt to the privacy needs of different users, improves the efficiency of model training, and saves bandwidth resources. Furthermore, the distributed collaboration framework alleviates the burden on central services, aligning with the distributed concept of constructing expansive virtual worlds in the Metaverse. Simultaneously, the distributed framework offers a unified management approach for numerous heterogeneous spatial interfaces in the Metaverse, promoting convenient and efficient collaboration among diverse data sources [43]. We propose a semantic Metaverse communication within the federated learning framework, which is delineated in the three phases illustrated in Figure 3.

Data Upload: Terminal devices, functioning primarily as interfaces between physical domains and VSPs, transmit data and information to trusted computing modules using semantic communication. These devices, which are connected to physical domains, are typically limited in their computing and storage capabilities, necessitating the dispatch of raw private data to trusted modules for secure processing. Here, edge servers, characterized by substantial computing resources and storage capacity, act as trusted computing modules. They establish long-term collaborations with terminals to ensure the shortest path for data transmission, effectively reducing data leakage during multiple transmissions and storage instances. Additionally, terminal equipment collects data in various complex modes and high-dimensional formats, including images, audio, text, and mixed modes. The semantic communication framework utilizes semantic extraction to efficiently extract information features, reduce data dimensionality, and minimize data transmission costs. Subsequently, the data are reconstructed from their semantic representation using decoders, which possess anti-noise properties, particularly in channel-fading scenarios. To implement these semantic communication modules, various AI models are utilized [44]. Autoencoders (AEs) are suitable for reconstruction tasks within semantic communications. VAEs are generative models that learn data distributions for data reconstruction. Recurrent neural networks (RNNs) and their variants, such as long short-term memory networks (LSTMs), can capture temporal data correlations to predict future temporal sequences.

Semantic Digital Twins: A semantic digital twin (SCDT) is one potential virtual representation of a real-world asset based on semantic technologies, and it provides a shared definition of concepts within a domain [45,46]. The core mechanism behind an SCDT is that it combines the encoder–decoder architecture in deep learning with the idea of federated learning [47]. In deep learning, encoder–decoder models such as AEs, VAEs, and GANs can indeed be used to implement SCDTs. The essence of these models is to extract a low-dimensional semantic vector representation from the high-dimensional raw data through the encoder. This process can be regarded as data compression and the removal of redundant information. The encoded semantic vector is used for the backup or transmission of DTs. When it is necessary to restore the original information or construct a virtual scene similar to the real world, the decoder uses semantic vectors to reconstruct the data as closely as possible to the original input. The edge server constructs a sub-Metaverse using semantic DT technology. In this process, the edge server initially acquires real-world data, which are subsequently compressed via a semantic encoder to eliminate redundancy, thereby enhancing storage efficiency. As part of constructing a sub-Metaverse, the edge server strategically allocates computational resources and employs distributed DTs during the reconstruction task. Meanwhile, semantic decoders in the semantic communication framework utilize semantic features stored in the server to facilitate the reconstruction of the sub-Metaverse and data resembling real-world data.

Federated Semantic Metaverse: During the training of the global Metaverse based on federated semantic models, edge servers function as key distributed nodes contributing by transmitting the SCDT model of the sub-Metaverse to the central cloud as a communicable local model. The central cloud leverages its substantial computing resources to collaborate with the edge server in the construction of the global Metaverse. It is essential to recognize that the global model in the federated semantic Metaverse architecture is responsible for training the optimal SCDT model to construct the Metaverse. The SCDT supports local training through federated learning. Furthermore, private data solely reside on the edge server near the terminal and do not partake in subsequent transmission tasks related to Metaverse construction, ensuring the prevention of data leakage during transmission. Numerous models of federated learning have been proposed in recent years, encompassing robust federated learning approaches that address data poisoning problems and personalized federated learning methods that cater to heterogeneous data. These models offer additional avenues for expanding our federated learning model.

5. Performance Analysis and Evaluation

In this section, we implement our proposed federated semantic communication framework and obtain some experimental results to demonstrate the superiority of semantic communication.

To evaluate the performance of federated semantic communication schemes in the Metaverse, we utilize a combination of the MNIST and KMNIST datasets as the source for a heterogeneous dataset in PyTorch 1.9.12. Each dataset consists of 60,000 training samples and 10,000 testing samples. In addition, Cifar10 is a color image dataset with a more complex data structure. We use this dataset to train the Fed-AE model and detect its reconstruction ability. Moreover, all of the simulation experiments in this study simulate the data transmission of an actual fading channel. In the case that the AE and VAE are trained with centralized learning, the data consist of handwritten digits and Japanese characters, representing the diversity of data in end-to-end transmission. Additionally, we train the AE and VAE with federated learning, and these are named Fed-AE and Fed-VAE. The AE possesses powerful reconstruction capabilities, and it can reconstruct high-quality data. The VAE can compress the input data into a low-dimensional latent space by learning the semantic distribution of the input data, even in the presence of high levels of noise interference.

For the experiments on training the semantic communication model with federated learning, we simulate 32 terminal devices participating in information collection. The dataset heterogeneity of each terminal device is simulated using random sampling from the dataset. To enhance the model’s generalization ability, we employ the Model-Agnostic Meta-Learning (MAML) algorithm as a collaborative algorithm. The MAML algorithm helps AI models quickly adapt and learn new knowledge in different tasks, leading to more efficient learning and inference.

The proposed framework achieves a global training cycle in approximately 2.08 s, with the model size being around 0.12 MB. However, the design of the model warrants deeper exploration depending on various real-world scenarios and requirements. In addition, our experimental setup includes an AMD R5 5800H CPU (Advanced Micro Devices, Inc. Santa Clara, CA, USA), Nvidia RTX 3050 Ti GPU (NVIDIA Corporation Santa Clara, CA, USA), Windows 11, and PyTorch 1.9.12. In the experiment, we simulate a scenario with 30 distributed nodes, and all of these nodes participate in the co-training process of the model in each round of training.

In order to comprehensively evaluate the reliability and effectiveness of the semantic communication model, we conduct a comprehensive evaluation using the following three key indicators: the peak signal-to-noise ratio (PSNR), classification test accuracy, and intuitive image codec effect. The PSNR is used to measure the difference between the reconstructed image and the original image, reflecting the ability of the model to maintain image quality and anti-noise performance at different compression rates. The classification test accuracy verifies the effectiveness of the model in extracting semantic features and its ability to be generalized to unseen data. The intuitive image codec effects show the user’s visual experience in practical applications, ensuring that the model not only performs well numerically but also provides high-quality reconstructed images with few artifacts. These three metrics work together to comprehensively reveal the performance and user experience of semantic communication models in the Metaverse environment from different perspectives.

We employ four different semantic model schemes, as shown in Figure 4. The two schemes of the AE utilize a single linear layer, while the two schemes of the VAE utilize deeper architectures with three linear layers. We can observe that the VAE performs consistently across different compression rates, showing little sensitivity to the compression rate. However, the peak signal-to-noise ratio (PSNR) performance of the VAE is not as good as that of the AE. The reason for this is that although the VAE can learn the data distribution, it relies on the semantic model to learn the feature space, and the simple linear layers may not be capable of effectively performing feature learning tasks. On the other hand, the AE exhibits a larger performance gap at different compression rates, with poorer PSNR performance at lower compression rates. Federated learning based on the MAML algorithm leads to a significant improvement in the PSNR, but the performance of Fed-VAE is slightly inferior.

As shown in Figure 5, the Fed-AE model is tested on the CiFar10 and KMNIST test sets for different CRs of 0.1, 0.3, 0.5, 0.7, and 0.9. It can be observed that at a CR of 0.3, the content of the images can be reconstructed, but there are noticeable artifacts in the reconstructed images. At a CR of 0.5, the reconstructed images become clearer, with significantly reduced artifacts, and the transmission cost is halved compared with that of the original data. For CRs equal to 0.7 and 0.9, the images are reconstructed with even higher quality, but the transmission cost remains a concern. Therefore, the main task of the semantic model is to reconstruct the data and make them as crisp and non-noisy as possible at a low CR. As shown in Figure 6, the Fed-AE model is tested at different CRs on the MNIST, KMNIST, and mixed datasets to measure the testing accuracy. We can observe that the semantic communication model achieves high accuracy on MNIST, indicating that simple data reconstructed through semantic encoding and decoding can be effectively learned in the feature space. The model performs well across different compression rates. However, when it comes to complex data such as the Japanese characters in KMNIST, the model’s learning capacity is limited. This implies that shallow model architectures struggle to reconstruct complex data effectively, and deeper semantic communication networks are worth exploring.

Furthermore, through horizontal comparison, we discover that lower compression rates affect the accuracy of semantic extraction and classification, as lower compression rates are more susceptible to noise interference. For the control experiment set generated through JPEG compression, our effect is obvious. As shown by the results in Figure 7, the compression rate is higher than that of JPEG compression, and it is basically equivalent to the JPEG effect with an 85% compression rate. When the compression rate starts to drop below about 0.2, JPEG compression exhibits notably poor performance, characterized by significant challenges in color restoration and image clarity. Additionally, although the compression rate coefficient for JPEG is adjusted, it is difficult to achieve compression rate reduction, and the lowest compression rate is only about 0.13.

6. Future Research Directions and Challenges

Despite the advantages presented by the federated semantic framework for the Metaverse in reducing computing and storage overhead while simultaneously ensuring data privacy and security, significant challenges persist in relation to the complexities of semantic communication, federated learning, and Metaverse integration.

Semantic feature representation: While semantic communication possesses a formidable ability for semantic extraction, the growing complexity and high dimensionality of the data in various social contexts bring challenges in extracting the most crucial features of complex objects. Furthermore, AI-based semantic communication lacks interpretability in the hidden space, making it challenging to assess the accuracy of the learned semantic space, thus hindering the optimization of semantic communication models.

Semantic security: Semantic communication based on deep learning introduces security in computing and communication [48]. Specifically, during the training of a semantic communication model, attackers can exploit malicious data that exhibit different information representations but share similar semantics and training data to compromise the model’s training process. This manipulation may not significantly impact the overall performance of the model, but it can cause the model to exhibit different behaviors when confronted with specific trigger sets. Furthermore, in the context of wireless communication, codecs, which are usually trained in pairs, present additional security challenges, given their integrated role in ensuring secure and reliable transmission [49,50]. One solution worth investigating is eliminating poisoned data. The data can be cleaned before training to remove dirty data that may be poisoned [51].

Privacy and security in the Metaverse: As an emerging immersive virtual shared space, the privacy and security issues of the Metaverse have become increasingly prominent. When users enjoy their digital lives, personal information may be violated in all aspects of the data service life cycle, including the collection of sensitive information such as private locations, habits, and lifestyles. As the Metaverse scales and technology advances, new threats continue to emerge, and XR and HCI (human–computer interaction) devices naturally collect more sensitive information than traditional smart devices, increasing the difficulty of accountability [52]. Identity management and access control face challenges, and centralized identity systems are vulnerable to single-point-of-failure risk and potential leakage risk caused by service providers. Self-sovereign identity (SSI) allows users to control and share personal data autonomously to realize identity interoperability across domain operations, which is an important development direction in the construction of the future Metaverse [53]. In addition, the leakage of secure or private data may occur in the process of edge storage, centralized storage, data transmission, and data processing. Any weakness in these links may lead to the leakage of users’ sensitive information, thus affecting users’ personal security and social stability.

Security of federated learning: Federated learning faces a variety of potential security challenges, such as model poisoning attacks, gradient leakage attacks, and backdoor attacks. Malicious nodes may submit tampered local model updates, compromise the accuracy of the global model, or introduce covert backdoors that cause the system to perform unintended actions under certain conditions. By using different aggregation techniques, such as Krum, Median, or Trimmed Mean, it is possible to detect and link the impact of malicious updates [54]. In addition, the use of federated confrontation training is worth exploring. Local models are trained on locally generated adversarial examples, and global models are updated to be robust to adversarial attacks. Although the original data are not directly shared, the attacker may still infer the information of part of the training data by analyzing the uploaded model parameters, causing user privacy leakage [55]. Some research techniques are expected to address these challenges, adopting differential privacy techniques to protect the privacy of individual contributors; adversarial training can improve the resistance of the model to adversarial samples [56].

Efficient collaboration without redundancy: Federated learning introduces a novel approach to collaborative models. Notably, when the training tasks across multiple cooperative nodes display significant redundancy, this can lead to the squandering of precious computing and communication resources. Therefore, addressing redundant collaboration and reducing the cost of repetitive training are critical challenges in this field. Additionally, considering the similarity and dissimilarity between various training tasks is crucial for eliminating redundant cooperation. An inadequate assessment of these similarities and dissimilarities can adversely affect the generalization capabilities of models.

Interoperability in the Metaverse: The Metaverse encompasses various virtual environments and applications that often employ different technologies and standards, leading to incompatible interactions. Additionally, extended reality serves as a vital interface for connecting the Metaverse, as conflicts arising from differing technologies and standards significantly impact the user experience and accessibility of the Metaverse. Thus, designing a generalized Metaverse communication interoperability protocol and framework is the key to solving this challenge [57]. The protocol or framework should focus on solving the following three core problems: ensuring the security of user privacy in the process of interoperability; improving the efficiency of interaction between different systems; and enhancing the compatibility and versatility of the protocol with various technologies and standards. By addressing these challenges, the user experience can be significantly improved and Metaverse technology can be advanced. The authors of [58] propose a four-layer architecture to achieve the goal of interoperable virtual world communication. The AIM system framework is proposed to solve the problem of cross-platform message transmission [59].

However, the greatest potential of the combination of federated learning and semantic communication for the Metaverse has not yet been fully demonstrated, and the methods and effects of this combination are still worthy of further exploration. Although AEs, VAEs, and GANs have made progress in supporting semantic communication in communication scenarios, these methods have been around for some time, and with the development of the internet and the dramatic increase in the amount of data, research on new semantic communication technologies has become particularly urgent. In addition, by simulating simple-dataset scenarios, the validity of the proposed federated semantic communication framework for the Metaverse is preliminarily verified. However, the performance of the framework in real, complex Metaverse environments still needs to be further tested and verified.

7. Conclusions

This study presents a comprehensive examination of the service requirements and associated technologies pertinent to the Metaverse, emphasizing the critical contributions of multi-access edge computing (MEC) and sixth-generation (6G) technologies in facilitating Metaverse applications. To construct high-quality data, enhance user immersion, support extensive access, and improve privacy and security within the Metaverse, we analyzed the feasibility and potential benefits of taking its specific service into account by implementing semantic communication. Furthermore, a federated learning semantic communication framework is proposed, and the operational process of the Metaverse within this framework is comprehensively detailed. Our simulation results have demonstrated the efficacy of the proposed framework in reducing delays and efficiently handling the collection of extensive data that are crucial for the construction of virtual environments. This framework can improve transmission efficiency, protect privacy, and reduce the burden of computing and storage for practical applications, as well as provide a feasible technical solution for constructing large-scale distributed virtual worlds. Finally, we address future challenges and explore potential strategies for advancing semantic computing and communication in support of the Metaverse.

Author Contributions

Conceptualization, Y.B.; Methodology, X.Z.; Software, G.L.; Formal analysis, D.R. (Duojie Renzeng) and D.R. (Dongzhu Renqing); Validation, X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2023YFC3305904, in part by the National Natural Scientific Foundation of China under Grants 62301049, 62436006, and 62406257, in part by the National Science and Technology Major Project of China under Grant 2022ZD0116100, and in part by the Tibet Autonomous Region Natural Scientific Foundation under Grant XZ202401ZR0031.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Yue Bian and Xin Zhang were employed by China Telecom Corporation Limited. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Huang, Y.; Qiao, X.; Wang, H.; Su, X.; Dustdar, S.; Zhang, P. Multi-player immersive communications and interactions in Metaverse: Challenges, architecture, and future directions. arXiv 2022, arXiv:2210.06802. [Google Scholar]
Yu, J.; Alhilal, A.; Hui, P.; Tsang, D.H. 6G mobile-edge empowered Metaverse: Requirements, technologies, challenges and research directions. arXiv 2022, arXiv:2211.04854. [Google Scholar]
Ng, W.C.; Du, H.; Lim, W.Y.B.; Xiong, Z.; Niyato, D.; Miao, C. Stochastic resource allocation for semantic communication-aided virtual transportation networks in the Metaverse. arXiv 2022, arXiv:2208.14661. [Google Scholar]
Lin, Y.; Du, H.; Niyato, D.; Nie, J.; Zhang, J.; Cheng, Y.; Yang, Z. Blockchain-aided secure semantic communication for ai-generated content in Metaverse. IEEE Open J. Comput. Soc. 2023, 4, 72–83. [Google Scholar] [CrossRef]
Ding, X.; Xu, Y.; Li, G.; Yang, K.; Yuan, J.; An, J. Design and Performance Evaluation for BILCM-ID System with Improved Stopping Criterion. IEEE Trans. Veh. Technol. 2024. [Google Scholar] [CrossRef]
Tan, C.; Cai, D.; Xu, Y.; Ding, Z.; Fan, P. Threshold-enhanced hierarchical spatial non-stationary channel estimation for uplink massive mimo systems. IEEE Trans. Wirel. Commun. 2024, 23, 4830–4844. [Google Scholar] [CrossRef]
Hashash, O.; Chaccour, C.; Saad, W.; Yu, T.; Sakaguchi, K.; Debbah, M. The seven worlds and experiences of the wireless Metaverse: Challenges and opportunities. IEEE Commun. Mag. 2025, 63, 120–127. [Google Scholar] [CrossRef]
Zawish, M.; Dharejo, F.A.; Khowaja, S.A.; Raza, S.; Davy, S.; Dev, K.; Bellavista, P. Ai and 6g into the Metaverse: Fundamentals, challenges and future research trends. IEEE Open J. Commun. Soc. 2024, 5, 730–778. [Google Scholar] [CrossRef]
Zawish, M.; Dharejo, F.A.; Khowaja, S.A.; Raza, S.; Davy, S.; Dev, K.; Bellavista, P. Dynamic Weighted Energy Minimization for Aerial Edge Computing Networks. IEEE Internet Things J. 2024, 12, 683–697. [Google Scholar]
Roy, S.; Rezazadeh, F.; Chergui, H.; Verikoukis, C. Joint Explainability and Sensitivity-Aware Federated Deep Learning for Transparent 6G RAN Slicing. In Proceedings of the ICC 2023-IEEE International Conference on Communications, Rome, Italy, 28 May–1 June 2023. [Google Scholar]
Liew, Z.Q.; Du, H.; Lim, W.Y.B.; Xiong, Z.; Niyato, D.; Yu, H. Economics of semantic communication in Metaverse: An auction approach. In Proceedings of the 2023 IEEE 20th Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2023; pp. 398–403. [Google Scholar]
Sana, M.; Strinati, E.C. Learning semantics: An opportunity for effective 6G communications. In Proceedings of the 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2022; pp. 631–636. [Google Scholar]
Pokhrel, S.R.; Choi, J. Understand-before-talk (UBT): A semantic communication approach to 6G networks. IEEE Trans. Veh. Technol. 2022, 72, 3544–3556. [Google Scholar] [CrossRef]
Wang, Y.; Sun, Z.; Fan, J.; Ma, H. On the uses of large language models to design end-to-end learning semantic communication. In Proceedings of the 2024 IEEE Wireless Communications and Networking Conference (WCNC), Dubai, United Arab Emirates, 21–24 April 2024; pp. 1–6. [Google Scholar]
Zhang, W.; Bai, K.; Zeadally, S.; Zhang, H.; Shao, H.; Ma, H.; Leung, V.C.M. Deepma: End-to-end deep multiple access for wireless image transmission in semantic communication. IEEE Trans. Cogn. Commun. Netw. 2023, 10, 387–402. [Google Scholar] [CrossRef]
Chang, L.; Zhang, Z.; Li, P.; Xi, S.; Guo, W.; Shen, Y.; Xiong, Z.; Kang, J.; Niyato, D.; Qiao, X.; et al. 6G-enabled edge AI for Metaverse: Challenges, methods, and future research directions. J. Commun. Inf. Netw. 2022, 7, 107–121. [Google Scholar] [CrossRef]
Dudley, J.; Yin, L.; Garaj, V.; Kristensson, P.O. Inclusive Immersion: A review of efforts to improve accessibility in virtual reality, augmented reality and the Metaverse. Virtual Real. 2023, 27, 2989–3020. [Google Scholar] [CrossRef]
Sanchez-Acedo, A.; Carbonell-Alcocer, A.; Gertrudix, M.; Rubio-Tamayo, J.L. Metaverse and extended realities in immersive journalism: A systematic literature review. Multimodal Technol. Interact. 2023, 7, 96. [Google Scholar] [CrossRef]
Guo, Y.; Zhao, R.; Lai, S.; Fan, L.; Lei, X.; Karagiannidis, G.K. Distributed machine learning for multiuser mobile edge computing systems. IEEE J. Sel. Top. Signal Process. 2022, 16, 460–473. [Google Scholar] [CrossRef]
Cai, D.; Fan, P.; Zou, Q.; Xu, Y.; Ding, Z.; Liu, Z. Active device detection and performance analysis of massive non-orthogonal transmissions in cellular internet of things. Sci. China Inf. Sci. 2022, 65, 182301. [Google Scholar] [CrossRef]
Zheng, S.; Shen, C.; Chen, X. Design and analysis of uplink and downlink communications for federated learning. IEEE J. Sel. Areas Commun. 2020, 39, 2150–2167. [Google Scholar] [CrossRef]
Mao, K.; Zhu, Q.; Wang, C.-X.; Ye, X.; Gomez-Ponce, J.; Cai, X.; Miao, Y.; Cui, Z.; Wu, Q.; Fan, W. A Survey on Channel Sounding Technologies and Measurements for UAV-Assisted Communications. IEEE Trans. Instrum. Meas. 2024, 73, 8004624. [Google Scholar] [CrossRef]
Wang, J.; Zhu, Q.; Lin, Z.; Chen, J.; Ding, G.; Wu, Q.; Gu, G.; Gao, Q. Sparse Bayesian Learning-Based Hierarchical Construction for 3D Radio Environment Maps Incorporating Channel Shadowing. IEEE Trans. Wirel. Commun. 2024, 73, 14560–14574. [Google Scholar] [CrossRef]
Hang, Y.; Gao, X.; Shi, M.; Ye, N. Energy-Efficient Power Control in D2D Networks: A Distributed ADMM Approach with Dynamic Penalty Coefficient. IEEE Trans. Veh. Technol. 2025. [Google Scholar] [CrossRef]
Ding, X.; Zhou, K.; Li, G.; Yang, K.; Gao, X.; Yuan, J.; An, J. Customized Joint Blind Frame Synchronization and Decoding Methods for Analog LDPC Decoder. IEEE Trans. Commun. 2024, 72, 756–770. [Google Scholar] [CrossRef]
Ni, H.; Zhu, Q.; Hua, B.; Mao, K.; Pan, Y.; Ali, F.; Zhong, W.; Chen, X. Path Loss and Shadowing for UAV-to-Ground UWB Channels Incorporating the Effects of Built-up Areas and Airframe. IEEE Trans. Intell. Transp. Syst. 2024, 25, 17066–17077. [Google Scholar] [CrossRef]
Tan, C.; Cai, D.; Fang, F.; Ding, Z.; Fan, P. Federated unfolding learning for CSI feedback in distributed edge networks. IEEE Trans. Commun. 2025, 73, 410–424. [Google Scholar] [CrossRef]
Lin, Y.; Gao, Z.; Du, H.; Niyato, D.; Kang, J.; Deng, R.; Shen, X.S. A unified blockchain-semantic framework for wireless edge intelligence enabled web 3.0. IEEE Wirel. Commun. 2024, 31, 126–133. [Google Scholar] [CrossRef]
Huynh-The, T.; Pham, Q.V.; Pham, X.Q.; Nguyen, T.T.; Han, Z.; Kim, D.S. Artificial intelligence for the Metaverse: A survey. Eng. Appl. Artif. Intell. 2023, 117, 105581. [Google Scholar] [CrossRef]
Chamola, V.; Bansal, G.; Das, T.K.; Hassija, V.; Reddy, N.S.S.; Wang, J.; Zeadally, S.; Hussain, A.; Yu, F.R.; Guizani, M.; et al. Beyond reality: The pivotal role of generative ai in the Metaverse. IEEE Internet Things Mag. 2024, 7, 126–135. [Google Scholar] [CrossRef]
Lu, S.; Dong, Z.; Cai, D.; Fang, F.; Zhao, D. MIM-GAN-based anomaly detection for multivariate time series data. In Proceedings of the 2023 IEEE 98th Vehicular Technology Conference (VTC2023-Fall), Hong Kong, China, 10–13 October 2023; pp. 1–7. [Google Scholar]
Ali, T.; Al-Khalidi, M.; Al-Zaidi, R.; Eleyan, A.; Rehman, M.A.U. Securing the Metaverse: A Deep Reinforcement Learning and Generative Adversarial Network Approach to Intrusion Detection. In Proceedings of the 2024 IEEE International Conference on Communications Workshops (ICC Workshops), Denver, CO, USA, 9–13 June 2024; pp. 263–268. [Google Scholar]
Dong, C.; Liang, H.; Xu, X.; Han, S.; Wang, B.; Zhang, P. Semantic communication system based on semantic slice models propagation. IEEE J. Sel. Areas Commun. 2022, 41, 202–213. [Google Scholar] [CrossRef]
Park, J.; Choi, J.; Kim, S.L.; Bennis, M. Enabling the wireless Metaverse via semantic multiverse communication. In Proceedings of the 2023 20th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), Madrid, Spain, 11–14 September 2023; pp. 85–90. [Google Scholar]
Lotfi, I.; Niyato, D.; Sun, S.; Kim, D.I.; Shen, X. Semantic Information Marketing in The Metaverse: A Learning-Based Contract Theory Framewor. IEEE J. Sel. Areas Commun. 2023, 42, 710–723. [Google Scholar] [CrossRef]
Wang, W.; Cai, D.; Liu, Z.; Ding, Z.; Fan, P. Wireless Image Semantic Cooperative Transmission in Distributed Edge Networks: An Information Disentanglement Method. In Proceedings of the 2024 IEEE 25th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Lucca, Italy, 10–13 September 2024; pp. 461–465. [Google Scholar]
Wang, W.; Cai, D.; Dong, Z.; Yu, L.; Xu, Y.; Liu, Z. Two-view Image Semantic Cooperative Non-orthogonal Transmission in Distributed Edge Networks. Int. J. Intell. Syst. 2024, 2024, 5081017. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Luong, N.C.; Pham, Q.-V.; Huynh-The, T.; Nguyen, V.-D.; Ng, D.W.K.; Chatzinotas, S. Edge computing for semantic communication enabled Metaverse: An incentive mechanism design. arXiv 2022, arXiv:2212.06463. [Google Scholar]
Wang, J.; Du, H.; Tian, Z.; Niyato, D.; Kang, J.; Shen, X. Semantic-aware sensing information transmission for Metaverse: A contest theoretic approach. IEEE Trans. Wirel. Commun. 2023, 22, 5214–5228. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.Y. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; Singh, A., Zhu, J., Eds.; Proceedings of Machine Learning Research. PMLR: Cambridge, MA, USA, 2017; pp. 1273–1282. [Google Scholar]
Wu, B.; Fang, F.; Wang, X.; Cai, D.; Fu, S.; Ding, Z. Client selection and cost-efficient joint optimization for noma-enabled hierarchical federated learning. IEEE Trans. Wirel. Commun. 2024, 23, 14289–14303. [Google Scholar] [CrossRef]
Ding, X.; Zhang, Y.; Li, J.; Mao, B.; Guo, Y.; Li, G. A feasibility study of multi-mode intelligent fusion medical data transmission technology of industrial Internet of Things combined with medical Internet of Things. Internet Things 2023, 21, 100689. [Google Scholar] [CrossRef]
Xiao, Z.; Cai, D.; Dong, Z.; Xiao, Y.; Shi, Y.; Liu, K. Cnxa: A novel attention mechanism aided convolution network. In Proceedings of the 2022 IEEE 8th International Conference on Cloud Computing and Intelligent Systems (CCIS), Chengdu, China, 26–28 November 2022; pp. 227–233. [Google Scholar]
Gruber, T.R. A translation approach to portable ontology specifications. Knowl. Acquis. 1993, 5, 199–220. [Google Scholar] [CrossRef]
Benfer, R.; Muller, J. Semantic digital twin creation of building systems through time series based metadata inference-A review. Energy Build. 2024, 321, 114637. [Google Scholar] [CrossRef]
Thomas, C.K.; Saad, W.; Xiao, Y. Causal semantic communication for digital twins: A generalizable imitation learning approach. IEEE J. Sel. Areas Inf. Theory 2023, 4, 698–717. [Google Scholar] [CrossRef]
Ud Din, I.; Habib Khan, K.; Almogren, A.; Zareei, M.; Arturo Pérez Díaz, J. Securing the Metaverse: A blockchain-enabled zero-trust architecture for virtual environments. IEEE Access 2024, 12, 92337–92347. [Google Scholar] [CrossRef]
Yang, Z.; Chen, M.; Li, G.; Yang, Y.; Zhang, Z. Secure semantic communications: Fundamentals and challenges. IEEE Netw. 2024, 38, 513–520. [Google Scholar] [CrossRef]
Shen, M.; Wang, J.; Du, H.; Niyato, D.; Tang, X.; Kang, J.; Ding, Y.; Zhu, L. Secure semantic communications: Challenges, approaches, and opportunities. IEEE Netw. 2023, 38, 197–206. [Google Scholar] [CrossRef]
Chen, J.; Zhang, X.; Zhang, R.; Wang, C.; Liu, L. De-pois: An attack-agnostic defense against data poisoning attacks. IEEE Trans. Inf. Forensics Secur. 2021, 16, 3412–3425. [Google Scholar] [CrossRef]
Wang, Y.; Su, Z.; Zhang, N.; Xing, R.; Liu, D.; Luan, T.H.; Shen, X. A Survey on Metaverse: Fundamentals, Security, and Privacy. IEEE Commun. Surv. Tutor. 2023, 25, 319–352. [Google Scholar] [CrossRef]
Roesner, F.; Kohno, T. Security and Privacy in the Metaverse. IEEE Secur. Priv. 2024, 22, 7–9. [Google Scholar] [CrossRef]
Wang, T.; Zheng, Z.; Lin, F. Federated learning framework based on trimmed mean aggregation rules. Expert Syst. Appl. 2025, 270, 126354. [Google Scholar] [CrossRef]
Wei, W.; Liu, L. Gradient leakage attack resilient deep learning. IEEE Trans. Inf. Forensics Secur. 2021, 17, 303–316. [Google Scholar] [CrossRef]
Hallaji, E.; Razavi-Far, R.; Saif, M.; Herrera-Viedma, E. Label noise analysis meets adversarial training: A defense against label poisoning in federated learning. Knowl. Based Syst. 2023, 266, 110384. [Google Scholar] [CrossRef]
Li, T.; Yang, C.; Yang, Q.; Lan, S.; Zhou, S.; Luo, X.; Huang, H.; Zheng, Z. Metaopera: A cross-Metaverse interoperability protocol. IEEE Wirel. Commun. 2023, 30, 136–143. [Google Scholar] [CrossRef]
Zaman, S.; Dantu, R.; Badruddoja, S.; Talapuru, S.; Upadhyay, K. Layerwise interoperability in Metaverse: Key to next-generation electronic commerce. In Proceedings of the 2023 IEEE International Conference on Metaverse Computing, Networking and Applications (MetaCom), Kyoto, Japan, 26–28 June 2023; pp. 9–16. [Google Scholar]
Jung, H. Design of an Architecture for Interoperability between heterogeneous Metaverse platforms. In Proceedings of the 2023 14th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 11–14 October 2023; pp. 1500–1503. [Google Scholar]

Figure 1. The basic framework of the Metaverse.

Figure 2. The framework of semantic communication (SC) enabling the Metaverse.

Figure 3. The proposed federated semantic communication framework for the Metaverse.

Figure 4. PSNR comparison of different semantic communication schemes with different compression ratios.

Figure 5. Reconstruction of test-set images for the Fed-AE model based on the MNIST and KMNIST datasets with different compression ratios.

Figure 6. Classification accuracy of the Fed-AE model for the MNIST and KMNIST test sets at different compression ratios.

Figure 7. The upper left is the original image, the lower left is the image generated by our model (the compression rate is 0.055), and the images from the upper right to the bottom were generated through JPEG compression at compression rates of 0.15, 0.20, and 0.30.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bian, Y.; Zhang, X.; Luosang, G.; Renzeng, D.; Renqing, D.; Ding, X. Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions. Electronics 2025, 14, 868. https://doi.org/10.3390/electronics14050868

AMA Style

Bian Y, Zhang X, Luosang G, Renzeng D, Renqing D, Ding X. Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions. Electronics. 2025; 14(5):868. https://doi.org/10.3390/electronics14050868

Chicago/Turabian Style

Bian, Yue, Xin Zhang, Gadeng Luosang, Duojie Renzeng, Dongzhu Renqing, and Xuhui Ding. 2025. "Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions" Electronics 14, no. 5: 868. https://doi.org/10.3390/electronics14050868

APA Style

Bian, Y., Zhang, X., Luosang, G., Renzeng, D., Renqing, D., & Ding, X. (2025). Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions. Electronics, 14(5), 868. https://doi.org/10.3390/electronics14050868

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Federated Learning and Semantic Communication for the Metaverse: Challenges and Potential Solutions

Abstract

1. Introduction

2. Overview of the Metaverse

2.1. Metaverse Service Requirements

2.2. Metaverse-Related Technologies

3. Semantic Communication in the Metaverse

3.1. Semantic Communication

3.2. Semantic Communication Empowers the Metaverse

4. Federated Semantic Communication Framework for the Metaverse

5. Performance Analysis and Evaluation

6. Future Research Directions and Challenges

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI