Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness

Song, Junhui; Zheng, Zhangqi; Li, Afei; Xia, Zhixin; Liu, Yongshan

doi:10.3390/app15147843

Open AccessArticle

Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness

by

Junhui Song

¹,

Zhangqi Zheng

²,

Afei Li

¹,

Zhixin Xia

¹ and

Yongshan Liu

^1,*

¹

School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China

²

School of Mathematics and Information Technology, Hebei Normal University of Science & Technology, Qinhuangdao 066004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 7843; https://doi.org/10.3390/app15147843

Submission received: 16 June 2025 / Revised: 10 July 2025 / Accepted: 11 July 2025 / Published: 13 July 2025

(This article belongs to the Special Issue Cyber-Physical Systems Security: Challenges and Approaches)

Download

Browse Figures

Versions Notes

Abstract

Federated learning (FL) has emerged as a prominent distributed machine learning paradigm that facilitates collaborative model training across multiple clients while ensuring data privacy. Despite its growing adoption in practical applications, performance degradation caused by data heterogeneity—commonly referred to as the non-independent and identically distributed (non-IID) nature of client data—remains a fundamental challenge. To mitigate this issue, a heterogeneity-aware and robust FL framework is proposed to enhance model generalization and stability under non-IID conditions. The proposed approach introduces two key innovations. First, a heterogeneity quantification mechanism is designed based on statistical feature distributions, enabling the effective measurement of inter-client data discrepancies. This metric is further employed to guide the model aggregation process through a heterogeneity-aware weighted strategy. Second, a multi-loss optimization scheme is formulated, integrating classification loss, heterogeneity loss, feature center alignment, and L2 regularization for improved robustness against distributional shifts during local training. Comprehensive experiments are conducted on four benchmark datasets, including CIFAR-10, SVHN, MNIST, and NotMNIST under Dirichlet-based heterogeneity settings (alpha = 0.1 and alpha = 0.5). The results demonstrate that the proposed method consistently outperforms baseline approaches such as FedAvg, FedProx, FedSAM, and FedMOON. Notably, an accuracy improvement of approximately 4.19% over FedSAM is observed on CIFAR-10 (alpha = 0.5), and a 1.82% gain over FedMOON on SVHN (alpha = 0.1), along with stable enhancements on MNIST and NotMNIST. Furthermore, ablation studies confirm the contribution and necessity of each component in addressing data heterogeneity.

Keywords:

federated learning; data heterogeneity; heterogeneity-aware; weighted aggregation; multi-loss function

1. Introduction

1.1. Related Work

The proliferation of distributed computing paradigms, such as the Internet of Things (IoT) and edge computing, has resulted in an unprecedented volume of data being generated and stored locally on end devices. The efficient utilization of this distributed data, while ensuring the preservation of privacy, has emerged as a critical challenge in intelligent computing. Federated learning (FL) has been proposed as a privacy-preserving distributed learning framework, where local clients perform model training without transmitting raw data to a central server. This paradigm has shown promising applications in domains such as healthcare, smart transportation, and industrial IoT [1,2,3,4].

Despite its practical success, FL remains fundamentally constrained by the inherent heterogeneity of client data—commonly referred to as the non-independent and identically distributed (non-IID) problem. Variations in data distributions, feature representations, and label spaces across clients often result in divergent local model updates, severely impairing the generalization ability of the global model [5,6,7,8,9].

Recent efforts to mitigate this issue can be broadly categorized into three methodological streams:

-: Data Preprocessing-Based Approaches: Clustering-based techniques have been employed to group clients with similar data characteristics and assign customized models accordingly. Approaches such as Iterative Federated Class Averaging (IFCA), Federated Semi-Supervised (FedSEM), and Fine-Tuning and Clustering (FPFC) cluster clients based on model weights or task similarities, while Federated Model-agnostic Meta-learning with Optimized Objective Normalization (FedMOON) leverages contrastive learning to align feature spaces [10,11,12,13,14].
-: Model Robustness-Oriented Strategies: These strategies incorporate regularization techniques or meta-learning paradigms for enhanced robustness against distribution shifts during local training. Federated Proximal (FedProx) introduces a proximal term to constrain local updates, while Federated Contrastive (FedCL) and other personalized FL methods optimize local solutions tailored to individual clients [15,16,17,18].
-: Framework-Level Enhancements: Instead of exchanging model weights, these methods transmit distilled “knowledge” to alleviate divergence under non-IID settings. Notable examples include Federated Model Distillation (FedMD), the Federated Generative Network (FedGEN), and Federated Knowledge Distillation (FedKD), which use public or synthetic datasets to facilitate the distillation of knowledge [19,20,21].

1.2. Challenges

While the approaches above have made progress, three core challenges persist. Firstly, a lack of quantitative heterogeneity modeling. Existing methods often adopt simplistic metrics, such as class imbalance or label distribution, failing to model statistical and structural heterogeneity in the feature space. This results in limited optimization guidance. Secondly, inadequate aggregation responsiveness. Mainstream aggregation algorithms such as Federated Averaging (FedAvg) apply static or sample size-based weighting schemes, overlooking the severity of client heterogeneity, which can lead to conflicting parameter updates and a degraded performance. Thirdly, strategy–distribution mismatch. Although regularization and personalization improve robustness, many methods inadequately integrate local adaptation with global learning. This issue is exacerbated under highly heterogeneous or dynamic conditions, where convergence instability and reduced generalization remain prevalent.

To address these limitations, a heterogeneity-aware federated optimization framework is proposed. The contributions of this work are summarized as follows:

-: Heterogeneity Metric Design: A statistical heterogeneity measurement approach is developed by extracting the local mean and variance from latent feature distributions. A differentiable heterogeneity function is integrated into both training and aggregation to effectively quantify client-level discrepancies.
-: Multi-Loss Optimization Mechanism: A composite loss function is formulated by combining cross-entropy loss with heterogeneity-aware loss, feature center alignment, and L2 regularization, thereby enhancing convergence and generalization in non-IID environments.
-: Heterogeneity-Aware Weighted Aggregation: A dynamic aggregation strategy is implemented on the server to adjust client contributions based on heterogeneity levels, reducing the negative impact of high-variance clients and improving global model fidelity.
-: Empirical Validation Across Non-IID Scenarios: Extensive experiments conducted on datasets such as CIFAR-10 under Dirichlet-based heterogeneity (alpha = 0.1 or 0.5) validate the superiority of the proposed method in terms of accuracy, convergence stability, and interpretability. The results indicate enhanced global performance, with client adaptability maintained.

2. Preliminary Knowledge

Federated Learning

Federated learning (FL) is a distributed machine learning paradigm that facilitates collaborative model training across multiple clients while preserving data privacy, as depicted in Figure 1. In practice, however, client data are typically non-independent and identically distributed (non-IID), resulting in divergent local updates and a degraded global performance. This heterogeneity poses significant challenges to models’ convergence and generalization.

To mitigate the impact of data heterogeneity, a wide range of optimization techniques have been proposed, which can be broadly categorized into three groups: model-level alignment, loss function compensation, and aggregation strategy enhancement.

To reduce local model drift, FedProx introduces an L2-based proximal term that constrains local updates [22]. Federated Curvature Regularization (FedCurv) employs the Fisher information matrix to guide regularization along important parameter directions [23]. Similarly, federated sharpness-aware minimization (FedSAM) improves robustness by incorporating neighborhood and global perturbations into a sharpness-aware minimization, making the model more globally aware while maintaining local consistency [24]. Federated Aggregation-Free (FedAF) further reduces client drift by allowing clients to collaboratively learn using condensed data and soft labels, addressing data heterogeneity and improving global model performance while avoiding the problems caused by traditional aggregation [25]. Although these approaches improve the robustness to some extent, they typically lack an explicit modeling or quantification of inter-client heterogeneity.

Several recent works focus on aligning client representations rather than model parameters. Federated Representation Learning (FedRep) decouples the model into a shared feature extractor and a personalized classifier to enhance its generalizability. Federated Model Aggregation (FedMA) performs layer-wise matching to aggregate models with heterogeneous architectures. FedMOON incorporates contrastive learning to ensure local representation consistency across federated rounds, reinforcing client-specific feature learning [26,27,28]. Personalized Federated Learning with Model-Contrastive Learning (PFL-MCL) enhances multi-modal user modeling in Metaverse environments, addressing data and model heterogeneity to improve communication efficiency and networking in human-centric applications [29]. However, most of these methods do not directly measure or model data heterogeneity, limiting their theoretical tractability and optimization adaptability.

Some studies attempt to explicitly account for heterogeneity during aggregation. FedNova addresses the bias introduced by unequal local update steps via normalized aggregation [30]. On the other hand, Federated Domain Shift Eraser (FDSE) mitigates domain shifts in federated learning by deskewing features with a Domain-agnostic Feature Extractor (DFE) and Domain-specific Skew Eraser (DSE), enhancing models’ generalization across diverse client distributions [31]. Additionally, Hierarchical Split Federated Learning (HSFL) optimizes federated learning by splitting models and aggregating them efficiently, enhancing performance under multi-tier systems, especially for resource-constrained edge devices [32]. Nevertheless, these methods largely operate at the aggregation level and do not capture feature-level distributional discrepancies.

To address these limitations, this work proposes a differentiable heterogeneity measurement approach based on the statistical characterization of client feature distributions. This heterogeneity metric is integrated with a multi-loss optimization framework and a heterogeneity-aware aggregation strategy, aiming to enhance models’ adaptability and stability in non-IID environments while ensuring interpretability and generalization.

3. Heterogeneity-Aware Federated Optimization Framework

To effectively mitigate the performance degradation and training instability caused by non-independent and identically distributed (non-IID) data across clients in federated learning (FL), this work proposes a heterogeneity-aware federated optimization framework. The proposed method systematically enhances FL by improving the feature modeling, loss function design, and aggregation strategy, thereby maintaining the preservation of privacy while significantly improving its adaptability to heterogeneous environments.

This section provides a comprehensive description of the proposed framework, focusing on four key components: the overall architecture, the heterogeneity quantification mechanism, the multi-loss design, and the aggregation strategy.

3.1. Overall Framework of Heterogeneity-Aware Federated Optimization

3.1.1. Modeling

In this part, we model the problem of federated optimization under non-IID data conditions. The goal is to quantify and incorporate the heterogeneity across clients into the federated learning framework. Our model consists of three main components, as illustrated in Figure 2.

(A) A Global Feature Extractor: This component models the local data distributions across clients and extracts consistent embeddings for each client.

(B) A Latent Information Extractor: We estimate the distributional characteristics (mean and covariance) of the local feature space to quantify the heterogeneity among clients.

(C) A Heterogeneity-Weighted Aggregation Strategy: The heterogeneity metric from each client influences the weight assigned during aggregation, allowing the global model to focus more on stable clients.

The integrated model supports both local training and global aggregation to ensure robustness under heterogeneous conditions.

3.1.2. Algorithm Implementation

On the client side, a Global Feature Extractor is employed to perform the deep modeling of local data, generating unified embedding features with a consistent dimensionality across clients. These features provide a structural foundation for the subsequent global model aggregation. Algorithm 1 is an algorithmic implementation of the framework of heterogeneity-aware federated optimization.

Algorithm 1. Framework of Heterogeneity-Aware Federated Optimization

Inputs:
- Global model parameters:

θ_{0}

- Client datasets:

{D_{1}, D_{2}, D_{3}, \dots D_{k}}

Output:
- Optimized global model:

θ_{t}

Client-Side Operations (Parallel Execution):
1. Global Feature Extraction
For each batch

B \in D_{k}

:

F_{k} = f_{θ} (x_{k}) \forall x_{k} \in B

→ Output: Embedding matrix

F_{k} \in R^{B \times D}

2. Latent Information Extraction
a. Compute feature statistics

μ_{k}

and

σ_{k}

(Equation (1))
b. Calculate heterogeneity metrics
- Variance-based metric:

H_{k}^{var}

(Equation (2))
- Manifold-based metric:

H_{k}^{m a n i}

(Equations (3) and (4))
c. Unified heterogeneity score

H_{k}

(Equation (5))
3. Multi-Loss Optimization
a. Composite loss function

L_{t a t a l}

(Equation (8))
b. Local model update:

θ_{k}^{t + 1} = θ_{k}^{t} - η \nabla_{θ} L_{t o t a l}

4. Transmit to Server
Send tuple

(θ_{k}^{t + 1}, H_{k}, n_{k})

where

n_{k} = | D_{k} |

Server-Side Operations
1. Aggregation Initialization
For round

t = 1

to

T

:
- Receive client tuples:

(θ_{k}^{t + 1}, H_{k}, n_{k})

2. Heterogeneity-Weighted Aggregation
a. Compute effective weights

ω_{k}

(Equation (9))
b. Update global model:

θ^{t + 1} = \frac{1}{\sum_{k = 1}^{K} ω_{k}} \sum_{k = 1}^{K} ω_{k} θ_{k}

3. Model Distribution
Broadcast

θ_{t + 1}

to all clients

Termination
Return final global model

θ_{T}

after T rounds

Next, the Latent Information Extractor statistically estimates the distributional characteristics of the local feature space by computing the sample mean and covariance matrix. Based on these statistics, a heterogeneity metric is introduced to measure the deviation of local data distributions from the global distribution. Specifically, the determinant of the covariance matrix is adopted as the heterogeneity indicator. This metric is incorporated into the training process as a regularization term, alongside center loss and weight regularization, to guide local model optimization.

On the server side, after collecting model parameters and heterogeneity indicators from all the clients, a Heterogeneity-Weighted Aggregation Strategy is employed. This strategy adaptively adjusts each client’s aggregation weight based on both the local sample size and the computed heterogeneity level. Clients with a high heterogeneity are assigned lower weights to reduce their negative impact on the global model. The concept of effective weight is introduced to ensure both an efficient model updating and the improved stability and generalization of the aggregated model.

3.1.3. Simulation Procedure

The simulation procedure is designed to comprehensively capture the process of training and aggregating federated models, with consideration of data heterogeneity. The following steps outline the procedure:

Local Model Training:
- Each client trains a local model using its own dataset.
- Heterogeneity Metrics: For each client, heterogeneity metrics are computed based on
  -
  Feature Statistics: Local feature embeddings are extracted, and statistical measures such as mean and variance are calculated.
  -
  Manifold Structures: High-dimensional features are projected into a low-dimensional manifold space using techniques like t-SNE, capturing geometric discrepancies in the latent space.
Model and Metric Sharing:
- At the end of each federated round, each client shares
  -
  The updated model parameters.
  -
  The computed heterogeneity metrics (based on feature statistics and manifold structures) with the server.
Global Model Update:
-
The server receives the model parameters and heterogeneity metrics from all the participating clients.
-
Heterogeneity-Aware Aggregation step: The server uses the received heterogeneity metrics to compute the aggregation weights for each client. Clients with a higher heterogeneity are assigned lower weights to mitigate their negative impact on the global model.
-
The server then aggregates the model parameters from all the clients based on the computed weights, updating the global model.
Model Synchronization:

After updating the global model, the server sends the updated model parameters back to the clients for the next round of local training.

This simulation procedure ensures that the federated learning process takes into account the heterogeneity present across the clients and adjusts the aggregation strategy accordingly.

3.1.4. Verification Method

To validate the effectiveness of the proposed framework, we conducted extensive experiments comparing our heterogeneity-aware optimization with standard federated learning approaches. We used the following verification methods:

Visualizing Feature Distributions: After training, we projected the feature distributions of the clients into a 2D space (using PCA and t-SNE) and compared the heterogeneity scores.

Quantifying Performance: We evaluated the performance of the global model on standard non-IID datasets like CIFAR-10 and MNIST. We compared our framework’s convergence and accuracy to traditional methods like FedAvg and FedProx.

This framework enhances the resilience of federated learning to non-IID data while preserving privacy. It is highly applicable in real-world federated learning environments such as cross-device or cross-institutional applications.

3.2. Mechanisms for Measuring Heterogeneity

In federated learning, the presence of non-independent and identically distributed (non-IID) data across clients poses significant challenges, such as a slow convergence and degraded model performance. To systematically quantify distributional differences among clients, we propose a multi-scale heterogeneity quantification mechanism that integrates local feature statistics with a manifold structural analysis. This dual-perspective metric supports both the modeling of heterogeneity and the subsequent optimization strategies.

3.2.1. Local Heterogeneity Based on Feature Statistics

After completing a local forward pass, each client extracts feature embeddings for a batch of data using the feature extractor

f (x)

, resulting in an embedding matrix

F \in R^{B \times D}

, where

B

denotes the batch size and

D

the feature dimensionality. We first compute the mean

μ_{k}

and variance

σ_{k}

of the batch embeddings as follows.

μ_{k} = \frac{1}{B} \sum_{i = 1}^{B} f_{i}, σ_{k} = \frac{1}{B} \sum_{i = 1}^{B} {(f_{i} - μ_{k})}^{2}

(1)

Next, incorporating a center shift term, we define a weighted variance-based heterogeneity metric.

H_{k}^{var} = \frac{1}{D} \sum_{j = 1}^{D} (σ_{k}^{j} + δ) + γ \cdot {‖μ_{k} - c‖}_{2}^{2}

(2)

where

δ

is a small constant added to prevent instability due to zero variance,

γ

is a weighting factor controlling the influence of center deviation, and

c

represents the client’s current feature center, which may be historical or predefined as a global reference.

This metric jointly considers the dispersion (variance) and deviation (mean shift) of feature distributions, effectively capturing the overall variation trend of the client’s local data.

3.2.2. High-Order Heterogeneity Modeling via Manifold Structure

To further capture geometric discrepancies in the latent space, we introduce a manifold-based heterogeneity metric as a higher-order measure. Specifically, t-distributed Stochastic Neighbor Embedding (t-SNE) is employed to project the high-dimensional embeddings

F

into a 2D manifold space:

F_{l o w} = t S N E (F), c_{l o w} = \frac{1}{B} \sum_{i = 1}^{B} F_{l o w}^{i}

(3)

We then compute the average squared distance from each low-dimensional point to the center to assess the degree of divergence in the manifold:

H_{k}^{m a n i} = \frac{1}{B} \sum_{i = 1}^{B} {‖F_{l o w}^{i} - c_{l o w}‖}_{2}^{2}

(4)

This term reflects the tendency of samples’ dispersion or concentration within the manifold space, thus capturing heterogeneity under nonlinear data distributions.

3.2.3. Unified Heterogeneity Metric

Finally, we integrate the above two components into a unified heterogeneity metric:

H_{k} = H_{k}^{var} + ζ \cdot H_{k}^{m a n i}

(5)

where

ζ

is a weighting factor that adjusts the influence of the manifold-based component. This fused metric balances computational efficiency with a comprehensive modeling of both the statistical and geometric aspects of heterogeneity, providing a solid foundation for heterogeneity-aware optimization.

To further validate the effectiveness of the proposed heterogeneity quantification mechanism, we selected four representative clients and visualized their feature distributions and corresponding heterogeneity scores, as illustrated in Figure 3. In each subfigure, samples are projected into a two-dimensional space using Principal Component Analysis (PCA), with colors indicating class labels. The heterogeneity score of each client is displayed in the title.

It can be observed from Figure 3 that the samples of Client A and Client B come from multiple categories with a relatively uniform distribution, and the heterogeneity scores are only 28.05 and 29.18, reflecting relatively balanced IID characteristics. The distribution of the Client C and Client D samples is extremely biased towards a single category or a small number of categories, and the corresponding heterogeneity scores are as high as 68.47 and 59.75, respectively, indicating a significant class imbalance. However, one class is absolutely dominant, showing a moderately high distribution deviation, so the heterogeneity scores are high.

These results demonstrate that the proposed heterogeneity metric effectively captures not only class imbalances but also the clustering or dispersion patterns of samples in the feature space. Our heterogeneity quantification mechanism effectively captures both statistical and geometric heterogeneity across clients, offering a strong foundation for models’ adaptation and optimization. It provides a strong interpretability and discriminative capability, thereby offering valuable support for subsequent federated optimization.

3.3. Multi-Loss Design

To enhance models’ generalization and convergence stability under non-IID data environments, a multi-loss function was designed during the client training stage. This design further responds to the client distribution disparities revealed by the aforementioned heterogeneity quantification mechanism. By integrating the heterogeneity scores into the training objective, the model can perceive and adaptively adjust to various heterogeneity factors, thereby achieving coordination between heterogeneity modeling and optimization strategies. The overall loss function consists of four components, classification loss, heterogeneity loss, regularization term, and feature center constraint, as detailed below:

-: Classification Loss (Cross-Entropy Loss). As the fundamental component in supervised learning, the classification loss evaluates the model’s prediction capability on the input data. It adopts the cross-entropy form to ensure the model learns discriminative features corresponding to local task labels.
-: Heterogeneity Loss. Based on the multi-scale fusion heterogeneity quantification mechanism proposed in Section 3.2, a corresponding heterogeneity loss term is introduced at the client-side training stage. This loss transforms the heterogeneity measurement into an optimizable training signal, guiding the model to suppress unstructured feature diffusions and structural drift. Specifically, this loss comprises two parts: the first is a statistical metric based on feature variance, utilizing per-dimension variance and the center deviation of intermediate embedding features to reflect the spatial dispersion of client features; the second is a nonlinear manifold-based metric, where t-SNE is applied to map features into a low-dimensional manifold space, quantifying their structural complexity and distributional divergence.
-: Feature Center Constraint. Considering the semantic feature drift among heterogeneous clients, a feature center constraint is introduced to enhance the consistency of feature representations. Specifically, in each training round, the mean vector of the embedded features from the current batch is computed, and the Euclidean distance between each sample’s feature and this mean is measured. This term encourages a more compact distribution of features within the semantic space, reduces the disruptive effect of feature dispersion on training stability, and improves the model’s discriminative capability after aggregation. It is defined as

$L_{c e n t e r} = \frac{1}{B} \sum_{i = 1}^{B} {‖f_{i} - μ_{k}‖}_{2}^{2}$

(6)
-: Regularization Term (L2 Regularization). To prevent the model from overfitting to local data distributions, an L2 regularization term is imposed on model parameters as a structural constraint. By penalizing the L2 norm of all network weight parameters, this term reduces the model’s complexity and enhances it generalization capability during global aggregation. This is particularly crucial in non-IID settings, as it mitigates local performance shifts caused by overfitting. Formally,

$R_{L 2} = \sum_{i} {‖θ_{i}‖}^{2}$

(7)

Finally, the total loss function optimized during client training is formulated as

L_{t o t a l} = L_{\sup} + α \cdot H_{k} + λ \cdot R_{L 2} + β \cdot L_{c e n t e r}

(8)

where

α

,

λ

, and

β

are the tuning coefficients for the heterogeneity loss, regularization term, and feature center constraint, respectively.

This multi-loss design not only improves the model’s representation ability on local data but also enables effective modeling and adaptation to inter-client distribution differences under the guidance of the heterogeneity quantification mechanism. By introducing heterogeneity-aware losses based on feature statistics, the model dynamically perceives and responds to non-IID characteristics, thereby guiding the local model convergence toward the global distribution during training. This design provides a unified and stable feature representation foundation for subsequent heterogeneity-aware aggregation.

3.4. Heterogeneity-Aware Weighted Aggregation Strategy

Based on the heterogeneity quantification mechanism proposed in Section 3.2, each client can compute a heterogeneity metric (HM) after local training in each round by leveraging the statistical information of intermediate feature representations. This metric reflects the dispersion of the client’s data distribution in the feature space. The heterogeneity metric not only serves as a part of local training constraints through the loss function but is further utilized to optimize the global model aggregation strategy.

During the aggregation phase, we propose a heterogeneity-aware weighted aggregation strategy, which integrates both the local dataset size and heterogeneity information of each client. Specifically, for client

k

, the aggregation weight of its model parameters is determined by its local sample size

n_{k}

and the heterogeneity metric

H M_{k}

, calculated as follows:

ω_{k} = \frac{n_{k}}{1 + H M_{k}}

(9)

The core idea of this mechanism is that clients with more stable and internally consistent data distributions are more reliable and thus should receive a greater aggregation weight. Conversely, clients with highly dispersed data and unstable training dynamics should have their influence on the global model suppressed to improve the overall convergence and generalization of the global model. This strategy maintains the communication efficiency of federated learning while dynamically modeling the quality of client data, thereby enhancing the robustness of the federated system under heterogeneous environments.

This chapter has presented a comprehensive solution under the framework of heterogeneity-aware federated optimization, aiming to balance non-IID sensitivity and federated learning performance. Starting from client-side feature modeling, a heterogeneity quantification mechanism was established using statistical characteristics (e.g., mean and variance) to measure the divergence in client data distributions. Building upon this, a multi-loss function was introduced that integrates cross-entropy loss, heterogeneity loss, center constraint, and L2 regularization, enabling the local training process to adaptively respond to data heterogeneity. Furthermore, a heterogeneity-weighted global aggregation strategy was proposed, where each client’s heterogeneity metric (HM) serves as a weighting factor to regulate the model fusion process. This effectively mitigates the performance fluctuations caused by distribution inconsistencies among clients. The overall framework significantly enhances the training stability and generalization ability of federated learning in heterogeneous environments, while preserving privacy and communication efficiency.

3.5. Summary

In this chapter, we introduced a heterogeneity-aware federated optimization framework that effectively addresses the challenges posed by non-IID data in federated learning. The framework consists of a comprehensive solution, including feature modeling, heterogeneity quantification, multi-loss design, and a heterogeneity-weighted aggregation strategy. These components work synergistically to improve the stability and generalization of the federated model while preserving privacy and communication efficiency.

4. Experiment

4.1. Experimental Setup

4.1.1. Dataset Settings

Four widely used image classification datasets are selected to ensure evaluations across diverse data characteristics and label structures. Table 1 summarizes the key attributes of these datasets.

4.1.2. Non-IID Simulation

To simulate non-IID scenarios, we partition the original datasets using a Dirichlet distribution. Two concentration parameters, alpha = 0.1 and alpha = 0.5, are used to simulate different degrees of data heterogeneity. Specifically, alpha = 0.1 represents a highly heterogeneous setting (where a few classes dominate certain clients), while alpha = 0.5 indicates a moderately heterogeneous environment. The resulting distributions for each dataset are illustrated in Figure 4.

4.2. Implementation Details

This section provides a detailed overview of the experimental setup, including the federated learning framework, model architecture, optimization settings, and client–server interaction. All experiments in this work are conducted using the Flower federated learning framework, which supports a flexible and extensible environment for federated learning experiments.

Federated Learning Framework:
-
Flower Framework: We implement the proposed heterogeneity-aware optimization method using the Flower framework, which allows for the easy integration of federated learning components, including client–server communication, model aggregation, and distributed training.
-
Client-Side Setup: Each client is assigned a local dataset, which is non-IID to simulate realistic conditions in federated learning. Clients independently perform local model training on their datasets and periodically upload their model updates and heterogeneity metrics to the server.
Model Architecture:
- A Convolutional Neural Network (CNN) with a simple and lightweight architecture is employed across all the clients for a fair comparison of methods. Specifically, the model consists of three convolutional layers, followed by a fully connected layer to output the class predictions. This architecture ensures that the focus remains on the aggregation strategy and heterogeneity-aware optimization rather than model complexity. The layer details are as follows:
  -
  Convolutional Layers: The model uses three convolutional layers with ReLU activations and max-pooling after each convolution.
  -
  Fully Connected Layer: A fully connected layer is used to output the final class prediction, followed by a softmax activation to generate probabilities for classification tasks.

This choice of architecture is designed to be lightweight, ensuring the experiments are computationally feasible, while maintaining sufficient complexity to test the core ideas of the proposed method.

3.

Federated Learning Procedure:

-: Federated Rounds: The system operates over a series of federated rounds. In each round, all clients perform local training on their datasets, calculate their heterogeneity metrics (based on feature statistics and manifold structures), and upload both the model parameters and heterogeneity scores to the server.
-: Server-Side Aggregation: The server aggregates the model parameters from all the clients using the heterogeneity-aware weighted aggregation strategy described in Section 3.4. This aggregation method adjusts the contribution of each client’s update based on its local heterogeneity, reducing the impact of clients with highly imbalanced data distributions.
-: Global Model Update: After receiving updates from all the clients, the server combines the model parameters using the weighted strategy, updating the global model. The updated global model is then sent back to the clients for the next round of local training.

4.

Client Participation:

-: All Clients Participate in Every Round: In each federated round, all clients participate in the model update process. This ensures that the global model benefits from diverse data distributions and that the model learns from all the clients’ datasets over time.
-: Communication Efficiency: The communication overhead is minimized by uploading only the model parameters and heterogeneity metrics, rather than the entire dataset. This ensures the privacy of client data while enabling effective federated learning.

5.

Baselines:

For comparison, FedAvg is used as the baseline aggregation method. FedAvg is a widely used approach for federated learning that aggregates local model updates by averaging, without accounting for client heterogeneity. The performance of the proposed method is compared against FedAvg to evaluate the improvements achieved by the heterogeneity-aware optimization framework.

This detailed setup allows us to rigorously assess the impact of the proposed heterogeneity-aware method on the model’s performance, robustness, and convergence under real-world federated learning conditions.

4.3. Comparative Experiment

To verify the effectiveness of the proposed method under non-IID heterogeneous conditions, four representative federated learning approaches are selected for comparison.

-: FedAvg: A basic federated averaging algorithm that does not consider data heterogeneity among clients.
-: FedProx: Alleviates the model drift caused by heterogeneity by incorporating an additional regularization term in the local optimization process.
-: FedSAM: Introduces gradient perturbation during optimization to enhance models’ robustness against data fluctuations and distributional differences.
-: FedMOON: Enhances client feature consistency through contrastive learning, effectively combating client drift.

On top of these, the proposed method introduces a heterogeneity-aware mechanism. By integrating a composite loss function to model data distribution characteristics and adopting a heterogeneity-weighted aggregation strategy, it significantly improves the stability and generalization of the global model in heterogeneous scenarios.

Table 2 summarizes the comparative performance under moderate heterogeneity (Dirichlet α = 0.5), and Table 3 summarizes the results under high heterogeneity (Dirichlet α = 0.1). The proposed heterogeneity-aware method (HAD) consistently achieves the best performance across all four datasets, validating its effectiveness and robustness under non-IID conditions.

When the heterogeneity is moderate (alpha = 0.5), the proposed method achieves accuracy scores of 67.69% (CIFAR-10), 93.9% (SVHN), 98.62% (MNIST), and 93.73% (NotMNIST), outperforming the traditional FedAvg and FedProx and in some cases surpassing FedSAM and FedMOON, demonstrating superior generalization.

Under highly heterogeneous conditions (alpha = 0.1), the method maintains its leading performance, especially on MNIST and NotMNIST, with a high accuracy of 97.41% and 92.86%, respectively, significantly outperforming the other methods. This strong performance stems from the proposed heterogeneity-aware mechanism and optimization design.

First, the HAD method introduces a heterogeneity metric based on feature statistics, which accurately identifies and quantifies distribution differences across clients. Building on this, the heterogeneity-weighted aggregation strategy adaptively regulates the influence of each client’s update, reducing the convergence disturbance caused by data inconsistency.

Second, HAD enhances the training objective by incorporating feature center constraints, heterogeneity loss, and regularization terms, enabling the model to balance diversity modeling and overfitting mitigation when dealing with non-IID data.

As shown in Figure 5 and Figure 6, which compare the training loss curves, throughout the 150 training rounds on each dataset, the proposed method consistently maintains the lowest loss curve, demonstrating a faster convergence and superior stability. Whether in Figure 5 (alpha = 0.5) or Figure 6 (alpha = 0.1), FedAvg shows a slow loss reduction and a relatively high final loss value, while FedProx performs slightly better but still suffers from considerable fluctuations.

In contrast, FedSAM and FedMOON converge significantly faster, with notably lower loss values. The proposed method further optimizes this trend by maintaining the lowest loss throughout training. On the MNIST dataset, although all the methods converge relatively quickly, FedProx exhibits a severe oscillation, and FedAvg shows a limited convergence, while the proposed method achieves a rapid early loss reduction and stabilizes at the lowest point, indicating a stronger convergence efficiency and training stability.

These results clearly demonstrate that the proposed heterogeneity-aware strategy and multi-objective loss design can significantly improve training quality and convergence performance under non-IID data distributions. The experimental results confirm that our method outperforms all baselines across all datasets and heterogeneity configurations, showing particularly strong advantages in the highly heterogeneous setting (alpha = 0.1) and highlighting its robust adaptability to non-IID environments.

4.4. Ablation Experiment

To evaluate the contribution of each component to the overall performance, the following ablation settings are designed:

Method1: Heterogeneity Loss: Removes the heterogeneity-aware loss mechanism.
Method2: Center Constraint: Removes the feature center constraint term.
Method3: Weighted Aggregation: Replaces heterogeneity-weighted aggregation with standard averaging.
Method4: The complete method with all modules included.

Table 4 summarizes the ablation study outcomes. It is evident that each module within the proposed heterogeneity-aware framework plays a significant role in improving the model’s performance. The complete method (Method4) achieves the best results in both accuracy (53.74%) and loss (1.3683), demonstrating the synergistic effect of the designed mechanisms in handling non-IID scenarios.

Method1, which removes the heterogeneity loss component, performs the worst, with an accuracy of only 47.63% and the highest loss of 1.6466. This highlights the importance of modeling and perceiving client feature distributions as a key factor for ensuring models’ stability.

In Method2, the removal of the feature center constraint leads to a drop in accuracy to 52.14%, indicating that feature alignment helps mitigate local feature drift and enhances the generalization ability of the model.

Method3 substitutes the heterogeneity-weighted aggregation with a simple averaging strategy, resulting in a decreased accuracy of 51.31%. This demonstrates that incorporating client heterogeneity as a weighting factor during aggregation can more effectively coordinate local model updates and improve the global model’s integration.

In conclusion, each of the three mechanisms is indispensable. Together, they form a robust framework that supports a high performance in highly heterogeneous federated learning environments.

4.5. Practical Application Expansion and Critical Analysis

4.5.1. Practical Application Scenarios

In summary, extensive experiments conducted on several benchmark datasets—CIFAR-10, SVHN, MNIST, and NotMNIST—demonstrate the effectiveness and robustness of the proposed heterogeneity-aware federated learning framework. In comparative experiments, the proposed method consistently outperforms mainstream federated learning approaches such as FedAvg, FedProx, FedSAM, and FedMOON under varying degrees of data heterogeneity, achieving a higher classification accuracy and lower training loss across multiple datasets. These results highlight its superior generalization capability and convergence efficiency.

-: Medical imaging combined diagnosis: In a cross-hospital brain MRI collaboration scenario, the heterogeneity perception mechanism of HAD can coordinate the distribution differences between institutions through feature statistics and manifold analysis and effectively overcome the model bias problem caused by FedAvg’s ignorance of data heterogeneity. Compared with FedProx, which relies on a single regularization constraint, the multi-scale modeling of HAD solves the feature space shift more comprehensively. Compared with the contrastive learning mechanism of FedMOON, which requires a frequent exchange of feature vectors, its weighted aggregation significantly reduces the communication burden and provides an efficient solution for distributed medical diagnosis.
-: Anomaly detection in the industrial IoT: In the face of differences in operating conditions in multi-factory equipment monitoring, the feature center constraint of HAD maintains cross-domain feature consistency and avoids the amplification effect of FedSAM gradient disturbance on sensor noise. Its heterogeneous weighted aggregation mechanism accurately screens effective client updates, significantly alleviates the convergence delay problem caused by invalid nodes, and provides stable support for industrial equipment condition monitoring in strong distribution difference environments.
-: Modeling of financial cross-domain risk control: In inter-bank anti-fraud cooperation, the manifold heterogeneity measure of HAD deeply captures regional nonlinear patterns and breaks through the limitations of FedMOON in complex distribution modeling. By fusing feature center constraint and multi-loss design, the modeling stability of the proposed method is significantly better than that of FedSAM’s gradient perturbation strategy, and it realizes the cross-regional collaborative perception of fraud features and provides a robustness of distribution for financial risk control scenarios.

4.5.2. Methodological Critical Review

The core breakthrough of the HAD framework lies in its pioneering fusion of feature statistics and manifold structure analysis to construct a multi-scale heterogeneity perception mechanism. Compared with the local optimization idea of FedProx, which relies on a single regularization constraint, HAD describes the essential differences in client distribution through the collaborative calculation of a variance basis measure and manifold divergence. Compared with FedSAM’s strategy of introducing gradient disturbance to improve its robustness, HAD’s feature center constraint shows a more stable control force in suppressing training fluctuations. This multi-dimensional modeling paradigm effectively solves the problem of the low efficiency of FedMOON’s contrastive learning in complex nonlinear distribution scenarios, and it establishes a new methodological benchmark for cross-domain federated learning.

However, at the application adaptation level, the framework still faces three key challenges. The extra overhead introduced by its manifold projection calculation makes the computational efficiency significantly weaker than FedAvg’s minimalist architecture in edge device scenarios. The regularization mechanism of FedProx shows a better toughness when facing a highly opposite sample distribution (such as the extreme difference between normal tissue and a malignant tumor in medical images). In addition, the need for the fine adjustment of heterogeneity weight parameters increases the deployment threshold, while FedSAM only needs to set the gradient disturbance amplitude, which is easier to engineer. These limitations essentially stem from the theoretical complexity of the statistic–manifold dual model, which forms a new balance proposition between lightweightness and robustness.

5. Conclusions

To address the pervasive non-IID data distribution problem in federated learning, this paper proposes a heterogeneity-aware robust federated learning method. The method introduces heterogeneity quantification and perception mechanisms at both the client training and server aggregation stages to enhance the generalization ability and convergence stability of the global model. Specifically, during local training on clients, a composite loss function is designed, incorporating classification loss, heterogeneity loss, L2 regularization, and feature center constraint, enabling structured optimization of the learning process. On the server side, a heterogeneity score is computed based on the clients’ feature distributions, and a corresponding weighted aggregation strategy is employed to effectively reduce the impact of highly heterogeneous clients on the global model. The proposed method is extensively validated on CIFAR-10, SVHN, MNIST, and NotMNIST under both mild (α = 0.5) and severe (α = 0.1) heterogeneity settings. Compared to existing classical methods, the proposed approach consistently achieves a superior performance across different datasets and heterogeneity levels. Additionally, ablation experiments confirm the critical roles of the heterogeneity loss, feature center constraint, and weighted aggregation in the overall framework.

These findings demonstrate that the proposed method offers significant improvements in robustness, generalization, and adaptability, outperforming existing mainstream approaches across a range of tasks and heterogeneity scenarios. The proposed framework provides both theoretical insight and practical guidance for building robust federated learning systems in real-world non-IID environments.

Future research directions could focus on several promising areas. First, the integration of more advanced heterogeneity metrics and optimization techniques could be explored to further enhance the convergence speed and model performance, particularly in highly heterogeneous environments. Second, the framework could be extended to more complex data distributions, such as multimodal data or heterogeneous label spaces, to improve the generalization in multi-task federated learning scenarios. Additionally, applying this framework to large-scale federated learning systems with real-world constraints, such as a limited communication bandwidth and heterogeneous device capabilities, presents an exciting direction. Exploring techniques such as decentralized learning, federated learning with differential privacy, or federated transfer learning could also provide valuable insights for improving both the security and performance of federated systems.

Author Contributions

The authors confirm their contributions to this paper as follows: study conception and design: J.S. and Z.Z.; data collection: J.S.; analysis and interpretation of results: A.L. and Z.X.; draft manuscript preparation: J.S. and Z.Z.; funding acquisition: Y.L.; manuscript review: Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 61972334.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jarwan, A.; Ibnkahla, M. Edge-based federated deep reinforcement learning for IoT traffic management. IEEE Internet Things J. 2022, 10, 3799–3813. [Google Scholar] [CrossRef]
Deng, W.; Chen, X.; Li, X.; Zhao, H. Adaptive federated learning with negative inner product aggregation. IEEE Internet Things J. 2024, 11, 6570–6581. [Google Scholar] [CrossRef]
Deng, W.; Li, K.; Zhao, H. A flight arrival time prediction method based on cluster clustering-based modular with deep neural network. IEEE Trans. Intell. Transp. Syst. 2023, 25, 6238–6247. [Google Scholar] [CrossRef]
Li, X.; Zhao, H.; Deng, W. BFOD: Blockchain-based privacy protection and security sharing scheme of flight operation data. IEEE Internet Things J. 2023, 11, 3392–3401. [Google Scholar] [CrossRef]
Hu, F.; Zhou, W.; Liao, K.; Li, H.; Tong, D. Toward federated learning models resistant to adversarial attacks. IEEE Internet Things J. 2023, 10, 16917–16930. [Google Scholar] [CrossRef]
Yan, Z.; Yang, H.; Guo, D.; Lin, Y. Improving airport arrival flow prediction considering heterogeneous and dynamic network dependencies. Inf. Fusion 2023, 100, 101924. [Google Scholar] [CrossRef]
Wang, S.; Luo, X.; Qian, Y.; Zhu, Y.; Chen, K.; Chen, Q.; Xin, B.; Yang, W. Shuffle differential private data aggregation for random population. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 1667–1681. [Google Scholar] [CrossRef]
Chang, Y.; Zhang, K.; Gong, J.; Qian, H. Privacy-preserving federated learning via functional encryption, revisited. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1855–1869. [Google Scholar] [CrossRef]
Zhang, L.; Xu, J.; Vijayakumar, P.; Sharma, P.K.; Ghosh, U. Homomorphic encryption-based privacy-preserving federated learning in IoT-enabled healthcare system. IEEE Trans. Netw. Sci. Eng. 2022, 10, 2864–2880. [Google Scholar] [CrossRef]
Yu, X.; Liu, Z.; Sun, Y.; Wang, W. Clustered federated learning for heterogeneous data (student abstract). In Proceedings of the 37th AAAI Conference on Artificial Intelligence, Virtual, USA, 7–14 February 2023; pp. 16378–16379. [Google Scholar]
Ruan, Y.; Joe-Wong, C. Fedsoft: Soft clustered federated learning with proximal local updating. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, Virtual, Canada, 22 February–1 March 2022; pp. 8124–8131. [Google Scholar]
Diao, Y.; Li, Q.; He, B. Towards addressing label skews in one-shot federated learning. In Proceedings of the 11th International Conference on Learning Representations (ICLR), Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
Nagalapatti, L.; Mittal, R.S.; Narayanam, R. Is your data relevant?: Dynamic selection of relevant data for federated learning. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, Virtual, Canada, 22 February–1 March 2022; pp. 7859–7867. [Google Scholar]
Dai, Y.; Chen, Z.; Li, J.; Heinecke, S.; Sun, L.; Xu, R. Tackling data heterogeneity in federated learning with class prototypes. In Proceedings of the 37th AAAI Conference on Artificial Intelligence, Virtual, USA, 7–14 February 2023; pp. 7314–7322. [Google Scholar]
Zhang, F.; Li, Y.; Lin, S.; Shao, Y.; Jiang, J.; Liu, X. Large sparse kernels for federated learning. In Proceedings of the ICLR, Kigali, Rwanda, 31 May 2023; Available online: https://openreview.net/forum?id=ZCv4E1unfJP (accessed on 5 June 2024).
Duan, J.H.; Li, W.; Lu, S. FedDNA: Federated learning with decoupled normalization-layer aggregation for non-IID data. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Bilbao, Spain, 13–17 September 2021; pp. 722–737. [Google Scholar]
Zhang, X.; Hong, M.; Dhople, S.; Yin, W.; Liu, Y. Fedpd: A federated learning framework with adaptivity to non-iid data. IEEE Trans. Signal Process. 2021, 69, 6055–6070. [Google Scholar] [CrossRef]
Vahidian, S.; Morafah, M.; Lin, B. Personalized federated learning by structured and unstructured pruning under data heterogeneity. In Proceedings of the 41st IEEE International Conference on Distributed Computing Systems Workshops, Washington, DC, USA, 7–10 July 2021; pp. 27–34. [Google Scholar]
Zhu, Z.; Hong, J.; Zhou, J. Data-free knowledge distillation for heterogeneous federated learning. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 12878–12889. [Google Scholar]
Wu, C.; Wu, F.; Lyu, L.; Huang, Y.; Xie, X. Communication-efficient federated learning via knowledge distillation. Nat. Commun. 2022, 13, 2032. [Google Scholar] [CrossRef] [PubMed]
Zhao, J.; Zhu, X.; Wang, J.; Xiao, J. Efficient client contribution evaluation for horizontal federated learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, ON, Canada, 6–11 June 2021; pp. 3060–3064. [Google Scholar]
Yuan, X.; Li, P. On convergence of FedProx: Local dissimilarity invariant bounds, non-smoothness and beyond. In Proceedings of the 36th Conference on Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; pp. 10752–10765. [Google Scholar]
Abdelmoniem, A.M.; Ho, C.Y.; Papageorgiou, P.; Canini, M. Empirical analysis of federated learning in heterogeneous environments. In Proceedings of the 2nd European Workshop on Machine Learning and Systems, Rennes, France, 5 April 2022; pp. 1–9. [Google Scholar]
Li, B.; Peng, Z.; Li, Y.; Xu, M.; Chen, S.; Ji, B.; Shen, C. Neighborhood and Global Perturbations Supported SAM in Federated Learning: From Local Tweaks To Global Awareness. arXiv, 2024; arXiv:2408.14144. [Google Scholar]
Wang, Y.; Fu, H.; Kanagavelu, R.; Wei, Q.; Liu, Y.; Goh, R.S.M. An Aggregation-free Federated Learning for Tackling Data Heterogeneity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 26233–26242. [Google Scholar]
Husnoo, M.A.; Anwar, A.; Hosseinzadeh, N.; Islam, S.N.; Mahmood, A.N.; Doss, R. FedRep: Towards horizontal federated load forecasting for retail energy providers. In Proceedings of the IEEE Asia-Pacific Power and Energy Engineering Conference, Melbourne, Australia, 20–23 November 2022; pp. 1–6. [Google Scholar]
Ek, S.; Portet, F.; Lalanda, P.; Vega, G. Evaluation of federated learning aggregation algorithms: Application to human activity recognition. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, Virtual, 12–17 September 2020; pp. 638–643. [Google Scholar]
Guo, Z.; Liu, A.; Dong, R. Research on Optical Mineral Image Recognition Based on Federated Learning. In Proceedings of the International Conference on Next Generation Data-Driven Networks, Xi’an, China, 15–17 March 2024; pp. 364–369. [Google Scholar]
Zhou, X.; Yang, Q.; Zheng, X.; Liang, W.; Wang, K.I.-K.; Ma, J.; Pan, Y.; Jin, Q. Personalized Federated Learning with Model-Contrastive Learning for Multi-modal User Modeling in Human-centric Metaverse. IEEE J. Sel. Areas Commun. 2024, 42, 817–831. [Google Scholar] [CrossRef]
Wang, J.; Liu, Q.; Liang, H.; Joshi, G.; Poor, H.V. Tackling the objective inconsistency problem in heterogeneous federated optimization. In Proceedings of the 34th Conference on Neural Information Processing Systems, Virtual, Canada, 6–12 December 2020; pp. 7611–7623. [Google Scholar]
Wang, Z.; Wang, Z.; Fan, X.; Wang, C. Federated Learning with Domain Shift Eraser. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 11–15 June 2025; pp. 4978–4987. [Google Scholar]
Lin, Z.; Wei, W.; Chen, Z.; Lam, C.-T.; Chen, X.; Gao, Y.; Luo, J. Hierarchical Split Federated Learning: Convergence Analysis and System Optimization. IEEE Trans. Mob. Comput. 2025, 1–16. [Google Scholar] [CrossRef]

Figure 1. Federated learning training process.

Figure 2. Overall framework of heterogeneity-aware federated optimization.

Figure 3. Feature distribution and heterogeneity score of four representative clients after PCA projection.

Figure 4. Data distribution plot, with alpha = 0.5 above and alpha = 0.1 below.

Figure 5. Loss comparison plot on different datasets (alpha = 0.5).

Figure 6. Loss comparison plot on different datasets (alpha = 0.1).

Table 1. Overview of datasets used in experiments.

Dataset Name	Number of Classes	Image Count	Image Size	Train/Test Split	Sample Type	Size
CIFAR-10	10	60,000	32 × 32	50,000/10,000	Natural images	177 MB
SVHN Cropped	10	99,289	32 × 32	73,257/26,032	Street view digits	1.32 GB
MNIST	10	70,000	28 × 28	60,000/10,000	Handwritten digits	11.6 MB
NotMNIST	10	145,000	28 × 28	120,000/25,000	Alphabet images	25.2 MB

Table 2. Performance comparison under heterogeneous settings (alpha = 0.5).

Method	CIFAR-10			SVHN			MINIST			NotMINIST
Method	Accuracy	Precision	Recall	Accuracy	Precision	Recall	Accuracy	Precision	Recall	Accuracy	Precision	Recall
FedAvg	55.67	55.03	56.23	88.07	88.78	87.54	96.71	96.16	97.43	89.95	89.20	90.43
FedProx	56.44	56.77	55.89	76.76	77.26	76.04	93.51	93.15	94.14	86.25	86.56	85.79
FedSAM	63.50	64.22	62.78	91.82	91.40	92.37	95.77	95.38	96.28	91.79	91.09	92.31
FedMOON	66.54	66.19	67.33	92.71	93.04	92.21	98.28	98.88	97.91	92.44	92.00	93.01
FedHAD	67.69	67.01	68.32	93.90	94.48	93.22	98.62	98.94	98.91	93.73	94.30	93.10

Table 3. Performance comparison under heterogeneous settings (alpha = 0.1).

Method	CIFAR-10			SVHN			MINIST			NotMINIST
Method	Accuracy	Precision	Recall	Accuracy	Precision	Recall	Accuracy	Precision	Recall	Accuracy	Precision	Recall
FedAvg	47.63	47.12	48.13	69.65	69.08	70.30	89.16	88.95	89.89	82.33	82.91	81.59
FedProx	47.05	46.65	47.52	46.73	47.47	46.17	67.58	66.13	66.95	61.07	60.60	61.94
FedSAM	45.94	45.83	46.57	76.81	77.34	76.11	87.67	87.87	87.41	88.67	88.90	88.21
FedMOON	50.64	51.42	51.03	83.37	83.01	83.45	96.30	96.71	95.91	92.39	92.89	93.59
FedHAD	52.71	52.03	53.33	85.19	84.36	85.60	97.41	97.98	96.82	92.86	92.39	93.74

Table 4. Ablation study of key components.

Method	Description	Heterogeneity Loss	Center Constraint	Weighted Aggregation	ACC (%)	Loss
Method1	Without heterogeneity loss	✗	✗	✓	47.63	1.6466
Method2	Without feature center constraint	✓	✗	✓	52.14	1.3859
Method3	Replacing weighted aggregation with simple mean	✓	✗	✗	51.31	1.4412
Method4	Full method (all modules included)	✓	✓	✓	53.74	1.3683

“✗” indicates the absence or non-use of the corresponding method/feature. “✓” indicates the presence or use of the corresponding method/feature.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Song, J.; Zheng, Z.; Li, A.; Xia, Z.; Liu, Y. Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness. Appl. Sci. 2025, 15, 7843. https://doi.org/10.3390/app15147843

AMA Style

Song J, Zheng Z, Li A, Xia Z, Liu Y. Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness. Applied Sciences. 2025; 15(14):7843. https://doi.org/10.3390/app15147843

Chicago/Turabian Style

Song, Junhui, Zhangqi Zheng, Afei Li, Zhixin Xia, and Yongshan Liu. 2025. "Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness" Applied Sciences 15, no. 14: 7843. https://doi.org/10.3390/app15147843

APA Style

Song, J., Zheng, Z., Li, A., Xia, Z., & Liu, Y. (2025). Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness. Applied Sciences, 15(14), 7843. https://doi.org/10.3390/app15147843

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research into Robust Federated Learning Methods Driven by Heterogeneity Awareness

Abstract

1. Introduction

1.1. Related Work

1.2. Challenges

2. Preliminary Knowledge

Federated Learning

3. Heterogeneity-Aware Federated Optimization Framework

3.1. Overall Framework of Heterogeneity-Aware Federated Optimization

3.1.1. Modeling

3.1.2. Algorithm Implementation

3.1.3. Simulation Procedure

3.1.4. Verification Method

3.2. Mechanisms for Measuring Heterogeneity

3.2.1. Local Heterogeneity Based on Feature Statistics

3.2.2. High-Order Heterogeneity Modeling via Manifold Structure

3.2.3. Unified Heterogeneity Metric

3.3. Multi-Loss Design

3.4. Heterogeneity-Aware Weighted Aggregation Strategy

3.5. Summary

4. Experiment

4.1. Experimental Setup

4.1.1. Dataset Settings

4.1.2. Non-IID Simulation

4.2. Implementation Details

4.3. Comparative Experiment

4.4. Ablation Experiment

4.5. Practical Application Expansion and Critical Analysis

4.5.1. Practical Application Scenarios

4.5.2. Methodological Critical Review

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI