1. Introduction
The rapid development of the big data era has highlighted the significant advantages of machine learning in numerous domains, giving rise to a plethora of intelligent applications [
1]. However, traditional centralized machine learning suffers from a fatal flaw of highly centralized data, leading to significant privacy breaches. In real-world applications, due to factors such as market competition and management strategies [
2], participating users (groups or individuals) are reluctant to share their data due to concerns about privacy risks, thus leading to the problem of data silos. To address this crucial issue, Federated Learning (FL) [
3] emerges as a highly promising solution. Its main innovation lies in providing a distributed machine-learning framework with privacy-preserving characteristics, enabling thousands of participants to collaboratively train a specific machine-learning model in a distributed manner. As the training data remains stored locally with the participants throughout the federated learning process, this mechanism allows for the sharing of training data among participants while ensuring privacy protection for each participant [
4]. To this day, the improvement and innovation of federated learning frameworks remain a research hotspot in the field of machine learning.
The basic workflow of federated learning is illustrated in
Figure 1, which mainly consists of the following steps: (1) Participants download the initialized global model from the cloud server, train the model using their local datasets, and generate the latest local model updates (i.e., model parameters). (2) The cloud server collects the local update parameters and updates the global model using model averaging algorithms. Despite making progress in privacy protection, federated learning still faces numerous security and privacy issues. Among them, the problems of heterogeneity and backdoor attacks are particularly acute.
Federated learning heterogeneity refers to the differences among various participants in federated learning, which can be summarized into three aspects: data heterogeneity (DH), objective heterogeneity (OH), and model heterogeneity (MH). Data heterogeneity refers to the variations in data characteristics, data types, or data scales among different participants. Each participant may possess data from different domains, such as medical data, financial data, or image data [
5], which could exhibit distinct data distributions, feature representations, and data labels. Objective heterogeneity refers to the differences in learning objectives or tasks among different participants. Each participant may have different learning goals, which could include classification, regression, clustering, or other tasks. Model heterogeneity refers to the usage of different machine learning models or model architectures among different participants. Each participant may employ different types of models, such as neural networks, decision trees, support vector machines, and so on. Classical federated learning aims to train a universally applicable global model, which may overfit local data and lose personalized features [
6]. Therefore, federated learning requires the design of suitable protocols and algorithms to handle these differences.
Backdoor attacks are not only easy to execute [
7] but also possess strong attack capabilities, making them a subject of significant concern in security research for federated learning. A backdoor attack refers to an attempt by an attacker to insert malicious backdoors or traps into a federated learning model. Since federated learning is a distributed machine learning approach involving multiple participants training the model together without sharing raw data, the attacker could be one of the participants or someone attempting to infiltrate the participants. The objective of a backdoor attack is to implant malicious functionality or behavior into the federated learning model, causing the model to perform normally under specific trigger conditions but execute malicious operations under specific backdoor trigger conditions. Examples of triggers include a single pixel [
8] or a black-and-white checkerboard [
7]. Attackers can implement backdoor attacks by manipulating training data, model parameters, or update rules during the local model update process of the participants. Backdoor attacks pose threats to the security and privacy of federated learning. Once a backdoor is successfully implanted, attackers can exploit the backdoor trigger conditions to perform unauthorized operations or obtain sensitive information. To prevent backdoor attacks, a series of security measures need to be implemented in federated learning, including data privacy protection, participant verification, secure mechanisms for model aggregation, anomaly detection [
9], and robustness enhancements.
In this work, we propose a novel federated learning paradigm called Federated Mutual Distillation Learning (FMDL). FMDL is a federated learning strategy that addresses the challenges of federated heterogeneity and backdoor attacks, guided by federated mutual learning and knowledge distillation [
10,
11]. FMDL views federated learning as a transfer learning process between the global model and local models, using deep mutual learning [
12] for localized updates in federated learning. This approach satisfies the universal standards of global updates while preserving the local, personalized requirements. Moreover, under the fine guidance of a teacher model, the student model (i.e., the local model) in FMDL maintains high accuracy even under various backdoor attacks.
We summarize our main contributions as follows:
We propose the adoption of dual-model updates in local updates of federated learning, where the meme model is used for knowledge transfer between the global model and local models, and the personalized model is designed as a private model for client data and tasks. The two models engage in deep mutual learning to address the three types of heterogeneity, enabling personalized model requirements.
We construct a clean teacher model based on knowledge distillation to guide the training of the student model. The teacher model is fine-tuned on small, clean subsets to defend against various types of backdoor attacks. This approach significantly reduces the accuracy of backdoor attacks, approaching random guessing without causing significant performance degradation, effectively ensuring privacy and security.
To achieve defensive performance visualization, we utilize attention maps as an evaluation criterion and define distillation loss based on the attention maps of the teacher and student models.
We conduct experiments on multiple benchmark datasets to validate the effectiveness of the FMDL method in addressing heterogeneity issues and its security against backdoor attacks.
The remainder of this paper is organized as follows.
Section 2 introduces the related works.
Section 3 describes the proposed federated mutual distillation learning method.
Section 4 evaluates and analyses the results of the experiment. Finally,
Section 5 concludes this paper.
3. Federated Mutual Distillation Learning
Addressing the three heterogeneous issues in federated learning while maintaining model personalization and improving overall performance, we introduce the method of knowledge distillation during the local update phase. Unlike the typical teacher-student relationship in knowledge distillation structures, although there is no well-trained teacher model or untrained student model in the federated learning system, the method of knowledge distillation can be applied to two different knowledge transfer models with different architectures. Therefore, we deviate from the traditional concept of one-way knowledge transfer in the “teacher-student model” and employ deep mutual learning in federated learning for local model updates. This allows the central server to obtain a generalized global model while enabling different clients to train a private personalized model tailored to their specific data and task requirements, resulting in a win-win situation. As shown in
Figure 2, we design two types of model structures within each client: (1) a local model used to receive the global model for local training and updates, and (2) a private model designed by the client for their specific needs. Both models engage in continuous mutual learning.
Additionally, we report our experimental parameters in
Table 1.
3.1. Classical Distillation Methods
The fundamental idea behind knowledge distillation is to transfer the “knowledge” of the teacher model to the student model by having the student model learn the teacher model’s predicted outcomes. Through knowledge distillation, the student model can acquire additional information from the teacher model, including relationships between categories, decision boundaries, and data distributions. As a result, the student model can maintain relatively high accuracy while having a smaller model size and faster inference speed. The loss function for the student model can be simplified as follows:
3.2. Federated Mutual Learning
In traditional federated learning, each participant (or client) trains a local model using their own local data and only shares model updates with the central server. The central server aggregates these updates to create a global model, which is then distributed back to the participants. However, the collaboration among participants is limited to the exchange of model updates. In terms of target heterogeneity, typical federated learning only focuses on the objectives of the central server and overlooks the clients’ need for personalized models. Moreover, in cases of significant data heterogeneity, the performance of the globally trained model in federated learning may not be ideal. In fact, if the client’s data are fragmented and dispersed, the model may even fail to converge after multiple iterations. Therefore, we introduce federated mutual learning, which aims to leverage the collective intelligence and diverse perspectives of participants to enhance the learning process, improve model performance, and overcome the individual limitations of participant data.
During the training process of federated mutual learning, the initial model is still distributed by the central server and used as the local model for the first iteration of local updates. Simultaneously, all clients also customize an independent private model (allowing for similarity or diversity). Both models are trained using local data. Unlike normal local updates, each client does not simply train a replica of the global model but instead engages in several rounds of deep mutual learning between the local model and the private model to achieve better performance than independent training. The detailed process is illustrated in
Figure 3. During this process, knowledge is transferred bidirectionally. As the local model receives updates from the global model, it migrates the knowledge from the central server to the private model. At the same time, the private model provides feedback on the client’s personalized features. Finally, the client sends the trained local model to the server, which selects and aggregates them for updating the global model in the federated aggregation step, preparing for the next round of federated training. This process is repeated until the model converges. The objectives of both models are to undergo self-training on the same dataset to achieve consistency in the predicted outcomes. The entire process is illustrated in Algorithm 1. Throughout the training process, we redefine the classical knowledge distillation loss as follows:
Algorithm 1: Federated Mutual Learning |
|
3.3. Attention Distillation Defense Methods
Attention maps are commonly used in deep learning to visualize the regions or positions that the model focuses on when processing inputs. Attention mechanisms are employed in sequence data tasks such as natural language processing and computer vision tasks. The attention mechanism calculates weights for each input element, allowing the model to concentrate on the most relevant or important elements. The attention map provides a visual representation of these weight allocations. Given a deep neural network model M, we define
as the attention algorithm that maps the activation maps to the attention representation, i.e., transforming the 3D activation maps into a flattened 2D tensor along the channel dimension. Attention maps play a crucial role in successful knowledge distillation. There are three common forms of attention algorithms, as shown below:
Since the teacher model and student model in knowledge distillation do not have a direct correspondence in federated learning, we replace the “teacher-student relationship” with the view that the local model and private model can be considered as two different student models. They engage in mutual learning and distillation, forming a “student-student relationship”. As the local model continuously participates in federated learning iterations, there may be potential backdoors. The process of erasing backdoor triggers through attention distillation is illustrated in
Figure 4. Therefore, it is necessary to add attention loss. Attention distillation combines the local model and private model through a neural attention extraction process. Attention representations are computed after each residual block, and attention distillation loss is defined based on the attention representations of both models:
Therefore, the overall training loss can be expanded as:
4. Experimental Evaluations
In this section, we conducted comparative experiments between the proposed Federated Mutual Learning (FMDL) method and the traditional Federated Learning (FL) method. The performance of FMDL was evaluated on three commonly used image classification datasets. Additionally, we conducted experiments specifically targeting three types of heterogeneous problems. In terms of backdoor defense, we assessed the performance of FMDL compared to three existing defense methods based on erasure techniques under six common backdoor attacks. Moreover, we clarified the criteria for selecting the form of attention maps representation.
4.1. Experimental Setup
We utilized three datasets, namely MNIST, CIFAR-10, and CIFAR-100, for training in federated learning, as shown in
Table 2. MNIST dataset: this dataset contains 70,000 handwritten digital images of 10 classes (0–9), including 60,000 training samples and 10,000 test samples. In all experiments, samples from the MNIST dataset were normalized to 28 × 28 pixels. For the pixel block backdoor attack, the attacker embeds a 5 × 5 pixel block in the samples and assigns them with the target label “1” desired by attackers. In the watermarking backdoor attack experiment, the attacker added the watermarking “1” with a different watermarking factor to some real samples and set its label as “1”. CIFAR-10 dataset: the CIFAR-10 dataset contains 60,000 color images in 10 categories (such as “aircraft”, “car”, “bird”, etc.), including 50,000 training samples and 10,000 test samples. The images in the CIFAR-10 dataset are normalized to a 32 × 32 three-channel input during data preprocessing. For the attribute backdoor attack experiment, “car” in the CIFAR-10 dataset, “cars with stripes”, “cars next to striped walls”, and “green cars” were selected as the attribute backdoor, which the backdoor triggers. CIFAR-100 has the same total number of images as CIFAR-10, but it has 100 classes. CIFAR-100 has 500 training images and 100 testing images per class. Furthermore, a Multilayer Perceptron (MLP) model was employed, where the weights and biases between neurons were updated using the backpropagation algorithm to minimize the loss function. After training, the data were classified. A Convolutional Neural Network (CNN) was used to extract different features by generating convolutional feature maps with 3 × 3 convolutional kernels. ReLU activation was applied to two convolutional layers (the first layer with 6 channels and the second layer with 16 channels, both followed by 2 × 2 max pooling). The linear layer and softmax layer were utilized for output. The optimizer chosen was the Stochastic Gradient Descent (SGD) algorithm with momentum = 0.9, weight decay = 5 × 10
, and batch size = 128.
The selection of the distillation parameter
is crucial for clearing the backdoor. Intuitively, a larger
is more effective in defending against backdoors. However, arbitrarily increasing the value of
may lead to a decline in the performance of the method. Based on the scaling experiments in [
48], we have determined the value of
to be 0.5. Although increasing
always enhances model robustness, setting
to 0.5 has already reduced the clean accuracy below the threshold, which is the optimal value of
.
To ensure fair evaluation, we followed the experimental configurations of six backdoor attacks as described in their respective original papers, including trigger models, sizes, and target labels, as presented in
Table 3. Regarding the backdoor defense methods, we compared three methods: Fine-tuning, Fine Pruning, and Mode Connectivity Repair (MCR), with our proposed FMDL method. For all defense methods, we assumed access to 5% of clean data.
We used two metrics to evaluate the performance of the defense mechanisms: Attack Success Rate (ASR) and Accuracy on Clean Samples (ACC). A higher decrease in ASR and a lower decrease in ACC indicate a stronger defense mechanism.
4.2. Comparison of FMDL and Traditional FL Performance
To test the basic performance of FMDL, i.e., whether it can train a universal and effective global model similar to classical federated learning, we conducted comparative experiments between FMDL and FL after the FedAvg procedure. We evaluated the performance of both methods on three different datasets. Additionally, we constructed different data structures for each dataset, namely IID data (as shown in
Figure 5) and Non-IID data (as shown in
Figure 6). Based on the performance of FMDL on different datasets, we can conclude that our proposed method outperforms traditional federated learning in various aspects. Compared to FedAvg, FMDL demonstrates advantages in terms of faster convergence, higher accuracy, and model stability across different dataset structures.
4.3. Performance of FMDL under Three Heterogeneous Settings
When comparing the corresponding (a), (b), and (c) subfigures in
Figure 5 and
Figure 6, it can be observed that traditional federated learning achieves significantly lower model accuracy when trained on Non-IID data compared to IID data for the CIFAR-10 or CIFAR-100 datasets, as depicted in the figures. This is due to data heterogeneity causing imbalanced weights during local updates, and simple federated aggregation hindering model progress. In contrast, FMDL maintains higher accuracy for both IID and non-IID data. Although there is a slight decrease in accuracy for non-IID data compared to IID data, it remains within an acceptable range. Data heterogeneity has a consistent impact on global performance, which aligns with real-world scenarios. On the MNIST dataset, the model achieves near-perfect predictions, indicating the effectiveness of FMDL in addressing data heterogeneity.
FMDL successfully achieves the central server’s objective of training a well-performing generalized model for target heterogeneity. Regarding personalized requirements, our designed mutual learning structure ensures that private models are trained locally without participating in global and local model updates. As a result, the private models fully satisfy the client’s needs, achieving model personalization.
We conducted experiments with FMDL using five participating clients. Initially, we independently trained the five clients to obtain personalized models with the best accuracy, as indicated by the yellow portion in
Figure 7. Subsequently, we trained all clients using FMDL, and the accuracy of the personalized models obtained is shown in the orange portion of
Figure 7. Comparative analysis reveals that the accuracy of the models obtained using our proposed method is higher than that of individually trained models. This demonstrates that FMDL enables clients with different models to benefit from a shared model, effectively addressing model heterogeneity.
4.4. Effectiveness of FMDL in Defending Backdoors
In the previous section, we proposed three representations for attention functions and conducted the following experiments to identify the functions that exhibit better performance. Using the BadNet attack as the baseline attack and ASR (Attack Success Rate) and ACC (Accuracy) as evaluation metrics, we obtained the following results in
Table 4. Hence, we adopted
as our computational function for calculating the overall distillation loss of the model.
Firstly, we attacked the model using six different backdoor attacks and employed four different defense mechanisms to evaluate their respective attack success rates. Next, we tested the accuracy of the backdoor models after erasing them on clean samples. The results are shown in
Table 5. The MCR (Masking and Confusion Rule) defense method performed remarkably well in countering BadNets and SIG attacks, resulting in the lowest backdoor attack success rate. However, it showed mediocre performance against other attacks. On the other hand, the Fine-tuning method exhibited relatively good prediction results on clean samples after BadNets and CL attacks, but the success rate of backdoor attacks did not decrease significantly, making it an inadequate defense mechanism. In comparison to the other three methods, our proposed attention distillation defense method demonstrated excellent performance in reducing the accuracy of multiple backdoor attacks. Simultaneously, the model’s prediction accuracy on clean samples did not suffer significant losses, with an average deviation of 2.66%, which is within an acceptable range. The attention distillation method showcased effectiveness and efficiency in countering backdoor attacks.
We also verified the impact of different proportions of clean samples on model performance, as shown in
Figure 8. According to the information reflected in the bar chart, when the proportion of clean samples reached 20%, both the MCR method and our proposed FMDL method reduced the backdoor attack success rate to below 5%. However, our proposed method exhibited better convergence speed than the MCR method. Even with only 1% of clean samples, FMDL reduced the average ASR from 99.04% to 35.93%, while MCR had a high attack success rate of 80%.
5. Conclusions
Considering the challenges of federated learning in three heterogeneous settings and backdoor attacks, this paper proposes a knowledge distillation-based federated learning paradigm called Federated Mutual Distillation Learning (FMDL). In the local update phase, we introduce mutual learning and mutual distillation between local models and private models to address heterogeneity, and experimental results demonstrate its effectiveness. Additionally, FMDL employs attention maps to evaluate the performance of defense mechanisms. The results show that our proposed method outperforms three other backdoor defense methods in countering six backdoor attacks. Overall, our FMDL method makes significant contributions to addressing heterogeneous federated learning and mitigating the threat of backdoor attacks in model deployment. Future research will explore more advanced methods to achieve federated personalization. Furthermore, although the distillation-based approach effectively eliminates backdoors, the reliance on teacher models increases the computational burden on clients. In practical scenarios, users may opt for less computationally expensive methods with slightly lower performance to mitigate backdoors. Therefore, exploring methods to reduce computational overhead is worth investigating. Additionally, the attention map we utilized lacks strict theoretical analysis, and there is a lack of mature theoretical analysis tools for backdoor attacks. Hence, it is crucial to explore theoretical analysis methods for backdoor attacks.