Defense Scheme of Federated Learning Based on GAN

Zhang, Qing; Zhang, Ping; Lu, Wenlong; Zhou, Xiaoyu; Bao, An

doi:10.3390/electronics14030406

Open AccessArticle

Defense Scheme of Federated Learning Based on GAN

by

Qing Zhang

¹

,

Ping Zhang

^1,2,*,

Wenlong Lu

¹,

Xiaoyu Zhou

¹ and

An Bao

²

¹

School of Mathematics and Statistics, Henan University of Science and Technology, Luoyang 471023, China

²

Intelligent System Science and Technology Innovation Center, Longmen Laboratory, Luoyang 471023, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(3), 406; https://doi.org/10.3390/electronics14030406

Submission received: 13 December 2024 / Revised: 17 January 2025 / Accepted: 18 January 2025 / Published: 21 January 2025

(This article belongs to the Special Issue Security and Privacy for AI)

Download

Browse Figures

Versions Notes

Abstract

Federated learning (FL), as a distributed learning mechanism, can have model training completed without directly uploading original data, effectively reducing the risk of privacy leakage. However, through the shared gradient information, research shows that adversaries may reconstruct the original data. To further protect the privacy of federated learning, a federated learning defense scheme is proposed based on generative adversarial networks (GAN), which is combined with adaptive differential privacy. Firstly, the real data distribution features are learned through GAN, and replaceable pseudo data are generated. Then, the pseudo data are added with adaptive noise. Finally, the pseudo gradient generated by the pseudo data in the model is used to replace the real gradient so that adversaries cannot obtain the real gradient to further protect the privacy of user data. After simulation experiments are carried out on the MNIST dataset, the algorithm is verified using the gradient attack method. The experimental results show that the proposed algorithm is superior to the federated learning algorithm based on differential privacy in accuracy. Compared with the FedAvg algorithm, only 0.48% accuracy is lost. Therefore, it achieves a good balance between algorithm accuracy and data privacy.

Keywords:

federated learning; generative adversarial network; gradient leakage; adaptive differential privacy

1. Introduction

The growing demand for Internet data processing has led to the increasing application of machine learning [1]. Traditional machine learning [2] methods often require aggregating all data onto a single device or a data center to obtain high-precision models. However, the process of data collection and aggregation inevitably involves security and privacy issues [3]. Federated learning (FL) [4], an emerging paradigm in machine learning, allows data owners to keep their original data private by avoiding the need to upload it. The data are kept on local devices throughout the entire training process. Only the trained models are uploaded to the aggregation server to take part in model training. Throughout the training process, data remain on the local device, ensuring privacy. This approach has been widely adopted across various industries, including healthcare [5,6], the Internet of Things [7], and finance [8]. However, federated learning still faces many challenges and issues in practical applications, including privacy protection, communication efficiency, unbalanced data, and non-convex optimization problems [9].

Since model training involves optimizing model parameters to fit the training dataset, it means that the model parameters embed information such as the distribution, features, and membership of the training data. Adversaries can exploit the model parameters to extract private information [10], which presents a significant challenge for privacy protection in federated learning.

To further protect the gradients and parameters of models, some scholars have applied technologies such as homomorphic encryption (HE) [11], secure multi-party computation (SMPC) [12], and differential privacy (DP) [13] to federated learning. Phong et al. [14] proposed a federated learning privacy protection scheme called PPDL based on additive homomorphic encryption, which effectively prevents the leakage of private information. However, in this scheme, all clients possess the same key pair, allowing any other client to decrypt the encrypted model updates from other clients, making it vulnerable to collusion between the central server and clients and also resulting in significant overhead. Bonawitz et al. [15] used secure multi-party computation to construct a secure aggregation protocol called SecAgg, which is used to safely compute the sum of multiple local models. However, this scheme can only calculate the sum of multiple values and requires multiple rounds of key negotiation and secret recovery among nodes, leading to high communication costs. Zhou et al. [16] proposed a federated learning deep shadow defense scheme based on GAN, using GAN to generate shadow data, train shadow models, and use shadow gradients to protect the original data and models. However, adversaries may still be able to infer key information from the original data through complex analysis of subtle differences in the features of the shadow data. Lu et al. [17] proposed a differential privacy asynchronous federated learning (DPAFL) scheme, which introduces local differential privacy into federated learning by adding Gaussian noise to the stochastic gradient descent (SGD) updates of local models to protect privacy. However, differential privacy methods can sacrifice accuracy.

To further mitigate privacy leakage in federated learning, this paper introduces a defense method based on the FedDSD scheme [16]. The proposed method seeks to tackle the identified challenges and strengthen privacy protection by improving upon existing techniques. The proposed method utilizes GAN [13] to generate synthetic data, which is then augmented with adaptive noise. The synthetic data are ultimately employed for training purposes, thereby effectively safeguarding the original data and the original model.

The main contributions of this paper are as follows: (1) Using GAN to learn the features of the original real data, generate synthetic data, and introduce adaptive noise to further obfuscate the generated data. The synthetic data, which approximate the utility of the original data, replace the original data; (2) Training a pseudo model with synthetic data augmented with adaptive noise to replace the original model, thereby protecting the original model and preventing adversaries from directly accessing a model trained on real data; (3) Replacing the original gradients with pseudo gradients obtained by training the pseudo model, thus preventing adversaries from acquiring the actual original gradients. At the same time, adaptive noise is introduced to further obfuscate the generated data, enhancing the security of data privacy and ensuring that even within the synthetic data, it is difficult to extract specific information about the original data.

2. Related Work

2.1. Federated Learning

A typical federated learning system comprises a central server and multiple clients, which work together to train a model under the server’s coordination. Each client keeps its training data private and independently trains a local model. The central server subsequently combines the locally trained models to update the global model. However, this setup is vulnerable to gradient leakage attacks. Attackers can continuously narrow the gradient gap between the noisy data and the gradient obtained by training with real data. In the attack method proposed by Zhu et al. [10], they exploit the gradient information during the model update process. By carefully adjusting the noise parameters in each iteration, they gradually approximate the gradient characteristics of the real data. On the other hand, Hitaj et al. [18] approach from the perspective of generative adversarial networks, allowing attackers to generate pseudo gradients that are similar to real gradients. These pseudo gradients are mixed into the normal gradient updates to steal the potential information of the original data. After multiple iterations, the real data can be recovered. The federated learning framework under gradient leakage attacks is shown in Figure 1.

2.2. Gradient Leakage Attack

The privacy and security problem occurs when attackers attack by obtaining global gradients or local gradients. The client trains a generative adversarial network locally and combines the global gradient obtained from the server side to steal the local training set of the victim. This method belongs to an active reconstruction attack and requires tampering with the global gradient, which has very high requirements for the attacker’s capabilities. In real scenarios, the attacker’s background knowledge is not easily obtained. Therefore, a more general scenario needs to be considered. Zhu et al. introduced the DLG algorithm [10], which imposes fewer requirements on attackers. To execute an attack, an attacker only needs access to the global model distributed by the server to the clients, along with the gradient information from the targeted client, without the need to modify the global gradient. Its general process is as follows: First, two variables are randomly initialized one is a false label

\hat{y}

and the other is false data

\hat{x}

. Then input the false data into the shared model

f

, and calculate the gradient

\nabla_{\hat{x}} L (f (\hat{x}), \hat{y})

after obtaining the output, where L represents the loss function. The objective function of the algorithm is based on the Euclidean distance between the real gradient

\nabla_{x} L (f (x), y)

and the generated gradient

\nabla_{\hat{x}} L (f (\hat{x}), \hat{y})

. Mathematically, the objective function can be expressed as:

J (\hat{x}, \hat{y}) = {‖\nabla_{x} L (f (x), y) - \nabla_{\hat{x}} L (f (\hat{x}), \hat{y})‖}_{2}^{2}

(1)

By continuously using the stochastic gradient descent algorithm, the two initialized variables

\hat{x}

and

\hat{y}

are iteratively trained. Specifically, in each iteration t, the updates for

\hat{x}

and

\hat{y}

can be formulated as follows:

{\hat{x}}_{t + 1} = {\hat{x}}_{t} - η \frac{\partial J ({\hat{x}}_{t}, {\hat{y}}_{t})}{\partial {\hat{x}}_{t}}

(2)

{\hat{y}}_{t + 1} = {\hat{y}}_{t} - η \frac{\partial J ({\hat{x}}_{t}, {\hat{y}}_{t})}{\partial {\hat{y}}_{t}}

(3)

where

η

is the learning rate. At the end of the training, the obtained label

\hat{y}

and data

\hat{x}

will be close to the real label

y

and data

x

.

2.3. Generative Adversarial Network (GAN)

GAN [13] has been extensively utilized in domains such as computer vision and privacy protection. Mukherkjee et al. [19] aggregated three basic GAN models to generate synthetic MRI scans of brain tumors and applied style transfer techniques to enhance image similarity. Zhang et al. [20] proposed a method called HAAM-GAN, which is based on GAN, for underwater image enhancement. The application scope of GAN is continuously expanding, and in the future, the aggregation of GAN models is expected to be deeply integrated with 6G communication and the Internet of Things (IoT), especially playing a key role in cooperative digital twins for UAV-based scenarios [21].

GAN generates synthetic samples by learning the underlying distribution of the original training data through a machine learning model. Typically, a GAN consists of two components: a generator (G) and a discriminator (D). In the training stage of GAN, the generator G and the discriminator D engage in an adversarial game. The generator loss

L_{G}

can be represented as:

L_{G} = - E_{z ~ P_{z} (z)} [l o g D (G (z))]

(4)

The generator tries to deceive the discriminator with the generated artificial samples, which means that the generator minimizes

L_{G}

to make the generated data as similar as possible to the real data.

The discriminator loss

L_{D}

can be expressed as:

L_{D} = {- E}_{x ~ P_{d a t a} (x)} [\log (D (x)) - E_{z ~ P_{z} (z)} [l o g (1 - D (G (z)))

(5)

The discriminator attempts to differentiate between the generated synthetic samples and the real data, which means that the discriminator maximizes

L_{D}

to differentiate between real data and generated data. The optimization processes for both the generator and discriminator are outlined as follows:

\underset{G}{m i n} \underset{D}{m a x} V (D, G) = E_{x ~ P_{d a t a} (x)} [\log (D (x))] + E_{z ~ P_{z} (z)} [\log (1 - D (G (z)))]

(6)

where

x

is an image in the real dataset,

P_{d a t a}

is its distribution,

z

is random noise, and

P_{z}

is its distribution. The GAN structure model is shown in Figure 2.

2.4. Differential Privacy (DP)

Differential privacy (DP) was initially proposed by Dwork et al. in 2008 [22]. The core idea is to add noise through random perturbation technology to reduce the impact of changes in records in the dataset on the model and better protect data privacy. This approach aims to maximize the protection of personal data privacy during statistical analysis, ensuring that attackers cannot infer the private information of specific users during data analysis. It is defined as follows.

Definition 1.

(ε, δ)

-differential privacy [23].

Random mechanism M: X→R. On two random datasets

x

and

x^{'}

, where

x, x^{'} \in X

. For any output result

S \in R

, if it satisfies:

P_{r} \begin{matrix} [M (x) = S] \end{matrix} \leq e^{ε} \times P_{r} \begin{matrix} [M (x^{'}) = S] \end{matrix} + δ

(7)

Then the algorithm M is said to satisfy

(ε, δ)

-differential privacy. Here,

ε

represents the privacy protection budget;

x

and

x^{'}

are datasets with a difference of 1 in records. After random perturbation, the probability of outputting a specific value depends on

ε

. The smaller

ε

is, the stronger the privacy protection.

δ

represents the relaxation term factor. When

δ = 0

, the random algorithm satisfies

ε

-differential privacy.

Theorem 1.

Sequence composition [24]. Given random algorithms

A_{1}

and

A_{2}

that, respectively, satisfy

(ε, δ) -

differential privacy, then their combined random algorithm

{A (A}_{1}, A_{2})

still satisfies

(ε, δ)

-differential privacy.

Theorem 2.

Post-processing invariance [25]. Given a random algorithm

A_{1}

that satisfies

(ε, δ) -

differential privacy. For any random algorithm

A_{2}

, its combination

A_{2} (A_{1} (x))

also satisfies

(ε, δ) -

differential privacy.

2.5. Adaptive Noise Addition

In achieving a balance between data privacy protection and model performance, traditional differential privacy protection methods, although they protect privacy by uniformly adding noise to model parameters, inevitably pose a potential threat to model performance. This paper adopts adaptive noise addition, which differs from traditional static noise addition methods. Adaptive noise addition can dynamically adjust the intensity of the noise, thereby minimizing the impact on model performance while ensuring privacy.

The core lies in dynamically adjusting the noise standard deviation based on the generator loss. If the generator loss

L_{g i}

shows a downward trend during iterations, that is,

L_{g i} < L_{g i - 1}

, and the current noise standard deviation

δ

is greater than the minimum standard deviation

δ_{m i n}

, the noise standard deviation is adjusted to

δ τ

according to the formula, where

τ

is the decay factor. Otherwise, the original noise standard deviation δ is maintained. Adaptive noise addition ensures that the noise intensity decreases reasonably with model optimization, avoiding excessive interference with model convergence, while also ensuring the baseline of privacy protection.

3. Federated Learning Privacy Protection Algorithm Based on GAN

3.1. Threat Model

In this security model, the central server is considered an untrusted party that is honest but curious, while all participants are viewed as trusted entities. An “honest but curious” server follows the federated learning protocol faithfully, without modifying any part of the process, but may attempt to infer additional information from the data, try to collect all collectable data in this scheme, and attempt to obtain user privacy information through the collected data for inference and calculation. Given that the untrusted server is capable of inferring sensitive information from each participant’s local gradient, strong privacy protection needs to be provided for the participants. Clients also honestly execute each step in the participation process but are curious about the privacy of other users. In addition, this paper assumes that there is no collusion between clients or between clients and the server. The information of any party is secret to the other participants.

3.2. Framework Structure

This section introduces the proposed privacy protection algorithm for federated learning, which leverages GAN-based techniques and is referred to as the FedADG algorithm. This approach is designed to enhance data security in federated learning systems and consists of three distinct stages: client local training, secure aggregation, and global update. It involves two types of entities: the server and the client. The system model is as follows (Figure 3).

In the first stage, the client conducts local training. At this stage, local model training and ADP_GAN training are mainly carried out. The original data distribution is obtained by training on the original data. Then initialize the noise vector, generate images through the generator G, and add noise to the images to obtain the generated images

x_{f i}

. Subsequently, the generated images obtain the corresponding labels

y_{f i}

through the discriminator D. Then, combine the generated images and the corresponding labels to obtain the pseudo dataset

(x_{f i}, y_{f i})

. After that, the client trains on the pseudo dataset

{(x}_{f i}, y_{f i})

and transmits the trained model parameters

g_{i}^{j}

to the server for aggregation.

In the second stage, the server performs aggregation by collecting the model parameters received from the clients. It then aggregates these parameters to update the model on the server. Subsequently, the updated model parameters are distributed back to each client for the next round of training.

In the third stage, the client conducts updates. The client receives the global model

G_{i}

, and then the client updates the local model parameters

g_{i}^{j}

. If the termination condition of the federated learning framework set in the initial stage is satisfied at this time, such as reaching the maximum number of training iterations, the training stops. Otherwise, the client will prepare for the next round of federated learning.

3.3. Defense Scenario

With the rapid development of the Internet of Medical Things (IoMT) [26], cross-institutional data sharing has become crucial to improving diagnostic accuracy and treatment outcomes. However, data privacy protection is a significant challenge in achieving this goal, and the solution proposed in this paper can be applied to this scenario.

When data involve cross-institutional sharing, GAN is used to generate synthetic data samples. These samples retain the key statistical features of the original data but do not contain any personal information of real patients. Only these synthetic data are used for federated learning model training across institutions, while the true sensitive data remain stored locally, effectively protecting data privacy.

3.4. Overall Process

First, inside each client, use the local client dataset

{(x}_{i}, y_{i})

to train the local ADP_GAN network model. The pseudo dataset

{(x}_{f i}, y_{f i})

can be generated through ADP_GAN. The client uses the pseudo dataset for training, generates the pseudo gradient

\nabla w_{f i}

, and outputs the pseudo model

g_{f i}

. The ADP_GAN algorithm process is as follows (Figure 4).

In each iteration, random noise ϵ is sampled from a normal distribution with a mean of 0 and a standard deviation of δ and added to the generated image

x_{f}

, that is:

x_{f i} \leftarrow x_{f} + ϵ

. The dynamic adjustment formula for the noise standard deviation is:

δ = \{\begin{matrix} δ τ, & L_{g i} < L_{g i - 1} a n d δ > δ_{m i n} \\ δ, & o t h e r w i s e \end{matrix}

(8)

where

τ

is the attenuation factor,

δ_{m i n}

is the minimum standard deviation, and

L_{g}

is the generator loss. After training, the pseudo dataset

{(x}_{f i}, y_{f i})

is output.

Subsequently, federated learning training begins. In the i-th training round of federated learning, each selected client will randomly initialize the noise vector z, and then obtain the pseudo dataset

{(x}_{f i}, y_{f i})

through the ADP_GAN algorithm. Here,

y_{f i}

is the corresponding pseudo label

y_{f i}

obtained by

x_{f i}

through the discriminator D in the ADP_GAN algorithm. Each client can ensure that the pseudo gradient

{\nabla w}_{*}

can effectively replace the original gradient ∇w by comparing the difference between them. If the difference is too large, it may be necessary to consider regenerating the pseudo dataset

{(x}_{f i}, y_{f i})

.

Then, local training of the client is performed using the obtained

{(x}_{f i}, y_{f i})

. The original gradient generated by the real data

{(x}_{i}, y_{i})

in the model is

\nabla w

. The pseudo gradient generated by the user training with the pseudo data

{(x}_{f i}, y_{f i})

is

\nabla w_{f}

. The client uses the pseudo gradient

\nabla w_{f}

to replace the original gradient

\nabla w

to obtain the pseudo model parameter

g_{f i}

, and then sends the pseudo model parameter

g_{f i}

to the server. Subsequently, it is aggregated through the federated averaging (FedAvg) algorithm [27], and then the server distributes the aggregated model parameters

g_{f i}

to each client for the next round of training.

The FedADG algorithm is shown as Algorithm 1:

Algorithm 1. Algorithm of FedADG
	Input:
	Communication rounds K; Number of clients N; Number of local epochs E;
	Learning rate η; Noise Standard Deviation $δ$ and $δ_{m i n}$ ; Decay Factor $τ$ ;
	Initialize model parameters $G_{o}$
	Output:
	Final model $G_{K}$
1:	Server executes:
2:	Initialize model parameters $G_{o}$ and learning rate η
3:	For $t = 0, 1, 2, \dots, K - 1$ :
4:	Send global model $G_{t - 1}$ to $P_{i}$
5:	$G_{f i}^{t} \leftarrow G_{t - 1}$
6:	Receive the $G_{i}^{t}$ from any worker, Update global model parameters
7:	$G_{t} = Σ 1 / n G_{f i}^{t}$
8:	Return final model $G_{K}$
9:	Local training on clients:
10:	Initialize local model $g_{f i}^{t} \leftarrow G_{t - 1}$
11:	For epoch e = 1, 2, …, E:
12:	For each batch b ∈ B:
13:	$g_{f i}^{t} = g_{i}^{t - 1} - η ∆ l$
14:	Update local model parameters $G_{f i}^{t}$
15:	Send encrypted parameters $G_{f i}^{t}$ to the server

3.5. Security Analysis

Theorem 3.

For any

t \geq 1

, the pseudo data generated in the t-th iteration based on the ADP_GAN algorithm satisfies

(ε, δ)

-differential privacy.

Proof.

In the process of generating pseudo data with ADP_GAN, the noise is adaptively adjusted according to the change in the generator’s loss. The data

x_{f}

generated by the generator G is added with Gaussian noise satisfying the differential privacy standard deviation to obtain pseudo data

x_{f i}

. According to different privacy requirements, local clients are divided into two categories: high privacy requirements and low privacy requirements. First, the local client adds Gaussian noise satisfying the differential privacy standard deviation of

δ

to the classified and clipped gradient. According to definition l, in each iteration satisfies

(ε, δ)

-differential privacy. Given any two datasets

x_{f}

and

x_{f}^{'}

that differ by 1 in a record, the output results obtained through ADP_GAN are

x_{f i}, x_{f i}^{'} \in S

, each of which satisfies

(ε, δ)

-differential privacy. From Theorem 1, we can obtain:

\Pr (x_{f i}) \leq e^{ε 1} \Pr (x_{f i}^{'}) + δ_{1} \leq e^{ε 1} \Pr (x_{f i}^{'}) + δ_{0}

(9)

\Pr (x_{f i}^{'}) \leq e^{ε 2} \Pr (x_{f i}) + δ_{2} \leq e^{ε 2} \Pr (x_{f i}) + δ_{0}

(10)

\begin{matrix} \Pr (x_{f i}) \Pr (x_{f i}^{'}) \leq \Pr (x_{f i}) (e^{ε 2} \Pr (x_{f i}) + δ_{2}) + δ_{1} \\ \leq e^{ε 1 + ε 2} \Pr (x_{f i}^{'}) \Pr (x_{f i}) + δ_{1} + δ_{2} \leq e^{ε 1 + ε 2} \Pr (x_{f i}^{'}) \Pr (x_{f i}) + {2 δ}_{0} \end{matrix}

(11)

Therefore, the data generated in different training stages of the algorithm each satisfy

(ε, δ)

-differential privacy. Then, according to the compositionality of differential privacy, ADP_GAN also satisfies

(ε, δ)

-differential privacy. In addition, according to the post-processing invariance of differential privacy, each iteration process in the subsequent rounds still satisfies

(ε, δ)

-differential privacy. Consequently, every iteration of the ADP_GAN algorithm ensures the preservation of differential privacy. □

The FedADG scheme safeguards the federated learning framework by combining the adaptive differential privacy mechanism with the GAN algorithm. Its main implementation method is to generate pseudo data using the local data through the GAN algorithm and then adaptively add noise to the pseudo data

x_{f i}

based on the loss change in the generator G in GAN so that third-party attackers cannot restore local data by stealing gradient information and effectively protect the data privacy of clients.

The scheme generates pseudo data through the ADP_GAN algorithm, and then the client replaces the real data with pseudo data for training. These pseudo data can capture the distribution characteristics of the original data but do not directly expose the specific information of any individual. This mechanism safeguards against attempts by attackers to extract the original data from the model. Because the ADP_GAN model acts on the client of federated learning, this scheme can resist gradient inference attacks initiated by the server or other semi-honest adversaries.

4. Experimental Results and Analysis

The configuration used in this experiment is a 64-bit Windows 11 Professional operating system, an Intel(R) Core (TM) i5-12450 CPU @ 3200 MHz central processing unit, an NVIDIA GeForce RTX4060 graphics card processor manufactured by TSINGHUA TONGFANG, a high-tech enterprise located in Beijing, China, and 8 GB of running memory. The software environment uses the Python 3.11 experimental platform.

The dataset used in the experiment is the MNIST (Mixed National Institute of Standards and Technology) dataset [28]. The MNIST dataset contains 60,000 training data samples and 10,000 test data samples. The samples are all 28 × 28 grayscale handwritten digital images with labels from 0 to 9.

4.1. Effectiveness of the FedADG Scheme

In order to more intuitively show the effect of this method on dataset reconstruction, this paper inputs the MNIST dataset and then outputs the pseudo dataset

{(x}_{f i}, y_{f i})

for the MNIST dataset based on ADP_GAN. Figure 5 intuitively shows the reconstruction effect of the ADP_GAN algorithm on the MNIST dataset.

Since the client in the FedADG algorithm uses the pseudo dataset

{(x}_{f i}, y_{f i})

, in order to verify the difference between the pseudo dataset

{(x}_{f i}, y_{f i})

and the original dataset

{(x}_{i}, y_{i})

, three evaluation indicators, namely MSE, FMSE, and PSNR are used to evaluate the pseudo data generated by ADP_GAN and the original data. The obtained results are shown in Table 1.

According to the data calculated according to evaluation indicators such as MSE, FMSE, and PSNR, it can be known that the pseudo data

x_{f i}

generated by the ADP_GAN algorithm is quite different from the original data

x_{i}

. Therefore, even if the scheme in this paper can be attacked by the DLG algorithm and lead to the leakage of

x_{f i}

, it can still ensure the security of the original data

x_{i}

and effectively protect the privacy of real data.

Figure 6 illustrates the convergence of the FedADG scheme on the MNIST dataset. The convergence rate gradually decreases and stabilizes, indicating that the algorithm exhibits strong convergence behavior.

Experiments were conducted with different values of the hyperparameter τ in the FedADG scheme to investigate its impact on the scheme’s performance. The values of τ selected were 0.5, 0.6, 0.7, 0.8, 0.9, and 1.0. Figure 7 shows that when τ is set to 0.5, the corresponding performance metrics are relatively high. As τ increases from 0.5 to 0.8, the metrics gradually decrease. When τ is further increased to 0.9, the model accuracy rises somewhat. When τ = 1.0 indicates that the noise intensity remains constant, which results in a lower accuracy than when τ = 0.9 and is significantly lower than the metrics when τ = 0.5.

This suggests that a value of τ around 0.5 may be relatively optimal, allowing the relevant metrics to reach a better level. Within the range of 0.5 to 0.8, as the hyperparameter increases, the model performance may gradually deteriorate. Although the metrics rebound when τ = 1.0, the overall performance is still not as good as when τ = 0.5.

We conducted tests on the convergence time of models for two federated learning frameworks: FedADG and FedAvg. The results showed that the average convergence time for FedADG was 56.95 s, while that for FedAvg was 17.89 s. Although FedADG’s convergence time is significantly longer than FedAvg’s, it can effectively resist gradient leakage attacks, ensuring the protection of client privacy information.

4.2. Comparison with Defense Baselines

First, the client is trained locally based on the ADP_GAN algorithm. Then, the client locally generates pseudo data

x_{f}

, and then outputs pseudo data

x_{f i}

by adding noise. Then, the corresponding label

y_{f i}

is obtained through the discriminator. Then the client generates the pseudo dataset

{(x}_{f i}, y_{f i})

. The client combines the pseudo dataset

{(x}_{f i}, y_{f i})

with the downloaded model for local training and then uploads the pseudo model parameters to the server. We use FedAvg, DP-FL, and GC-FL as comparison schemes. For the non-IID (non-independent and identically distributed) scenario: Divide 60,000 training samples into 20 segments, with each segment containing 3000 samples. Each client randomly selects 2 different segments from the 20 segments for allocation. For the IID (independent and identically distributed) scenario: Randomly divide 60,000 training samples into 1200 segments, with each segment containing 50 samples. Then randomly allocate these segments to 10 clients, ensuring each receives 1 to 30 segments. The experimental results are shown in Table 2.

The performance of various federated learning algorithms is significantly affected by data distribution, as evidenced by the accuracy tests conducted on both non-IID and IID data. The federated averaging algorithm (FedAvg) generally performs well, achieving an accuracy of 97.62% on IID data. However, its accuracy drops to 88.99% on non-IID data. Differential privacy federated learning (DP-FL) and gradient compression-based federated learning (GC-FL) both have relatively lower accuracy rates under both data distributions, ranging from 80.54% to 85.72% and 81.80% to 86.41%, respectively. In contrast, the FedADG algorithm demonstrates outstanding performance, achieving an accuracy of 88.51% on non-IID data and an impressive 97.65% on IID data. This showcases its strong adaptability and accuracy advantages, making it comparable to FedAvg. The FedADG algorithm differs by only 0.48% from the FedAvg algorithm. Next, to further discuss the defense effect of the FedADG algorithm, gradient leakage attacks are used to attack the original gradient, gradient compression, noise addition, and the scheme in this paper. Figure 8 shows the defense effects of different schemes on the MNIST dataset.

For the four schemes of original gradient (NO), gradient compression (GC), differential privacy (DP), and GAN_ADP, the adversary reconstructs based on the real gradient

\nabla w

generated by the real data

x

, and continuously obtains the final false data by minimizing the distance between the false gradient

{\nabla w}_{*}

and the real gradient

\nabla w

. As can be seen from Figure 8, the FedADG scheme has a good defense effect.

5. Conclusions

For the purpose of protecting the data security of clients, a federated learning privacy protection algorithm, FedADG, based on generative models is proposed, and simulation experiments are carried out on the MNIST dataset. The experimental results show that FedADG is higher than the federated learning algorithm based on differential privacy in accuracy on the MNIST dataset. On this basis, the gradient leakage method is used to conduct experiments on different defense schemes. The experimental results show that the FedADG scheme can well protect data security. Moreover, since the training of the FedADG scheme uses the pseudo dataset generated based on ADP_GAN, it provides enhanced privacy protection. According to the evaluation indicators of MSE, FMSE, and PSNR, FedADG can effectively protect the privacy and security of the original dataset while ensuring classification accuracy.

Author Contributions

Methodology, Q.Z. and P.Z.; software, Q.Z., P.Z. and W.L.; supervision, W.L. and X.Z.; writing—original draft preparation, Q.Z. and P.Z.; writing—review and editing, X.Z. and A.B.; funding acquisition, P.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Major Science and Technology Projects of Longmen Laboratory (No. 231100220300) and the Henan Province Science and Technology Research Program Joint Fund (No. 242103810054).

Data Availability Statement

The “Mixed National Institute of Standards and Technology” dataset is from http://yann.lecun.com/exdb/mnist/ (accessed on 20 March 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ashmore, R.; Calinescu, R.; Paterson, C. Assuring the machine learning lifecycle: Desiderata, methods, and challenges. ACM Comput. Surv. 2021, 54, 1–39. [Google Scholar] [CrossRef]
Wei, J.; Chu, X.; Sun, X.Y.; Xu, K.; Deng, H.X.; Chen, J.G.; Wei, Z.M.; Lei, M. Machine Learning in Materials Science. InfoMat 2019, 1, 338–358. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated leaning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
Konenčný, J.; Mcmahan, H.B.; Yu, F.X.; Richtarik, P.; Suresh, A.T.; Bacon, D. Federated Learning: Strategies for Improving Communication Efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar] [CrossRef]
Ju, C.; Zhao, R.H.; Sun, J.C.; Wei, X.G.; Zhao, B.; Liu, Y.; Li, H.S.; Chen, T.J.; Zhang, X.W.; Gao, D.S. Privacy-Preserving Technology to Help Millions of People: Federated Prediction Model for Stroke Prevention. arXiv 2020, arXiv:2006.10517. [Google Scholar] [CrossRef]
Nie, W.; Xin, L.; Li, Z. Study on the Application of Federated Learning in Medical Informatization. J. Med. Form. 2022, 43, 12–13. [Google Scholar]
Hao, M.; Li, H.W.; Luo, X.Z.; Xu, G.W.; Yang, H.M.; Liu, S. Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Trans. Ind. Inform. 2020, 16, 6532–6542. [Google Scholar] [CrossRef]
Zheng, W.; Yan, L.; Gou, C.; Wang, F.Y. Federated meta-learning for fraudulent credit card detection. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 7–15 January 2021; pp. 4654–4660. [Google Scholar]
Bonawitz, K.; Eichner, H.; Grieskamp, W.; Huba, D.; Ingerman, A.; Ivanov, V.; Kiddon, C.; Konečný, J.; Mazzocchi, S.; McMahan, H.B.; et al. Towards federated learning at scale: System design. arXiv 2019, arXiv:1902.01046. [Google Scholar]
Zhu, L.; Liu, Z.; Han, S. Deep leakage from gradients. In Proceedings of the 33rd International Conference on Neural Information Processing System; Curran Associates Inc.: Red Hook, NY, USA, 2019; pp. 14774–14784. [Google Scholar]
Rivest, R.L.; Adleman, L.; Dertouzos, M.L. On data banks and privacy homomorphisms. Found. Secur. Comput. 1978, 4, 169–180. [Google Scholar]
Li, Y.; Zhou, Y.; Jolfaei, A.; Yu, D.J.; Xu, G.C.; Zheng, X. Privacy-preserving federated learning framework based on chained secure multiparty computing. IEEE Internet Things J. 2020, 8, 6178–6186. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Phong, L.T.; Aono, Y.; Hayashi, T.; Wang, L.H.; Moriai, S. Privacy-Preserving Deep Learning via Additively Homomorphic Encryption. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1333–1345. [Google Scholar] [CrossRef]
Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
Zhou, H.; Chen, Y.L.; Wang, X.W.; Zhang, Y.W.; He, J.J. Deep shadow defense scheme of federated learning based on generative adversarial network. J. Comput. Appl. 2024, 44, 223–232. (In Chinese) [Google Scholar] [CrossRef]
Lu, Y.L.; Huang, X.H.; Dai, Y.Y.; Maharjan, S.; Zhang, Y. Differentially Private Asynchronous Federated Learning for Mobile Edge Computing in Urban Informatics. IEEE Trans. Ind. Inform. 2020, 16, 2134–2143. [Google Scholar] [CrossRef]
Hitaj, B.; Ateniese, G.; Perez-Cruz, F. Deep models under the GAN: Information leakage from collaborative deep learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; ACM: New York, NY, USA, 2017; pp. 603–618. [Google Scholar]
Mukherkjee, D.; Saha, P.; Kaplun, D.; Sinitca, A.; Sarkar, R. Brain tumor image generation using an aggregation of GAN models with style transfer. Sci. Rep. 2022, 12, 9141. [Google Scholar] [CrossRef] [PubMed]
Zhang, D.H.; Wu, C.Y.; Zhou, J.C.; Zhang, W.S.; Li, C.L.; Lin, Z.F. Hierarchical attention aggregation with multi-resolution feature learning for GAN-based underwater image enhancement. Eng. Appl. Artif. Intell. 2023, 125, 106743. [Google Scholar] [CrossRef]
Zhou, L.Y.; Leng, S.P.; Wang, Q.; Quek, T.Q.S.; Guizani, M. Cooperative Digital Twins for UAV-Based Scenarios. IEEE Commun. Mag. 2024. [Google Scholar] [CrossRef]
Dwork, C.; Naor, M. On the difficulties of disclosure prevention in statistical databases or the case for differential privacy. J. Priv. Confidentiality 2010, 2. [Google Scholar] [CrossRef]
Dwork, C.; Roth, A. The algorithmic foundations of differential privacy. Found. Trends® Theor. Comput. Sci. 2014, 9, 211–407. [Google Scholar] [CrossRef]
Dwork, C.; Rothblum, G.N.; Vadhan, S. Boosting and differential privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, Las Vegas, NV, USA, 23–26 October 2010; pp. 51–60. [Google Scholar]
Kifer, D.; Lin, B.R. Towards an axiomatization of statistical privacy and utility. In Proceedings of the Twenty-ninth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Indianapolis, IN, USA, 6–11 June 2010; ACM: New York, NY, USA; pp. 147–158. [Google Scholar]
Yadav, D.K.; Yadav, D.; Pal, Y.; Chaudhary, D.; Sahu, H.; Manasa, A.S.L. Post Quantum Blockchain Assisted Privacy Preserving Protocol for Internet of Medical Things. In Proceedings of the 2023 IEEE World Conference on Applied Intelligence and Computing (AIC), Sonbhadra, India, 29–30 July 2023; pp. 965–970. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Deng, L. The MNIST database of handwritten digit images for machine learning research. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]

Figure 1. Federated learning framework under attack.

Figure 2. GAN structure model.

Figure 3. FedADG algorithm.

Figure 4. ADP_GAN algorithm.

Figure 5. Reconstruction effect of ADP_GAN on the MNIST dataset.

Figure 6. Convergence effect of FedADG on the MNIST dataset.

Figure 7. Analysis results of hyperparameters. (a) The trend of accuracy varying with communication bound for different τ values. (b) The variation of accuracy with τ value.

Figure 8. Defense effects of different schemes.

Table 1. Defense indicators of FedADG.

Numerical Value	MSE	FMSE	PSNR/dB
Maximum	3265.1901	23,892.6390	4.5538
Minimum	1879.8457	22,787.9235	4.3482

Table 2. Accuracy on MNIST.

Algorithm	Non-IID	IID
Algorithm	Accuracy	Accuracy
FedAvg	88.99%	97.62%
DP-FL	80.54%	85.72%
GC-FL	81.80%	86.41%
FedADG	88.51%	97.65%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Q.; Zhang, P.; Lu, W.; Zhou, X.; Bao, A. Defense Scheme of Federated Learning Based on GAN. Electronics 2025, 14, 406. https://doi.org/10.3390/electronics14030406

AMA Style

Zhang Q, Zhang P, Lu W, Zhou X, Bao A. Defense Scheme of Federated Learning Based on GAN. Electronics. 2025; 14(3):406. https://doi.org/10.3390/electronics14030406

Chicago/Turabian Style

Zhang, Qing, Ping Zhang, Wenlong Lu, Xiaoyu Zhou, and An Bao. 2025. "Defense Scheme of Federated Learning Based on GAN" Electronics 14, no. 3: 406. https://doi.org/10.3390/electronics14030406

APA Style

Zhang, Q., Zhang, P., Lu, W., Zhou, X., & Bao, A. (2025). Defense Scheme of Federated Learning Based on GAN. Electronics, 14(3), 406. https://doi.org/10.3390/electronics14030406

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Defense Scheme of Federated Learning Based on GAN

Abstract

1. Introduction

2. Related Work

2.1. Federated Learning

2.2. Gradient Leakage Attack

2.3. Generative Adversarial Network (GAN)

2.4. Differential Privacy (DP)

2.5. Adaptive Noise Addition

3. Federated Learning Privacy Protection Algorithm Based on GAN

3.1. Threat Model

3.2. Framework Structure

3.3. Defense Scenario

3.4. Overall Process

3.5. Security Analysis

4. Experimental Results and Analysis

4.1. Effectiveness of the FedADG Scheme

4.2. Comparison with Defense Baselines

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI