Next Article in Journal
Neural Dynamics Associated with Biological Variation in Normal Human Brain Regions
Previous Article in Journal
Cooling Rate and Compositional Effects on Microstructural Evolution and Mechanical Properties of (CoCrCuTi)100−xFex High-Entropy Alloys
Previous Article in Special Issue
Knowledge-Assisted Actor Critic Proximal Policy Optimization-Based Service Function Chain Reconfiguration Algorithm for 6G IoT Scenario
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Ultra-Reliable and Low-Latency Wireless Hierarchical Federated Learning: Performance Analysis †

1
School of Information Science and Technology, Southwest JiaoTong University, Chengdu 611756, China
2
Chongqing Key Laboratory of Mobile Communications Technology, Chongqing 400065, China
3
Provincial Key Lab of Information Coding and Transmission, Southwest Jiaotong University, Chengdu 611756, China
4
School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
*
Author to whom correspondence should be addressed.
An earlier version of this paper was presented in part at the IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), New York, NY, USA, 17–20 May 2023 and in part at the IEEE International Conference on Communications Workshops (ICC Workshops), Roma, Italy, 28 May–1 June 2023.
Entropy 2024, 26(10), 827; https://doi.org/10.3390/e26100827
Submission received: 25 August 2024 / Revised: 24 September 2024 / Accepted: 27 September 2024 / Published: 29 September 2024

Abstract

:
Wireless hierarchical federated learning (WHFL) is an implementation of wireless federated Learning (WFL) on a cloud–edge–client hierarchical architecture that accelerates model training and achieves more favorable trade-offs between communication and computation. However, due to the broadcast nature of wireless communication, the WHFL is susceptible to eavesdropping during the training process. Apart from this, recently ultra-reliable and low-latency communication (URLLC) has received much attention since it serves as a critical communication service in current 5G and upcoming 6G, and this motivates us to study the URLLC-WHFL in the presence of physical layer security (PLS) issue. In this paper, we propose a secure finite block-length (FBL) approach for the multi-antenna URLLC-WHFL, and characterize the relationship between privacy, utility, and PLS of the proposed scheme. Simulation results show that when the eavesdropper’s CSI is perfectly known by the edge server, our proposed FBL approach not only almost achieves perfect secrecy but also does not affect learning performance, and further shows the robustness of our schemes against imperfect CSI of the eavesdropper’s channel. This paper provides a new method for the URLLC-WHFL in the presence of PLS.

1. Introduction

Wireless federated learning (WFL), which allows the training of machine learning (ML) models on a large corpus of decentralized data stored on mobile devices [1], has attracted significant research interest. Currently, WFL primarily aims to enhance communication efficiency [2,3,4,5,6], improve privacy and security [7,8,9], investigate a balance between privacy and utility [10,11,12,13,14], manage power control for wireless devices [15,16], and design effective beamforming strategies [17]. To optimize the processing power of edge and cloud servers, a hierarchical federated learning (HFL) system involving clients, edge servers, and cloud servers has been proposed in [18]. Compared to FL systems relying on a single server, HFL reduces the computational load [19,20,21], lowers user-to-cloud server communication costs [18,19,20,21,22,23], decreases FL processing time [18], and improves privacy and security in FL [22,23]. Specifically, since the convergence performance of the HFL system is theoretically proved in [18], joint user scheduling and wireless resource allocation are established to improve both communication and energy efficiency [19,20,21]. To enhance privacy in wireless hierarchical federated learning (WHFL), ref. [22] introduced a method based on local differential privacy (LDP) that involves artificial noise into the shared model parameters at two stages. Additionally, ref. [24] considered the influence of device mobility on the learning performance of WHFL systems.
The broadcast nature of wireless communication renders WFL susceptible to eavesdropping. As a result, tackling the challenge of WFL in the presence of physical layer security (PLS) is a significant issue. Different from the privacy requirement of the FL that the information leakage between users and servers does not need to be arbitrarily small due to the accuracy of data analysis, the information leakage to the eavesdropper should vanish, which is also known as the PLS requirement [25]. The current research in WFL in the presence of PLS [26,27,28,29] primarily focuses on enhancing the security of data through resource allocation and artificial jamming techniques. Specifically, reference [26] focuses on optimizing the power control of drones to enhance the security rate of the WFL system, considering constraints such as WFL training time and battery capacity of the drone. In [27], a method for achieving secrecy in WFL was proposed via using cooperative jamming, which involves the cooperative provision of jamming signals by users to counteract eavesdropping attempts and enhance security. Ref. [28] proposed the method of using conventional wireless devices to form a non-orthogonal multiple access (NOMA) transmission group with an edge device for secrecy-enhanced mobile edge computing, and the devices provide cooperative jamming to an eavesdropper while transmitting data to a cellular base station. In [29], a power allocation algorithm is proposed for WFL, where the transmitting power is divided proportionally between the transmitted signal and artificial noise to maximize the secrecy rate while satisfying the model performance requirement. Apart from this, ref. [30] proposed a PLS measure while considering the privacy-utility constraints in WFL.
Very recently, ultra-reliable and low-latency communication (URLLC) has attracted significant attention, as it serves as a critical communication service in fifth-generation (5G) and sixth-generation (6G) cellular networks. One essential technology for URLLC is short-sized packet communication [31], which indicates that the coding block length should be finite, and finite block-length (FBL) coding [32] provides an effective way for this scenario. Currently, the study of WFL combined with URLLC includes the design of a multi-level architecture to satisfy URLLC requirements [33], and the application of WFL in vehicular networks while considering URLLC constraints [34]. To the best of the authors’ knowledge, the practical FBL scheme for the WFL remains unknown. Then it is natural to ask: is there any practical FBL scheme for the WHFL in the presence of PLS, if yes, what is the relationship between PLS, privacy, and utility in WHFL systems while considering URLLC requirements?
One possible solution to the aforementioned question is a channel feedback coding scheme. The study of channel feedback scheme started from [35], where an elegant feedback coding scheme called the Schalkwijk–Kailath (SK) scheme was proposed for additive white Gaussian noise (AWGN) channel with noiseless feedback. In this scheme, the transmitter sends the original message only in the initial transmission. In the subsequent round, the receiver sends an estimate of the original message to the transmitter via a noiseless feedback channel. The transmitter sends an amplified version of the estimation error back to the receiver, and the receiver obtains an estimate of the estimation error by using the minimum mean square error (MMSE). After a predetermined number of rounds, the receiver uses the minimum distance rule to decode the message. It was shown that the SK scheme [35] is not only capacity-achieving but also its decoding error probability doubly exponentially decays to zero as the coding block length increases, which indicates that the SK scheme requires an extremely short coding block length to achieve a desired decoding error probability. Furthermore, ref. [36] showed that the SK scheme achieves perfect weak secrecy by itself, i.e., the SK scheme satisfies PLS requirement by itself. Recently, ref. [37] showed that the SK scheme [35] is almost the optimal FBL scheme for the AWGN channel with feedback, which indicates that it may be a good choice for URLLC.
However, note that the application of the SK scheme to the wireless fading channel still has a long way to go since it is based on the assumption that the feedback channel is a noiseless channel. Apart from this, in wireless communication, the channel feedback is often utilized to transmit channel state information (CSI) back to the device for each uplink transmission [17], and this allows the device to adjust its transmission parameters based on the received feedback. Then it is natural to ask: Can we utilize the channel feedback not only for CSI transmission but also for designing an FBL approach for the multi-antenna URLLC-WHFL in the presence of PLS, i.e., is it possible to extend the classical SK scheme to the multi-antenna URLLC-WHFL in the presence of PLS?
In this paper, we answer the aforementioned questions by studying the WHFL in the presence of PLS. Figure 1 illustrates the collaborative training of a learning model by users, edge servers, and cloud servers. To preserve privacy, a local differential privacy (LDP) mechanism [38] is utilized by adding Gaussian noise to each user’s gradient before aggregating all gradients to the edge servers. Furthermore, communication between each edge server and the cloud server over a quasi-static fading duplex channel, which, due to the inherent broadcast characteristics of wireless communication, is eavesdropping by an external eavesdropper. Our primary objective is to ensure that the polluted gradient data retains a certain amount of utility while minimizing privacy leakage to the cloud server and protecting the gradient data transmitted from edge servers to the cloud server from eavesdropping. A straightforward way to achieve the above goal is for the edge servers to securely encode the polluted data gradients as codewords and transmit them into wireless duplex fading channels. The cloud server can successfully decode the polluted data gradients, while the eavesdropper obtains no information about them. In this way, the PLS and the privacy of the data can simultaneously be guaranteed since the real data gradients are protected by the LDP mechanism.
Our key contributions to this paper are summarized as follows:
  • We propose an FBL approach for multi-antenna WHFL in the presence of PLS. In this approach, the feedback link is not only utilized for CSI transmission but also used to send the cloud server’s MMSE about the transmitted polluted data gradient back to the edge server. The key idea of the proposed scheme is to apply the modulo-lattice operation (MLO) [39] to eliminate the impact of feedback channel noise on the performance of the SK scheme [35], and further extend the SK-type scheme to a two-dimensional situation, which performs well in the SISO fading channel. Then further applying pre-coding, beamforming, and singular value decomposition (SVD) techniques to the extended scheme for the SISO case, the FBL coding scheme for the multi-antenna WHFL is obtained.
  • We derive the achievable secrecy rate of our proposed scheme and characterize the relationship between PLS, privacy, and utility of our scheme. Moreover, given fixed decoding error probability and coding block length, we establish lower and upper bounds on the LDP noise variance that ensure certain privacy, utility, and secrecy levels of PLS.
To obtain a better understanding of the contribution of this paper and the related works studied in the literature, the following Table 1 summarizes the study of WFL in the presence of privacy, utility, PLS, and URLLC in the literature.
The remainder of this paper is organized as follows. In Section 2, the definitions, system model, and main results are given. The FBL approach for the MIMO case is shown in Section 3. FBL approaches for the SIMO/MISO cases are proposed in Section 4. Simulation results are shown in Section 5. Section 6 summarizes all results in this paper and discusses future work.

2. Definitions, System Model and Main Results

2.1. WHFL System

Figure 1 illustrates a system composed of K tot users, L edge servers indexed by and a cloud server. The disjoint user sets are denoted as { C } = 1 L , and K =   | C | representing the number of users in edge server . The distributed datasets are represented by { S , k } k = 1 | C | , where S , k = | S , k | is the size of S , k . Each dataset S , k is defined as { ( u k , j , v k , j ) } j = 1 | S , k | , where u k , j represents the j-th input sample and v k , j is the corresponding label. S is the aggregated dataset of edge server , and the gradients from each user are aggregated by their corresponding edge server. The global loss function F ( m ) is defined as follows:
F ( m ) = 1 S = 1 L k = 1 | C | S , k F , k ( m ) ,
where model vector m R q and S = k S , k . The local loss function is given by the following:
F , k ( m ) = 1 S , k ( u k , j , v k , j ) S , k f ( m ; u k , j , v k , j ) ,
where f ( m ; u k , j , v k , j ) represents the sample-wise loss function. The goal of model training is to minimize the global loss function, as follows:
m = arg min m F ( m ) .
To achieve this, we employ a distributed gradient descent iterative algorithm. Specifically, in the t-th ( t { 1 , 2 , , T } ) communication round, the cloud server broadcasts the current global model vector m t to all users, and every user has perfect knowledge of m t . Each user k then computes its local gradient F , k ( m t ) using its dataset S , k and the current model m t . Once the edge server receives all the noisy local gradients from its users, which have been perturbed by Gaussian noise for LDP, it computes an estimation of the partial gradient as follows:
F ( m t ) = 1 S k C S , k F , k ( m t ) ,
where S = | S | denotes the size of S . Then, the cloud server aggregates the partial gradient estimates from all edge servers to compute the estimation F ^ ( m t ) of the global gradient, as follows:
F ( m t ) = 1 S = 1 L S F ( m t ) ,
and updates the global model m t + 1 by the following:
m t + 1 = m t μ F ^ ( m t ) ,
where μ denotes the learning rate.

2.2. Model Formulation

An information-theoretic model of WHFL system is shown in Figure 2. Without loss of generality, we adopt the following assumptions:
Assumption 1.
The communication of any individual edge server to the cloud server is not affected by other edge servers, and the downlink transmission from the cloud server to the edge servers is reliable [16]. Furthermore, we consider that an external eavesdropper targets the information transmitted during the uplink communication from the edge servers to the cloud server. Consequently, this paper primarily focuses on the PLS of the T rounds of uplink communication from one edge server to the cloud server.
Assumption 2.
The channels are quasi-static fading.
Assumption 3.
Following similar arguments in [3,9,16,17], we assume that the perfect CSI of the feedforward and feedback channels is known by both the cloud server and the edge server. Here note that this assumption is well-justified from a practical standpoint. For the feedforward channel, the channel training for estimating CSI at the cloud server can be achieved by transmitting pilot sequences from the edge servers, and the channel estimation is perfect when the length of the pilot sequences is sufficiently large [40]. On the other hand, when the cloud server transmits the perfectly estimated CSI to the edge server through the feedback channel, only a few feedback bits are required. By using a code with a low coding rate and high error-correcting capability, the probability of feedback error can be negligible [41] and, hence, the CSI of the feedforward channel is perfectly known by the transceiver. For the feedback channel, the perfect CSI sharing between transceivers can be realized in a similar way.

2.2.1. Privacy-Utility

In Figure 2, let W t , k = j = 1 S , k f ( m t ; u k , j , v k , j ) = ( W t , k , 1 , , W t , k , q ) T R q represent the overall local gradient vector for user k ( k { 1,2 , , K } ) during the t-th ( t { 1,2 , , T } ) communication round, where f ( m t ; u k , j , v k , j ) = ( f 1 ( m t ; u k , j , v k , j ) , , f q ( m t ; u k , j , v k , j ) ) T and W t , k , i = j = 1 S , k f i ( m t ; u k , j , v k , j ) ( i { 1,2 , , q } ). Following [42], assume that f ( m t ; u k , j , v k , j ) is independent and identically distributed (i.i.d.) and f ( m t ; u k , j , v k , j ) N ( 0 , σ w , t 2 I ) , which indicates that W t , k N ( 0 , S , k σ w , t 2 I ) . The i.i.d. generated local Gaussian noise η t , k = ( η t , k , 1 , , η t , k , q ) T follows distribution N ( 0 , σ 2 I ) and is independent of W t , k . The edge server aggregates the corrupted local gradient, and it is defined as follows:
W t , k = W t , k + η t , k ,
where W t , k N ( 0 , ( S , k σ w , t 2 + σ 2 ) I ) . The overall local gradients and noise for the t-th round are W t = ( W t , 1 , , W t , q ) T and η t = ( η t , 1 , , η t , q ) T , respectively, where W t , i = k = 1 K W t , k , i , η t , i = k = 1 K η t , k , i and i { 1,2 , , q } . Consequently, from (7), the overall corrupted local gradients for the t-th round are W t = ( W t , 1 , , W t , q ) T , where W t , i = k = 1 K W t , k , i and i { 1,2 , , q } . Due to the fact that W t , k and η t , k are i.i.d. and independent, the overall corrupted gradients W t are i.i.d. and distributed as N ( 0 , ( S σ w , t 2 + K σ 2 ) I ) , where S = k = 1 K S , k .
Definition 1
(Mutual information privacy [43]). For every t { 1 , , T } , if the mutual information 1 q I ( W t ; W t ) during the t-th round is upper bounded by ϵ, namely, max t { 1 , , T } 1 q I ( W t ; W t ) ϵ , the LDP mechanism is said to satisfy ϵ-mutual information privacy for ϵ > 0 .
Definition 2
(Utility [44]). The utility of W t is defined by the distortion between W t and W t , and in this paper, we consider the quadratic distortion d ( W t , W t ) = | | W t W t | | 2 . If 1 q T t = 1 T E ( d ( W t ,   W t ) ) υ , the utility of W t is determined by υ, where the utility and the distortion have an inverse relationship with each other, i.e., smaller υ corresponds to larger utility.

2.2.2. Gradient Compression

We employ lossy Gaussian source coding characterized by a quadratic distortion metric, defined as d ( W t , W ^ t ) = | | W t W ^ t | | 2 (the source encoder and decoder are respectively located at the edge server and cloud server), where W ^ t is the output of the source decoder at the cloud server. Following [45] (Chapter 3.8, pp. 64–65), the edge server’s source encoder maps W t to { 1,2 , , 2 q R t ( D ) } and compresses W t into an index W t that is uniformly distributed over W t = { 1 , 2 , , 2 q R t ( D ) } . The rate-distortion function R t ( D ) is defined as follows:
R t ( D ) = 1 2 log K σ 2 + S σ w , t 2 D 0 D < K σ 2 + S σ w , t 2 0 D K σ 2 + S σ w , t 2 ,
where 1 q T t = 1 T E ( d ( W t , W ^ t ) ) D . For the cloud server’s source decoder, the decoding mapping transforms the indices { 1,2 , , 2 q R t ( D ) } into W ^ t . Here note that when R t ( D ) = 0 , no message is transmitted, and W ^ t is set to 0.

2.2.3. Communication Model

At the t-th round, the channel input-output relationships are expressed as follows:
Y i ( t ) = h X i ( t ) + η 1 , i ( t ) , 1 i N t ,
Y ˜ i ( t ) = h ˜ X ˜ i ( t ) + η 2 , i ( t ) , 1 i N t 1 ,
Z i ( t ) = g X i ( t ) + g ˜ X ˜ i ( t ) + η e , i ( t ) , 1 i N t ,
where the input and output of the feedforward channel are denoted by X i ( t ) and Y i ( t ) , respectively, the feedback channel’s input and output are X ˜ i ( t ) and Y ˜ i ( t ) , respectively, and the eavesdropping channel’s output is Z i ( t ) . Note that X i ( t ) , Y ˜ i ( t ) C A × 1 , X ˜ i ( t ) , Y i ( t ) C B × 1 and Z i ( t ) C C × 1 . The average power constraint for the input of the edge server X i ( t ) is 1 N t i = 1 N t E [ X i H ( t ) X i ( t ) ] P , the input of the cloud server X ˜ i ( t ) is constrained by 1 N t 1 i = 1 N t 1 E [ X ˜ i H ( t ) X ˜ i ( t ) ] P ˜ . The matrices h C B × A , h ˜ C A × B , g C C × A , and g ˜ C C × B represent the CSI of the feedforward, feedback, and eavesdropping channels, respectively. The channel noises’ elements of η e , i ( t ) C C × 1 , η 2 , i ( t ) C A × 1 and η 1 , i ( t ) C B × 1 are i.i.d. and distributed as CN ( 0 , σ e 2 ) , CN ( 0 , σ 2 2 ) and CN ( 0 , σ 1 2 ) , respectively. The input message W t of the edge server is uniformly drawn in the set W t , and it is encoded as a codeword of length N t . Furthermore, the input of the edge server is defined as X i ( t ) = f t , i ( W t , h , h ˜ , Y ˜ 1 i 1 ( t ) ) , where f t , i ( · ) is an encoding function and Y ˜ 1 i 1 ( t ) = ( Y ˜ 1 ( t ) , , Y ˜ i 1 ( t ) ) . The cloud server estimates the message w ^ t = φ ( h , h ˜ , Y N t ) using the decoding function φ . The input of the cloud server is defined as X ˜ i ( t ) = f ˜ t , i ( h , Y 1 i ( t ) , h ˜ ) , where f ˜ t , i ( · ) is an encoding function and Y 1 i ( t ) = ( Y 1 ( t ) , , Y i ( t ) ) . The average decoding error probability P e , t is given by the following:
P e , t = 1 | W t | w t W t P r { φ ( h , Y N t , h ˜ ) w t | w t s e n t } .
Definition 3.
According to [46,47], the CSIs g and g ˜ of eavesdropping channels are defined as follows:
g = g ^ + Δ g , | | Δ g | | F ω , g ˜ = g ˜ ^ + Δ g ˜ , | | Δ g ˜ | | F ω ˜ ,
where g ^ C C × A , g ˜ ^ C C × B are the estimated CSI of g and g ˜ , respectively. Δ g and Δ g ˜ represent the legal parties’ estimation errors about the perfect CSI of the eavesdropper’s channel, and these errors are respectively bounded by parameters ω > 0 and ω ˜ > 0 . Here note that Δ g = Δ g ˜ = 0 corresponds to the situation that the legal parties obtain perfect CSI of the eavesdropper’s channel.
Definition 4.
The secrecy level of PLS [48] (the normalized uncertainty of the eavesdropper) is given by
Δ = H ( W 1 , , W T | Z N 1 , , Z N T , h , h ˜ , g , g ˜ ) H ( W 1 , , W T ) , 0 Δ 1 .
A transmission rate R is said to be ( τ , N , δ , D , υ , ϵ ) achievable, if for given decoding error probability τ, block length N ( N = t = 1 T N t ), secrecy level δ, 1 q T t = 1 T E ( d ( W t , W ^ t ) ) D , max t { 1 , , T } 1 q I ( W t ; W t ) ϵ and 1 q T t = 1 T E ( d ( W t , W t ) ) υ , there exists a channel code described above such that we have the following:
H ( W 1 , , W T ) N = R , 1 T t = 1 T P e , t τ , Δ δ ,
where δ [ 0 , 1 ] , and δ = 1 represents the perfect secrecy. For the WHFL in SISO/SIMO/MISO/MIMO cases, the achievable secrecy transmission rates are respectively denoted by R s i s o / s i m o / m i s o / m i m o ( τ , N , δ , D , υ , ϵ ) , the channel gains are respectively defined by h s i s o , h ˜ s i s o , g s i s o , g ˜ s i s o , h s i m o , h ˜ s i m o , g s i m o , g ˜ s i m o , h m i s o , h ˜ m i s o , g m i s o , g ˜ m i s o , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o , and the CSI estimation errors are defined by Δ g s i s o , Δ g ˜ s i s o , Δ g s i m o , Δ g ˜ s i m o , Δ g m i s o , Δ g ˜ m i s o , Δ g m i m o and Δ g ˜ m i m o .

2.3. Main Results

Theorem 1.
For the MIMO WHFL with K users and T iterations, given that N, τ, υ, D, ϵ, δ, and applying the FBL approach in Section 3, the relationship between PLS, privacy, utility, and the noise variance of LDP is characterized by the following:
max max t { 1 , , T } , Δ g m i m o D · 2 2 q ( 1 δ ) log   det I + g ^ m i m o K x 1 g ^ m i m o H σ e 2 S σ w , t 2 K Secrecy level of PLS , max t { 1 , , T } S σ w , t 2 K ( 2 2 ϵ 1 ) Privacy term σ 2 LDP noise variance υ K Utility term ,
where g ^ m i m o = g m i m o Δ g m i m o , K x 1 = E ( X 1 ( t ) X 1 H ( t ) ) . In addition, an achievable transmission rate R m i m o ( τ , N , δ , D , υ , ϵ ) of our proposed FBL approach is given by the following:
R m i m o ( τ , N , δ , D , υ , ϵ ) = t = 1 T N t R t N ,
where
N = t = 1 T N t , R t = max j = 1 J P j = P j = 1 J P ˜ j = P ˜ j = 1 J 1 N t log 3 SNR j d j 2 Q 1 ( τ 8 J ) 2 1 + SNR j d j 2 Ψ 1 Ψ 2 N t 1 ,
Ψ 1 = 1 + ξ d j 2 SNR j d ˜ j 2 SNR ˜ j , Ψ 2 = 1 ξ d ˜ j 2 SNR ˜ j 1 , ξ = 1 3 Q 1 ( τ 8 J ( N t 1 ) ) 2 ,
and SNR j = P j σ 1 2 , SNR ˜ j = P ˜ j σ 2 2 , d j , d ˜ j , P j , and P ˜ j are defined in Section 3.
Proof of Theorem 1. 
Our FBL approach for the MIMO WHFL is an extension of the classical SK scheme for the AWGN channel with noiseless feedback. The key to this extension is composed of three parts:
  • The two-dimensional message mapping method, which maps the message to a complex codeword transmitted over the fading channels.
  • An SVD-based pre-coding strategy that divides the MIMO channel into several parallel SISO channels.
  • The two-dimensional modulo-lattice operation (MLO) that eliminates the impact of feedback channel noise on the performance of the SK scheme.
Details about the above tools and how to combine these tools to show our FBL approach for the MIMO WHFL are given in the next section, and the formal proof of Theorem 1 is in Appendix A. □
Remark 1.
Here note that in the FBL approach for the MIMO WHFL, we apply an SVD-based pre-coding strategy to divide the MIMO channel into several parallel SISO channels, which indicates that the FBL approach for the SISO WHFL can be directly obtained since it is a special case of the approach for the MIMO WHFL. The following Corollary 1 proposes an FBL approach for the SISO WHFL and characterizes the relationship between PLS, privacy, utility, and the noise variance of LDP. Since Corollary 1 can be directly obtained from Theorem 1, we omit the detailed proof here.
Corollary 1.
For the SISO WHFL with K users and T iterations, given N, τ, υ, D, ϵ, δ, and using a similar FBL approach to that of Theorem 1, the relationship between PLS, privacy, utility, and the noise variance of LDP is characterized by the following:
max max t { 1 , , T } , Δ g s i s o D · 2 2 q ( 1 δ ) log 1 + | g ^ s i s o | 2 P σ e 2 S σ w , t 2 K Secrecy level of PLS , max t { 1 , , T } S σ w , t 2 K ( 2 2 ϵ 1 ) Privacy term σ 2 LDP noise variance υ K Utility term ,
where g ^ s i s o = g s i s o Δ g s i s o . Furthermore, an achievable transmission rate R s i s o ( τ , N , δ , D , υ , ϵ ) of our proposed FBL approach is given by the following:
R s i s o ( τ , N , δ , D , υ , ϵ ) = t = 1 T N t R t N , R t = 1 N t log 3 SNR | h s i s o | 2 Q 1 ( τ 8 ) 2 1 + SNR | h s i s o | 2 Ψ 3 Ψ 4 N t 1 ,
where N = t = 1 T N t , Ψ 3 = 1 + ξ | h s i s o | 2 SNR | h ˜ s i s o | 2 SNR ˜ , Ψ 4 = 1 ξ | h ˜ s i s o | 2 SNR ˜ 1 , ξ = 1 3 Q 1 ( τ 8 ( N t 1 ) ) 2 , SNR = P σ 1 2 , SNR ˜ = P ˜ σ 2 2 , and | h s i s o | , | h ˜ s i s o | , | g ^ s i s o | represent the modulus of h s i s o , h ˜ s i s o and g ^ s i s o , respectively.
Theorem 2.
For the SIMO WHFL with K users and T iterations, given N, τ, υ, D, ϵ, δ, and using the FBL approach in Section 4, the relationship between PLS, privacy, utility, and the noise variance of LDP is characterized by the following:
max max t { 1 , , T } , Δ g s i m o D · 2 2 q ( 1 δ ) log 1 + | | g ^ s i m o | | 2 P σ e 2 S σ w , t 2 K Secrecy level of PLS , max t { 1 , , T } S σ w , t 2 K ( 2 2 ϵ 1 ) Privacy term σ 2 LDP noise variance υ K Utility term ,
where g ^ s i m o = g s i m o Δ g s i m o . In addition, an achievable transmission rate R s i m o ( τ , N , δ , D , υ , ϵ ) of our proposed FBL approach is given by the following:
R s i m o ( τ , N , δ , D , υ , ϵ ) = t = 1 T N t R t N , R t = 1 N t log 3 SNR | | h s i m o | | 2 Q 1 ( τ 8 ) 2 1 + SNR | | h s i m o | | 2 Ψ 5 Ψ 6 N t 1 ,
where Ψ 5 = 1 + ξ | | h s i m o | | 2 SNR | | h ˜ s i m o | | 2 SNR ˜ , Ψ 6 = 1 ξ | | h ˜ s i m o | | 2 SNR ˜ 1 , SNR, SNR ˜ , N, and ξ are given in Corollary 1.
Proof of Theorem 2. 
The difference between the approaches in Theorems 1 and 2 is that for the SIMO case, we use a beamforming strategy together with a new pre-coding strategy instead of the SVD-based pre-coding strategy used for the MIMO case. Here the beamforming and new pre-coding strategies respectively transform the feedforward and feedback channels into SISO channels. Then, along the lines of the encoding-decoding procedure in Section 3.1.3, the FBL approach for the SIMO WHFL is obtained, and the detail about this approach is in Section 4. Finally, since the proof of Theorem 2 is included in that of Theorem 1, we omit the formal proof here. □
Remark 2.
Here, note that in the SIMO WHFL, a beamforming strategy transforms the SIMO feedforward channel into the SISO feedforward channel, while a new pre-coding strategy transforms the MISO feedback channel into the SISO feedback channel. Analogously, for the MISO WHFL, first, we apply the pre-coding strategy of the SIMO WHFL to transform the MISO feedforward channel into the SISO feedforward channel, and the beamforming strategy of the SIMO WHFL to transform the SIMO feedback channel into the SISO feedback channel, then along the lines of the encoding-decoding procedure in Theorem 2, the following Corollary 2 for the MISO WHFL is obtained. As the proof follows a similar way to that of Theorem 2, the detailed proof is omitted here.
Corollary 2.
For the MISO WHFL with K users and T iterations, given N, τ, υ, D, ϵ and δ, and using a similar FBL approach to that of Theorem 2, the relationship between PLS, privacy, utility, and the noise variance of LDP is characterized by the following:
max max t { 1 , , T } , Δ g m i s o D · 2 2 q ( 1 δ ) log 1 + | | g ^ m i s o | | 2 P σ e 2 S σ w , t 2 K Secrecy level of PLS , max t { 1 , , T } S σ w , t 2 K ( 2 2 ϵ 1 ) Privacy term σ 2 LDP noise variance υ K Utility term ,
where g ^ m i s o = g m i s o Δ g m i s o . In addition, an achievable transmission rate R m i s o ( τ , N , δ , D , υ , ϵ ) of our proposed FBL approach is given by the following:
R m i s o ( τ , N , δ , D , υ , ϵ ) = t = 1 T N t R t N , R t = 1 N t log 3 SNR | | h m i s o | | 2 Q 1 ( τ 8 ) 2 1 + SNR | | h m i s o | | 2 Ψ 7 Ψ 8 N t 1 ,
where Ψ 7 = 1 + ξ | | h m i s o | | 2 SNR | | h ˜ m i s o | | 2 SNR ˜ , Ψ 8 = 1 ξ | | h ˜ m i s o | | 2 SNR ˜ 1 , SNR, SNR ˜ , N and ξ are given in Corollary 1.

3. An FBL Approach for the MIMO WHFL

For the WHFL in the MIMO case, (9)–(11) can be re-written as follows:
Y i ( t ) = h m i m o X i ( t ) + η 1 , i ( t ) , 1 i N t ,
Y ˜ i ( t ) = h ˜ m i m o X ˜ i ( t ) + η 2 , i ( t ) , 1 i N t 1 ,
Z i ( t ) = g m i m o X i ( t ) + g ˜ m i m o X ˜ i ( t ) + η e , i ( t ) , 1 i N t ,
where h m i m o C B × A , h ˜ m i m o C A × B , g m i m o C C × A , g ˜ m i m o C C × B , X i ( t ) C A × 1 , X ˜ i ( t ) C B × 1 , the elements of η 1 , i ( t ) C B × 1 , η 2 , i ( t ) C A × 1 and η e , i ( t ) C C × 1 are i.i.d. as CN ( 0 , σ 1 2 ) , CN ( 0 , σ 2 2 ) and CN ( 0 , σ e 2 ) , respectively. Here note that the feedforward channel (26) and the feedback channel (27) are both MIMO channels.
In this section, for the MIMO WHFL system, an FBL approach is proposed, which combines the two-dimensional message mapping method, the two-dimensional MLO, and the SVD technique, see the following Figure 3. To facilitate a better understanding of Figure 3, we introduce the two-dimensional message mapping method and the two-dimensional MLO below.
The two-dimensional message mapping method: We first review the message mapping in the classical SK scheme [35] (see Figure 4a). Specifically, for given codeword length n, let the message W W = { 1 , 2 , , 2 n R } and | W | = 2 n R , where R is the transmission rate. Partition the interval [ 3 , 3 ] into 2 n R equal sub-intervals, with each sub-interval’s midpoint corresponding to a message in W . Let θ denote the midpoint associated with message W, where the variance of θ is approximately 1. This one-dimensional mapping method is shown to be optimal for AWGN channels with real signals. To address the complexity of fading channels, we introduce a two-dimensional message mapping method, detailed as follows:
For given codeword length n, let message W = ( W R , W I ) , where W, W R and W I are uniformly distributed in W = { 1 , , 2 n R } , W R = { 1 , , 2 n R R } and W I = { 1 , , 2 n R I } , respectively, and R R + R I = R . Since the message W is composed of two parts, we place the points ( W R , W I ) in a complex square grid with corners located at ( ± 3 , ± j 3 ) (see Figure 4b). Divide the entire square grid into 2 n ( R R + R I ) equally spaced sub-grids, and the center point of each sub-grid is mapped to a pair of values in W = ( W R , W I ) . Let θ = θ R + j θ I be the center point of the sub-grid with respect to (w.r.t) the message W = ( W R , W I ) , where θ R and θ I represent the real and imaginary components of θ , respectively, and the variance of θ approximately equals 2.
The two-dimensional MLO: The two-dimensional MLO is given by the following:
M Λ [ x ] = def x Q [ x ] ,
where the two-dimensional lattice Λ = Λ R + j Λ I is a complex plane with Λ R [ d 2 , d 2 ] , Λ I [ d 2 , d 2 ] , d > 0 , j = 1 , Q [ x ] is the nearest neighbor quantization of x w.r.t. Λ , and x is a complex-valued number. Some basic properties of the two-dimensional MLO [39] are listed below.
Proposition 1. 
(1). The distributive law M Λ [ M Λ [ x ] + y ] = M Λ [ x + y ] .
(2). If x + y Λ , M Λ [ x + y ] = x + y , otherwise, a modulo-aliasing error occurred.
(3). Let the dither signal ν be uniformly distributed on Λ, then M Λ [ x + ν ] is uniformly distributed on Λ, where Var ( M Λ [ x + ν ] ) = d 2 12 + d 2 12 = d 2 6 .
The classical SK scheme does not work in the noisy feedback case, and this is because in such a case, the transmitter cannot accurately obtain the estimation error of the receiver. We show that by applying the two-dimensional MLO to both the feedforward and feedback encoders, the adverse effects of feedback channel noise on the SK scheme’s performance can be mitigated, which allows the SK-type scheme to remain effective even in the presence of noisy feedback. The following Figure 5a,b illustrate the differences between the classical SK scheme and the modified SK-type scheme utilizing two-dimensional MLO.

3.1. An FBL Approach for the MIMO WHFL

3.1.1. Channel Decomposition by SVD

Based on the SVD technique, matrices h m i m o and h ˜ m i m o can be expressed as follows:
h m i m o = U Λ V H , h ˜ m i m o = U ˜ Λ ˜ V ˜ H ,
where U , V ˜ H C B × B and U ˜ , V H C A × A are unitary matrices. The diagonal matrices Λ C B × A and Λ ˜ C A × B have non-negative real number diagonal elements ( d 1 ,…, d J ) and ( d ˜ 1 ,…, d ˜ J ) [49], respectively, and
J = min ( A , B ) .
According to (26) and (30), we have the following:
U H Y i ( t ) = Λ V H X i ( t ) + U H η 1 , i ( t ) Y i ( t ) = Λ X i ( t ) + η 1 , i ( t ) ,
where Y i ( t ) = U H Y i ( t ) , η 1 , i ( t ) = U H η 1 , i ( t ) C B × 1 and X i ( t ) = V H X i ( t ) C A × 1 . It is noted that E ( X i H ( t ) X i ( t ) ) = E ( X i H ( t ) X i ( t ) ) and E ( η 1 , i H ( t ) η 1 , i ( t ) ) = E ( η 1 , i H ( t ) η 1 , i ( t ) ) , ensuring that the power constraint of X i ( t ) is equal to that of X i ( t ) , and the distributions of η 1 , i ( t ) and η 1 , i ( t ) remain the same. As Λ is a diagonal matrix, (32) can be decomposed as follows:
Y j , i ( t ) = d j X j , i ( t ) + η j , 1 , i ( t ) , 1 j J , 1 i N t ,
where Y j , i ( t ) , X j , i ( t ) and η j , 1 , i ( t ) denote the j-th components of Y i ( t ) , X i ( t ) and η 1 , i ( t ) , respectively.
Similarly, from (27) and (30), (27) can be decomposed as follows:
Y ˜ j , i ( t ) = d ˜ j X ˜ j , i ( t ) + η j , 2 , i ( t ) , 1 j J , 1 i N t 1 ,
where Y ˜ j , i ( t ) , X ˜ j , i ( t ) and η j , 2 , i ( t ) denote the j-th components of Y ˜ i ( t ) , X ˜ i ( t ) and η 2 , i ( t ) , respectively, and Y ˜ i ( t ) = U ˜ H Y ˜ i ( t ) , η 2 , i ( t ) = U ˜ H η 2 , i ( t ) C A × 1 and X ˜ i ( t ) = V ˜ H X ˜ i ( t ) C B × 1 . As shown in (32)–(34), applying the SVD technique, the feedforward and feedback MIMO channels can be effectively transformed into J parallel SISO sub-channels.
Power allocating: The edge server assigns power P 1 , , P J to the J parallel sub-channels for the feedforward channel, where j = 1 J P j = P . Similarly, the cloud server distributes power P ˜ 1 , , P ˜ J across the J parallel sub-channels for the feedback channel, where j = 1 J P ˜ j = P ˜ .

3.1.2. Message Splitting

For given τ , N t , υ , D and ϵ , we define the following:
| W t | = 2 N t R t = 2 q R t ( D ) , R t = H ( W t ) N t .
Next, the message W t is divided into J independent components ( W t , 1 , , W t , J ) , where W t , j is uniformly distributed over the set W t , j = { 1 , 2 , , 2 N t R t , j } and j = 1 , , J . Then, we divide each sub-message W t , j into W t , j = ( W t , j , R , W t , j , I ) , where W t , j , R and W t , j , I are uniformly distributed over the sets W t , j , R = { 1 , 2 , , 2 N t R t , j , R } and W t , j , I = { 1 , 2 , , 2 N t R t , j , I } , respectively. The rate for each parallel sub-channel is defined as R t , j = R t , j , R + R t , j , I . Consequently, the total rate R t for all J parallel sub-channels during the t-th communication round is as follows:
R t = j = 1 J ( R t , j , R + R t , j , I ) .

3.1.3. An FBL Scheme of Each Parallel Sub-Channel

By using the two-dimensional message mapping method introduced in the last subsection, the message W t , j is mapped to the center point θ j of its corresponding sub-grid.
Initialization: At time instant 1, the edge server maps the messages W t , j to θ j = θ R , j + j θ I , j , and sends the following:
X j , 1 ( t ) = P j 2 θ j ,
Then, the cloud server computes the first estimation θ ^ j , 1 of θ j by the following:
θ ^ j , 1 = Y j , 1 ( t ) d j P j 2 = θ j + η j , 1 , 1 ( t ) d j P j 2 = θ j + ε 1 ,
where ε 1 = ε R , 1 + j ε I , 1 = θ ^ j , 1 θ j is the estimation error of the cloud server at time instant 1. Define α 1 = Var ( ε 1 ) = 2 σ 1 2 d j 2 P j , α R , 1 = Var ( ε R , 1 ) = σ 1 2 d j 2 P j and α I , 1 = Var ( ε I , 1 ) = σ 1 2 d j 2 P j .
Iteration: First, we introduce a shared dither random i.i.d. sequence ν N t 1 = ( ν 1 , , ν N t 1 ) , which is perfectly known by both the edge server and the cloud server, and it is uniformly distributed on Λ ( Λ = Λ R + j Λ I is a complex plane with Λ R [ d 2 , d 2 ] , Λ I [ d 2 , d 2 ] ), and d = 6 P ˜ j . Here ν N t 1 is independent of all signals transmitted over channels. At time instant i ( 2 i N t ), using the two-dimensional MLO shown in Section 3, the cloud server sends the following:
X ˜ j , i 1 ( t ) = M Λ [ γ i 1 θ ^ j , i 1 + ν i 1 ] ,
where γ i 1 is a modulation coefficient. From Property (3) of Proposition 1, we have E ( X ˜ j , i 1 H ( t ) X ˜ j , i 1 ( t ) ) = P ˜ j (the dither signals guarantee that the codeword transmitted by the cloud server meets the power constraint). Then the edge server computes a noisy version of estimation error ε i 1 = θ ^ j , i 1 θ j by the following:
ε ˜ i 1 = 1 γ i 1 M Λ [ Y ˜ j , i 1 ( t ) d ˜ j γ i 1 θ j ν i 1 ] = ( a ) 1 γ i 1 M Λ [ γ i 1 ε i 1 + η j , 2 , i 1 ( t ) d ˜ j ] ,
where (a) is due to the modulo distributive law in property (1) of Proposition 1. The modulo-aliasing errors do not occur in the edge server, if γ i 1 ε i 1 + η j , 2 , i 1 ( t ) d ˜ j Λ . Hence, the edge server obtains ε ˜ i 1 = ε i 1 + η j , 2 , i 1 ( t ) γ i 1 d ˜ j . Then, the edge server sends the following:
X j , i ( t ) = λ i 1 γ i 1 ε ˜ i 1 ,
where λ i 1 is chosen to satisfy the transmitter’s power constraint P j . Then, the cloud server updates θ ^ j , i by computing the following:
θ ^ j , i = θ ^ j , i 1 ε ^ i 1 = θ ^ j , i 1 β i Y j , i ( t ) d j ,
where ε ^ i 1 = β i Y j , i ( t ) d j , and the MMSE estimation coefficient β i is given by the following:
β i = E ( ε i 1 Y j , i ( t ) H d j ) E ( Y j , i ( t ) Y j , i ( t ) H d j 2 ) ,
which ensures that ε i 1 is correctly estimated from Y j , i ( t ) . Define ε i = ε R , i + j ε I , i = θ ^ j , i θ j , (42) yields the following:
ε i = ε i 1 β i Y j , i ( t ) d j .
Further define α i = Var ( ε i ) , α R , i = Var ( ε R , i ) , α I , i = Var ( ε I , i ) . Since ε i is a CSCG distribution estimation error, we conclude that α R , i = α I , i = α i 2 .
Decoding: At time instant N t , the final estimation obtained by the cloud server is θ ^ j , N t = θ j + ε N t , where ε N t = ε R , N t + j ε I , N t . The cloud server successfully decodes the message W t , j if θ ^ j , N t is closest to the message point θ j , i.e., ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) and ε I , N t [ 3 2 N t R t , j , I , 3 2 N t R t , j , I ) .
The formal proof of Theorem 1 is provided in Appendix A.

4. An FBL Approach for the SIMO WHFL

For the SIMO WHFL, (9)–(11) can be re-written as follows:
Y i ( t ) = h s i m o X i ( t ) + η 1 , i ( t ) , 1 i N t ,
Y ˜ i ( t ) = h ˜ s i m o X ˜ i ( t ) + η 2 , i ( t ) , 1 i N t 1 ,
Z i ( t ) = g s i m o X i ( t ) + g ˜ s i m o X ˜ i ( t ) + η e , i ( t ) , 1 i N t ,
where h s i m o C B × 1 , h ˜ s i m o C 1 × B , g s i m o C C × 1 , g ˜ s i m o C C × B , X i ( t ) C 1 × 1 , X ˜ i ( t ) C B × 1 , the elements of η 1 , i ( t ) C B × 1 and η e , i ( t ) C C × 1 are i.i.d. as CN ( 0 , σ 1 2 ) and CN ( 0 , σ e 2 ) , respectively, and η 2 , i ( t ) C 1 × 1 CN ( 0 , σ 2 2 ) . Here note that the feedforward channel (45) is a SIMO channel, while the feedback channel (46) is a MISO channel. Unlike the SVD technique used for the MIMO WHFL that decomposes the MIMO channel into several parallel SISO channels, we use a beamforming strategy to transform the feedforward SIMO channel into the SISO channel, and a new pre-coding strategy to transform the feedback MISO channel into the SISO channel, see the following Figure 6. Further applying the approach for each SISO channel (see Section 3.1.3), the FBL approach for the SIMO WHFL is obtained, and the details are given below.
Beamforming strategy: The signal received by the cloud server in (45) can proceed as follows:
h s i m o H Y i ( t ) = h s i m o H h s i m o X i ( t ) + h s i m o H η 1 , i ( t ) = | | h s i m o | | 2 X i ( t ) + h s i m o H η 1 , i ( t ) , Y ¯ i ( t ) = | | h s i m o | | 2 X i ( t ) + η ¯ 1 , i ( t ) ,
where Y ¯ i ( t ) = h s i m o H Y i ( t ) C 1 × 1 and η ¯ 1 , i ( t ) = h s i m o H η 1 , i ( t ) C 1 × 1 . Applying (4), the feedforward SIMO channel is transformed into the SISO channel.
A new pre-coding strategy: For the feedback channel (46), allowing the following:
X ˜ i ( t ) = h ˜ s i m o H | | h ˜ s i m o | | X ˜ i ( t ) ,
where X ˜ i ( t ) C 1 × 1 and E ( X ˜ i H ( t ) X ˜ i ( t ) ) = E ( X ˜ i H ( t ) h ˜ s i m o | | h ˜ s i m o | | h ˜ s i m o H | | h ˜ s i m o | | X ˜ i ( t ) ) = E ( X ˜ i H ( t ) X ˜ i ( t ) ) = P ˜ , which indicates that the power constraint of X ˜ i ( t ) is equal to that of X ˜ i ( t ) . Hence, substituting (49) into (46), we have the following:
Y ˜ i ( t ) = h ˜ s i m o h ˜ s i m o H | | h ˜ s i m o | | X ˜ i ( t ) + η 2 , i ( t ) = | | h ˜ s i m o | | X ˜ i ( t ) + η 2 , i ( t ) ,
which indicates that the feedback MISO channel is transformed into the SISO channel. Hence along the lines of the encoding-decoding procedure in Section 3.1.3, the FBL approach for the SIMO WHFL is obtained.
Since the proof of Theorem 2 is included in the proof of Theorem 1, we omit the detailed proof here.

5. Simulation Results

5.1. Experimental Settings

The simulation results are derived by averaging 2000 independent channel realizations (i.e., Monte-Carlo simulations). We consider a WHFL system consisting of 10 users, an edge server, and a cloud server, with each user having the same amount of training data. We assume that the channel matrix elements follow an i.i.d. distribution as CN ( 0 , 1 ) [5,6,17]. Following [47], the maximum normalized estimation errors of the eavesdropper’s channel are defined as Ω = ω | | g | | F and Ω ˜ = ω ˜ | | g ˜ | | F , where ω and ω ˜ are defined in (13). The edge server employs Lempel–Ziv–Welch (LZW) source coding [50] to compress the quantized gradients, and the total transmitted data are M bits. The transmission latency for the edge server to upload data is T c o m m = M R e g [5], where R e g represents the edge server’s transmission rate.
To evaluate the effectiveness of the proposed FBL scheme under real-world conditions, we train a neural network using the MNIST dataset (http://yann.lecun.com/exdb/mnist/, accessed on 20 March 2024), which contains 60,000 training samples and 10,000 test samples of 10 different handwritten digits. The network architecture includes 784 input nodes, a hidden layer containing 20 nodes, and an output layer with 10 nodes. The loss function is cross-entropy, with the hidden and output layers utilizing the ReLU and softmax activation functions, respectively. The neural network contains a total of q = 15,910 parameters, and the learning rate is set at μ = 0.1 . In the experiments, the following three schemes are compared.
  • Benchmark (Perfect HFL): The perfectly aggregated HFL system can be achieved through error-free transmission, which serves as the benchmark accuracy in ideal settings.
  • Baseline 1 (Random binning coding scheme (RBCS)-based WHFL [26,28]): The gradient data from the edge servers is uploaded using the RBCS, which is based on traditional low-density parity-check (LDPC) codes with a target bit error rate of 10 6 .
  • Baseline 2 (Frequency division multiple access (FDMA)-based WHFL with artificial noise (AN) [29]): In the FDMA-based WHFL system with AN, FDMA is employed to transmit gradient data from edge servers to the cloud server, targeting a bit error ratio of 10 6 . Additionally, AN is added to the transmitted signals to prevent eavesdroppers from obtaining the true gradient data.

5.2. Experimental Results

We show the results of test accuracy and the cross entropy versus the communication round for SISO/SIMO/MISO/MIMO cases in Figure 7 and Figure 8, respectively. From Figure 7 and Figure 8, we see that if perfect CSI of the eavesdropper’s channel is obtained by legal parties, both our proposed FBL scheme, Baseline 1 scheme, and Baseline 2 scheme almost do not affect the learning performance of HFL. This is because Baseline 1, Baseline 2, and our proposed schemes are all capable of transmitting gradient data with a sufficiently low decoding error probability. On the other hand, in our proposed FBL schemes, if imperfect CSI of the eavesdropper’s channel is obtained by legal parties, the test accuracy of HFL decreases, and the training loss of HFL increases as the maximum normalized estimation error of the eavesdropper’s channel increases. However, note that in such an imperfect CSI case, our proposed FBL schemes still provide the same level of secrecy as that of the perfect CSI case, which shows the robustness of our schemes against imperfect CSI of the eavesdropper’s channel. Furthermore, Figure 7 and Figure 8 demonstrate that the eavesdropper cannot obtain the real gradient data when applying our FBL scheme, which indicates that our FBL schemes effectively ensure the PLS of the data.
As depicted in Figure 9, the transmission latency of our FBL scheme is approximately 2 to 5 times lower than that of Baseline 1 and Baseline 2, due to the gain from introducing feedback. Additionally, the transmission latency of our scheme decreases as the number of antennas increases. Furthermore, the transmission latency of Baseline 2 is lower than that of Baseline 1, owing to the gain from introducing AN to counter eavesdropping attacks in Baseline 2. Moreover, Figure 9 shows that the transmission latency of our FBL scheme increases as the maximum normalized estimation error of the eavesdropper’s channel increases, and this is because to support the same level of performance, the worse estimation of the CSI of the eavesdropper’s channel, the more bits need to be transmitted, which leads to an increase in transmission latency.
From Table 2, we show that the achievable secrecy transmission rates of our FBL schemes increase with the number of antennas, and the achievable secrecy transmission rates of our schemes are significantly higher than those of Baseline 1 and Baseline 2, due to the gain introduced by feedback in our scheme. Additionally, due to the gain from introducing AN in Baseline 2, its achievable secrecy transmission rate is higher than that of Baseline 1. Furthermore, Table 2 shows that the achievable secrecy transmission rates of our proposed FBL scheme decrease as the maximum normalized estimation errors of the eavesdropper’s channel increase, which can be viewed as the price for the worst estimation. From Table 3, we conclude that the achievable secrecy transmission rates of FBL schemes increase as the SNR of the feedback channel increases. Moreover, Figure 10 shows that the transmission latency of FBL schemes increases as the SNR of the feedback channel decreases. Therefore, in our schemes, poorer feedback channel conditions lead to lower achievable secrecy transmission rates and increased transmission latency. However, poorer feedback channel conditions do not directly affect learning performance, as it is primarily determined by the distortion D of lossy source coding, the average decoding error probability τ of channel coding, and the variance of noise introduced by LDP mechanisms.
Figure 11 shows the relationship between PLS (measured by the secrecy level), privacy, utility, and the LDP noise variance of proposed FBL schemes. From Figure 11, we conclude that the secrecy level increases as the LDP noise variance increases, and a higher secrecy level leads to a more stringent relationship between privacy and utility (with a smaller ϵ and a larger υ ). Apart from this, for a given secrecy level, increasing the maximum normalized estimation error in the eavesdropper’s channel results in an increase in the variance of LDP noise, which can be also viewed as the price for the worse estimation.

6. Conclusions and Future Work

In this paper, a practical FBL approach, which is an extension of the classical SK scheme, is proposed for the multi-antenna URLLC-WHFL systems in the presence of PLS. We characterize the relationship between PLS, privacy, and the utility of these WHFL systems, and derive achievable transmission rates of the proposed FBL approach. Simulation results demonstrate that when the edge server has perfect knowledge of the eavesdropper’s CSI, our proposed FBL approach not only almost achieves perfect secrecy but also does not affect learning performance. Additionally, simulation results demonstrate that the proposed schemes have robustness even when the edge server has an imperfect eavesdropper’s CSI. Apart from this, it has been demonstrated that the transmission latency of our proposed FBL approach is significantly lower compared to traditional RBCS.
Furthermore, this paper focuses on proposing and analyzing a theoretical scheme. The application of this approach in real-world systems still faces practical challenges, such as hardware constraints, power consumption, or synchronization issues. Future work should aim to optimize energy efficiency and address synchronization in more complex multi-antenna systems using the proposed FBL scheme. On the other hand, as the computational complexity of techniques like precoding, beamforming, and SVD increases with the number of devices and communication channels, particularly in multi-antenna systems, further research, and optimization are needed to extend our proposed approach to more complex and large-scale networks. For instance, distributed or hierarchical architectures can allocate the computational load across multiple servers or devices, reducing the burden on individual components. Additionally, low-complexity approximation methods for precoding and beamforming could help lower overall system complexity. Future work will extend our approach to more complex multi-edge scenarios, exploring the impact of interference among edge servers on the WHFL in the presence of PLS.

Author Contributions

H.Z. did the theoretical work, performed the experiments, analyzed the data and drafted the work; B.D. designed the work, performed the theoretical work, interpreted the data for the work and revised the work; and P.X. interpreted the data for the work and revised the work. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was supported in part by the National Key R&D Program of China under grant no. 2022YFA1005000, in part by the National Natural Science Foundation of China under grant no. 62071392, and in part by Chongqing Key Laboratory of Mobile Communications Technology under grant no. cqupt-mct-202302.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. The Formal Proof of Theorem 1

Appendix A.1. Utility and Privacy Analysis

First, note that since W t , η t , and W t are i.i.d. generated, from Definition 1, we conclude the following:
max t { 1 , , T } 1 q I ( W t ; W t ) = max t { 1 , , T } 1 2 log 1 + S σ w , t 2 K σ 2 ϵ ,
On the other hand, from Definition 2, we conclude the following:
1 q T t = 1 T E ( d ( W t , W t ) ) = 1 q T t = 1 T E ( | | η t | | 2 ) = K σ 2 υ .
From (A1) and (A2), the relationship between privacy, utility, and the noise variance of LDP is characterized by the following:
max t { 1 , , T } S σ w , t 2 K ( 2 2 ϵ 1 ) Privacy term σ 2 Noise variance of LDP υ K Utility term ,
which indicates that by selecting an appropriate LDP noise to satisfy (A3), both privacy and utility can be ensured.

Appendix A.2. Decoding Error Probability and Convergence Analysis

First, we bound the decoding error probability P e , t of W t transmitted in all parallel sub-SISO channels as follows:
P e , t P e , t ( 1 ) + P e , t ( 2 ) + + P e , t ( J ) ,
where P e , t ( j ) ( j = 1 , , J ) represents the decoding error probability of message W t , j , and J is the number of parallel sub-SISO channels, which is defined in (31). Next, we analyze the error events of message W t , j , which consist of the following:
(1)
A modulo-aliasing error occurs in the edge server at time instant i + 1 ( 1 i N t 1 ) , and it is defined as follows:
E i = { γ i ε i + η j , 2 , i ( t ) d ˜ j Λ } = { [ γ i ε R , i + η R , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ] [ γ i ε I , i + η I , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ] } ,
where ε R , i and ε I , i are the real and imaginary parts of ε i , respectively, η R , j , 2 , i ( t ) and η I , j , 2 , i ( t ) are the real and imaginary parts of η j , 2 , i ( t ) , respectively.
(2)
A decoding error occurs in the cloud server at time instant N t , and it is defined as follows:
E N t = { ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) ε I , N t [ 3 2 N t R t , j , I , 3 2 N t R t , j , I ) } ,
where ε R , N t and ε I , N t are the real and imaginary parts of ε N t , respectively.
Thus, the error probability P e , t ( j ) is bounded by the following:
P e , t ( j ) P r i = 1 N t E i = P r i = 1 N t 1 E i + P r i = 1 N t 1 E i c E N t = i = 1 N t 1 P r j = 1 i 1 E j c E i + P r i = 1 N t 1 E i c E N t = i = 1 N t 1 P r E ˜ i + P r E ˜ N t ,
where E c is the complement of the set E, and E ˜ i = j = 1 i 1 E j c E i . Here, note that P r ( E ˜ i ) ( i { 1 , , N t 1 } ) is the error probability that a demodulation error occurs at time instant i + 1 , and no error occurs in all previous times. P r ( E ˜ N t ) is the error probability of the final decoding, and no demodulation error occurs in all times. We assume that P e , t τ , which indicates that 1 T t = 1 T P e , t τ is guaranteed. Then, we choose P e , t ( j ) τ J , and we have the following:
P r E ˜ N t = i = 1 N t 1 P r E ˜ i = τ 2 J ,
for simplification, we define the following:
P r E ˜ 1 = = P r E ˜ N t 1 = p m .
Substituting (A9) into (A8), we have the following:
p m = τ 2 J ( N t 1 ) .
For the error event E ˜ i , since no demodulation error occurs before time instant i + 1 , according to (43), (44), and the fact η j , 2 , i ( t ) is CSCG-distributed, we can conclude the following: γ i ε i + η j , 2 , i ( t ) d ˜ j CN ( 0 , γ i 2 α i + σ 2 2 d ˜ j 2 ) . Hence, we have γ i ε k , i + η k , j , 2 , i ( t ) d ˜ j N ( 0 , γ i 2 α i 2 + σ 2 2 2 d ˜ j 2 ) , where k { R , I } . From (A5) and d = 6 P ˜ j , we have the following:
P r ( E i ) = P r ( { [ γ i ε R , i + η R , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ] [ γ i ε I , i + η I , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ] } ) P r ( γ i ε R , i + η R , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ) + P r ( γ i ε I , i + η I , j , 2 , i ( t ) d ˜ j [ d 2 , d 2 ) ) = 2 Q 3 P ˜ j 2 E ( γ i ε R , i + η R , j , 2 , i ( t ) d ˜ j ) 2 + 2 Q 3 P ˜ j 2 E ( γ i ε I , i + η I , j , 2 , i ( t ) d ˜ j ) 2 = 4 Q 3 P ˜ j 2 γ i 2 α i 2 + σ 2 2 2 d ˜ j 2 = p m .
For simplification, let
ξ = 1 3 Q 1 ( p m 4 ) 2 = 1 3 Q 1 ( τ 8 J ( N t 1 ) ) 2 .
Substituting (A11) into (A12), we have the following:
γ i 2 α i 2 + σ 2 2 2 d ˜ j 2 = P ˜ j 2 ξ .
From (A13), we can conclude the following:
γ i = 1 α i P ˜ j ξ σ 2 2 d ˜ j 2 ,
and note that X j , i + 1 ( t ) is subject to the power constraint P j ; hence, we have the following:
E [ X j , i + 1 ( t ) H X j , i + 1 ( t ) ] = λ i 2 E ( γ i ε i + η j , 2 , i ( t ) d ˜ j ) 2 = P j .
According to (A13), (A15), we conclude the following:
λ i = ξ · P j P ˜ j .
From (43), (A14), and (A16), we have the following:
Y j , i + 1 ( t ) = d j λ i ( γ i ε i + η j , 2 , i ( t ) d ˜ j ) + η j , 1 , i + 1 ( t ) ,
we have the following:
β i + 1 = λ i γ i α i P j + σ 1 2 d j 2 = α i σ 1 SNR j ( 1 ξ · SNR ˜ j 1 d ˜ j 2 ) SNR j + d j 2 .
According to (44), (A14), (A16), (A17), and (A18), we have the following:
α i + 1 = α i 1 + SNR j d j 2 1 ξ · SNR ˜ j 1 d ˜ j 2 1 + ξ · SNR j · SNR ˜ j 1 d j 2 d ˜ j 2 1 = 2 d j 2 SNR j 1 1 + SNR j d j 2 Ψ 1 Ψ 2 i ,
where Ψ 1 , Ψ 2 , and ξ are defined in (19). From (A6)–(A8), we have the following:
P r E ˜ N t P r { ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) } + P r { ε I , N t [ 3 2 N t R t , j , I , 3 2 N t R t , j , I ) } τ 2 J ,
and let
P r { ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) } = P r { ε I , N t [ 3 2 N t R t , j , I , 3 2 N t R t , j , I ) } = τ 4 J .
Next, we first analyze the term P r { ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) } = τ 4 J , i.e.,
P r { ε R , N t [ 3 2 N t R t , j , R , 3 2 N t R t , j , R ) } = 2 Q 3 2 N t R t , j , R · 1 α R , N t = 2 Q 3 2 N t R t , j , R · 1 α N t 2 = τ 4 J .
Substituting (A19) into (A22), we have the following:
R t , j , R = 1 2 N t log 3 SNR j d j 2 Q 1 ( τ 8 J ) 2 1 + SNR j d j 2 Ψ 1 Ψ 2 N t 1 .
Analogously, we can show that R t , j , I = R t , j , R . The transmission rate of the j-th ( j = 1 , , J ) parallel sub-channel in the t-th round is given by the following:
R t , j = R t , j , R + R t , j , I = 1 N t log 3 SNR j d j 2 Q 1 ( τ 8 J ) 2 1 + SNR j d j 2 Ψ 1 Ψ 2 N t 1 .
Combining (A24) and (36), and power allocating in Section 3.1.1, (18) in Theorem 1 is obtained. Then, according to R = H ( W 1 , , W T ) N in (15), the transmission rate R m i m o ( τ , N , δ , D , υ , ϵ ) is given by the following:
R m i m o ( τ , N , δ , D , υ , ϵ ) = H ( W 1 , , W T ) N = ( b ) t = 1 T H ( W t ) N = t = 1 T N t R t N ,
where N = t = 1 T N t , and (b) is due to the fact that W t is mapped into the uniformly distributed index W t in each communication round, which indicates that ( W 1 , , W T ) are independent of each other; hence, (17) in Theorem 1 is obtained. A brief convergence analysis of our FBL approach is given below.
Convergence analysis: Following the convergence proof in [18,22], we can prove the existence of convergence in the WHFL system with our FBL scheme when the decoding error probability of our FBL scheme is significantly small. Furthermore, from (A19) and (A22), we conclude that the variance of the estimation error in our FBL approach converges to zero with double-exponential speed, which indicates that the final estimation θ ^ j , N t can always converge to θ j , and the required coding block length for achieving the desired decoding error probability is significantly short.

Appendix A.3. Security Analysis

First, note that the eavesdropper’s equivocation rate, Δ , can be re-written as follows:
Δ = H ( W 1 , , W T | Z N 1 , , Z N T , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) H ( W 1 , , W T ) = t = 1 T H ( W t | Z N 1 , , Z N T , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o , W 1 , , W t 1 ) H ( W 1 , , W T ) = ( c ) t = 1 T H ( W t | Z N t , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) H ( W 1 , , W T ) = t = 1 T H ( W t | Z N t , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) t = 1 T H ( W t ) ,
where (c) follows from the Markov chain W t ( Z N t , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) ( Z N 1 , ,   Z N t 1 , Z N t + 1 , , Z N T , W 1 , , W t 1 ) . The term H ( W t | Z N t , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) in (A26) is given by the following:
H ( W t | Z N t , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o ) ( e ) H ( W t | g m i m o X 1 ( t ) + g ˜ m i m o X ˜ 1 ( t ) + η e , 1 ( t ) Z 1 ( t ) , , g m i m o X N t 1 ( t ) + g ˜ m i m o X ˜ N t 1 ( t ) + η e , N t 1 ( t ) Z N t 1 ( t ) , g m i m o X N t ( t ) + η e , N t ( t ) Z N t ( t ) , η 1 , 1 ( t ) , , η 1 , N t ( t ) , η 2 , 1 ( t ) , , η 2 , N t ( t ) , η e , 2 ( t ) , , η e , N t ( t ) , X ˜ 1 ( t ) , , X ˜ N t 1 ( t ) , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o , U , Λ , V , U ˜ , Λ ˜ , V ˜ ) = ( f ) H ( W t | g m i m o V X 1 ( t ) + η e , 1 ( t ) , η 1 , 1 ( t ) , , η 1 , N t ( t ) , η 2 , 1 ( t ) , , η 2 , N t ( t ) , η e , 2 ( t ) , , η e , N t ( t ) , X ˜ 1 ( t ) , , X ˜ N t 1 ( t ) , h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o , U , Λ , V , U ˜ , Λ ˜ , V ˜ ) = ( g ) H ( W t | g m i m o V X 1 ( t ) + η e , 1 ( t ) ) = ( h ) H ( W t ) + h ( η e , 1 ( t ) ) h ( g m i m o V X 1 ( t ) + η e , 1 ( t ) ) = ( i ) H ( W t ) log det I + g m i m o K x 1 g m i m o H σ e 2 Information leakage at time 1 ,
where
(e) follows from the fact that conditioning reduces entropy, as shown in (28),
(f) follows from X i ( t ) = V H X i ( t ) , X ˜ i ( t ) = V ˜ H X ˜ i ( t ) and X i ( t ) ( i = 2 , , N t ) is a function of h m i m o , h ˜ m i m o , η 1 , 1 ( t ) , , η 1 , i 1 ( t ) , η 2 , 1 ( t ) , , η 2 , i 1 ( t ) ,
(g) follows from the fact that X ˜ i ( t ) ( i = 1 , , N t 1 ) is only related to ν i [39] [Chapter 4.1, pp. 61–63], and h m i m o , h ˜ m i m o , g m i m o , g ˜ m i m o , U , Λ , V , U ˜ , Λ ˜ , V ˜ , η 1 , 1 ( t ) , , η 1 , N t ( t ) , η 2 , 1 ( t ) , , η 2 , N t ( t ) , η e , 2 ( t ) , , η e , N t ( t ) , ν 1 , , ν N t 1 are independent of W t , X 1 ( t ) , η e , 1 ( t ) ,
(h) is due to the fact that (33) and X j , 1 ( t ) = P j 2 θ j , and W t = ( W t , 1 , , W t , J ) are mapped into ( θ 1 , , θ J ) , respectively,
(i) follows from the following:
h ( g m i m o V X 1 ( t ) + η e , 1 ( t ) ) h ( η e , 1 ( t ) ) log det I + g m i m o V E ( X 1 ( t ) X 1 H ( t ) ) V H g m i m o H σ e 2 = log det I + g m i m o E ( X 1 ( t ) X 1 H ( t ) ) g m i m o H σ e 2 ,
and K x 1 = E ( X 1 ( t ) X 1 H ( t ) ) . Substituting (A27) into (A26), we have the following:
Δ t = 1 T H ( W t ) 1 log det I + g m i m o K x 1 g m i m o H σ e 2 H ( W t ) t = 1 T H ( W t ) min t { 1 , , T } 1 log det I + g m i m o K x 1 g m i m o H σ e 2 H ( W t ) .
From (8), (35) and (A29), Δ δ in (15) is guaranteed if we have the following:
σ 2 max t { 1 , , T } D · 2 2 q ( 1 δ ) log det I + g m i m o K x 1 g m i m o H σ e 2 S σ w , t 2 K Secrecy level of PLS .
With the assumption that the edge server has an imperfect CSI of the eavesdropper’s channel, combining (A30) and Definition 3, (A30) can be re-written by the following:
σ 2 max t { 1 , , T } , Δ g m i m o D · 2 2 q ( 1 δ ) log   det I + g ^ m i m o K x 1 g ^ m i m o H σ e 2 S σ w , t 2 K Secrecy level of PLS ,
where g ^ m i m o = g m i m o Δ g m i m o . Then, combining (A3) and (A31), (16) in Theorem 1 is obtained.
The proof of Theorem 1 is completed.

References

  1. Zhu, G.; Liu, D.; Du, Y.; You, C.; Zhang, J.; Huang, K. Toward an Intelligent Edge: Wireless Communication Meets Machine Learning. IEEE Commun. Mag. 2020, 58, 19–25. [Google Scholar] [CrossRef]
  2. Yang, Z.; Chen, M.; Saad, W.; Hong, C.S.; Shikh-Bahaei, M. Energy Efficient Federated Learning Over Wireless Communication Networks. IEEE Trans. Wireless Commun. 2021, 20, 1935–1949. [Google Scholar] [CrossRef]
  3. Amiri, M.M.; Gündüz, D. Federated Learning Over Wireless Fading Channels. IEEE Trans. Wireless Commun. 2020, 19, 3546–3557. [Google Scholar] [CrossRef]
  4. Jin, R.; He, X.; Dai, H. Communication Efficient Federated Learning with Energy Awareness Over Wireless Networks. IEEE Trans. Wireless Commun. 2022, 21, 5204–5219. [Google Scholar] [CrossRef]
  5. Zhu, G.; Wang, Y.; Huang, K. Broadband Analog Aggregation for Low-Latency Federated Edge Learning. IEEE Trans. Wireless Commun. 2020, 19, 491–506. [Google Scholar] [CrossRef]
  6. Zhu, G.; Du, Y.; Gündüz, D.; Huang, K. One-Bit Over-the-Air Aggregation for Communication-Efficient Federated Edge Learning: Design and Convergence Analysis. IEEE Trans. Wireless Commun. 2021, 20, 2120–2135. [Google Scholar] [CrossRef]
  7. Elgabli, A.; Park, J.; Issaid, C.B.; Bennis, M. Harnessing Wireless Channels for Scalable and Privacy-Preserving Federated Learning. IEEE Trans. Commun. 2021, 69, 5194–5208. [Google Scholar] [CrossRef]
  8. Wen, H.; Wu, Y.; Yang, C.; Duan, H.; Yu, S. A Unified Federated Learning Framework for Wireless Communications: Towards Privacy, Efficiency, and Security. In Proceedings of the 2020 IEEE INFOCOM Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 6–9 July 2020; pp. 653–658. [Google Scholar]
  9. Seif, M.; Tandon, R.; Li, M. Wireless Federated Learning with Local Differential Privacy. In Proceedings of the 2020 IEEE International Symposium on Information Theory (ISIT), Los Angeles, CA, USA, 21–26 June 2020; pp. 2604–2609. [Google Scholar]
  10. Yuan, X.; Ni, W.; Ding, M.; Wei, K.; Li, J.; Poor, H.V. Amplitude-Varying Perturbation for Balancing Privacy and Utility in Federated Learning. IEEE Trans. Inf. Forensics Secur. 2023, 18, 1884–1897. [Google Scholar] [CrossRef]
  11. Kim, M.; Günlü, O.; Schaefer, R.F. Federated Learning with Local Differential Privacy: Trade-Offs Between Privacy, Utility, and Communication. In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Process (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 2650–2654. [Google Scholar]
  12. Zhou, J.; Su, Z.; Ni, J.; Wang, Y.; Pan, Y.; Xing, R. Personalized Privacy-Preserving Federated Learning: Optimized Trade-off Between Utility and Privacy. In Proceedings of the 2022 IEEE Global Communications Conference (GLOBECOM), Rio de Janeiro, Brazil, 4–8 December 2022; pp. 4872–4877. [Google Scholar]
  13. Guo, S.; Su, Z.; Tian, Z.; Yu, S. Utility-Aware Privacy-Preserving Federated Learning through Information Bottleneck. In Proceedings of the 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Wuhan, China, 9–11 December 2022; pp. 680–686. [Google Scholar]
  14. Wang, B.; Chen, Y.; Jiang, H.; Zhao, Z. PPeFL: Privacy-Preserving Edge Federated Learning with Local Differential Privacy. IEEE Internet Things J. 2023, 10, 15488–15500. [Google Scholar] [CrossRef]
  15. Zhang, N.; Tao, M. Gradient Statistics Aware Power Control for Over-the-Air Federated Learning. IEEE Trans. Wireless Commun. 2021, 20, 5115–5128. [Google Scholar] [CrossRef]
  16. Liu, D.; Simeone, O. Privacy for Free: Wireless Federated Learning via Uncoded Transmission with Adaptive Power Control. IEEE J. Sel. Areas Commun. 2021, 39, 170–185. [Google Scholar] [CrossRef]
  17. Yang, K.; Jiang, T.; Shi, Y.; Ding, Z. Federated Learning via Over-the-Air Computation. IEEE Trans. Wireless Commun. 2020, 19, 2022–2035. [Google Scholar] [CrossRef]
  18. Liu, L.; Zhang, J.; Song, S.H.; Letaief, K.B. Client-Edge-Cloud Hierarchical Federated Learning. In Proceedings of the 2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; pp. 1–6. [Google Scholar]
  19. Luo, S.; Chen, X.; Wu, Q.; Zhou, Z.; Yu, S. HFEL: Joint Edge Association and Resource Allocation for Cost-Efficient Hierarchical Federated Edge Learning. IEEE Trans. Wireless Commun. 2020, 19, 6535–6548. [Google Scholar] [CrossRef]
  20. Liu, S.; Yu, G.; Chen, X.; Bennis, M. Joint User Association and Resource Allocation for Wireless Hierarchical Federated Learning with IID and Non-IID Data. IEEE Trans. Wireless Commun. 2022, 21, 7852–7866. [Google Scholar] [CrossRef]
  21. Wen, W.; Chen, Z.; Yang, H.H.; Xia, W.; Quek, T.Q.S. Joint Scheduling and Resource Allocation for Hierarchical Federated Edge Learning. IEEE Trans. Wireless Commun. 2022, 21, 5857–5872. [Google Scholar] [CrossRef]
  22. Shi, L.; Shu, J.; Zhang, W.; Liu, Y. HFL-DP: Hierarchical Federated Learning with Differential Privacy. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–7. [Google Scholar]
  23. Wainakh, A.; Guinea, A.S.; Grube, T.; Mühlhäuser, M. Enhancing Privacy via Hierarchical Federated Learning. In Proceedings of the 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Genoa, Italy, 7–11 September 2020; pp. 344–347. [Google Scholar]
  24. Feng, C.; Yang, H.H.; Hu, D.; Zhao, Z.; Quek, T.Q.S.; Min, G. Mobility-Aware Cluster Federated Learning in Hierarchical Wireless Networks. IEEE Trans. Wireless Commun. 2022, 21, 8441–8458. [Google Scholar] [CrossRef]
  25. Wyner, A.D. The Wire-Tap Channel. Bell Syst. Tech. J. 1975, 54, 1355–1387. [Google Scholar] [CrossRef]
  26. Yao, J.; Ansari, N. Secure Federated Learning by Power Control for Internet of Drones. IEEE Trans. Cognitive Commun. Netw. 2021, 7, 1021–1031. [Google Scholar] [CrossRef]
  27. Wang, T.; Li, Y.; Wu, Y.; Quek, T.Q.S. Secrecy driven Federated Learning via Cooperative Jamming: An Approach of Latency Minimization. IEEE Trans. Emerg. Topics Comput. 2021, 10, 1687–1703. [Google Scholar] [CrossRef]
  28. Qian, L.; Wu, W.; Lu, W.; Wu, Y.; Lin, B.; Quek, T.Q.S. Secrecy-Based Energy-Efficient Mobile Edge Computing via Cooperative Non-Orthogonal Multiple Access Transmission. IEEE Trans. Commun. 2021, 69, 4659–4677. [Google Scholar] [CrossRef]
  29. Yan, Z.; Li, D.; Zhang, Z.; He, J. Accuracy-Security Tradeoff with Balanced Aggregation and Artificial Noise for Wireless Federated Learning. IEEE Internet Things J. 2023, 10, 18154–18167. [Google Scholar] [CrossRef]
  30. Zhang, H.; Yang, C.; Dai, B. When Wireless Federated Learning Meets Physical Layer Security: The Fundamental Limits. In Proceedings of the IEEE INFOCOM Computer Communications Workshops (INFOCOM WKSHPS), New York, NY, USA, 2–5 May 2022; pp. 1–6. [Google Scholar]
  31. Durisi, G.; Koch, T.; Popovski, P. Toward Massive, Ultrareliable, and Low-Latency Wireless Communication with Short Packets. Proc. IEEE 2016, 104, 1711–1726. [Google Scholar] [CrossRef]
  32. Polyanskiy, Y.; Poor, H.V.; Verdu, S. Channel Coding Rate in the Finite Blocklength Regime. IEEE Trans. Inf. Theory 2010, 56, 2307–2359. [Google Scholar] [CrossRef]
  33. She, C.; Dong, R.; Gu, Z.; Hou, Z.; Li, Y.; Hardjawana, W.; Vucetic, B.; Song, L.; Yang, C. Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks. IEEE Netw. 2020, 34, 219–225. [Google Scholar] [CrossRef]
  34. Samarakoon, S.; Bennis, M.; Saad, W.; Debbah, M. Distributed Federated Learning for Ultra-Reliable Low-Latency Vehicular Communications. IEEE Trans. Commun. 2020, 68, 1146–1159. [Google Scholar] [CrossRef]
  35. Schalkwijk, J.; Kailath, T. A coding scheme for additive noise channels with feedback–I: No bandwidth constraint. IEEE Trans. Inf. Theory 1966, 12, 172–182. [Google Scholar] [CrossRef]
  36. Gunduz, D.; Brown, D.R.; Poor, H.V. Secret communication with feedback. In Proceedings of the 2008 International Symposium on Information Theory and Its Applications (ISITA), Auckland, New Zealand, 7–10 December 2008; pp. 1–6. [Google Scholar]
  37. Truong, L.V.; Fong, S.L.; Tan, V.Y.F. On Gaussian Channels with Feedback Under Expected Power Constraints and with Non-Vanishing Error Probabilities. IEEE Trans. Inf. Theory 2017, 63, 1746–1765. [Google Scholar] [CrossRef]
  38. Abadi, M.; Chu, A.; Goodfellow, I.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep Learning with Differential Privacy. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 303–318. [Google Scholar]
  39. Zamir, R. Lattice Coding for Signals and Network; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
  40. Zhi, K.; Pan, C.; Ren, H.; Wang, K.; Elkashlan, M.; Di Renzo, M.; Hanzo, L.; Schober, R.; Wang, J. Two-Timescale Design for Reconfigurable Intelligent Surface-Aided Massive MIMO Systems with Imperfect CSI. IEEE Trans. Inf. Theory 2022, 69, 3001–3033. [Google Scholar] [CrossRef]
  41. Schiessl, S.; Al-Zubaidy, H.; Skoglund, M.; Gross, J. Delay Performance of Wireless Communications with Imperfect CSI and Finite-Length Coding. IEEE Trans. Commun. 2018, 66, 6527–6541. [Google Scholar] [CrossRef]
  42. Chen, Z.J.; Hernandez, E.E.; Huang, Y.C.; Rini, S. DNN gradient lossless compression: Can GenNorm be the answer? In Proceedings of the 2022 IEEE International Conference on Communications (ICC), Seoul, Republic of Korea, 16–20 May 2022; pp. 407–412. [Google Scholar]
  43. Wang, W.; Ying, L.; Zhang, J. On the Relation Between Identifiability, Differential Privacy, and Mutual-Information Privacy. IEEE Trans. Inf. Theory 2016, 62, 5018–5029. [Google Scholar] [CrossRef]
  44. Sankar, L.; Rajagopalan, S.R.; Poor, H.V. Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach. IEEE Trans. Inf. Forensics Secur. 2013, 8, 838–852. [Google Scholar] [CrossRef]
  45. Gamal, A.A.E.; Kim, Y.-H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
  46. Han, S.; Xu, X.; Fang, S.; Sun, Y.; Cao, Y.; Tao, X.; Zhang, P. Energy Efficient Secure Computation Offloading in NOMA-Based mMTC Networks for IoT. IEEE Internet Things J. 2019, 6, 5674–5690. [Google Scholar] [CrossRef]
  47. Ng, D.W.K.; Lo, E.S.; Schober, R. Robust Beamforming for Secure Communication in Systems with Wireless Information and Power Transfer. IEEE Trans. Wireless Commun. 2014, 13, 4599–4615. [Google Scholar] [CrossRef]
  48. Tekin, E.; Yener, A. The Gaussian Multiple Access Wire-Tap Channel. IEEE Trans. Inf. Theory 2008, 54, 5747–5755. [Google Scholar] [CrossRef]
  49. Tse, D.; Viswanath, P. Fundamentals of Wireless Communication; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
  50. Welch, T.A. A technique of high-performance data compression. IEEE Comput. 1984, 17, 8–19. [Google Scholar] [CrossRef]
Figure 1. The multi-antenna WHFL in the presence of PLS.
Figure 1. The multi-antenna WHFL in the presence of PLS.
Entropy 26 00827 g001
Figure 2. An information-theoretic model of the WHFL system, where the edge server, cloud server and eavesdroppers are equipped with A, B, and C antennas, respectively ( A 1 , B 1 , C 1 ).
Figure 2. An information-theoretic model of the WHFL system, where the edge server, cloud server and eavesdroppers are equipped with A, B, and C antennas, respectively ( A 1 , B 1 , C 1 ).
Entropy 26 00827 g002
Figure 3. A schematic diagram of the FBL approach for the WHFL over the MIMO channel.
Figure 3. A schematic diagram of the FBL approach for the WHFL over the MIMO channel.
Entropy 26 00827 g003
Figure 4. Comparison of the message mapping methods between the classical SK scheme and the scheme in this paper. (a) Message mapping of classical SK scheme. (b) Message mapping in this paper.
Figure 4. Comparison of the message mapping methods between the classical SK scheme and the scheme in this paper. (a) Message mapping of classical SK scheme. (b) Message mapping in this paper.
Entropy 26 00827 g004
Figure 5. Comparing the mechanisms between the classical SK scheme and the two-dimensional MLO-based SK-type scheme, where θ ^ i 1 represents the estimation of the transmitted message θ at time i 1 . (a) The classical SK scheme in a certain round i. (b) The two-dimensional MLO-based SK-type scheme in a certain round i.
Figure 5. Comparing the mechanisms between the classical SK scheme and the two-dimensional MLO-based SK-type scheme, where θ ^ i 1 represents the estimation of the transmitted message θ at time i 1 . (a) The classical SK scheme in a certain round i. (b) The two-dimensional MLO-based SK-type scheme in a certain round i.
Entropy 26 00827 g005
Figure 6. A schematic diagram of the FBL approach for the SIMO WHFL.
Figure 6. A schematic diagram of the FBL approach for the SIMO WHFL.
Entropy 26 00827 g006
Figure 7. Performance comparison between the different schemes on the MNIST dataset ( K = 10 S = 60,000 q = 15,910 D = 10 4 SNR ˜ = 15 dB,  P = 10 τ = 10 6 σ 1 2 = σ 2 2 = 1 , σ e 2 = 2 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Figure 7. Performance comparison between the different schemes on the MNIST dataset ( K = 10 S = 60,000 q = 15,910 D = 10 4 SNR ˜ = 15 dB,  P = 10 τ = 10 6 σ 1 2 = σ 2 2 = 1 , σ e 2 = 2 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Entropy 26 00827 g007
Figure 8. Performance comparison between the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Figure 8. Performance comparison between the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Entropy 26 00827 g008aEntropy 26 00827 g008b
Figure 9. Transmission latency (200 rounds) of the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , τ = 10 6 , SNR ˜ = 15 dB, σ 1 2 = 1 ,   σ 2 2 = 1 ,   σ e 2 = 2 , T = 200 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Figure 9. Transmission latency (200 rounds) of the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , τ = 10 6 , SNR ˜ = 15 dB, σ 1 2 = 1 ,   σ 2 2 = 1 ,   σ e 2 = 2 , T = 200 ). (a) A = B = C = 4 ,   δ = 0.99994 . (b) A = 1 ,   B = C = 4 ,   δ = 0.99997 . (c) A = 4 ,   B = C = 1 ,   δ = 0.99997 . (d) A = B = C = 1 ,   δ = 0.99998 .
Entropy 26 00827 g009
Figure 10. Transmission latency (200 rounds) of our schemes under different feedback channel SNR and perfect CSI on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , P = 10 , D = 10 4 , τ = 10 6 , σ 1 2 = 1 ,   σ 2 2 = 1 ,   σ e 2 = 2 , T = 200 ).
Figure 10. Transmission latency (200 rounds) of our schemes under different feedback channel SNR and perfect CSI on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , P = 10 , D = 10 4 , τ = 10 6 , σ 1 2 = 1 ,   σ 2 2 = 1 ,   σ e 2 = 2 , T = 200 ).
Entropy 26 00827 g010
Figure 11. The relationship between the PLS (secrecy level), the privacy-utility, and LDP noise variance of proposed FBL schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ). (a) A = B = C = 4 . (b) A = B = C = 4 . (c) A = B = C = 4 . (d) A = 1 ,   B = C = 4 . (e) A = 1 ,   B = C = 4 . (f) A = 1 ,   B = C = 4 . (g) A = 4 ,   B = C = 1 . (h) A = 4 ,   B = C = 1 . (i) A = 4 ,   B = C = 1 . (j) A = B = C = 1 . (k) A = B = C = 1 . (l) A = B = C = 1 .
Figure 11. The relationship between the PLS (secrecy level), the privacy-utility, and LDP noise variance of proposed FBL schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , D = 10 4 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ). (a) A = B = C = 4 . (b) A = B = C = 4 . (c) A = B = C = 4 . (d) A = 1 ,   B = C = 4 . (e) A = 1 ,   B = C = 4 . (f) A = 1 ,   B = C = 4 . (g) A = 4 ,   B = C = 1 . (h) A = 4 ,   B = C = 1 . (i) A = 4 ,   B = C = 1 . (j) A = B = C = 1 . (k) A = B = C = 1 . (l) A = B = C = 1 .
Entropy 26 00827 g011aEntropy 26 00827 g011b
Table 1. Summarizing all results in WFL in the presence of privacy, utility, PLS and URLLC.
Table 1. Summarizing all results in WFL in the presence of privacy, utility, PLS and URLLC.
Related WorkPrivacyUtilityPLSRelationship between PLS, Privacy, and UtilityURLLC
[7,8,9,16,23]
[10,11,12,13,14,22]Relationship between Privacy and Utility
[26,27,28]
[29]Relationship between PLS and Utility
[30]Relationship between PLS-Privacy-Utility
[33,34]
This WorkRelationship between PLS-Privacy-Utility
Table 2. Achievable secrecy rates of the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , T = 200 , D = 10 4 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ).
Table 2. Achievable secrecy rates of the different schemes on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , SNR ˜ = 15 dB, P = 10 , τ = 10 6 , T = 200 , D = 10 4 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ).
Number of Antennas A = B = C = 4
(MIMO)
A = 1 , B = C = 4
(SIMO)
A = 4 , B = C = 1
(MISO)
A = B = C = 1
(SISO)
Our scheme
(Perfect CSI)
10.3951 ( bits / symbol )
( ϵ = 0.04 , υ = 3 ,
σ 2 = 0.3 , δ = 0.99994 )
4.9928 ( bits / symbol )
( ϵ = 0.025 , υ = 5 ,
σ 2 = 0.5 , δ = 0.99997 )
5.1346 ( bits / symbol )
( ϵ = 0.018 , υ = 6.5 ,
σ 2 = 0.65 , δ = 0.99997 )
2.6718 (bits/symbol)
( ϵ = 0.032 , υ = 4 ,
σ 2 = 0.4 , δ = 0.99998 )
Our scheme
(Imperfect CSI, Ω = 0.2 )
10.3941 ( bits / symbol )
( ϵ = 0.025 , υ = 5 ,
σ 2 = 0.5 , δ = 0.99994 )
4.9922 ( bits / symbol )
( ϵ = 0.012 , υ = 10 ,
σ 2 = 1 , δ = 0.99997 )
5.1343 ( bits / symbol )
( ϵ = 0.013 , υ = 9 ,
σ 2 = 0.9 , δ = 0.99997 )
2.6708 (bits/symbol)
( ϵ = 0.01 , υ = 12 ,
σ 2 = 1.2 , δ = 0.99998 )
Our scheme
(Imperfect CSI, Ω = 0.4 )
10.3898 ( bits / symbol )
( ϵ = 0.0025 , υ = 50 ,
σ 2 = 5 , δ = 0.99994 )
4.9914 ( bits / symbol )
( ϵ = 0.005 , υ = 24 ,
σ 2 = 2.4 , δ = 0.99997 )
5.1338 ( bits / symbol )
( ϵ = 0.0065 , υ = 18.5 ,
σ 2 = 1.85 , δ = 0.99997 )
2.6689 (bits/symbol)
( ϵ = 8.2 × 10 4 , υ = 150 ,
σ 2 = 15 , δ = 0.99998 )
Baseline 1 [26,28]
(Perfect CSI)
4.2827 ( bits / symbol )
( ϵ = 0.04 , υ = 3 , σ 2 = 0.3 )
2.1537 ( bits / symbol )
( ϵ = 0.025 , υ = 5 , σ 2 = 0.5 )
2.2124 ( bits / symbol )
( ϵ = 0.018 , υ = 6.5 ,
σ 2 = 0.65 )
0.8046 ( bits / symbol )
( ϵ = 0.032 , υ = 4 , σ 2 = 0.4 )
Baseline 2 [29]
(Perfect CSI)
5.2228 ( bits / symbol )
( ϵ = 0.04 , υ = 3 , σ 2 = 0.3 )
2.6265 ( bits / symbol )
( ϵ = 0.025 , υ = 5 , σ 2 = 0.5 )
2.6981 ( bits / symbol )
( ϵ = 0.018 , υ = 6.5 ,
σ 2 = 0.65 )
0.9812 ( bits / symbol )
( ϵ = 0.032 , υ = 4 , σ 2 = 0.4 )
Table 3. Achievable secrecy rates of our schemes under different feedback channel SNR on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , P = 10 , τ = 10 6 , T = 200 , D = 10 4 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ).
Table 3. Achievable secrecy rates of our schemes under different feedback channel SNR on the MNIST dataset ( K = 10 , S = 60,000 , q = 15,910 , P = 10 , τ = 10 6 , T = 200 , D = 10 4 , σ 1 2 = σ 2 2 = 1 ,   σ e 2 = 2 ).
Number of Antennas A = B = C = 4 (MIMO)
( ϵ = 0.04 , υ = 3 ,
σ 2 = 0.3 , δ = 0.99994 )
A = 1 , B = C = 4  (SIMO)
( ϵ = 0.025 , υ = 5 ,
σ 2 = 0.5 , δ = 0.99997 )
A = 4 , B = C = 1  (MISO)
( ϵ = 0.018 , υ = 6.5 ,
σ 2 = 0.65 , δ = 0.99997 )
A = B = C = 1  (SISO)
( ϵ = 0.032 , υ = 4 ,
σ 2 = 0.4 , δ = 0.99998 )
SNR ˜ = 10 dB
(Perfect CSI)
8.0911 ( bits / symbol ) 3.6612 ( bits / symbol ) 3.7589 ( bits / symbol ) 2.2885 ( bits / symbol )
SNR ˜ = 15 dB
(Perfect CSI)
10.3951 ( bits / symbol ) 4.9928 ( bits / symbol ) 5.1346 ( bits / symbol ) 2.6718 ( bits / symbol )
SNR ˜ = 20 dB
(Perfect CSI)
12.5581 ( bits / symbol ) 6.1146 ( bits / symbol ) 6.3177 ( bits / symbol ) 3.0622 ( bits / symbol )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, H.; Xu, P.; Dai, B. Ultra-Reliable and Low-Latency Wireless Hierarchical Federated Learning: Performance Analysis. Entropy 2024, 26, 827. https://doi.org/10.3390/e26100827

AMA Style

Zhang H, Xu P, Dai B. Ultra-Reliable and Low-Latency Wireless Hierarchical Federated Learning: Performance Analysis. Entropy. 2024; 26(10):827. https://doi.org/10.3390/e26100827

Chicago/Turabian Style

Zhang, Haonan, Peng Xu, and Bin Dai. 2024. "Ultra-Reliable and Low-Latency Wireless Hierarchical Federated Learning: Performance Analysis" Entropy 26, no. 10: 827. https://doi.org/10.3390/e26100827

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop