Next Article in Journal
Study on the Detection Mechanism of Multi-Class Foreign Fiber under Semi-Supervised Learning
Previous Article in Journal
Research on Microbial Community Structure in Different Blocks of Alkaline–Surfactant–Polymer Flooding to Confirm Optimal Stage of Indigenous Microbial Flooding
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Privacy-Preserving Byzantine-Resilient Swarm Learning for E-Healthcare

1
School of Information and Control Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China
2
School of Cyber Engineering, Xidan University, Xi’an 710126, China
*
Authors to whom correspondence should be addressed.
Appl. Sci. 2024, 14(12), 5247; https://doi.org/10.3390/app14125247
Submission received: 1 April 2024 / Revised: 8 June 2024 / Accepted: 10 June 2024 / Published: 17 June 2024
(This article belongs to the Section Computing and Artificial Intelligence)

Abstract

:
An automatic medical diagnosis service based on deep learning has been introduced in e-healthcare, bringing great convenience to human life. However, due to privacy regulations, insufficient data sharing among medical centers has led to many severe challenges for automated medical diagnostic services, including diagnostic accuracy. To solve such problems, swarm learning (SL), a blockchain-based federated learning (BCFL), has been proposed. Although SL avoids single-point-of-failure attacks and offers an incentive mechanism, it still faces privacy breaches and poisoning attacks. In this paper, we propose a new privacy-preserving Byzantine-resilient swarm learning (PBSL) that is resistant to poisoning attacks while protecting data privacy. Specifically, we adopt threshold fully homomorphic encryption (TFHE) to protect data privacy and provide secure aggregation. And the cosine similarity is used to judge the malicious gradient uploaded by malicious medical centers. Through security analysis, PBSL is able to defend against a variety of known security attacks. Finally, PBSL is implemented by uniting deep learning with blockchain-based smart contract platforms. Experiments based on different datasets show that the PBSL algorithm is practical and efficient.

1. Introduction

In recent years, online medical diagnosis services [1] have attracted considerable interest due to their ability to overcome geographical restrictions and reduce the waiting time of seeing doctors [2,3]. To discover hidden diseases from collected medical data, deep learning (DL) has been applied to e-healthcare systems [4,5,6]. DL can process huge amounts of medical data and automatically find valid patterns that are too complex for human-led analysis. DL intrinsically relies on large amounts of training data. Unfortunately, in traditional healthcare systems, medical data tend to be distributed among different hospitals, and the small amount of medical data collected by individual hospitals are often insufficient to train reliable models [7,8]. Therefore, data centralization is an effective way to address the lack of training data [9]. Although the centralized model has brought many benefits, it still faces many challenges, including surge in data traffic, disclosure of local data, confidentiality of global models, and data monopolies [10].
As shown in Figure 1a, Google’s federated Learning (FL) [11,12] is implemented with the help of a centralized server that aggregates all local model gradients and takes an overall average to produce a global model update. Although the problems of local data storage and local data confidentiality have been solved, the global model parameters are still processed and stored by the centralized server. Furthermore, such centralized architectures decrease fault tolerance. To solve these problems, Stefanie et al. [13] proposed swarm learning (SL), in which multiple medical centers share their local gradients through blockchain peer-to-peer networks, and the medical center that wins the consensus mechanism replaces the centralized parameter server to update the global parameters according to collected gradients.
As shown in Figure 1b, despite the move away from centralized servers, SL still faces data security and privacy breaches. First, SL does not protect the privacy of hospital data and guarantee the confidentiality of the global model. While local model updates and global models are already recorded on blockchain, blockchain itself does not guarantee the privacy of local data and the confidentiality of the global model. For example, any curious medical center participating in swarm learning can infer the original training data of other medical centers according to the gradient and model parameters stored in the blockchain [14,15]. Secondly, in the process of gradient collection and parameter updating, the malicious behavior of dishonest medical centers can destroy the swarm learning process and poison the federated model. In particular, malicious medical centers (e.g., Byzantine attackers) can easily disrupt swarm learning systems by spreading false information over the network [16]. For example, Bagdasaryan et al. demonstrated that dishonest edge devices can poison collaborative models by replacing updated models with its exquisitely designed one [17].
Therefore, we propose a privacy-preserving Byzantine-robust swarm learning (PBSL), which ensures the confidentiality of the local gradient, the privacy of the global model, and the robustness of Byzantine attacks. Specifically, after the blockchain-based point-to-point network is established, each medical center calculates its local gradients according to the local training data. Then, the local gradients encrypted by the fully homomorphic encryption (FHE) scheme, CKKS, are broadcast to all other medical centers on the blockchain-based network. If medical centers on the blockchain reach a consensus, the winning medical center uses a proposed Byzantine attack tolerant aggregation scheme based on secure cosine similarity, which supports lossless collaborative model learning while protecting the privacy of the medical center.
In short, the main contributions of this paper include the following three aspects:
  • PBSL can not only protect the privacy of local sensitive data, but also ensure the confidentiality of the global collaborative model. Using the fully homomorphic encryption (FHE) scheme CKKS, PBSL encrypts the local gradient submitted by the medical center and aggregates to generate a global collaborative model without decryption.
  • PBSL can achieve swarm learning (SL) that resists Byzantine attacks. In order to resist Byzantine attacks, we propose a privacy-protecting Byzantine-robust aggregation method, which uses secure cosine similarity to punish malicious MC s .
  • PBSL prototype is implemented by integrating deep learning and blockchain-based smart contract. Moreover, PBSL is tested using real medical datasets to evaluate its effectiveness. Experimental results show that PBSL is effective.
The remainder of this paper is organized as follows. We review related works in Section 2. We review some primitives in Section 3, and introduce our system model and design goals in Section 4. Then, we introduce our scheme in Section 5. Next, we present the security and performance analyses in Section 6 and Section 7, respectively. Finally, we conduct a summary of our work in Section 8.

2. Related Work

In this section, we focus on related works about privacy-preserving FL, Byzantine-robust FL, blockchain-based FL, and swarm learning.

2.1. Privacy-Preserving Federated Learning

Although the training dataset is divided and stored separately, federated learning (FL) cannot protect training data privacy because the exchanged global model and local gradients still contain a large amount of sensitive information. Three popular privacy-protecting federal learning (PPFL), i.e., differential privacy (DP), homomorphic encryption (HE), and secure multi-party computation (MPC), have been proposed to address this problem.
DP-based PPFL protects the local updates against inference attacks through adding random noises [18,19,20]. However, it is challenging to bring a trade-off of accuracy loss. SMC-based PPFL collaboratively computes the aggregated values between multi-parties’ inputs through secure multi-party computing in nature [21,22,23,24,25,26]. SMC-based PPFL has been constructed in one [27], two [28], three [29], and four [30] servers. However, although SMC, which is a cryptographic primitive, has powerful privacy, it is inefficient. HE-based PPFL uses homomorphic encryption to support aggregation operation on cipher-texts, and protects training data privacy from curious parameter servers without external learning accuracy loss [31]. EaSTFLy [32] uses ( T , N ) -Shamir’ secret sharing and Paillier partially homomorphic encryption (PHE). BatchCrypt [33] encodes quantized gradients into signed integers and then encrypts them, achieving training speedup compared with full-precision encryption. However, most of them contain massive, complicated arithmetical operations, which brings considerable computation overhead.
In this paper, threshold fully homomorphic encryption (TFHE), which combines ( T , N ) -Shamir’ secret sharing and CKKS fully homomorphic encryption (FHE) [34], is used as a privacy protection mechanism, which greatly reduces the computing and communication overheads.

2.2. Byzantine-Robust Federated Learning

2.2.1. Byzantine Attack

As we all know, federated learning is vulnerable to Byzantine attacks [35,36,37]. In order to map the specified feature to the target class, the Byzantine attacker tries to modify the classification boundary of the trained model. It has been proven that an attacker can control the entire training process and steal user privacy [16].
According to the goals of the attackers, Byzantine attacks can be divided into targeted attacks [17] and untargeted attacks [35]. Targeted attacks target only one or a few data categories in the dataset, while maintaining the accuracy of other categories. Untargeted attacks are undifferentiated attacks that aim to reduce the accuracy of all data categories. And according to the abilities of the adversaries, Byzantine attacks can also be divided into data attacks [37] and model attacks [35]. In data attacks, the attackers indirectly poison the global model by poisoning the client’s local data. For example, the label-flipping attack flips the label of the normal features to the target class. And the backdoor attack is an attempt to seek a set of parameters to establish a strong link between the trigger and the target label while minimizing the impact on benign input classification. In model attacks, the attackers can directly manipulate and control the model updates of communication between the clients and the server, which directly affects the accuracy of the global model.

2.2.2. Byzantine-Robust Federated Learning

Typical FLs, such as FedSGD [38] and FedAvg, are not resistant to poisoning attacks, resulting in untrustworthy and inaccurate global models. By comparing local gradients received from different parties, Byzantine-robust federated learning (BRFL) [16,39,40,41] excludes abnormal gradients to protect the training model from poisoning attacks. Based on the Euclidean distance between any two gradients, Krum [16] excludes the outlier gradients. Specifically, in each training round, a candidate model would be selected. Similarly, Trim-mean [39] first sorts the received local gradients, then removes the larger and smaller gradients among them, and finally takes the mean of the remaining gradients as the global gradient. GeoMed [42] updates the FL model by selecting a gradient based on the geographic median to realize gradient aggregation. Bulyan [43] first implements the Krum-based aggregation and then updates the global model by averaging the gradient closest to the median value. Meanwhile, based on the cosine similarity between the parties’ historical gradients, FoolsGold [35] sets the weights of parties to ward off Sybil attacks in FL.
However, these schemes do not consider privacy threats, which directly aggregate the global model directly in plaintext. In order to solve this problem, the FL scheme [40] using DP and SMC can not only prevent inference attack during training, but also improve model accuracy. However, because it is difficult to calculate the similarity between the encrypted gradients, the FL scheme [40] is vulnerable to malicious gradients. To this end, privacy-enhanced federated learning (PEFL), which employs HE as the underlying technology, punishes malicious parties via the logarithmic function of gradient data. However, the PEFL scheme still does not avoid malicious behaviors of the servers and inference attacks onto clients.
To develop a Byzantine-resilient, and at the same time, privacy-preserving federated learning framework, we use TFHE to protect user privacy, cosine similarity to remove malicious gradients, and blockchain to avoid malicious behavior of servers.

2.3. Blockchain-Based Federated Learning and Swarm Learning

By heavily relying on a single server, traditional FL cannot protect against a single point of failure and resist malicious attacks. As the infrastructure of Bitcoin, blockchain has attracted a lot of attention because of its decentralization and reliability. Several blockchain-based FL protocols [44,45,46,47] replace servers with blockchain, enabling privacy protection and enhancing trust among parties. BAFFLE [44] uses blockchain to avoid single points of failure and aggregates local models using smart contracts (SCs). Similarly, through blockchain smart contracts, BlockFL [46] exchanges and validates local model updates. By providing incentives associated with training samples, BlockFL can integrate more equipment and more training samples. Meanwhile, as a state-of-the-art BCFL, swarm learning (SL) [13,48] combines edge computing, blockchain networking, and federated learning while maintaining confidentiality without a central coordinator. In the e-heath field, SL has been successfully applied to the diagnosis of COVID-19, tuberculosis, leukemia and lung diseases [13], skin lesion classification fairness [49], genomics data sharing [50], and risk prediction of cardiovascular events [51].
However, the above schemes impose heavy computation and communication costs on the blockchain nodes and cannot distinguish the malicious gradients. To address these issues, the blockchain-based federated learning framework with committee consensus (BFLC) [45] can effectively reduce consensus computing and prevent malicious attacks through innovative committee consensus mechanisms.
Finally, Table 1 shows a comparison of our PBSL scheme with the various FL schemes described above. The results show that our scheme can not only resist poisoning attacks and be robust to users dropping out, but also has high privacy and computational efficiency. The compared results point out that PBSL can not only resist Byzantine attacks and be robust to parties dropping out, but also protect privacy and improve computing efficiency.

3. Preliminaries

In this section, we review the basics of PBSL, namely swarm learning and full homomorphic encryption.

3.1. Swarm Learning

As the state-of-the-art BCFL, discarding the centralized server, swarm learning (SL) shares local gradients through the swarm network, and independently builds models on local data at the swarm edge nodes. SL ensures the confidentiality, privacy, and security of sensitive data with the help of blockchain. Unlike clients in traditional FL, edge nodes not only perform training tasks, but also execute transactions. The workflow of SL is as follows. A new edge node is first registered to the smart contract (SM) pre-deployed on the blockchain. Then, the swarm edge nodes perform a new training round, including downloading the global model and performing localized training. And then, all swarm edge nodes commit the local gradients to the SM through the swarm network. Finally, the SM aggregates all local gradients to create an updated model.
Suppose there are K parties { P 1 , P 2 , , P K } that constitute a collaborative group, in which each party P k holds its local dataset D k ( k = 1 , 2 , 3 , , K ). D = { D 1 , D 2 , , D K } denotes the joint dataset. Without disclosing their local sensitive data, K parties aim to cooperatively train a global general classifier f ( x ; w ) from the complete dataset D. As illustrated in Figure 1b, in tth iteration, the party P k downloads the global model w t 1 from the blockchain and trains the local model based on the local dataset D k . Equation (1) describes the objective function optimized by training:
F ( x i , w , y i ) = min w E ( x i , y i ) D ˜ L ( f ( x i ; w ) , y i )
where ( x i , y i ) is the training sample, L ( · , · ) is the empirical error function, and D ˜ is the training sample distribution. We adopt stochastic gradient descent (SGD) to solve this minimization problem. Using w t 1 and D k , each party P k computes its local gradient g k ( w t ) via Equation (2):
g k ( w t ) = i = 1 | D k | d d w t L ( f ( x i k ; w t 1 ) , y i k )
Through the consensus mechanism, the winning party performs the model update operation with Equation (3):
w t = w t 1 γ k = 1 K ζ i j g k ( w t )
where ζ i j = | D j | | D | , j = 1 N ζ i j = 1 , and γ is the local learning rate. Once the update operation is completed, the winning party then sends the updated results through transactions to all parties.

3.2. The CKKS Scheme

As a typical fully homomorphic encryption (FHE), the Cheon–Kim–Kim–Song (CKKS) scheme [34] is a cryptographic technique in which addition and multiplication operations on plaintext are equivalent to corresponding operations on cipher-text. With the unique encoding, decoding, and rescaling mechanisms, CKKS can encrypt float numbers and vectors. CKKS consists of the following functions:
(1)
Key Generation: ( s k , p k , e v k ) K e y G e n ( 1 λ ) . This procedure generates a public key p k for encryption, a corresponding secret key s k for decryption, and a key e v k for evaluation.
(2)
Encoding: m E c d ( z , s ) . Taking an N / 2 -dimensional vector z = ( z j ) j T and a scaling factor s as inputs, the procedure transforms z to a polynomial m by encoding, i.e., m = σ 1 π 1 ( z · s ) , where σ 1 and π 1 are the inverses of σ and π , respectively.
(3)
Encryption: c E n c ( m , p k ) . Taking the given polynomial m R and public key p k as inputs, the procedure outputs the cipher-text c R q L k .
(4)
Addition: c ^ A d d ( c 1 , c 2 ) . Taking the cipher-texts c 1 and c 2 as inputs, the procedure outputs c ^ = c 1 c 2 .
(5)
Multiplication: c ˜ M u l t ( c 1 , c 2 , e v k ) . Taking the cipher-texts c 1 and c 2 and the evaluation key e v k as inputs, the procedure outputs c ˜ = c 1 c 2 R q l k .
(6)
Rescaling: c R S ( c ) . Taking a cipher-text c R q l k at level l as inputs, the procedure outputs the cipher-text c q l q l c in R q l k , where x represents the integer closest to the real number x.
(7)
Decryption: m D e c ( c , s k ) . Taking the cipher-text c and the secret key s k as inputs, the procedure outputs the polynomial m.
(8)
Decoding: z D c d ( m , s ) . Taking the plaintext polynomial m R as inputs, the procedure transforms m R into a vector z using z = π σ ( s 1 · m ) .

3.3. Threshold Fully Homomorphic Encryption

To construct distributed FHE, threshold fully homomorphic encryption (TFHE) [52] has been proposed, in which each participant has a share of a secret key corresponding to an FHE public key.
Let P = { P 1 , , P N } be a set of parties. Threshold fully homomorphic encryption scheme (TFHE) is tuple of PPT algorithms T F H E = ( T F H E . S e t u p , T F H E . E n c r y p t , T F H E . E v a l , T F H E . P a r t D e c , T F H E . F i n D e c ) satisfying the following specifications:
  • ( p k , s k 1 , , s k N ) T F H E . S e t u p ( 1 λ , 1 d , A ) : The algorithm inputs a security parameter λ , a circuit depth d, and an access structure A , and outputs a public key p k , and secret key shares s k 1 , , s k N .
  • c t T F H E . E n c r y p t ( p k , μ ) : The algorithm inputs a public key p k and a single-bit plaintext μ { 0 , 1 } , and outputs a cipher-text c t .
  • c t ^ T F H E . E v a l ( p k , C , c t 1 , , c t k ) : The algorithm inputs a public key p k , a boolean circuit C : { 0 , 1 } k { 0 , 1 } C λ of d e p t h d and cipher-texts c t 1 , , c t k encrypted under the same public key, and outputs an evaluation cipher-text c t ^ .
  • p i T F H E . P a r t D e c ( p k , c t , s k i ) : The algorithm inputs a public key p k , a cipher-text c t and a secret key share s k i . and outputs a partial decryption p i related to the party P i .
  • μ ^ T F H E . F i n D e c ( p k , B ) : The algorithm inputs a public key p k , a set B = { p i } i S for some S { P 1 , , P N } where we recall that we identify a party P i with its index i, and deterministically outputs a plaintext μ ^ { 0 , 1 , } .
And we construct T F H E using the schemes, F H E = ( F H E . S e t u p , F H E . E n c r y p t , F H E . E v a l , F H E . D e c r y p t ) and S S = ( S S . S h a r e , S S . C o m b i n e ) as follows:
  • T F H E . S e t u p ( 1 λ , 1 d , A t ) : The algorithm inputs λ , d, and A t T A S , and generates ( f h e p k , f h e s k ) F H E . S e t u p ( 1 λ , 1 d ) . Then, f h e s k is divided into shares using ( f h e s k 1 , , f h e s k N ) S S . S h a r e ( f h e s k , A t ) .
  • T F H E . E n c r y p t ( p k , μ ) : The algorithm inputs p k and μ { 0 , 1 } , computes c t F H E . E n c r y p t ( p k , μ ) , and output c t .
  • T F H E . E v a l ( p k , C , c t 1 , , c t k ) : The algorithm inputs p k , C and c t 1 , , c t k , computes c t ^ F H E . E v a l ( C , c t 1 , , c t k ) , and outputs c t ^ .
  • T F H E . P a r t D e c ( p k , c t , s k i ) : The algorithm inputs p k , c t , and s k i Z q n , computes p i = F H E . D e c o d e 0 ( s k i , c t ) + ( N ! ) 2 · e Z p , and outputs p i .
  • T F H E . F i n D e c ( p k , B ) : The algorithm inputs p k and { p i } i S , and checks if S A . If this is not the case, then it outputs ⊥. Otherwise, it arbitrarily chooses a satisfying set S S of size t and computes the Lagrange coefficients λ i , 0 S for all i S . Then, it computes μ F H E . D e c o d e 1 ( i S λ i , 0 S ) , and outputs μ .

4. Problem Statement

In this section, we formulate our system model, threat model, and design goals.

4.1. System Model

As shown in Figure 2, our PBSL includes three entities: trusted authorities ( TAs ), medical centers ( MC s ), and blockchain ( BC )-based peer-to-peer network. Each MC is equipped with a PC server, and MC s are interconnected through blockchain ( BC )-based peer-to-peer networks. The three entities involved in the system are shown below.
  • TA stands for Trusted Authority and is responsible for generating initial parameters and public/secret key pairs and distributing them to MC s .
  • MC s = { MC 1 , MC 2 , , MC K } represents K medical centers participating in swarm learning. In our system, each MC k MC s trains the model using its local medical dataset, encrypts the gradient, shares the encrypted gradient to the blockchain, and cooperatively trains with other MC s to achieve a more accurate global model.
  • BC represents a blockchain platform made up of self-organizing networks of interconnected medical centers ( MC s ). Every node (i.e., MC s ) on blockchain has the opportunity to aggregate the global model, so BC can replace a central aggregator. Smart contract (SC) deployed on the blockchain homomorphically aggregates the encrypted gradients to obtain the global collaboration model.

4.2. Threat Model

TA is a trusted third party. Considering the autonomy of MC s in swarm learning, we select the “curious-and-malicious” threat model, which is weaker than the “honest-and-curious” threat model. Specifically, curious MC s are honest in uploading true gradients trained on their local datasets, while attempting to obtain more information from gradients shared by other MC s stored on the blockchain, thus compromising users’ data privacy. In order to reduce the global model accuracy, malicious MC s can perform malicious operations or deliberately abandon local training. The potential threats to our PBSL are as follows:
  • Privacy: Since local gradients contain MC s ’ private information, if the MC s directly upload the local gradients in plaintext, the adversaries can infer the private information of honest MC s , which leads to data leakage of MC s .
  • Byzantine Attacks: Byzantine attack refers to the arbitrary behavior of the system participant who does not follow the designed protocol. For example, a Byzantine MC randomly sends a vector in place of the local gradient, or mistakenly aggregates the local gradient of all MC s .
  • Poisoning Attacks: Poison attack refers to various malicious behaviors launched by the MC s . For example, an MC flips the label of the sample and uploads the toxic gradient.

4.3. Design Goals

Under the system and threat models mentioned above, aiming to design a privacy-preserving swarm learning (PBSL) scheme, the proposed scheme should provide confidentiality and privacy guarantees, resist Byzantine attacks, and reduce computational overheads. And our scheme should achieve the same precision as classical swarm learning (SL). Specifically, the following goals are achieved:
  • Confidentiality: Guaranteeing the confidentiality of the global model stored in the blockchain. Specifically, after swarm learning, only MC s can access the global model stored in the blockchain.
  • Privacy: Guaranteeing the privacy of each MC s ’ local sensitive information. Specifically, during swarm learning, every MC s ’ local training data cannot be leaked to other MC s .
  • Robustness: Detecting malicious MC s and discarding their local gradients during swarm learning. Specifically, a mechanism is constructed to identify malicious medical centers and discard their submitted local model gradients.
  • Accuracy: While protecting data privacy and resisting malicious attacks, the model accuracy of PBSL training should be the same as that of the original SL training.
  • Efficiency: Reducing the high computational and communication cost of encryption operations in a privacy-preserving SL scheme.
  • Tolerance to MC drop-out: Guaranteeing the privacy and convergence even if up to D  MC s are delayed or dropped.

5. Our Privacy-Preserving Swarm Learning

As shown in Figure 3, our PBSL described in this section mainly includes four stages: (I) system initialization, (II) local computation, (III) model aggregation, and (IV) global model decryption and updating.
Firstly, TA sets initialization parameters, generates key pairs < s k F H E , p k F H E > , divides s k F H E into K shares through Shamir’s secret sharing, and distributes K shares to K MC s , respectively.
Then, MC s train their model on its local dataset and encrypt their gradients by fully homomorphic encryption technology CKKS [34], which are broadcasted to the blockchain-based network.
Furthermore, the MC that wins the consensus mechanism uses a cosine similarity-based strategy to punish malicious gradient vectors, obtain the encrypted global model through homomorphically aggregating the encrypted local gradients, and stores the encrypted collaborative learning results on the blockchain.
Finally, MC s obtain the encrypted global models from the blockchain and cooperatively decrypt the encrypted global models to obtain the global models. We summarize PBSL in Algorithm 1, and illustrate the details of each process in the following. Table 2 lists the description of notations in our PBSL.
Algorithm 1: PBSL
Applsci 14 05247 i001

5.1. System Initialization

5.1.1. Blockchain P2P Network Construction

MC s = { MC 1 , MC 2 , , MC K } form blockchain-based peer-to-peer network. MC s have pseudonymity, i.e., pseudo public keys p k 1 p s u , p k 2 p s u , , p s K p s u , and their corresponding secret keys s k 1 , s k 2 , , s k K are privately kept, respectively.

5.1.2. Cryptosystem Initialization

Firstly, TA generates a key pair < s k F H E , p k F H E > K e y G e n ( 1 λ ) . TA sets a security parameter λ , and chooses a power-of-two M = m ( λ , q L ) , an integer h = h ( λ , q L ) , an integer P = P ( λ , q L ) , and a real value σ = σ ( λ , q L ) . Then, TA samples s HWT ( h ) , a R q L , and e DG ( σ 2 ) , where HWT ( h ) is the set of signed binary vectors whose Hamming weight is exactly h and DG ( σ 2 ) samples a vector in Z N by drawing its coefficient independently from the discrete Gaussian distribution of variance σ 2 . The private key is set as s k F H E ( 1 , s ) and the public key as p k F H E ( b , a ) R q L 2 where b a s + e ( m o d q L ) .
Secondly, TA randomly divides s k F H E into K parts. TA computes δ = η · λ , s.t., η · λ 1 m o d N 2 , sets a threshold m, s.t., ( m 1 ) N , and defines a polynomial function of secret sharing protocol f ( x ) = λ + i = 1 m a i x i , where a 1 , a 2 , , a m 1 R Z λ * . TA divides s k F H E into K parts through calculating f ( i ) . TA broadcasts < g , p k F H E , η > to all MC s through blockchain-based network.
Finally, each MC k MC asks TA for its own private key s k F H E k = f ( k ) .

5.1.3. Registration

To participate in swarm learning, MC k MC s need to register to smart contract (SC) published on the blockchain, which specifies the entire deep learning task. Each MC k constructs a zero-knowledge proof p r o o f s k k of its private key s k k . The p r o o f s k k is defined as < c , r > , where v R Z q , t = g v , c = H ( g , p k k p s u , t ) , and r = ( v c · s k k ) m o d q . Then, MC k submits “Registration” transaction < p k k p s u , p r o o f s k k > to the blockchain. And then, each MC k on the blockchain verifies p r o o f s k k , i.e., p r o o f s k k = ? H ( g , p k k p s u , g r ( p k k p s u ) c ) to prevent an adversary from registering itself with p k k p s u copied from an honest MC l . If the verification is passed, MC k is registered as a member of the learning group and stores its p k k p s u to the blockchain.
And in order to protect public broadcast channels, we can construct the symmetric encryption key k e y k l = ( p k k p s u ) s k l = ( p k l p s u ) s k k and use the symmetric key encryption algorithm E n c k e y k l ( · ) to encrypt the traffic data between MC k and MC l .

5.1.4. Parameter Initialization

In order to initialize the system parameters, a randomly selected MC l sets the initial parameters p = ( η , b , w 0 ) , where η is the learning rate, b is batch size, and w 0 is initial model parameters. Then, MC l broadcasts the encrypted parameters p l k ¯ = E n c k e y l k ( p ) to all other MC k ( k l ). Upon receiving p l k ¯ from MC l , MC k decrypts p l k ¯ to obtain p = D e c k e y l k ( p l k ¯ ) . Finally, all MC s obtained uniform initial parameters p .

5.2. Local Computation

Algorithm 2 describes the procedure of Local _ Computation in detail. In a local computation procedure, there are four parts: local model training, normalization, encryption, and sharing. The four parts are described in detail below.

5.2.1. Local Model Training

In the tth iteration, each MC k obtains the encrypted global model [ [ w t 1 ] ] p k F H E from the latest block B t 1 on the blockchain. Then, each MC k calls Collab _ Decryption ( · ) to obtain the local model w t 1 . Finally, using D k and w t 1 k , each MC computes the local gradient g t k , as shown in Equation (4):
g t k = L ( w t 1 k , D k )
where L ( w t 1 k , D k ) represents the experical loss function and ▿ represents the derivation operation.
Algorithm 2: Local_Computation
Applsci 14 05247 i002

5.2.2. Normalization

In order to directly apply our cosine-similarity-based aggregation rule to cipher-text and mitigate the effects of malicious gradients with a larger magnitudes, these local gradients { g t k } k = 1 K need to be normalized. For each MC k , we use Equation (5) to normalize the local gradients g t k :
g ˜ t k = g t k | g t k |
where g ˜ t k is the unit vector.

5.2.3. Encryption

To ensure the confidentiality of local model updates, as shown in Algorithm 3, each MC k encrypts g ˜ t k using p k F H E . Because gradients are signed the floating-point vector, gradients are encrypted using the CKKS scheme [34] instead of the Paillir scheme [53] with high computation overheads. g ˜ t k is treated as vectors and encrypted using p k F H E .
Algorithm 3: Global_Aggregation
Applsci 14 05247 i003

5.2.4. Message Encapsulation and Sharing

Each MC k encapsulates p k k p s u , [ [ g ˜ t k ] ] p k , and timestamp t into the following message:
M t k = ( p k k p s u , [ [ g ˜ t k ] ] p k , t )
and broadcasts M t k to other MC s .

5.3. Global Aggregation

5.3.1. PoW Competition

In order to compete for the right to append blocks to the blockchain, all MC s that make up the blockchain execute any consensus algorithm (e.g., PoW). The only winner MC j has the right to construct a block and add it to the blockchain.

5.3.2. Homomorphic Aggregation

In the tth iteration, the winner MC j is responsible for aggregating all the encrypted gradients stored in the memory pool. MC j decodes { M t 1 , M t 2 , , M t K } and homomorphically aggregates encrypted gradients to generate the encrypted global model [ [ g ˜ t ] ] p k F H E without the need for decryption by the following equation:
[ [ g ˜ t ] ] p k F H E = 1 K k = 1 K [ [ g ˜ t k ] ] p k F H E

5.3.3. Secure Cosine Similarity Computation

Then, we compute the cosine similarity c s t k between two gradients g ˜ t = [ p 1 , p 2 , , p n ] and g ˜ t k = [ q 1 , q 2 , , q n ] . And the encrypted gradients [ [ g ˜ t ] ] p k F H E and [ [ g ˜ t k ] ] p k F H E are obtained as [ [ g ˜ t ] ] p k F H E = E n c p k F H E ( [ p 1 , p 2 , , p n ] ) and [ [ g ˜ t k ] ] p k F H E = E n c p k F H E ( [ q 1 , q 2 , , q n ] ) . The encrypted cosine similarity [ [ c s t k ] ] p k F H E between [ [ g ˜ t ] ] p k F H E and [ [ g ˜ t k ] ] p k F H E can be transformed to the inner product, which is described as follows:
[ [ c s t k ] ] p k F H E = [ [ g ˜ t ] ] p k F H E [ [ g ˜ t k ] ] p k F H E = [ [ p 1 q 1 + p 2 q 2 + + p n q n ] ] p k F H E
If c s t k < 0 , it means that the angle between g ˜ t k and g ˜ t is greater than 90 ° , that is, g ˜ t k has a negative impact on g ˜ t . The trust score S t k of MC j is defined by Equation (7) to punish the malicious gradients
S t k = R e L U ( c s t k )
Equation (8) gives the definition of the R e L U function.
R e L U ( a ) = a , if a > 0 0 , if a 0
According to the previous work [54], the cipher-text homomorphism operation of R e L U function is given below:
[ [ S t k ] ] p k F H E = R e L U ( [ [ c s t k ] ] p k F H E ) = 1 2 ( [ [ c s t k ] ] p k F H E + [ [ 1 ] ] p k F H E ) , if 1 2 ( [ [ c s t k ] ] p k F H E + [ [ 1 ] ] p k F H E ) > [ [ 1 2 ] ] p k F H E [ [ 1 2 ] ] p k F H E , if 1 2 ( [ [ c s t k ] ] p k F H E + [ [ 1 ] ] p k F H E ) [ [ 1 2 ] ] p k F H E
To compare the cipher-text of 1 2 ( [ [ c s t k ] ] p k F H E + [ [ 1 ] ] p k F H E ) and [ [ 1 2 ] ] p k F H E , the numerical method for the comparison of homomorphic ciphers in the range of [ 0 , 1 ] is called [54]. The encrypted score [ [ S t k ] ] p k F H E is set by
[ [ S t k ] ] p k F H E = 2 · [ [ S t k ] ] p k F H E [ [ 1 ] ] p k F H E

5.3.4. Model Aggregation and Updating

In the tth epoch, the winner MC j homomorphically aggregates the local gradients weighted by their trust score according to the following equation:
[ [ g ˜ t ] ] p k F H E = k = 1 K [ [ S t k ] ] p k F H E · [ [ g ˜ t k ] ] p k F H E k = 1 K [ [ S t k ] ] p k F H E = k = 1 K R e L U ( [ [ g ˜ t ] ] p k F H E [ [ g ˜ t k ] ] p k F H E ) · [ [ g ˜ t k ] ] p k F H E k = 1 K [ [ S t k ] ] p k F H E
and subsequently updates the global model:
[ [ w t ] ] p k F H E = [ [ w t 1 ] ] p k F H E γ [ [ g ˜ t ] ] p k F H E

5.3.5. Construction of New Block

MC k constructs the new block B t that contains the following:
B t = ( [ [ w t ] ] p k F H E , [ M t 1 , M t 2 , , M t K ] )
and appends B t to the blockchain.

5.4. Collaborative Global Model Decryption

As described in Algorithm 4, each MC k downloads the generated block B t from the blockchain. MC k executes the partial decryption algorithm to obtain w t k , i.e.,
w t k = [ [ w t ] ] p k F H E · s k F H E k
Then, each device MC k broadcasts the encrypted
( w t k ) k j ¯ = E n c k e y k j ( w t k )
for other MC j ( j k ). And then, upon receiving ( w t k ) k j ¯ from MC k , MC j computes w t k = D e c k i j ( ( w t k ) k j ¯ ) . Furthermore, MC j randomly selects v  ( v m ) numbers from the set { 1 , 2 , , K } to form a decryption set D S . Finally, MC j executes the final decryption algorithm to obtain w t , i.e.,
w t = k D S λ k w t k
where λ k ( k D S ) is the Lagrange coefficient.
Algorithm 4: Collab_Decryption
Applsci 14 05247 i004

6. Security Analysis of PBSL

In this section, we review our security requirements of PBSL, described in Section 2, and analyze the security of PBSL.

6.1. Privacy and Confidentiality

Theorem 1.
PBSL can achieve the privacy of local model gradient g t k and the confidentiality of global model w t .
Proof. 
We show that g t k and w t are able to be guaranteed at different stages of swarm learning.
  • In the stage of local computation, g t k is encrypted with p k F H E , i.e.,
    [ [ g t k ] ] p k F H E = E n c ( g t k , p k F H E )
    where t = { 1 , 2 , , T } and k = { 1 , 2 , , K } , and T A splits s k F H E into K distributed secret keys s k F H E 1 , s k F H E 2 , , s k F H E K for MC s . Because any MC k with only a single s k F H E k cannot decrypt [ [ g t k ] ] p k F H E , PBSL can protect the privacy of g t k at this stage.
  • At the stage of global model aggregation, the winning MC k homomorphically aggregates { [ [ g t k ] ] p k F H E } k = 1 , 2 , , K to generate [ [ g ˜ t ] ] p k F H E , then homomorphically calculates [ [ S t k ] ] p k F H E by [ [ S t k ] ] p k F H E = 2 · R e L U ( [ [ g ˜ t ] ] p k F H E [ [ g ˜ t k ] ] p k F H E ) [ [ 1 ] ] p k F H E , and updates
    [ [ w t ] ] p k F H E = [ [ w t 1 ] ] p k F H E γ k = 1 K [ [ S t k ] ] p k F H E · [ [ g ˜ t k ] ] p k F H E k = 1 K [ [ S t k ] ] p k F H E
    The whole process is performed on cipher-text. And MC k with only a single s k F H E k cannot decrypt [ [ w t ] ] p k F H E . Therefore, PBSL can protect the confidentiality of w t in the stage.
  • At the stage of collaborative global model decryption, all MC s decrypt the [ [ w t ] ] p k F H E distributedly, and each MC k shares its partially decryption result w t k . Because any MC k with only a single s k F H E k cannot decrypt [ [ w t ] ] p k F H E , PBSL can protect the privacy of w t in the stage.

6.2. Byzantine-Robust

Theorem 2.
For an arbitrary number of malicious MC s , in the t t h iteration, the loss between the learned model w t and the optimal model w ^ under no attacks is bounded, i.e.,
| | w t w ^ | | ( 1 ρ ) t | | w 0 w ^ | | + 12 β Δ 1 ρ + e r r o r
with probability at least 1 δ , where e r r o r is caused by CKKS.
Proof. 
Our PBSL improves the encryption algorithm of FLTrust [55] and extends FLTrust to blockchain, so we directly apply FLTrust’s security analysis to PBSL. According to FLTrust, in our PBSL, the loss between the learned model w t in the t t h iteration and the optimal model w ^ is
| | w t w ^ | | ( 1 ρ ) t | | w 0 w ^ | | + 12 β Δ 1 ρ + e r r o r
with probability at least 1 δ , where e r r o r is caused by CKKS. As described in CKKS [34], Equation (19) gives the bound of the encoding and encryption errors.
B = 8 2 σ N + 6 σ N + 16 σ h N
Therefore, e r r o r generated by the cipher-text operation of our PBSL can be expressed as follows:
e r r o r = B + B 1 + B 2 + v 1 B 2 + v 2 B 1 + B 1 B 2 + B m u l t ( l )
where e r r o r from ⊕ is B 1 + B 2 , e r r o r from ⊗ is v 1 B 2 + v 2 B 1 + B 1 B 2 + B m u l t ( l ) , B m u l t ( l ) = P 1 · q l · B k s + B s c a l e , B k s = 8 σ N / 3 , P = P ( λ , q L ) , and B s c a l e = N / 3 · ( 3 + 8 h ) . Then, when | 1 ρ | < 1 , we have l i m t | | w t w * | | 12 β Δ 1 / ρ + e r r o r . □
Therefore, the loss between the learned model w t and the optimal model w ^ is bounded.

7. Evaluation

In this section, we evaluate our scheme and present the experimental results to analyze the performance of our PBSL from two perspectives: defense effectiveness against attacks, and resource overhead in terms of computation.

7.1. Experimental Setup

7.1.1. Experimental Environment

Implementation Settings: In order to evaluate the integrated performance, we implement the PBSL scheme on a private Ethereum blockchain setup with real medical datasets. In our experiment, smart contract performs secure a global gradient aggregation algorithm instead of the centralized server.
  • A blockchain system uses Ethereum, a blockchain-based smart contracts platform [56,57,58]. Smart contracts developed in the Solidity programming language are deployed to private blockchains using Truffle v5.10And Ethereum is deployed on a desktop computer with 3.3 GHz Intel(R) Xeon(R) CPU, 16 GB memory, and Windows 10.
  • CKKS Scheme is deployed using the TenSEAL library, which is a library for carrying out homomorphic encryption operations on tensors, and its code is available on Github (https://github.com/OpenMined/TenSEAL, accessed on 9 June 2024).
Dataset: Our PBSL scheme was evaluated using two widely used datasets, the MNIST dataset and the FashionMNIST dataset.
  • MNIST dataset consists of handwritten digital pictures of 250 different people. Every sample is a black and white image with 28 × 28 pixels. The dataset is divided into a training set with 60,000 samples and a test set with 10,000 samples. We distributed the entire dataset evenly across 10 MC s .
  • FashionMNIST dataset constitutes images in 10 categories. The whole dataset contains 60,000 samples (50,000 for training and 10,000 for testing). The dataset was also evenly distributed to 10 MC s .

7.1.2. Byzantine Attacks

In our experiments, the following Byzantine attacks were considered:
  • Label Flipping Attack: By changing the label of the training sample to another category, the label flipping attacker misleads the classifier. To simulate this attack, the labels of the samples on each malicious MC k are modified from l to N l 1 , where N is the total number of labels and l { 0 , 1 , , N 1 } .
  • Backdoor Attack: Backdoor attackers force classifiers to make bad decisions about samples with specific characteristics. To simulate the attack, we choose 6 × 6 pixels as the trigger, and randomly select a number of training images on each malicious MC s , and overwrite the 6 × 6 pixels at the top left of each selected images. Then, we reset the labels of all the selected images to the target class.
  • Arbitrary Model Attack: The model attacker arbitrarily tampers with some of the local model parameters. To reproduce the arbitrary model attack, malicious MC k select a random number as its jth local model parameter.
  • Untargeted Model Poisoning: The model attacker constructs the toxic local model that is similar to benign local models but causes the correct global gradient to reverse, or is much bigger (or smaller) than the maximum (or minimum) benign local models.

7.1.3. Evaluation Metrics

To verify that our PSDL scheme improves model accuracy, we use test accuracy and test error rate as evaluation measures on different training datasets. In the case of different numbers of malicious MC s , the results of our PBSL scheme were verified by using the no-attack FedSGD scheme [38] as the baseline, and the Krum scheme [16], Bulyan [16], Trim-mean [39] as the control.

7.1.4. SL Settings

In our experiment, the total number K of MC s is set to 10, and all MC s are selected in each iteration during training. And we selected the models derived from convolutional neural networks (CNNs) with structure: I n p u t C o n v F C O u t p u t , to train three different datasets using Python, Numpy, and Tensorflow. Then, the training data are evenly distributed to each MC , i.e., for both MNIST and FashionMNIST datasets, the size of each local dataset is 6000. We considered the proportion of different malicious MC s from 10 % to 50 % . Our PBSL scheme is compared with SL and Krum. And the batch size b is set as 128.

7.2. Experimental Results

To demonstrate that the PBSL scheme meets our goals of accuracy, robustness, privacy, efficiency, and reliability, we conduct the following experiments. The experimental results show that PBSL can achieve the expected goals, which are discussed below.

7.2.1. Experimental Results on Defense Performance

Impact of Number of MC s : The test accuracy was measured by increasing the number of MC s involved in collaborative training. The experiments were run with different numbers of MC s from 1 to 10. Figure 4 shows that the more MC s participate, the higher the training accuracy.
Impact of Different Proportions of Malicious MC s : To check the robustness of our PBSL scheme, we evaluate the accuracy loss caused by the ratio change in malicious MC s . Figure 5 and Figure 6 show the results of PBSL training on two different datasets in the presence of different proportions of malicious MC s . The experiment shows that the proposed basic scheme is reasonable and robust.
Impact of different attacks: To prove Byzantine robustness against different typical Byzantine attacks, we performed targeted attacks and untargeted attacks to our PBSL scheme. Table 3 shows the training results on MNIST and FashionMNIST datasets in the presence of different proportions of malicious M C s under targeted and untargeted attacks. The results show that PBSL still keeps the same accuracy as baseline for different typical attacks, achieving the goal of robustness and accuracy.

7.2.2. Efficiency Comparison with Related FL Approaches

Comparison of PBSL with alternative privacy protection schemes. In the PBSL scheme, all operations, including similarity calculations and model aggregation, are computed in cipher-text. No participant can recover the original plaintext of the MC s ’ from the cipher-text. Therefore, the PBSL scheme meets the MC s ’ requirements for data privacy. To verify the efficiency of PBSL, using the Paillier-based federated learning scheme, i.e., PEFL [41], as a benchmark, we evaluated PBSL and BatchCrypt, where BatchCrypt [33] is a federated learning scheme based on HE. For each MC , we use the time required for encryption and decryption of each iteration as a criterion. As shown in Figure 7, PBSL greatly improves computational efficiency and reduces computational costs compared to baseline and BatchCrypt. PBSL adopts a novel encryption scheme, i.e., CKKS [34], which has its unique advantages in encrypting large-scale floating-point vectors and network models with many parameters.
Comparison of PBSL with alternative blockchain-based schemes.Table 4 compares the model accuracy and mining overhead on the FashionMNIST dataset between LearningChain in [59], BCFL [45], and PBSL, where α denotes training time per iteration and β denotes mining time per block. First, we observe that the proposed PBSL achieves the highest model accuracy, compared with LearningChain and BCFL. This is because the privacy protection of PBSL uses fully homomorphic encryption (FHE) instead of differential privacy, which improves the accuracy of the model. Second, the mining overhead of PBSL is less than that of LearningChain but more than that of BCFL. This is due to the fact that LearningChain defends against Byzantine attacks at the cost of resource consumption for mining, while BCFL reduces the mining overhead at the expense of model robustness.
And the model aggregation is implemented as functions of smart contract, and the computational cost of a function in a smart contract is assessed in gas units. Therefore, we measure the computational consumption of PBSL in gas. Using the CNN model, gas costs are tested in each iteration with 10 MC s . After all the MC s have submitted their local updates in cipher-text, SC obtains the global gradient at 30 , 500 , 242 gas per transaction by homomorphically aggregating all local updates.

8. Conclusions

In the article, we review the shortcomings of swarm learning (SL) and further propose a new privacy-protection Byzantine-resilient swarm learning for e-heath, called PBSL, which ensures the privacy of local gradient and the tolerance toward Byzantine attacks. With the CKKS cryptosystem based on threshold decryption and decentralized computing based on blockchain, each MC can safely aggregate other MC s ’ gradients to generate a more accurate global model, while the confidentiality of the final global model can be ensured. And in the process of global model aggregation, our proposed Byzantine-robust aggregation method uses secure cosine similarity to punish malicious MC s . Detailed security analysis and extensive experiments show that PBSL has privacy protection capability and efficiency.
In this paper, our attack detection strategy is based on cosine similarity between the local model gradient and the global model gradient. In future work, in order to prevent new attack vectors, we will continue to study new attack detection strategies that are suitable for blockchain platforms and take into account privacy protection, so as to deal with unforeseen threats in future-proofing systems.

Author Contributions

Conceptualization, X.Z.; Methodology, X.Z.; Software, T.L.; Writing—original draft, X.Z.; Writing—review & editing, T.L.; Supervision, H.L.; Project administration, H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: [http://yann.lecun.com/exdb/mnist/, https://github.com/zalandoresearch/fashion-mnist].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Meier, C.; Fitzgerald, M.C.; Smith, J.M. Ehealth: Extending, enhancing, and evolving health care. Annu. Rev. Biomed. Eng. 2013, 15, 359–382. [Google Scholar] [CrossRef] [PubMed]
  2. Liu, X.; Lu, R.; Ma, J.; Chen, L.; Qin, B. Privacy-preserving patient-centric clinical decision support system on naive bayesian classification. IEEE J. Biomed. Health Inform. 2016, 20, 655–668. [Google Scholar] [CrossRef] [PubMed]
  3. Rahulamathavan, Y.; Veluru, S.; Phan, C.W.; Chambers, J.A.; Rajarajan, M. Privacy-preserving clinical decision support system using gaussian kernel-based classification. IEEE J. Biomed. Health Inform. 2014, 18, 56–66. [Google Scholar] [CrossRef] [PubMed]
  4. Wiens, J.; Saria, S.; Sendak, M.; Ghassemi, M.; Liu, V.X.; Doshi-Velez, F.; Jung, K.; Heller, K.; Kale, D.; Saeed, M.; et al. Do no harm: A roadmap for responsible machine learning for healthcare. Nat Med. 2019, 2019, 1337–1340. [Google Scholar] [CrossRef] [PubMed]
  5. Courtiol, P.; Maussion, C.; Moarii, M.; Pronier, E.; Pilcer, S.; Sefta, M.; Manceron, P.; Toldo, S.; Zaslavskiy, M.; Le Stang, N.; et al. Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat. Med. 2019, 25, 1519–1525. [Google Scholar] [CrossRef] [PubMed]
  6. Warnat-Herresthal, S.; Perrakis, K.; Taschler, B.; Becker, M.; Baßler, K.; Beyer, M.; Günther, P.; Schulte-Schrepping, J.; Seep, L.; Klee, K.; et al. Scalable Prediction of Acute Myeloid Leukemia Using High-Dimensional Machine Learning and Blood Transcriptomics. iScience 2020, 23, 100780. [Google Scholar] [CrossRef] [PubMed]
  7. Rajkomar, A.; Dean, J.; Kohane, I. Machine learning in medicine. N. Engl. J. Med. 2019, 380, 1347–1358. [Google Scholar] [CrossRef] [PubMed]
  8. Savage, N. Calculating disease. Nature 2017, 550, 115–117. [Google Scholar] [CrossRef] [PubMed]
  9. Ping, P.; Hermjakob, H.; Polson, J.S.; Benos, P.V.; Wang, W. Biomedical informatics on the cloud: A treasure hunt for advancing cardiovascular medicine. Circ. Res. 2018, 122, 1290–1301. [Google Scholar] [CrossRef]
  10. Kaissis, G.A.; Makowski, M.R.; Ruckert, D.; Braren, R.F. Secure privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
  11. Konecny, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated learning: Strategies for improving communication efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar]
  12. Mcmahan, H.B.; Moore, E.; Ramage, D.; Arcas, B. Federated learning of deep networks using model averaging. arXiv 2016, arXiv:1602.05629v1. [Google Scholar]
  13. Warnat-Herresthal, S.; Schultze, H.; Shastry, K.L.; Manamohan, S.; Mukherjee, S.; Garg, V.; Sarveswara, R.; Händler, K.; Pickkers, P.; Aziz, N.A.; et al. Swarm Learning for decentralized and confidential clinical machine learning. Nature 2021, 594, 265–270. [Google Scholar] [CrossRef] [PubMed]
  14. Song, C.; Ristenpart, T.; Shmatikov, V. Machine learning models that remember too much. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; ACM: New York, NY, USA, 2017; pp. 587–601. [Google Scholar]
  15. Melis, L.; Song, C.; De Cristofaro, E.; Shmatikov, V. Inference attacks against collaborative learning. arXiv 2018, arXiv:1805.04049. [Google Scholar]
  16. Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. Adv. Neural Inf. Process. Syst. 2017, 30, 119–129. [Google Scholar]
  17. Bagdasaryan, E.; Veit, A.; Hua, Y.; Estrin, D.; Shmatikov, V. How to backdoor federated learning. arXiv 2018, arXiv:1807.00459. [Google Scholar]
  18. Shokri, R.; Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the ACM Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; pp. 310–1321. [Google Scholar]
  19. Heikkila, M.A.; Koskela, A.; Shimizu, K.; Kaski, S.; Honkela, A. Differentially private cross-silo federated learning. arXiv 2020, arXiv:2007.05553. [Google Scholar]
  20. Zhao, L.; Wang, Q.; Zou, Q.; Zhang, Y.; Chen, Y. Privacy-preserving collaborative deep learning with unreliable participants. IEEE Trans. Inf. Forensics Secur. 2020, 15, 1486–1500. [Google Scholar] [CrossRef]
  21. Xu, R.; Baracaldo, N.; Zhou, Y.; Anwar, A.; Ludwig, H. HybridALpha: An efficient approach for privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, London, UK, 15 November 2019; pp. 13–23. [Google Scholar]
  22. Hinton, G.; Deng, L.; Yu, D.; Dahl, G.E.; Mohamed, A.R.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.N.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 2012, 29, 82–97. [Google Scholar] [CrossRef]
  23. Chan, T.-H.; Jia, K.; Gao, S.; Lu, J.; Zeng, Z.; Ma, Y. Pcanet: A simple deep learning baseline for image classification? IEEE Trans. Image Process. 2015, 24, 5017–5032. [Google Scholar] [CrossRef]
  24. Aliper, A.; Plis, S.; Artemov, A.; Ulloa, A.; Mamoshina, P.; Zhavoronkov, A. Deep learning applications for predicting pharmacological properties of drugs and drug repurposing using transcriptomic data. Mol. Pharm. 2016, 13, 2524–2530. [Google Scholar] [CrossRef]
  25. Budaher, J.; Almasri, M.; Goeuriot, L. Comparison of several word embedding sources for medical information retrieval. In Proceedings of the Working Notes of CLEF 2016—Conference and Labs of the Evaluation Forum, Évora, Portugal, 5–8 September 2016; pp. 43–46. [Google Scholar]
  26. Rav, D.; Wong, C.; Deligianni, F.; Berthelot, M.; Andreu-Perez, J.; Lo, B.; Yang, G.Z. Deep learning for health informatics. IEEE J. Biomed. Health Inf. 2017, 21, 4–21. [Google Scholar] [CrossRef]
  27. Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
  28. Mohassel, P.; Zhang, Y. Secureml: A system for scalable privacy-preserving machine learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 25 May 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 19–38. [Google Scholar]
  29. Wagh, S.; Gupta, D.; Chandran, N. Securenn: 3-party secure computation for neural network training. Proc. Priv. Enhanc. Technol. 2019, 3, 26–49. [Google Scholar] [CrossRef]
  30. Chaudhari, H.; Rachuri, R.; Suresh, A. Trident: Efficient 4pc framework for privacy preserving machine learning. In Proceedings of the 27th Annual Network and Distributed System Security Symposium, NDSS, San Diego, CA, USA, 23–26 February 2020. [Google Scholar]
  31. Phong, L.T.; Aono, Y.; Hayashi, T.; Wang, L.; Moriai, S. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1333–1345. [Google Scholar] [CrossRef]
  32. Dong, Y.; Chen, X.; Shen, L.; Wang, D. EaSTFLy: Efficient and secure ternary federated learning. Comput. Secur. 2020, 94, 101824. [Google Scholar] [CrossRef]
  33. Zhang, C.; Li, S.; Xia, J.; Wang, W. BatchCrypt: Efficient homomorphic encryption for Cross-Silo federated learning. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC 20), Online, 15–17 July 2020; pp. 493–506. [Google Scholar]
  34. Cheon, J.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology, Proceedings of the ASIACRYPT 2017: 23rd International Conference on the Theory and Applications of Cryptology and Information Security, Hong Kong, China, 3–7 December 2017; Springer: Cham, Switzerland, 2017; pp. 409–437. [Google Scholar]
  35. Fung, C.; Yoon, C.J.M.; Beschastnikh, I. Mitigating sybils in federated learning poisoning. arXiv 2018, arXiv:1808.04866. [Google Scholar]
  36. Fang, M.; Cao, X.; Jia, J.; Gong, N. Local model poisoning attacks to byzantine-robust federated learning. In Proceedings of the 29th USENIX Security Symposium (USENIX Security), Boston, MA, USA, 12–14 August 2020; pp. 1605–1622. [Google Scholar]
  37. Tolpegin, V.; Truex, S.; Gursoy, M.E.; Liu, L. Data poisoning attacks against federated learning systems. In Proceedings of the 25th European Symposium on Research in Computer Security, ESORICS 2020, Guildford, UK, 14–18 September 2020; Springer: Cham, Switzerland, 2020; pp. 480–501. [Google Scholar]
  38. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.Y. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
  39. Yin, D.; Chen, Y.; Ramchandran, K.; Bartlett, P.L. Byzantine-robust distributed learning: Towards optimal statistical rates. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 5636–5645. [Google Scholar]
  40. Truex, S.; Baracaldo, N.; Anwar, A.; Steinke, T.; Ludwig, H.; Zhang, R.; Zhou, Y. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, London, UK, 15 November 2019; pp. 1–11. [Google Scholar]
  41. Liu, X.; Li, H.; Xu, G.; Chen, Z.; Huang, X.; Lu, R. Privacy-enhanced federated learning against poisoning adversaries. IEEE Trans. Inf. Forensics Secur. 2021, 16, 4574–4588. [Google Scholar] [CrossRef]
  42. Chen, Y.; Su, L.; Xu, J. Distributed statistical machine learning in adversarial settings: Byzantine gradient descent. Proc. ACM Meas. Anal. Comput. Syst. 2017, 1, 1–25. [Google Scholar] [CrossRef]
  43. Guerraoui, R.; Rouault, S. The hidden vulnerability of distributed learning in byzantium. In Proceedings of the International Conference on Machine Learning (ICML 2018), Stockholm, Sweden, 10–15 July 2018; pp. 3521–3530. [Google Scholar]
  44. Ramanan, P.; Nakayama, K. BAFFLE: Blockchain based aggregator free federated learning. In Proceedings of the International Conference on Blockchain (Blockchain), Virtual Event, 2–6 November 2020; pp. 72–81. [Google Scholar]
  45. Li, Y.; Chen, C.; Liu, N.; Huang, H.; Zheng, Z.; Yan, Q. A blockchain-based decentralized federated learning framework with committee consensus. IEEE Netw. 2021, 35, 234–241. [Google Scholar] [CrossRef]
  46. Kim, H.; Park, J.; Bennis, M.; Kim, S.L. Blockchained on-device federated learning. IEEE Commun. Lett. 2020, 24, 1279–1283. [Google Scholar] [CrossRef]
  47. Weng, J.; Zhang, J.; Li, M.; Zhang, Y.; Luo, W. DeepChain: Auditable and Privacy-Preserving Deep Learning with Blockchain-Based Incentive. IEEE Trans. Dependable Secur. Comput. 2021, 18, 2438–2455. [Google Scholar] [CrossRef]
  48. Warnat-Herresthal, S.; Schultze, H.; Shastry, K.; Manamohan, S.; Mukherjee, S.; Garg, V.; Sarveswara, R.; Händler, K.; Pickkers, P.; Aziz, N.A.; et al. Swarm Learning as a privacy-preserving machine learning approach for disease classification. bioRxiv 2020. [Google Scholar] [CrossRef]
  49. Fan, D.; Wu, Y.; Li, X. On the Fairness of Swarm Learning in Skin Lesion Classification. In Clinical Image-Based Procedures, Distributed and Collaborative Learning, Artificial Intelligence for Combating COVID-19 and Secure and Privacy-Preserving Machine Learning, Proceedings of the 10th Workshop, CLIP 2021, Second Workshop, DCL 2021, First Workshop, LL-COVID19 2021, and First Workshop and Tutorial, PPML 2021, Strasbourg, France, 27 September–1 October 2021; CoRR Abs/2109.12176; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
  50. Oestreich, M.; Chen, D.; Schultze, J.; Fritz, M.; Becker, M. Privacy considerations for sharing genomics data. Excli J. 2021, 2021, 1243. [Google Scholar]
  51. Westerlund, A.; Hawe, J.; Heinig, M.; Schunkert, H. Risk Prediction of Cardiovascular Events by Exploration of Molecular Data with Explainable Artificial Intelligence. Int. J. Mol. Sci. 2021, 22, 10291. [Google Scholar] [CrossRef] [PubMed]
  52. Jain, A.; Rasmussen, P.M.R.; Sahai, A. Threshold Fully Homomorphic Encryption. Cryptology ePrint Archive, Report 2017/257. 2017. Available online: http://eprint.iacr.org/2017/257 (accessed on 9 June 2024).
  53. Paillier, P. Public-key crypto-systems based on composite degree residuosity classes. In Proceedings of the International Conference on the Theory and Application of Cryptographic Techniques, Prague, Czech Republic, 1–2 May 1999; pp. 223–238. [Google Scholar]
  54. Cheon, J.; Kim, D.; Lee, H.; Lee, K. Numerical method for comparison on homomorphically encrypted numbers. In Proceedings of the ASIACRYPT 2019 25th International Conference on the Theory and Application of Cryptology and Information Security, Kobe, Japan, 8–12 December 2019; pp. 415–445. [Google Scholar]
  55. Cao, X.; Fang, M.; Liu, J.; Gong, N.Z. FLTrust: Byzantine-robust federated learning via trust bootstrapping. In Proceedings of the Network and Distributed System Security Symposium, Virtual, 21–25 February 2021; pp. 1–18. [Google Scholar] [CrossRef]
  56. Li, Z.; Kang, J.; Yu, R.; Ye, D.; Deng, Q.; Zhang, Y. Consortium blockchain for secure energy trading in industrial internet of things. IEEE Trans. Ind. Inform. 2018, 14, 3690–3700. [Google Scholar] [CrossRef]
  57. Hu, S.; Cai, C.; Wang, Q.; Wang, C.; Luo, X.; Ren, K. Searching an encrypted cloud meets blockchain: A decentralized, reliable and fair realization. In Proceedings of the IEEE INFOCOM 2018-IEEE Conference on Computer Communications, Honolulu, HI, USA, 16–19 April 2018. [Google Scholar]
  58. Wang, S.; Zhang, Y.; Zhang, Y. A blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems. IEEE Access 2018, 6, 437–450. [Google Scholar] [CrossRef]
  59. Chen, X.; Luo, J.J.; Liao, C.W.; Li, P. When machine learning meets blockchain: A decentralized, privacy-preserving and secure design. In Proceedings of the 2018 IEEE International Conference on Big Data (Big Data), Seattle, WA, USA, 10–13 December 2018; pp. 1178–1187. [Google Scholar]
Figure 1. (a) Traditional federated learning (FL) architecture and (b) swarm learning (SL) architecture.
Figure 1. (a) Traditional federated learning (FL) architecture and (b) swarm learning (SL) architecture.
Applsci 14 05247 g001
Figure 2. System model of PBSL.
Figure 2. System model of PBSL.
Applsci 14 05247 g002
Figure 3. An overview of proposed PBSL.
Figure 3. An overview of proposed PBSL.
Applsci 14 05247 g003
Figure 4. The impact of the number of MC s .
Figure 4. The impact of the number of MC s .
Applsci 14 05247 g004
Figure 5. Comparison of accuracy with different numbers of malicious MC s under different attacks on different datasets.
Figure 5. Comparison of accuracy with different numbers of malicious MC s under different attacks on different datasets.
Applsci 14 05247 g005
Figure 6. Comparison of accuracy with different epochs under different attacks on different datasets.
Figure 6. Comparison of accuracy with different epochs under different attacks on different datasets.
Applsci 14 05247 g006
Figure 7. Comparison of the cost time of encryption and decryption in baseline, BatchCrypt, and PBSL.
Figure 7. Comparison of the cost time of encryption and decryption in baseline, BatchCrypt, and PBSL.
Applsci 14 05247 g007
Table 1. A comparative summary between our scheme and previous schemes.
Table 1. A comparative summary between our scheme and previous schemes.
SchemesPrivacy
Protection
Poisoning
Defense
Central Server
or Blockchain
Robust to
Off-Line
Lightweight
Computation
SecProbe
[20]
DP-based×Central server
PPDL
[31]
HE-based×Central server××
PPML
[27]
MPC-based×Central server×
Krum
[16]
×Central server×
Trim-mean
[39]
×Central server×
FoolsGold
[35]
×Central server×
PEFL
[41]
HE-based
Paillier
Central server×
BAFFLE
[44]
××Blockchain××
BlockFL
[46]
××Blockchain××
Swarm Learning
[46]
××Blockchain××
BFLC
[45]
×Blockchain×
PBSLHE-based
CKKS
Blockchain
Table 2. Summary of notations in PBSL.
Table 2. Summary of notations in PBSL.
NotationImplications
Kthe number of medical centers
mthe threshold of Sharmir secret sharing
MC 1 , , MC K a set of K medical centers
p k k p s u , s k k pseudo public key and secret key of medical center MC k
α the security parameter
p k F H E , s k F H E the keys of CKKS cryptosystem
s k F H E k the distributed secret keys splited from s k F H E
k e y k l the symmetric encryption key between medical center MC k and MC l
E n c k e y k l ( · ) , D e c k e y k l ( · ) the symmetric key encryption or decryption algorithm using k e y k l
w t the weight of the global model after the t 1 -th iteration of training
[ [ · ] ] p k F H E the cipher-text generated by CKKS algorithm
vthe size of decryption set D S
Table 3. Accuracy of PBSL and Bulyan with different numbers of malicious MCs under untargeted attack and targeted attack on different datasets.
Table 3. Accuracy of PBSL and Bulyan with different numbers of malicious MCs under untargeted attack and targeted attack on different datasets.
DatasetScheme (Attacks)% of Malicious MCs
10%20%30%40%50%
MNISTPBSL (Untargeted) 98.0 98.0 97.0 97.0 96.0
Bulyan (Untargeted) 97.0 95.0 94.0 80.0 58.0
PBSL (Targeted) 98.0 97.0 97.0 97.0 96.0
Bulyan (Targeted) 97.0 96.0 94.0 84.0 78.0
FashionMNISTPBSL (Untargeted) 87.0 87.0 86.0 85.0 85.0
Bulyan (Untargeted) 85.0 84.0 83.0 79.0 54.0
PBSL (Targeted) 88.0 88.0 87.0 87.0 86.0
Bulyan (Targeted) 87.0 87.0 85.0 80.0 72.0
Table 4. Comparison of accuracy and mining overhead on FashionMNIST dataset.
Table 4. Comparison of accuracy and mining overhead on FashionMNIST dataset.
( α , β ) LearningChainBCFLPBSL
Accuracy Overhead Accuracy Overhead Accuracy Overhead
( 1 , 10 ) 73.3 % 100 76.5 % 70 86.3 % 90
( 1 , 8 ) 72.8 % 94 75.7 % 70 85.2 % 84
( 1 , 6 ) 76.4 % 98 77.5 % 74 87.4 % 88
( 2 , 10 ) 67.9 % 90 72.2 % 52 82.1 % 80
( 5 , 10 ) 51.8 % 70 56.5 % 46 74.2 % 60
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhu, X.; Lai, T.; Li, H. Privacy-Preserving Byzantine-Resilient Swarm Learning for E-Healthcare. Appl. Sci. 2024, 14, 5247. https://doi.org/10.3390/app14125247

AMA Style

Zhu X, Lai T, Li H. Privacy-Preserving Byzantine-Resilient Swarm Learning for E-Healthcare. Applied Sciences. 2024; 14(12):5247. https://doi.org/10.3390/app14125247

Chicago/Turabian Style

Zhu, Xudong, Teng Lai, and Hui Li. 2024. "Privacy-Preserving Byzantine-Resilient Swarm Learning for E-Healthcare" Applied Sciences 14, no. 12: 5247. https://doi.org/10.3390/app14125247

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop