Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks

Messinis, Sotirios C.; Protonotarios, Nicholas E.; Doulamis, Nikolaos

doi:10.3390/s24165142

Open AccessArticle

Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks

by

Sotirios C. Messinis

^1,*

,

Nicholas E. Protonotarios

²

and

Nikolaos Doulamis

¹

Institute of Communication and Computer Systems, National Technical University of Athens, 15773 Athens, Greece

²

Mathematics Research Center, Academy of Athens, 11527 Athens, Greece

^*

Author to whom correspondence should be addressed.

Sensors 2024, 24(16), 5142; https://doi.org/10.3390/s24165142

Submission received: 3 July 2024 / Revised: 30 July 2024 / Accepted: 5 August 2024 / Published: 8 August 2024

(This article belongs to the Section Internet of Things)

Download

Browse Figures

Versions Notes

Abstract

:

Federated learning (FL) has emerged as a pivotal paradigm for training machine learning models across decentralized devices while maintaining data privacy. In the healthcare domain, FL enables collaborative training among diverse medical devices and institutions, enhancing model robustness and generalizability without compromising patient privacy. In this paper, we propose DPS-GAT, a novel approach integrating graph attention networks (GATs) with differentially private client selection and resource allocation strategies in FL. Our methodology addresses the challenges of data heterogeneity and limited communication resources inherent in medical applications. By employing graph neural networks (GNNs), we effectively capture the relational structures among clients, optimizing the selection process and ensuring efficient resource distribution. Differential privacy mechanisms are incorporated, to safeguard sensitive information throughout the training process. Our extensive experiments, based on the Regensburg pediatric appendicitis open dataset, demonstrated the superiority of our approach, in terms of model accuracy, privacy preservation, and resource efficiency, compared to traditional FL methods. The ability of DPS-GAT to maintain a high and stable number of client selections across various rounds and differential privacy budgets has significant practical implications, indicating that FL systems can achieve strong privacy guarantees without compromising client engagement and model performance. This balance is essential for real-world applications where both privacy and performance are paramount. This study suggests a promising direction for more secure and efficient FL medical applications, which could improve patient care through enhanced predictive models and collaborative data utilization.

Keywords:

decentralized federated learning; resource allocation; differential privacy; client selection; graph neural networks

1. Introduction

Federated learning (FL) is a distributed learning paradigm, in which models are trained across multiple devices without direct transfer of raw data to a centralized server [1]. In this paradigm, training involves collaboration between multiple clients, such as research institutions, hospitals, and medical devices, enhancing privacy among the involved parties [2]. FL components facilitate the scalability of machine learning applications by ensuring the protection of personal data at the edge [3].

In the context of medical devices, FL enables the training of machine learning models by using data from a vast array of devices [4,5]. To this end, each device contributes its data to the training process, allowing the model to learn from a diverse range of patient populations and clinical scenarios [6,7]. Furthermore, FL can aid in tackling issues associated with data heterogeneity, given that medical devices often produce data of varying characteristics and formats. However, the data transfer required by distributed learning algorithms significantly consumes radio resources, including energy and bandwidth, which are frequently limited in real-world scenarios [8]. Bandwidth allocation, a subject of extensive study in the resource allocation problem for FL (see, for example, [9]) is often integrated with various approaches, such as agent selection and power control, to improve FL’s communication efficiency. Simultaneously, achieving satisfactory FL performance under energy limitations is also essential. Moreover, the loss functions of agents depend on their private data, which may include sensitive information, such as health or financial records. Even though direct exchange of raw data is avoided in distributed learning settings, the information shared between agents and servers can be vulnerable to interception by malicious entities, potentially allowing them to infer the agents’ private data [10].

Although traditional FL relies on a centralized server to coordinate model training across distributed devices, there is an increasing need for serverless FL settings, usually referred to as decentralized federated learning (DFL) [11]. DFL operates on the decentralized model exchange between clients, offering resilience against single-point failures and reducing network congestion through direct device-to-device (D2D) communications [12], see Figure 1. Despite these advantages, DFL faces several challenges, including long delays due to frequent decentralized model updates, potentially increasing communication and computation resource usage. Heterogeneity among devices and the decentralized structure may result in stragglers with slower updates, reducing aggregation frequency and impacting convergence rates.

Cross-device FL has been widely applied in diverse fields, including mobile phones, internet of things (IoT) devices, and mobile edge computing. In these settings, DFL client devices exhibit considerable variation in data characteristics and system configurations; inadequately addressing this diversity may affect performance [13]. Therefore, managing client heterogeneity has become a key priority. DFL client selection, also addressed as participant selection or device sampling, plays a pivotal role, as it determines the client devices in each training round. Effective client selection strategies can greatly improve model performance, in terms of accuracy, fairness, robustness, and efficiency by reducing training overheads [14]. Nonetheless, the risk of revealing private information persists through the analysis of uploaded client parameters, such as the weights in deep neural networks [15].

Furthermore, in both FL and DFL, ensuring privacy is critical to the user. This fact has led to the adoption of the differential privacy (DP) technique, which provides a rigorous mathematical framework for quantifying and minimizing the privacy risks associated with sharing sensitive data during the training process [16]. DP is a technique used to protect the privacy of individuals in large datasets by adding random noise to the data, in such a way that statistical analysis of the data remains accurate without revealing the identity of individual data points [10]. It is particularly useful in healthcare and medical devices, where it helps collect, analyze, and safeguard sensitive health and medical information from unauthorized access [17]. Furthermore, DP is essential in FL applications, to protect individual data, build user trust, and comply with privacy regulations [18]. By introducing controlled noise into the learning process, DP enhances protection against attempts to infer membership or re-identify individuals, thereby complicating efforts by adversaries to obtain sensitive information. This privacy-preserving framework ensures balance between model utility and user privacy, thus encouraging broader participation in FL, while future-proofing against evolving privacy concerns [19].

In FL environments, where client connections usually occur, there is an interest in utilizing these connections in order to enhance performance [20]. In this regard, graph neural networks (GNNs) have become increasingly popular, due to their ability to effectively process data within a graph structure [21]. As data privacy becomes essential, GNNs are expected to meet certain privacy concerns [22]. GNN techniques enhance FL training by leveraging connections within two types of graph information: graphs between clients and layer connectivity within deep neural networks [23]. By addressing variations in local model architectures among clients and by considering the neural network as a graph, GNNs offer a novel approach to balancing interclient variations [24]. For example, devices in close proximity experiencing similar conditions can be modeled as interclient graphs, where each device is represented by a node.

In the present study, we present DPS-GAT (differentially private selection-graph attention networks), a GNN-based architecture that jointly provides optimal resource allocation and client selection with DP guarantees in FL for healthcare. In particular, we solve the underlying optimization problem of client selection and resource allocation by implementing graph attention networks [25] and differential privacy, by extending our initial work [26]. For the purposes of our solution, we have achieved an optimal trade-off between client selection and resource allocation, while employing appropriate DP techniques. To the best of our knowledge, this is the first study that provides a GNN-based solution considering client selection, resource allocation, and DP in a DFL setting.

The paper is organized as follows: in Section 2 we present related work, while in Section 3 we proceed with our system model and the corresponding problem formulation. In Section 4 we present our contribution to the joint resource allocation and client selection with DP guarantees; in Section 5 we demonstrate the feasibility and performance of our architecture, and in Section 6 we provide the discussion of our results. Finally, in Section 7 we give our concluding remarks.

2. Related Work

2.1. Client Selection and Resource Allocation in Federated Learning

In [8], the authors highlight the significance of distributed learning in next-generation intelligent networks, emphasizing collaborative model training among intelligent agents without centralized raw data processing for privacy preservation. However, this highlights the challenge of high communication overhead in wireless systems with limited resources, prompting the need for communication-efficient distributed learning algorithms. Study [27] addresses resource efficiency in FL through intelligent participant selection and incorporating updates from straggling participants, resulting in improved model quality and optimized resource utilization, thus enhancing the effectiveness of the training process. In [28], the authors address the resource allocation problem in FL workflows where computational power, network bandwidth, and energy resources are limited. The study explores various resource allocation methods, considering factors such as node capabilities, dataset complexity, and FL workflow characteristics. The authors of [29] propose an innovative framework integrating edge computing with parallel split learning, to overcome challenges regarding privacy requirements on resource-constrained devices while reducing training latency and maintaining accuracy. In [30], the authors provide a comprehensive study of training FL algorithms in realistic wireless network scenarios. They solve the joint optimization problem involving learning, wireless resource allocation, and user selection. The authors of [31] propose FEDACS, a novel personalized FL algorithm that introduces an attention-based client selection mechanism, to mitigate challenges arising from non-IID data and data scarcity by prioritizing collaboration among clients with similar data distributions. The authors of [32] introduce a distributed cross-device FL framework, aiming to address limitations in existing centralized client selection approaches. In [33], the authors address the challenge of improving the learning performance of FL in over-the-air computation scenarios, particularly focusing on energy-harvesting clients. In [34], the authors introduce an FL framework augmented with critical learning periods (CLPs) [35]. By adaptively incorporating CLPs into existing FL methods, this significantly improves model accuracy while achieving better communication efficiency compared to the state-of-the-art methods.

2.2. Graph Neural Networks for Client Selection and Resource Allocation

The authors of [36] address the distributed power allocation problem in interference-limited wireless networks with dense transceiver pairs, through the design of low signaling overhead schemes using GNNs. In [37], the authors introduce a GNN designed to enhance bandwidth allocation for multiple legitimate wireless users transmitting to a base station in the presence of an eavesdropper. Furthermore, the authors of [38] introduce a GNN-based approach for DFL in D2D wireless networks, aiming to minimize total training delay and enhance learning performance. The proposed method employs a multi-head graph attention mechanism to capture diverse client and channel features, and it incorporates a neighbor selection module for clients to choose participating neighbors in model aggregation. To the same extent, the study of [39] proposes a solution to the challenges of dynamic bandwidth allocation in wireless vehicular networks, addressing mobility and heterogeneity with the first algorithm employing GNNs to predict vehicular network connection topology, prioritizing vehicles for bandwidth allocation based on quality of service (QoS) requirements and proximity. The authors of [40] introduce a novel graph-based client selection framework tailored for the heterogeneity in FL settings. This addresses the challenge of diverse hardware configurations and data distributions among mobile devices.

2.3. Differential Privacy in Federated Learning

DP deployment in FL environments has already been documented in the recent literature. Specifically, the authors of [41] introduce a solution to privacy concerns arising in FL by incorporating the concept of user-level differential privacy. In [42], the authors address the challenge of minimizing FL training delay over wireless channels while considering imbalanced data distributions, privacy constraints, and transmission quality. Study [43] introduces an iterative DP algorithm for client selection in FL, addressing scenarios where clients coordinate with a central server to complete tasks while deciding to participate based on local computation and probabilistic intent. The algorithm does not rely on client-to-client information exchange, and it ensures near-optimal values to clients over long-term average participation with a certain differential privacy guarantee. Following a different approach, the authors of [44] present a novel DP-enhanced FL platform that treats privacy as a resource and addresses the challenge of accumulating privacy leakage in multiple FL jobs. In [45], the authors propose a unified FL framework that integrates both horizontal and vertical FL approaches. The framework aims to balance privacy preservation, model utility, and system efficiency, crucial for large-scale model training and deployment. The paper formulates and quantifies trade-offs between privacy leakage, utility loss, and efficiency reduction.

3. System Model

We consider a D2D wireless network with N clients in a given area. The DFL process iterates for R training rounds. Considering the potential mobility of the clients, we model the network as a time-varying graph, namely

G_{r} (V, E_{r})

, where V and

E_{r}

denote the set of clients (nodes) and the set of communication links (edges) between nodes in the r-th training round, with

r = 1, 2, \dots, R

, respectively. Furthermore, we assume that there exists a communication link between two clients when their signal-to-interference-plus-noise ratio (SINR) is greater than or equal to a certain threshold, denoted by

Ω

[38]. We define the adjacency matrix in the r-th training round, denoted by

A^{r}

, in the following manner:

A^{r} = (a_{i, j}^{r}), with a_{i, j}^{r} \in {0, 1},

(1)

where

a_{i, j}^{r}

represents the link (or absence thereof) between client i and client j. The set of neighboring nodes of client i in the r-th training round is represented by

N_{i}^{r}

. Each client has a local dataset,

D_{i}

, with

D_{i}

representing its data samples. Let parameters

θ_{i}^{R} \in R^{d}

denote the local model of client i after R training rounds. The loss function associated with the local model of client i, denoted by

L_{i}

, is defined as in [38], namely,

L_{i} (θ_{i}^{R}) = \frac{1}{D_{i}} \sum_{z \in D_{i}} ℓ (z; θ_{i}^{R}),

(2)

where

ℓ (z; θ_{i}^{R})

denotes the loss on data sample

z \in D_{i}

with model

θ_{i}^{R}

.

Furthermore, in the r-th training round, client i trains its model locally with K stochastic gradient descent (SGD) steps. As in [37], we define client i’s model at the

κ

-th step as

θ_{i}^{r, κ}

,

1 ⩽ κ ⩽ K

, updated as follows:

θ_{i}^{r, κ} = θ_{i}^{r, κ - 1} - η^{r} \nabla L_{i} (θ_{i}^{r, κ - 1}),

(3)

where

η^{r}

and

\nabla L_{i} (θ_{i}^{r, κ - 1})

denote the learning rate and the gradient of the loss function of client i at the

(κ - 1)

-th SGD step in the r-th training round, respectively.

In this work, each client selects a subset of its neighboring client devices, to participate in model aggregation. We define the neighbor selection decision as

s_{i, j}^{r} \in {0, 1}

, satisfying

s_{i, j}^{r} ⩽ a_{i, j}^{r}, \forall i, j \in V .

(4)

The neighbor selection matrix for aggregation in the r-th training round is represented by

S^{r} \in R^{N x N}

, with

s_{i}^{r} \in R^{1 x N}

denoting the neighbor selection row vector of client i. We define the set of neighbors that client i has selected for aggregation in the r-th training round as

{\hat{N}}_{i}^{r}

, which includes the non-zero elements in vector

s_{i}^{r}

. Client i updates its local model by aggregating the local models of its selected neighbors in the r-th training round, as follows:

θ_{i}^{r + 1} (s_{i}^{r}) = \sum_{j \in {\hat{N}}_{i}^{r}} p_{j, i}^{r} θ_{j}^{r, K}, p_{j, i}^{r} = \frac{D_{j}}{\sum_{n \in {\hat{N}}_{i}^{r}} D_{n}},

(5)

where

p_{j, i}^{r}

represents the weight of the model of client j when client i performs aggregation at the end of the r-th training round. We note that

θ_{i}^{r + 1}

will then be used as client i’s model for local training in the

(r + 1)

-th training round.

In DFL, each client selects its neighbors, CPU frequency, and transmit power. The chosen values result in computation and communication delay. The computation delay of a client is the time that the client uses for local training. We denote the CPU frequency of client i with the variable

f_{i}^{r} \in [f_{i}^{m i n}, f_{i}^{m a x}]

, where

f_{i}^{m i n}

and

f_{i}^{m a x}

represent the minimum and the maximum CPU frequency of client i, respectively. Denoting the number of CPU cycles of client i to process one bit of data by

c_{i}

, we define the computational delay as

T_{i}^{c, r} = \frac{c_{i} D_{i} K}{f_{i}^{r}} .

(6)

In terms of communication delays, we define the transmit power of client i in the r-th training round as

p_{i}^{r} \in [0, p_{i}^{m a x}]

, with

p_{i}^{m a x}

denoting the maximum transmit power of client i. Let

g_{i, j}^{r}

denote the channel gain from client i to client j in the r-th training round. At time t, we define the set of clients that transmit their updated models as

M (t) \subset V

. The corresponding achievable data rate at time t in the r-th training round is denoted by

R_{i, j}^{r} (t)

and is defined as [42]

R_{i, j}^{r} (t) = B {log}_{2} \{1 + \frac{p_{i}^{r} g_{i, j}^{r} (t)}{\sum_{κ \in M (t)} {| g_{κ, j}^{r} |}^{2} p_{κ}^{r} + σ^{2}}\},

(7)

where

σ

is the received noise power. Consequently, if we denote the communication delay by

T_{j, i}^{t, r}

, then, considering the different transmit power and channel gains of the selected clients and the changing inference over time, the delay is such that

\int_{t^{r}}^{t^{r} + T_{j, i}^{t, r}} R_{i, j}^{r} (t) d t = ξ_{m},

(8)

where

t^{r}

denotes the time that client i receives the model from client j, and where

ξ_{m}

denotes the model size. Finally,

T_{i}^{t, r} = {max}_{j} T_{j, i}^{t, r}

represents the total time client i needs to receive the models from its selected neighbors. The total delay of the r-th training round is then expressed as the sum of the maximum values of communication and computation delays that are functions of the neighbor selection matrix, transmit power, and CPU frequency of all the clients, i.e.,

T^{r} = max_{i \in V} \{T_{i}^{c, r} + T_{i}^{t, r}\} .

(9)

Due to time-varying topologies, the neighbor selection and resource allocation decisions are independent across the training rounds. Here, we define as

d_{m a x}

the maximal time interval of each communication round, which is used to avoid an endless waiting time caused by possible stragglers. The time that clients with available channels require in order to jointly complete an update of their respective FL models at the r-th communication round is given by

d (r) = max_{i \in V} \{min \{T^{r}, d_{m a x}\}\} .

(10)

In order to satisfy the underlying privacy requirements of DFL, we consider a DP mechanism with parameters

ϵ

and

δ

. In our study, the DP budget

ϵ > 0

is a bound on all outputs on neighboring datasets, namely

D_{i}

and

D_{j}

, while

δ

represents the probability of the event that the ratio of the probabilities for the two adjacent datasets cannot be bounded by

e^{ϵ}

after adding a privacy-preserving mechanism. We employ local differential privacy (LDP), as in [46], by applying perturbation mechanisms on the user’s datasets. We adopt the Gaussian mechanism that has been widely used in privacy-preserving SGD algorithms: this mechanism satisfies LDP for the r-th client, when we properly select the value of the standard deviation (STD),

σ_{i}

, for each client [41]. As in [47], the selected composition of leakage,

{\hat{ϵ}}_{i}

, based on local privacy leakage, is given by

{\hat{ϵ}}_{i} = \sqrt{\frac{E_{i} ln (\frac{1}{δ_{i}})}{ln (\frac{2}{δ_{i}})}} ϵ_{i},

(11)

where

E_{i}

is the number of model uploads of the r-th client. Thus, it is crucial to control the largest possible delay among all the clients. Having defined the system model, the next step is to design the dynamic channel assigning mechanisms in this work, to minimize time delay while completing the training. Following the approach of [42], the problem is formulated as follows:

min_{s_{i}^{r} \in S^{r}} \sum_{r \in R} d (r),

(12a)

subject to

s_{i, j}^{r} ⩽ a_{i, j}^{r}, i, j \in V,

(12b)

0 ⩽ p_{i}^{r} ⩽ p_{i}^{m a x}, i \in V,

(12c)

f_{i}^{m i n} ⩽ f_{i}^{r} ⩽ f_{i}^{m a x}, i \in V,

(12d)

lim_{r \in R} sup \frac{1}{R} \sum_{r} I_{i, j} (r) ⩾ β_{i},

(12e)

where

β_{i}

denotes the participating ratio for client i, determined by its DP requirement

(ϵ_{i}, δ_{i})

and local training model, and

I_{i, j} (r)

denotes an indicator function, i.e., whether the local training model of client j is successfully received by client i at the r-th training round. Equation (12a) implies that the time used for the update of the local models is determined by the neighboring selection matrix

s_{i, j}^{r}

. It is worth mentioning that even when a minimum positive

ϵ_{i}

is considered there are different responses in the local model training of the clients, and, as a consequence, their potential selection is affected [42]. In practice, the local training models for imaging classification tasks can be deployed across medical institutions and devices participating in the FL process. The computational requirements primarily depend on the complexity of the selected neural network models and the volume of data processed by each client. Typically, clients require moderate computational capabilities, including multi-core CPUs and GPUs, to handle local model training. The DPS-GAT model can be trained offline, considering periodic retraining when there are new complex graph structures and considerable scalability differences.

4. Graph Neural Networks for Differentially Private Client Selection and Resource Allocation

For the purposes of our study, we propose a multi-head graph attention network (GAT) [25]. The attention mechanism can help each client learn the features of its neighboring clients and determine their importance score for selection. The DPS-GAT algorithm consists of an encoder, a selection features network, and a decoder (see Figure 2). In order to capture the intrinsic properties of D2D networks, the encoder encodes client features, including the maximum transmit power, the maximum and minimum CPU frequencies, the differential privacy budget, and the edge features that represent channel gains. The neighboring selection network determines the set of neighboring clients for model aggregation. The decoder determines the final allocation decisions of the selected clients. Let

x_{i}^{f, r} = (p_{i}^{m a x}, f_{i}^{m i n}, f_{i}^{m a x}, ϵ_{i}^{m i n}),

(13)

denote the feature vector of client i. Similarly, the edge feature, denoted by

x^{e, r}

, represents the absolute value of the channel gain of link

(i, j)

.

4.1. Encoder

We consider both node and edge features. Node features represent the communication and computation capabilities of devices. Edge features characterize the channel gain affecting the communication delay. The client encoder utilizes these two features, in order to capture the intrinsic properties of the network topology. We employ the z-score normalization to stabilize the gradients of GNN [48]. Since each client aggregates features of different magnitudes from its neighbors, local normalization is applied, to ensure that the mean and standard deviation of the features are zero and one, respectively [38]. In this work, we deploy long short-term memory (LSTM) as the encoder for the features of the clients. The features of client i at the training round r, denoted by

r_{r}^{(i)}

, are defined as

r_{r}^{(i)} = LSTM (x_{i}^{f, r}) .

(14)

The features of all clients will then be gathered as a set, denoted by

R_{r}

, namely,

R_{r} = \{r_{r}^{(i)}, \dots, r_{r}^{(N)}\} .

(15)

Furthermore, we denote by

S_{r}

the selection decisions of the clients, i.e.,

S_{r} = \{s_{r}^{(0)}, s_{r}^{(1)}, . . ., s_{r}^{(N)}\},

(16)

where

s_{r}^{(i)}

represents the selection feature of agent i at the training round r. Our GAT-based algorithm models the features of all clients simultaneously, in the sense that

S_{r} = GAT (R_{r}, E_{r}),

(17)

where

E_{r}

represents the set that contains the edge features at the training round r.

4.2. Selection Features Network

We proceed with the process of neighboring selection by considering the client features

R_{r}

, defined in Equation (15). The selection representation occurs via a GNN that handles all nodes, edges, and continuous edge features. By aggregating information from neighbors, several GAT layers are employed, with each layer updating node features, denoted by

h_{i}

,

i = 1, 2, \dots, N

. A layer first transforms node and edge features accordingly, and then it aggregates neighboring node features with a multi-head attention mechanism. This combination allows GAT layers to capture dependencies between nodes. GAT provides updated node features based on the information received by its neighbor. A concatenated feature vector,

e_{i j}^{r}

, defined by

e_{i j}^{r} = [e_{i j}] [h_{i}] [h_{j}],

(18)

represents the edge feature of node j from node i’s point of view, which is then forwarded to a shared attention mechanism. The attention mechanism,

a^{T}

, is a single-layer, feed-forward neural network with LeakyRelu, softmax linearization, and nonlinearities, such as the sigmoid. Its attention coefficients,

a_{i j}

, indicate the importance of node j to node i. Subsequently, we compute the node attention scores, in order to generate the node embeddings, namely,

a_{i j} = \frac{exp (LeakyReLU (a^{T} [h_{i}] [e_{i j}^{r}]))}{\sum_{k \in {\hat{N}}_{i}} exp (LeakyReLU (a^{T} [h_{i}] [e_{i k}^{r}]))} .

(19)

We implement independent attention mechanisms per node. Specifically, the multi-head attention mechanism is shaped by K independent attention mechanisms, with their features to be concatenated, resulting in the following output feature representation:

h_{i}^{n e w} = σ (\sum_{j \in {\hat{N}}_{i}} [a_{i j}^{1} W_{h}^{1} e_{i j}^{r}] [a_{i j}^{2} W_{h}^{2} e_{i j}^{r}] \dots [a_{i j}^{K} W_{h}^{K} e_{i j}^{r}]),

(20)

with

σ (\cdot)

,

W_{h}

, and k representing the sigmoid function, the weighted sum of node features over its neighborhood, and the number of attention heads, respectively. When performing training in DFL, the full participation scheme may incur high communication delays. Thus, we propose a neighbor selection module to determine which neighbors are selected for each client. The inputs to the module consist of the concatenated embedding of a client and the embeddings of its neighbors. The concatenated embedding of client i and its neighbor j is

h_{i, j}^{n e w, r} = [h_{i}^{n e w, r}] [h_{j}^{n e w, r}]

. We apply a decoder for the neighboring selection decisions and the resource allocation matrix. For each neighboring client

j \in {\hat{N}}_{i}^{r}

, client i determines the selection decision

s_{i, j}

based on a predefined threshold

γ \in (0, 1)

. Since bi-directional model exchanges are required for model aggregation in DFL, we require

e_{i, j}^{r} ⩾ γ, e_{j, i}^{r} ⩾ γ .

(21)

4.3. Decoder

We deploy an LSTM decoder as

f_{r}^{(i)} = LSTM ([r_{r}^{(i)}] [s_{r}^{(i)}]),

(22)

with

[r_{r}^{(i)}] [s_{r}^{(i)}]

representing the concatenation of the corresponding features and

f_{r}

denoting the selection features of the nodes. The outcome includes only the features of the selected nodes for resource allocation. We further employ a full GNN block containing a global block, a node block, and an edge block. Adopting a set-to-set mapping approach similar to the one in [49], we define the resource allocation GNN, denoted by RAG, as

RAG = \{Q_{e n c}^{u}, Q_{e n c}^{e}, Q_{d e c}^{u}\},

(23)

where we adopt the following notation:

Q_{e n c}^{u}

:

R^{2} \to R^{n_{u}}

represents the encoding rates of the node features (transmit power, CPU frequencies, privacy constraints),

Q_{e n c}^{e}

:

R^{2} \to R^{n_{e}}

the encoding rates of the channel gains between the clients,

Q_{d e c}^{u}

:

R^{n_{u}} \to R

the decoding of the final state for the assignment of resources, and

(Q^{e}, Q^{u}, Q^{b})

:

R^{n_{u} + n_{e} + n_{b}} \to R^{n_{e}}

,

R^{n_{u}}

or

R^{n_{b}}

represent the updates of the graph’s edges, nodes, and global values, respectively. Furthermore,

n_{u}

,

n_{e}

, and

n_{b}

denote the hyperparameters that represent the size of the node encodings, the edges, and the final resource assignments, respectively.

To ensure that the proposed algorithm can adapt to different network settings, we train our GNN in an offline and unsupervised manner. We generate a set of D2D network scenarios with topology changes and time-varying channel conditions. Due to the small size of the neighbors’ node features, compressed model parameters, and channel conditions, each client collects these features in each scenario with negligible communication delay. It then determines the decision and sends the decision to its neighbors, and the loss of each client can be calculated.

5. Results

For the evaluation of our model, we employed the Regensburg pediatric appendicitis dataset [50], the Breast Ultrasound Image dataset [51], and the Maternal-Fetal US [52] dataset. These datasets were used for image classification in clients. We considered four distinct scenarios with several differential privacy budgets

(ϵ_{i})

. We ensured that each client’s local dataset contained an equal number of data samples [53]. The clients were randomly positioned in an area following the uniform distribution. We assumed that each client, i.e., a certain hospital or medical institution, was stable. In our approach, we did not consider the distances between the clients, as the underlying transmit power, bandwidth allocation, and DP budgets completely defined the client selection process and the model weights exchange. In this regard, we considered the transit power of each device to follow a uniform distribution

[8, 15]

mW. The frequency of each client device was set to

f_{i}^{m i n} \in [0.1, 0.2]

GHz and

f_{i}^{m a x} \in [1.5, 2.5]

GHz. To evaluate the trade-offs between training loss and privacy strength, the privacy budget values

(ϵ_{i})

investigated were

the same for every client with $ϵ_{i} \in {0.1, 0.5, 0.9};$
different among the client population with $ϵ_{i}$ values in the interval $(0.01, 1)$ , following a uniform distribution.

We initialized the transmit power and CPU frequency of each client to the maximum value and then we implemented our trained GNN. Each client could independently make their own decisions and determine the delay by recording the wall clock time. Our experiments were deployed on a Windows 11 Home HP AMD Ryzen 7 5825U, utilizing Python 3.8 and PyTorch libraries. Indicatively, for testing the model performance we considered 50 clients, while for the total delay we considered scenarios of between 5 and 50 clients. Furthermore, we adopted a supervised learning approach by implementing the framework of [54] within our DFL setting. In order to test how bandwidth affected both the model performance and the client selection process, we omitted any constraint distances among the clients, and the overall bandwidth quantity was set to 2000 Mbps.

In Figure 3a, we present the number of selected clients per communication round in the case of DP heterogeneity. On the other hand, in Figure 3b, we present the number of selected clients under the same

ϵ

values in the first communication round. To test and validate our proposed algorithm, we made performance comparisons (accuracy and loss computations) with three advanced differentially private client selection algorithms from the literature (see Figure 4 and Figure 5):

i.: DP-FedAvg (differentially private federated averaging) [55] was selected, due to its foundational role in FL with differential privacy, serving as a standard benchmark for comparisons.
ii.: FL-PATE (private aggregation of teacher ensembles for federated learning) [56] offers an innovative approach by ensuring privacy through aggregating teacher model outputs. This method differs from DP-FedAvg and is particularly suited for scenarios requiring stringent privacy guarantees.
iii.: D2P-Fed [57] was chosen for its recent advancements, integrating advanced mechanisms for privacy-preserving client selection and update aggregation.

We present our results on the Breast Ultrasound Image dataset [51], and the Maternal-Fetal US [52] dataset, in terms of average performance per round, total delay, and number of selected clients per round in Table 1, Table 2 and Table 3, respectively.

Furthermore, we computed the total communication and computation delay for several numbers of participating clients with different DP budget values (see Figure 6), as well as the given trade-offs between learning performance, differential privacy, and client selection, as depicted in Figure 7.

6. Discussion

In this study, we investigated the performance of our proposed DPS-GAT algorithm in the context of client selection for FL, considering resource constraints with differential privacy guarantees in medical applications. Our goal was to assess the efficacy of DPS-GAT in selecting an optimal number of clients per training round while ensuring privacy preservation without violating the training performance. The evaluation of different DP budgets (spanning from 0.1 to 0.9) reveals that DPS-GAT effectively adjusts to diverse privacy needs. Given different

ϵ

values, the number of selected clients remains relatively stable, especially after the 15th communication round. This stability is crucial, as it guarantees consistent model performance and convergence throughout extended training periods, even as privacy constraints become more stringent.

In benchmarking terms and all three datasets, DPS-GAT reached and outperformed the baseline algorithms, in terms of training performance, in relation to the number of selected clients per communication round. DPS-GAT showed a stable client selection range as the

ϵ

value increased, maintaining a higher client selection rate compared to DP-FedAvg, FL-PATE, and D2P-Fed, while it also maintained a balance in the bandwidth allocation ratio per round, as depicted in Figure 8. DPS-GAT demonstrated high training accuracy compared to the other differentially private algorithms. Specifically, in the case of the Regensburg pediatric appendicitis dataset [50], it consistently achieved accuracy of over 0.7 in every DP budget scenario, reaching approximately 0.82 in cases of DP budget heterogeneity among clients. This trend persisted throughout all the communication rounds of our experiments. Furthermore, the accuracy trade-off between the number of selected clients and the DP budget indicated that there was a specific interval of reference within which to maintain high training accuracy, while not compromising the average DP budget. This balance also ensured that resource consumption remained low, with an optimal number of selected clients (N = 27 clients with an average

ϵ

= 0.6). In the total communication and computation delay for several numbers of participating clients, DPS-GAT was the second-best against its three counterpart algorithms in the

ϵ

heterogeneity case.

Compared to DP-FedAvg, FL-PATE, and D2P-Fed, which showed moderate fluctuations and a slightly lower client selection rate, DPS-GAT demonstrated superior stability and higher average client selection. This highlights the efficiency of our algorithm in managing privacy-preserving constraints while maximizing client participation. The ability of DPS-GAT to maintain a high and stable number of client selections across various rounds and DP budgets has significant practical implications. It indicates that FL systems can achieve strong privacy guarantees without compromising client engagement and model performance. This balance is essential for real-world applications where both privacy and performance are paramount. However, in practice, there are cases where the network experiences certain perturbations. In these cases, where loss of communication with a client occurs, our DPS-GAT algorithm excludes the client from the training round. If communication is interrupted during training, the updates from the affected client are excluded, and training proceeds with the remaining clients, maintaining model robustness. In this regard, small-world network topologies are ideal, due to their short path lengths and high clustering, enhancing resilience and communication efficiency [58]. Scale-free networks ensure robustness, with highly connected hub nodes that maintain connectivity despite node failures [59]. These topologies can collectively ensure efficient, robust, and scalable FL, making them ideal for our system.

The scalability of our DPS-GAT method is vital for DFL in medical applications. Utilizing GNNs for client selection and resource allocation ensures efficient handling of large, complex networks. Our differential privacy mechanisms scale with the number of clients, maintaining privacy and performance. The multi-head attention mechanism in GNNs allows parallel processing, enhancing scalability. Our experimental results confirm that DPS-GAT maintains high accuracy and low latency as the network size grows, making it ideal for large-scale healthcare deployments. Furthermore, our proposed DPS-GAT algorithm demonstrated a robust ability to manage the inherent trade-offs in FL with differential privacy. The results underscore its superiority over existing baseline algorithms, making it a promising solution for privacy-preserving FL-based medical applications.

Our research demonstrates that the DPS-GAT algorithm significantly improves the robustness and efficiency of DFL in medical applications by ensuring high model accuracy and strong privacy preservation. This advancement has the potential to enhance patient care through more accurate predictive models while maintaining stringent privacy standards, paving the way for broader adoption of FL in sensitive domains.

7. Conclusions

In this work, we present a novel and effective solution to the challenges of DFL for healthcare applications. By introducing a GNN-based approach, our study simultaneously addresses efficient client selection and resource allocation with DP guarantees. Our results highlight the importance of incorporating privacy-preserving techniques in FL for healthcare. Furthermore, GNN methods demonstrate a highly promising direction for considering privacy requirements in modern DFL environments without compromising the underlying training performance. Future work could explore the scalability of DPS-GAT in more diverse and larger FL environments, particularly within medical settings where data diversity and volume are substantial.

Author Contributions

Conceptualization, S.C.M., N.E.P. and N.D.; methodology, S.C.M. and N.E.P.; software and validation, S.C.M. and N.E.P.; writing—original draft preparation, S.C.M. and N.E.P.; writing—review and editing, S.C.M., N.E.P. and N.D.; visualization, S.C.M.; supervision, N.E.P. and N.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101094901, the SEPTON project (Security Protection Tools for Networked Medical Devices).

Data Availability Statement

The data used for the experiments of this study are available from the Regensburg pediatric appendicitis dataset [50], the Breast Ultrasound Image dataset [51], and the Maternal-Fetal US dataset [52], which are open datasets.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
Soltani, B.; Zhou, Y.; Haghighi, V.; Lui, J.C.S. A Survey of Federated Evaluation in Federated Learning. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, Macao, 19–25 August 2023; pp. 6769–6777. [Google Scholar] [CrossRef]
Wen, J.; Zhang, Z.; Lan, Y.; Cui, Z.; Cai, J.; Zhang, W. A survey on federated learning: Challenges and applications. Int. J. Mach. Learn. Cybern. 2023, 14, 513–535. [Google Scholar] [CrossRef]
Nguyen, D.C.; Pham, Q.V.; Pathirana, P.N.; Ding, M.; Seneviratne, A.; Lin, Z.; Dobre, O.; Hwang, W.J. Federated learning for smart healthcare: A survey. ACM Comput. Surv. (CSUR) 2022, 55, 60. [Google Scholar] [CrossRef]
Rieke, N.; Hancox, J.; Li, W.; Milletari, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. NPJ Digit. Med. 2020, 3, 119. [Google Scholar] [CrossRef]
Sachin, D.; Annappa, B.; Hegde, S.; Abhijit, C.S.; Ambesange, S. FedCure: A Heterogeneity-Aware Personalized Federated Learning Framework for Intelligent Healthcare Applications in IoMT Environments. IEEE Access 2024, 12, 15867–15883. [Google Scholar] [CrossRef]
Dayan, I.; Roth, H.R.; Zhong, A.; Harouni, A.; Gentili, A.; Abidin, A.Z.; Liu, A.; Costa, A.B.; Wood, B.J.; Tsai, C.S.; et al. Federated learning for predicting clinical outcomes in patients with COVID-19. Nat. Med. 2021, 27, 1735–1743. [Google Scholar] [CrossRef]
Cao, X.; Başar, T.; Diggavi, S.; Eldar, Y.C.; Letaief, K.B.; Poor, H.V.; Zhang, J. Communication-Efficient Distributed Learning: An Overview. IEEE J. Sel. Areas Commun. 2023, 41, 851–873. [Google Scholar] [CrossRef]
Mao, W.; Lu, X.; Jiang, Y.; Zheng, H. Joint Client Selection and Bandwidth Allocation of Wireless Federated Learning by Deep Reinforcement Learning. IEEE Trans. Serv. Comput. 2024, 17, 336–348. [Google Scholar] [CrossRef]
Ouadrhiri, A.E.; Abdelhadi, A. Differential Privacy for Deep and Federated Learning: A Survey. IEEE Access 2022, 10, 22359–22380. [Google Scholar] [CrossRef]
Qi, P.; Chiaro, D.; Guzzo, A.; Ianni, M.; Fortino, G.; Piccialli, F. Model aggregation techniques in federated learning: A comprehensive survey. Future Gener. Comput. Syst. 2024, 150, 272–293. [Google Scholar] [CrossRef]
Kalra, S.; Wen, J.; Cresswell, J.C.; Volkovs, M.; Tizhoosh, H.R. Decentralized federated learning through proxy model sharing. Nat. Commun. 2023, 14, 2899. [Google Scholar] [CrossRef]
Fu, L.; Zhang, H.; Gao, G.; Zhang, M.; Liu, X. Client selection in federated learning: Principles, challenges, and opportunities. IEEE Internet Things J. 2023, 10, 21811–21819. [Google Scholar] [CrossRef]
Lyu, L.; Yu, H.; Ma, X.; Chen, C.; Sun, L.; Zhao, J.; Yang, Q.; Yu, P.S. Privacy and Robustness in Federated Learning: Attacks and Defenses. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 8726–8746. [Google Scholar] [CrossRef]
Hu, R.; Guo, Y.; Gong, Y. Federated learning with sparsified model perturbation: Improving accuracy under client-level differential privacy. IEEE Trans. Mob. Comput. 2023, 23, 8242–8255. [Google Scholar] [CrossRef]
Baek, C.; Kim, S.; Nam, D.; Park, J. Enhancing Differential Privacy for Federated Learning at Scale. IEEE Access 2021, 9, 148090–148103. [Google Scholar] [CrossRef]
Dyda, A.; Purcell, M.; Curtis, S.; Field, E.; Pillai, P.; Ricardo, K.; Weng, H.; Moore, J.C.; Hewett, M.; Williams, G.; et al. Differential privacy for public health data: An innovative tool to optimize information sharing while protecting data confidentiality. Patterns 2021, 2, 100366. [Google Scholar] [CrossRef]
Mothukuri, V.; Parizi, R.M.; Pouriyeh, S.; Huang, Y.; Dehghantanha, A.; Srivastava, G. A survey on security and privacy of federated learning. Future Gener. Comput. Syst. 2021, 115, 619–640. [Google Scholar] [CrossRef]
Williamson, S.M.; Prybutok, V. Balancing Privacy and Progress: A Review of Privacy Challenges, Systemic Oversight, and Patient Perceptions in AI-Driven Healthcare. Appl. Sci. 2024, 14, 675. [Google Scholar] [CrossRef]
Liu, R.; Xing, P.; Deng, Z.; Li, A.; Guan, C.; Yu, H. Federated Graph Neural Networks: Overview, Techniques, and Challenges. IEEE Trans. Neural Netw. Learn. Syst. 2024; early access. [Google Scholar] [CrossRef]
Yang, C.; Lin, Y.; Liu, Z.; Sun, M. Graph Representation Learning. In Representation Learning for Natural Language Processing; Springer Nature: Singapore, 2023; pp. 169–210. [Google Scholar] [CrossRef]
Zhang, H.F.; Zhang, F.; Wang, H.; Ma, C.; Zhu, P.C. A novel privacy-preserving graph convolutional network via secure matrix multiplication. Inf. Sci. 2024, 657, 119897. [Google Scholar] [CrossRef]
Zhang, L.; Liu, P.; Gulla, J.A. Recommending on graphs: A comprehensive review from a data perspective. User Model.-User-Adapt. Interact. 2023, 33, 803–888. [Google Scholar] [CrossRef]
Sajadmanesh, S.; Gatica-Perez, D. Progap: Progressive graph neural networks with differential privacy guarantees. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining, Merida, Mexico, 4–8 March 2024; pp. 596–605. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar] [CrossRef]
Messinis, S.C.; Protonotarios, N.E.; Arapidis, E.; Doulamis, N. Client Selection and Resource Allocation via Graph Neural Networks for Efficient Federated Learning in Healthcare Environments. In Proceedings of the17th International Conference on PErvasive Technologies Related to Assistive Environments, Crete, Greece, 26–28 June 2024; pp. 606–612. [Google Scholar]
Abdelmoniem, A.M.; Sahu, A.N.; Canini, M.; Fahmy, S.A. REFL: Resource-Efficient Federated Learning. In Proceedings of the Eighteenth European Conference on Computer Systems, EuroSys ’23, Rome, Italy, 8–12 May 2023; pp. 215–232. [Google Scholar] [CrossRef]
Nikolaidis, F.; Symeonides, M.; Trihinas, D. Towards Efficient Resource Allocation for Federated Learning in Virtualized Managed Environments. Future Internet 2023, 15, 261. [Google Scholar] [CrossRef]
Lin, Z.; Zhu, G.; Deng, Y.; Chen, X.; Gao, Y.; Huang, K.; Fang, Y. Efficient parallel split learning over resource-constrained wireless edge networks. IEEE Trans. Mob. Comput. 2024; early access. [Google Scholar]
Chen, M.; Yang, Z.; Saad, W.; Yin, C.; Poor, H.V.; Cui, S. A Joint Learning and Communications Framework for Federated Learning Over Wireless Networks. IEEE Trans. Wirel. Commun. 2021, 20, 269–283. [Google Scholar] [CrossRef]
Chen, Z.; Li, J.; Shen, C. Personalized Federated Learning with Attention-based Client Selection. arXiv 2023, arXiv:2312.15148. [Google Scholar]
Panigrahi, M.; Bharti, S.; Sharma, A. FedDCS: A distributed client selection framework for cross device federated learning. Future Gener. Comput. Syst. 2023, 144, 24–36. [Google Scholar] [CrossRef]
Chen, C.; Chiang, Y.H.; Lin, H.; Lui, J.C.; Ji, Y. Joint Client Selection and Receive Beamforming for Over-the-Air Federated Learning With Energy Harvesting. IEEE Open J. Commun. Soc. 2023, 4, 1127–1140. [Google Scholar] [CrossRef]
Yan, G.; Wang, H.; Yuan, X.; Li, J. Criticalfl: A critical learning periods augmented client selection framework for efficient federated learning. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 2898–2907. [Google Scholar]
Gong, S.; Xing, K.; Cichocki, A.; Li, J. Deep Learning in EEG: Advance of the Last Ten-Year Critical Period. IEEE Trans. Cogn. Dev. Syst. 2022, 14, 348–365. [Google Scholar] [CrossRef]
Gu, Y.; She, C.; Quan, Z.; Qiu, C.; Xu, X. Graph neural networks for distributed power allocation in wireless networks: Aggregation over-the-air. IEEE Trans. Wirel. Commun. 2023, 22, 7551–7564. [Google Scholar] [CrossRef]
Hao, X.; Yeoh, P.L.; Liu, Y.; She, C.; Vucetic, B.; Li, Y. Graph Neural Network-Based Bandwidth Allocation for Secure Wireless Communications. In Proceedings of the 2023 IEEE International Conference on Communications Workshops (ICC Workshops), Rome, Italy, 28 May–1 June 2023; pp. 332–337. [Google Scholar] [CrossRef]
Meng, C.; Tang, M.; Setayesh, M.; Wong, V.W. GNN-Based Neighbor Selection and Resource Allocation for Decentralized Federated Learning. In Proceedings of the GLOBECOM 2023—2023 IEEE Global Communications Conference, Kuala Lumpur, Malaysia, 4–8 December 2023; pp. 1223–1228. [Google Scholar] [CrossRef]
Li, X.; Chen, M.; Liu, Y.; Zhang, Z.; Liu, D.; Mao, S. Graph Neural Networks for Joint Communication and Sensing Optimization in Vehicular Networks. IEEE J. Sel. Areas Commun. 2023, 41, 3893–3907. [Google Scholar] [CrossRef]
Chang, T.; Li, L.; Wu, M.; Yu, W.; Wang, X.; Xu, C. GraphCS: Graph-based client selection for heterogeneity in federated learning. J. Parallel Distrib. Comput. 2023, 177, 131–143. [Google Scholar] [CrossRef]
Wei, K.; Li, J.; Ding, M.; Ma, C.; Su, H.; Zhang, B.; Poor, H. User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization. IEEE Trans. Mob. Comput. 2022, 21, 3388–3401. [Google Scholar] [CrossRef]
Wei, K.; Li, J.; Ma, C.; Ding, M.; Chen, C.; Jin, S.; Han, Z.; Poor, H.V. Low-Latency Federated Learning Over Wireless Channels With Differential Privacy. IEEE J. Sel. Areas Commun. 2022, 40, 290–307. [Google Scholar] [CrossRef]
Eqbal Alam, S.; Shukla, D.; Rao, S. Near-optimal Differentially Private Client Selection in Federated Settings. arXiv 2023, arXiv:2310.09370. [Google Scholar] [CrossRef]
Yuan, J.; Wang, S.; Wang, S.; Li, Y.; Ma, X.; Zhou, A.; Xu, M. Privacy as a Resource in Differentially Private Federated Learning. In Proceedings of the IEEE INFOCOM 2023—IEEE Conference on Computer Communications, New York City, NY, USA, 17–20 May 2023; pp. 1–10. [Google Scholar] [CrossRef]
Zhang, X.; Kang, Y.; Chen, K.; Fan, L.; Yang, Q. Trading Off Privacy, Utility, and Efficiency in Federated Learning. ACM Trans. Intell. Syst. Technol. 2023, 14, 98. [Google Scholar] [CrossRef]
Wang, S.; Huang, L.; Nie, Y.; Zhang, X.; Wang, P.; Xu, H.; Yang, W. Local Differential Private Data Aggregation for Discrete Distribution Estimation. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 2046–2059. [Google Scholar] [CrossRef]
Huang, Z.; Hu, R.; Guo, Y.; Chan-Tin, E.; Gong, Y. DP-ADMM: ADMM-Based Distributed Learning With Differential Privacy. IEEE Trans. Inf. Forensics Secur. 2020, 15, 1002–1012. [Google Scholar] [CrossRef]
Fei, N.; Gao, Y.; Lu, Z.; Xiang, T. Z-score normalization, hubness, and few-shot learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 142–151. [Google Scholar]
Cranmer, M.; Melchior, P.; Nord, B. Unsupervised resource allocation with graph neural networks. In Proceedings of the NeurIPS 2020 Workshop on Pre-Registration in Machine Learning, Virtual, 11 December 2021; pp. 272–284. [Google Scholar]
Marcinkevičs, R.; Reis Wolfertstetter, P.; Klimiene, U.; Ozkan, E.; Chin-Cheong, K.; Paschke, A.; Zerres, J.; Denzinger, M.; Niederberger, D.; Wellmann, S.; et al. Regensburg Pediatric Appendicitis Dataset. 2023. Available online: https://zenodo.org/records/7669442 (accessed on 1 July 2024).
Deb, S.D.; Jha, R.K. Breast UltraSound Image classification using fuzzy-rank-based ensemble network. Biomed. Signal Process. Control 2023, 85, 104871. [Google Scholar] [CrossRef]
Burgos-Artizzu, X.P.; Coronado-Gutiérrez, D.; Valenzuela-Alcaraz, B.; Bonet-Carne, E.; Eixarch, E.; Crispi, F.; Gratacós, E. Evaluation of deep convolutional neural networks for automatic classification of common maternal fetal ultrasound planes. Sci. Rep. 2020, 10, 10200. [Google Scholar] [CrossRef]
Marcinkevičs, R.; Reis Wolfertstetter, P.; Klimiene, U.; Chin-Cheong, K.; Paschke, A.; Zerres, J.; Denzinger, M.; Niederberger, D.; Wellmann, S.; Ozkan, E.; et al. Interpretable and intervenable ultrasonography-based machine learning models for pediatric appendicitis. Med. Image Anal. 2024, 91, 103042. [Google Scholar] [CrossRef]
Guijas, P. P2PFL: Peer-to-Peer Federated Learning. 2024. Available online: https://github.com/pguijas/p2pfl (accessed on 1 July 2024).
Cheng, A.; Wang, P.; Zhang, X.S.; Cheng, J. Differentially private federated learning with local regularization and sparsification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10122–10131. [Google Scholar]
Pan, Y.; Ni, J.; Su, Z. FL-PATE: Differentially Private Federated Learning with Knowledge Transfer. In Proceedings of the 2021 IEEE Global Communications Conference (GLOBECOM), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Wang, L.; Jia, R.; Song, D. D2P-Fed: Differentially private federated learning with efficient communication. arXiv 2020, arXiv:2006.13039. [Google Scholar]
Amaral, L.A.N.; Scala, A.; Barthelemy, M.; Stanley, H.E. Classes of small-world networks. Proc. Natl. Acad. Sci. USA 2000, 97, 11149–11152. [Google Scholar] [CrossRef] [PubMed]
Barabási, A.L.; Bonabeau, E. Scale-free networks. Sci. Am. 2003, 288, 50–59. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Decentralized federated learning environment in healthcare.

Figure 2. DPS-GAT approach.

Figure 3. Number of selected clients. (a) Number of selected clients per communication round under DP heterogeneity. (b) Number of selected clients per DP budget in the first round.

Figure 4. Comparison of accuracy and loss for DP budget values

ϵ = 0.1

and

ϵ = 0.5

.

Figure 4. Comparison of accuracy and loss for DP budget values

ϵ = 0.1

and

ϵ = 0.5

.

Figure 5. Comparison of accuracy and loss for a DP value

ϵ

= 0.9 and for DP budgets under heterogeneity.

Figure 5. Comparison of accuracy and loss for a DP value

ϵ

= 0.9 and for DP budgets under heterogeneity.

Figure 6. Total delay.

Figure 7. Learning–differential privacy–selected clients’ trade-offs.

Figure 8. Bandwidth allocation ratio (round 1,

N = 41

clients).

Figure 8. Bandwidth allocation ratio (round 1,

N = 41

clients).

Table 1. Performance of DPS-GAT for the datasets [51,52].

Datasets	Algorithms	Average Performance per Round (20 Rounds)
		DP = 0.1	DP = 0.5	DP = 0.9	DP Heterogeneity
		Accuracy (Loss)	Accuracy (Loss)	Accuracy (Loss)	Accuracy (Loss)
[51]	DPS-GAT	0.821 (0.219)	0.826 (0.217)	0.828 (0.216)	0.834 (0.210)
	DP-FedAvg	0.817 (0.231)	0.822 (0.225)	0.824 (0.224)	0.829 (1.112)
	FL-PATE	0.812 (0.220)	0.819 (0.223)	0.821 (0.222)	0.816 (0.245)
	D2P-Fed	0.805 (0.214)	0.816 (0.210)	0.818 (0.209)	0.807 (0.187)
[52]	DPS-GAT	0.810 (0.122)	0.812 (0.111)	0.814 (0.113)	0.816 (0.116)
	DP-FedAvg	0.805 (0.129)	0.807 (0.128)	0.809 (0.127)	0.802 (1.254)
	FL-PATE	0.800 (0.125)	0.802 (0.126)	0.804 (0.125)	0.795 (0.238)
	D2P-Fed	0.795 (0.117)	0.797 (0.113)	0.799 (0.112)	0.809 (0.139)

Table 2. Total delay per algorithm for the datasets [51,52] versus client number.

Datasets	Algorithms	Total Delay (Seconds)
Datasets	Algorithms	10 Clients	20 Clients	30 Clients	40 Clients	50 Clients
[51]	DPS-GAT	0.243	0.291	0.344	0.390	0.447
	DP-FedAvg	0.752	0.807	0.855	0.864	0.872
	FL-PATE	0.220	0.273	0.329	0.378	0.421
	D2P-Fed	0.236	0.284	0.332	0.385	0.431
[52]	DPS-GAT	0.245	0.292	0.346	0.397	0.444
	DP-FedAvg	0.759	0.809	0.851	0.862	0.876
	FL-PATE	0.227	0.274	0.325	0.377	0.423
	D2P-Fed	0.235	0.285	0.334	0.388	0.430

Table 3. Total number of client selection per algorithm for the datasets [51,52].

Datasets	Algorithms	Communication Rounds
Datasets	Algorithms	5th	10th	15th	20th
[51]	DPS-GAT	12	15	17	20
	DP-FedAvg	38	40	43	43
	FL-PATE	11	14	16	19
	D2P-Fed	12	14	17	19
[52]	DPS-GAT	12	15	17	20
	DP-FedAvg	38	40	43	43
	FL-PATE	11	14	16	19
	D2P-Fed	12	14	17	19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Messinis, S.C.; Protonotarios, N.E.; Doulamis, N. Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks. Sensors 2024, 24, 5142. https://doi.org/10.3390/s24165142

AMA Style

Messinis SC, Protonotarios NE, Doulamis N. Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks. Sensors. 2024; 24(16):5142. https://doi.org/10.3390/s24165142

Chicago/Turabian Style

Messinis, Sotirios C., Nicholas E. Protonotarios, and Nikolaos Doulamis. 2024. "Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks" Sensors 24, no. 16: 5142. https://doi.org/10.3390/s24165142

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Differentially Private Client Selection and Resource Allocation in Federated Learning for Medical Applications Using Graph Neural Networks

Abstract

1. Introduction

2. Related Work

2.1. Client Selection and Resource Allocation in Federated Learning

2.2. Graph Neural Networks for Client Selection and Resource Allocation

2.3. Differential Privacy in Federated Learning

3. System Model

4. Graph Neural Networks for Differentially Private Client Selection and Resource Allocation

4.1. Encoder

4.2. Selection Features Network

4.3. Decoder

5. Results

6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI