Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy

Sittijuk, Pramote; Tamee, Kreangsak

doi:10.3390/sym16091181

Open AccessArticle

Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy

by

Pramote Sittijuk

¹ and

Kreangsak Tamee

^1,2,*

¹

Department of Computer Science and Information Technology, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand

²

Center of Excellence in Nonlinear Analysis and Optimization, Faculty of Science, Naresuan University, Phitsanulok 65000, Thailand

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(9), 1181; https://doi.org/10.3390/sym16091181

Submission received: 23 July 2024 / Revised: 24 August 2024 / Accepted: 27 August 2024 / Published: 9 September 2024

(This article belongs to the Special Issue Symmetry and Asymmetry in Nonlinear Analysis, Optimization and Related Topics)

Download

Browse Figures

Versions Notes

Abstract

:

We introduce the random high-local performance client selection strategy, termed Fed-RHLP. This approach allows opportunities for higher-performance clients to contribute more significantly by updating and sharing their local models for global aggregation. Nevertheless, it also enables lower-performance clients to participate collaboratively based on their proportional representation determined by the probability of their local performance on the roulette wheel (RW). Improving symmetry in federated learning involves IID Data: symmetry is naturally present, making model updates easier to aggregate and Non-IID Data: asymmetries can impact performance and fairness. Solutions include data balancing, adaptive algorithms, and robust aggregation methods. Fed-RHLP enhances federated learning by allowing lower-performance clients to contribute based on their proportional representation, which is determined by their local performance. This fosters inclusivity and collaboration in both IID and Non-IID scenarios. In this work, through experiments, we demonstrate that Fed-RHLP offers accelerated convergence speed and improved accuracy in aggregating the final global model, effectively mitigating challenges posed by both IID and Non-IID Data distribution scenarios.

Keywords:

Fed-RHLP; random high local performance client selection; roulette wheel

1. Introduction

The federated learning (FL) mechanism, which employs the concept of decentralized machine learning (ML), has seen significant development in artificial intelligence (AI). This approach distributes the responsibility of training local models on local datasets and shares the trained local models from clients to aggregate a global model on the server [1]. FL has gained widespread use, particularly in enhancing confidence and security in storing and processing participants’ private medical information [2]. By keeping sensitive data on local devices and sharing only aggregated and anonymized model updates, FL significantly reduces the risk of data breaches [3].

FL enables collaborative model training across multiple institutions or devices without centralizing data, thereby mitigating the risks associated with personal data leakage during data transmission between the server and clients. Furthermore, FL helps alleviate the strain on processing and storage resources on the central server, a common issue in centralized machine learning approaches. In a centralized model, network bottleneck problems often arise during training and mathematical calculations, especially when dealing with large-scale models that are active on the server simultaneously [4,5].

The client selection process in FL plays a critical role in choosing an efficient, smaller subset of clients for training and aggregating the global model. This reduces communication overhead, speeds up updates, and conserves computational resources [6]. It also ensures a balanced approach that optimizes the global model’s accuracy while maintaining crucial communication efficiency. Client selection is classified into two main methods: unbiased or random client selection and biased client selection.

Federated Averaging (FedAvg) uses a random client selection method for local updates and global model aggregation. It is one of the popular algorithms because of its simplicity [7,8]. However, FedAVG suffers when an improper subset of clients with poor quality and statistically heterogeneous data is selected, resulting in an inefficiently trained model. Using the random client selection method in FedAvg can lead to client drift, where local models diverge due to data heterogeneity among clients. This divergence necessitates more training epochs to achieve effective convergence of the global model. FedAvg’s performance declines with increased data heterogeneity, resulting in lower test accuracy and slower convergence compared with other client selection strategies or methods that better address data heterogeneity [9,10]. This issue also leads to problems such as delayed convergence and unsatisfactory final global accuracy, especially with non-independent and identically distributed data (Non-IID) [11,12,13].

Biased client selection methods intentionally select clients based on specific criteria, such as the quantity and quality of the client’s dataset and the client’s loss values from local model training. These biased client selection approaches aim to address statistical heterogeneity. In [14], the Power-of-Choice algorithm is proposed, which selects an expected set of clients with higher local loss. This approach can potentially accelerate convergence but risks deviating from the global loss function’s optimal value, thereby impacting overall model accuracy. This issue is addressed in [15] with the FedChoice algorithm, which selects a set of random higher loss clients. Additionally, [16] proposed the FedLA algorithm, which selects clients by determining biased weights based on both dataset size and the variety of labels per client. However, FedLA encounters challenges when clients have similar sample sizes and labels, leading to potential duplicate client selection and impacting algorithm’s effectiveness.

To achieve an accurate global model while tuning the number of selected clients and servers in reduced communication rounds [17], effective client selection is a crucial step in FL [18]. It focuses on enhancing performance by selecting a set of expected clients, which can promote global model aggregation and overcome the limitations of random client selection and certain biased client selection approaches that choose specific sets of clients. Effective client selection also helps address client drift and ensures diverse data representation for effectively training local models. Additionally, selecting the most robust and high-quality clients to actively participate in the FL process increases the convergence rate [19] and helps achieve an effective global model for accurately predicting problem domains [20]. Conversely, selecting the weakest clients with lower performance and lacking diverse classes will negatively affect data aggregation, resulting in a poor final global model. Focusing solely on selecting high-performance clients for participation is unfair, as weaker clients might be consistently excluded. This approach could lead to negative consequences, such as data omission and reduced data diversity [21]. Moreover, these weakest clients would not be able to tune their local model parameters effectively for contributing to the global model.

These problems led to the development of the Fed-RHLP which uses random high-local performance client selection on a roulette wheel for local updates and global model aggregations. Fed-RHLP aims to strike a balance between prioritizing high-performance clients and providing opportunities for lower-performing ones to contribute. This approach addresses the problem of slow convergence in the FedAvg and the biased client selection algorithm. Fed-RHLP can also be applied to all machine learning algorithms operating on the FL mechanism, effectively predicting the problem domains in the real world.

2. Concept and Federated Averaging (FedAvg) Algorithm

2.1. FL Concept

FL is a novel approach for effectively aggregating different clients’ local statistical models into a global model that represent all clients for predicting problem domains. Instead of sending clients’ raw data to be collected and processed on a central server, FL sends updated local models for secure processing.

FL operates in a distributed resources environment by selecting a small subset of clients for local updates using an optimization algorithm on their local dataset in each communication round. Then, updated local models from these clients are uploaded to the server for aggregation into the global model. This process is repeated by adjusting the local models’ parameters and sharing updated models until a sufficiently accurate final global model is achieved [22].

FL defines the local objective function of a client

F_{k} (w)

which is calculated from the average local loss where

f_{i} (w)

is the loss of prediction on a local dataset (

d_{k}

) made with the model’s parameters (

w

). The

F_{k} (w)

were aggregated to be a global loss of function

f (w)

on the fraction of each client’s number of local datasets (

n_{k}

) per set of selected client’s total number of local datasets (

n

) with

\frac{n_{k}}{n}

as Equation (1).

f (w) = \sum_{k = 1}^{K} \frac{n_{k}}{n} F_{k} (w)

(1)

w h e r e F_{k} (w) = \frac{1}{n_{k}} \sum_{i \in d_{k}} f_{i} (w)

2.2. Federated Averaging (FedAvg) Algorithm

The FedAvg algorithm was developed by [23]. It randomly generates an initial ML model’s parameters (

W_{0}

) for uploading it to a randomly selected set of clients (S_t), which is defined as the number of clients with m = N.C; where m is the selected small set of clients, N is number of all clients, and C is the fraction of selected clients for local model updates, for example, where N = 100 clients and C = 0.1, m = 100 × 0.1 = 10 clients for working in each communication round (t = 1 to T). Each client in S_t will locally update the current global model (

W

) in the current communication round with the local dataset (

d_{k}

) which is split into batches of size B by the Stochastic gradient descent (SGD) technique. In local epochs (i = 1 to E), SGD proceeds to adjust the local models’ old weights (

w_{o l d}

) by subtracting the calculated gradient values to be the new weights (

w_{n e w}

). The gradient values are defined by calculating the local loss value (

l

) on batch data (

B

) made with client’s local model (

w

) multiplied by the learning rate (

α

) for adjusting local models’ parameters in each local epoch. Then, the updated local models (

w_{k}

) in S_t are sent to be aggregated in the global model (

W_{t + 1}

) on the server. The insight of the FedAvg concept is shown in Figure 1 and Algorithm 1.

Algorithm 1: FedAvg algorithm
1:	ServerExecute:
2:	W₀ = Initial ML model’s parameters
3:	N = Number of all clients
4:	C = Fraction of selected clients
5:	For t = 1,2,… to T do
6:	m = N.C
7:	S_t = {random set of m clients}
8:	For k ∈ S_t in Parallel do
9:	$w_{t + 1}^{k}$ = ClientUpdate(k, w_t)
10:	End for
11:	$W_{t + 1} = \sum_{k \in S_{t}} \frac{n_{k}}{n_{t}} w_{t + 1}^{k}$
12:	End for
13:	ClientUpdate(k, w):
14:	B = (Split d_k into batches of size B)
15:	For i = 1,2,… to E do
16:	For b ∈ B do
17:	w ← w − n∇ $l$ (w; b)
18:	End for
19:	End for
20:	Return w to the server

3. Proposed Federated Random High Local Performance Client Selection (Fed-RHLP)

3.1. Random High-Local Performance Client Selection on the Roulette Wheel

The proposed Fed-RHLP uses roulette wheel selection by incorporating biased values from measured local model performance to select a set of clients with high local performance for updating and aggregating the global model in each communication round. The roulette wheel selection probabilistically chooses clients from a set of all clients based on their fitness values [24]. The process involves computing fitness probabilities by normalizing individual fitness, creating a virtual roulette wheel corresponding to each client’s fitness, and randomly selecting individuals by spinning the wheel. This approach balances favoring higher fitness clients while still giving less fit clients a chance of being selected for local model updates and global model aggregation, thereby increasing a variety of clients’ data points used in processing for the next generation [25,26].

In Fed-RHLP, the clients who have higher local model performance will have a high probability on the roulette wheel according to their local model performance. This technique will help to select an effective set of clients and solve the uncertain client selection problem [27,28]. By using the roulette wheel in the Fed-RHLP algorithm, the client’s local model performance (

{L P}_{k}

) will be calculated by sending the current global model (

W

) to all clients (

N

) for computing with the local dataset (

d_{k}

).

{L P}_{k}

is converted to be the weight of local model performance (

{W L P}_{k}

) while

{L P}_{k}

divides the total of all clients’ local model performance (

T o t a l L P

) as Equations (2)–(4). Then,

{W L P}_{k}

of

N

clients are defined as a probability of each client on the roulette wheel.

{L P}_{k} = f (W; d_{k})

(2)

T o t a l L P = \sum_{k = 1}^{N} {L P}_{k}

(3)

{W L P}_{k} = \frac{{L P}_{k}}{T o t a l L P}

(4)

In each communication round, the set of clients will be randomly selected on the roulette wheel by a random number with the number of m clients where 0 < m ≤ N on their

{W L P}_{k}

for local model updates and global model aggregation. The strategy of using a roulette wheel on FedAvg is illustrated in Figure 2.

3.2. Insight of Fed-RHLP Algorithm

The Fed-RHLP procedure is improved by the roulette wheel strategy with the Roulettewheel_Selection function. The Fed-RHLP measures the local model performance of all clients (

N

) and calculates the weight of local model performance (

{W L P}_{k}

). In each communication round, the sets of all clients are randomly selected according to the

{W L P}_{k}

in a form of unrepeated clients into set of selected clients (

S_{t}

) by using np.random.choice function is a part of the NumPy library in Python. Then, send clients in

S_{t}

for local update and average clients’ local model parameters to the global model as shown in Algorithm 2.

Algorithm 2: Strategy of using a roulette wheel on Fed-RHLP
1:	ServerExecute:
2:	W₀ = Initial ML model’s parameters
3:	N = Number of all clients
4:	C = Fraction of selected clients
5:	K = Set of all clients
6:	For t = 1,2… to T do
7:	m = N.C
8:	S_{t =} {Roulettewheel_Selection(K, W_t)}
9:	For k ∈ S_t in Parallel do
10:	$w_{t + 1}^{k}$ = ClientUpdate(k,W_t)
11:	End for
12:	$W_{t + 1} = \sum_{k \in S_{t}} \frac{n_{k}}{n_{t}} w_{t + 1}^{k}$
13:	End for
14:	Roulettewheel_Selection(K, W) :
15:	For k ∈ K do
16:	${L P}_{k} = f (W; d_{k})$
17:	TotalLP+= LP_k
18:	End for
19:	For k ∈ K do
20:	${W L P}_{k} = \frac{{L P}_{k}}{TotalLP}$
21:	End for
22:	S_t = random(K, m, ${WLP}_{k \in K}$ ,replace = False)
23:	Return S_t to the server
24:	ClientUpdate(k, w):
25:	B = (Split d_k into batches of size B)
26:	For i = 1,2,… to E do
27:	For b ∈ B do
28:	w ← w − n∇ $l$ (w; b)
29:	End for
30:	End for
31:	Return w to the server

4. Experimental Setup and Results

4.1. Experimental Setup

The experimental setup is classified into two steps: Section 4.1.1 data preparation and Section 4.1.2 setting FL algorithm: CNN model parameters (our experiment uses CNNs because they are well-suited for image classification tasks, and our selected dataset for testing consists of image data), and hardware component specifications for executing the FL algorithm as follows.

4.1.1. Data Preparation

In this experiment, the proposed Fed-RHLP’s performance was tested by comparing that with the performance of the FedAvg and the Power-of-Choice with online datasets including MNIST, Fashion-MNIST (FMNIST), QMNIST, and CIFAR-10 (CIFAR). These datasets’ structures can be described as follows:

The MNIST dataset contained 70,000 handwriting images of numbers 0–9 which were classified into 60,000 training samples and 10,000 test samples. Each example was a 28 × 28 grayscale image [29].
The FMNIST was a dataset created by Zalando that consists of images of various fashion items. It includes a training set of 60,000 images and a test set of 10,000 images. Each image is 28 × 28 pixels and in grayscale, with each one labeled into one of 10 different classes [30].
The QMNIST was a dataset derived from the NIST Special Database 19 which were classified into 60,000 training samples and 10,000 test samples. The QMNIST dataset is designed to closely follow the preprocessing steps of the original MNIST dataset. It is distributed under a BSD-style license [31].
The CIFAR was a dataset of 60,000 tiny images collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton. It consisted of a training set of 50,000 examples and a test set of 10,000 examples. Each example was a 32 × 32 color image in 10 classes [32].

These online datasets are shown in Figure 3.

These datasets used in Figure 3 were presented in both IID and Non-IID forms, with unequal data samples distributed across 100 clients. Each client had a varied number of data samples, ranging from 10% to 30% for the training models. In the IID scenario, 100 clients received independent and identically distributed random variables for training, ensuring that each client had a smooth and balanced distribution of samples across all classes. In contrast, for the Non-IID scenario, the training samples were non-independent and identically distributed random variables, meaning that each client received training samples for only certain classes during local training. This Non-IID distribution could result in some clients receiving an incomplete set of classes for training their local models. The Non-IID problem presents a significant challenge in FL, as it can lead to the aggregation of an inaccurately trained global model. Moreover, Non-IID distributions often result in higher local model training loss and slower convergence, and require more communication rounds to effectively aggregate the global model [33,34,35].

4.1.2. Setting FL Algorithm and CNN Model’s Parameters

In each communication round, a subset of (m) clients from all clients (N = 100) was selected based on a defined fraction of 10% (C = 0.1). Therefore, 10 clients (m = 100 × 0.1) were selected for local model updates in each communication round [36] using the Convolutional Neural Network (CNN) algorithm on the MNIST and CIFAR datasets. The CNN model’s structure is set for MNIST and CIFAR dataset, as shown in Table 1.

The SGD optimizer was used to adjust the local models’ parameters in five local epochs (E) [37] by defining the suitable batch data (B) size of 64 [38,39]. The learning rate of SGD was set at 0.01 [40].

The clients and server communicated in t = 200 communication rounds for global model aggregation. The global model was evaluated five times, and the average accuracy from these evaluations was used to ensure the stable reporting of its performance, which was measured by predicting the accuracy on a dataset of 10,000 global test samples for MNIST and CIFAR dataset.

The hardware components’ specification was set for executing the FL algorithms, as shown in Table 2.

4.2. Experimental Results

To evaluate the accuracy of Fed-RHLP, we compare the global model’s accuracy with two baseline algorithms on MNIST, FMNIST, QMNIST, and CIFAR datasets under both IID and Non-IID scenarios. The results are shown in Figure 4a–h.

From Figure 4a–h, it can be seen that Fed-RHLP achieves faster convergence speed and gives a more accurate global model than Fed-Avg and Power-of-Choice in testing on MNIST, FMNIST, QMNIST, and CIFAR IID and Non-IID.

As shown in Table 3, Table 4, Table 5 and Table 6, Fed-RHLP demonstrated the highest accuracy for the global model, a faster speed of convergence, and reduced execution time under MNIST, FMNIST, QMNIST, and CIFAR IID and Non-IID Dataset cases than FedAvg and Power-of-Choice. The results of the three algorithms are summarized as follows.

For the MNIST IID case, Fed-RHLP was 0.53% better than FedAvg (99.08–98.55%) and 0.55% better than Power-of-Choice (99.08–98.53%). Fed-RHLP’s speed of convergence, at an accuracy of 90%, was 77.78% faster than that of FedAvg ((9 − 2) × 100/9) and 75% faster than Power-of-Choice ((8 − 2) × 100/8). Additionally, the FedRHLP demonstrated a reduced execution time of 52.81% of FedAvg’s execution time, which was calculated by ((597.6 − 282) × 100/597.6) and 54.55% of Power-of-Choice execution time, which was calculated by ((620.4 − 282) × 100/620.4).

For the MNIST Non-IID case, Fed-RHLP was 3.35% better than FedAvg (98.71–95.36%) and 2.35% better than Power-of-Choice (98.71–96.36%). Fed-RHLP’s speed of convergence, at an accuracy of 90%, was 76.09% faster than that of FedAvg, calculated by ((46 − 11) × 100/46) and 76.60% faster than Power-of-Choice ((47 − 11) × 100/47). Additionally, the FedRHLP demonstrated the reduced execution time of 41.13% of FedAvg’s execution time, calculated by ((2365.2 − 1385.4) × 100/2365.2) and 57.94% of Power-of-Choice execution time ((3294.6 − 1385.4) × 100/3294.6).

For the FMNIST IID case, Fed-RHLP was 0.52% better than FedAvg (91.56–91.04%) and 0.36% better than Power-of-Choice (91.56–91.20%). Fed-RHLP’s speed of convergence, at an accuracy of 90%, was 77.78% faster than that of FedAvg ((54 − 12) × 100/54) and 70.73% faster than Power-of-Choice ((41 − 12) × 100/41). Additionally, the FedRHLP demonstrated a reduced execution time of 67.51% of FedAvg’s execution time, which was calculated by ((8974.8 − 2916) × 100/8974.8) and 70.73% of Power-of-Choice execution time, which was calculated by ((9963 − 2916) × 100/9963).

For the FMNIST Non-IID case, Fed-RHLP was 1.41% better than FedAvg (88.96–87.55%) and 0.94% better than Power-of-Choice (88.96–88.02%). Fed-RHLP’s speed of convergence, at an accuracy of 80%, was 28% faster than that of FedAvg ((25 − 18) × 100/25) and 10% faster than Power-of-Choice ((20 − 18) × 100/20). Additionally, the FedRHLP demonstrated a reduced execution time of 11.09% of FedAvg’s execution time, which was calculated by ((7580 − 6739.2) × 100/7580) and 1.33% of Power-of-Choice execution time, which was calculated by ((6830 − 6739.2) × 100/6830).

For the QMNIST IID case, Fed-RHLP was 0.25% better than FedAvg (99.32–99.07%) and 0.28% better than Power-of-Choice (99.32–99.04%). Fed-RHLP’s speed of convergence, at an accuracy of 90%, was 66.67% faster than that of FedAvg and Power-of-Choice ((3 − 1) × 100/3). Additionally, the FedRHLP demonstrated a reduced execution time of 50.67% of FedAvg’s execution time, which was calculated by ((401.4 − 198) × 100/401.4) and 66.67% of Power-of-Choice execution time, which was calculated by ((594 − 198) × 100/594).

For the QMNIST Non-IID case, Fed-RHLP was 0.27% better than FedAvg (98.90–98.63%) and 0.21% better than Power-of-Choice (98.90–98.69%). Fed-RHLP’s speed of convergence, at an accuracy of 90%, was 33.33% faster than that of FedAvg ((15 − 10) × 100/15) and 23.08% faster than Power-of-Choice ((13 − 10) × 100/13). Additionally, the FedRHLP demonstrated a reduced execution time of 10.71% of FedAvg’s execution time, which was calculated by ((2520 − 2250) × 100/2520) and 22.87% of Power-of-Choice execution time, which was calculated by ((2917.2 − 2250) × 100/2917.2).

For the CIFAR IID case, Fed-RHLP was 12.21% better than FedAvg (74.21–62.0%) and 12.24% better than Power-of-Choice (74.21–61.97%). Fed-RHLP’s speed of convergence, at an accuracy of 60%, was 79.78% faster than that of FedAvg, calculated by ((178 − 36) × 100/178) and 80.11% faster than Power-of-Choice ((181 − 36) × 100/181). Additionally, the FedRHLP demonstrated the reduced execution time of 76.93% of FedAvg’s execution time, calculated by ((14,135.4 − 3261.6) × 100/14,135.4) and 84.75% of Power-of-Choice execution time ((21,394.2 − 3261.6) × 100/21,394.2).

For the CIFAR Non-IID case, Fed-RHLP was 9.37% better than FedAvg (66.11–56.74%) and 6.37% better than Power-of-Choice (66.11–59.74%). Fed-RHLP’s speed of convergence, at an accuracy of 50%, was 37.74% faster than that of FedAvg, calculated by ((159 − 99) × 100/159) and 32.19% faster than Power-of-Choice ((146 − 99) × 100/146). Additionally, the FedRHLP demonstrated the reduced execution time of 7.88% of FedAvg’s execution time, calculated by ((25,662.6 − 23,641.2) × 100/25,662.6) and 31.85% of Power-of-Choice execution time, calculated by ((34,689.6 − 23,641.2) × 100/34,689.6).

5. Conclusions

The proposed Fed-RHLP algorithm is enhanced by the strategy of using a roulette wheel to increase the chance of selecting clients with higher local model performance for effective global model aggregation. The roulette wheel strategy in federated learning is employed to promote fairness and ensure more equitable and balanced client participation. Fed-RHLP also allows lower-performance clients to adjust their local model parameters to align with the global model. Compared with FedAvg and Power-of-Choice, Fed-RHLP consistently achieves higher local performance for the selected clients, resulting in faster convergence and increased accuracy of the final global model. Importantly, Fed-RHLP can overcome the Non-IID problem that clients in the real world often encounter.

Author Contributions

Conceptualization, P.S. and K.T.; methodology, P.S.; software, P.S.; validation, P.S. and K.T.; formal analysis, P.S. and K.T.; investigation, P.S. and K.T.; resources, P.S. and K.T.; data curation, P.S. and K.T.; writing—original draft preparation, P.S.; writ-ing—review and editing, P.S. and K.T.; visualization, P.S. and K.T.; su-pervision, P.S. and K.T.; project administration, P.S. and K.T.; funding acquisition, K.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Naresuan University (NU) and National Science, Research and Innovation Fund (NSRF), grant number 180613.

Data Availability Statement

Data are contained within the article.

Acknowledgments

We thank Olalekan Israel Aiikulola, Lecturer (Special Knowledge and Abilities), Faculty of Medical Science, Naresuan University, Thailand for carefully checking the general format of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, C.; Xie, Y.; Bai, H.; Yu, B.; Li, W.; Gao, Y. A survey on federated learning. Knowl.-Based Syst. 2021, 216, 106775. [Google Scholar] [CrossRef]
Bebortta, S.; Tripathy, S.S.; Basheer, S.; Chowdhary, C.L. FedEHR: A Federated Learning Approach towards the Prediction of Heart Diseases in IoT-Based Electronic Health Records. Diagnostics 2023, 13, 3166. [Google Scholar] [CrossRef] [PubMed]
Moshawrab, M.; Adda, M.; Bouzouane, A.; Ibrahim, H.; Raad, A. Reviewing Federated Learning Aggregation Algorithms; Strategies, Contributions, Limitations and Future Perspectives. Electronics 2023, 12, 2287. [Google Scholar] [CrossRef]
Rafi, T.H.; Noor, F.A.; Hussain, T.; Chae, D.K.; Yang, Z. A Generalized Look at Federated Learning: Survey and Perspectives. arXiv 2023, arXiv:2303.14787. [Google Scholar]
Wang, R.; Lai, J.; Zhang, Z.; Li, X.; Vijayakumar, P.; Karuppiah, M. Privacy-preserving federated learning for internet of medical things under edge computing. IEEE J. Biomed. Health Inform. 2022, 27, 854–865. [Google Scholar] [CrossRef]
Jung, J.P.; Ko, Y.B.; Lim, S.H. Federated Learning with Pareto Optimality for Resource Efficiency and Fast Model Convergence in Mobile Environments. Sensors 2024, 24, 2476. [Google Scholar] [CrossRef]
Nilsson, A.; Smith, S.; Ulm, G.; Gustavsson, E.; Jirstrand, M. A performance evaluation of federated learning algorithms. In Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning, Rennes, France, 10–11 December 2018; pp. 1–8. [Google Scholar]
Zhu, H.; Jin, Y. Multi-objective evolutionary federated learning. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 1310–1322. [Google Scholar] [CrossRef]
Putra, M.A.; Putri, A.R.; Zainudin, A.; Kim, D.S.; Lee, J.M. Acs: Accuracy-based client selection mechanism for federated industrial iot. Internet Things 2023, 21, 100657. [Google Scholar] [CrossRef]
Khajehali, N.; Yan, J.; Chow, Y.W.; Fahmideh, M. A Comprehensive Overview of IoT-Based Federated Learning: Focusing on Client Selection Methods. Sensors 2023, 23, 7235. [Google Scholar] [CrossRef]
Zhang, S.Q.; Lin, J.; Zhang, Q. A multi-agentreinforcement learning approach for efficient client selection in federated learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 9091–9099. [Google Scholar]
Zhou, H.; Lan, T.; Venkataramani, G.; Ding, W. On the Convergence of Heterogeneous Federated Learning with Arbitrary Adaptive Online Model Pruning. arXiv 2022, arXiv:2201.11803. [Google Scholar]
Mu, X.; Shen, Y.; Cheng, K.; Geng, X.; Fu, J.; Zhang, T.; Zhang, Z. Fedproc: Prototypical contrastive federated learning on non-iid data. Future Gener. Comput. Syst. 2023, 143, 93–104. [Google Scholar] [CrossRef]
Cho, Y.J.; Wang, J.; Joshi, G. Client selection in federated learning: Convergence analysis and power-of-choice selection strategies. arXiv 2020, arXiv:2010.01243. [Google Scholar]
Zeng, Y.; Teng, S.; Xiang, T.; Zhang, J.; Mu, Y.; Ren, Y.; Wan, J. A Client Selection Method Based on Loss Function Optimization for Federated Learning. Comput. Model. Eng. Sci. 2023, 137, 1047–1064. [Google Scholar] [CrossRef]
Khalil, A.; Wainakh, A.; Zimmer, E.; Parra-Arnau, J.; Anta, A.F.; Meuser, T.; Steinmetz, R. Label-Aware Aggregation for Improved Federated Learning. In Proceedings of the 2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC), Tartu, Estonia, 18–20 September 2023; IEEE: New York, NY, USA, 2023; pp. 216–223. [Google Scholar]
Fu, L.; Zhang, H.; Gao, G.; Zhang, M.; Liu, X. Client selection in federated learning: Principles, challenges, and opportunities. IEEE Internet Things J. 2023, 10, 21811–21819. [Google Scholar] [CrossRef]
Rai, S.; Kumari, A.; Prasad, D.K. Client Selection in Federated Learning under Imperfections in Environment. AI 2022, 3, 124–145. [Google Scholar] [CrossRef]
Yaqoob, M.M.; Alsulami, M.; Khan, M.A.; Alsadie, D.; Saudagar, A.K.; AlKhathami, M.; Khattak, U.F. Symmetry in privacy-based healthcare: A review of skin cancer detection and classification using federated learning. Symmetry 2023, 15, 1369. [Google Scholar] [CrossRef]
Ye, M.; Fang, X.; Du, B.; Yuen, P.C.; Tao, D. Heterogeneous federated learning: State-of-the-art and research challenges. ACM Comput. Surv. 2023, 56, 1–44. [Google Scholar] [CrossRef]
Huang, T.; Lin, W.; Wu, W.; He, L.; Li, K.; Zomaya, A.Y. An Efficiency-Boosting Client Selection Scheme for Federated Learning With Fairness Guarantee. IEEE Trans. Parallel Distrib. Syst. 2021, 32, 1552–1564. [Google Scholar] [CrossRef]
Ma, X.; Liao, L.; Li, Z.; Lai, R.X.; Zhang, M. Applying federated learning in software-defined networks: A survey. Symmetry 2022, 14, 195. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; PMLR: Birmingham, UK, 2017; pp. 1273–1282. [Google Scholar]
Rahman, N.A.B.A.; Hamid, Z.B.A.; Musirin, I.B.; Salim, N.A.B. Hybrid Loss Sensitivity Factor And Mutated Ant Lion Optimizer For Optimal Distributed Generation Placement With Multiple Loadings. J. Theor. Appl. Inf. Technol. 2023, 101, 6703–6715. [Google Scholar]
Creevey, F.M.; Hill, C.D.; Hollenberg, L.C. GASP: A genetic algorithm for state preparation on quantum computers. Sci. Rep. 2023, 13, 11956. [Google Scholar] [CrossRef] [PubMed]
Almotairi, K.H. Gene selection for high-dimensional imbalanced biomedical data based on marine predators algorithm and evolutionary population dynamics. Arab. J. Sci. Eng. 2024, 49, 3935–3961. [Google Scholar] [CrossRef]
Hosseinzadeh, M.; Hudson, N.; Heshmati, S.; Khamfroush, H. Communication-Loss Trade-Off in Federated Learning: A Distributed Client Selection Algorithm. In Proceedings of the 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2022; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar]
Mohammed, I.; Tabatabai, S.; Al-Fuqaha, A.; El Bouanani, F.; Qadir, J.; Qolomany, B.; Guizani, M. Budgeted online selection of candidate IoT clients to participate in federated learning. IEEE Internet Things J. 2020, 8, 5938–5952. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
Yadav, C.; Bottou, L. Cold case: The lost mnist digits. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar] [CrossRef]
Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Master’s Thesis, Department of Computer Science, University of Toronto, Toronto, ON, Canada, 2009. [Google Scholar]
Xiao, J.; Du, C.; Duan, Z.; Guo, W. A novel server-side aggregation strategy for federated learning in non-iid situations. In Proceedings of the 2021 20th International Symposium on Parallel and Distributed Computing (ISPDC), Cluj-Napoca, Romania, 28–30 July 2021; IEEE: New York, NY, USA, 2021; pp. 17–24. [Google Scholar]
Duan, J.H.; Li, W.; Lu, S. FedDNA: Federated learning with decoupled normalization-layer aggregation for non-iid data. In Machine Learning and Knowledge Discovery in Databases, Proceedings of the Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, 13–17 September 2021; Proceedings, Part I 21 2021; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 722–737. [Google Scholar]
Li, Q.; Diao, Y.; Chen, Q.; He, B. Federated learning on non-iid data silos: An experimental study. In Proceedings of the 2022 IEEE 38th International Conference on Data Engineering (ICDE), Kuala Lumpur, Malaysia, 9–12 May 2022; IEEE: New York, NY, USA, 2022; pp. 965–978. [Google Scholar]
Zhong, J.; Wu, Y.; Ma, W.; Deng, S.; Zhou, H. Optimizing multi-objective federated learning on non-iid data with improved nsga-iii and hierarchical clustering. Symmetry 2022, 14, 1070. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2020, 2, 429–450. [Google Scholar]
Frid-Adar, M.; Diamant, I.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
Ren, H.; Deng, J.; Xie, X. GRNN: Generative regression neural network—A data leakage attack for federated learning. ACM Trans. Intell. Syst. Technol. (TIST) 2022, 13, 65. [Google Scholar] [CrossRef]
Wu, Y.; Liu, L.; Bae, J.; Chow, K.H.; Iyengar, A.; Pu, C.; Wei, W.; Yu, L.; Zhang, Q. Demystifying learning rate policies for high accuracy training of deep neural networks. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019; IEEE: New York, NY, USA, 2019; pp. 1971–1980. [Google Scholar]

Figure 1. FegAvg processing diagram.

Figure 2. Strategy of using a roulette wheel on Fed-RHLP.

Figure 3. Online datasets in this experiment.

Figure 4. Accuracy of the global model trained with different algorithms on MNIST (a,b), FMNIST (c,d), QMNIST (e,f), and CIFAR dataset (g,h).

Table 1. Setting CNN model’s structure for building MNIST and CIFAR dataset.

Dataset	Layer	Input Channel	Output Channel	Kernel	Stride	Padding
MNIST	Convolutional layer 1	1	10	5 × 5	-	-
	Convolutional layer 2	10	20	5 × 5	-	-
	Convolutional layer 2 dropout = 0.5	-	-	-	-	-
	Fully connected layer 1	320	50	-	-	-
	Fully connected layer 2	50	10 Classes	-	-	-

FMNIST	Convolutional layer 1	1	32	3 × 3	-	1
	Max pooling layer	-	-	2 × 2	2	-
	Convolutional layer 2	32	64	3 × 3	-	-
	Max pooling layer	-	-	2 × 2	2	-
	Fully connected layer 1	2304	600	-	-	-
	Dropout layer = 0.25	-	-	-	-	-
	Fully connected layer 2	600	120	-	-	-
	Fully connected layer 3	120	10 Classes	-	-	-

QMNIST	Convolutional layer 1	1	16	3 × 3	-	1
	Convolutional layer 2	16	32	3 × 3	-	1
	Convolutional layer 3	32	64	3 × 3	-	1
	Max pooling layer	-	-	2 × 2	2	-
	Fully connected layer 1	576	120	-	-	-
	Fully connected layer 2	120	84	-	-	-
	Fully connected layer 3	84	47 Classes	-	-	-

CIFAR	Convolutional layer 1	3	16	3 × 3	1	1
	Convolutional layer 2	16	32	3 × 3	1	1
	Convolutional layer 3	32	64	3 × 3	1	1
	Max pooling layer	-	-	2 × 2	2	-
	Fully connected layer 1	1024	512	-	-	-
	Fully connected layer 2	512	64	-	-	-
	Fully connected layer 3	64	10 Classes	-	-	-
	Dropout layer = 0.5	-	-	-	-	-

Table 2. Hardware components specification for executing the FL algorithms.

Component Name	Description
Processor	12th Gen Intel(R) Core(TM) i7-12650H 2.30 GHz
Graphic cards	Nvidia Geforce RTX 4060 Laptop GPU
Random access memory	DDR5, 40.0 GB
Operating system	Windows 11
Environments	Python 3.7.2 with the Pytorch framework

Table 3. Accuracy of global model and convergence rounds and time (seconds) on MNIST IID and Non-IID.

Algorithm	Accuracy of Global Model	Dataset	Convergence Rounds and Time (Seconds)
			Accuracy: 50%		Accuracy: 60%		Accuracy: 70%		Accuracy: 80%		Accuracy: 90%
			Round	Time	Round	Time	Round	Time	Round	Time	Round	Time
Proposed Fed-RHLP	99.08%	MNIST IID	1	139.8	1	139.8	1	139.8	1	139.8	2	282
	98.71%	MNIST Non-IID	4	526.8	5	648.6	5	648.6	9	1140.6	11	1385.4
Fed-Avg	98.55%	MNIST IID	2	130.8	3	196.8	3	196.8	4	261	9	597.6
	95.36%	MNIST Non-IID	8	412.2	10	515.4	12	621	20	1018.8	46	2365.2
Power-of-Choice	98.53%	MNIST IID	2	183	3	259.2	3	259.2	4	331.2	8	620.4
	96.36%	MNIST Non-IID	8	560.4	12	822.6	13	887.4	17	1152.6	47	3294.6

Note: calculation method for convergence speed (%) = (higher round − lower round) × 100/higher round and reduced execution time (%) = (higher execution time − lower execution time) × 100/higher execution time.

Table 4. Accuracy of global model and convergence rounds and time (seconds) on FMNIST IID and Non-IID.

Algorithm	Accuracy of Global Model	Dataset	Convergence Rounds and Time (Seconds)
			Accuracy: 50%		Accuracy: 60%		Accuracy: 70%		Accuracy: 80%		Accuracy: 90%
			Round	Time	Round	Time	Round	Time	Round	Time	Round	Time
Proposed Fed-RHLP	91.56%	FMNIST IID	1	243	1	243	1	243	1	243	12	2916
	88.96%	FMNIST Non-IID	3	1123.2	5	1872	9	3369.6	18	6739.2	-	-
Fed-Avg	91.04%	FMNIST IID	1	166.2	1	166.2	1	166.2	2	332.4	54	8974.8
	87.55%	FMNIST Non-IID	4	1212.8	6	1819.2	8	2425.6	25	7580	-	-
Power-of-Choice	91.2%	FMNIST IID	1	243	1	243	1	243	3	729	41	9963
	88.02%	FMNIST Non-IID	6	2049	6	2049	12	4098	20	6830	-	-

Table 5. Accuracy of global model and convergence rounds and time (seconds) on QMNIST IID and Non-IID.

Algorithm	Accuracy of Global Model	Dataset	Convergence Rounds and Time (Seconds)
			Accuracy: 50%		Accuracy: 60%		Accuracy: 70%		Accuracy: 80%		Accuracy: 90%
			Round	Time	Round	Time	Round	Time	Round	Time	Round	Time
Proposed Fed-RHLP	99.32%	QMNIST IID	1	198	1	198	1	198	1	198	1	198
	98.90%	QMNIST Non-IID	3	675	3	675	4	900	6	1350	10	2250
Fed-Avg	99.07%	QMNIST IID	1	133.8	1	133.8	2	267.6	2	267.6	3	401.4
	98.63%	QMNIST Non-IID	5	840	6	1008	6	1008	14	2352	15	2520
Power-of-Choice	99.04%	QMNIST IID	1	198	1	198	2	396	2	396	3	594
	98.69%	QMNIST Non-IID	7	1570.8	7	1570.8	7	1570.8	9	2019.6	13	2917.2

Table 6. Accuracy of global model and convergence rounds and time (seconds) on CIFAR IID and Non-IID.

Algorithm	Accuracy of Global Model	Dataset	Convergence Rounds and Time (Seconds)
			Accuracy: 30%		Accuracy: 40%		Accuracy: 50%		Accuracy: 60%		Accuracy: 70%
			Round	Time	Round	Time	Round	Time	Round	Time	Round	Time
Proposed Fed-RHLP	74.21%	CIFAR IID	7	634.2	11	996.6	20	1812	36	3261.6	83	7519.8
	66.11%	CIFAR Non-IID	29	6925.2	36	8596.8	99	23,641.2	121	28,894.8	-	-
Fed-Avg	62.0%	CIFAR IID	42	3288	61	4783.8	101	7967.4	178	14,135.4	-	-
	56.74%	CIFAR Non-IID	77	12,427.8	115	18,561	159	25,662.6	-	-	-	-
Power-of-Choice	61.97%	CIFAR IID	39	4609.8	59	6973.8	103	12,174.6	181	21,394.2	-	-
	59.74%	CIFAR Non-IID	72	17,107.2	102	24,235.2	146	34,689.6	-	-	-	-

Note: The calculation method based on the values in Table 3, Table 4, Table 5 and Table 6 can be described as follows: Convergence speed (%) = (higher round − lower round) × 100/higher round. Reduced execution time (%) = (higher execution time − lower execution time) × 100/higher execution time.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sittijuk, P.; Tamee, K. Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy. Symmetry 2024, 16, 1181. https://doi.org/10.3390/sym16091181

AMA Style

Sittijuk P, Tamee K. Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy. Symmetry. 2024; 16(9):1181. https://doi.org/10.3390/sym16091181

Chicago/Turabian Style

Sittijuk, Pramote, and Kreangsak Tamee. 2024. "Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy" Symmetry 16, no. 9: 1181. https://doi.org/10.3390/sym16091181

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fed-RHLP: Enhancing Federated Learning with Random High-Local Performance Client Selection for Improved Convergence and Accuracy

Abstract

1. Introduction

2. Concept and Federated Averaging (FedAvg) Algorithm

2.1. FL Concept

2.2. Federated Averaging (FedAvg) Algorithm

3. Proposed Federated Random High Local Performance Client Selection (Fed-RHLP)

3.1. Random High-Local Performance Client Selection on the Roulette Wheel

3.2. Insight of Fed-RHLP Algorithm

4. Experimental Setup and Results

4.1. Experimental Setup

4.1.1. Data Preparation

4.1.2. Setting FL Algorithm and CNN Model’s Parameters

4.2. Experimental Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI