A Federated Learning Framework against Data Poisoning Attacks on the Basis of the Genetic Algorithm
Abstract
:1. Introduction
- (1)
- This paper applies the genetic algorithm to the federated learning model. In the participation stage of the federated learning model, the genetic algorithm is used to find the optimal combination of data to avoid the impact of poor quality data on the training process of the federated learning model;
- (2)
- It is verified in the fashion-MNIST data set and the cifar10 data set that the training accuracy of GAFL is 7.45% higher than that of the federated learning model in the fashion-MNIST data set and 8.18% in the cifar10 data set;
- (3)
- It is verified in the fashion-MNIST data set and the cifar10 data set that the comprehensive score of GAFL is the highest.
2. Related Work
3. Basic Theory
3.1. Federated Learning
- (1)
- Parameter server sends initial weight to the participants;
- (2)
- Participants use their own data sets to train locally;
- (3)
- Participants send the training parameters to the parameter server;
- (4)
- The parameter server updates the parameters of the global model;
- (5)
- The parameter server returns the parameters of the global model to participants;
3.2. The Genetic Algorithm
- (1)
- Initialize a population randomly;
- (2)
- Assess the fitness of each individual;
- (3)
- The individuals with high fitness are selected for chromosome cross mutation;
- (4)
- Repeat the above process until the best individual is found.
4. Federated Learning Model on the Basis of the Genetic Algorithm
4.1. Algorithm Description
- Step 1: Participants train original data locally;
- Step 2: Determine the weight between data availability and accuracy;
- Step 3: Use the genetic algorithm to find the optimal data combination according to the accuracy of the data;
- Step 4: All participants dynamically join the model;
- Step 5: The parameter server sends the initial model to the participants;
- Step 6: Participants’ nodes use local data to train in the model locally;
- Step 7: The trained model parameters are sent to the parameter server;
- Step 8: The parameter server integrates model parameters;
- Step 9: The parameter server sends the updated parameters to participants;
- Step 10: Participants use the new model weight parameters to conduct a new round of iterative training.
4.2. Algorithmic Pseudocode
Algorithm 1. Pseudocode of the federated learning model on the basis of the genetic algorithm. federated learning model based on the genetic algorithm Input: fashion-MNIST dataset Output: optimal training results |
1: Initialize the connection weight and connection threshold of neural network 2: Do{ 3: Calculate sample data // is the data of sample , is the calculation formula in the neural network 4: Calculate the gradient of neurons which are in the output layer 5: Calculate the gradient of neurons which are in the hidden layer 6: }while(reach termination conditions) 7: Output accuracy // is the single point training accuracy rate of the th participant, refers to the number of participating nodes 8: Initialize , , , , // refers to the probability of crossover, refers to the probability of mutation, refers to the population size, refers to the algebra of terminating evolution, refers to the fitness function of any individual generated by evolution exceeds 8: Generate the first generation population pop randomly 9: for (any chromosome score exceeding , or reproduction algebra exceeding g) 10: Calculate the fitness of each individual in the population pop F(i)// refers to the number of participating nodes 11: Initialize empty population newpop 12: for child is created 13: Select two individuals from the population pop by proportional selection algorithm according to fitness 14: if 15: Perform crossover operation on the above two individuals according to crossover percentage 16: if 17: Perform mutation operation on the above two individuals according to mutation percentage 18: Add two new individuals to the population newpop 19: Replace pop with newpop 20:for to // refers to the number of data in the data set 21: for to 22: Build initial model 23: train models 24: exchange parameters 25: update models 26: end for 27:end for |
4.3. Model Description
4.3.1. Technical Challenge
4.3.2. Technical Details
- (1)
- Encoding: select a number of options from to initialize the population, where is chromosome and is gene;
- (2)
- Decoding: decoding is to determine the range of and , , ;
- (3)
- Fitness calculation: since the purpose is to find the maximum value of objective function , the larger the objective function , the higher the fitness;
- (4)
- Selection: select individuals with higher fitness;
- (5)
- Cross mutation: change a binary bit;
- (1)
- Participants train data locally;
- (2)
- Participants use the neural network to predict the accuracy of the image data;
- (3)
- Participants select data with higher accuracy than the set threshold for training;
- (4)
- Participants obtain the neural network global model and weight parameters from the parameter server;
- (5)
- Participants use the neural network algorithm to train original data locally;
- (6)
- Participants return the updated global model weight parameters to the parameter server.
- (1)
- The parameter server averages global model weight parameters uploaded by participants;
- (2)
- The parameter server returns the updated global model weight parameters to participants;
- (3)
- Repeat the process until the final iteration condition is reached.
5. Model Analysis
5.1. Time Complexity
5.2. Algorithm Security
6. Experimental
6.1. Experimental Environment and Data Set
6.2. Experimental Procedure
6.3. Experimental Results
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
SMC | Secure Multi-Party Computation |
GAFL | Genetic Algorithm Federated Learning |
FL | Federated Learning |
PDGAN | a novel Poisoning Defense Generative Adversarial Network |
References
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S. Communication-efficient learning of deep networks from decentralized data. Artificial intelligence and statistics. PMLR 2017, 54, 1273–1282. [Google Scholar]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. Found. Trends® Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
- Gao, L.; Fu, H.; Li, L.; Chen, Y.; Xu, M.; Xu, C. FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling and Correction. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10102–10111. [Google Scholar]
- Chen, F.; Li, P.; Miyazaki, T.; Wu, C. FedGraph: Federated Graph Learning With Intelligent Sampling. IEEE Trans. Parallel Distrib. Syst. 2021, 33, 1775–1786. [Google Scholar] [CrossRef]
- Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
- Li, X.; Jiang, M.; Zhang, X.; Kamp, M.; Dou, Q. FedBN: Federated Learning on Non-IID Features via Local Batch Normalization. arXiv 2021, arXiv:abs/2102.07623. [Google Scholar]
- Li, L.; Fan, Y.; Tse, M.; Li, K.-Y. A review of applications in federated learning. Comput. Ind. Eng. 2020, 149, 106854. [Google Scholar] [CrossRef]
- Niknam, S.; Harpreet, S.D.; Jeffrey, H.R. Federated learning for wireless communications: Motivation, opportunities, and challenges. IEEE Commun. Mag. 2020, 58, 46–51. [Google Scholar] [CrossRef]
- Rieke, N.; Hancox, J.; Li, W.; Milletarì, F.; Roth, H.R.; Albarqouni, S.; Bakas, S.; Galtier, M.N.; Landman, B.A.; Maier-Hein, K.; et al. The future of digital health with federated learning. NPJ Digit. Med. 2020, 3, 1–7. [Google Scholar] [CrossRef] [PubMed]
- Luo, X.; Zhao, Z.; Peng, M. Tradeoff between Model Accuracy and Cost for Federated Learning in the Mobile Edge Computing Systems. In Proceedings of the 2021 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Nanjing, China, 29 March 2021. [Google Scholar]
- He, C.; Ceyani, E.; Balasubramanian, K.; Annavaram, M.; Avestimehr, S. SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks. arXiv 2021, arXiv:10.48550/arXiv.2106.02743. [Google Scholar]
- Jannatul, F.; Mondol, G.; Prapti, A.P.; Begumet, M.; Sheikh, M.N.A.; Galib, S.M. An enhanced image encryption technique combining genetic algorithm and particle swarm optimization with chaotic function. Int. J. Comput. Appl. 2021, 43, 960–967. [Google Scholar]
- Kang, J.; Xiong, Z.; Niyato, D.; Zou, Y.; Zhang, Y.; Guizani, M. Reliable Federated Learning for Mobile Networks. IEEE Wirel. Commun. 2020, 27, 72–80. [Google Scholar] [CrossRef] [Green Version]
- Jiang, C.; Xu, C.; Zhang, Y. PFLM: Privacy-preserving federated learning with membership proof. Inf. Sci. 2021, 576, 288–311. [Google Scholar] [CrossRef]
- Tran, N.H.; Bao, W.; Zomaya, A.; Nguyen, M.N.H.; Hong, C.S. Federated learning over wireless networks: Optimization model design and analysis. In Proceedings of the IEEE Infocom 2019-IEEE Conference on Computer Communications, Paris, France, 29 April 2019–2 May 2019; pp. 1387–1395. [Google Scholar]
- Truex, S.; Baracaldo, N.; Anwar, A.; Steinke, T.; Ludwig, H.; Zhang, R.; Zhou, Y. A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM workshop on artificial intelligence and security, London, UK, 11 November 2019; pp. 1–11. [Google Scholar]
- Xu, R.; Baracaldo, N.; Zhou, Y.; Anwar, A.; Ludwig, H. Hybridalpha: An efficient approach for privacy-preserving federated learning. In Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security, London, UK, 11 November 2019; pp. 13–23. [Google Scholar]
- Zhao, Y.; Chen, J.; Zhang, J.; Wu, D.; Teng, J.; Yu, S. PDGAN: A Novel Poisoning Defense Method in Federated Learning Using Generative Adversarial Network. In Algorithms and Architectures for Parallel Processing; Springer: Cham, Switzerland, 2020; pp. 595–609. [Google Scholar]
- Mendieta, M.; Yang, T.; Wang, P.; Lee, M.; Ding, Z.; Chen, C. Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning. CVPR 2022, 55, 8387–8396. [Google Scholar]
- Shen, Y.; Zhou, Y.; Yu, L. CD2pFed: Cyclic Distillation-guided Channel Decoupling for Model Personalization in Federated Learning. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022. [Google Scholar]
- Lambora, A.; Gupta, K.; Chopra, K. Genetic algorithm-A literature review. 2019 international conference on machine learning, big data, cloud and parallel computing (COMITCon). In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Seyedali, M. Genetic Algorithm. Evolutionary Algorithms and Neural Networks; Springer: Cham, Switzerland, 2019; pp. 43–55. [Google Scholar]
- Sourabh, K.; Chauhan, S.S.; Kumar, V. A review on genetic algorithm: Past, present, and future. Multimed. Tools Appl. 2021, 80, 8091–8126. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhai, R.; Chen, X.; Pei, L.; Ma, Z. A Federated Learning Framework against Data Poisoning Attacks on the Basis of the Genetic Algorithm. Electronics 2023, 12, 560. https://doi.org/10.3390/electronics12030560
Zhai R, Chen X, Pei L, Ma Z. A Federated Learning Framework against Data Poisoning Attacks on the Basis of the Genetic Algorithm. Electronics. 2023; 12(3):560. https://doi.org/10.3390/electronics12030560
Chicago/Turabian StyleZhai, Ran, Xuebin Chen, Langtao Pei, and Zheng Ma. 2023. "A Federated Learning Framework against Data Poisoning Attacks on the Basis of the Genetic Algorithm" Electronics 12, no. 3: 560. https://doi.org/10.3390/electronics12030560
APA StyleZhai, R., Chen, X., Pei, L., & Ma, Z. (2023). A Federated Learning Framework against Data Poisoning Attacks on the Basis of the Genetic Algorithm. Electronics, 12(3), 560. https://doi.org/10.3390/electronics12030560