Secure Data Sharing in Federated Learning through Blockchain-Based Aggregation
Abstract
:1. Introduction
Contribution and Organization
2. Preliminary
2.1. Federated Learning Approach
2.2. Blockchain Overview
- –
- Democracy and decentralized control: In systems utilizing proof of work (PoW) as the consensus mechanism in permissionless scenarios, everyone has the potential to act as a miner, possessing equal privileges to generate and approve blocks for the blockchain. Although variations may exist in different cases, the overarching principle remains: blockchain technology eliminates the need for a singular, fully trusted entity, thereby averting the vulnerability of a single point of failure.
- –
- Integrity and immutability: In the absence of an attacker or a coalition of attackers dominating the consensus process, such as when more than 51% of the computing power in the Bitcoin blockchain is handled by semi-honest miners, it becomes infeasible to modify agreed-upon blocks in the consensus.
- –
- Consistency: Despite potential attacks from robust adversaries, the chain upholds a singular and consistent perspective, as outlined by the aforementioned assumptions. However, it is essential to note that deviations from predefined rules by nodes can lead to the generation of forks, resulting in different perspectives among participants, as observed in scenarios like Ethereum.
3. EIFFeL Framework and Security Analysis
3.1. Threat Model
- –
- Malicious clients: Multiple malicious clients can arbitrarily deviate from the protocol. They may (1) compromise the aggregate by submitting malformed updates; (2) cause the honest clients to complete an integrity check; (3) violate the privacy of honest clients, which may collude with the server.
- –
- Malicious server: It aims to violate the privacy of clients by trying to recover their raw updates. A malicious server may (1) mark the inputs from honest clients as invalid; (2) mark the inputs from malicious clients as valid, so as to decide which one will be aggregated.
3.2. Scheme Architecture and Workflow
- –
- Shamir’s t-out-of-n secret sharing scheme [22]: This cryptographic method facilitates the distribution of a secret among n participants, and it requires at least t participants (the threshold) to collectively reconstruct the original secret. Two algorithms are defined, is for generating secret sharing and is for reconstructing the secret. Assuming m malicious clients, t is set as in EIFFeL.
- –
- Reed–Solomon error correcting code [23]: It is an error-correcting code used in digital communication and data storage. Reed–Solomon codes add redundant symbols to the original data, allowing the receiver to detect and correct errors during transmission. EIFFeL uses the Reed–Solomon error correcting code, where any set of shares containing malicious shares can be used to recover the secret with .
- –
- Key agreement protocol: It enables two or more parties to agree upon a shared secret key. It involves three algorithms: is used to generate the parameters, is used to generate a public/private key pair, and is used to agree on a common secret key.
- –
- Authenticated encryption: A cryptographic process that combines encryption and message authentication to guarantee the integrity and confidentiality of transmitted data. It includes the algorithms of key generation , encryption , and decryption .
- –
- Secret-shared non-interactive proofs (SNIPs) [24]: These are cryptographic protocols that allow multiple parties to jointly prove the truth of a statement without revealing their individual inputs. These proofs are constructed in such a way that the validity of the statement can be verified without requiring interaction between the parties. A public validation predicate is defined to conduct the integrity check.
3.3. Analysis of the EIFFeL Framework
- –
- It should be corrupt-resistant against all entities, including the server, clients, and other attackers. The data should not be changed, deleted, or manipulated by malicious entities in any manner.
- –
- It should be able to validate the identities of clients (and the server) according to some public key infrastructure (PKI). By default, some PKI information should be stored on the bulletin board so that it can validate the identity claims of its users.
- –
- It should be able to establish a secure channel with its users so that the integrity and authenticity of the users’ data can be guaranteed during the transmission.
4. The Enhanced Scheme
4.1. Security Analysis
4.2. Performance Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Davies, H. Ted Cruz Using Firm That Harvested Data on Millions of Unwitting Facebook Users. Available online: https://www.theguardian.com/us-news/2015/dec/11/senator-ted-cruz-president-campaign-facebook-user-data (accessed on 4 January 2024).
- European Parliament; Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council. Available online: https://data.europa.eu/eli/reg/2016/679/oj (accessed on 4 May 2016).
- Krishnan, S.; Anand, A.J.; Srinivasan, R.; Kavitha, R.; Suresh, S. Federated Learning; CRC Press: Boca Raton, FL, USA, 2024. [Google Scholar]
- Boenisch, F.; Dziedzic, A.; Schuster, R.; Shamsabadi, A.S.; Shumailov, I.; Papernot, N. Reconstructing Individual Data Points in Federated Learning Hardened with Differential Privacy and Secure Aggregation. In Proceedings of the 2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), Delft, The Netherlands, 3–7 July 2023; IEEE Computer Society: Piscataway, NJ, USA, 2023; pp. 241–257. [Google Scholar]
- Melis, L.; Song, C.; De Cristofaro, E.; Shmatikov, V. Exploiting unintended feature leakage in collaborative learning. In Proceedings of the 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 19–23 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 691–706. [Google Scholar]
- Yin, H.; Mallya, A.; Vahdat, A.; Alvarez, J.M.; Kautz, J.; Molchanov, P. See through gradients: Image batch recovery via gradinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 16337–16346. [Google Scholar]
- Lyu, L.; Yu, H.; Ma, X.; Chen, C.; Sun, L.; Zhao, J.; Yang, Q.; Yu, P.S. Privacy and Robustness in Federated Learning: Attacks and Defenses. IEEE Trans. Neural Netw. Learn. Syst. 2022, 1–21. [Google Scholar] [CrossRef] [PubMed]
- Adilova, L.; Böttinger, K.; Danos, V.; Jacob, S.; Langer, F.; Markert, T.; Poretschkin, M.; Rosenzweig, J.; Schulze, J.P.; Sperl, P. Security of AI-Systems: Fundamentals. Available online: https://doi.org/10.24406/publica-1503 (accessed on 15 March 2024).
- Blanchard, P.; El Mhamdi, E.M.; Guerraoui, R.; Stainer, J. Machine learning with adversaries: Byzantine tolerant gradient descent. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Fang, M.; Cao, X.; Jia, J.; Gong, N. Local model poisoning attacks to {Byzantine-Robust} federated learning. In Proceedings of the 29th USENIX security symposium (USENIX Security 20), Boston, MA, USA, 12–14 August 2020; pp. 1605–1622. [Google Scholar]
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and Open Problems in Federated Learning. Found. Trends Mach. Learn. 2021, 14, 1–210. [Google Scholar] [CrossRef]
- Bell, J.H.; Bonawitz, K.A.; Gascón, A.; Lepoint, T.; Raykova, M. Secure single-server aggregation with (poly) logarithmic overhead. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, Virtual, 9–13 November 2020; pp. 1253–1269. [Google Scholar]
- Bonawitz, K.; Ivanov, V.; Kreuter, B.; Marcedone, A.; McMahan, H.B.; Patel, S.; Ramage, D.; Segal, A.; Seth, K. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, TX, USA, 30 October–3 November 2017; pp. 1175–1191. [Google Scholar]
- Kairouz, P.; Liu, Z.; Steinke, T. The distributed discrete gaussian mechanism for federated learning with secure aggregation. In Proceedings of the International Conference on Machine Learning, PMLR, Virtual, 18–24 July 2021; pp. 5201–5212. [Google Scholar]
- Liu, B.; Pejó, B.; Tang, Q. Privacy-Preserving Federated Singular Value Decomposition. Appl. Sci. 2023, 13, 7373. [Google Scholar] [CrossRef]
- Roy Chowdhury, A.; Guo, C.; Jha, S.; van der Maaten, L. Eiffel: Ensuring integrity for federated learning. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, Los Angeles, CA, USA, 7–11 November 2022; pp. 2535–2549. [Google Scholar]
- Diedrich, H. Ethereum: Blockchains, Digital Assets, Smart Contracts, Decentralized Autonomous Organizations; Wildfire Publishing: Sydney, Australia, 2016. [Google Scholar]
- Narayanan, A.; Bonneau, J.; Felten, E.; Miller, A.; Goldfeder, S. Bitcoin and Cryptocurrency Technologies: A Comprehensive Introduction; Princeton University Press: Princeton, NJ, USA, 2016. [Google Scholar]
- Swan, M. Blockchain: Blueprint for a New Economy; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2015. [Google Scholar]
- Qammar, A.; Karim, A.; Ning, H.; Ding, J. Securing federated learning with blockchain: A systematic literature review. Artif. Intell. Rev. 2023, 56, 3951–3985. [Google Scholar] [CrossRef] [PubMed]
- Yu, F.; Lin, H.; Wang, X.; Yassine, A.; Hossain, M.S. Blockchain-empowered secure federated learning system: Architecture and applications. Comput. Commun. 2022, 196, 55–65. [Google Scholar] [CrossRef]
- Shamir, A. How to share a secret. Commun. ACM 1979, 22, 612–613. [Google Scholar] [CrossRef]
- Lin, S.; Costello, D.J. Error Control Coding: Fundamentals and Applications; Pearson/Prentice Hall: Upper Saddle River, NJ, USA, 2004. [Google Scholar]
- Corrigan-Gibbs, H.; Boneh, D. Prio: Private, robust, and scalable computation of aggregate statistics. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), Boston, MA, USA, 27–29 March 2017; pp. 259–282. [Google Scholar]
- Suwito, M.H.; Tama, B.A.; Santoso, B.; Dutta, S.; Tan, H.; Ueshige, Y.; Sakurai, K. A Systematic Study of Bulletin Board and Its Application. In Proceedings of the ASIA CCS ’22: ACM Asia Conference on Computer and Communications Security, Nagasaki, Japan, 30 May–3 June 2022; Suga, Y., Sakurai, K., Ding, X., Sako, K., Eds.; ACM: New York, NY, USA, 2022; pp. 1213–1215. [Google Scholar]
- Tramèr, F.; Shokri, R.; Joaquin, A.S.; Le, H.; Jagielski, M.; Hong, S.; Carlini, N. Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS 2022, Los Angeles, CA, USA, 7–11 November 2022; Yin, H., Stavrou, A., Cremers, C., Shi, E., Eds.; ACM: New York, NY, USA, 2022; pp. 2779–2792. [Google Scholar]
- Burmester, M.; Desmedt, Y. A secure and scalable Group Key Exchange system. Inf. Process. Lett. 2005, 94, 137–143. [Google Scholar] [CrossRef]
- Python Cryptographic Authority. Python Library NumPy. Available online: https://numpy.org/ (accessed on 13 February 2024).
- Oliphant, T.; Contributors Community. Python Library Cryptography. Available online: https://cryptography.io/en/latest/ (accessed on 13 February 2024).
- Samarakoon, S.; Siriwardhana, Y.; Porambage, P.; Liyanage, M.; Chang, S.Y.; Kim, J.; Kim, J.; Ylianttila, M. 5G-NIDD: A Comprehensive Network Intrusion Detection Dataset Generated over 5G Wireless Network. arXiv 2022, arXiv:2212.01298. [Google Scholar]
Category | Number of Records |
---|---|
Benign | 477,737 |
ICMPFlood | 1155 |
HTTPFlood | 140,812 |
SlowrateDos | 73,124 |
SYNFlood | 9721 |
SYNScan | 20,043 |
TCPConnectScan | 20,052 |
UDPFlood | 457,340 |
UDPScan | 15,906 |
Step | Participant | Runtime (ms) | |||
---|---|---|---|---|---|
n = 50 | 100 | 150 | 200 | ||
2.(a). | Client | 491.37 | 974.21 | 1477.88 | 1990.13 |
Blockchain | - | - | - | - | |
2.(b). | Client | 1073.43 | 2106.11 | 3204.75 | 4283.49 |
Blockchain | 995.14 | 1997.64 | 3091.71 | 4107.93 | |
2.(c). | Client | 2079.58 | 4215.38 | 6314.49 | 8284.27 |
Blockchain | - | - | - | - | |
3. | Client | 171.21 | 354.14 | 498.87 | 637.61 |
Blockchain | - | - | - | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, B.; Tang, Q. Secure Data Sharing in Federated Learning through Blockchain-Based Aggregation. Future Internet 2024, 16, 133. https://doi.org/10.3390/fi16040133
Liu B, Tang Q. Secure Data Sharing in Federated Learning through Blockchain-Based Aggregation. Future Internet. 2024; 16(4):133. https://doi.org/10.3390/fi16040133
Chicago/Turabian StyleLiu, Bowen, and Qiang Tang. 2024. "Secure Data Sharing in Federated Learning through Blockchain-Based Aggregation" Future Internet 16, no. 4: 133. https://doi.org/10.3390/fi16040133
APA StyleLiu, B., & Tang, Q. (2024). Secure Data Sharing in Federated Learning through Blockchain-Based Aggregation. Future Internet, 16(4), 133. https://doi.org/10.3390/fi16040133