Hierarchical Clustering via Single and Complete Linkage Using Fully Homomorphic Encryption †
Abstract
:1. Introduction
2. Preliminaries
2.1. Agglomerative Hierarchical Clustering
2.2. Homomorphic Encryption
2.3. Privacy-Preserving Clustering
3. Proposed Approach
Algorithm 1: Computation of |
Input: A ciphertext Output: 1: for to do 2: 3: 4: append to 5: end for 6: return |
Algorithm 2: : Computation of Euclidean distance |
Input: Ciphertext , : the th rotation of Output: Euclidean distance list 1: 2: 3: for to do 4: 5: 6: end for 7: initialize as a plaintext of s with length 8: for to do 9: 10: end for 11: 12: return |
3.1. Single Linkage
Algorithm 3: : Clustering via single linkage |
Input: : a list of sorted index pairs, number of data points, : desired number of clusters Output: : a list of clusters 1: 2: 3: 4: while do 5: 6: 7: 8: if then 9: 10: append to 11: 12: 13: 14: empty 15: empty 16: 17: 18: end if 19: 20: end while 21: initialize as an empty list 22: for each in do 23: if is not empty then 24: append to 25: end if 26: end for 27: return 28: function 29: while do 30: 31: end while 32: return 33: end function |
3.2. Complete Linkage
Algorithm 4: : Clustering via complete linkage |
Input: : a list of sorted index pairs, number of data points, : desired number of clusters Output: : a list of clusters 1: 2: 3: while do 4: 5: 6: append to 7: empty 8: empty 9: 10: 11: 12: 13: end while 14: initialize as an empty list 15: for each in do 16: if is not empty then 17: append to 18: end if 19: end for 20: return 21: function 22: 23: length of list 24: if then 25: 26: return 27: else 28: initialize as an empty list 29: initialize as a list of s with length 30: for to do 31: 32: if or then 33: if then 34: append to 35: else 36: 37: 38: end if 39: else if or then 40: if then 41: append to 42: else 43: 44: 45: end if 46: end if 47: end for 48: initialize as an empty list 49: for to do 50: if not in then 51: append to 52: end if 53: end for 54: return 55: end if 56: end function |
4. Implementation
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, S.; Xu, Z.; Liu, F. Method for Determining the Optimal Number of Clusters Based on Agglomerative Hierarchical Clustering. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 3007–3017. [Google Scholar] [CrossRef] [PubMed]
- Havens, T.C.; Bezdek, J.C.; Palaniswami, M. Scalable Single Linkage Hierarchical Clustering for Big Data. In Proceedings of the IEEE Eight International Conference on Intelligent Sensors, Sensor Networks and Information Processing (ISSNIP), Melbourne, Australia, 2–5 April 2013; pp. 396–401. [Google Scholar]
- Lin, C.-R.; Chen, M.-S. Combining Partitional and Hierarchical Algorithms for Robust and Efficient Data Clustering with Cohesion Self-Merging. IEEE Trans. Knowl. Data Eng. 2005, 17, 145–159. [Google Scholar]
- Zhong, C.; Miao, D.; Fränti, P. Minimum Spanning Tree Based Split-and-Merge: A Hierarchical Clustering Method. Inf. Sci. 2011, 181, 3397–3410. [Google Scholar] [CrossRef]
- Zaki, M.J.; Meira, W., Jr. Data Mining and Machine Learning: Fundamental Concepts and Algorithms, 2nd ed.; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
- Gentry, C. Fully Homomorphic Encryption Using Ideal Lattices. In Proceedings of the Symposium on the Theory of Computing, Bethesda, MD, USA, 31 May–2 June 2009; ACM: New York, NY, USA, 2009; pp. 169–178. [Google Scholar]
- Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic Encryption for Arithmetic of Approximate Numbers. In Proceedings of the Advances in Cryptology—ASIACRYPT 2017, Hong Kong, China, 3–7 December 2017; Springer: Cham, Switzerland, 2017; pp. 409–437. [Google Scholar]
- Marcolla, C.; Sucasas, V.; Manzano, M.; Bassoli, R.; Fitzek, F.H.P.; Aaraj, N. Survey on Fully Homomorphic Encryption, Theory, and Applications. Proc. IEEE 2022, 110, 1572–1609. [Google Scholar] [CrossRef]
- Brakerski, Z.; Gentry, C.; Vaikuntanathan, V. (Leveled) Fully Homomorphic Encryption without Bootstrapping. In Proceedings of the Innovations in Theoretical Computer Science Conference, Cambridge, MA, USA, 8–10 January 2012; ACM: New York, NY, USA, 2012; pp. 309–325. [Google Scholar]
- Fan, J.; Vercauteren, F. Somewhat Practical Fully Homomorphic Encryption. IACR Cryptol. ePrint Arch. 2012, 2012, 144. [Google Scholar]
- Cheon, J.H.; Han, K.; Kim, A.; Kim, M.; Song, Y. A Full RNS Variant of Approximate Homomorphic Encryption. In Proceedings of the 25th International Conference on Selected Areas in Cryptography, Calgary, AB, Canada, 15–17 August 2018; LNCS. Springer: Berlin/Heidelberg, Germany, 2018; Volume 11349, pp. 347–368. [Google Scholar]
- CryptoLab HEaaN. Available online: https://www.cryptolab.co.kr/en/products-en/heaan-he/ (accessed on 28 March 2024).
- Almutairi, N.; Coenen, F.; Dures, K. K-Means Clustering Using Homomorphic Encryption and an Updatable Distance Matrix: Secure Third Party Data Clustering with Limited Data Owner Interaction. In Proceedings of the International Conference on Big Data Analytics and Knowledge Discovery, Lyon, France, 28–31 August 2017; pp. 274–285. [Google Scholar]
- Jäschke, A.; Armknecht, F. Unsupervised Machine Learning on Encrypted Data. In Proceedings of the Selected Areas in Cryptography (SAC) 2018, Calgary, AB, Canada, 15–17 August 2018; pp. 453–478. [Google Scholar]
- Samanthula, B.K.; Rao, F.-Y.; Bertino, E.; Yi, X.; Liu, D. Privacy-Preserving and Outsourced Multi-User k-Means Clustering. In Proceedings of the 2015 IEEE Conference on Collaboration and Internet Computing (CIC), Hangzhou, China, 27–30 October 2015; pp. 80–89. [Google Scholar]
- Kim, H.J.; Chang, J.W. A Privacy-Preserving k-Means Clustering Algorithm Using Secure Comparison Protocol and Density-Based Center Point Selection. In Proceedings of the IEEE International Conference on Cloud Computing, CLOUD, San Francisco, CA, USA, 2–7 July 2018; pp. 928–931. [Google Scholar]
- Ramírez, D.H.; Auñón, J.M. Privacy Preserving K-Means Clustering: A Secure Multi-Party Computation Approach. arXiv 2020, arXiv:2009.10453. [Google Scholar]
- Mohassel, P.; Rosulek, M.; Trieu, N. Practical Privacy-Preserving K-Means Clustering. Proc. Priv. Enhancing Technol. 2020, 2020, 414–433. [Google Scholar] [CrossRef]
- Bunn, P.; Ostrovsky, R. Secure Two-Party k-Means Clustering. In Proceedings of the 14th ACM conference on Computer and Communication Security, Alexandria, VA, USA, 2 November–31 October 2007; pp. 486–497. [Google Scholar]
- Jha, S.; Kruger, L.; Mcdaniel, P. Privacy Preserving Clustering. In Proceedings of the European Symposium on Research in Computer Security, Milan, Italy, 12–14 September 2005; pp. 397–417. [Google Scholar]
- Cheon, J.H.; Kim, D.; Park, J.H. Towards a Practical Cluster Analysis over Encrypted Data. In Proceedings of the Selected Areas in Cryptography (SAC) 2019, Waterloo, ON, Canada, 12–16 August 2019. [Google Scholar]
- Bozdemir, B.; Canard, S.; Ermis, O.; Möllering, H.; Önen, M.; Schneider, T. Privacy-Preserving Density-Based Clustering. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, Hong Kong, China, 7–11 June 2021; pp. 658–671. [Google Scholar]
- Zahur, S.; Evans, D. Circuit Structures for Improving Efficiency of Security and Privacy Tools. In Proceedings of the Proceedings—IEEE Symposium on Security and Privacy, Berkeley, CA, USA, 19–22 May 2013; pp. 493–507. [Google Scholar]
- Meng, X.; Papadopoulos, D.; Oprea, A.; Triandopoulos, N. Private Two-Party Cluster Analysis Made Formal & Scalable. arXiv 2019, arXiv:1904.04475. [Google Scholar]
- Lichman, M. Iris Dataset. Available online: https://archive.ics.uci.edu/dataset/53/iris (accessed on 28 March 2024).
- Wolberg, W.; Mangasarian, O.; Street, N.; Street, W. Breast Cancer Wisconsin (Diagnostic). UCI Mach. Learn. Repos. 1995. [Google Scholar] [CrossRef]
- Nkongolo, M.W. UGRansome Dataset. Available online: https://www.kaggle.com/datasets/nkongolo/ugransome-dataset/versions/1 (accessed on 9 July 2024).
- Hegde, A.; Möllering, H.; Schneider, T.; Yalame, H. SoK: Efficient Privacy-Preserving Clustering. Proc. Priv. Enhancing Technol. 2021, 2021, 225–248. [Google Scholar] [CrossRef]
- Costache, A.; Curtis, B.R.; Hales, E.; Murphy, S.; Ogilvie, T.; Player, R. On the Precision Loss in Approximate Homomorphic Encryption. In Proceedings of the International Conference on Selected Areas in Cryptography, Fredericton, NB, Canada, 14–18 August 2023. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sokhonn, L.; Park, Y.-S.; Lee, M.-K. Hierarchical Clustering via Single and Complete Linkage Using Fully Homomorphic Encryption. Sensors 2024, 24, 4826. https://doi.org/10.3390/s24154826
Sokhonn L, Park Y-S, Lee M-K. Hierarchical Clustering via Single and Complete Linkage Using Fully Homomorphic Encryption. Sensors. 2024; 24(15):4826. https://doi.org/10.3390/s24154826
Chicago/Turabian StyleSokhonn, Lynin, Yun-Soo Park, and Mun-Kyu Lee. 2024. "Hierarchical Clustering via Single and Complete Linkage Using Fully Homomorphic Encryption" Sensors 24, no. 15: 4826. https://doi.org/10.3390/s24154826
APA StyleSokhonn, L., Park, Y.-S., & Lee, M.-K. (2024). Hierarchical Clustering via Single and Complete Linkage Using Fully Homomorphic Encryption. Sensors, 24(15), 4826. https://doi.org/10.3390/s24154826