Privacy Preserving Machine Learning

A special issue of Algorithms (ISSN 1999-4893). This special issue belongs to the section "Evolutionary Algorithms and Machine Learning".

Deadline for manuscript submissions: closed (15 November 2022) | Viewed by 12162

Special Issue Editor


E-Mail
Guest Editor
Department of Machine Intelligence, School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Interests: machine learning; materials informatics; AI-related engineering applications; data security; privacy-preserving machine learning, etc.
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

We invite you to submit your latest research in the area of “Privacy Preserving Machine Learning (PPML)” to this Special Issue. We are looking for new and innovative approaches for solving security and privacy problems in machine learning. High-quality papers are solicited to address both theoretical and practical issues of the state-of-the-art research, related challenges, and research roadmap for future research in PPML area. Potential topics include, but are not limited to, adversarial attacks against dataset and algorithms, secure and privacy preserving algorithms, and PPML related applications. Your contributions will benefit multiple research communities, such as machine learning, distributed systems, information security, and privacy protection.

Prof. Dr. Quan Qian
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • privacy attacks
  • attacks against the dataset
  • attacks against the algorithm
  • secure and privacy preserving methods
  • privacy preserving data publishing
  • privacy preserving synthetic data generation
  • privacy preserving machine learning
  • federated machine learning
  • differential privacy
  • homomorphic encryption
  • secure multi-party computation
  • hardware security implementation
  • privacy issues in real applications.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 584 KiB  
Article
Generating Higher-Fidelity Synthetic Datasets with Privacy Guarantees
by Aleksei Triastcyn and Boi Faltings
Algorithms 2022, 15(7), 232; https://doi.org/10.3390/a15070232 - 1 Jul 2022
Cited by 2 | Viewed by 1583
Abstract
We consider the problem of enhancing user privacy in common data analysis and machine learning development tasks, such as data annotation and inspection, by substituting the real data with samples from a generative adversarial network. We propose employing Bayesian differential privacy as the [...] Read more.
We consider the problem of enhancing user privacy in common data analysis and machine learning development tasks, such as data annotation and inspection, by substituting the real data with samples from a generative adversarial network. We propose employing Bayesian differential privacy as the means to achieve a rigorous theoretical guarantee while providing a better privacy-utility trade-off. We demonstrate experimentally that our approach produces higher-fidelity samples compared to prior work, allowing to (1) detect more subtle data errors and biases, and (2) reduce the need for real data labelling by achieving high accuracy when training directly on artificial samples. Full article
(This article belongs to the Special Issue Privacy Preserving Machine Learning)
Show Figures

Figure 1

18 pages, 679 KiB  
Article
Privacy-Preserving Feature Selection with Fully Homomorphic Encryption
by Shinji Ono, Jun Takata, Masaharu Kataoka, Tomohiro I, Kilho Shin and Hiroshi Sakamoto
Algorithms 2022, 15(7), 229; https://doi.org/10.3390/a15070229 - 30 Jun 2022
Viewed by 1919
Abstract
For the feature selection problem, we propose an efficient privacy-preserving algorithm. Let D, F, and C be data, feature, and class sets, respectively, where the feature value x(Fi) and the class label x(C) are [...] Read more.
For the feature selection problem, we propose an efficient privacy-preserving algorithm. Let D, F, and C be data, feature, and class sets, respectively, where the feature value x(Fi) and the class label x(C) are given for each xD and FiF. For a triple (D,F,C), the feature selection problem is to find a consistent and minimal subset FF, where ‘consistent’ means that, for any x,yD, x(C)=y(C) if x(Fi)=y(Fi) for FiF, and ‘minimal’ means that any proper subset of F is no longer consistent. On distributed datasets, we consider feature selection as a privacy-preserving problem: assume that semi-honest parties A and B have their own personal DA and DB. The goal is to solve the feature selection problem for DADB without sacrificing their privacy. In this paper, we propose a secure and efficient algorithm based on fully homomorphic encryption, and we implement our algorithm to show its effectiveness for various practical data. The proposed algorithm is the first one that can directly simulate the CWC (Combination of Weakest Components) algorithm on ciphertext, which is one of the best performers for the feature selection problem on the plaintext. Full article
(This article belongs to the Special Issue Privacy Preserving Machine Learning)
Show Figures

Figure 1

15 pages, 567 KiB  
Article
MAC Address Anonymization for Crowd Counting
by Jean-François Determe, Sophia Azzagnuni, François Horlin and Philippe De Doncker
Algorithms 2022, 15(5), 135; https://doi.org/10.3390/a15050135 - 20 Apr 2022
Viewed by 2044
Abstract
Research has shown that counting WiFi packets called probe requests (PRs) implicitly provides a proxy for the number of people in an area. In this paper, we discuss a crowd counting system involving WiFi sensors detecting PRs over the air, then extracting and [...] Read more.
Research has shown that counting WiFi packets called probe requests (PRs) implicitly provides a proxy for the number of people in an area. In this paper, we discuss a crowd counting system involving WiFi sensors detecting PRs over the air, then extracting and anonymizing their media access control (MAC) addresses using a hash-based approach. This paper discusses an anonymization procedure and shows time-synchronization inaccuracies among sensors and hashing collision rates to be low enough to prevent anonymization from interfering with counting algorithms. In particular, we derive an approximation of the collision rate of uniformly distributed identifiers, with analytical error bounds. Full article
(This article belongs to the Special Issue Privacy Preserving Machine Learning)
Show Figures

Figure 1

14 pages, 694 KiB  
Article
Federated Learning for Intrusion Detection in the Critical Infrastructures: Vertically Partitioned Data Use Case
by Evgenia Novikova, Elena Doynikova and Sergey Golubev
Algorithms 2022, 15(4), 104; https://doi.org/10.3390/a15040104 - 23 Mar 2022
Cited by 10 | Viewed by 5530
Abstract
One of the challenges in the Internet of Things systems is the security of the critical data, for example, data used for intrusion detection. The paper research construction of an intrusion detection system that ensures the confidentiality of critical data at a given [...] Read more.
One of the challenges in the Internet of Things systems is the security of the critical data, for example, data used for intrusion detection. The paper research construction of an intrusion detection system that ensures the confidentiality of critical data at a given level of intrusion detection accuracy. For this goal, federated learning is used to train an intrusion detection model. Federated learning is a computational model for distributed machine learning that allows different collaborating entities to train one global model without sharing data. This paper considers the case when entities have data that are different in attributes. Authors believe that it is a common situation for the critical systems constructed using Internet of Things (IoT) technology, when industrial objects are monitored by different sets of sensors. To evaluate the applicability of the federated learning for this case, the authors developed an approach and an architecture of the intrusion detection system for vertically partitioned data that consider the principles of federated learning and conducted the series of experiments. To model vertically partitioned data, the authors used the Secure Water Treatment (SWaT) data set that describes the functioning of the water treatment facility. The conducted experiments demonstrate that the accuracy of the intrusion detection model trained using federated learning is compared with the accuracy of the intrusion detection model trained using the centralized machine learning model. However, the computational efficiency of the learning and inference process is currently extremely low. It is explained by the application of homomorphic encryption for input data protection from different data owners or data sources. This defines the necessity to elaborate techniques for generating attributes that could model horizontally partitioned data even for the cases when the collaborating entities share datasets that differ in their attributes. Full article
(This article belongs to the Special Issue Privacy Preserving Machine Learning)
Show Figures

Figure 1

Back to TopTop