Next Article in Journal
Multi-Strategy Fusion of Sine Cosine and Arithmetic Hybrid Optimization Algorithm
Previous Article in Journal
1-Bit Hexagonal Meander-Shaped Wideband Electronically Reconfigurable Transmitarray for Satellite Communications
 
 
Article
Peer-Review Record

Differential Privacy High-Dimensional Data Publishing Based on Feature Selection and Clustering

Electronics 2023, 12(9), 1959; https://doi.org/10.3390/electronics12091959
by Zhiguang Chu 1,2, Jingsha He 1, Xiaolei Zhang 2, Xing Zhang 2 and Nafei Zhu 1,*
Reviewer 1:
Electronics 2023, 12(9), 1959; https://doi.org/10.3390/electronics12091959
Submission received: 13 March 2023 / Revised: 17 April 2023 / Accepted: 19 April 2023 / Published: 23 April 2023

Round 1

Reviewer 1 Report

The submitted paper presents "Differential privacy high-dimensional data publishing based on feature selection and clustering". Although the paper might have some novelties, some points need clarification:

1) You have not refereed any article for exponential and Laplace mechanisims. 

2) Figure and Tables have no captions 

3) No results except accuracy

4) How many models u have trained (Base learners) before aggreagaition in random forest. 

5) What is the split for CV, Test and Training.

6) How u have optimised random forest parameters. Which parameters you have optimised.

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

This work introduces an innovative hybrid feature selection and clustering algorithm aimed at overcoming the challenges associated with high-dimensional data publishing. The proposed method efficiently removes irrelevant and weakly correlated features, enhancing the time efficiency and accuracy of feature selection. The rapid correlation feature clustering approach lowers computational cost and eliminates the need to predetermine the number of clusters. Moreover, the adaptive privacy budget allocation strategy strikes a balance between data utility and privacy in the published data.

 

 

Overall, while the authors try to makes a contribution,

the authors are encouraged to address the identified issues before the paper's acceptance for publication.

 

 

**Point**:

Would it be beneficial to incorporate the overview of various research studies on random forest algorithms, feature selection, and clustering techniques into the introduction section to provide a broader context and background on the existing methods and their applications?

 

 

**Point**:

Are there any parameter tuning and sensitivity analysis performed during the development of the algorithm that could be discussed to provide insight into its robustness and performance under various conditions?

 

**Point**:

Would the authors please provide the data and code necessary to reproduce the results presented in the paper for review?

 

 

**Point**:

Can the authors please include comparisons with other state-of-the-art privacy-preserving feature selection methods to provide a more comprehensive evaluation of the proposed algorithm's performance?

 

**Point**:

Can the authors provide a more critical analysis of differential privacy methods and their potential limitations, as well as discuss the challenges and drawbacks of using symmetric uncertainty and random forest in the context of privacy preservation?

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The comments have been addressed. The manuscript can now be accepted.

Author Response

Thank you for your positive comment. This article has been further refined with your suggestions, thank you again for your suggestions.

Reviewer 2 Report

I appreciate the authors' diligent efforts in addressing the issues and concerns raised during the previous review. The revisions and additions have significantly improved the quality of the paper. However, I would like to kindly request further information to ensure the reproducibility and scientific rigor of the research.

 

While the authors have provided scripts for the proposed algorithms, it would be beneficial if they could also share the following information:

 

**point:**  Could the authors please provide the complete code used to generate the results presented in the manuscript? This would enable the reviewer to verify the reproducibility of the study and ensure the scientific conscientiousness of the research.

Author Response

Please see the attachmen

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

Thank you very for promptly providing the code functions and dataset as requested. 

 

After attempting to use the code and a portion of the data to replicate the results presented in your paper. At this time, I have not been able to fully verify the conversion due to some challenges in adapting the dataset conversion process to my computer. 

However, I did observe that the 30 features data have been provided as described in your work.

And the relevant function and packages are used/

I feel more confident in the scientific rigor of your research 

I recommend accepting your work so that the editorial office can proceed with further processing your submission. Thank you again.

Back to TopTop