Next Article in Journal
The Impact of Computational Accuracy on the Quality of Direct Drive Control
Previous Article in Journal
Research and Design of a Chicken Wing Testing and Weight Grading Device
 
 
Article
Peer-Review Record

Intelligent Scene-Adaptive Desensitization: A Machine Learning Approach for Dynamic Data Privacy in Virtual Power Plants

Electronics 2024, 13(6), 1051; https://doi.org/10.3390/electronics13061051
by Ruxia Yang 1,2,*, Hongchao Gao 3, Fangyuan Si 3 and Jun Wang 4
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3:
Electronics 2024, 13(6), 1051; https://doi.org/10.3390/electronics13061051
Submission received: 8 January 2024 / Revised: 5 March 2024 / Accepted: 6 March 2024 / Published: 12 March 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

I read the article "Intelligent Scene-Adaptive Desensitization: A Machine Learning Approach for Dynamic Data Privacy in Virtual Power Plants" with interest and I have the following comments to make:

1. Of the 24 bibliographic references presented at the end of the article, only references [1], [2], [3], [10] and [14] are cited and used in the article. Under these conditions, I propose to the authors to eliminate the other unused references in the article.

2. The information from Table 2 is not presented and discussed in the article. I propose to the authors to remove this table from the article in this case.

3. The notations used in equations (1), (2), (3), (4) and (10) are not explained.

4. In Algorithm 1, the notations Θ, X are not explained.

 

Author Response

  1. Comment:Of the 24 bibliographic references presented at the end of the article, only references [1], [2], [3], [10] and [14] are cited and used in the article. Under these conditions, I propose to the authors to eliminate the other unused references in the article.

Response:

Thank you for your comment.

We regret the oversight encountered during the compilation of our paper. To rectify this, we have updated and accurately annotated the references in line with the most current information available.

 

  1. Comment:The information from Table 2 is not presented and discussed in the article. I propose to the authors to remove this table from the article in this case.

Response:

Thank you for your comment.

The section in question details the inclusion of specific scenarios within our database. We have ensured this information is clearly annotated within the text for clarity and precision.

 

  1. Comment:The notations used in equations (1), (2), (3), (4) and (10) are not explained.

Response:

Thank you for your comment.

We have undertaken a comprehensive review and expansion of the explanations surrounding our formulas, providing a more in-depth and clear understanding of these mathematical components.

 

  1. Comment:In Algorithm 1, the notations Θ, X are not explained.

Response:

Thank you for your comment.

We have addressed the previously missing symbols by providing detailed supplemental explanations, ensuring the completeness and coherence of the mathematical expressions and equations within our paper.

 

 

 

 

 

Reviewer 2 Report

Comments and Suggestions for Authors

The research is current. The authors really have a good result, which, unfortunately, was presented very poorly.

1) The introduction does not explain what the research problem is and why it is relevant

2) I recommend making a separate section for analyzing research in the subject area. Now this is part of the introduction. And significantly expand the scope of the literature review

3) The brief descriptions of the sections in the introduction are not accurate.

4) The method used for the research is very poorly described.

5) It is not explained why and how the proposed method compares with existing ones

6) The calculation results are not sufficiently explained

7) Figures 6 and 7 are not sufficiently explained; in addition, the graphs in figure 7 need to be enlarged since it is impossible to see them

8) The conclusions are very superficial and do not reflect the real result of the study

Author Response

  1. Comment:The introduction does not explain what the research problem is and why it is relevant

Response:

Thank you for your comment.

In order to better guide readers through our work, we have expanded upon the introduction section, providing a more comprehensive foundation for understanding the subsequent chapters.

 

  1. Comment:I recommend making a separate section for analyzing research in the subject area. Now this is part of the introduction. And significantly expand the scope of the literature review

Response:

Thank you for your comment.

To elucidate the state of research in this field, we have segregated 'Related Work' into a standalone chapter, coupled with an updated and more extensive literature review.

 

  1. Comment:The brief descriptions of the sections in the introduction are not accurate.

Response:

Thank you for your comment.

The introductory segment now includes a more detailed narrative, ensuring a clearer context and rationale for the study.

 

  1. Comment:The method used for the research is very poorly described.

Response:

Thank you for your comment.

Previously simplified sections have been elaborately revisited and expanded, particularly within Chapters 5 and 6, enriching the depth of our methodological explanations.

 

  1. Comment:It is not explained why and how the proposed method compares with existing ones

Response:

Thank you for your comment.

In Chapters 2 "Related Work" and 7 "System Design and Evaluation," we have incorporated additional information, drawing comparisons between our approach and existing methodologies in the field.

 

  1. Comment:The calculation results are not sufficiently explained

Response:

Thank you for your comment.

We have meticulously re-evaluated our computational work and incorporated further explanations, thereby enhancing the interpretative value of our experimental findings.

 

  1. Comment:Figures 6 and 7 are not sufficiently explained; in addition, the graphs in figure 7 need to be enlarged since it is impossible to see them

Response:

Thank you for your comment.

Figures 6 and 7 have received extended descriptions to more effectively convey our experimental results. Additionally, we've adjusted the size of images throughout the document for improved visual consistency.

 

  1. Comment:The conclusions are very superficial and do not reflect the real result of the study

Response:

Thank you for your comment.

The conclusion has been augmented to more accurately reflect the outcomes of our study, aiming to provide a comprehensive summary of our research findings and implications.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper entitled "Intelligent Scene-Adaptive Desensitization: A Machine Learning Approach for Dynamic Data Privacy in Virtual Power Plants" presents a dynamic desensitization method for Virtual Power Plants (VPPs), using machine learning for adaptive scene recognition to adjust data privacy levels intelligently. A novel similarity utility function and a Gaussian processes-based differential privacy algorithm are introduced, achieving an 87.5% accuracy in scene recognition. These contributions provide a nuanced approach to data protection, addressing the specific needs of complex VPP environments.

Nevertheless, the paper seems to rely on a single source of data from a provincial State Grid Corporation, which may limit the generalizability of the findings. There is also a lack of information on how the machine learning models were validated beyond the accuracy rate, such as precision, recall, or F1-score. Moreover, it is not clear if the impact of different privacy budgets on actual data utility was considered beyond theoretical accuracy measurements. 

I have the following comments and suggestions for paper's improvements for the authors:

1. The methods, particularly the machine learning techniques and the Gaussian processes-based differential privacy algorithm, are well-documented. However, the paper lacks a detailed discussion on the limitations of the proposed models and potential biases that could affect the findings.

2. While the machine learning approach is well within the capacity to assess, the specific nuances of Gaussian processes in differential privacy might require further expertise for a comprehensive evaluation.

3. The authors might want to incorporate data from multiple sources to enhance the robustness of the findings.

4. Expanding on the validation of the machine learning model with additional metrics might be useful.

5. The authors might want to evaluate the practical impact of different privacy budgets on data utility in real-world scenarios.

6. A section discussing the limitations of the study and potential biases needs to be included.

7. It would be beneficial to compare the proposed methods with existing state-of-the-art techniques to showcase the relative advantages or disadvantages.

8. Enhancing the robustness of the data and providing a thorough comparative analysis would significantly strengthen the paper's contribution to the field.

Comments on the Quality of English Language

Minor revisions/proofreading of English might be required

Author Response

  1. Comment:The methods, particularly the machine learning techniques and the Gaussian processes-based differential privacy algorithm, are well-documented. However, the paper lacks a detailed discussion on the limitations of the proposed models and potential biases that could affect the findings.

Response:

Thank you for your comment.

In our approach, we primarily focus on adaptive scene recognition for dynamic desensitization. Chapter 7 now includes a thorough discussion on the merits and drawbacks of our method, alongside an exploration of the model's limitations and potential biases.

 

  1. Comment: While the machine learning approach is well within the capacity to assess, the specific nuances of Gaussian processes in differential privacy might require further expertise for a comprehensive evaluation.

Response:

Thank you for your comment.

Supplementary experiments have been integrated into Chapter 7, laying the groundwork for future, more extensive explorations.

 

  1. Comment:The authors might want to incorporate data from multiple sources to enhance the robustness of the findings.

Response:

Thank you for your comment.

Our model's training and evaluation are grounded in data from the State Grid Corporation of China, a major player in China's power system, serving a vast population and overseeing extensive power operations. The diversity and breadth of the State Grid's dataset enrich our research, offering a comprehensive view of user behaviors, electricity patterns, and generation scenarios. Despite potential limitations of a single data source, the unique scope and variety of this dataset substantially offset these concerns, enhancing the model's applicability and generalizability. Detailed information about this data source is also provided in Chapter 7.

 

  1. Comment:Expanding on the validation of the machine learning model with additional metrics might be useful.

Response:

Thank you for your comment.

Chapter 7 includes additional experiments, bolstering our research findings.

 

  1. Comment:The authors might want to evaluate the practical impact of different privacy budgets on data utility in real-world scenarios.

Response:

Thank you for your comment.

Related experiments in Chapter 7 have been expanded with more descriptive details to facilitate easier understanding and readability.

 

  1. Comment:A section discussing the limitations of the study and potential biases needs to be included.

Response:

Thank you for your comment.

A critical examination of the research limitations and potential biases is now part of Chapter 7's discourse.

 

  1. Comment: It would be beneficial to compare the proposed methods with existing state-of-the-art techniques to showcase the relative advantages or disadvantages.

Response:

Thank you for your comment.

Comparative discussions on the technological strengths and weaknesses have been incorporated into Chapters 2 and 7.

 

  1. Comment:Enhancing the robustness of the data and providing a thorough comparative analysis would significantly strengthen the paper's contribution to the field.

Response:

Thank you for your comment.

The dataset has received a more comprehensive description, confirming its robustness and suitability for theoretical analysis. A detailed explanation regarding this aspect has been provided in response to comment 3.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I have read the new improved version of the article "Intelligent Scene-Adaptive Desensitization: A Machine Learning Approach for Dynamic Data Privacy in Virtual Power Plants" and I can say that:

1. The article has been corrected and completed, which leads to an increase in its quality.

2. There is still a problem to be solved related to the use of bibliographic references: references [13] and [18] are not used in the paper. For this reason, I propose their removal from the list of bibliographic references.

Author Response

Thank you for your feedback! We have reorganized the references once again.

Reviewer 2 Report

Comments and Suggestions for Authors

The authors made some changes that improved the paper. However, there were still many problems that were not resolved:
1) In the literature review, the authors refer to articles that cannot be found according to the bibliographic descriptions provided. Therefore, it is impossible to verify the reliability of the analysis of previous studies and related works
2) Section 3. Basic background knowledge contains a description of known methods. There are no links to sources describing these methods. There is also no decoding of the symbols used in the formulas. It is unclear why the authors additionally introduced these descriptions since there is no explanation of how they plan to use them in this study.
3) Figure 2 is not clear and requires a more detailed explanation. The phrase “Our focus extends to the analysis of business service data flow in virtual power plants, segregating it into two pivotal scenarios: power generation and consumption, as shown in Figure 2” does not make the understanding any clearer.
4) It is not clear how the recognition accuracy shown in Figure 4 was measured or calculated. It is also unclear in what units the amount of data is measured
5) It is not clear by what methods additional data were synthesized. How justified is the use of such methods?
6) It would be better to explain how and why the methods are combined in the Scenario Classification Model. Why does Table 3. Comparison of Scenario Classification Model Performance compare exactly the scenarios given, and not some others?
7) High accuracy for all algorithms for different industry types in Figure 5. Comparison of load recognition accuracy under different algorithms suggests that the initial data for analysis was prepared incorrectly. If this statement is wrong, please justify the contrary.
8) All graphs in Figure 7. Differential privacy anonymization capability under different virtual power plant business scenarios have absolutely the same axes, scales, and legend. It is not clear why the lines on them are different. Perhaps some designations that should distinguish these cases are missing.
9) It is not clear from the article why the authors call the proposed method adaptive

Author Response

  1. Comment:In the literature review, the authors refer to articles that cannot be found according to the bibliographic descriptions provided. Therefore, it is impossible to verify the reliability of the analysis of previous studies and related works

Response:

Thank you for your comment.

We have meticulously reviewed and organized the bibliography, incorporating your feedback. Should there remain any issues, please specify the particular reference in question.

 

  1. Comment:Section 3. Basic background knowledge contains a description of known methods. There are no links to sources describing these methods. There is also no decoding of the symbols used in the formulas. It is unclear why the authors additionally introduced these descriptions since there is no explanation of how they plan to use them in this study.

Response:

Thank you for your comment.

Following your recommendations, we enriched the methodological section by incorporating additional literature, offering detailed descriptions and explanations for each symbol and formula encountered. These elucidations are intimately linked to our study, shedding light on the theoretical underpinnings and practical applications of the algorithms employed.

 

  1. Comment:Figure 2 is not clear and requires a more detailed explanation. The phrase “Our focus extends to the analysis of business service data flow in virtual power plants, segregating it into two pivotal scenarios: power generation and consumption, as shown in Figure 2” does not make the understanding any clearer.

Response:

Thank you for your comment.

Virtual power plants are conceptualized as holistic platforms that optimize and control dispersed energy resources to enhance energy efficiency. Our analytical framework distinctly categorizes the application scenarios of business services into power generation and consumption, facilitating a deeper comprehension of how virtual power plants navigate the management of various private data types. This bifurcation enables more precise classification during feature extraction and scenario discrimination, with subsequent adjustments made to the original manuscript to reflect these insights.

 

  1. Comment:It is not clear how the recognition accuracy shown in Figure 4 was measured or calculated. It is also unclear in what units the amount of data is measured

Response:

Thank you for your comment.

The figure displays three curves, each representing the recognition accuracy of model training after K-means clustering with real data, synthetic data, and mixed data, respectively. All curves exhibit an upward trend in recognition accuracy as the volume of data increases, stabilizing after a certain point and ultimately approaching a 97% accuracy rate. This indicates that the model performs consistently across different datasets, and the use of synthetic data to augment the training set is effective in this scenario.

  1. Comment: It is not clear by what methods additional data were synthesized. How justified is the use of such methods?

Response:

Thank you for your comment.

The electrical data for the expanded dataset is primarily synthesized through interpolation, a method used to estimate the values of unknown points within a given set of data points. Interpolation is utilized for dataset expansion to generate new samples between known data points, making it suitable for scenarios involving the statistical analysis of electricity usage. Categorical data, such as names and numbers, are expanded through random generation, serving merely to differentiate between datasets. As demonstrated in the figure, when equal amounts of experimental data are selected, there are no significant differences between real data, synthetic data, and mixed data. This enhances the model's training accuracy and generalization ability, indicating that synthetic data can be used as a substitute for real data in subsequent experiments.

 

  1. Comment:It would be better to explain how and why the methods are combined in the Scenario Classification Model. Why does Table 3. Comparison of Scenario Classification Model Performance compare exactly the scenarios given, and not some others?

Response:

Thank you for your comment.

This scheme is designed as an ablation study, with a particular focus on the performance of the K-means algorithm combined with adaptive algorithms. The purpose is to demonstrate how and why this combination excels in data preprocessing and feature extraction, and maintains high adaptability across various data recognition scenarios. The choice of K-means plus adaptive algorithms is motivated by the combination's ability to effectively process diverse data scenarios, optimize the data processing flow, enhance recognition accuracy, and reduce computational costs and time consumption. This aspect is especially critical in practical applications, as it directly relates to the model's efficiency and feasibility.

We specifically compare this combination, excluding other potential algorithm combinations, because this part of the research aims to prove the unique advantages of using K-means and adaptive algorithms together. This does not imply that other combinations lack value, but rather, given our research objectives and presuppositions, the K-means plus adaptive algorithm combination best represents the core issue we aim to explore—namely, how to ensure high recognition accuracy while efficiently adapting to diverse data scenarios.

 

  1. Comment:High accuracy for all algorithms for different industry types in Figure 5. Comparison of load recognition accuracy under different algorithms suggests that the initial data for analysis was prepared incorrectly. If this statement is wrong, please justify the contrary.

Response:

Thank you for your comment.

Figure 5 illustrates that, when various algorithms are at their optimal state, in scenarios with limited data volume or where other algorithms are unsuitable for classification, the adaptive recognition algorithm exhibits greater generalizability in its load recognition rate, achieving better recognition outcomes. This explanation has also been added to the original text to clarify.

 

  1. Comment:All graphs in Figure 7. Differential privacy anonymization capability under different virtual power plant business scenarios have absolutely the same axes, scales, and legend. It is not clear why the lines on them are different. Perhaps some designations that should distinguish these cases are missing.

Response:

Thank you for your comment.

As you mentioned, indeed, there were some missing descriptions that could serve as distinctions, corresponding to datasets in different scenarios. Now, the descriptions of the images have been updated based on the content of the article to better differentiate between the four scenarios.

 

  1. Comment:It is not clear from the article why the authors call the proposed method adaptive

Response:

Thank you for your comment.

Our method, dubbed the 'adaptive method', is so named for its capacity to autonomously adjust its processing strategies in response to the evolving characteristics and conditions of the data and environment. At its core, the adaptive method is characterized by its flexibility and intelligence, enabling the autonomous optimization of algorithm parameters and strategies across different datasets and application scenarios without the need for human intervention.

This method, through the analysis of the structure and features of virtual power grid user data, automatically adjusts data preprocessing, feature extraction, and model training processes, culminating in the achievement of multi-scenario recognition. Capable of effectively adapting to both large-scale dataset processing and pattern identification in dynamic environments, this method ensures the accuracy and efficiency of analysis and recognition processes.

Designated as the 'adaptive method', this approach showcases several adaptability features: parameter adaptability, allowing the algorithm to automatically fine-tune its parameters in response to user data changes; scenario adaptability, facilitating scenario classification based on different power scenarios and user data characteristics; and de-sensitization strategy adaptability, enabling dynamic de-sensitization processing through adaptive adjustments based on scenario similarity assessments.

By harnessing this adaptability, our method not only elevates the flexibility and intelligence of data processing but also significantly amplifies the model's application value and performance across a spectrum of data scenarios. Hence, the 'adaptive method' aptly captures its foundational characteristics, highlighting its extensive applicability and efficiency in practical applications. This narrative has been incorporated into Chapter 5 to enhance clarity and comprehension.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper has been thoroughly revised. Thank you very much for that!

Comments on the Quality of English Language

The English appears to be fine with some minor proofreading (in the newly inserted parts) applicable

Author Response

Thank you for your feedback! 

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

I am grateful to the authors for the changes made. Now, the course of the research, premises, and results are clear. I also ask you to make a few changes that, in my opinion, will improve the perception of the article.

1) The word "desensitization" is mentioned several times in various parts of the article, but its explanation is missing in the introduction. Also, in the introduction, there is no explanation of what scenes we are talking about. This should be added.

2) Figure 2 is not clear enough. There are two dotted rectangles in green and blue. You should sign in Figure 2 what is highlighted by these rectangles.

3) The References include articles from publications that are not in the public domain, and many of them are in Chinese, but there is no abstract in English. This makes it difficult to analyze the literature. Please, if possible, use references that are more accessible to everyone.

Author Response

  1. Comment:The word "desensitization" is mentioned several times in various parts of the article, but its explanation is missing in the introduction. Also, in the introduction, there is no explanation of what scenes we are talking about. This should be added.

Response:

Thank you for your comment.

We have expanded the introduction section to include an explanation of the term "desensitization," along with additional background to set the stage for our subsequent discussion and analysis in the specific scenarios addressed.

 

  1. Comment:Figure 2 is not clear enough. There are two dotted rectangles in green and blue. You should sign in Figure 2 what is highlighted by these rectangles.

Response:

Thank you for your comment.

Regarding the two rectangles that appeared in our initial submission, these were intended to represent data flows within Virtual Power Plant (VPP) business services, distinguishing between generation and consumption scenarios. The error in the image export has been corrected, and the correct image is now included in the revised manuscript.

 

  1. Comment:The References include articles from publications that are not in the public domain, and many of them are in Chinese, but there is no abstract in English. This makes it difficult to analyze the literature. Please, if possible, use references that are more accessible to everyone.

Response:

Thank you for your comment.

We have also revisited and reorganized our references, opting to include some publications with higher impact to replace previous ones. Furthermore, we have updated the section on related work, aiming for a more comprehensive literature analysis.

Back to TopTop