Next Article in Journal
Enhanced Estimation of Traffic Noise Levels Using Minute-Level Traffic Flow Data through Convolutional Neural Network
Previous Article in Journal
Fostering Sustainable Urban Tourism in Predominantly Industrial Small-Sized Cities (SSCs)—Focusing on Two Selected Locations
 
 
Article
Peer-Review Record

Machine Learning Models for Solar Power Generation Forecasting in Microgrid Application Implications for Smart Cities

Sustainability 2024, 16(14), 6087; https://doi.org/10.3390/su16146087
by Pannee Suanpang 1,* and Pitchaya Jamjuntr 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Sustainability 2024, 16(14), 6087; https://doi.org/10.3390/su16146087
Submission received: 18 June 2024 / Revised: 7 July 2024 / Accepted: 8 July 2024 / Published: 17 July 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

In this manuscript, a method is proposed to forecast solar power generation in microgrids by comparing the performance of LGBM and KNN models. Specific comments are as follows:

(1) In the Abstract section, including numerical results such as accuracy is crucial for a clear understanding of the performance of the proposed method.

(2) Acronyms used in the text, such as LGBM and ARIMA in section 2.1, should be explained in full the first time they appear, and then the acronyms should be used throughout the rest of the manuscript. It is recommended to review and revise the entire manuscript for this issue.

(3) In Figure 2, there are issues with letter spacing and spelling errors such as "foreest." It is recommended to check and correct spelling errors throughout the text and figures. The manuscript mentions that Figure 2 shows the LGBM model, but in Figure 2, only part c is the LGBM model structure, while parts a and b are related to feature engineering. Consider whether this figure is appropriate and revise accordingly.

(4) In Figure 3, "point" is misspelled as "poInt." On the right side of the image, what is "Category 1"? It should be "Category A." The right-side image labeled "Before KNN" should be "After KNN." Overall, there are many issues with the figures. Please thoroughly check and correct all figures in the manuscript; this comment will not detail all specific issues.

(5) The appearance of section 2.5 feels repetitive or should be placed earlier in Chapter 2. Its current placement at the end of the chapter is not appropriate. Consider reorganizing the content of Chapter 2 to highlight the key points.

(6) Section 3.2 "Smart Cities" seems to introduce a project about smart cities, which appears unrelated to the methodology in Chapter 3. Consider condensing this section or integrating it into another section for better coherence.

(7) In section 3.3, the data collection and preprocessing section should provide a brief introduction to the basic structure and content of the data, and describe the preprocessing methods, including at least basic mathematical descriptions. Providing more detailed descriptions or graphical representations of the means and other statistics mentioned would be preferable.

(8) In section 3.4, are "time" and "hour" two different features? These two variables essentially describe different concepts and should not be used for feature analysis. Their perfect correlation (1.00) is inevitable and should not be summarized as a finding. Consider performing basic preprocessing on the variables before conducting correlation analysis, and present this part of the content in a more rigorous and clear manner, including the relationship between month and time.

(9) In the methodology and mathematical descriptions of the models, there is no mention of any variables, which is inappropriate. Consider adding descriptions of the variables used.

(10) In sections 3.5 and 3.6, it is recommended to describe or tabulate the hyperparameter settings, provide a brief explanation of how the dataset is divided, and clearly describe the loss functions, etc., to enable readers to replicate the study. Add necessary mathematical descriptions and analyses where needed.

(11) More recent works about machine learning model should be included in this manuscript, like “A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. International Journal of Fatigue, 2024, 178: 108007.”

(12) The manuscript should choose the model most suitable for the task and select similar methods and models for comparison. It is unclear why the manuscript includes a lengthy discussion of a comparative model. Additionally, only two old models are mentioned, with no other comparisons, which is not persuasive in demonstrating the superiority of the proposed method.

(13) There is no significant innovation observed in the manuscript. The experimental section is vague and disorganized. Try reorganizing it to highlight the innovations and the reproducibility of the experiments.

(14) Chapters 5 and 6 are repetitive and not concise. It is recommended to refine and combine these sections.

(15) The seasonal stability mentioned in the abstract is not reflected in detailed experiments and descriptions in the experimental section.

Author Response

In this manuscript, a method is proposed to forecast solar power generation in microgrids by comparing the performance of LGBM and KNN models. Specific comments are as follows:

 

  • In the Abstract section, including numerical results such as accuracy is crucial for a clear understanding of the performance of the proposed method.

Answer:  We have included the numerical results of accuracy in the abstract by comparing the LGBM and KNN models using various metrics, including R-squared, RMSE, MAE, and training time. The results show that the LGBM model outperforms the KNN model in terms of accuracy (R-squared: 0.84 vs. 0.77), Root Mean Squared Error (RMSE: 5.77 vs. 6.93) and Mean Absolute Error (MAE: 3.93 vs. 4.34). However, the LGBM model takes longer to train (120 seconds vs. 90 seconds) and uses more memory (500 MB vs. 300 MB). (Line 18-26)

 

  • Acronyms used in the text, such as LGBM and ARIMA in section 2.1, should be explained in full the first time they appear, and then the acronyms should be used throughout the rest of the manuscript. It is recommended to review and revise the entire manuscript for this issue.

Answer: We used the full names of the acronyms LGBM and ARIMA when they first appeared, along with their definitions. Additionally, we ensured the consistent use of these acronyms throughout the entire paper. (Line 85-89, 186-190)

 

  • In Figure 2, there are issues with letter spacing and spelling errors such as "foreest." It is recommended to check and correct spelling errors throughout the text and figures. The manuscript mentions that Figure 2 shows the LGBM model, but in Figure 2, only part c is the LGBM model structure, while parts a and b are related to feature engineering. Consider whether this figure is appropriate and revise accordingly.

Answer:  We already revise the information about Figure 2 and recheck with the spacing We have conducted a thorough review of the entire figure and the manuscript for academic English usage and spelling accuracy. Additionally, we rechecked the content within the figures and their captions to ensure consistency with the manuscript. We also removed the references to parts a-c from Figure 2 to make the content clearer. (Line 245-247)

 

 

  • In Figure 3, "point" is misspelled as "poInt." On the right side of the image, what is "Category 1"? It should be "Category A." The right-side image labeled "Before KNN" should be "After KNN." Overall, there are many issues with the figures. Please thoroughly check and correct all figures in the manuscript; this comment will not detail all specific issues.

Answer:  We rechecked, redraw and corrected the spelling of Figure 3. We changed "Category 1" to "Category A" and updated "Before KNN" to "After KNN" in the image. Additionally, we reviewed all figures in the paper to ensure they contain correct content, spelling, and explanations consistent with the manuscript. (Line 289-294, 303-305)

 

  • The appearance of section 2.5 feels repetitive or should be placed earlier in Chapter 2. Its current placement at the end of the chapter is not appropriate. Consider reorganizing the content of Chapter 2 to highlight the key points.

Answer: We have removed the duplicate content from Section 2.5 and reorganized the entire Literature Review section in accordance with the reviewer's recommendations. Additionally, we moved Section 3.2 from the Methodology, which illustrates information about the context of the research areas, to Section 2.6. This adjustment clarifies the background information of the research areas and ensures it is presented in the appropriate section. The revised structure of Section 2 is as follows:

2.1 Solar Power Generation and Microgrids (Line 139-177)

2.2 Solar Power Generation Forecasting Techniques (Line 181-228)

2.3 Light Gradient Boosting Machine (LGBM) (Line 231-283)

2.4 K Nearest Neighbors (KNN) (Line 286-322)

2.5 Comparative Studies on Solar Power Generation Forecasting (Line  325-356)

2.6 Rayong Smart Cities, Thailand (Line 358-386)

 

  • Section 3.2 "Smart Cities" seems to introduce a project about smart cities, which appears unrelated to the methodology in Chapter 3. Consider condensing this section or integrating it into another section for better coherence.

Answer: We have moved Section 3.2, Smart Cities, into Section 2.6, Rayong Smart Cities, Thailand, in the Literature Review section. This reorganization was done because the information in this section describes the context of Rayong Smart Cities, Thailand, which presents information relevant to the research areas where this study is being implemented. (Line 358-386)

 

  • In section 3.3, the data collection and preprocessing section should provide a brief introduction to the basic structure and content of the data, and describe the preprocessing methods, including at least basic mathematical descriptions. Providing more detailed descriptions or graphical representations of the means and other statistics mentioned would be preferable.

Answer:  Thank you for your recommendation to include more details in Section 3.2, Data Collection and Preprocessing, to enhance clarity and understanding. We have revised and added the following details:

 

  • We provide a brief introduction to the basic structure and content of the collected data (Lines 419-422).
  • We have added a new section, “3.2.1 Preprocessing Methods,” to explain the data collection methods and preprocessing phase, ensuring the dataset's reliability and quality (Lines 428-437).
  • We have introduced another new section, “3.2.2 Mathematical Descriptions,” where we present mathematical equations (1-3) to illustrate how the dataset is manipulated (Lines 439-447).
  • We include details of graphical representations (Figure 6) to explain the evaluation of power output (Lines 449-453).

 

In section 3.4, are "time" and "hour" two different features? These two variables essentially describe different concepts and should not be used for feature analysis. Their perfect correlation (1.00) is inevitable and should not be summarized as a finding. Consider performing basic preprocessing on the variables before conducting correlation analysis, and present this part of the content in a more rigorous and clear manner, including the relationship between month and time.

Answer:  We have revised and rechecked the variables "time" and "hour" and confirmed that they are similar but distinct, as shown in Figure 7. The correlation matrix provides information about the relationships between variables we use: Time: indicating that they are highly correlated. Hour: Similar to Time, Hour exhibits a strong positive correlation with Time (1.00), indicating a high degree of correlation between the two variables.

However, we performed basic preprocessing on these variables before conducting the correlation analysis and presenting the experimental results in the Results section. By preprocessing the data and focusing on non-redundant and meaningful features, our analysis provides a clearer understanding of the factors influencing solar power generation. This rigorous approach ensures the reliability and accuracy of our subsequent modeling efforts. (Line 465-470, 495-498)

 

  • In the methodology and mathematical descriptions of the models, there is no mention of any variables, which is inappropriate. Consider adding descriptions of the variables used.

Answer:  Thank you very much for your suggestion to include more information about the mathematical description and variable description.

  • We have added a new section, “3.4.1 Model Implementation and Hyperparameter Tuning,” to explain the mathematical formulation and description. Additionally, we provide an explanation of the variables used in this study. (Line 515-520)
  • We have added a new section, “3.4.2 Dataset Division,” to explain the dataset division by providing mathematical formulation and description. (Line 521-565)

 

 

 

 

  • In sections 3.5 and 3.6, it is recommended to describe or tabulate the hyperparameter settings, provide a brief explanation of how the dataset is divided, and clearly describe the loss functions, etc., to enable readers to replicate the study. Add necessary mathematical descriptions and analyses where needed.

Answer:  Thank you so much for your recommendation. We have added the description of the hyperparameter settings and provided a brief explanation of how the dataset is divided, to enable readers to replicate the study. We added necessary mathematical descriptions and analyses where needed in the following revisions:

  • We added information in section “3.5.1 Algorithm and Implementation Details” to provide details about how to develop and implement the algorithm (Lines 664-684).
  • We added information in section “3.5.2 Hyperparameter Settings and Dataset Division” to provide details about hyperparameter settings, dataset division, and information about loss functions and mathematical descriptions for readers to replicate this study (Lines 687-700).

 

  • More recent works about machine learning model should be included in this manuscript, like “A novel machine learning method for multiaxial fatigue life prediction: Improved adaptive neuro-fuzzy inference system. International Journal of Fatigue, 2024, 178: 108007.”

Answer: Once again, thank you very much for your recommendation to add more literature related to the ‘machine learning model’ in this study (Reference number 49) (Lines 953, 959, 1038, 1040, 1258-1277). We have also added more literature about solar power forecasting to guide our future studies (References 71-80) (Lines 1072, 1087-1092, 1258-1277).

 

  • The manuscript should choose the model most suitable for the task and select similar methods and models for comparison. It is unclear why the manuscript includes a lengthy discussion of a comparative model. Additionally, only two old models are mentioned, with no other comparisons, which is not persuasive in demonstrating the superiority of the proposed method.

Answer:  We have added more information on how to choose the model suitable for this study by explaining the results and discussion sections.

  • We have included information to support the comparison of the two models, LGBM and KNN, based on several performance metrics (Lines 848-863).
  • We have revised and rewritten the discussion section to analyze the results, highlighting key findings from this analysis. The selection of LGBM and KNN for comparison was deliberate and based on their suitability for the task at hand. The discussion has been succinctly summarized, emphasizing performance metrics and model comparisons, and the comparative analysis highlights the distinct strengths and limitations of each model (Lines 929-946).
  • We added section “5.2 Implications” to provide information on implementing solar power generation forecasting models in microgrid operations (Lines 978-994).
  • Moreover, considering the limitations of this study, we have added section “5.3 Limitations and Future Research” to state the limitations of our study and provide information about future research directions. This section aims to address existing limitations and foster innovation in solar power generation forecasting, ultimately enhancing prospects and practical applications within microgrid planning and operation (Lines 1034-1072).

 

  • There is no significant innovation observed in the manuscript. The experimental section is vague and disorganized. Try reorganizing it to highlight the innovations and the reproducibility of the experiments.

Answer We have reorganized and emphasized the significant innovations of this paper in the Results, Discussion, and Conclusion sections to clearly present the paper's contributions in the following revisions:

  • We have reorganized Section 3, ‘Methodology,’ to enhance clarity and reproducibility of our experiments, including the following subsections:

3.1 Research Framework (Lines 387-412)

3.2 Data Collection and Preprocessing (Lines 413-426)

3.3 Data Collection and Preprocessing (Lines 457-498)

3.4 Light Gradient Boosting Machine (LGBM) Model (Lines 500-653)

3.5 K Nearest Neighbors (KNN) Model (Lines 655-792)

  • We have reorganized Sections 5 and 6 into a combined section titled ‘Discussion and Conclusion,’ which includes the following subsections:

5.1 Discussion of Results (Lines 928-964)

5.2 Implications (Lines 967-994)

5.3 Limitations and Future Research (Lines 988-1072)

5.4 Conclusion (Lines 1075-1103)

 

  • Chapters 5 and 6 are repetitive and not concise. It is recommended to refine and combine these sections.

Answer: We have revised and combined Section 5, Discussion, and Section 6, Conclusion, and also deleted repetitive information. The new subsections are arranged as follows:

5.1 Discussion of Results (Lines 928-964)

5.2 Implications (Lines 967-994)

5.3 Limitations and Future Research (Lines 988-1072)

5.4 Conclusion (Lines 1075-1103)

 

  • The seasonal stability mentioned in the abstract is not reflected in detailed experiments and descriptions in the experimental section.

Answer:  Finally, we have rewritten the abstract to reflect the details of our experiment (Lines 18-26) and included the description of the experimental results in the Results section (Line 840-869), following your recommendations.

 

 ? We would like to thank you for your valuable suggestions, which have helped improve the quality of our paper and enhance its contribution for publication.

Reviewer 2 Report

Comments and Suggestions for Authors

The presented study compares the performance of the LGBM and KNN models in forecasting solar power generation within a microgrid context. The area of forecasting renewable power generation is still being developed. This issue is crucial for making energy systems based on RES reliable and flexible.

The main question addressed by the research is what is the effective and accurate method for solar generation forecasting in microgrid applications. To address this issue the authors compared two methods using machine learning – Light Gradient Boosting Machine (LGBM) and K-Nearest Neighbours (KNN).

The references are appropriate. In contrast to the scientific papers published so far, the presented comparative analysis assesses the environmental benefits and implications of accurate solar power generation forecasting for reducing carbon emissions and advancing sustainability goals, which is strongly connected with the aims and scope of the Sustainability Journal.

The comprehensive comparative study of LGBM and KNN algorithms is the original contribution of the authors, according to the reviewer. The comparison consider for example: R-squared, RMSE, MAE, training time, memory usage, ability to capture complex patterns, handling nonlinear relationships, adaptability to changing conditions, strengths and limitations.

The study was correctly performed and the authors formulated valuable findings, which should be interesting for the scientific environment, industry stakeholders and policymakers. Additionally, the authors broadly discussed the limitations of the study, as well as possible future research directions.

The presented results enable more conscious energy planning and management, reducing costs and increasing reliability for microgrid operators. The consideration of environmental aspects provides a comprehensive insight into the planning and decision process.

However, I have a few suggestions for the authors to consider for improving the article:
- The novelty of the research and the filled research gap, referring to the literature review, should be more highlighted.
- The referencing style needs to be corrected. Instead of [1][2], it should be [1,2] (line 34), and instead of [5][6][7], it should be [6-7] (line 58), etc.
- Figures 3 and 4 are difficult to read and should be improved for better legibility.

 

Comments on the Quality of English Language

There are some language mistakes, for example: To bridge this gap there are a number of different forecasting models that can be used to predict solar power generation. I suggest a professional language editing service to correct mistakes. 

Author Response

The author expresses profound gratitude to the reviewers for their kindness, valuable suggestions, and constructive comments, which have greatly contributed to the improvement of our paper. In response to your recommendations, we have revised and amended the manuscript accordingly. The changes are highlighted in yellow, with additional information marked in red within the yellow highlights.

 

  • The presented study compares the performance of the LGBM and KNN models in forecasting solar power generation within a microgrid context. The area of forecasting renewable power generation is still being developed. This issue is crucial for making energy systems based on RES reliable and flexible.

AnswerThank you very much for your kindness and support towards our research team in developing a study comparing the performance of the LGBM and KNN models in forecasting solar power generation within a microgrid context.

 

  • The main question addressed by the research is what is the effective and accurate method for solar generation forecasting in microgrid applications. To address this issue the authors compared two methods using machine learning – Light Gradient Boosting Machine (LGBM) and K-Nearest Neighbors (KNN).

Answer: Once again, we would like to express our gratitude for your comments. Your feedback has helped us clearly identify the main research question within our study.

 

  • The references are appropriate. In contrast to the scientific papers published so far, the presented comparative analysis assesses the environmental benefits and implications of accurate solar power generation forecasting for reducing carbon emissions and advancing sustainability goals, which is strongly connected with the aims and scope of the Sustainability Journal.

Answer: The authors would like to thank the reviewer for evaluating the references in this paper. In response, we have provided additional related references to enhance the quality of this paper. This ensures a comprehensive presentation of the knowledge regarding solar power generation forecasting, its role in reducing carbon emissions, and its contribution to advancing sustainability goals, aligning closely with the aims and scope of the Sustainability Journal.

 

  • The comprehensive comparative study of LGBM and KNN algorithms is the original contribution of the authors, according to the reviewer. The comparison consider for example: R-squared, RMSE, MAE, training time, memory usage, ability to capture complex patterns, handling nonlinear relationships, adaptability to changing conditions, strengths and limitations.

Answer: The authors would like to thank the reviewer for understanding the significance of this paper. The comprehensive comparative study of LGBM and KNN algorithms is indeed the original contribution of the authors. The comparison considers several metrics, including R-squared, RMSE, MAE, training time, memory usage, ability to capture complex patterns, handling nonlinear relationships, adaptability to changing conditions, and an assessment of their strengths and limitations.

 

  • The study was correctly performed and the authors formulated valuable findings, which should be interesting for the scientific environment, industry stakeholders and policymakers. Additionally, the authors broadly discussed the limitations of the study, as well as possible future research directions.

AnswerThe authors would like to thank the reviewer for their valuable contribution and thorough review of our paper. We appreciate the acknowledgment that the study was correctly performed and that our findings are valuable and of interest to the scientific community, industry stakeholders, and policymakers. Additionally, we are pleased to note that the discussion on the limitations of the study and possible future research directions was well-received.

 

  • The presented results enable more conscious energy planning and management, reducing costs and increasing reliability for microgrid operators. The consideration of environmental aspects provides a comprehensive insight into the planning and decision process.

Answer: The authors would like to thank the reviewer for their valuable contribution and review of our paper. We appreciate your recognition that the presented results enable more conscious energy planning and management, reduce costs, and increase reliability for microgrid operators. Additionally, the consideration of environmental aspects indeed provides a comprehensive insight into the planning and decision-making process.

 

  • However, I have a few suggestions for the authors to consider for improving the article:
    (1) The novelty of the research and the filled research gap, referring to the literature review, should be more highlighted.

Answer: Thank you very much for your suggestion. We have revised and added more information as follows:

  • We have included additional information on the research gap, specifically addressing the effectiveness of LGBM in improving forecast accuracy by incorporating meteorological variables and historical solar power generation data. To highlight the significance of our study, we have added a new subsection, ‘1.1 Research Gap’ (Lines 83-109).
  • We have also reviewed more related literature on the ‘machine learning model’ in this study (Reference number 49) (Lines 953, 959, 1038, 1040, 1258-1277). Furthermore, we have added additional literature on solar power forecasting to guide our future studies (References 71-80) (Lines 1072, 1087-1092, 1258-1277).

 

 

 

 

(2) The referencing style needs to be corrected. Instead of [1][2], it should be [1,2] (line 34), and instead of [5][6][7], it should be [6-7] (line 58), etc.

Answer: We have changed the referencing style from [1][2] to [1,2] and from [1][2][3] to [1-3], in accordance with the Journal template, and revised it throughout the entire paper. These changes are highlighted in yellow. (Lines 39, 63, 68, 75, 104, 177, 228, 294, 959, 964, 1010, 1087-1088, 1092, 1099)

 

(3) Figures 3 and 4 are difficult to read and should be improved for better legibility.

Answer : We have enhanced the clarity and resolution of Figure 3 (Lines 245-246)  and Figure 4 (Lines 302-305) to provide better quality.

 

? We would like to thank you for your valuable suggestions, which have helped improve the quality of our paper and enhance its contribution for publication.

Back to TopTop