Next Article in Journal
A Sustainable Intermodal Location-Routing Optimization Approach: A Case Study of the Bohai Rim Region
Previous Article in Journal
Characterization of Thymus vulgaris subsp. vulgaris Community by Using a Multidisciplinary Approach: A Case Study from Central Italy
 
 
Article
Peer-Review Record

Regression-Based Methods for Daily Peak Load Forecasting in South Korea

Sustainability 2022, 14(7), 3984; https://doi.org/10.3390/su14073984
by Geun-Cheol Lee
Reviewer 1:
Reviewer 2: Anonymous
Sustainability 2022, 14(7), 3984; https://doi.org/10.3390/su14073984
Submission received: 10 February 2022 / Revised: 24 March 2022 / Accepted: 25 March 2022 / Published: 28 March 2022

Round 1

Reviewer 1 Report

Hi,

The detailed comments are provided in the attached file. 

Comments for author File: Comments.pdf

Author Response

Response to Reviewer 1 Comments

 

Thank you very much for your detailed reviews and valuable comments. According to your comments, we tried to modify the paper as much as possible. Through this revision, we believe the paper has been significantly improved. Response to each of your comments is presented below one by one. The manuscript was also revised, accordingly. We used the ‘Track Changes’ function of MS Word so that you can easily identify any changes in the revised manuscript.

 

Point 1: Abstract: The abstract section should provide brief information about the methodology, results and future potentials. The abstract is missing some important information, where the authors should point out quantitative improvements in terms of some evaluation metrics.

 

Response 1: According to your direstion, the abstract has been modified (page 1).

Abstract: This study examines the daily peak load forecasting problem in South Korea. This problem has become increasingly important due to the continually changing energy environment. As such, it has been studied by many researchers over the decades. South Korea is geographically located such that it experiences four distinct seasons. Seasonal changes are among main factors affecting electricity demand. In addition, much of the electricity consumption in a strong manufacturing country like South Korea is driven by the industry rather than residential customers. In this study, to forecast daily peak loads of South Korea, we proposed multiple linear regression-based methods where several season-specific regression models (i.e., summer, winter, and all-season models) were included. Among the three models, the most appropriate model among the three models was selected considering the characteristics of the electricity demand. It was then applied to the daily forecasting. Performances of the proposed methods were evaluated through computational experiments. Forecasts obtained by the proposed methods were compared with those obtained by the existing forecasting methods, including a machine learning method. Results showed that the proposed methods had mean absolute percentage errors around 1.95% and outperformed all benchmarks.

 

 

Point 2: Introduction: On page 2, a nice graph for depicting the associating between power generation and GDP of South Korea over a period of 40 years is shown. But, appropriate references are missing, which is very critical and must be provided for such sensitive and important issues.

 

Response 2: We add the reference of data in the caption of the figure.

 

 

Point 3: On page 90-95, page 3, several assumption are made, which are about forecasting power and weather. Usually, in forecasting related papers, models are developed for forecasting these physical phenomenon. The reviewer suggest to include these in their analysis rather than assuming them, which will help in finding promising relationships between the different varying phenomenon.

 

Response 3: Even if we assume that the future weather information such as mean temperature of the forecast day is known, we carried out the related analysis and present the results of the analysis. For example, Figures 6 and 8 visually show the relationship between the peak load and the mean temperature.

Also, we move the part explaining the assumption to the last paragraph of section 3.3.1 where the relevant reference is presented. (page 8)

“In this study, we assume that the weather information of the forecast day is known. Although this is a future datum at the time of the prediction, forecasting tomorrow’s weather information is not regarded as difficult. Especially, temperature forecasts are known as being quite reliable in the short term [7]. Thus, only considering temperature is often sufficient for daily load forecasting”

 

 

Point 4: For the daily peak load trend in Figure 2 and Figure 3, proper reference should be provided, which is usually required for reader’s convenience.

 

Response 4: At the bottom of the first paragraph of section 3, we mentioned the source of the data used in the section. (page 4)

“To analyze the characteristics of electricity demand, we used nine years (2010~2018) of data on South Korea’s daily peak loads available from the Electric Power Statistics Information System (http://epsis.kpx.or.kr/).”

 

 

Point 5: The scientific gaps are not clear and not straightforward. The last part of the introduction portray the focus of the work, which is not convincing without mentioning the actual challenges related to the topic that are missing. Additionally, the contributions of the work should be highlighted at the end of this section.

 

Response 5: The last part of the introduction section has been thoroughly changed. Unique characteristics of the proposed methods were additionally described. (page 2)

“In this study, a comprehensive analysis of short-term electricity demand of South Korea is conducted to obtain such independent variables. Furthermore, the proposed methods have procedures where different forecasting models are applied depending on the season of the forecast day. Due to this seasonally adaptive characteristic, the proposed methods are expected to show excellent forecasting performances.”

 

Point 6: It is better to add " literature review" after introduction.

 

Response 6: According to your suggestion, we add the ‘Literature Review’ section after the introduction section.

 

 

Point 7: It seems that the focus of the study is “industrial sector” rather than residential sector, which is OK but should be explicitly stated in text and may also be with some graphs. If this is true, then some results should be reviewed again, as the industrial sector will mostly be off during week ends, where load should be very load compared to week days. E.g. Figure 4 and 5.

 

Response 7: We consider the nation’s overall electricity demand, not only the demand of the industrial sector. To avoid misunderstanding, we add the sentences which explain that the nationwide demand is considered in the paper.

“In this study, we consider the daily peak load forecasting problem in South Korea. The nation’s overall electricity demand needs to be predicted every day.“ (page 2)

“Note that we forecast nationwide electricity demand, not just industrial electricity demand even if the effect of the day of the week mainly comes from the nation’s industrial characteristics.” (page 6)

 

 

Point 8: My overall comment to the authors: please read through the entire paper word by word to make sure there are no grammar, spelling, and logic errors. I noticed many editorial mistakes while reading this manuscript and this sometimes prohibited me from comprehending the analysis of results. E.g. on line 40-41, page 1. It is written that “electric load can be generated” which is not appropriate. Instead, more suitable statement will be “electricity can be generated…”.

 

Response 8: Including the specific error that you indicated, many editorial errors were found and corrected throughout the entire paper. Additionally, final proofreading has been done by a professional English editing company.

 

 

Point 9: The equation (1), (2), (3) should be properly formatted according to guidelines of the journal..

 

Response 9: The format of the equations was corrected according to the journal guidelines.

 

 

Point 10: Research originality and contribution are not clear. Justifications are necessary to verify the significance of the study..

 

Response 10: In section 1, the significance of the considered problem and unique characteristics of the proposed methods were additionally described. (page 2)

 

 

Point 11: The reviewer notices several grammatical and formatting issues. In addition, checking of the punctuation and English proofreading is highly recommended as well as consistent spelling should be used., which needs to be thoroughly checked and corrected.

Response 11: The manuscript has been thoroughly reviewed according to your comments, many of the editorial issues were corrected. Again, we took a professional English editing service for the final proofreading.

 

 

Point 12: The comparison should include some previous works from the state of the art, which will

help in identifying the improvement gained through the proposed method.

 

Response 12: Each of the regression-based methods using all-season, summer, and winter models is the benchmark from previous work. Because all three methods had already shown their outperformance over various other previous methods, once we show the outperformance of the proposed methods over these three regression-based methods, the proposed methods become the best method by far. We mentioned this in section 5.1.

“Because each of these three models showed its superiority over various other previous methods in their studies [16-18], the demonstrated superiority of the proposed methods over these three models guarantees that the proposed methods are by far the best methods.” (page 15)

Reviewer 2 Report

The author presents regression relations for the daily electoral peak in South Korea. The work needs significant revision for several reasons. My comments are the following:

  • What is the significance of the proposed regression equations? What is their extrapolation behaviour?
  • Does the author impose the mathematical form of the regression equation during the fitting process? The author did not mention what regression method he used? And why did he use this specific regression method?
  • The author must include symbolic regression in the introduction, for example he include the following references, Koza, John R. Genetic programming II. Vol. 17. Cambridge: MIT press, 1994., and El Hasadi, Yousef MF, and Johan T. Padding. "Solving fluid flow problems using semi-supervised symbolic regression on sparse data." AIP Advances9, no. 11 (2019): 115218.

Author Response

Response to Reviewer 2 Comments

 

Thank you very much for your reviews and valuable comments. According to your comments, we tried to modify the paper as much as possible. Response to each of your comments is presented below one by one. The manuscript was also revised, accordingly. We used the ‘Track Changes’ function of MS Word so that you can easily identify any changes in the revised manuscript.

 

Point 1: What is the significance of the proposed regression equations? What is their extrapolation behaviour?

 

Response 1: Each of the three regression models used in the proposed methods, was proposed in the previous work. The statistical significance of each model, such as ANOVA results, is presented in its paper. We add this mention at the end part of the explanation of each model.

Regarding the extrapolation behavior: Values of the independent variables at a forecasting day are input into the regression model for forecasting. Since the values such as the day of the week, month, mean temperature, etc. do not exceed the range of the corresponding sample values of independent variables, extrapolation does not occur. Although electricity demand is on the rise in the long run, no significant change is expected in a day. Therefore, extrapolation is not a big problem in problem to be considered. This may be the reason why the forecast results showed relatively small forecast errors.

 

 

Point 2: Does the author impose the mathematical form of the regression equation during the fitting process? The author did not mention what regression method he used? And why did he use this specific regression method?

 

Response 2: We made the mistake of omitting the description of which method we used. The explanation that the multiple regression model was used was added to the abstract and section 4.1 in the revised manuscript. Various factors affecting the electricity demand identified in the data analysis were intuitively expressed in linear relations. Since there are many independent variables to be considered, complex relationships above quadratic equations were not considered. Instead, an interaction effect that can be intuitively explained was added. We think the proposed multiple linear regression model alone has obtained sufficiently good forecasting performance.

 

 

Point 3: The author must include symbolic regression in the introduction, for example, he include the following references, Koza, John R. Genetic programming II. Vol. 17. Cambridge: MIT press, 1994., and El Hasadi, Yousef MF, and Johan T. Padding. "Solving fluid flow problems using semi-supervised symbolic regression on sparse data." AIP Advances9, no. 11 (2019): 115218.

 

Response 3: The reviewer’s suggestion that we need to consider the symbolic regression is very meaningful. However, this study does not focus on how to construct a regression model, rather on how to select a regression model among the given three models. Moreover, since the number of independent variables used is considerably large, exploring the relationship of various mathematical combinations can complicate the model significantly. Such a complex model can result in losing the great advantage of the regression model, that it is explainable.

Round 2

Reviewer 1 Report

I am satisfied with the revision. 

Author Response

Thank you very much.

Reviewer 2 Report

I read the author's response, and I agree with most of the points. However, I can't entirely agree with him concerning the last point about symbolic regression. The number of input variables (dimensions) for the current problem is 12, which is small compared to the million dimensions that neural networks can handle. Symbolic regression can handle up to 30 dimensions, especially if you use some transfer knowledge (previous knowledge ) with the algorithm. it may identify the critical input variables ( dimensions ) and derive compact equations, mainly if we use commercial software such as Eureqa or the open-source gplearn to handle the equations' complexity effectively. I did not request from the authors to use symbolic regression in their paper. However, I asked for a brief description of the symbolic regression algorithm and to add the following references. It will enrich the introduction since it will give the audience information about a potential method that they can use to obtain their regression models. Also, the second reference describes that symbolic regression can be used in a semi-supervised learning mode with sparse data. For this reason, I request that the author add a description of the symbolic regression in their introduction. 

 

  • Koza, John R. Genetic programming II. Vol. 17. Cambridge: MIT press, 1994., 
  •  El Hasadi, Yousef MF, and Johan T. Padding. "Solving fluid flow problems using semi-supervised symbolic regression on sparse data." AIP Advances 9, no. 11 (2019): 115218.

 

 

Author Response

Thank you very much for your comments. According to your suggestion, we added a brief description of the symbolic regression algorithm in the introduction section (page 2). Also, the two references that you suggested were added.

Back to TopTop