*Article* **Research on Regional Short-Term Power Load Forecasting Model and Case Analysis**

**Kang Qian <sup>1</sup> , Xinyi Wang <sup>2</sup> and Yue Yuan 1,\***


**Abstract:** Integrated energy services will have multiple values and far-reaching significance in promoting energy transformation and serving "carbon peak and carbon neutralization". In order to balance the supply and demand of power system in integrated energy, it is necessary to establish a scientific model for power load forecasting. Different algorithms for short-term electric load forecasting considering meteorological factors are presented in this paper. The correlation between electric load and meteorological factors is first analyzed. After the principal component analysis (PCA) of meteorological factors and autocorrelation analysis of the electric load, the daily load forecasting model is established by optimal support vector machine (OPT-SVM), Elman neural network (ENN), as well as their combinations through linear weighted average, geometric weighted average, and harmonic weighted average method, respectively. Based on the actual data of an industrial park of Nantong in China, the prediction performance in the four seasons with the different models is evaluated. The main contribution of this paper is to compare the effectiveness of different models for short-term electric load forecasting and to give a guideline to build the proper methods for load forecasting.

**Keywords:** short-term electric load forecasting; meteorological factors; optimized support vector machine; Elman neural network; combined model

### **1. Introduction**

Integrated energy services will have multiple values and far-reaching significance in promoting energy transformation and serving "carbon peak and carbon neutralization". Before the implementation of integrated energy services, in order to balance the supply and demand of the power system, it is necessary to establish a scientific model for power load forecasting. The balance of supply and demand in the power system plays an important role in regional economic and social development. To improve the quality of power supply, it is important to analyze various characteristics of regional electric load and establish a proper model for the short-term electrical load forecasting [1].

Many factors affect electric load forecasting, including social-economic factors, such as population and industrial structure, and meteorological conditions [2,3]. Therein, complex and changeable meteorological factors are essential factors in load forecasting and have the greatest impact on forecast results [2]. Forecasting that considers meteorological factors can improve forecasting accuracy.

According to relevant statistics, in large cities with large population, residential power consumption in summer can reach 50% or even higher of the total power load. The prediction of residential power load plays an important role in regional energy allocation, energy-saving control, and power grid reliability. At the same time, residential power load shows the characteristics of complexity and uncertainty, which is closely related to

**Citation:** Qian, K.; Wang, X.; Yuan, Y. Research on Regional Short-Term Power Load Forecasting Model and Case Analysis. *Processes* **2021**, *9*, 1617. https://doi.org/10.3390/pr9091617

Academic Editors: Pei Liu, Ming Liu and Xiao Wu

Received: 1 July 2021 Accepted: 31 August 2021 Published: 8 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

temperature, humidity, wind speed, week type, building function, building area, industrial structure, and other factors.

In short-term load forecasting, the processing methods of meteorological factors mainly include the following ideas: (1) Do not consider the influence of meteorological factors; (2) Only the influence of single meteorological factor shall be considered; (3) Considering the influence of multiple meteorological factors without coupling effect; (4) Considering the influence of meteorological index formed by coupling multiple meteorological factors; (5) Consider the influence of real-time meteorological factors. In the actual prediction process, the appropriate treatment scheme is selected according to local conditions in combination with the actual local weather and load conditions.

Load forecasting can be divided into ultra-short-term, short-term, medium-term, and long-term according to different purposes. Short-term load forecasting refers to daily load forecasting and weekly load forecasting. It is necessary to fully study the law of power grid load changes and analyze the relevant factors of load changes for arrangement of daily scheduling plan or weekly scheduling plan.

Extensive research has been done on load forecasting, and the algorithms can be divided into two groups. Traditional statistical algorithms including the time series model [4–6] and the linear regression model [7] are the early and widely developed methods for load forecasting. It is modeled only based on load time series, which is computationally fast. However, short-term load forecasting is affected by many factors as mentioned above. When the weather changes greatly or the holidays occur, these models fail to forecast accurately [8]. Therefore, the second load forecasting algorithm that considers meteorological factors came into being. It takes into account weather and calendar factors to make the forecast results closer to the actual situation. These modern intelligent algorithms include the artificial neural network (ANN) [9–11], the expert system [12], support vector machine(SVM) [13–16], etc. Though these individual forecasting models obtain satisfactory performance, each model has unavoidable drawbacks, and their performance varies under different conditions, which results in risks and limitations in the practical applications [17].

Thus, it is possible to consider combining different individual forecasting models. In 1969, Bates and Granger made a systematic study on the combination forecasting method for the first time and then aroused great interest in combination forecasting models [18]. According to the pros and cons of individual algorithms, the combination of these algorithms can reduce the risks of the individual method and achieve better forecasting performance. Various combinations are proposed, such as the BP neural network model combined with the gray theory model based on variable weights [19], different types of neural network combinations based on non-positive constraint theory [20], SVM combined with chaos theory [21], etc. These researches of combination models mainly focus on the determination of the optimal weights to improve the forecast accuracy, and, meanwhile, avoid falling into local optimal solution, lower convergent rate, and increasing modeling complexity [22–24].

In addition, some recent papers have also proposed some short-term load forecasting methods: a novel hybrid forecasting algorithm is proposed in [25]. The proposed hybrid forecasting method is based on locally weighted support vector regression (LWSVR) and the modified grasshopper optimization algorithm (MGOA). Obtaining the appropriate values of LWSVR parameters is vital to achieving satisfactory forecasting accuracy. Therefore, the MGOA is proposed in this paper to optimally select the LWSVR's parameters. The proposed MGOA can be derived by presenting two modifications on the conventional GOA in which the chaotic initialization and the sigmoid decreasing criterion are employed to treat the drawbacks of the conventional GOA. There are also other related papers, e.g., the work in [26] proposes an ultra-short-term power load forecasting method based on two-layer xgboost (extreme gradient boosting) algorithm. The work in [27] proposes a shortterm power load forecasting model based on similar day selection and multi-integration combination. In [28], particle swarm optimization (PSO) is proposed to optimize the neural network model of quantum weighted gated circulating unit to predict short-term power

load. The work in [29] introduces K-means clustering analysis technology based on cosine similarity to predict short-term power load. The work in [30] proposes a load forecasting method based on improved deep belief network algorithm.

It can be seen that there are many short-term load forecasting methods, such as time series model, multiple linear regression, support vector machine, and so on. Compared with other traditional algorithms, support vector machine has better prediction effect for small samples and has the advantages of strong learning ability and high precision in processing high-dimensional data. However, the prediction effect of support vector machine is closely related to the quality of parameter estimation. At present, the commonly used grid parameter optimization methods have a large amount of calculation and limited parameter estimation accuracy. Using an intelligent optimization algorithm to find the global optimal parameter combination has gradually become an effective way to improve the application effect of support vector machine. In light of this, by coupling a simulated annealing particle swarm optimization algorithm and support vector machine, an optimized support vector machine model is constructed to study the short-term load forecasting effect under different meteorological variable input conditions and analyze the influence of meteorological factors on load forecasting.

In the field of power load forecasting, static neural network, mainly represented by BP neural network, is most widely used. This kind of model actually transforms dynamic time modeling into static space modeling, resulting in its insufficient ability to capture the highly complex time-varying characteristics of load. Therefore, Elman neural network with dynamic adaptive time-varying characteristics is applied after adding a receiving layer on BP neural network.

For the short-term load forecasting of regional power system, the problems existing in the current research include the following.


This paper presents some combined models for short-term load forecasting. It integrates simulated annealing OPT-SVM and ENN through time-varying weights selection. Compared with other traditional algorithms, support vector machines have better prediction effects for small samples. The advantages of strong learning ability and high accuracy when processing high-dimensional data are also very prominent. The Elman neural network has better generalization and prediction ability for nonlinear and high-frequency time series data. The proposed models can take into account regional power load characteristics and their correlation with meteorological factors such as temperature, humidity, wind speed, and sunshine. On the one hand, they can improve the adaptability of the power dispatching department to regional power load changes; on the other hand, they can improve the accuracy of load forecasting to meet the demand of regional power load. The accuracy of short-term electrical load forecasting based on single OPT-SVM and ENN varies in different periods. The weighted combination of the results of different forecasting models in each period could result in higher accuracy of load forecasting. To this end, the time history of the contribution of each single model to the combined forecasting is studied. The effectiveness of the combination forecasting model is analyzed from the

statistical accuracy so as to provide a useful reference for deepening the application of the time-varying weight combination model in short-term load forecasting.

The innovation of our paper is to find out a time-varying weight combination model which takes the results of two single models of OPT-SVM and Elman NN as input, studies the time history changes of each single model's contribution to the combined forecasting, and uses statistics methods to analyze the effectiveness of the combined model, providing a useful reference for deepening the application of time-varying weight combination models in short-term load forecasting.

The rest of this paper is organized as follows. The experimental data are described in Section 2. In Section 3, the mathematical principles and processes of the proposed models are described in detail. Furthermore, Section 4 presents and discusses the forecasting results. Finally, the proposed models are summarized in Section 5.

### **2. Data Preparation**

Under the influence of meteorological and other factors, regional power load often presents complex time-varying characteristics of continuity, nonlinearity, and multifrequency nesting. Therefore, in order to establish a scientific and accurate power load forecasting model, it is necessary to understand the time-varying characteristics of regional power load, analyze the correlation between power load and meteorological factors, and provide a priori knowledge for selecting the key influencing factors of power load forecasting.

The basic data of this paper come from the actual load demand data of an industrial park in Nantong, Jiangsu, China from 1 March 2016 to 28 February 2017. These data contain the following.


**Figure 1.** Daily load curve of the studied park.

The studied industrial park area is about 170 km<sup>2</sup> , with relatively concentrated distribution, simple terrain, and less-fluctuated climate, so one weather station can effectively cover the area.

The first three-quarters of the remaining load sequence length after excluding the pre-ordered autocorrelation load is used as the training sample, and the last one-quarter as the prediction verification sample; the modeling period and forecasting period are determined according to the four seasons and the load autocorrelation time lag as follows:


4. Winter: December 6 to February 6 (2017) as modeling period, totaling 63 days, and February 7 to February 28 as forecasting period, totaling 21 days.

During the preprocessing of the raw data, horizontal and vertical processing methods are used here to deal with abnormal data [31], and the Least squares method is applied for curve fitting to obtain the missing data. The power load forecasting model established should consider the influence of meteorological factors, and the meteorological factors, such as temperature, humidity, precipitation, sunshine days, wind speed, and electricity load, have different dimensions and units and cannot be used directly. Therefore, it is necessary to standardize all relevant factors so that all data are in the same order of magnitude.

Furthermore, the different types of meteorological data are normalized as follows:

$$Y = \frac{I\_i - I\_{\min}}{I\_{\max} - I\_{\min}} \tag{1}$$

where *Y* is the normalized data, and *I<sup>i</sup>* , *Imin*, and *Imax* are the current sample value, minimum, and maximum of the training data, respectively.

### **3. Optimal Combined Model Considering Meteorological Factors**

This paper improves the results of short-term load forecasting by the steps as shown in Figure 2.

**Figure 2.** Framework of the proposed method.

### *3.1. Data Optimization*

The pre-ordered autocorrelation load and comprehensive meteorological factors are taken as the input of the forecasting model. Therefore, the processing of the data is introduced as follows.

### 3.1.1. Correlation Analysis

First of all, the correlation analysis between electric load and meteorological factors is necessary. Pearson correlation coefficient is a measure of linear correlation between two sets of data, which has a value between −1 and 1. It is defined as

$$\rho\_{X,Y} = \frac{\text{cov}(X,Y)}{\sigma\_X \sigma\_Y} = \frac{E[(X-\mu\_X)(Y-\mu\_Y)]}{\sigma\_X \sigma\_Y} \tag{2}$$

where *ρX*,*<sup>Y</sup>* represents the overall correlation coefficient, *X* and *Y* are two variables, cov(*X*,*Y*) is the covariance of the variables *X* and *Y*, *µ<sup>X</sup>* and *µ<sup>Y</sup>* are the expectations of the variables *X* and *Y*, and *σ<sup>X</sup>* and *σ<sup>Y</sup>* are the standard deviations of the variables *X*, *Y*.

### 3.1.2. Principal Component Analysis (PCA)

There could exist correlations between different meteorological factors so that multiple meteorological variables may contain redundant information. PCA can be used to extract main feature factors from multiple meteorological factors for dimensionality reduction. It replaces the original variables with fewer new variables and enables these new variables to retain the original full variable information as much as possible. The detailed steps of PCA can be found in [32].

### 3.1.3. Autocorrelation Analysis of Electrical Load

The pre-ordered autocorrelation load is taken as the input of the proposed load forecasting model in this paper, and the autocorrelation time lag needs to be determined. The pre-ordered autocorrelation load time lag is determined according to the autocorrelation function (ACF) in [33].

### *3.2. Model Description*

The proposed combined model integrates two individual models (i.e., OPT-SVM and ENN) based on the time-varying optimal weight selection. OPT-SVM and ENN both have good forecasting performance.

### 3.2.1. Optimal Supporting Vector Machine (OPT-SVM)

The basic idea of SVM algorithm is to construct an optimal hyperplane from the sample space or feature space to make the margin between different types of data points as far as possible. Given the sample sets {*x<sup>i</sup>* , *yi*}, *i* = 1, 2, . . . , *n*, *x<sup>i</sup>* is the input sample, *y<sup>i</sup>* is the output sample, and *n* is the total number of samples. The nonlinear load forecasting SVM model is expressed as

$$f(\mathbf{x}) = \boldsymbol{\omega}^T \boldsymbol{\varphi}(\mathbf{x}) + \boldsymbol{b} \tag{3}$$

where *ϕ*(*x*) denotes the nonlinear mapping from the input space to high-dimensional feature space. *ω* is the weight vector and *b* is constant. They can be obtained by the principle of structural minimization, and the optimization target can be expressed as follows:

$$\frac{1}{2}\omega^2 + c\sum\_{i=1}^n \left| y\_i - \left(\omega, \varphi(x\_i)\right) - b\right|\_\varepsilon \tag{4}$$

with *c* as penalty factor.

Introducing slack variables *ξ<sup>i</sup>* and *ξ* ∗ *i* into (4), it turns into

$$\frac{1}{2} \left\| \omega \right\|^2 + c \sum\_{i=1}^n (\mathfrak{f}\_i + \mathfrak{f}\_i^\*), \text{ s.t.} \begin{cases} y\_i - \left( \omega^T (\mathbf{x}\_i) + b \right) \le \varepsilon + \mathfrak{f}\_i^\* \\ \left( \omega^T \boldsymbol{\varrho} (\mathbf{x}\_i + \mathbf{b}) \right) - y\_i \le \varepsilon + \mathfrak{f}\_i \\ \mathfrak{f}\_i \mathfrak{f}\_i^\* \ge 0, i = 1, \dots, n \end{cases} \tag{5}$$

By introducing Lagrange multiplier *α*, we transform Equation (5) into

$$f(\mathbf{x}, a) = \sum\_{i=1}^{n} (\alpha\_i^\* - a\_i) \mathbf{K}(\mathbf{x}\_i, \mathbf{x}) + b \tag{6}$$

where *K*(*x*, *xi*) is the kernel function. Here, we choose Gauss radial basis function (GRBF) as kernel function for its good analytical performance. GRBF is expressed as

$$\mathcal{K}(\mathbf{x}\_{i\prime}\mathbf{x}\_{j}) = e^{-\gamma \left\| \mathbf{x}\_{i} - \mathbf{x}\_{j} \right\|^{2}} \tag{7}$$

where *γ* is a tunable parameter.

Error penalty parameter *c* and kernel function parameter *γ* are the two key parameters to the SVM. Compared with other traditional algorithms, support vector machines have better prediction effects for small samples. The advantages of strong learning ability and high accuracy when processing high-dimensional data are also very prominent. However, the prediction effect of support vector machines is closely related to the quality of parameter estimation. Currently, the commonly used gridding parameter optimization method has a large amount of calculation and limited parameter estimation accuracy. The use of intelligent optimization algorithms to find the global optimal parameter combination has gradually become an improved support vector and the most effective way to attain the machine application effect. Particle Swarm Optimization (PSO), as an excellent representative algorithm with clear principles and simple calculations, has been widely used in solving optimization problems. PSO is a swarm intelligence evolutionary computation

technology based on iterative optimization. Furthermore, the simulated annealing algorithm is introduced to the PSO algorithm to increase its global optimization ability. The specific steps of the algorithm can refer to the work in [34]. Here, we utilize PSO to solve the optimal SVM parameters to avoid lengthy calculations. Because of this, this chapter builds an optimized support vector machine model by coupling the simulated annealing particle swarm optimization algorithm and support vector machine to study the effect of short-term load forecasting under different meteorological variable input conditions and analyze the influence of meteorological factors on load forecasting.

According to the description of simulated annealing PSO and SVM above, the algorithm of OPT-SVM for electric load forecasting proceeds as follows:


### 3.2.2. Elman Neural Network (ENN)

ENN is based on the basic structure of the BP neural network and adds a "feedback". Such an internal feedback network increases its ability to deal with dynamic information of the network itself, which means the system can adapt to the time-varying characteristics and that it is more suitable to establish the forecast model of time series.

ENN includes an input layer, hidden layer, context layer, and output layer. The connection of the input layer, the hidden layer, and the output layer is similar to the BP neural network. The input layer performs signal transmission and the output layer plays a weighting role. The context layer is used to memorize the output value of the hidden layer unit at the previous moment, and is usually considered as a delay operator network. The structure diagram is shown in Figure 3.

**Figure 3.** The structure of Elman Neural Network.

The mathematical model of ENN is

$$\mathbf{x}(k) = f(\omega\_1 \mathbf{x}\_c(k) + \omega\_2 \mathbf{i}(k-1))\tag{8}$$

$$\mathbf{x}\_{\mathcal{L}}(k) = a \cdot \mathbf{x}\_{\mathcal{L}}(k-1) + \mathbf{x}(k-1) \tag{9}$$

$$y(k) = \mathcal{g}(\omega\_3 \mathbf{x}(k))\tag{10}$$

where *y* is the *m*-dimensional output vector, *x* is the *n*-dimensional hidden layer vector, *i* is the *r*-dimensional input vector, *x<sup>c</sup>* is the *n*-dimensional feedback state vector, *a* is the self-connected feedback gain factor, *ω*<sup>1</sup> is connection weight between the context layer and the hidden layer, *ω*<sup>2</sup> is the connection weight of the input layer and the hidden layer, *ω*<sup>3</sup> is the connection weight of the hidden layer and the output layer, *g*(.) is the transfer function of the output neuron, and *f*(.) is the transfer function of the hidden layer neuron.

Like the BP neural network, the number of hidden neurons is the core parameter to determine the model structure. A trial and error process is needed to decide the optimal number of hidden neurons. Once the model structure is settled, the connection weights between different layers are determined by self-learning using the gradient descent method. The training and prediction process by ENN is shown in Figure 4.

**Figure 4.** The flowchart of electrical load forecast by ENN.

3.2.3. Combined Forecasting Model

For the same prediction problem, the optimized combination of multiple different prediction models can effectively improve the prediction accuracy of the model under certain conditions. The combination model development includes three key steps: (1) selection of proper individual models, (2) construction of the mathematical expression of the combination from multiple individual models, and (3) solving the weight of each single model through optimization algorithms.

As the OPT-SVM and ENN are selected here as the candidates for the combined model, the mathematical expressions of the combination model are then supposed to be

determined. We focuses on three forms of expressions: linear weighted average, geometric weighted average, and harmonic weighted average; they are expressed as follows:

$$f\_{t, \varepsilon} = a\_1(t)f\_{1,t} + a\_2(t)f\_{2,t} + \dots + a\_m(t)f\_{m,t} \tag{11}$$

$$f\_{c,t} = f\_{1,t}^{a\_1(t)} f\_{2,t}^{a\_2(t)} \cdots f\_{m,t}^{a\_m(t)} \tag{12}$$

$$f\_{\mathcal{C},t} = \frac{1}{\frac{a\_1(t)}{f\_{1,t}} + \frac{a\_2(t)}{f\_{2,t}} + \dots + \frac{a\_m(t)}{f\_{m,t}}} \tag{13}$$

where *t* represents the *t*-th forecast period; *f<sup>c</sup>* is the combined prediction result; *f*1, · · · , *f<sup>m</sup>* are forecast values of *m* individual prediction models; *α*1, · · · , *α<sup>m</sup>* are the weights of individual models, respectively; and *α*<sup>1</sup> + *α*2+ · · · +*αm*=1, *α<sup>i</sup>* ≥ 0, *i* = 1, 2, . . . , *m*.

Solving the time-varying weights Can be regarded as an optimization problem as follows:

$$\min Q = Q(\mathfrak{a}\_1, \mathfrak{a}\_2, \dots, \mathfrak{a}\_m), \text{ s.t.} \begin{cases} \mathfrak{a}\_1 + \mathfrak{a}\_2 + \dots + \mathfrak{a}\_m = 1\\ \mathfrak{a}\_i \ge 0, i = 1, 2, \dots, m \end{cases} \tag{14}$$

where *Q* is the objective function that indicates the combined forecast accuracy corresponding to a certain set of weights. In this paper, the Mean absolute percentage error (MAPE) is used as the optimization objective function.

In the *t*-th forecast period, assign all the data before the *t*-th period, including simulated values of the modeling period, the predicted values of the forecast period, and the actual measured values of the corresponding period, to Equation (14). Equation (14) is then solved by the sequential quadratic programming method. In the (*t* + 1)-th period, the predicted value and measured value of (*t* + 1)-th period are incorporated into the known knowledge of the weight solution problem by real-time learning of the latest prediction experience, thereby updating the weights of individual models.

### **4. Results and Discussion**

### *4.1. Correlation Analysis Results of Meteorological Factors and Daily Load in Four Seasons*

Figure 5 shows the correlation analysis results between electric load and meteorological variables. The dependent variable is the electric load, and the independent variables are the meteorological factors. The color bars in the right of each figure represent the range of the correlation coefficient, and the numbers in the figure represent the correlation coefficient. When the correlation coefficient is positive, it shows a trend toward the upper right side; when the correlation coefficient is 1, it is a straight line; when the correlation coefficient is negative, it shows a trend toward the lower left side. The smaller the correlation coefficient, the closer the graph is to a circle. The first row in the figure shows the correlation coefficients between electric load and various meteorological factors. The other rows in the figure are the correlations between various meteorological factors.

It can be seen from the correlation analysis results that the correlation between summer load and meteorological factors the most obvious, followed by autumn and winter loads, and the correlation between spring load and meteorological factors is slightly weaker. In general, air temperature (surface temperature) has the most significant impact on the seasonal load. Besides, among meteorological factors, the correlation between air temperature and the surface temperature reaches 0.9, and the correlation between relative humidity and evaporation in spring, summer, and autumn is also relatively high. It would cause multi-collinearity, which should be carefully treated.

Version August 23, 2021 submitted to *Processes* 11 of 22

**Figure 5.** Correlation analysis of daily load and meteorological factors in different seasons.(a) Spring (Mar. 1st to Jun. 3rd) (b) Summer (Jun. 4th to Sep. 3rd) (c) Autumn (Sep. 4th to Dec. 3rd) **Figure 5.** Correlation analysis of daily load and meteorological factors in different seasons.(a) Spring (Mar. 1st to Jun. 3rd) (b) Summer (Jun. 4th to Sep. 3rd) (c) Autumn (Sep. 4th to Dec. 3rd) (d) Winter (Dec. 4th to Feb. 28th of the following year) **Figure 5.** Correlation analysis of daily load and meteorological factors in different seasons. (**a**) Spring (March 1 to June 3); (**b**) Summer (June 4 to September 3); (**c**) Autumn (September 4 to December 3); (**d**) Winter (December 4 to February 28 of the following year).

(d) Winter (Dec. 4th to Feb. 28th of the following year)

### *4.2. Analysis of Time-Varying Characteristics of the Resident Load*

The total regional power load is composed of three parts: commercial area, industrial area, and residential area, among which the hourly power load of commercial area and industrial area has obvious periodicity, and the influence of environmental factors is relatively small. Resident electric load is difficult to predict due to the influence of people's lifestyle and weather factors. This paper mainly studies the daily load forecasting, and tries to forecast the hourly residential load with an additional example.

The hourly load of typical daily residents in the four seasons in the region is shown in Figure 6 (select April 15 in spring, August 21 in summer, October 22 in autumn, and February 14 in winter as typical days). It can be seen from the figure that the hourly load characteristics and trends of residents on typical days in different seasons are relatively similar, and the peak load is mainly concentrated at 20 o'clock.

**Figure 6.** Hourly load changes of typical daily residents in an industrial park in different seasons.

The hourly load trend of residents on a typical day is similar. The peak electricity consumption is mainly concentrated from 17:00 to 22:00. As this period is the time when residents are off work, the load in the residential area reaches the peak of the day.

Due to weather changes, residential areas have increased air conditioning in summer, air conditioning and heating in winter, and the hourly load peak and valley values of residents in summer and winter are higher than those in spring and autumn. The peak difference between summer and spring is about 15 MW. The peak difference in autumn is nearly 20 MW.

### *4.3. PCA Results of the Meteorological Factors*

PCA is conducted on five meteorological variables of temperature, wind speed, relative humidity, evaporation, and surface temperature on a daily scale in the four seasons. The results are presented in Table 1.

Taking winter as an example, the comprehensive meteorological factor (the first principal component) contains 41.27% of information. The normalized variables temperature, relative humidity, and surface temperature have similar coefficient values greater than 0.5, indicating that these three variables are closely related to the meteorological conditions. At the same time, the new factor almost contains the main information of each representative variable, which can be used as a new meteorological feature to represent the meteorological factor. Only the first principal component is used as the comprehensive meteorological factor inputting the proposed model.


**Table 1.** The first principal component coefficients and contribution of meteorological factors on daily scales. <sup>358</sup> of each representative variable, which can be used as a new meteorological feature to <sup>359</sup> represent the meteorological factor. Only the first principal component is used as the

<sup>350</sup> PCA is conducted on five meteorological variables of temperature, wind speed, <sup>351</sup> relative humidity, evaporation and surface temperature on a daily scale in the four

 Taking winter as an example, the comprehensive meteorological factor (the first principal component) contains 41.27 % of information. The normalized variables temper- ature, relative humidity, and surface temperature have similar coefficient values greater than 0.5, indicating that these three variables are closely related to the meteorological conditions. At the same time, the new factor almost contains the main information

Version August 23, 2021 submitted to *Processes* 13 of 22

<sup>352</sup> seasons. The results are presented in Table 1, respectively.

### *4.4. Pre-Ordered Autocorrelation Load Time Lag Results* <sup>365</sup> length of the time series). According to the length of the entire load sequence of the

The autocorrelation function values of daily load in the four seasons are shown in Figure 7. The blue line in the figure represents the critical value of the 95% significance level of the autocorrelation function (the critical value is inversely proportional to the length of the time series). According to the length of the entire load sequence of the four seasons, we checked the critical value table of the correlation coefficient to determine the critical values to be 0.205, 0.209, 0.210, and 0.211. When the autocorrelation function value is greater than the critical value, it indicates that the time lag load is significantly correlated with the target period load; otherwise, the autocorrelation is not significant. The maximum time lag where the positive autocorrelation function value is greater than the critical value is taken as the pre-ordered autocorrelation load time lag. As a result, the pre-ordered autocorrelation load time lag is four days in spring, seven days in summer and autumn, and three days in winter. <sup>366</sup> four seasons, check the critical value table of the correlation coefficient to determine <sup>367</sup> the critical values to be 0.205, 0.209, 0.210 and 0.211. When the autocorrelation function <sup>368</sup> value is greater than the critical value, it indicates that the time lag load is significantly <sup>369</sup> correlated with the target period load; otherwise, the autocorrelation is not significant. <sup>370</sup> The maximum time lag where the positive autocorrelation function value is greater <sup>371</sup> than the critical value is taken as the pre-ordered autocorrelation load time lag. As a <sup>372</sup> result, the pre-ordered autocorrelation load time lag is four days in spring, seven days in <sup>373</sup> summer and autumn, and three days in winter.

In this paper, MAPE and maximum absolute percentage error (max APE) are


*yi*


× 100 % (15)

selected to evaluate the predictive performance of the model. The good stability of

**Figure 7.** ACF of daily load in four seasons. **Figure 7.** ACF of daily load in four seasons.

<sup>374</sup> *4.5. Evaluation Criteria*

*MAPE* =

1 *n* ∑ *n i*=1

### *4.5. Evaluation Criteria*

In this paper, MAPE and maximum absolute percentage error (max APE) are selected to evaluate the predictive performance of the model. The good stability of MAPE can be used as a benchmark for evaluation criteria [35]. They are defined as

$$MAPE = \frac{1}{n} \sum\_{i=1}^{n} \frac{|\mathfrak{f}\_i - y\_i|}{y\_i} \times 100\,\%\tag{15}$$

$$\max\\_APE = \max(\frac{|\mathcal{Y}\_i - y\_i|}{y\_i} \times 100\,\%\,\,i = 1,2,\ldots,n)\tag{16}$$

where *n* is the number of samples, *y*ˆ*<sup>i</sup>* is the predicted value, and *y<sup>i</sup>* is the actual value.

### *4.6. Load Forecasting Results*

First, the forecasting results obtained by individual models are summarized and compared in Table 2 and Figure 8. Figure 8 shows the statistical error results of the OPT-SVM and Elman model for the four-season daily load in simulation period and forecast period. In the figure, the cross x represents the average position, the horizontal line in the box represents the median, the upper and lower boundaries of the box are the 75% and 25% quantiles, and the upper and lower ends of the whiskers line represent the maximum and minimum values except for outliers. It can be seen that for the daily load in spring, summer, and autumn, the average error, 75%, and 25% quantile of ENN are significantly smaller than OPT-SVM; and in winter, the average error and 75% quantile of ENN are larger than that of OPT-SVM. Therefore, ENN has a better forecasting performance for the daily load in spring, summer, and autumn, while OPT-SVM is more suitable for the daily load forecasting in winter.


**Table 2.** Forecasting accuracy comparison of OPT-SVM and ENN.

For the hourly load forecast of residents in summer and winter shown in Figures 9 and 10, the addition of meteorological factors (OPT-SVM-T and OPT-SVM-PCA) to the pre-autocorrelation load helps to improve the prediction accuracy. Here, OPT-SVM-T refers to a load forecasting model that considers the preorder autocorrelation load and adds air temperature as input. Support vector machine model: (1) Load forecasting model (OPT-SVM) that does not consider meteorological variables and takes pre-order autocorrelation load as input; (2) Load forecast model that takes pre-order autocorrelation load and temperature as input (OPT-SVM-T); (3) The load forecasting model (OPT-SVM-PCA) that considers the pre-order load and adds the comprehensive meteorological variable factor (this article uses the first principal component after the principal component analysis of multiple meteorological variables as the comprehensive meteorological factor) as the input. Among them, the use of preautocorrelation load and comprehensive weather and OPT-SVM-PCA with the factor as the input has a better forecasting effect on the summer resident load, and the OPT-SVM-T taking into account the temperature factor is more effective for improving the accuracy of the winter resident load forecasting.


**Figure 8.** Statistical chart of total load error in four seasons. (**a**); Spring (March 1 to June 3); (**b**) Summer (June 4 to September 3); (**c**) Autumn (September 4 to December 3); (**d**) Winter (February 4 to February 28 of the following year).

The forecast accuracy of the individual models changes over time. Even the models with lower average forecasting accuracy in a few periods also contain useful information that can help improve the forecasting accuracy. As the prediction accuracy of individual models have a good positive correlation with their respective weights, and the weights can also reflect the contribution of the predicted value of each individual model to the combined forecasting model. Based on the results above, we then analyze the time-varying weights of every single model.

**Figure 9.** Optimized support vector machine model to predict the hourly load of residents in winter: (**a**) OPT-SVM (**b**) OPT-SVM-T (**c**) OPT-SVM-PCA.

**Figure 10.** Hourly power load forecasting in residential area in summer: (**a**) OPT-SVM (**b**) OPT-SVM-T (**c**) OPT-SVM-PCA.

Figure 11 shows the time-varying weights of individual models through linear weighted average daily load forecasting in the four seasons. It can be seen from Figure 11 that ENN weights are all higher than OPT-SVM in each forecast period in spring, summer, and autumn, indicating that ENN forecast accuracy is generally higher than OPT-SVM in these three seasons; in winter, the performance of the two prediction model reverses: OPT-SVM has better prediction effect and correspondingly greater weights than ENN. Besides, it can be seen that the time-varying characteristics of the weights of OPT-SVM and ENN in different seasons

vary: (1) Before the 13th day in spring, the weights of a single model remain basically stable. After that, the weights of ENN show a small drop while the weights of OPT-SVM increase. (2) During the summer period, the weights of the two models in the combined model have a small change, especially in the late forecast period, the ENN weights stay stable and close to 1. (3) In the autumn forecast period, with the accumulation of forecast experience, the ENN weights slowly decrease while OPT-SVM weights increase accordingly. (4) The ENN weights and OPT-SVM weights in the winter forecast period are between 0.4–0.5 and 0.5–0.6, respectively. According to the time-varying characteristics, ENN weights and OPT-SVM weights can be divided into three segments in winter, and both weights in the middle segment are getting closer.

**Figure 11.** Time-varying weights of individual models through linear weighted average daily load forecasting.

Figure 12 shows the time-varying weights of individual models through harmonic weighted average daily load forecasting in the four seasons. Comparing Figure 11, we can see that although the mathematical expressions of the combined prediction are different, the time-varying characteristics of the weights of ENN and OPT-SVM have not changed significantly. Due to space limitations, the time-varying weights of the geometric weighted average method are no longer given.

Finally, the prediction accuracy of the individual models and combined models is compared in Table 3. The following observations have been noted. (1) For load forecasting in spring, the MAPEs and maxAPEs of the three combined forecasting models are generally close to each other; the MAPEs of the three combined models are lower than that of OPT-SVM and ENN, and the maxAPEs of combined models are between two individual models. (2) For the daily load forecasting in summer, the MAPEs and maxAPEs of the three combined models are close to or slightly lower than that of OPT-SVM and ENN. (3) For the daily load forecasting in autumn, the MAPEs and maxAPEs of the combined models for daily load forecast are both greater than the individual models. (4) In winter, the combined forecasting models significantly decrease the MAPEs while the maxAPEs of the combined models are between that of OPT-SVM and ENN.

**Figure 12.** Time-varying weights of individual models through harmonic weighted average daily load forecasting.


**Table 3.** Prediction result comparison of the individual models and combined models.

We further rank the prediction errors of five models in each prediction period. Figure 13 shows the ranking results. The smaller the blue circle in the figure, the smaller the absolute percentage error. As shown in Figure 13, the number of periods with the smallest forecast error of the harmonic weighted average method is significantly higher than the other two combined forecasting models. Based on the above analysis, it can be concluded that the harmonic weighted average is the best model among the three combined models for daily load forecasting of the four seasons. For the daily load forecasting in spring and winter,

the combined model can be used to archive better performance than a single model; for the daily load forecasting in summer, the accuracy of the combined models is basically the same as that of the individual models; for the daily load forecasting in autumn, the prediction accuracy of the combined models is between OPT-SVM and ENN.

**Figure 13.** Prediction error ranking results for each period with different models. (**a**) Spring (March 1 to June 3); (**b**) Summer (June 4 to September 3); (**c**) Autumn (September 4 to December 3); (**d**) Winter (December 4 to February 28 of the following year).

Further research on the forecasting accuracy improvement of the combined model in each time period is of great significance for deepening the understanding of the effectiveness of combined forecasting. Figure 14 shows the forecast accuracy of the combined models for the total daily load of the four seasons compared with the single models. As can be seen from the Figure 14, compared with OPT-SVM, the harmonic weighted average has a positive error reduction value for multiple periods when predicting the total daily load in spring, and finally shows a greater improvement in the average prediction accuracy of OPT-SVM. Compared with Elman, only a few time period errors have been slightly reduced. At the same time, some time periods are affected by OPT-SVM, and the error is slightly increased. The combined effect causes the harmonic weighted average to be smaller than the OPT-SVM average error. For the total load of autumn, although the forecast error in some periods has also been effectively improved, the errors in the two periods are significantly increased, which means the average accuracy of the harmonic weighted average forecast is worse than OPT-SVM. Based on the above analysis, it can be seen that the improvement of the average forecast accuracy of the combined forecasting model can roughly reflect the overall situation of the forecast error reduction in each period. When the average forecasting accuracy is improved, it can be trusted that the combined forecasting model can get closer to the actual measurement in more time periods. When the average prediction accuracy is worse than the optimal single model, it can be considered how to make full use of the partial error to obtain the prediction result of the compressed period, so as to provide useful guidance for further improving the prediction accuracy.

**References**

**Figure 14.** The forecast error reduction effects of the combined model by harmonic weighted average compared with the single models.(a) Spring (Mar. 1st to Jun. 3rd) (b) Summer (Jun. 4th to **Figure 14.** The forecast error reduction effects of the combined model by harmonic weighted average compared with the single models. (**a**) Spring (March 1 to June 3); (**b**) Summer (June 4 to September 3); (**c**) Autumn (September 4 to December 3); (**d**) Winter (December 4 to February 28 of the following year).

#### Sep. 3rd) (c) Autumn (Sep. 4th to Dec. 3rd) (d) Winter (Dec. 4th to Feb. 28th of the following year) **5. Conclusions**

<sup>506</sup> humidity, wind speed, evaporation and surface temperature and load is analyzed, so as <sup>507</sup> to provide basis for selecting key meteorological factors. From the analysis results, the <sup>508</sup> temperature has the greatest impact on the load in four seasons. <sup>509</sup> When the average forecasting accuracy is improved, the combined forecasting Electric load forecasting plays an important role in the power system. In order to balance the supply and demand of power system in integrated energy, it is necessary to establish a scientific model for power load forecasting. This paper takes an industrial park in Nantong as an example and establishes forecasting models that consider meteorological factors.

<sup>510</sup> models can obtain prediction results closer to the actual measured values in more <sup>511</sup> periods. The results show that the harmonic weighted average method is recommended <sup>512</sup> for the optimal combined forecasting model. The first mock exam is better than the <sup>513</sup> first mock exam model, especially for the total load in spring and winter. The average <sup>514</sup> absolute percentage error (MAPE) of the forecast is lower than that of the single model. <sup>515</sup> Improving the accuracy of short-term regional power load forecasting is a long-term <sup>516</sup> work. Its forecasting models are diverse and the influencing factors are very complex. <sup>517</sup> This paper only analyzes the impact of some meteorological factors on short-term load <sup>518</sup> forecasting. In the field of power load forecasting, whether from the exploration of load This paper establishes a prediction model considering meteorological factors, including an optimization vector machine model, Elman neural network model, and time-varying weight combination prediction model, that studies the impact of meteorological factors on power load forecasting and evaluates the applicability of the model. Based on the analysis of the variation characteristics of regional power load in different seasons and typical days, the correlation between meteorological factors such as temperature, relative humidity, wind speed, evaporation and surface temperature, and load is analyzed, so as to provide basis for selecting key meteorological factors. From the analysis results, the temperature has the greatest impact on the load in four seasons.

<sup>519</sup> forecasting model or the comprehensive consideration of influencing factors, there are <sup>520</sup> still many research contents and methods worthy of further test, which can be studied <sup>521</sup> and analyzed from the following perspectives: <sup>522</sup> 1. In this paper, considering the influence of meteorological factors, in the follow-up <sup>523</sup> research, we can take into account the plot type, week type and cultural activities, and When the average forecasting accuracy is improved, the combined forecasting models can obtain prediction results closer to the actual measured values in more periods. The results show that the harmonic weighted average method is recommended for the optimal combined forecasting model. The first mock exam is better than the first mock exam model, especially for the total load in spring and winter. The average absolute percentage error (MAPE) of the forecast is lower than that of the single model.

<sup>524</sup> establish a joint model with multiple influencing factors to investigate the impact of <sup>525</sup> comprehensive influencing factors on regional short-term power load. <sup>526</sup> 2. The follow-up research can consider prediction models and methods to make <sup>527</sup> intelligent prediction for more different regions, different utilization types and different <sup>528</sup> environmental factors. <sup>529</sup> **Author Contributions:** K.Q. developed the concept, conceived the experiments, designed the Improving the accuracy of short-term regional power load forecasting is a long-term work. Its forecasting models are diverse and the influencing factors are very complex. This paper only analyzes the impact of some meteorological factors on short-term load forecasting. In the field of power load forecasting, whether from the exploration of load forecasting model or the comprehensive consideration of influencing factors, there are still many research contents and methods worthy of further test, which can be studied and analyzed from the following perspectives:


1. Muhammad Qamar Raza and Abbas Khosravi. A review on artificial intelligence based load demand forecasting techniques for

smart grid and buildings. *Renewable and Sustainable Energy Reviews*, 50:1352–1372, 2015.

**Author Contributions:** K.Q. developed the concept, conceived the experiments, designed the study, and wrote the original manuscript. X.W., Y.Y. and K.Q. reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work is supported by the research project "Research on typical mode and key technology of multi station integration based on resource intensive sharing" (CEEC2020-KJ07) of China energy Engineering group.

**Acknowledgments:** This work is supported by the research project "Research on typical mode and key technology of multi station integration based on resource intensive sharing" (CEEC2020-KJ07) of China energy construction group. At the same time, this project is also a part of the research project "Research on integrated energy system planning and business model" (GSKJ2-X03-2021) of China energy Engineering group planning and Design Co., Ltd.

**Conflicts of Interest:** The authors declare no conflict of interest.

### **References**

