1. Introduction
With the rapid development of social economy and the rising power load of the park, the risk of heavy overload borne by the distribution transformer equipment in the park is becoming more and more serious, which lays a hidden danger for the safe operation of the power grid. If the power flow changing trend of distribution transformer could be forecast accurately in the short term, it is of great significance for power companies to forecast the heavy overload of distribution transformer accurately and take timely measures.
There is a great deal of researches on improving the forecasting accuracy, generalization ability, and adaptive ability of the short-term load forecasting model. The traditional load forecasting methods include ARIMA [
1], the grey model method [
2,
3], the support vector machine [
4,
5], and the similar day method [
6,
7]. In recent years, with the continuous improvement of the degree of power grid information, the frequency and accuracy of power grid operation data collection by terminal devices such as on-line monitoring devices also continue to improve [
8]. The artificial intelligence technology represented by deep learning has made the research of load forecasting develop rapidly. The most representative technologies, such as LSTM [
8,
9] and neural network [
10,
11], form a hybrid neural network by increasing the depth of the network, which further improves the ability of high-dimensional feature learning.
The existing researches on load forecasting with strong randomness and high complexity are mainly based on load decomposition and the multi-model fusion forecasting method. The widely used load decomposition methods include the filtering analysis method, the modal empirical decomposition method, and its derivative algorithm [
12,
13,
14]. By introducing the load decomposition method into the forecasting, separating the local characteristic signals of the time series, and selecting the forecasting method separately, the learning ability of the local characteristic signals could be enhanced and the comprehensive forecasting accuracy could be improved. In [
14], the variational mode decomposition is used to effectively reduce the instability of the original sequence, and then the depth gated network is used to model and forecast each sub-component respectively, so that the model has stronger time-series-dependent fitting ability. In [
13], the load data are decomposed by the method of ensemble empirical mode decomposition, and different Elman recurrent neural network models are established for each time-series component, but the process of adjusting parameters of the model is complicated in actual forecasting. In [
12], the GRU neural network and multiple linear regression method are used to forecast the time-series components of different frequencies respectively, and better forecasting results are obtained when the data contain more noise. A method for one-day-ahead forecast of electricity prices combines deterministic components and the residual component [
15]. A hybrid algorithm for EPF combines the local forecasting paradigm, GRNN and has; this algorithm overcame the demerits of the global algorithms and the parameters were optimized by HAS [
16]. A method is proposed to improve prediction accuracy by different machine learning algorithms. the deterministic component and the residual component were combined to get the specific properties of electricity prices [
17]. A framework was proposed for forecasting electricity prices by using DNN, LSTM, GRU and the CNN model [
18]. An approach was proposed for forecasting electricity demand by using linear regression-based models, spline function-based models, and traditional time-series models [
19].
Currently, there are two aspect deficiencies. Firstly, the above researches are usually oriented to the medium and short-term forecasting of wide-area total load, and few studies are specifically aimed at park distribution transformer load forecasting. The reason is that the wide-area total load is superimposed by the distribution transformer load of each park. High-frequency components and noise cancel each other internally, resulting in a smoother load curve, while the park distribution transformer load has obvious periodic characteristics in the long time scale, but there are a large number of high-frequency components and noises in the short time scale, and the random characteristics are more obvious, which greatly increases the fitting difficulty of the forecasting model. Therefore, it is necessary to seek a more accurate forecasting technology in line with the short-term load characteristics of the park.
Secondly, most of the high-frequency component forecasting algorithms in the above literature for high randomness are single, leading to a poor generalization of forecasting models, and boosting the comprehensive forecast ability to load different frequency feature components by multi-model fusion becomes an effective means. In view of the fact that the distribution transformer load in the park is affected by many complex factors, on the basis of its regular stationary characteristic load with strong regularity, the local part shows strong random characteristics, and the traditional single forecasting method makes it difficult to learn the rules accurately.
In this paper, a short-term park load combination forecasting method is proposed based on complementary ensemble empirical mode decomposition (CEEMD). The LSSVR parameters are optimized by SBO. The load data of the historical park is decomposed into series components by CEEMD, and the MLR algorithm is used in the obvious periodic low-frequency component to reduce the overall complexity of the forecasting model. The LSSVR load forecasting model is built for the high-frequency part, which is difficult to forecast. The regularization parameters and kernel function width of the LSSVR model are optimized by the SBO algorithm, and the forecast values of each component are superimposed to get the final forecast value. The effectiveness of the proposed method is proved through simulation results.
A novel satin bower bird optimization algorithm is introduced, and the regularization parameters and kernel function width of the least squares support vector regression model are optimized by its excellent optimization performance. The short-term park load power forecasting simulation results show that the SBO algorithm with dynamic step size and mutation has better convergence and forecasting accuracy compared with the traditional algorithms.
In order to solve the problem of random fluctuation of park load power series, a combined forecasting method is proposed based on complementary ensemble empirical mode decomposition, the satin bower bird optimization algorithm (SBO), and least squares support vector regression (LSSVR) parameters. The original park load time series are decomposed into different time-scale series components by complementary ensemble empirical mode decomposition. As a result, the overall calculation of the model and the complex parameter adjustment work are reduced in the training process.
The simulation results of numerical examples show that the proposed method not only solves the defect of large reconstruction error of ensemble empirical mode decomposition and improves the forecasting accuracy effectively, but reduces the difficulty and complexity of forecasting greatly.
The remainder of this paper is organized as follows:
Section 1 proposes a combined method forecasting strategy.
Section 2 presents the algorithmic details of CEEMD and satin bower bird optimization algorithm.
Section 3 describes the short-term forecasting model.
Section 4 provides experimental results and discussion. Finally, the paper is concluded in
Section 5.
4. Short-Term Forecasting Model of Park Load Based on CEEMD-MLR-LSSVR-SBO
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, and the experimental conclusions that can be drawn.
4.1. Least Squares Support Vector Regression (LSSVR) Model
LSSVR is suitable for the training and forecasting of small sample data. Its basic idea is to use the nonlinear mapping function
to map the given sample set
to the high-dimensional feature space for fitting. According to the structural risk principle, the error quadratic term is selected as the experience loss of the training set, and the minimized objective function is built. LSSVR regression model is an optimization problem with constraints:
where
represents the equality constraints on the objective function,
is the weight vector,
is the error variable,
is the regularization parameter, and
is the bias vector.
The mathematical derivation of LSSVR can be found in the Ref. [
23], and the final regression function is as follows:
where
is the Lagrange multiplier, and
is the kernel function. In this paper, the radial basis kernel function is selected as follows:
where
is the width of the kernel function.
The regularization parameter
is used to measure the smoothness of the fitting curve and minimize the fitting error [
24]. If its value is too large or too small, the generalization ability of the model would become worse. The value of kernel function width
is too small, which can easily lead to overfitting; when the value of the parameter
is too large, it will lead to the deterioration of the generalization ability of the model [
25]. Regularization parameter
and kernel function width
have a great influence on the learning ability and forecasting accuracy of the LSSVR regression model. In order to improve the forecasting effect of LSSVR, it is necessary to find two suitable parameters for it. The traditional parameter selection method is the cross-validation method.
4.2. Regression Model of Optimizing LSSVR Parameters Based on SBO
In this paper, the SBO optimization algorithm is introduced to optimize two parameters of the LSSVR model. LSSVR modeling usually needs to construct the mapping relationship
between input
and output
, where
is the embedded dimension. Before the regression model training, the original park load data
is normalized, as shown in Formula (17).
The normalized processed data were divided into training and test sets. The training set learning was utilized within a set maximum number of iterations, resulting in the best combination of parameters. The objective function for the two-parameter optimization was as follows:
where
and
are the park load Max and Min, respectively, in the selected data set;
and
are the output power true and forecast values, respectively, for the
i-th sample.
According to the selected training data set and the set objective function, a regression model based on SBO to optimize LSSVR parameters is constructed, and the flow chart is shown in
Figure 2.
The steps of using the SBO algorithm to optimize the regularization parameter and kernel function width of LSSVR are as follows:
Step 1: The value range of LSSVR parameters and is set.
Step 2: The relevant parameters of SBO are set, including the number of nests for initializing SBO, the maximum number of iterations, the upper limit of step size, the probability of variation, the proportional coefficient, and the dimension of variables to be optimized.
Step 3: The location of the initial nest is generated randomly, and the location of each nest represents a set of parameters .
Step 4: The Formula (15) is used to calculate the cost function value of all the nest individuals. According to a comparison of these, the current best nest position is obtained and is kept to the next generation.
Step 5: The probability of all selected nest individuals is calculated by the Formulas (7) and (8).
Step 6: The target nest is determined through the roulette selection mechanism, and the nest location is updated with Formulas (9) and (10).
Step 7: Formulas (11)–(13) are used to mutate all nest individuals randomly.
Step 8: All the populations are combined to obtain the optimal nest. If the end condition of the algorithm is satisfied, the location of the optimal nest is the parameter value of the optimal selection; on the contrary, return to step 4 to continue the iteration.
Step 9: The optimal nest location is taken as the optimal parameter of LSSVR. Thus, the LSSVR regression model is built.
5. Numerical Example Analysis
In order to verify the scientific nature and reliability of the method proposed in this paper, the experimental samples in this paper are the distribution transformer load data of a 10 kV park in a commercial and residential mixed area in Liaoning Province. The time span is from 1 January 2020 to 31 December 2020, and the sampling interval is 15 min (
Figure 3). From the local magnification diagram of the load curve in
Figure 3, it can be seen that the load curve of the distribution transformer in the park has an obvious long periodicity but fluctuates greatly in a short time between the peak and valley of electricity consumption. In this experiment, the data of the first 11 months are divided into training sets, and the data of December are divided into test sets. This method can be widely applied to any region of the world, and we simply use this case in China to demonstrate its implementation.
Firstly, the long-period characteristics and short-term fluctuation characteristics of the load curve are separated by empirical mode decomposition, and the appropriate forecasting method is selected according to the complexity of different components. Then the multiple linear regression is proposed to forecast the low-frequency IMF curve, and the CEEMD-MLR-LSSVR-SBO ensemble learning forecasting model is used to forecast the high-frequency IMF curve. Finally, the low-frequency component and high-frequency component are superimposed and reconstructed, and the final load forecasting curve of the distribution transformer in the park is obtained. Relative mean square error (RMSE) and mean absolute percentage error (MAPE) are used in the forecasting and evaluation indicators, as follows:
5.1. Empirical Mode Decomposition
Fourteen groups of IMF with frequency from high to low are obtained by empirical mode decomposition of distribution and transformer load data, as shown in
Figure 4. In this paper, the reconstruction curve is obtained by superposing each IMF component, and the reconstruction error of EMD load decomposition is calculated by using absolute error (AE) and mean absolute error (MAE). The formula is as follows:
Figure 5 shows the time point error of the reconstruction curve and its box diagram. It can be seen from the figure that the reconstruction error of each sequence point is generally small, concentrated in
, the maximum reconstruction error is only
, and the mean absolute error is
. On the experimental surface, EMD load decomposition can effectively retain the information of the original load curve, and the reconstruction error is of a small order of magnitude, so the influence of reconstruction error can be ignored in the subsequent load forecasting.
Using the 14 groups of IMF time series data decomposed by the above experiment, the sample entropy of each group is calculated. The results are shown in
Figure 6. And the high- and low-frequency groups of IMF are completed according to the sample entropy. In this paper, through the analysis of the entropy distribution of each component in
Figure 5, it can be seen that the sample entropy is gradually close to 0 from the sample IMF9 = 0.02, and the curve changes gently and smoothly, so IMF9~14 are defined as low-frequency components, and the accurate forecasting results can be obtained quickly and efficiently by directly using multiple linear regression to forecast them. IMF1~IMF8 are defined as high-frequency components. The CEEMD-MLR-LSSVR-SBO forecasting model is used to mine the rules to the maximum extent, so as to improve the forecasting accuracy. Although IMF1 to IMF5 are high-frequency components, the numerical variation range is small and the proportion of the original load is small, that is, the superposition of the component forecasting results has little impact on the accuracy of the final forecasting results. The other components have a great influence on the accuracy of the final forecasting results, among which IMF6, IMF7, and IMF14 have the characteristics of a large proportion of the original load, so it is necessary to improve the forecasting accuracy of the above components as far as possible.
5.2. Studies on Time Step of MLR Forecast for Low-Frequency IMFs
After the empirical mode decomposition of the distribution and transformer load data, each component does not have a strong correlation between the power load and the weather type, hour, week, month, and other characteristics in the prior experience, so the input feature is the historical component data of the first 8 time points after the forecasting time point plus the forecasting step.
Because the factors affecting the forecast time point component data in the MLR model are only historical component data, the forecast performance of the model for different time spans needs to be contrasted according to its forecast properties. In this paper, single-step and multi-step forecasting experiments are done for IMF9 and IMF14, respectively. and the RMSE and MAPE are calculated for the forecast outcomes at different forecast step sizes. The results are shown in
Table 1. Among them, the single-step forecasting is forecast 15 min in advance, and the multi-step forecasting is forecast 30 min, 1 h, and 1 day in advance. It was found by the experiment results that when the component frequencies were high, the MLR forecasting method is affected by the forecasting step greatly. For IMF9, forecasting more than 30 min in advance would lead to a large decrease in forecasting accuracy and cannot meet the requirement of MAPE less than 10 for the result of low-frequency, single-component forecasting. Therefore, the unified forecasting step size of each algorithm is set as 30 min in this paper.
5.3. Studies of Different Forecast Methods for High-Frequency IMFs
In order to optimize the performance of CEEMD-MLR-LSSVR-SBO, it is necessary to analyze the forecasting effect of each high-frequency component of each model based on a common forecasting model. Firstly, this paper designs an experiment to compare and analyze the individual forecasting results of each common forecasting model (LSTM, BP, SVM, XGBoost algorithm) on the high-frequency components from IMF1 to IMF8. Then the error difference degree of each model of each high-frequency component is calculated by using the Pearson correlation coefficient.
In the paper, the component data IMF1~IMF8 are brought into the algorithm training of each forecasting model to comprise the effects of forecasting individual forecasting models, and the forecasting accuracy is evaluated by MAPE. The results are shown in
Table 2.
Table 2 shows that when each algorithm carries on the forecasting separately, the IMF7 and IMF8 data in the high-frequency component have obvious continuity characteristics, so compared with other algorithms, the LSTM algorithm uses the cumulative information of previous training effectively, and the forecasting performance is also superior.
As the change frequency from IMF1 to IMF6 is higher and the complexity of the time series increases, the forecasting effect of the algorithm decreases obviously. Among them, the individual forecasting error of XGBoost for the above high-frequency components is lower generally than that of other algorithms, because XGBoost is good at mining and learning deep-feature relationships, which makes the model training fuller. For the high-frequency part of IMF, according to the evaluation index MAPE, CEEMD-MLR-LSSVR-SBO model compared with other general forecasting methods has a forecasting accuracy that is not the best. However, the MAPE value is generally lower than the MAPE average forecast by the individual basic learner. Therefore, it is proved that the CEEMD-MLR-LSSVR-SBO model has higher general applicability than the individual model for different data sets.
5.4. Forecasting Effect of Load Curve
By adding the low-frequency components of IMF9~14 directly forecast by multiple linear regression and the high-frequency components of IMF1~8 forecast by the CEEMD-MLR-LSSVR-SBO ensemble learning model, the final load forecasting curve of the distribution transformer is obtained. In order to continue to verify whether the CEEMD-MLR-LSSVR-SBO forecasting model is able to maintain a high prediction accuracy for consecutive days, six models are used to forecast the load situation for consecutive seven days, respectively. The error analysis is shown in
Table 3.
Comparing the data in
Table 3, it can be seen that the average error of SVR and XGBoost forecasting results for one consecutive week is still lower than that of LSTM and BP. The error of SVR and XGBoost decreases by 0.29% and 0.81% compared with LSTM and BP, while XGBoost decreases by 0.07% and 0.59%. It can be seen that the XGBoost forecasting model is still better than the traditional heuristic algorithm forecasting model. The mean error of SVR also decreased by 0.22% compared to XGBoost. CEEMD-MLR-LSSVR-SBO has the highest accuracy in the continuous week forecasting, and the error fluctuation is not large. The mean value of weekly error decreases by 0.16%, 0.31%, 0.04%, 0.13% and 0.26% compared with the other five models, respectively. The mean root errors also decreased by 0.49%, 0.58, 0.52%, 0.41% and 0.5%, indicating that the CEEMD-MLR-LSSVR-SBO combination method can improve the accuracy and stability of model forecasting.
Among them, the EMD superposition forecasting method selected well by each component model is to select the forecasting model with the best forecasting effect of each IMF component according to the individual forecasting error of the previous basic learner. Finally, the high-frequency and low-frequency components are superimposed to form the load curve of the forecasting period. In the experiment of the simple forecasting method, the historical original load data are used as input, and the load curve of the output forecasting period is forecast directly. The experimental results show that the forecasting results of simple forecasting methods have the problem of peak lag to a certain extent, because the forecasting results are greatly affected by the historical load of input. The forecasting accuracy based on CEEMD-MLR-LSSVR-SBO is obviously higher than that of the simple forecasting method, and the problem of peak lag is obviously improved. Although its forecasting accuracy is slightly lower than that of the EMD superposition forecasting method, the EMD superposition forecasting method based on each component model selects the component forecasting model according to the forecasting error, that is, it is difficult to realize in practical application, while the forecasting based on CEEMD-MLR-LSSVR-SBO overcomes the shortcomings of re-selection of each component forecasting model and has higher practicability.