1. Introduction
Electricity load forecasting is of great importance to the development of modern power systems. Stable and efficient management and scheduling strategies for power systems rely heavily on accurate forecasts of future loads at different times [
1]. Accurate short-term load forecasting can help national grids and energy suppliers cope with the increasing complexity of pricing strategies in future smart grids, further increase the utilization of renewable energy and meet the challenges posed by the development of electricity [
2].
In recent years, the research on electricity load forecasting can be divided into traditional forecasting methods based on mathematical statistics and forecasting methods based on artificial intelligence (AI). Traditional forecasting models can be classified into exponential smoothing [
3], Kalman filtering [
4] and multiple linear regression models [
5]. Traditional forecasting methods rely on statistical models to analyze the regularity of electrical loads during stochastic variations and cannot effectively solve complex problems of non-linearity. In order to better solve the problem of complex nonlinear time series, AI-based forecasting methods have been widely discussed and applied. AI-based prediction methods include artificial neural networks (ANNs) [
6,
7], support vector machines (SVM) [
8,
9] and fuzzy prediction methods [
10]. For example, [
11] combined a real number coded genetic algorithm (GA) with a BP neural network (BPNN) for short-term gas load forecasting. In addition, the feasibility of combining the GA algorithm with the BPNN and the superiority of the prediction model performance were verified by simulation experiments. Another study [
12] proposed a hybrid annual electricity load forecasting model combining a fruit fly optimization algorithm (FOA) and a generalized regression neural network (GRNN). The FOA algorithm was used to determine the parameter values in the GRNN model to improve the forecasting accuracy of the model. A third study [
13] proposed a novel evolutionary algorithm based on the “leader-following” behavior of sheep. This algorithm was combined with an ANN for electricity load forecasting. Simulation experiments demonstrated that this approach can achieve the desired results satisfactorily. A fourth study [
14] combined particle a swarm algorithm (PSO) with an ANN. The PSO algorithm was used to find the optimal network weights to improve the prediction performance of the prediction model.
In addition, the least square support vector machine (LSSVM) was well suited to handle complex nonlinear power load sequences because of its superior self-learning and self-adaptive capabilities. However, the performance of the LSSVM was highly susceptible to the penalty factor
gam and the RBF kernel parameter
sig. Therefore, researchers used artificial intelligence algorithms to find the optimal parameters. For example, [
15] proposed an electricity load forecasting model that combines the K-average algorithm, the reinforcement learning improved Q-PSO algorithm and the LSSVM algorithm. Finding parameters for LSSVM using a PSO algorithm was improved by the reinforcement learning strategy. The production patterns of industrial customers were identified using the K-average algorithm. Improving The data-specific prediction accuracy was improved by sacrificing the data generality. Another study [
16] proposed an electricity load forecasting model combining variational modal decomposition (VMD), maximum relevance minimum redundancy (MRMR), a BPNN neural network and the LSSVM model. VMD was used to decompose the original data series into several high-, medium- and low-frequency components. High-frequency variables were predicted using the BPNN model. Medium- and low-frequency variables were predicted using the LSSVM model. Another study [
17] proposed a combined load prediction model based on a singular spectrum analysis data preprocessing approach, a neural network model and the LSSVM model. The artificial intelligence algorithm was used to find the optimal parameters of the model. The prediction results were obtained using a weighted combination of computational methods.
At the same time, researchers have found that standard intelligent optimization algorithms have difficulties with achieving their theoretical optimality due to their own structural deficiencies. Therefore, a multi-strategy optimization approach was used to improve and optimize the intelligent optimization algorithm to address the shortcomings of the algorithm. For example, [
18] proposed a new electricity load forecasting model combining the grey wolf algorithm (GWO) and Elman neural network. Chaotic sequences and chaotic cosine inertia weighting strategies were introduced on the basis of the GWO algorithm to improve the shortcomings of the algorithm’s insufficient global merit-seeking ability. Another study [
19] proposed a chicken flock optimizer based on a nonlinear dynamic convergence factor algorithm. In addition, the nonlinear dynamic inertia weights and Levy variation strategy were introduced into the optimizer to improve the convergence speed of the algorithm. The improved algorithm was used to find the optimal initial weights and thresholds of the ELM neural network in order to build a new electricity load forecasting model. Another study [
20] proposed a sparrow search algorithm (SSA) and ELM neural network based electricity load forecasting model. The tent chaos strategy and firefly perturbation strategy were used to improve a problem with the SSA algorithm, namely that it had insufficient global search capability and easily fell into a local optimum. Comparative experiments with several competing models also confirmed that the method was effective in improving the prediction accuracy of the model. Another study [
21] proposed a storm flood loss estimation model based on the SSA algorithm, mean impact value (MIV) and LSSVM model for economic losses caused by flooding in metro stations. The paper demonstrated not only the good prediction accuracy of the model but also the reliability of the LSSVM model when dealing with large data time series.
The sparrow search algorithm is a new type of swarm intelligence optimization algorithm proposed in 2020 and is widely used in various fields [
22]. The SSA algorithm has a strong global optimization capability and stability, but it still suffers from an insufficient optimization capability and slow convergence speed, and it easily falls into local optimality when encountering complex problems. Researchers have proposed a number of solutions to address the shortcomings of the SSA algorithm. For example, [
23] proposed a fused cross-variant sparrow search algorithm. The algorithm used tent chaotic mapping to initialize the population to increase the population diversity. The crossover and variation ideas of the genetic algorithm were used to improve the position update equation of the SSA algorithm and help the algorithm to jump out of the local optimum. The chaotic flying sparrow search algorithm was proposed in [
24]. The improved algorithm was optimized mainly in the position update phase of the sparrow. In the search discovery phase, dynamic adaptive weights and levy flight mechanisms were combined to improve the search range and flexibility of the algorithm. The backward learning strategy based on lens imaging was introduced into the follower’s position update process to help the algorithm balance local and global search. Another study [
25] presented an improved sparrow search algorithm applied to the field of photovoltaic microgrids. The improved algorithm used a gravity inverse learning mechanism to initialize the population. Learning coefficients are introduced into the sparrow finder position update process to improve the global optimization capability. The variation operator is introduced into the joiner position update process to help the algorithm jump out of the local optimum.
In addition, it was found that data pre-processing techniques can effectively reduce the effect of noise in the raw data on the prediction results. For example, [
26] proposed a combined forecasting model based on improved empirical modal decomposition (IEMD), autoregressive integrated moving average (ARIMA) and wavelet neural network (WNN) optimized based on the FOA algorithm. IEMD was used to reduce the noise of the original data. Simulation experiments not only verified the excellent prediction performance of the model, but also confirmed that data pre-processing has a positive impact on the prediction results. Another study [
27] proposed a novel electricity load forecasting model based on data preprocessing and a multi-objective cuckoo search algorithm based on non-dominated ranking to optimize the GRNN. Fast empirical modal decomposition by integration (FEEMD) was used to reduce the interference of raw data. Another study [
28] used the ensemble empirical mode decomposition (EEMD) to decompose the raw load data and then used the Elman neural network to make predictions. Although empirical mode decomposition (EEMD) and empirical mode decomposition (EEMD) could automatically decompose the modal components based on the data, the addition of white noise to the EEMD and EEMD during the decomposition process could create endpoint effects and cause distortion. VMD enables the effective separation of the intrinsic modal components and the division of the frequency domain of the signal to avoid distortions caused by the endpoint effect.
Table 1 shows a further summary and analysis of the above literature. The following conclusions are obtained from
Table 1: the use of the idea of combined models to construct prediction models, a reasonable signal noise reduction approach and a multi-strategy optimization approach can effectively improve the prediction accuracy of power load prediction models. Studies [
11,
12,
13,
14,
21] used standard intelligent optimization algorithms to optimize the network weights of neural networks in order to construct prediction models. Although such prediction models also had high prediction accuracy, the authors did not take into account the impact of the standard intelligent optimization algorithm’s own shortcomings on the optimization process and the impact of non-linear fluctuations in the original data on the prediction results. While [
15,
18,
19,
20] considered the impact of multi-strategy optimization approaches on intelligent optimization algorithms, the authors ignored the improvement of the prediction accuracy by data pre-processing methods. In addition, although [
26,
27,
28] integrated the idea of combinatorial modeling, multi-strategy optimization and data pre-processing, the authors did not consider the endpoint effects inherent in EMD denoising that can also affect the final prediction results.
In summary, a new combined power load forecasting model based on variational modal decomposition (VMD) and an improved chaotic sparrow search algorithm (CISSA) to optimize the LSSVM model is proposed. First, we address the problem that the standard sparrow search algorithm is prone to fall into local extremes as the population diversity decreases in the late iterations. In this paper, an improved chaotic sparrow optimization algorithm (CISSA) is proposed based on the analysis of the SSA algorithm. The improved tent mapping strategy, the random following strategy in the chicken flock optimization algorithm idea and the Levy flight strategy in the cuckoo algorithm idea are improved for the population initialization phase, algorithm iteration phase and global optimization search phase of the algorithm, respectively. Second, the original load sequence is decomposed into several modal components of different frequencies by VMD. The CISSA algorithm is used to calculate the two parameters of the LSSVM model, the penalty factor gam and the RBF kernel parameter sig. The CISSA-LSSVM prediction model is then used to train and predict the components at different frequencies separately. Finally, the predicted values of each component are integrated to produce the final prediction results.
In order to verify the performance of the CISSA algorithm proposed in this paper, 8 test benchmark functions are used to evaluate the optimization capability of the CISSA algorithm. The comparison with two improved SSA algorithms and three basic algorithms verifies that the CISSA algorithm has better search accuracy, convergence performance and stability. Finally, simulation experiments using real historical load data are conducted to verify the prediction accuracy and stability of the model. The simulation results compared with several competing models also demonstrate the excellent prediction accuracy and performance of the VMD–CISSA–LSSVM prediction model.
2. Theory and Methods
This section presents the mathematical theory and models of the variational modal decomposition, the LSSVM model, the sparrow search algorithm, the improved chaotic sparrow search algorithm and the VMD–CISSA–LSSVM model.
2.1. Variational Modal Decomposition
VMD is an adaptive decomposition method for non-smooth signals, which can determine the number of modal decompositions according to the actual situation of the sequence. The optimal solution is obtained by adaptively matching the frequency bandwidth of each mode to the optimal frequency bandwidth of each class of modes during the solution process. The specific mathematical model of VMD is shown in [
29]. The specific process of VMD decomposition is shown as follows:
The Hilbert transform is applied to each sub-mode and the one-sided spectrum of the resolved signal is obtained;
There is a transformation of the spectrum to a baseband where the spectrum is multiplied by the central frequency of an exponential signal estimate: ;
The bandwidth is estimated by demodulating the signal and its constrained variational problem can be expressed as Equation (1):
The quadratic penalty factor
and Lagrange multiplier
are introduced to turn it into an unconstrained variational problem to be solved:
The alternating direction multiplier method is used to update the values of A and B, as shown in Equation (3):
where
is the unit pulse signal;
n is the
n-th modal component obtained after the signal decomposition;
N is the total number of modal decompositions;
k is the number of iterations;
is the central frequency of the modal;
is the sign of the partial derivative operation;
is the penalty factor;
j is the unity of the imaginary number;
is the convolution operator;
is the Lagrange multiplier;
,
and
are the Fourier transforms of
,
and
, respectively;
is the finite bandwidth of the component; and
is the central frequency of the component.
2.2. Least Square Support Vector Machines
The inequality constraint in the SVM algorithm is replaced by an equation constraint and the sum of squared errors is used as the empirical loss. In addition, the selection of penalty factor and kernel function parameters in LSSVMs directly affects LSSVMs’ anti-interference ability and generalization ability. The specific mathematical model of LSSVMs is shown in [
30].
For a given training set
, its regression function can be defined as Equation (4):
where
x is the sample input,
y is the sample output and
and
are the normal vector and intercept of the hyperplane in the higher dimensional space, respectively. According to the risk minimization principle, the regression problem can be transformed into a constraint problem:
where
is the relaxation variable and
is the regularization factor. By introducing the Lagrange multiplier
, the above problem is transformed into Equation (6):
The optimal values are obtained by the partial differentiation of
,
b,
e and
, respectively, and the regression function is then established:
where
is the kernel function and the
RBF kernel function is used in this paper. The expression is as shown in Equation (8):
where
is the
RBF kernel parameter.
2.3. Sparrow Search Algorithm
The sparrow search algorithm [
31] is a new swarm intelligence optimization algorithm proposed by Xue in 2020. In this paper, a rational analysis is carried out based on the SSA algorithm so as to develop a reasonable optimization scheme.
The initial sparrow individuals in the sparrow search algorithm are randomly generated in the search space and gradually aggregated during the iterative process, making it difficult to obtain a good population diversity and maintain it at a certain level. This leads to a poor convergence performance and an inconsistency between the global search capability and local exploitation performance of the algorithm.
Sparrow populations are divided into searchers, followers and vigilantes, depending on their individual capabilities. The searcher’s position is updated by the following Equation (9):
where
t is the current number of iterations;
is the random number between
;
and
represent the warning value and the safety threshold, respectively;
is a matrix of
whose elements are all 1; and
is a random number subject to a normal distribution.
The equation for updating the position of a follower is as follows:
where
denotes the worst position of the sparrow in the
d-th dimension in the
t-th iteration of the population,
denotes the optimal position of the sparrow in the
d-th dimension in the
t+1-th iteration of the population and
L is the unit matrix of
.
The equation for updating the location of the vigilantes is as follows:
where
is the minimum constant;
K is a random number within
;
,
and
are the current adaptation fitness, the best adaptation fitness and the worst adaptation fitness, respectively;
is the number of iteration steps. When
, the sparrow is at the edge of the population and is vulnerable to predators; when
, the sparrow is in the middle of the population, is aware of the threat of predators and adjusts its search strategy by moving closer to other sparrows in time to avoid being attacked by predators.
From Equation (10), it can be seen that the follower position update process is mainly guided by and . This also shows that the SSA algorithm does not take full advantage of the information carried by most common individuals in the population. As a result, the effective exploration area for sparrows is small and the algorithm is weak at the global scale.
The operation flow of the standard SSA algorithm is shown in
Figure 1. The iterative search process for individual sparrows shows that the strength of the sparrow search algorithm is influenced by the quality of the individuals in the population and the location update parameters. The individual sparrow position updating relies on the inter-individual following and interactions. Due to the lack of variation in the iterative update process of individuals, once the local optimum stagnation is reached it is difficult for sparrows to jump out of the current local space.
2.4. Improved Chaotic Sparrow Search Algorithm
In this paper, the CISSA algorithm is proposed based on the analysis of the SSA algorithm. First, an improved tent chaotic mapping is used to generate the initial population to improve the quality of the initial solution and lay the foundation for global optimization. Second, in the iteration of the algorithm the random following strategy of the chicken flock algorithm is used to optimize the position update process of the followers in the SSA algorithm, thus balancing the local exploitation performance and global search capability of the algorithm. Finally, the Levy flight strategy of the cuckoo algorithm is introduced to improve the global search capability of the algorithm and help it to jump out of local constraints. The multi-strategy fusion approach helps the algorithm to balance local exploitation and global search capabilities, while improving the algorithm’s local extreme value escape capability.
2.4.1. Improved tent Mapping Strategy
Chaos is a nonlinear system between deterministic and stochastic systems [
32,
33]. Chaotic mappings are capable of traversing all states without repetition within a certain range.
Figure 2 shows the bifurcation diagrams of four common chaotic mappings. From
Figure 2, it is clear that the tent chaotic map covers a larger area and is more uniformly distributed. Therefore, the tent chaos mapping is chosen to initialize the sparrow population distribution and help the sparrow population to be uniformly distributed in the mapping space. In addition, random variables are introduced into the tent chaos mapping to improve the diversity and randomness of the population.
The tent chaotic mapping can be expressed by Equation (12):
Adding random variables
to Equation (12), Equation (13) is obtained:
Finally, the improved tent chaotic mapping is obtained after the Bernoulli shift transformation of
. The initial position of the sparrow population in the feasible domain is obtained by Equation (14):
where
q is a random number within
;
and
represent the upper and lower bounds of the feasible solution interval, respectively; and
is the individual after mapping. The process can be expressed as follows: a
d-dimensional vector is randomly generated in
as the initial individual. Then
N-1 new individuals are generated by iterating over each dimension of the vector in Equation (13). Finally, Equation (14) is used to map the values of the variables generated by the modified tent chaos mapping onto the sparrow individuals.
2.4.2. Random Following Strategy
The followers in the SSA algorithm are prone to rapid population clustering in a short period of time as they move towards the optimal position. Although a fast convergence can be achieved, the probability of the algorithm falling into a local optimum is greatly increased by the sudden drop in population diversity. Therefore, the random following strategy of the chicken flock optimization algorithm is used to improve the position update of the followers in the SSA algorithm. The mathematical model of the chicken swarm optimization is shown in [
34]. The random-following strategy of the chicken swarm optimization algorithm is to move the hens closer to the roosters with a certain probability. This ensures a convergence without reducing the diversity of the population and provides a good balance between local exploitation and global search. The Equation for updating the position of the hen is as follows:
where
r denotes any
r-th rooster as the hen’s mate and
s denotes any
s-th rooster or hen in the flock,
;
is the fitness of a randomly selected rooster
s;
is the fitness value of the
i-th sparrow.
The improved follower position update formula can be expressed as:
where
.
2.4.3. Levy Flight Strategy
In the late iterations of the SSA algorithm, individual sparrows have already completed their position updates and are prone to local optimum stagnation at this point. In order to solve this problem, the Levy flight strategy in the cuckoo algorithm is used to update and mutate the population after the SSA algorithm position update [
35].
The Levy flight strategy is based on a combination of long-term small-step searches and short-term large-step jumps. The short distance search ensures that a small area around the individual is carefully searched during foraging. Longer walks ensure that the individual is able to move into another area and search over a wider area. Currently, the Mantegna method is commonly used to generate random step sizes that obey the Levy distribution. The formula proposed by Mantegna for simulating the Levy flight path can be expressed as:
where
s is the flight path of Levy;
is a constant, usually taken as 1.5; and
and
are normally distributed random numbers, which obey the normal distribution of Equation (20). The standard deviations
and
of the corresponding normal distribution in Equation (20) take values that satisfy Equation (21). The position update formula for Levy’s flight can be expressed as:
where
denotes the
i-th solution at generation
t,
denotes the optimal solution at this point,
denotes the weight of the control step and
denotes the point multiplication.
Figure 3 illustrates the two-dimensional plane-based Levy flight path generated using the Mantegna method. It is clear from
Figure 3 that the Mantegna method can be effectively implemented in the search for the optimal solution based on a long-term small-step search and a short-term large-step jump change interphase. By expanding the search space in the short term with large steps, the individual is able to escape from the local stagnation at this point. In addition, the long-term small-step search method is used to enhance the local search capability, effectively solving the problem of individuals falling into local optima. In the standard SSA algorithm, once the sparrow’s position is updated it enters the next phase of the cycle or ends, at which point it tends to fall into a local optimum. By introducing Levy flight variation into the sparrow population in the global search phase, the variation is updated again, helping the population to update its position again and move away from the local optimum at this point.
This paper accomplishes a selective variation update of sparrows after a position update by comparing the size of
rand with the inertia weight factor
:
where
is the current iteration number.
is the maximum iteration number and
rand is a random number within
. If the selected random number
rand is greater than
, the selected sparrow is subjected to Levy flight variation according to Equation (22). If the selected random number
rand is less than
, the variation is skipped and the next step is carried out.
2.4.4. The CISSA Algorithm
As shown in
Figure 4, the operational flow of the CISSA algorithm can be summarized as follows:
Initialize the relevant parameters of the SSA algorithm;
Initialize the sparrow population using a tent chaotic mapping with increased random variables. The improved tent chaotic mapping is used to improve the diversity of the sparrow population by using the ergodicity and randomness of the mapping, thus providing a basis for the global optimization of the algorithm. It generates a d-dimensional vector in the initial space as the initial individual. Then, N-1 new individuals are generated by iterating over each of its dimensions by the equation . Finally, the values of the variables generated by the chaotic mapping are mapped onto individual sparrows by the equation ;
Calculate and rank the fitness values of the sparrows at this time and record the best and worst positions of the sparrows at this time;
Update the position of the spotter sparrow at this point according to the equation ;
The position of the follower sparrow at this point is updated according to the random following strategy employed in the equation . The local exploitation performance and global search capability of the algorithm is balanced using the random following strategy;
Update the position of the spotter alert at this point according to the equation ;
Recalculate and rank the fitness values of the sparrows, recording the best and worst positions of the sparrows at this time;
Calculate the inertia weighting factor . Whether sparrow populations undergo Levy variation is determined by comparing the magnitude of rand to . If the selected random number is greater than , then the selected individual sparrow is subjected to Levy flight variation according to the equation . The Levy flight strategy in the cuckoo algorithm idea is used to improve the global search ability of the algorithm and help the algorithm to jump out of local restrictions;
Recalculate the fitness and record of the optimal and worst positions of the sparrow at this time;
Determine whether if the stop condition is met. If the stop condition is met, output the result; otherwise, repeat steps 2–9.
2.5. VMD–CISSA–LSSVM Electricity Load Forecasting Model
In summary, a new combined power load forecasting model based on VMD, the CISSA algorithm and the LSSVM model is proposed.
The VMD algorithm is used to decompose the original data to obtain multiple IMF components and Res residual components. The effect of denoising the raw load data is then achieved by means of modal reconstruction. The accuracy of the prediction model is reduced if the sub-series data is fed directly into the LSSVM model for load power prediction. The reason for this is that the penalty factor gam and the RBF kernel parameter sig of the LSSVM have a significant impact on the prediction results. To improve the prediction accuracy, the CISSA algorithm proposed in this paper is used to find the optimal kernel width and penalty factor for these two important parameters and input them into the LSSVM model for load prediction.
The flow of the VMD–CISSA–LSSVM power load forecasting model is shown in
Figure 5. The specific operational flow can be expressed as follows:
Modal decomposition of load data using the VMD algorithm;
The input sub-series data has a large variance in peak values, which can have a significant impact on the prediction results if entered directly without processing. Therefore, the data needs to be normalized before the individual subsequences are fed into the LSSVM. The normalization formula can be expressed as , where x represents the original data and and represent the minimum and maximum values in the original data;
The kernel function width and penalty factor of the LSSVM are optimized using the CISSA algorithm proposed above;
The decomposed sub-series data of the original load prediction are fed into the LSSVM prediction model optimized by the CISSA algorithm;
The prediction results of each sub-series are summed to obtain the final prediction result.
In addition, the proposed combined prediction model can be applied in a real power transmission environment, as shown in
Figure 6. The VMD–CISSA–LSSVM model is applied in the first conversion phase. The model’s forecasting performance is continuously improved by continuous learning from historical electricity load data from previous years. Highly accurate forecasting results are used to give effective feedback to the power sector, helping decision makers to develop reasonable power supply and production plans and reduce unnecessary losses and waste in the supply–consumption process.