Next Article in Journal
Comparison of Advanced Multivariable Control Techniques for Axial-Piston Pump
Previous Article in Journal
PID Controller Design for an E. coli Fed-Batch Fermentation Process System Using Chaotic Electromagnetic Field Optimization
Previous Article in Special Issue
Optimal Operation Strategy for Wind–Photovoltaic Power-Based Hydrogen Production Systems Considering Electrolyzer Start-Up Characteristics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Study on Short-Term Electricity Load Forecasting Based on the Modified Simplex Approach Sparrow Search Algorithm Mixed with a Bidirectional Long- and Short-Term Memory Network

1
School of Intelligent Manufacturing, Luoyang Institute of Science and Technology, Luoyang 471023, China
2
School of Agriculture Engineering, Jiangsu University, Zhenjiang 212013, China
3
School of Electrical Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450045, China
*
Authors to whom correspondence should be addressed.
Processes 2024, 12(9), 1796; https://doi.org/10.3390/pr12091796
Submission received: 18 July 2024 / Revised: 15 August 2024 / Accepted: 21 August 2024 / Published: 23 August 2024

Abstract

:
In order to balance power supply and demand, which is crucial for the safe and effective functioning of power systems, short-term power load forecasting is a crucial component of power system planning and operation. This paper aims to address the issue of low prediction accuracy resulting from power load volatility and nonlinearity. It suggests optimizing the number of hidden layer nodes, number of iterations, and learning rate of bi-directional long- and short-term memory networks using the improved sparrow search algorithm, and predicting the actual load data using the load prediction model. Using actual power load data from Wuxi, Jiangsu Province, China, as a dataset, the model makes predictions. The results indicate that the model is effective because the enhanced sparrow algorithm optimizes the bi-directional long- and short-term memory network model for predicting the power load data with a relative error of only 2%, which is higher than the prediction accuracy of the other models proposed in the paper.

1. Introduction

Electricity is a special kind of secondary energy that is produced, transmitted, distributed and consumed simultaneously. This means that power generation on the supply side and power consumption on the user side are in a dynamic equilibrium relationship. When power generation and consumption are not balanced, it can cause safety accidents and lead to large economic losses for the power system and users. Given the characteristics of electric energy and the randomness of load changes, the accurate forecasting of power loads and predicting trends in advance can help the power sector make timely and scientific decisions, guaranteeing the electrical grid’s continuous, secure, and reliable functioning.
Power load and past climatic data are the foundations of power load forecasting. This explores the change rule of power load history and predicts future load based on this analysis. Based on the time horizon, power load forecasting is separated into four categories: ultra-short-term, short-term, medium-term, and long-term. Long-term power load forecasting covers the next three to five years and provides a basis for grid planning departments in their vision planning [1,2]. Medium-term power load forecasting covers the next one month to one year and provides a reference for equipment maintenance plans. Short-term power load forecasting covers the next few hours to a few days and can be used for grid arrangement and scheduling plans [3,4]. Ultra-short-term power load forecasting covers a time range of one hour in the future and can be used for preventive emergency control and handling of emergencies.
The goal of current research on short-term power load prediction has been to increase prediction accuracy. Numerous conventional techniques, including regression analysis, time series, and comparable day approaches, have been employed in earlier research. But these methods usually have some limitations, such as low accuracy, weak ability to integrate influencing factors and low sensitivity to input data. With the fast growth of computer science and artificial intelligence, machine learning approaches have been brought into the field of short-term power demand forecasting. These methods comprise random forests (RF), support vector machines (SVM), artificial neural networks (ANN), and long- and short-term memory networks (LSTM). These approaches make up for the drawbacks of standard load forecasting methods, increase the accuracy of short-term power load prediction, and present fresh concepts for short-term power load forecasting.
A long- and short-term memory network is a form of recurrent neural network (RNN) architecture utilized in the field of machine learning, which is ideal for the classification, manufacturing, and prediction of time series data. Research academics at home and abroad have applied LSTM to the field of load forecasting in recent years by analyzing the influence of important factors such as meteorology and regional economy and constructing and improving forecasting models [5,6,7]. Reference [8] proposed a short-term power load forecasting method based on a convolutional neural network (CNN) and a long- and short-term memory network (LSTM), which employs CNN to extract the feature vectors of the input data, and the feature vectors are utilized as inputs to the LSTM forecasting model. Better forecasting results are achieved by analyzing the example of the power load in Bangladesh, but both CNN and LSTM require a large amount of computation, which requires strong hardware support, and the setting of too many parameters is also a major problem. In reference [9], to increase the accuracy of short-term power load forecast, the article adopts the variational modal decomposition (VMD) method to decompose the original power load data signals, and then the deconstructed sub-signals and the original signals are formed into a new dataset fed into the neural network for training so that the artificial neural network can learn more information, and its ability to predict is more outstanding, but the sub-signals after the decomposition of VMD usually lack intuitive interpretation, which can make the prediction results difficult to understand. Reference [10] proposed a discrete particle swarm (DPSO)-optimized short-term power load prediction model for a long- and short-term memory network (LSTM), which employs the DPSO technique to improve the parameters of the LSTM model, and in the case study, real data from four provinces in central China are used to test the efficiency of the suggested model, but DPSO may encounter the dimension disaster problem.
LSTM can learn historical knowledge and has a memory function for historical information. However, it only has the ability to acquire forward knowledge and cannot acquire backward information well. To make each moment contain contextual information, some scholars have proposed a bi-directional long- and short-term memory network (BiLSTM) by combining the bi-directional recurrent neural network (BIRNN) model with the LSTM unit. The BiLSTM is an upgrade of the LSTM, which not only absorbs forward information but also employs backward information efficiently [11]. Reference [12] proposed an improved prediction model for BiLSTM: the authors used improved particle swarm optimization (IPSO) to improve the settings of BiLSTM, and the experimental findings demonstrate that the IPSO-BiLSTM model has superior prediction outcomes to the LSTM model and BiLSTM model. Reference [13] developed a short-term power load forecasting approach based on an attention system and simultaneous long- and short-term memory network. It leverages the attention mechanism to emphasize the input elements that play a crucial role in load forecasting and then combines with the BiLSTM model for forecasting, which achieved better forecasting results in the example simulation, but its complexity is too high.
In this study, we propose the creation of a novel short-term power load forecasting approach utilizing the modified simplex approach sparrow search algorithm (SMSSA) mixed with a bidirectional long- and short-term memory network (BiLSTM). The approach leverages the SMSSA algorithm to maximize the number of hidden layer nodes, the number of iterations, and the learning rate of BiLSTM, and to decrease the uncertainty of manual trial settings. Finally, simulation analysis is carried out utilizing data from the Wuxi region in Jiangsu Province, China, and the analysis findings suggest that the approach described in this research has a strong prediction performance.

2. Long- and Short-Term Memory Networks

Long- and short-term memory (LSTM) is an upgraded recurring neural network proposed by Hochreiter and Schmid Huber in 1997 [14]. LSTM avoids the problem of gradient vanishing of ordinary RNNs when confronting long-term dependencies, and it is an efficient RNN design. Figure 1 demonstrates the basic framework of LSTM.
LSTM introduces the concepts of gates and memory cells in each hidden layer. The forgetting gate deletes old input data and resets the memory unit; the input gate adds fresh valid information to the memory unit; and the output gate activates the valid information for filtering and output to the succeeding network. This approach of managing the memory units by three gating units overcomes the problem that RNNs cannot acquire long-distance temporal relationships and successfully mitigates the gradient vanishing problem [15,16,17]. This makes LSTM an appropriate neural network topology for tackling the long-term dependence problem. The state update equations for the basic units of LSTM are shown in Equations (1)–(6):
f t = σ ( W f [ h t 1 ,   x t ] ) + b f
i t = σ ( W i [ h t 1 ,   x t ] ) + b i
C ^ t = tan h ( W c [ h t 1 ,   x t ] ) + b c
C t = f t C t 1 + i t C ^ t
o t = σ ( W ο [ h t 1 ,   x t ] ) + b o
h t = o t tan h C t
where x t is the input at time t; f t , i t and o t denote the forgetting gate, the input gate, and the output gate, respectively; a t is the input node state at time t; C t 1 and C t are the cell states at time t − 1 and time t, respectively; h t 1   and h t are the outputs at time t − 1 and time t, respectively; σ is the sigmoid activation function; tanh is the hyperbolic tangent function;   W f , W i , W c and   b f , b i , b c , b o are the corresponding weight matrices and bias vectors; and denotes the Hadamard product.

3. Bidirectional Long- and Short-Term Memory Networks

LSTM can learn historical knowledge and has a memory function for historical information. However, it only has the ability to acquire the forward knowledge and cannot acquire the backward information well. To learn contextual information at each moment, several academics have merged the bidirectional recurrent neural network (BiRNN) model with LSTM units for gathering contextual information and have suggested the bidirectional long- and short-term memory network model.
The bidirectional long- and short-term memory (BiLSTM) network model is an improvement of the LSTM network, which not only learns the forward information but also utilizes the backward information effectively [18]. The framework of BiLSTM is given in Figure 2.
As shown in Figure 2, the BiLSTM neural network has two directions of hidden layers, a pre-hidden layer forward training time series and a post-hidden layer backward training time series, and both the pre-transmission and post-transmission layers are connected to the output layer. x is the model input, h is the hidden layer state, and y is the output. The BiLSTM operation is as follows:
h r t = f ( w r · x t + v r · h r t 1 + b r )
h s t = f ( w s · x t + v s · h s t 1 + b s )
y t = g ( U [ h t r , h t s ] + c )
where h r t   and h   r t 1 denote the forward LSTM hidden layer states at moments t and t − 1, respectively; h s t and h s t 1 denote the reverse LSTM hidden layer states at moments t and t − 1, respectively; w r , v r , w s and   v s   are the corresponding weights;   b r ,   b s   are the corresponding biases; x t is the input at moment t; and   y t is the output at moment t.

4. Improving Sparrow Search Algorithm to Optimize Bidirectional Long- and Short-Term Memory Networks

The power load sequence is fundamentally time-series data containing interconnected earlier and post-input power load data, which is useful for creating a suitable short-term power load forecasting model employing bidirectional long- and short-term memory networks. When employing the BiLSTM model for short-term power load prediction, the learning rate and the number of training times have a very big influence on the model prediction performance. The learning rate influences whether the model can successfully converge to the ideal value, and choosing too big a learning rate will make the model oscillate and not converge at the optimal value, while choosing too small a learning rate will make the model training very slow and reduce the efficiency of the model operation [19]. The selection of the number of training times has a major impact on the model fitting: setting the number of training times too high eventually produces model overfitting, whereas if the number of training times is too little, the curve-fitting effect may not be good. Furthermore, the performance of the model is also influenced by the neural network structure. The BiLSTM model’s structure can be optimized by choosing the right number of hidden layer layers and nodes. Through this process, the model can reach its optimal performance [20]. Therefore, in this study, an enhanced sparrow search algorithm (SMSSA) is utilized to automate the parameter selection of the BiLSTM model to increase the prediction accuracy and model generalization of the BiLSTM model.

4.1. Principles of Sparrow Search Algorithm

The sparrow search method (SSA) is a unique optimization method introduced by Jiankai Xue in 2020, which gives a new way to handle complicated global optimization issues [21]. Sparrows are flocking birds and have strong memory compared to other birds. There is a definite division of work within the sparrow’s population: one portion is termed the discoverer, and the other part is called the follower. The finders are responsible for locating food and providing the populace with foraging places and guidance, while the followers receive food through the finders. The behavior of sparrows can be described as follows:
Step 1: Finders often have large amounts of energy reserves and supply forage locations and search routes for anyone searchers.
Step 2: When a sparrow discovers a predator, it emits a warning signal. When the warning signal is bigger than a safety threshold, the finder moves all followers into a safe region.
Step 3: Every sparrow has the potential to be a finder, provided a good food source can be provided. The ratio of discoverers to followers remains constant throughout the population.
Step 4: Sparrows with greater access to food will act as producers. To obtain more food, hungry followers will likely fly elsewhere to forage for more food.
Step 5: Followers follow the best finder in search of food. Additionally, there will be predators that may continually observe the finders and compete for food.
Step 6: Sparrows in the middle of the group move swiftly to a safe region to acquire a better position when they confront danger, whereas those in the middle of the group approach other sparrows randomly.
A sparrow search algorithm may be created using a mathematical model thanks to the behavioral description of sparrows. Assuming that there is a population of N sparrows, the locations of the sparrows may be represented by the following matrix:
X = x 1.1 x 1 . d x n . 1 x n . d
where n denotes the number of populations and d denotes the dimension of the optimization variable.
The fitness values in the population may be represented by the following matrix:
F X = f x 1.1 x 1 . d     f x n . 1 x n . d
where n denotes the number of populations, and the value in each row of F X denotes the fitness value of an individual.
In a colony, the discoverer has a high degree of energy reserves and may supply foraging locations and search routes for other followers. During each cycle, the positioning of the discoverer is revised by the following formula:
X i , j t + 1 = X i , j t · exp i α · i t e r m a x         i f   R 2   S T X i . j t + Q · L                                               i f   R 2   S T
where t indicates how many iterations are performed in the moment.; X i , j t indicates what the jth dimension of the ith sparrow is worth when there are t iterations; i t e r m a x indicates the most iterations that can be made; α is an arbitrary number; R 2 indicates the value of the alarm; S T is the safety threshold, whereby R 2   S T denotes that there are no predators around, and foraging can be carried out extensively, and R 2   S T indicates that some sparrows in the population have discovered a predator, and they have moved to other secure locations to feed; and L is a full 1 matrix of 1*d.
Followers forage under the leadership of the finder; if they see the finder finding good food, they go to the finder to compete for food, thus changing their status to become a finder, and the ones that do not get food fly elsewhere. The formula for updating followers is shown below:
X i , j t + 1 = Q · exp X w o r s t t X i . j t i 2                         i f   i     n / 2 X P t + 1 + X i . j t X P t + 1 · A + · L       i f   o t h e r w i s e
where X p indicates the ideal position right now; X w o r s t t is the poorest position available right now; A represents a 1*d matrix where a random value of 1 or −1 is assigned to each entry and A + = A T A A T 1 ; i     n / 2 denotes that followers who have not been fed are most likely to starve and must search elsewhere for food.
Scouts make up 10–20% of the entire population, and their locations are determined at random when locating that population. The modified mathematical model is displayed below:
X i , j t + 1 = X b e s t t + β · X i . j t X best t                   i f   f i   f g X i . j t + K · X i . j t X wor s t t f i f w + ε       i f   f i = f g
In the formula,   X b e s t t implies that it is the current global optimum position; β shows the step control parameter, which follows the distribution of normal values.; K is a random value; f i shows the worth of individual adaptability as of right now; f g shows the value of the global optimum adaption; f w is the least favorable adaption value; and ε shows the lowest constant, to prevent the circumstance of the denominator being 0.     f i f g implies that the sparrow is near the edge of the population; X b e s t t denotes the current center of the population, around which is the safest; f i = f g signifies that the sparrow placed in the center of the population is aware of the danger and needs to move to another spot.

4.2. Improved Sparrow Search Algorithm

The traditional sparrow search algorithm produces disadvantages such as delayed convergence and easy fall into local optimization; for this reason, some academics have suggested a simple guided sparrow search algorithm (SMSSA) method to enhance the search mechanism, respectively, for the finder search mechanism and scout’s scouting mechanism of the sparrow population, and at the same time, introduce the simplex method of poorly adapted sparrows for each iteration for the position update to improve the algorithm performance [22]. The specific improvements are as follows:
(1)
Improving the search mechanism
As the core of the whole population operation, the discoverer provides a rich food area and search direction for the population, and it may be claimed that the behavior of the discoverer has a big effect on the convergence of the complete algorithm. From Equation (3), it is known that when R 2 < S T , that is, when no predator is found in the population, the discoverer can carry out extensive foraging; however, some scholars have found that there is a certain degree of randomness in this foraging phenomenon, which will somewhat slow down the rate of convergence. To solve this problem, some scholars have proposed improving the mechanism of discoverer search. The improved mathematical model is given in the following equation:
X i , j t + 1 = X i , j t · e x p i t e r m a x t i t e r m a x 1 · 1 e 1 k           i f   R 2 S T X i . j t + Q · L                                                                                                         i f   R 2 S T
where t indicates how many iterations are performed in the moment.; k denotes the regulation factor.
Equation (15) is mainly improved based on Equation (12), which can adaptively regulate the descending rate of the discoverer’s position update operator by adjusting the value of k. The operator’s rate falls off further as k grows in value. From Equation (15), it can be shown that the position update formula of the discoverer increases the number of iterations, t, which makes the position update formula display nonlinear variations and enhances the variety of the method. In addition, the absence of the random factor in Equation (15) increases the stability of the algorithm.
(2)
Improving the detection mechanism
There are limitations in the standard sparrow search algorithm with its scouting mechanism. It is mainly manifested in the fact that when the population discovers danger, the scouts in the middle of the colony approach their neighbors to decrease predation, which leads the algorithm to fall into a local optimum, and the scouts outside the population approach the safety domain with a random step, which leads to a slower convergence of the algorithm.
Therefore, the scouting factor φ ( t ) is introduced to improve the scouting mechanism, using φ 1 ( t ) as an alternative to the population center scout’s step factor, where φ 1 ( t ) represents K in Equation (14), and φ 2 ( t ) as an alternative to the population edge scout’s step factor, where φ 2 ( t ) represents β in Equation (14). The specific formulas of φ 1 ( t ) and φ 2 ( t ) are as follows:
φ 1 ( t ) = φ i + ( φ f φ i ) · ( 1 t i t e r m a x ) n φ 2 ( t ) = φ i ( φ f φ i ) · ( 1 t i t e r m a x ) n
where φ i denotes the initial value of φ ; φ f denotes the final value of φ ; t indicates how many iterations are done at the moment; i t e r m a x indicates the most iterations that can be made; and n denotes the nonlinear adjustment factor.
(3)
Introduction of simplex mechanism
For each iteration of the poorly adapted part of the individual using the simplex method of the relevant operations, the simplex method is mainly divided into reflective, expansion, and contraction operations. The reflective operation enables the individual to reverse the direction of the search, increasing the individual search space. The expansion operation makes the individual far away from the optimal solution and stops the program from slipping into the local minima. The compression operation brings the individual closer to the optimal position. Using the simplex operation greatly increases the search capability of the sparrow search algorithm.
The particular implementation phases of the simplex mechanism are as follows:
Step 1: Establish the population, compute and rank the fitness of individuals, record the globally optimal individual X b and its fitness value f b , and the suboptimal individual and its fitness value f t . Define X c = ( X b + X t ) / 2 .
Step 2: Perform reflection operation on the m poorly positioned points w , and define X r = X c + α ( X c X w ) , where α is the reflection coefficient.
Step 3: Judge whether f r < f b , and then perform the expansion operation, X y = X c + β X r X c , where β is the expansion coefficient; if f y < f b , then w = X y , and vice versa w = X r .
Step 4: Judge whether f r < f w , and then perform the compression operation, X z = X c γ X r X c , where γ is the compression coefficient; if f z < f w , then w = X z , and vice versa w = X r .
Step 5: Judge whether   f w > f r > f t , and then perform the contraction operation, X s = X c σ X w X c , where σ is the contraction coefficient, and σ = γ ; if f s < f w , then w = X r .
(4)
Improving Sparrow Search Algorithm Flow
Step 1: Establish the population and SMSSA parameters: the highest quantity of iterations T m a x , the population size N, the search upper bounds ub and lb, the dimension d, and the proportion of discoverers P.
Step 2: Calculate and rank individual fitness, and record the optimal, sub-optimal and worst fitness individual positions, X b , X t and X w , and their fitness values,   f b , f t and f w .
Step 3: Update the discoverer position according to Equation (15).
Step 4: Update the follower position according to Equation (13).
Step 5: Update the detection of this location according to Equation (16).
Step 6: Define   X c = ( X b + X t ) / 2 , reflective operation for m poorly positioned individuals w, X r = X c + α ( X c X w ) , reflective individuals with adaptation degree f r .
Step 7: Judge the size relationship between f r , f b , f w and f t according to the simplex method, and then decide whether to perform the expansion, compression or telescoping operation on the m individuals with the worst fitness.
Step 8: Loop Step 2–Step 7, ascertain whether the iteration condition is met, and if so, call the loop out.
Step 9: End of step, return the optimal position and fitness.

4.3. Improved Sparrow Search Algorithm Performance Test Simulation

The sparrow search algorithm and the modified sparrow search algorithm were described in the previous part, and in order to evaluate their optimization searching abilities, three typical benchmark test functions are selected in this section to test the performance of these two methods.
(1)
Test Functions
1.   Sphere function
f 1 ( x ) = i N x i 2 ,   x i [ 100,100 ]
This function is a nonlinear symmetric single-peaked function with only one global extreme point, and the function achieves a global minimum f 1 ( x ) = 0 when x =   ( 0,0 , , 0 ) This function allows for a test of the algorithm’s optimization-seeking accuracy.
2.   Griewank function
f 2 x = 1 4000 i N x i 2 i N c o s x i i + 1 ,   x i [ 600,600 ]
This function contains a large number of local extremes over the entire range, and there exists a global minimum extreme value f 2 ( 0 ) = 0 . The use of this function enables a test of the algorithm’s capacity to move beyond the local and continue the search.
3.   Rastrigin function
f 3 ( x ) = i N x i 2 10 cos 2 π x i + 10 ,   x i [ 5.12,5 , 12 ]
This function is a multi-peaked function with a large number of local extreme points, and at x = ( 0,0 , . . . , 0 ) , there is a global minimum point, and there is some difficulty in finding the global minimum point, so this function can be a test of the algorithm’s global optimization-seeking ability.
(2)
Comparison of algorithms
In order to assure the fairness of the algorithms and the control variables in this study, the size of the population is set at 30, and the highest possible number of iterations is set at i t e r m a x for 100. Three test function parameter configurations are provided in Table 1:
The results of testing SSA and SMSSA with the three test functions are shown in Figure 3, Figure 4 and Figure 5:
Figure 3, Figure 4 and Figure 5 show the performance curves of the two optimization techniques under the three test functions, from which it can be shown that the convergence speed of SMSSA is substantially quicker than that of SSA. To better assess the convergence outcomes of the two algorithms, the average and standard deviation of the two methods were calculated after each function is carried out 30 times, and the experimental findings are provided in Table 2.
Table 2 shows that the average SSA value in the sphere test function is 1.18 × 10−25 while the average value of SMSSA is 6.23 × 10−34, and compared to SSA, the accuracy of SMSSA is nine orders of magnitude greater. In the Griewank function, the average value of SSA is 3.76 × 10−12, while the average value of SMSSA is 7.22 × 10−15, and at this time, SMSSA is three orders of magnitude more accurate than SSA; in the test of the Ristrigin function, SMSSA accuracy is two orders of magnitude better than SSA. From the above, it can be learned that SMSSA is better than SSA in searching ability. It shows that SMSSA is able to find function values that are closer to the optimal value during the optimization search, indicating that it has better optimization results

4.4. Construction of SMSSA-BiLSTM Models

According to the deep learning theory and the characteristics of the prediction model [23,24], the BiLSTM network model is optimized using the SMSSA method, which is largely separated into the following three phases:
(1)
Establish the BiLSTM network model’s structure, including how many nodes are in each input layer, how many layers are in the hidden layer, and how many nodes are in the output layer. The number of nodes in the hidden layer is obtained by utilizing an optimization algorithm to determine which is the best.
(2)
The SMSSA method is used to maximize the learning rate, number of trainings, and number of nodes in the BiLSTM network’s hidden layer.
(3)
The optimized model is used to estimate the real load and assess the performance of the model.
To analyze the advantages and drawbacks of the individual locations of SMSSA populations, this study employs the error between the output value of the BiLSTM model and the real value as the fitness function of the SMSSA algorithm. In this study, the MAE function is chosen as the fitness function of the algorithm, and the function formulation is given in the following Equation (20).
M A E = 1 n 1 n X ^ i X i
where X ^ i denotes the predicted value; X i denotes the true value; and n denotes the size of the population.

4.5. Prediction Process of SMSSA-BiLSTM Model

The specific steps and flowchart of the optimized BiLSTM network prediction model based on SMSSA algorithm are shown in Figure 6:
(1)
Determine the initial parameters of the model. Determine the sparrow population size N, the maximum number of iterations T m a x , the upper bound ub and lower bound lb of the population search, the dimension d, and the proportion of discoverers P. Use a random function to determine the sparrow’s starting location. Count the number of nodes in the BiLSTM network’s input and output layers as well as the number of hidden layers.
(2)
Determine each sparrow’s specific adaptations and note where the best, worst, and suboptimal adaptations are found, X b , X t   and   X w , as well as their adaptation values, f b , f t   and   f w .
(3)
The sparrow’s position is updated by the finder, follower, and warning formulas, and the hyperparameters constrained by the boundary function are passed into the BiLSTM prediction model to return the adaptation values. Replaces are made if the current sparrow position’s ideal adaptation value is greater than the optimal position’s adaptation value; otherwise, nothing changes.
(4)
Determine if the algorithm has finished. If the number of iterations reaches the maximum number of iterations T m a x and model accuracy, output the location of the ideal population, feed the acquired parameters back to the BiLSTM prediction model, and predict the trained optimization model on the original data to obtain the result.
Figure 6. SMSSA-BiLSTM model prediction flow chart.
Figure 6. SMSSA-BiLSTM model prediction flow chart.
Processes 12 01796 g006

5. Simulation Analysis

In this study, the proposed SMSSA-BiLSTM short-term load forecasting model is verified using the real load data of the Wuxi region in Jiangsu Province, China in 2018. In this section, simulation analysis is carried out in three steps: firstly, the data and simulation environment required in the experiments in this section are determined; then, the example simulation of the established model is carried out; and finally, the performance effect of the model is compared and analyzed through the load curves, relative error plots, and the indicators such as RMSE, MAE, and MAPE.

5.1. Data Selection and Simulation Environment

(1)
Selection of data
In this article, the genuine electric load data of an area in 2018 is picked. The load data have a sample interval of fifteen minutes; that is, there are 4 sampling points in each hour, and a day comprises 96 sample points. Since the electric load has different change characteristics in different seasons, and the change characteristics of spring, fall, and winter are similar, they are taken as the first type of date; summer is the most special, and summer is taken as the second type of date. Therefore, in this section, the power load data from 1 March 2018, 0:00 to 31 March 2018, 24:00 in the region will be used as the first type of research object, which is used to describe the model’s prediction in the spring, fall, and winter, whereby the data collected before to 31 March serve as the training set, and the data collected on 31 March serve as the test set; the power load data from 1 July, 0:00 to 31 July are adopted as the second type of research object to describe the model’s prediction in summer, where the training set is the data collected before 31 July, and the test set is the data collected on 31 July.
(2)
Simulation environment
In this paper, MATLAB is chosen as the simulation environment, and the programming environment used is MATLAB version 2021a; the computer processor was Intel core i7 10875H, using a Windows 11 system.
(3)
Error evaluation metrics
To assess the superiority of this paper’s suggested SMSSA-BiLSTM model, the following three error evaluation metrics will be used to evaluate the model in this paper.
R M S E = 1 n i = 1 n y i y ^ i 2
M A E = 1 n i = 1 n y i y ^ i
M A P E = 1 n i = 1 n y i y ^ i y i × 100 %
where n is the sample capacity, y i is the value of the predicted output, and y ^ i is the actual value.

5.2. Simulation Process and Analysis

In this part, a multiple-input, multiple-output model is used to forecast the 96-load data of the following day using the 96-load data of the previous day and the highest and lowest temperature of the following day. The present study employs a BiLSTM model with two hidden layers. The number of nodes in hidden layer 1 is set to 200, and the number of nodes in hidden layer 2 is set to 20. The learning rate is set at 0.005. The model has 98 input nodes and 96 output nodes. The number of nodes of both hidden layers is set to [1, 300] for the search of optimization, the range of training times is set to [10, 300], the number of sparrow population is set to 5, the maximum number of iterations is set to 10, and the search of optimization for the learning rate is set to [0.01, 0.001]. In the SSA and SMSSA optimization algorithms, the warning value is set to 0.6, the proportion of discoverers is set to 0.7, the proportion of aware of dangerous sparrows is set to 0.2, and the worst number of sparrows for the simplex search in SMSSA is set to 5. The inertia weight in PSO is set to 0.8, and the learning factors are all set to 1.5. The following are the simulation results of each combined model.
The overlap between the real and projected values of many models, including SMSSA-BiLSTM, SSA-BiLSTM, and PSO-BiLSTM, is rated from high to low in Figure 7 and Figure 8. Due to the sparrow search algorithm’s advantage for the optimization time, the SSA-BiLSTM and SMSSA-BiLSTM load profiles fluctuate more than the PSO-BiLSTM load profiles and are smaller and more stable. In SMSSA-BiLSTM, the best-fitting results are obtained due to the stronger optimization seeking ability of the improved sparrow optimization algorithm. Further, their relative error curves are plotted as shown in Figure 9.
The relative errors of the three models for the first and second types of dates are displayed in Figure 9 and Figure 10, from which it is evident that the SMSSA-BiLSTM model has the highest forecast stability and the least amount of variation in relative errors, and most of its relative errors are concentrated near 2%, followed by the SSA-BiLSTM model, which performs a little bit poorer; the largest fluctuation of relative errors is that of the PSO-BiLSTM model, whose relative errors are more concentrated near 5%, which indicates that the prediction results of the SMSSA-BiLSTM model are more accurate. Table 3 shows the prediction values of the three models on 31 March in order to further quantify the prediction errors of each model.
Table 3 demonstrates the true and predicted values for 31 March at 15 min intervals. Various error evaluation metrics can be calculated from this table. To see the magnitude of every model inaccuracy, this paper uses the evaluation indexes RMSE, MAE, and MAPE to calculate the error, and various types of model errors are obtained, as shown in Table 4.
As can be seen in Table 4, SMSSA-BiLSTM predicts the smallest MAPE values in both categories of dates. In the first category of dates, the MAPE of SMSSA-BiLSTM decreases by 1.44% and 0.62% compared to PSO-BiLSTM and SSA-BiLSTM, respectively; in the second category of dates, the MAPE of SMSSA-BiLSTM decreases by 0.82% and 0.57% compared to PSO-BiLSTM and SSA-BiLSTM, respectively. Comparing the results with the other two algorithms it can be demonstrated that SMSSA-BiLSTM can predict the ‘threshold’ more accurately.

6. Conclusions

This paper firstly introduces the bidirectional long- and short-term memory network (BiLSTM), which is an improvement of the long and short-term memory network (LSTM), which efficiently uses the backward information in addition to learning the ahead information, so it can fully learn the load data. Then, an enhanced sparrow search algorithm is presented, which primarily addresses algorithmic issues with the classical sparrow search algorithm’s population adaptation, search mechanism, and detection mechanism. It also addresses issues with the classical sparrow algorithm’s slow convergence speed and tendency to fall into local optimality. This serves as the foundation for the establishment of a short-term power load forecasting model that uses an improved sparrow search algorithm to optimize BiLSTM. The model compares its parameters, including the number of hidden layer nodes, learning rate, and training times, to those of SSA-BiLSTM and PSO-BiLSTM. Real load data from 2018 in Wuxi, Jiangsu Province, China, are chosen for example analysis in order to verify the prediction performance of the prediction model proposed in this paper. The results demonstrate the effectiveness of the SMSSA-BiLSTM model, with a small error and good fitting effect.

Author Contributions

Conceptualization, C.Z. and F.Z.; methodology, F.G. and W.C.; software, F.Z. and W.C.; validation, C.Z. and F.G.; formal analysis, C.Z. and F.Z.; investigation, F.Z. and C.Z.; resources, F.G.; data curation, C.Z.; writing—original draft preparation, F.Z.; writing—review and editing, C.Z. and F.Z.; visualization, C.Z. and F.G.; supervision, C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Scientific and Technological Research Projects of Henan Province (No. 232102110286).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors thank the Electric Power Scientific Research Institute of Henan for their collaboration in this research.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Wang, K.; Zhang, J.; Li, X.; Zhang, Y. Long-Term Power Load Forecasting Using LSTM-Informer with Ensemble Learning. Electronics 2023, 12, 2175. [Google Scholar] [CrossRef]
  2. Li, J.; Lei, Y.; Yang, S. Mid-long term load forecasting model based on support vector machine optimized by improved sparrow search algorithm. Energy Rep. 2022, 8, 491–497. [Google Scholar] [CrossRef]
  3. Ciechulski, T.; Osowski, S. High Precision LSTM Model for Short-Time Load Forecasting in Power Systems. Energies 2021, 11, 2983. [Google Scholar] [CrossRef]
  4. Cui, C.; He, M.; Di, F.; Lu, Y.; Dai, Y.; Lv, F. Research on power load forecasting method based on LSTM model. In Proceedigns of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 12–14 June 2020. [Google Scholar]
  5. Butt, F.M.; Hussain, L.; Jafri, S.H.M.; Alshahrani, H.M.; Al-Wesabi, F.N.; Lone, K.J.; El Din, E.M.T.; Duhayyim, M.A. Intelligence based Accurate Medium and Long Term Load Forecasting System. Appl. Artif. Intell. 2022, 36, 2089. [Google Scholar] [CrossRef]
  6. Jin, Y.; Guo, H.; Wang, J.; Song, A. A hybrid system based on LSTM for short-term power load forecasting. Energies 2020, 13, 6241. [Google Scholar] [CrossRef]
  7. Kwon, B.; Park, R.; Song, K. Short-term load forecasting based on deep neural networks using LSTM layer. J. Electr. Eng. Technol. 2020, 15, 1501–1509. [Google Scholar] [CrossRef]
  8. Rafi, S.H.; Masood, N.A.; Deeba, S.R.; Hossain, E. A short-term load forecasting method using integrated CNN and LSTM network. IEEE Access 2020, 9, 32436–32448. [Google Scholar] [CrossRef]
  9. Chao, H.; Lin, F.; Pan, J.; Chien, W.; Lai, C. Power Load Forecasting Based on VMD and Attention-LSTM. In Proceedings of the 3rd International Conference on Data Science and Information Technology, Xiamen, China, 24–26 July 2020. [Google Scholar]
  10. Yang, J.; Zhang, X.; Bao, Y. Short-term Load Forecasting of Central China based on DPSO-LSTM. In Proceedings of the 2021 IEEE 4th International Electrical and Energy Conference (CIEEC), Wuhan, China, 28–30 May 2021. [Google Scholar]
  11. Siami-Namini, S.; Tavakoli, N.; Namin, A.S. The performance of LSTM and BiLSTM in forecasting time series. In Proceedings of the 2019 IEEE International Conference on Big Data (Big Data), Los Angeles, CA, USA, 9–12 December 2019. [Google Scholar]
  12. Yan, L.; Zhang, H. A Variant Model Based on BiLSTM for Electricity Load Prediction. In Proceedings of the 2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS), Shenyang, China, 29–31 July 2021. [Google Scholar]
  13. Wang, Z.; Jia, L.; Ren, C. Attention-Bidirectional LSTM Based Short Term Power Load Forecasting. In Proceedings of the 2021 Power System and Green Energy Conference (PSGEC), Shanghai, China, 20–22 August 2021. [Google Scholar]
  14. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  15. Graves, A. Connectionist temporal classification. In Supervised Sequence Labelling with Recurrent Neural Networks; Springer: Berlin/Heidelberg, Germany, 2012; pp. 61–93. [Google Scholar]
  16. Wang, Y.; Sun, S.; Cai, Z. Daily Peak-Valley Electric-Load Forecasting Based on an SSA-LSTM-RF Algorithm. Energies 2023, 16, 7964. [Google Scholar] [CrossRef]
  17. Zhong, B. Deep learning integration optimization of electric energy load forecasting and market price based on the ANN–LSTM–transformer method. Front. Energy Res. 2023, 11, 1292204. [Google Scholar] [CrossRef]
  18. Liu, G.; Guo, J. Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 2019, 337, 325–338. [Google Scholar] [CrossRef]
  19. Li, Z.; Hu, R.; Liu, X.; Deng, Y.; Tang, P.; Wang, Y. Multi-factor short-term load forecasting model based on PCA-DBILSTM. Proc. CSU-EPSA 2020, 32, 32–39. [Google Scholar]
  20. Gong, P.; Luo, Y.; Fang, Z.; Dou, F. Short-term power load forecasting method based on Attention-BiLSTM-LSTM neural network. J. Comput. Appl. 2021, 41, 81–86. [Google Scholar]
  21. Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
  22. Liu, C.; He, Q. Simplex-guided sparrow search algorithm with improved search mechanism. Comput. Eng. Sci. 2021, 44, 2238–2245. [Google Scholar]
  23. Yaprakdal, F.; Arısoy, M.V. A Multivariate Time Series Analysis of Electrical Load Forecasting Based on a Hybrid Feature Selection Approach and Explainable Deep Learning. Appl. Sci. 2023, 13, 12946. [Google Scholar] [CrossRef]
  24. Alghamdi, H.; Hafeez, G.; Ali, S.; Ullah, S.; Khan, M.I.; Murawwat, S.; Hua, L. An Integrated Model of Deep Learning and Heuristic Algorithm for Load Forecasting in Smart Grid. Mathematics 2023, 11, 4561. [Google Scholar] [CrossRef]
Figure 1. Structural diagram of long and short-term memory network. × denotes the Hadamard product and + denotes addition and recognizes the existence of explanations below.
Figure 1. Structural diagram of long and short-term memory network. × denotes the Hadamard product and + denotes addition and recognizes the existence of explanations below.
Processes 12 01796 g001
Figure 2. BiLSTM structure diagram.
Figure 2. BiLSTM structure diagram.
Processes 12 01796 g002
Figure 3. Iterative change diagram of sphere function.
Figure 3. Iterative change diagram of sphere function.
Processes 12 01796 g003
Figure 4. Iterative change diagram of Griewank function.
Figure 4. Iterative change diagram of Griewank function.
Processes 12 01796 g004
Figure 5. Iterative change diagram of Rastrigin function.
Figure 5. Iterative change diagram of Rastrigin function.
Processes 12 01796 g005
Figure 7. Forecast results for 31 March.
Figure 7. Forecast results for 31 March.
Processes 12 01796 g007
Figure 8. Forecast results for 31 July.
Figure 8. Forecast results for 31 July.
Processes 12 01796 g008
Figure 9. Prediction errors of different network models on March 31st.
Figure 9. Prediction errors of different network models on March 31st.
Processes 12 01796 g009
Figure 10. Prediction errors of different network models on July 31st.
Figure 10. Prediction errors of different network models on July 31st.
Processes 12 01796 g010
Table 1. Test function parameter settings.
Table 1. Test function parameter settings.
Test FunctionDimensionSpace SearchTarget Value
Sphere20(−100, 100)0.0
Griewank20(−600, 600)0.0
Rastrigin20(−5.12, 5.12)0.0
Table 2. Comparison of performance between SSA algorithm and SMSSA algorithm.
Table 2. Comparison of performance between SSA algorithm and SMSSA algorithm.
Test FunctionArithmeticTheoretical OptimumAverage ValueStandard Deviation
SphereSSA01.18 × 10−253.76 × 10−25
SMSSA06.23 × 10−343.19 × 10−34
GriewankSSA03.76 × 10−123.72 × 10−12
SMSSA07.22 × 10−156.38 × 10−15
RistriginSSA01.58 × 10−115.00 × 10−11
SMSSA03.29 × 10−131.28 × 10−13
Table 3. Predicted values of different load forecasting models.
Table 3. Predicted values of different load forecasting models.
Sampling PointActual Value
(MW)
PSO-BiLSTM
(MW)
SSA-BiLSTM
(MW)
SMSSA-BiLSTM
(MW)
130482747.112903.353123.14
230672909.453066.953131.26
330522958.333046.463103.63
430292961.123046.353121.46
530252988.312972.883099.32
630162997.532895.253066.20
730192947.232831.533070.89
830012985.502817.962947.12
929592992.922832.802897.79
1029273012.802922.702936.87
1129533010.872898.152939.44
1229292972.522945.423005.58
1329272922.963011.403026.32
1429112886.003046.373001.41
1528872804.232924.192917.03
1629062678.053046.822864.42
1729042657.743010.112817.48
1829172686.113012.362849.18
1929082737.433018.442828.40
2029182821.993058.162832.48
2129372954.653027.392852.91
2229443076.143102.342903.58
2329813164.353022.832896.01
2429983272.542969.082988.25
2530953374.653029.183046.53
2631983411.103018.593155.55
2732303393.572966.553207.54
2832523395.723109.893293.16
2932963430.213191.353327.88
3033003493.473338.413458.48
3134373588.893435.893555.30
3236693704.283627.803677.18
3337233803.903771.553800.62
3438243935.533948.943936.64
3539563982.813947.524063.60
3640354090.464048.834103.73
3740574139.464077.964159.73
3841054241.074037.804220.50
3941804271.114014.564246.69
4041644248.364015.984186.14
4142014198.594028.034188.56
4241604220.044017.734096.38
4341034035.013918.283974.41
4440673901.143855.393917.36
4538273874.283721.493776.09
4636853744.663581.933679.55
4735093686.843535.653632.12
4835953688.053610.303684.94
4936813673.983564.263709.10
5038473639.413682.403817.39
5138793655.123787.283823.51
5239463666.193774.133929.55
5339803699.213724.603915.93
5439773726.613789.133910.37
5539223726.483804.393984.26
5638963857.933850.613997.45
5739673897.543906.384019.75
5839103927.594012.464080.65
5939933909.164010.414093.94
6039533938.803957.314060.62
6138863860.733859.734104.93
6239413917.163803.814027.03
6339733940.043749.714032.20
6439634004.983794.974019.75
6539863937.143801.114021.05
6639393949.483821.724057.08
6738743880.773839.874058.12
6838923825.823749.893982.33
6937323740.793623.823896.62
7035893814.753573.083857.26
7135753826.023542.313734.95
7236233734.693554.833645.18
7336653770.853650.563618.00
7436263769.683695.803600.15
7536643665.843784.513613.52
7637533550.133781.903620.93
7737173568.193805.003692.27
7837433495.663771.433729.30
7937053424.453715.543775.04
8036733432.333625.073739.84
8135993457.643639.643709.07
8236453490.293515.213651.94
8336053521.313415.003642.90
8435373623.053404.883587.46
8535443643.253371.993510.71
8635813692.533364.863492.24
8735213703.733410.673504.95
8834923701.343469.243393.38
8934253610.683431.173383.56
9033423620.403487.353323.01
9133193472.423410.153233.84
9232333354.253247.193119.12
9331553242.353129.313051.70
9431263240.473105.423027.70
9530643198.012933.463059.66
9630193280.523041.633259.66
Table 4. Comparison of prediction errors of different models.
Table 4. Comparison of prediction errors of different models.
ErrorsPSO-BiLSTMSSA-BiLSTMSMSSA-BiLSTM
first typeRMSE147.7229114.586890.1895
MAE121.151493.234374.3971
MAPE3.54%2.72%2.10%
second typeRMSE156.6372143.8687115.0086
MAE119.8673115.317692.1552
MAPE3.10%2.85%2.28%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, C.; Zhang, F.; Gou, F.; Cao, W. Study on Short-Term Electricity Load Forecasting Based on the Modified Simplex Approach Sparrow Search Algorithm Mixed with a Bidirectional Long- and Short-Term Memory Network. Processes 2024, 12, 1796. https://doi.org/10.3390/pr12091796

AMA Style

Zhang C, Zhang F, Gou F, Cao W. Study on Short-Term Electricity Load Forecasting Based on the Modified Simplex Approach Sparrow Search Algorithm Mixed with a Bidirectional Long- and Short-Term Memory Network. Processes. 2024; 12(9):1796. https://doi.org/10.3390/pr12091796

Chicago/Turabian Style

Zhang, Chenjun, Fuqian Zhang, Fuyang Gou, and Wensi Cao. 2024. "Study on Short-Term Electricity Load Forecasting Based on the Modified Simplex Approach Sparrow Search Algorithm Mixed with a Bidirectional Long- and Short-Term Memory Network" Processes 12, no. 9: 1796. https://doi.org/10.3390/pr12091796

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop