Next Article in Journal
The Strike-Slip Fault System and Its Influence on Hydrocarbon Accumulation in the Gudong Area of the Zhanhua Depression, Bohai Bay Basin
Next Article in Special Issue
A Method for Optimizing Production Layer Regrouping Based on a Genetic Algorithm
Previous Article in Journal
Preparation of Fe2O3/g-C3N4 Photocatalysts and the Degradation Mechanism of NOR in Water under Visible Light Irradiation
Previous Article in Special Issue
Optimization of Abnormal Hydraulic Fracturing Conditions of Unconventional Natural Gas Reservoirs Based on a Surrogate Model
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Hybrid Grey System Model Based on Stacked Long Short-Term Memory Layers and Its Application in Energy Consumption Forecasting

School of Science, Southwest University of Science and Technology, Mianyang 621010, China
*
Author to whom correspondence should be addressed.
Processes 2024, 12(8), 1749; https://doi.org/10.3390/pr12081749
Submission received: 10 July 2024 / Revised: 7 August 2024 / Accepted: 14 August 2024 / Published: 20 August 2024

Abstract

:
Accurate energy consumption prediction is crucial for addressing energy scheduling problems. Traditional machine learning models often struggle with small-scale datasets and nonlinear data patterns. To address these challenges, this paper proposes a hybrid grey model based on stacked LSTM layers. This approach leverages neural network structures to enhance feature learning and harnesses the strengths of grey models in handling small-scale data. The model is trained using the Adam algorithm with parameter optimization facilitated by the grid search algorithm. We use the latest annual data on coal, electricity, and gasoline consumption in Henan Province as the application background. The model’s performance is evaluated against nine machine learning models and fifteen grey models based on four performance metrics. Our results show that the proposed model achieves the smallest prediction errors across all four metrics (RMSE, MAE, MAPE, TIC, U1, U2) compared with other 15 grey system models and 9 machine learning models during the testing phase, indicating higher prediction accuracy and stronger generalization performance. Additionally, the study investigates the impact of different LSTM layers on the model’s prediction performance, concluding that while increasing the number of layers initially improves prediction performance, too many layers lead to overfitting.

1. Introduction

Energy consumption reflects a region’s energy demand, industrial development, and economic growth, serving as a critical metric for devising effective energy scheduling strategies. Therefore, accurate prediction of energy consumption is necessary. However, predicting energy consumption with a small sample size and nonlinear dataset is highly challenging. The grey system model, grounded in the grey differential equation, features a simpler structure than machine learning models. It effectively learns and utilizes the characteristics of limited data. This capability has spurred the extensive exploration and adoption of the grey model in the energy sector. The grey model (GM) was initially proposed by Deng in 1983 [1]. In 1984, Deng further advanced this concept by introducing practical grey forecasting models known as GM(1,1) and GM(1,N), which were notably applied to forecast long-term grain output in China [2]. Building upon Deng’s foundational work, subsequent models such as DGM(1,1) [3] and DGM(1,N) [3] were developed as derivatives of GM(1,1) and GM(1,N), respectively.
After decades of advancement, grey modeling techniques have attained a level of maturity. Predominantly linear GMs can be categorized into univariate and multivariate models. Additionally, both types can be further distinguished based on whether they are continuous or discrete models. The majority of univariate grey models stem from the foundational GM(1,1) model. In 2020, Wang introduced a seasonal grey model, DSGM(1,1), which incorporates dynamic seasonal adjustment factors, significantly enhancing prediction accuracy [4]. Concurrently, Wu proposed the CFNGM(1,1,k,c) model in 2020, utilizing novel concepts of conformable fractional accumulation and differentiation, applied specifically to carbon dioxide prediction [5]. Following this, in 2021, Liu introduced the GM(1,1) power model, leveraging the principle of adjacent accumulation and validating its efficacy through case studies involving four central European countries [6]. In 2019, Luo introduced the DGMP(1,1,N) model, demonstrating its robust fitting and forecasting accuracy [7]. In 2020, Zhou developed an innovative discrete grey model, DGMNF(1,1), integrating considerations of nonlinearity and fluctuation. Two empirical examples were presented to validate the efficacy and reliability of that model [8]. Additionally, in 2021, Qian proposed the SADGM(1,1) model, an innovative adaptive discrete grey system model. Its performance was benchmarked against various grey and non-grey prediction methods using three real-world cases, affirming its feasibility and comparative superiority [9]. Similarly, the development of multivariate grey models originates from the GM(1,N) framework. In 2018, Ding proposed the grey DFCGM(1,N) model, employing dummy variables to effectively capture future trends influenced by these variables [10]. Concurrently, in 2018, another GM(1,N) model was introduced for mixed-frequency data, addressing challenges arising from inconsistent statistical frequencies in system feature and correlative factor series, especially under small-sample conditions [11]. Subsequently, in 2021, Luo introduced the TDAGM(1,N) model, applying it to analyze food production dynamics [12]. In 2013, He introduced the D-GMC(1,N) model to address boundary effects associated with bi-dimensional empirical mode decomposition (BEMD), demonstrating its effectiveness [13]. In 2019, Ding developed the CDGM(1,N) model, an enhanced discrete grey multivariable model utilized for forecasting output values in eastern high-tech industries [14]. Lastly, in 2020, Ding proposed a novel discrete grey system model incorporating grey power indices within its structural framework [15].
As research progresses in GM methodologies, it becomes increasingly evident that linearly structured GMs encounter challenges in effectively forecasting nonlinear data. As a result, there has been an increasing interest in developing nonlinear GMs utilizing various approaches. Nonlinear GMs typically explore three primary methodologies: kernel methods, integration of nonlinear mathematical constructs such as y 2 ( t ) and y γ ( t ) , and incorporation of neural networks. In 2018, Ma introduced a nonlinear multivariate grey system model known as the kernel-based GM(1,N) or KGM(1,N), leveraging the kernel method. It was demonstrated that the KGM(1,N) model achieved higher efficiency compared to existing linear multivariate grey models and the Least-Squares Support Vector Machine (LSSVM) [16]. In 2020, Duna introduced an enhanced version of the KGM(1,N) model, incorporating a Gaussian vector basis kernel function and a global polynomial kernel function, which demonstrated improved capability in handling nonlinearity [17]. In 2024, Ma presented the GMW-KRGM, a kernel ridge grey system model integrating an expanded parametric Morlet wavelet. Through analysis across six real-world examples, Ma illustrated the model’s superior accuracy in managing nonlinear data [18]. In 2017, Shaikh proposed the grey Verhulst model and conducted a forecast analysis focusing on the gas consumption of China [19]. In 2020, Xiao developed the grey Riccati–Bernoulli model (GRBM(1,1)) by transforming a differential equation based on the concept of differential information. This model was validated through four examples, showcasing its effectiveness compared to existing models [20]. Also in 2020, Mao utilized the Lotka–Volterra model to measure and forecast the influence of commercial banks’ online payment systems on the growth of third-party online payment systems [21]. In 2021, Ma introduced the neural grey system model, highlighting its superior performance relative to other models and emphasizing its robust applicability across different scenarios [22]. In 2023, Liu developed an advanced conformable fractional-order grey forecasting model, integrating a pioneering accumulation mechanism grounded in the generalized conformable fractional calculus. This model showcased enhanced predictive capabilities, surpassing existing models in precision [23]. Also in 2023, Xie constructed a nonlinear grey multivariate model for energy structure forecasting by leveraging differential equations and grey differential data, successfully predicting China’s energy consumption trends [24]. In the same year, Wei introduced an innovative nonlinear grey Bernoulli model employing a physics-preserving Cusum operator. This method was utilized for extracting intrinsic dynamics from short-term traffic flow data, with outcomes confirming its efficacy [25].
Nowadays, grey system models has been extensively used in the energy field, such as power load forecasting, energy consumption prediction, energy price forecasting, etc. In 2016, Zhao successfully applied the Rolling-ALO-GM(1,1) model to predict annual power load with significant results [26]. In 2019, Jin proposed a novel grey model incorporating grey correlation and applied it to short-term power load forecasting, demonstrating its superior forecasting accuracy compared to existing methods [27]. In 2017, Zeng introduced the NSGM(1,1) model for predicting the trend of China’s total energy consumption [28]. In 2021, Guo utilized an enhanced GM(1,1) model to forecast energy usage of residential air source heat pump water heaters, validating the effectiveness of the predictions [29]. In 2022, Li developed a nonlinear grey system model utilizing grey difference information to forecast energy price, energy consumption, and economic growth. This model successfully predicted coal prices and consumption in China from 2021 to 2025 [30]. Also in 2022, Lei constructed the PGM(1,2,a,b) model to predict electricity price, demonstrating its efficient and accurate short-term forecasting capabilities [31]. In 2023, Duan integrated the logistic model of energy structure into the systemic framework, devising a novel grey prediction model. That model, when applied to the case study of China’s electricity consumption, exhibited commendable predictive performance [32]. Also within that year, Pandey utilized the grey forecasting model DGM (1,1, α ) to predict non-renewable and renewable energy from diverse sources, including hydro, solar, wind, and bioenergy [33]. In 2023, Zhao combined a fractional-order cumulative operator with a new information-priority accumulation method to create a hybrid grey univariate model, which was used to predict energy consumption in southwestern China [34]. In 2024, Yuan integrated a grey system model with Gaussian process residual uncertainty analysis and seasonal trend decomposition using LOESS to forecast carbon emissions in developed countries [35]. That same year, He used the vector-valued Bernoulli equation to establish a nonlinear multivariable grey Bernoulli model for predicting fuel and crude oil prices [36]. Despite their relative maturity in model structure and application, existing grey models still exhibit limitations when confronted with nonlinear data, hindering their ability to effectively capture data characteristics.
Long Short-Term Memory (LSTM) is a type of recurrent neural network employing a gating mechanism that enhances its ability to learn features in data, particularly effective for processing long time series. Initially introduced by Hochreiter in 1997 [37], LSTM has since evolved with various extensions, including bidirectional LSTM-CRF [38], multiplicative LSTM [39], and convolutional LSTM [40]. LSTM models have been widely used in the field of energy forecasting. In 2023, Wang deployed the GA-LSTM model for forecasting ship fuel consumption, successfully predicting fuel usage across various conditions [41]. In the same year, Lu introduced an innovative multi-source transfer learning model for short-term energy forecasting, leveraging LSTM networks combined with multi-kernel maximum mean discrepancy for domain adaptation [42]. This model addressed the challenge of limited historical data in predicting energy usage for diverse building types. In the same year, Lu proposed the Prophet-EEMD-LSTM model for workshop power consumption forecasting, demonstrating its high predictive accuracy [43]. However, the LSTM model shares common drawbacks with neural networks, where its predictive performance relies heavily on large-scale datasets. When handling small-scale datasets, it frequently encounters challenges such as overfitting and convergence to local optima.
Considering the strengths and weaknesses of both the grey model and LSTM, a natural consideration is to integrate the LSTM model into the grey model framework to leverage their respective advantages. Thus, in this paper, we propose a hybrid grey system model based on stacked LSTM layers, aiming to synergize the strengths of both approaches and enhance the handling of nonlinear and small-scale data efficiently. The idea for this combined framework originates from reference [22], which focuses on using a more complex optimization algorithm to train the model and employs only a simple neural network. This paper embeds stacked LSTM layers to develop a new hybrid grey model, using a simpler and more user-friendly algorithm to train the model, ultimately resulting in an effective model. Additionally, since the neural grey model framework in reference [22] has not been extensively studied, we chose to use this framework to build our model in order to verify its effectiveness and fill the research gap. Furthermore, the application of the neural grey model in predicting the annual consumption of electricity, coal, and gasoline energy in Henan Province is still unexplored, with most research on Henan Province focusing on agriculture. To address this gap, we apply the proposed hybrid grey model to predict the annual consumption of electricity, coal, and gasoline energy in Henan Province and verify the model’s prediction performance.
In the rest of the paper, the methodology including the generic formula of the grey system model, the proposed GreySLstm model and its solutions are shown in Section 2; applications in three Henan energy consumption datasets are presented in Section 3; the conclusion is in Section 4.

2. Methodology

2.1. The General Formulation of a Grey System Model

Given the original sequences I i ( 0 ) ( p ) ( p = 1 , 2 , N ) and T ( 0 ) ( p ) ( p = 1 , 2 , N ), the first-order accumulation I i ( 1 ) ( p ) and T ( 1 ) ( p ) can be obtained:
I i ( 1 ) ( p ) = t = 1 p I i ( 0 ) ( t ) ,   p = 1 , 2 , , N T ( 1 ) ( p ) = t = 1 p T ( 0 ) ( t ) ,   p = 1 , 2 , , N
where Equation (1) is called 1-AGO [44], and the structure of 1-AGO is shown in Figure 1.
The general whitening formula can be explained as follows:
d T ( 1 ) ( p ) d p + a T ( 1 ) ( p ) = f ( I ( 1 ) ( p ) ; θ ) ,
Here, I ( 1 ) ( p ) = ( I 1 ( 1 ) ( p ) , I 2 ( 1 ) ( p ) , , I n ( 1 ) ( p ) ) , where a denotes the development factor, and the vector θ consists of parameters associated with the input series. The function f ( · ) depends on the variable p.
By calculating the whitening equation, we obtain the discrete version of Equation (2):
T ( 0 ) ( k ) + a z ( 1 ) ( k ) = f 1 2 ( I ( 1 ) ( k 1 ) + I ( 1 ) ( k ) ) ; θ ,
where z ( 1 ) ( k ) = 0.5 [ T ( 1 ) ( k 1 ) + T ( 1 ) ( k ) ] , and it is called the background value.
For convenience, here, we let v k = 1 2 ( I ( 1 ) ( k 1 ) + I ( 1 ) ( k ) ) , so Equation (3) can be written as follows:
T ( 0 ) ( k ) + a z ( 1 ) ( k ) = f ( v k ; θ ) ,
In a grey system model, Equation (4) is used to compute the value of α and θ . After obtaining the parameters of the GM model, we need to construct the prediction equation. Upon solving Equation (3) with the starting condition T ( 1 ) ( p ) = T ( 0 ) ( 1 ) , the response function in continuous form is altered as follows:
T ( 1 ) ( p ) = T ( 0 ) ( 1 ) e a ( p 1 ) + 1 p e a ( p r ) f ( I ( 1 ) ( r ) ; θ ) d r ,
We need to exploit its numerical approximation. By discretizing the integral term on the right side of Equation (5), we obtain its discrete form:
T ^ ( 1 ) ( k ) = T ( 0 ) ( 1 ) e a ( k 1 ) + r = 2 k e a ( k r + 1 2 ) · f 1 2 ( I ( 1 ) ( k 1 ) + I ( 1 ) ( k ) ) ; θ ,
As previously defined for v k , Equation (6) can also be written as follows:
T ^ ( 1 ) ( k ) = T ( 0 ) ( 1 ) e a ( k 1 ) + r = 2 k e a ( k r + 1 2 ) · f v k ; θ ,
Thus, according to Equation (7), we can calculate the forecasting value T ^ ( 1 ) ( k ) . Then, by using the inverse AGO (1-IAGO), we obtain the value of T ^ ( 0 ) ( k ) . The expression of 1-IAGO can be written as follows:
T ^ ( 0 ) ( k ) = T ^ ( 1 ) ( k ) T ^ ( 1 ) ( k 1 ) .
The summary of the general form of the grey model, from Equations (2)–(8), refers to Ma’s paper [22].

2.2. The Proposed Hybrid Grey System Model Based on Stacked LSTM Layers

The preceding section outlined a conventional grey system model and its solutions. It is evident that current grey system models encounter challenges in predicting nonlinear time series data. This difficulty arises from the traditional parameter estimation of a and θ in the function f ( · ) using the least-squares method, which results in the formation of a linear function.
In this section, we construct stacked LSTM (SLSTM) layers as the function f ( · ) to more effectively capture features from nonlinear data. The architecture of f ( · ) is depicted in Figure 2. As shown in Figure 2, we develop a neural network comprising multiple LSTM layers and linear layers.
First, the input I ( 1 ) ( p ) is fed into the first LSTM layer. Within that layer, the data pass through three crucial gates: the input gate, forget gate, and output gate. The output value of the input gate is calculated as follows:
i ( 1 ) ( p ) = σ W i ( 1 ) I ( 1 ) ( p ) + U i ( 1 ) h ( 1 ) ( p 1 ) + b i ( 1 )
where σ refer to the sigmoid function which could be express as follows:
σ x = 1 + e x 1
Next, the output of the forget gate is obtained as follows:
f ( 1 ) ( p ) = σ W f ( 1 ) I ( 1 ) ( p ) + U f ( 1 ) h ( 1 ) ( p 1 ) + b f ( 1 )
Then, the value of the output gate is calculated as follows:
o ( 1 ) ( p ) = σ W o ( 1 ) I ( 1 ) ( p ) + U o ( 1 ) h ( 1 ) ( p 1 ) + b o ( 1 )
Based on the outputs of the aforementioned gates, the output of the first LSTM layer is obtained as follows:
c ( 1 ) ( p ) = f ( 1 ) ( p ) c ( 1 ) ( p 1 ) + i ( 1 ) ( p ) tanh W c ( 1 ) I ( 1 ) ( p ) + U c ( 1 ) h ( 1 ) ( p 1 ) + b c ( 1 ) h ( 1 ) ( p ) = o ( 1 ) ( p ) tanh c ( 1 ) ( p )
where h ( 1 ) ( p ) is the output of the first LSTM layer. The symbol ⊙ denotes element-wise multiplication, and tanh represents the hyperbolic tangent function, which is defined as follows:
tanh ( x ) = e x e x e x + e x
Then, it passes through the first linear layer:
s ( 1 ) ( p ) = h ( 1 ) ( p ) W ( 1 ) + b ( 1 )
Each time, we take the output of the linear layer as the input for the next LSTM layer. We obtain the output of the last LSTM layer:
h ( z ) ( p ) = o ( z ) ( p ) tanh c ( z ) ( p )
where z indicates that the last LSTM layer is the zth LSTM layer. Finally, after mapping, we obtain the output of the stacked LSTM layers:
s ( z ) ( p ) = h ( z ) ( p ) W ( z ) + b ( z )
where s ( z ) ( p ) also represents the value of f ( I ( 1 ) ( p ) ) . In the formulations above, W i ( q ) , U i ( q ) , b i ( q ) , W f ( q ) , U f ( q ) , b f ( q ) , W o ( q ) , U o ( q ) , b o ( q ) , W c ( q ) , U c ( q ) , b c ( q ) ( q = 1 , 2 , , z ) are the parameters of the LSTM layers, and W ( q ) ,   b ( q ) are the parameters of the linear layers. For ease of representation, we collectively denote the parameters in stacked LSTM layers as Θ . Furthermore, we refer to Gers’s paper [45] to establish Equations (9)–(17).
Thus, Equation (2) can be rewritten as follows:
d T ( 1 ) ( p ) d p + a T ( 1 ) ( p ) = f ( I ( 1 ) ( p ) ; Θ )
where the equation of the right side is equal to s ( z ) ( p ) .
To determine the parameter values, we solve Equation (18) to obtain its discrete form:
T ( 0 ) ( k ) + a z ( 1 ) ( k ) = f ( v k ; Θ )
where
v k = 1 2 ( I ( 1 ) ( k 1 ) + I ( 1 ) ( k ) )
The structure of Equation (19) is shown in Figure 3.
We can derive the response function from Equation (18) under the condition T ( 1 ) ( p ) = T ( 0 ) ( 1 ) :
T ( 1 ) ( p ) = T ( 0 ) ( 1 ) e a ( p 1 ) + 1 p e a ( p r ) f ( I ( 1 ) ( r ) ; Θ ) d r
Similarly, we obtain the discrete form of Equation (21):
T ^ ( 1 ) ( k ) = T ( 0 ) ( 1 ) e a ( k 1 ) + r = 2 k e a ( k r + 1 2 ) · f v k ; Θ
Finally, after computing the value of T ^ ( 1 ) ( k ) , we obtain the predicted value T ^ ( 0 ) ( k ) using the inverse of IAGO (Equation (8)).
The synergy between stacked LSTM layers and grey system model in the hybrid grey model can be simply shown in Figure 4.

2.3. Adam Algorithm for Training the Proposed Model

In general, neural networks often lack closed-form analytical solutions, necessitating the use of optimization algorithms for iterative parameter updates. Among the various algorithms available, such as Gradient Descent (GD) [46], Stochastic Gradient Descent (SGD) [47], Adaptive Moment Estimation (Adam) [48], Momentum Gradient Descent (MGD) [49], and other algorithms [50], the Adam algorithm combines momentum and second-order moment estimation, enhancing the stability of the optimization process and accelerating convergence. Numerous studies have demonstrated its effectiveness and stability. To optimize the stability and ease of use, we employed the Adam algorithm for model optimization.
First, we need to define the training error e k at each point ( I ( 1 ) ( k ) , T ( 0 ) ( k ) ):
e k = T ( 0 ) ( k ) + a z ( 1 ) ( k ) f ( v k ; Θ ) ,
Then, we obtain the sum of training error E:
E ( a , Θ ) = 1 N k = 2 N e k 2 = e T e ,
Next, we need to calculate the gradient of the total training error:
E = [ E a , E Θ ] ,
where
E a = 2 N k = 2 N ( T ( 0 ) ( k ) + a z ( 1 ) ( k ) f ( v k ; Θ ) ) z ( 1 ) ( k )
E θ = 2 N k = 2 N ( T ( 0 ) ( k ) + a z ( 1 ) ( k ) f ( v k ; Θ ) ) Θ f ( v k ; Θ )
In the GD algorithm, iteration involves directly using the gradient:
a k + 1 Θ k + 1 = a k Θ k l · E
where l means the learning rate of the algorithm.
Unlike other optimization algorithms, the Adam algorithm additionally introduces and calculates the modified bias-corrected first-order moment estimate m ^ k and the bias-corrected second-order raw moment estimate η ^ k . Before calculating m ^ k and η ^ k , we need to get their biased first and second moment estimate which are represented by m k and η k , respectively.
m k = μ 1 · m k 1 + ( 1 μ 1 ) · E η k = μ 2 · η k 1 + ( 1 μ 2 ) · E 2
where μ 1 and μ 2 are the decay rates. Then, we can compute the value of m ^ k and η ^ k :
m ^ k = m k 1 μ 1 k η ^ k = η k 1 μ 2 k
Thus, we can update the parameters based on m ^ k and η ^ k :
a k + 1 Θ k + 1 = a k Θ k l · m ^ k η ^ k + ϵ .
where ϵ means a constant which is very small. The complete Adam algorithm is shown in Algorithm 1:
Algorithm 1: Adam algorithm for training the hybrid grey system model
 Input:  E ( a , Θ ) (Equation (24)), learning rate l, max_iteration
 Output:  [ α 0 , Θ 0 ] r a n d o m ( ) ; (Initialize the parameter set)
  μ 1 0.9 ; (Initialize the exponential decay rate)
  μ 2 0.999 ; (Initialize the exponential decay rate)
  ϵ 10 8 ; (Initialize the small constant)
  m 0 0 ; (Initialize the 1 st moment)
  η 0 0 ; (Initialize the 2 nd moment)
  i t e r a t i o n 0 ; (Initialize the number of iterations)
1while  iteration < max _ iteration  do
2  iteration = iteration + 1;
3   E Equation (26); (Calculate the objective function gradient)
4   m k , η k Equation (29); (Calculate the first and second moment)
5   m ^ k , η ^ k Equation (30); (Calculate the bias-corrected first and second moment)
6   a k + 1 Θ k + 1 Equation (31); (Update the parameters)
7end
8return  [ α 0 , Θ 0 ] (Resulting parameters)

2.4. Grid Search Algorithm for Tuning Parameters of the Proposed Model

In the preceding section, evolutionary algorithms and the Adam algorithm were employed to optimize model parameters. However, achieving optimal model performance necessitates fine-tuning hyperparameters such as the number of neurons L, learning rate l, and the count of LSTM layers z.
Let the model parameter space be denoted as Φ , where each parameter combination is represented by a vector ϕ consisting of L ,   l ,   z . A grid search aims to identify the optimal parameter combination ϕ * from Φ based on the training dataset D t r a i n , with the objective of enhancing model performance on the validation dataset D v a l . This approach adheres to the mathematical principle:
ϕ * = arg min ϕ Φ F ( ϕ , D train , D val ) ,
where the term F ( ϕ , D train , D val ) represents the performance metric obtained by training the model with parameter combination ϕ on the validation set D v a l . Here, we use Mean Absolute Percentage Error (MAPE) to calculate the metric:
MAPE = 1 | D val | i D val T i T ^ i T i × 100 %
In our approach, we employ two distinct methods to compute T ^ i during the grid search. Firstly, we utilize the response function (Equation (22)) in conjunction with 1-IAGO (Equation (8)) to derive T ^ i . This method is denoted as GreySLstm-M1 for convenience. Secondly, leveraging the previously defined function f ( · ) , we directly use the output of the stacked LSTM layers as T ^ i during the grid search process. This approach is referred to as GreySLstm-M2.

2.5. The Proposed Complete Forecasting Process

Taking into account the previous algorithms and formula derivations, the complete forecasting process of the proposed hybrid grey model is shown in Algorithm 2.
Algorithm 2: Complete forecasting process of the GM-ResNet
 Input: Training input: I ( 0 ) ( p ) , T ( 0 ) ( p ) ,   p = 1 , , N ;
  Test input : I ( 0 ) ( p ) ,   p = N + 1 , , N + T ;
 Number of neurons L, learning rate l and max_iteration
 Number of LSTM layers z;
 Output:  T ( 1 ) ( p ) r = 1 p T ( 0 ) ( r ) ;   I i ( 1 ) ( p ) r = 1 p I i ( 0 ) ( r ) ;
  iteration = 0 ;
1 L ,   l ,   z Equation (32); (Use GreySLstm-M1 or M2 to select the best L ,   l ,   z )
2while  iteration < m a x _ i t e r a t i o n  do
3   a * Θ * ← Algorithm 1; (Use Algorithm 1 to train the model)
4end
5for  p = 2  to  N + T  do
6   y ^ ( 0 ) p ← Equations (22) and (8); (Forecast by using the response function and 1-IAGO)
7end
 Result:  T ^ ( 0 ) ( p ) , p = 1 , , N + T ;

3. Application

3.1. Data Collection

To validate the model’s efficacy and evaluate its predictive precision, we applied it to real-world examples for validation (as shown in Figure 5). This paper collected annual data on coal consumption in Henan Province from 1995 to 2019, electricity consumption from 1995 to 2022, and gasoline consumption from 1995 to 2019. These data were obtained from the latest available statistics on the official website of the National Bureau of Statistics of China (https://data.stats.gov.cn/index.htm (accessed on 10 July 2024)).

3.2. Selection of Comparison Models and Assessment Criteria

In order to better evaluate the performance of the proposed model, in this paper, we used 9 machine learning models and 15 grey system models for comparing. Meanwhile, four kinds of metrics were used to quantify prediction accuracy. The detailed information on the grey system models is shown in Table 1, the detail of the machine learning models is shown in Table 2, and the metrics used for evaluation are shown in Table 3. D in Table 3 means the set of training or testing data, and d means the length of D.

3.3. Case 1: Henan’s Coal Consumption

Coal remains a cornerstone of global energy supply, playing a crucial role in power production, industrial processes, and economic stability. In Henan Province, coal production constitutes a significant economic pillar, and coal is one of the primary energy sources. Therefore, predicting coal consumption in Henan Province is of paramount importance. Accurate forecasts can provide a theoretical foundation for the government to formulate relevant energy and economic strategies.
In this paper, we collected the latest annual coal consumption data in Henan Province from 1995 to 2019, totaling 25 data points. The first 20 data points were used for training the model, while the remaining 5 points were reserved for testing. The metrics for model performance are presented in Table 4, and the prediction curves are shown in Figure 6. As seen in Table 4, the proposed GreySLstm-M1 model achieved the best RMSE, MAE, MAPE, TIC, U1, and U2 values in both training and testing phases. During training, the MAE and MAPE of the rf model, the RMSE and U2 of the lstm model, and the TIC and U1 of the convlstm model ranked second. However, in testing, their performance was significantly inferior to that of the GreySLstm-M1 model. Furthermore, the GreySLstm-M2 model performed mediocrely based on the calculated indicators. Figure 6 illustrates that while GreySLstm-M1, GreySLstm-M2, gru, lstm, cnnlstm, and convlstm fitted the training data well, only GreySLstm-M1 maintained a close alignment between predicted and actual points during testing. It is also evident that all grey system models, except the proposed one, performed poorly.

3.4. Case 2: Henan’s Electricity Consumption

Electricity is essential for residents’ daily lives, industrial production, and scientific and technological research. Henan Province, with its large population, has a substantial demand for electricity. Therefore, accurately predicting electricity consumption in Henan is crucial for relevant authorities to formulate effective power distribution strategies.
In this paper, we collected annual electricity consumption data for Henan Province from 1995 to 2022, totaling 28 data points. The first 23 points were used for model training, and the last 5 points were used for prediction analysis. The results of the indicator calculations are presented in Table 5, and the prediction curves of the models are shown in Figure 7. From Table 5, it is evident that the proposed GreySLstm-M1 model achieved the best results for all indicators during testing, while the rf model performed best during training. However, the GreySLstm-M1 model’s training performance was close to that of the rf model, ranking second in all training indicators. During testing, the cnnlstm model ranked second in MAE and MAPE, and the BernoulliGM model ranked second in TIC, U1, and U2. However, their performance on the training set was significantly weaker than that of the GreySLstm-M1 model. The GreySLstm-M2 model showed average performance across the indicators. Figure 7 demonstrates that the prediction curve of GreySLstm-M1 closely aligned with the actual data. Among the comparison models, cnnlstm and convlstm effectively captured the nonlinearity of the real data. In contrast, most grey models produced prediction curves that resembled a straight line or arc, failing to accurately fit nonlinear data.

3.5. Case 3: Henan’s Gasoline Consumption

Gasoline is essential for transportation, powering vehicles, and facilitating the movement of goods and people, thereby driving economic activity and connectivity. Forecasting gasoline consumption in Henan Province can aid in the effective planning and allocation of resources, providing a scientific basis for developing strategies to reduce emissions and transition to alternative fuels.
In this paper, we collected annual gasoline consumption data for Henan Province from 1995 to 2019. The first 18 data points were used for training, and the remaining 7 points were used for testing. The related metrics are shown in Table 6, and the prediction curves are shown in Figure 8. The analysis of Table 6 reveals that the proposed GreySLstm-M1 model achieved the lowest values across all metrics in both training and testing phases. Additionally, the GreySLstm-M2 model demonstrated the second-best performance in all metrics, except for RMSE and U2 during testing. Figure 8 shows that while GreySLstm-M1, GreySLstm-M2, rf, lstm, and svr fitted the actual curve well during training, rf and svr failed to maintain this trend during testing. Furthermore, all grey system models exhibited poor performance in both training and testing, with their prediction curves resembling arcs.

3.6. Discussions

Analysis of the Performance of the Models on Real-World Cases

In the three real-world cases examined, the proposed GreySLstm-M1 model consistently demonstrated superior prediction performance with small-scale, nonlinear datasets. Compared to other models, GreySLstm-M1 maintained the best results across all scenarios, highlighting its versatility and stability. Conversely, the GreySLstm-M2 model showed mediocre performance, likely because it relied too heavily on the role of stacked LSTM layers and did not adequately leverage the strengths of the grey model component. This suggests that solely optimizing neural networks may not ensure superior prediction results for GreySLstm models.
Among the comparison models, the rf model often exhibited excellent fitting during the training phase but performed poorly during the prediction phase, primarily due to overfitting. Other machine learning models also suffered from similar overfitting issues to varying degrees. Moreover, the grey models in the comparison generally performed inadequately across all cases, indicating their inability to effectively handle nonlinear, small-scale datasets.

3.7. Analysis of the Indicator Optimization of the Proposed Model

To further analyze the performance of GreySLstm-M1 and quantify the improvement in prediction accuracy compared to other models, we used the following formula to calculate the degree of optimization:
x i = M i M GreySLstm - M 1 M i × 100 %
where x i represents the optimization percentage of the GreySLstm-M1 model relative to model i on a certain indicator. M i represents a certain indicator value of model i, and M GreySLstm - M 1 represents the same indicator value of the GreySLstm-M1 model.
The specific results of each case are listed in Table 7, Table 8 and Table 9, respectively. From case 1, we found that the prediction accuracy of GreySLstm improved significantly. Compared to other models, the RMSE, MAE, MAPE, TIC, U1, and U2 indicators were improved by 43.7827%, 22.3263%, 25.9755%, 41.4517%, 41.4517%, and 43.7827%, respectively. In case 2, these indicators were improved by at least 0.9034%, 8.0892%, 6.8578%, 1.0207%, 1.0207%, and 0.9034%. In case 3, the improvements were at least 18.9155%, 15.4864%, 14.1890%, 18.1348%, 18.1348%, and 18.9155%, respectively. In cases 1 and 3, where the data exhibited high nonlinearity, the proposed model significantly enhanced prediction performance. However, in case 2, where the data were less nonlinear, the performance improvement compared to other models was not as substantial. Therefore, we conclude that the proposed GreySLstm-M1 model is more suitable for small-scale datasets with strong nonlinearity.

Evaluating GreySLstm Performance with Different Numbers of LSTM Layers

To better understand the performance of the proposed GreySLstm model, we conducted a detailed analysis of its prediction performance based on different numbers of LSTM layers. Given that the GreySLstm-M2 model performed significantly worse than the GreySLstm-M1 model in the previous section, we focused exclusively on the GreySLstm-M1 model for the remainder of this study. Hereafter, we refer to GreySLstm-M1 simply as GreySLstm.
To ensure that our conclusions are more representative, we ran the GreySLstm model five times for each LSTM layer configuration and calculated the RMSE, MAE, MAPE, TIC, U1, and U2 indicators for each run. The results for each case are listed in Table 10, and the visualizations are shown in Figure 9, Figure 10 and Figure 11.
In cases 1 and 2, we observe that during both the training and test phases, the indicators initially decreased, then increased, and eventually stabilized. When the test set indicators stabilized, they were significantly larger than those of the training set. In case 3, the indicators fluctuated with an increasing number of LSTM layers but remained relatively stable overall, with test set indicators consistently larger than those of the training set.
From these observations, we conclude that the models in cases 1 and 2 experienced overfitting with too many layers. This occurred because each LSTM layer contained multiple neurons, and when there were too many layers, the total number of neurons could exceed 2–3 times the number of training data points. A model that is too complex not only learns the underlying patterns but also captures noise and irrelevant features, reducing its generalization ability on new data. Furthermore, the performance in case 3 suggests overfitting from the outset, with its indicator trends resembling the stable phase observed in cases 1 and 2. This was likely due to the smaller training set size (18 data points) compared to cases 1 and 2 (20 and 23 data points, respectively).
In summary, the prediction performance of the GreySLstm model generally improves with an increasing number of layers but may decline after reaching an optimal point. If overfitting persists from the beginning, increasing the training set size should be considered to mitigate this issue.

4. Conclusions

4.1. Paper Structure Overview

In this paper, we conducted a literature review on grey models and LSTM models in Section 1, identifying gaps in current research. In Section 2, we introduced the general form of the grey model and then proposed the GreySLstm model, explaining its construction and prediction principles. We also described the parameter optimization process using the Adam and grid search algorithms and summarized the complete prediction process. In Section 3, we collected the latest annual coal, electricity, and gasoline consumption data from Henan Province. We used these data to test the proposed model and quantify the prediction error with four different indicators. We compared our model with 24 other models, demonstrating its superior generalization and prediction performance. We also discussed the results from cases and analyzed the impact of the number of LSTM layers on the prediction accuracy of the hybrid grey model.

4.2. Main Findings and Contributions

The results in Section 3 showed that the proposed model outperformed many grey and machine learning models across multiple cases and evaluation indicators. The model demonstrated strong generalization performance in various scenarios, indicating high reliability and applicability in practical applications. This finding broadens current research ideas, proposing a novel integration of grey models with neural networks. It proves that this model framework effectively leverages the strengths of both approaches, resulting in a hybrid grey model capable of handling nonlinear and small-scale data.
The analysis of the impact of the number of LSTM layers on the prediction performance of the GreySLstm model in Section 3 revealed a practical rule: increasing the number of layers initially improves the model’s prediction performance, but excessive layers lead to overfitting. This finding provides a clear guideline for future researchers, helping them identify the optimal layer configuration and avoid overcomplicating the model. It also supports the theoretical assumption that overly complex deep learning models, while performing well on training data, often underperform on test data, reducing generalization performance. Additionally, this insight suggests an important research direction: investigating methods to mitigate overfitting caused by increasing LSTM layers, such as regularization techniques and early stopping.
Compared with other studies, this paper proposed a novel framework combining stacked LSTM layers and grey models, resulting in a hybrid grey model with superior performance. This neural grey model was applied for the first time to predict the annual energy consumption of electricity, coal, and gasoline in Henan Province, demonstrating its effectiveness. This research enhances the combination model, fills a gap in the literature, and provides a reference for future research.

4.3. Analysis of Potential Limitations of the Model

Grey models typically include accumulation operations (1-AGO), which can be time-consuming when handling large-scale data. However, this is a common limitation of grey models, not a specific issue of the proposed framework. Additionally, while the GreySLstm model can process nonlinear and small-scale data, embedding too many LSTM layers is not advisable. Excessive layers can lead to overfitting due to the model’s complexity. Therefore, we recommend setting the number of LSTM layers as a hyperparameter to be adjusted for optimal prediction results.

4.4. Recommendations for Model Enhancement

This paper presents the mathematical principles and training methods of the GreySLstm model but suggests further optimization for improved performance. To enhance the robustness of the GreySLstm model, we recommend using outlier detection techniques such as isolation forest, Local Outlier Factor (LOF), and density-based spatial clustering of applications with noise (DBSCAN). To improve the model’s generalization ability, we suggest implementing regularization, early stopping, and other techniques to effectively prevent overfitting, thereby enhancing the model’s overall generalization performance.

4.5. Future Research Directions

The results of this paper demonstrate that embedding neural networks into grey models is effective, leveraging the strengths of neural networks in feature capture and grey models in handling small-scale data. Future research will delve deeper into this model framework to gain a richer and more scientific understanding of its mechanisms.

Author Contributions

Conceptualization, X.M.; methodology, X.M.; validation, Y.H.; formal analysis, X.M.; investigation, Y.H.; resources, Y.H.; data curation, Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, X.M.; visualization, X.M.; supervision, X.M.; project administration, X.M.; funding acquisition, X.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Humanities and Social Science Fund of Ministry of Education of China grant number 19YJCZH119.

Data Availability Statement

National Bureau of Statistics of China (https://data.stats.gov.cn/index.htm) (accessed on 10 July 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Deng, J. Grey fuzzy forecast and control for grain. J. Huazhong Univ. Sci. Technol. Med. Sci. 1983, 2, 1–8. [Google Scholar]
  2. Deng, J. Grey dynamic model and its application in the long-term forecasting output of grain. Discov. Nat. 1984, 3, 37–45. (In Chinese) [Google Scholar]
  3. Xie, N.-M.; Liu, S.-F.; Yang, Y.-J.; Yuan, C.-Q. On novel grey forecasting model based on non-homogeneous index sequence. Appl. Math. Model. 2013, 37, 5059–5068. [Google Scholar] [CrossRef]
  4. Wang, Z.X.; Wang, Z.W.; Li, Q. Forecasting the industrial solar energy consumption using a novel seasonal GM (1, 1) model with dynamic seasonal adjustment factors. Energy 2020, 200, 117460. [Google Scholar] [CrossRef]
  5. Wu, W.; Ma, X.; Zhang, Y.; Li, W.; Wang, Y. A novel conformable fractional non-homogeneous grey model for forecasting carbon dioxide emissions of BRICS countries. Sci. Total. Environ. 2020, 707, 135447. [Google Scholar] [CrossRef]
  6. Liu, L.; Wu, L. Forecasting the renewable energy consumption of the European countries by an adjacent non-homogeneous grey model. Appl. Math. Model. 2021, 89, 1932–1948. [Google Scholar] [CrossRef]
  7. Luo, D.; Wei, B.L. A unified treatment approach for a class of discrete grey forecasting models and its application. Syst. Eng.-Theory Pract. 2019, 39, 451–462. [Google Scholar]
  8. Zhou, W.; Wu, X.; Ding, S.; Pan, J. Application of a novel discrete grey model for forecasting natural gas consumption: A case study of Jiangsu Province in China. Energy 2020, 200, 117443. [Google Scholar] [CrossRef]
  9. Qian, W.; Sui, A. A novel structural adaptive discrete grey prediction model and its application in forecasting renewable energy generation. Expert Syst. Appl. 2021, 186, 115761. [Google Scholar] [CrossRef]
  10. Ding, S.; Dang, Y.G.; Xu, H. Construction and application of GM (1, N) based on control of dummy variables. Control Decis. 2018, 33, 309–315. [Google Scholar]
  11. Wang, J. The GM (1, N) Model for Mixed-frequency Data and Its Application in Pollutant Discharge Prediction. J. Grey Syst. 2018, 30, 97. [Google Scholar]
  12. Luo, D.; An, Y.M.; Wang, X.L. Time-delayed accumulative TDAGM (1, N, t) model and its application in grain production. Control Decis. 2021, 36, 2002–2012. [Google Scholar]
  13. He, Z.; Wang, Q.; Shen, Y.; Wang, Y. Discrete multivariate gray model based boundary extension for bi-dimensional empirical mode decomposition. Signal Process. 2013, 93, 124–138. [Google Scholar] [CrossRef]
  14. Ding, S. A novel discrete grey multivariable model and its application in forecasting the output value of China’s high-tech industries. Comput. Ind. Eng. 2019, 127, 749–760. [Google Scholar] [CrossRef]
  15. Ding, S.; Xu, N.; Ye, J.; Zhou, W.; Zhang, X. Estimating Chinese energy-related CO2 emissions by employing a novel discrete grey prediction model. J. Clean. Prod. 2020, 259, 120793. [Google Scholar] [CrossRef]
  16. Ma, X.; Liu, Z. The kernel-based nonlinear multivariate grey model. Appl. Math. Model. 2018, 56, 217–238. [Google Scholar] [CrossRef]
  17. Duan, H.; Wang, D.; Pang, X.; Liu, Y.; Zeng, S. A novel forecasting approach based on multi-kernel nonlinear multivariable grey model: A case report. J. Clean. Prod. 2020, 260, 120929. [Google Scholar] [CrossRef]
  18. Ma, X.; Deng, Y.; Ma, M. A novel kernel ridge grey system model with generalized Morlet wavelet and its application in forecasting natural gas production and consumption. Energy 2024, 287, 129630. [Google Scholar] [CrossRef]
  19. Shaikh, F.; Ji, Q.; Shaikh, P.H.; Mirjat, N.H.; Uqaili, M.A. Forecasting China’s natural gas demand based on optimised nonlinear grey models. Energy 2017, 140, 941–951. [Google Scholar] [CrossRef]
  20. Xiao, Q.; Gao, M.; Xiao, X.; Goh, M. A novel grey Riccati–Bernoulli model and its application for the clean energy consumption prediction. Eng. Appl. Artif. Intell. 2020, 95, 103863. [Google Scholar] [CrossRef]
  21. Mao, S.; Zhu, M.; Wang, X.; Xiao, X. Grey–Lotka–Volterra model for the competition and cooperation between third-party online payment systems and online banking in China. Appl. Soft Comput. 2020, 95, 106501. [Google Scholar] [CrossRef]
  22. Ma, X.; Xie, M.; Suykens, J.A.K. A novel neural grey system model with Bayesian regularization and its applications. Neurocomputing 2021, 456, 61–75. [Google Scholar] [CrossRef]
  23. Liu, C.; Xu, Z.; Zhao, K.; Xie, W. Forecasting education expenditure with a generalized conformable fractional-order nonlinear grey system model. Heliyon 2023, 9, e16499. [Google Scholar] [CrossRef]
  24. Xie, D.; Li, X.; Duan, H. A novel nonlinear grey multivariate prediction model based on energy structure and its application to energy consumption. Chaos Solitons Fractals 2023, 173, 113767. [Google Scholar] [CrossRef]
  25. Wei, B.; Yang, L.; Xie, N. Nonlinear grey Bernoulli model with physics-preserving Cusum operator. Expert Syst. Appl. 2023, 229, 120466. [Google Scholar] [CrossRef]
  26. Zhao, H.; Guo, S. An optimized grey model for annual power load forecasting. Energy 2016, 107, 272–286. [Google Scholar] [CrossRef]
  27. Jin, M.; Zhou, X.; Zhang, Z.M.; Tentzeris, M.M. Short-term power load forecasting using grey correlation contest modeling. Expert Syst. Appl. 2012, 39, 773–779. [Google Scholar] [CrossRef]
  28. Zeng, B.; Luo, C. Forecasting the total energy consumption in China using a new-structure grey system model. Grey Syst. Theory Appl. 2017, 7, 194–217. [Google Scholar] [CrossRef]
  29. Guo, J.J.; Wu, J.Y.; Wang, R.Z. A new approach to energy consumption prediction of domestic heat pump water heater based on grey system theory. Energy Build. 2011, 43, 1273–1279. [Google Scholar] [CrossRef]
  30. Li, H.; Wu, Z.; Yuan, X.; Yang, Y.; He, X.; Duan, H. The research on modeling and application of dynamic grey forecasting model based on energy price-energy consumption-economic growth. Energy 2022, 257, 124801. [Google Scholar] [CrossRef]
  31. Lei, M.; Feng, Z. A proposed grey model for short-term electricity price forecasting in competitive power markets. Int. J. Electr. Power Energy Syst. 2012, 43, 531–538. [Google Scholar] [CrossRef]
  32. Duan, H.; Pang, X. A novel grey prediction model with system structure based on energy background: A case study of Chinese electricity. J. Clean. Prod. 2023, 390, 136099. [Google Scholar] [CrossRef]
  33. Pandey, A.K.; Singh, P.K.; Nawaz, M.; Kushwaha, A.K. Forecasting of non-renewable and renewable energy production in India using optimized discrete grey model. Environ. Sci. Pollut. Res. 2023, 30, 8188–8206. [Google Scholar] [CrossRef]
  34. Zhao, X.; Ma, X.; Cai, Y.; Yuan, H.; Deng, Y. Application of a novel hybrid accumulation grey model to forecast total energy consumption of Southwest Provinces in China. Grey Syst. Theory Appl. 2023, 13, 629–656. [Google Scholar] [CrossRef]
  35. Yuan, H.; Ma, X.; Ma, M.; Ma, J. Hybrid framework combining grey system model with Gaussian process and STL for CO2 emissions forecasting in developed countries. Appl. Energy 2024, 360, 122824. [Google Scholar] [CrossRef]
  36. He, Q.; Ma, X.; Zhang, L.; Li, W.; Li, T. The nonlinear multi-variable grey Bernoulli model and its applications. Appl. Math. Model. 2024, 134, 635–655. [Google Scholar] [CrossRef]
  37. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  38. Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF models for sequence tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar]
  39. Krause, B.; Lu, L.; Murray, I.; Renals, S. Multiplicative LSTM for sequence modelling. arXiv 2016, arXiv:1609.07959. [Google Scholar]
  40. Shi, X.; Chen, Z.; Wang, H.; Yeung, D.-Y.; Wong, W.-K.; Woo, W.-C. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 28. [Google Scholar] [CrossRef]
  41. Wang, K.; Hua, Y.; Huang, L.; Guo, X.; Liu, X.; Ma, Z.; Ma, R.; Jiang, X. A novel GA-LSTM-based prediction method of ship energy usage based on the characteristics analysis of operational data. Energy 2023, 282, 128910. [Google Scholar] [CrossRef]
  42. Lu, H.; Wu, J.; Ruan, Y.; Qian, F.; Meng, H.; Gao, Y.; Xu, T. A multi-source transfer learning model based on LSTM and domain adaptation for building energy prediction. Int. J. Electr. Power Energy Syst. 2023, 149, 109024. [Google Scholar] [CrossRef]
  43. Lu, Y.; Sheng, B.; Fu, G.; Luo, R.; Chen, G.; Huang, Y. Prophet-EEMD-LSTM based method for predicting energy consumption in the paint workshop. Appl. Soft Comput. 2023, 143, 110447. [Google Scholar] [CrossRef]
  44. Deng, T.F.; Gui, Y.; Yan, J.Y. Prediction and analysis of tunnel crown settlement based on grey system theory. Adv. Mater. Res. 2012, 490, 423–427. [Google Scholar] [CrossRef]
  45. Gers, F.A.; Schmidhuber, J.; Cummins, F. Learning to forget: Continual prediction with LSTM. Neural Comput. 2000, 12, 2451–2471. [Google Scholar] [CrossRef] [PubMed]
  46. Andrychowicz, M.; Denil, M.; Gomez, S.; Hoffman, M.W.; Pfau, D.; Schaul, T.; Shillingford, B.; De Freitas, N. Learning to learn by gradient descent by gradient descent. Adv. Neural Inf. Process. Syst. 2016, 29. [Google Scholar] [CrossRef]
  47. Amari, S. Backpropagation and stochastic gradient descent method. Neurocomputing 1993, 5, 185–196. [Google Scholar] [CrossRef]
  48. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  49. Wang, S.; Lu, T.; Hao, R.; Wang, F.; Ding, T.; Li, J.; He, X.; Guo, Y.; Han, X. An Identification Method for Anomaly Types of Active Distribution Network Based on Data Mining. IEEE Trans. Power Syst. 2023, 39, 5548–5560. [Google Scholar] [CrossRef]
  50. Duan, Y.; Zhao, Y.; Hu, J. An initialization-free distributed algorithm for dynamic economic dispatch problems in microgrid: Modeling, optimization and analysis. Sustain. Energy Grids Netw. 2023, 34, 101004. [Google Scholar] [CrossRef]
  51. Liu, S.; Forrest, J.Y.L. Grey Systems: Theory and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  52. Chen, P.Y.; Yu, H.M. Foundation settlement prediction based on a novel NGM model. Math. Probl. Eng. 2014, 2014, 242809. [Google Scholar] [CrossRef]
  53. Xie, N.; Wang, R.; Chen, N. Measurement of shock effect following change of one-child policy based on grey forecasting approach. Kybernetes 2018, 47, 559–586. [Google Scholar] [CrossRef]
  54. Chen, C.I.; Chen, H.L.; Chen, S.P. Forecasting of foreign exchange rates of Taiwan’s major trading partners by novel nonlinear Grey Bernoulli model NGBM (1, 1). Commun. Nonlinear Sci. Numer. Simul. 2008, 13, 1194–1204. [Google Scholar] [CrossRef]
  55. Wu, L.; Liu, S.; Yao, L.; Yan, S.; Liu, D. Grey system model with the fractional order accumulation. Commun. Nonlinear Sci. Numer. Simul. 2013, 18, 1775–1785. [Google Scholar] [CrossRef]
  56. Duan, H.; Lei, G.R.; Shao, K. Forecasting crude oil consumption in China using a grey prediction model with an optimal fractional-order accumulating operator. Complexity 2018, 2018, 3869619. [Google Scholar] [CrossRef]
  57. Ding, Y.; Dang, Y. Forecasting renewable energy generation with a novel flexible nonlinear multivariable discrete grey prediction model. Energy 2023, 277, 127664. [Google Scholar] [CrossRef]
  58. Wu, L.-F.; Liu, S.-F.; Cui, W.; Liu, D.-L.; Yao, T.-X. Non-homogenous discrete grey model with fractional-order accumulation. Neural Comput. Appl. 2014, 25, 1215–1221. [Google Scholar] [CrossRef]
  59. Wu, W.; Ma, X.; Zeng, B.; Wang, Y.; Cai, W. Forecasting short-term renewable energy consumption of China using a novel fractional nonlinear grey Bernoulli model. Renew. Energy 2019, 140, 70–87. [Google Scholar] [CrossRef]
  60. Wu, L.; Liu, S.; Chen, H.; Zhang, N. Using a novel grey system model to forecast natural gas consumption in China. Math. Probl. Eng. 2015, 2015, 686501. [Google Scholar] [CrossRef]
  61. Xia, J.; Ma, X.; Wu, W.; Huang, B.; Li, W. Application of a new information priority accumulated grey model with time power to predict short-term wind turbine capacity. J. Clean. Prod. 2020, 244, 118573. [Google Scholar] [CrossRef]
  62. Zhou, W.; Zhang, H.; Dang, Y.; Wang, Z. New information priority accumulated grey discrete model and its application. Chin. J. Manag. Sci. 2017, 25, 140–148. [Google Scholar]
  63. Xie, N.M.; Liu, S.F. Research on the non-homogenous discrete grey model and its parameter’s properties. Syst. Eng. Electron. 2008, 5, 863–867. [Google Scholar]
  64. Xiang, X.; Liu, L.; Cao, J.; Zhang, P. Forecasting the installed wind capacity using a new information priority accumulated nonlinear grey Bernoulli model. In IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2020; Volume 467, p. 012088. [Google Scholar]
  65. Drucker, H.; Burges, C.J.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural Inf. Process. Syst. 1996, 9. [Google Scholar]
  66. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  67. Popescu, M.-C.; Balas, V.E.; Perescu-Popescu, L.; Mastorakis, N. Multilayer perceptron and neural networks. WSEAS Trans. Circuits Syst. 2009, 8, 579–588. [Google Scholar]
  68. Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T. Xgboost: Extreme gradient boosting. R Package Version 0.4-2. 2015. Available online: https://cran.r-project.org/web/packages/xgboost/vignettes/xgboost.pdf (accessed on 14 August 2024).
  69. O’Shea, K.; Nash, R. An introduction to convolutional neural networks. arXiv 2015, arXiv:1511.08458. [Google Scholar]
  70. Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
  71. Kim, S.; Hong, S.; Joh, M.; Song, S.-K. Deeprain: Convlstm network for precipitation prediction using multichannel radar data. arXiv 2017, arXiv:1711.02316. [Google Scholar]
  72. Kim, T.Y.; Cho, S.B. Predicting residential energy consumption using CNN-LSTM neural networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
Figure 1. The structure of 1-AGO.
Figure 1. The structure of 1-AGO.
Processes 12 01749 g001
Figure 2. The structure of function f ( · ) .
Figure 2. The structure of function f ( · ) .
Processes 12 01749 g002
Figure 3. Structure of the hybrid grey system model’s training equation (Equation (19)).
Figure 3. Structure of the hybrid grey system model’s training equation (Equation (19)).
Processes 12 01749 g003
Figure 4. The synergy between stacked LSTM layers and grey system model in hybrid model.
Figure 4. The synergy between stacked LSTM layers and grey system model in hybrid model.
Processes 12 01749 g004
Figure 5. Collected data on coal, electricity and gasoline consumption in Henan.
Figure 5. Collected data on coal, electricity and gasoline consumption in Henan.
Processes 12 01749 g005
Figure 6. Predicted values of all models in case 1.
Figure 6. Predicted values of all models in case 1.
Processes 12 01749 g006
Figure 7. Predicted values of all models in case 2.
Figure 7. Predicted values of all models in case 2.
Processes 12 01749 g007
Figure 8. Predicted values of all models in case 3.
Figure 8. Predicted values of all models in case 3.
Processes 12 01749 g008
Figure 9. Performance of GreySLstm with different numbers of LSTM layers in case 1.
Figure 9. Performance of GreySLstm with different numbers of LSTM layers in case 1.
Processes 12 01749 g009
Figure 10. Performance of GreySLstm with different numbers of LSTM layers in case 2.
Figure 10. Performance of GreySLstm with different numbers of LSTM layers in case 2.
Processes 12 01749 g010
Figure 11. Performance of GreySLstm with different numbers of LSTM layers in case 3.
Figure 11. Performance of GreySLstm with different numbers of LSTM layers in case 3.
Processes 12 01749 g011
Table 1. Information on the grey system models used for comparison.
Table 1. Information on the grey system models used for comparison.
NameReferenceYearModel StructureParameter
GM[51]2010 T ( 0 ) ( k ) + a z ( 1 ) ( k ) = b /
NGM[52]2014 T ( 0 ) ( k ) + a z ( 1 ) ( k ) = b k + c /
DGM[53]2018 T ( 1 ) ( k + 1 ) = β 1 T ( 1 ) ( k ) + β 2 /
NDGM[3]2013 T ( 1 ) ( k ) = β 1 T ( 1 ) ( k 1 ) + β 2 k + β 3 /
BernoulliGM[54]2008 T ( 0 ) ( k ) + a z ( 1 ) ( k ) = b z ( 1 ) ( k ) n b
FGM[55]2013 T ( r ) ( k ) T ( r ) ( k 1 ) + a 2 T ( r ) ( k ) + T ( r ) ( k 1 ) = b α
FNGM[56]2018 I ( r ) ( t ) I ( r ) ( t 1 ) + α z ( r ) ( t ) = β t + γ β
FNDGM[57]2023 I ( r ) ( k + 1 ) = b 1 I ( r ) ( k ) + b 2 k + b 3 b 1
FDGM[58]2014 T ( r ) ( k + 1 ) = β 1 T ( r ) ( k ) + β 2 β 1
FBernoulliGM[59]2019 d X ( r ) ( k ) d t + a X ( r ) ( k ) = b X ( r ) ( k ) γ b
NIPGM[60]2015 I ( λ ) ( t ) I ( λ ) ( t 1 ) + α z ( λ ) ( t ) = β β
NIPNGM[61]2020 I ( λ ) ( t ) I ( λ ) ( t 1 ) + α z ( λ ) ( t ) = β t + γ β
NIPDGM[62]2017 I ( λ ) ( k + 1 ) = b 1 I ( λ ) ( k ) + b 2 b 1
NIPNDGM[63]2008 I ( λ ) ( k + 1 ) = b 1 I ( λ ) ( k ) + b 2 k + b 3 b 1
NIPBernoulliGM[64]2020 d T ( t ) d t + a T ( t ) = b T ( t ) r b
Table 2. Information of machine learning models used for comparison.
Table 2. Information of machine learning models used for comparison.
Full NameAbbreviationReferenceYear
Support vector regressionsvr[65]1996
Long Short-Term Memorylstm[45]2000
Random forest regressionrf[66]2001
Multilayer perceptronmlp[67]2009
Extreme gradient boostingxgb[68]2015
Convolution neural networkcnn[69]2015
Gated recurrent unitgru[70]2017
Convolutional LSTMconvlstm[71]2017
CNN-LSTMcnnlstm[72]2019
Table 3. Metrics used for evaluating.
Table 3. Metrics used for evaluating.
Full NameMetricsEquation
Root-mean-square errorRMSE 1 d k D T ( 0 ) ( k ) T ^ ( 0 ) ( k ) 2
Mean absolute errorMAE 1 d k D T ( 0 ) ( k ) T ^ ( 0 ) ( k )
Mean Absolute Percentage ErrorMAPE 1 d k D T ( 0 ) ( k ) T ^ ( 0 ) ( k ) | T ( 0 ) ( k ) | × 100
Theil’s inequality coefficientTIC 1 d k D T ( 0 ) ( k ) T ^ ( 0 ) ( k ) 2 1 d k D T ( 0 ) ( k ) 2 + 1 d k D T ^ ( 0 ) ( k ) 2
Theil’s U1 statisticU1 1 d k D T ( 0 ) ( k ) T ^ ( 0 ) ( k ) 2 1 d k D T ( 0 ) ( k ) 2 + 1 d k D T ^ ( 0 ) ( k ) 2
Theil’s U2 statisticU2 k D T ( 0 ) ( k ) T ^ ( 0 ) ( k ) 2 k D T ( 0 ) ( k + 1 ) T ( 0 ) ( k ) 2
Table 4. The metrics of the models in case 1.
Table 4. The metrics of the models in case 1.
ModelGreySLstm-M1GreySLstm-M2grurfxgblstmsvr
TrainingRMSE490.3060960.1925928.6231870.69661574.2322811.10701566.8574
MAE339.5045609.2220658.7882518.27371352.1988529.33451367.6741
MAPE1.88572.98444.67972.705011.15503.619611.9076
TIC0.01330.02600.02520.02380.04320.02210.0425
U10.01330.02600.02520.02380.04320.02210.0425
U20.02660.05220.05050.04730.08550.04410.0851
TestRMSE928.64675323.74612507.04393073.11621661.89601651.88832308.4474
MAE910.81035108.53712329.04322799.06601172.61121436.75362166.6322
MAPE4.122223.286810.681912.88105.56876.65039.7389
TIC0.02090.10660.05320.06450.03620.03570.0540
U10.02090.10660.05320.06450.03620.03570.0540
U20.04140.23730.11170.13700.07410.07360.1029
ModelcnnmlpcnnlstmconvlstmGMNGMDGM
TrainingRMSE2430.87733915.1753861.2289812.67663013.46102491.00853017.3331
MAE2159.56723154.0362597.2083566.96962640.82092156.83662655.5472
MAPE14.251817.54953.19413.572117.373114.688617.6296
TIC0.06650.11520.02360.02210.08150.06910.0815
U10.06650.11520.02360.02210.08150.06910.0815
U20.13210.21270.04680.04420.16370.13530.1639
TestRMSE11,761.70034033.46683381.07915542.609916,836.78669023.259816,728.2120
MAE11,319.11363203.20063175.02045061.806216,073.41768607.202415,971.1954
MAPE51.532714.991214.536823.269073.287239.262772.8201
TIC0.20920.08390.07040.11100.27560.16870.2743
U10.20920.08390.07040.11100.27560.16870.2743
U20.52430.17980.15070.24710.75050.40220.7456
ModelNDGMBernoulliGMFGMFNGMFNDGMFDGMFBernoulliGM
TrainingRMSE2403.27275532.42874248.01644070.03332318.34934003.42056202.7378
MAE2120.14984954.73183574.89203599.82411948.75973384.76205537.3872
MAPE15.065036.397230.503626.069612.742928.358841.8137
TIC0.06560.16690.11460.11500.06230.10860.1903
U10.06560.16690.11460.11500.06230.10860.1903
U20.13060.30060.23080.22110.12600.21750.3370
TestRMSE9081.49951671.94963807.35866217.606511,715.24653689.25122269.5763
MAE8708.00011383.75273298.59915447.697111,347.74843197.69342226.4167
MAPE39.68176.001415.281025.183651.589614.81329.9161
TIC0.16950.03630.07910.12350.20840.07680.0482
U10.16950.03630.07910.12350.20840.07680.0482
U20.40480.07450.16970.27710.52220.16440.1012
ModelNIPGMNIPNGMNIPDGMNIPNDGMNIPBernoulliGM
TrainingRMSE4317.68573141.66174127.48232429.13363457.6183
MAE3704.45852639.52393531.05371970.39292963.5155
MAPE31.482716.003829.732211.650623.9604
TIC0.11710.08860.11210.06560.0996
U10.11710.08860.11210.06560.0996
U20.23460.17070.22430.13200.1879
TestRMSE5153.072411,050.55983907.231513,174.311123,210.8592
MAE4600.445310,461.03673395.071812,732.835618,689.1563
MAPE21.218947.779115.721157.911587.2516
TIC0.10420.19950.08100.22860.6128
U10.10420.19950.08100.22860.6128
U20.22970.49260.17420.58721.0346
Table 5. The metrics of the models in case 2.
Table 5. The metrics of the models in case 2.
ModelGreySLstm-M1GreySLstm-M2grurfxgblstmsvr
TrainingRMSE37.496863.6828488.388834.8961145.2803295.4851181.7269
MAE28.892542.0938220.575025.8150121.5278150.3027156.6424
MAPE1.67612.022932.13781.457711.432020.571814.6458
TIC0.00960.01630.12590.00900.03780.07630.0468
U10.00960.01630.12590.00900.03780.07630.0468
U20.01930.03280.25140.01800.07480.15210.0935
TestRMSE115.3096200.4789421.2057523.4443772.2424457.9253129.5320
MAE95.7251148.0640377.4322480.7118743.9410414.9213108.6905
MAPE2.75843.985810.385613.274820.722911.43503.1498
TIC0.01620.02880.06270.07910.12150.06850.0180
U10.01620.02880.06270.07910.12150.06850.0180
U20.03250.05640.11860.14740.21740.12890.0365
ModelcnnmlpcnnlstmconvlstmGMNGMDGM
TrainingRMSE189.6547118.7663105.700447.3233231.5212165.9904232.7358
MAE142.902493.491575.307335.2646192.6853132.2427193.8591
MAPE14.30226.13384.08912.398314.37499.914514.5544
TIC0.04900.03040.02670.01220.05860.04350.0589
U10.04900.03040.02670.01220.05860.04350.0589
U20.09760.06110.05440.02440.11920.08540.1198
TestRMSE293.7241331.2288124.8711448.65071213.7337277.89371215.1447
MAE270.8711310.8389104.1500392.98501163.1175246.73141164.8605
MAPE7.68768.81562.961510.924832.51576.956832.5664
TIC0.03980.04470.01750.05980.14640.03780.1466
U10.03980.04470.01750.05980.14640.03780.1466
U20.08270.09330.03520.12630.34170.07820.3421
ModelNDGMBernoulliGMFGMFNGMFNDGMFDGMFBernoulliGM
TrainingRMSE152.1839301.4826268.4222147.0146136.9840472.5171263.7041
MAE131.1087266.4801225.9895110.9666104.1567376.2808232.4938
MAPE10.302821.886019.29085.88026.651937.512918.9998
TIC0.03920.08220.06620.03810.03480.11510.0713
U10.03920.08220.06620.03810.03480.11510.0713
U20.07830.15520.13820.07570.07050.24320.1357
TestRMSE340.3275116.3608773.9608480.4346473.8145301.5636149.3024
MAE315.9793107.1247751.6470459.6381457.4457270.4788123.7991
MAPE8.92023.052521.129712.960612.93837.45863.3959
TIC0.04590.01630.09840.06350.06270.04420.0212
U10.04590.01630.09840.06350.06270.04420.0212
U20.09580.03280.21790.13530.13340.08490.0420
ModelNIPGMNIPNGMNIPDGMNIPNDGMNIPBernoulliGM
TrainingRMSE214.9696138.8300232.3187125.2973272.0905
MAE169.2131109.0311189.868390.8680238.6010
MAPE12.61666.004915.03775.274320.0758
TIC0.05370.03610.05770.03190.0736
U10.05370.03610.05770.03190.0736
U20.11070.07150.11960.06450.1401
TestRMSE862.1933404.1645790.7576429.9505197.6275
MAE836.4000384.1215768.0924412.9963150.0556
MAPE23.487410.857121.590211.69044.0554
TIC0.10850.05400.10040.05720.0283
U10.10850.05400.10040.05720.0283
U20.24270.11380.22260.12110.0556
Table 6. The metrics of models in case 3.
Table 6. The metrics of models in case 3.
ModelGreySLstm-M1GreySLstm-M2grurfxgblstmsvr
TrainingRMSE5.52478.759030.39969.401529.016417.287324.9984
MAE3.31225.592019.10206.420525.175613.557223.0586
MAPE1.47472.265012.05912.925415.73247.873113.2535
TIC0.01280.02020.07110.02210.06660.04050.0593
U10.01280.02020.07110.02210.06660.04050.0593
U20.02570.04070.14130.04370.13480.08030.1162
TestRMSE76.915396.7842141.9118278.8786318.532694.8583287.8411
MAE65.535877.5446126.7438265.2736306.691785.9361254.5826
MAPE9.542111.119918.000538.630544.945612.321236.1240
TIC0.05660.06910.11680.25920.30790.07510.2640
U10.05660.06910.11680.25920.30790.07510.2640
U20.11420.14370.21080.41420.47310.14090.4275
ModelcnnmlpcnnlstmconvlstmGMNGMDGM
TrainingRMSE89.278839.425535.113634.669740.1285177.109040.0240
MAE73.507129.293026.186626.688432.8573122.643932.7378
MAPE43.438014.469213.223113.988717.488254.462717.4793
TIC0.20810.09220.08220.08110.09540.29820.0949
U10.20810.09220.08220.08110.09540.29820.0949
U20.41490.18320.16320.16110.18650.82300.1860
TestRMSE462.0330249.8680104.4885221.5902168.82751536.1246170.6696
MAE453.9507244.429995.2879210.2770164.66431349.0963166.6187
MAPE67.398636.208214.637930.591024.8887193.058425.1626
TIC0.52080.22750.08330.19590.14260.53970.1444
U10.52080.22750.08330.19590.14260.53970.1444
U20.68620.37110.15520.32910.25082.28150.2535
ModelNDGMBernoulliGMFGMFNGMFNDGMFDGMFBernoulliGM
TrainingRMSE34.074443.187537.680041.544942.857050.433736.0213
MAE27.174833.172429.270934.410833.423840.239828.0154
MAPE14.672115.701216.780218.644416.045122.607316.1310
TIC0.07980.10500.08490.09780.10390.11740.0817
U10.07980.10500.08490.09780.10390.11740.0817
U20.15830.20070.17510.19310.19920.23440.1674
TestRMSE197.0548232.0179217.0482193.0278241.6070311.1789362.4447
MAE144.3595229.0458159.5841189.3324238.4823305.2174260.6160
MAPE20.196334.423622.241228.591835.735845.267035.9641
TIC0.13370.20760.14360.16660.21820.30010.2202
U10.13370.20760.14360.16660.21820.30010.2202
U20.29270.34460.32240.28670.35880.46220.5383
ModelNIPGMNIPNGMNIPDGMNIPNDGMNIPBernoulliGM
TrainingRMSE40.040859.925233.814532.314037.7739
MAE31.199942.844225.862528.012529.7247
MAPE18.229924.052214.887916.175617.5189
TIC0.08950.14390.07900.07600.0850
U10.08950.14390.07900.07600.0850
U20.18610.27850.15710.15020.1755
TestRMSE187.9752107.3958946.13407738.0291443.3291
MAE139.325187.2050690.90504938.1195311.9407
MAPE19.478512.533594.8007664.156142.8325
TIC0.12680.07570.42860.86930.2582
U10.12680.07570.42860.86930.2582
U20.27920.15951.405211.49290.6585
Table 7. Optimization of GreySLstm-M1 compared with other models in the test phase of case 1.
Table 7. Optimization of GreySLstm-M1 compared with other models in the test phase of case 1.
Modelvs. GreySLstm-M2vs. gruvs. rfvs. xgbvs. lstmvs. svrvs. cnn
RMSE82.556562.958569.781644.121343.782759.771892.1045
MAE82.170860.893467.460222.326336.606457.961991.9533
MAPE82.298161.409267.997825.975538.014857.672792.0008
TIC80.393760.681267.606242.268541.451761.312590.0117
U180.393760.681267.606242.268541.451761.312590.0117
U282.556562.958569.781644.121343.782759.771892.1045
Modelvs. mlpvs. cnnlstmvs. convlstmvs. GMvs. NGMvs. DGMvs. NDGM
RMSE76.976572.534083.245394.484489.708394.448689.7743
MAE71.565671.313282.006294.333489.418094.297289.5405
MAPE72.502571.642982.284694.375389.501094.339289.6118
TIC75.094270.324281.178692.416387.613892.380287.6714
U175.094270.324281.178692.416387.613892.380287.6714
U276.976572.534083.245394.484489.708394.448689.7743
Modelvs. BernoulliGMvs. FGMvs. FNGMvs. FNDGMvs. FDGMvs. FBernoulliGMvs. NIPGM
RMSE44.457375.609285.064292.073274.828359.082881.9788
MAE34.178272.388083.280891.973671.516659.090880.2017
MAPE31.312573.023983.631492.009672.172058.429180.5729
TIC42.437673.574383.079689.969872.786356.620379.9418
U142.437673.574383.079689.969872.786356.620379.9418
U244.457375.609285.064292.073274.828359.082881.9788
Modelvs. NIPNGMvs. NIPDGMvs. NIPNDGMvs. NIPBernoulliGM
RMSE91.596476.232692.951195.9991
MAE91.293373.172692.846895.1265
MAPE91.372473.779192.881995.2755
TIC89.526374.197790.857096.5897
U189.526374.197790.857096.5897
U291.596476.232692.951195.9991
Table 8. Optimization of GreySLstm-M1 compared with other models in the test phase of case 2.
Table 8. Optimization of GreySLstm-M1 compared with other models in the test phase of case 2.
Modelvs. GreySLstm-M2vs. gruvs. rfvs. xgbvs. lstmvs. svrvs. cnn
RMSE42.483072.623977.971085.068274.819110.979960.7422
MAE35.348874.637880.086887.132776.929311.928764.6603
MAPE30.792973.439979.220586.688975.877312.424164.1183
TIC43.911174.202979.560986.697176.404110.404359.4026
U143.911174.202979.560986.697176.404110.404359.4026
U242.483072.623977.971085.068274.819110.979960.7422
Modelvs. mlpvs. cnnlstmvs. convlstmvs. GMvs. NGMvs. DGMvs. NDGM
RMSE65.18737.657174.298690.499658.505990.510666.1181
MAE69.20438.089275.641591.770061.202791.782369.7053
MAPE68.70976.857874.750891.516660.349391.529869.0766
TIC63.80417.429472.951988.957857.218088.968564.7390
U163.80417.429472.951988.957857.218088.968564.7390
U265.18737.657174.298690.499658.505990.510666.1181
Modelvs. BernoulliGMvs. FGMvs. FNGMvs. FNDGMvs. FDGMvs. FBernoulliGMvs. NIPGM
RMSE0.903485.101475.998975.663661.762822.767886.6260
MAE10.641487.264679.173879.074064.609022.677188.5551
MAPE9.633786.945278.716878.680163.016518.770988.2557
TIC1.020783.574774.535474.193363.387523.879985.0928
U11.020783.574774.535474.193363.387523.879985.0928
U20.903485.101475.998975.663661.762822.767886.6260
Modelvs. NIPNGMvs. NIPDGMvs. NIPNDGMvs. NIPBernoulliGM
RMSE71.469685.417873.180741.6531
MAE75.079587.537376.821836.2069
MAPE74.593487.223776.404231.9820
TIC70.037783.889571.729042.8803
U170.037783.889571.729042.8803
U271.469685.417873.180741.6531
Table 9. Optimization of GreySLstm-M1 compared with other models in the test phase of case 3.
Table 9. Optimization of GreySLstm-M1 compared with other models in the test phase of case 3.
Modelvs. GreySLstm-M2vs. gruvs. rfvs. xgbvs. lstmvs. svrvs. cnn
RMSE20.529045.800672.419875.853218.915573.278583.3528
MAE15.486448.292775.295078.631423.739074.257585.5632
MAPE14.189046.989975.299178.769722.555773.585285.8423
TIC18.134851.561078.174581.627224.627378.570889.1368
U118.134851.561078.174581.627224.627378.570889.1368
U220.529045.800672.419875.853218.915573.278583.3528
Modelvs. mlpvs. cnnlstmvs. convlstmvs. GMvs. NGMvs. DGMvs. NDGM
RMSE69.217626.388765.289454.441594.992954.933260.9676
MAE73.188331.223468.833660.200495.142260.667254.6024
MAPE73.646634.812468.807561.661195.057462.078452.7533
TIC75.125732.047271.117860.330389.515960.829757.6863
U175.125732.047271.117860.330389.515960.829757.6863
U269.217626.388765.289454.441594.992954.933260.9676
Modelvs. BernoulliGMvs. FGMvs. FNGMvs. FNDGMvs. FDGMvs. FBernoulliGMvs. NIPGM
RMSE66.849464.563060.153268.165175.282678.778759.0822
MAE71.387558.933465.385972.519678.528274.853552.9620
MAPE72.280457.097366.626673.298378.920573.467851.0122
TIC72.743960.607566.029274.069181.145174.300755.3960
U172.743960.607566.029274.069181.145174.300755.3960
U266.849464.563060.153268.165175.282678.778759.0822
Modelvs. NIPNGMvs. NIPDGMvs. NIPNDGMvs. NIPBernoulliGM
RMSE28.381491.870699.006082.6505
MAE24.848690.514598.672978.9909
MAPE23.867589.934698.563377.7223
TIC25.301586.798093.491578.0911
U125.301586.798093.491578.0911
U228.381491.870699.006082.6505
Table 10. Metrics of the GreySLstm model with different numbers of LSTM layers.
Table 10. Metrics of the GreySLstm model with different numbers of LSTM layers.
LSTM Layer Number123456789
Case 1RMSETrain937.9942851.0898486.6038844.6278963.40583109.381810,351.496011,148.519411,152.6367
Test5132.21003611.18511627.78413030.88415331.771519,477.619064,645.930568,870.812868,889.6010
MAETrain625.4415521.0091288.6944547.3042546.06931982.27576818.29567408.28597411.4835
Test4841.13973463.42001190.50962618.61035128.776318,726.484762,154.823366,233.577366,251.8598
MAPETrain3.12772.43151.32312.67282.437410.877539.633442.515842.5315
Test22.131515.78995.617812.099923.366585.2647282.9392301.4895301.5725
TICTrain0.02540.02310.01320.02310.02600.07240.22760.24110.2412
Test0.10270.07470.03540.06350.10660.23610.59460.61050.6106
U1Train0.02540.02310.01320.02310.02600.07240.22760.24110.2412
Test0.10270.07470.03540.06350.10660.23610.59460.61050.6106
U2Train0.05100.04620.02640.04590.05230.16890.56240.60570.6060
Test0.22880.16100.07260.13510.23770.86822.88153.06983.0707
Case 2RMSETrain101.558794.146685.9299118.194565.8725343.2804861.38431104.37571103.5302
Test317.5203192.0910149.4493360.5283177.35452044.37805160.06706546.06116542.0924
MAETrain76.275669.471965.231285.107247.0495202.2991549.3752699.6323699.0246
Test286.1201168.4210128.8550314.3367148.33871973.56654974.82836322.11066318.2444
MAPETrain4.15734.08263.71434.59472.538210.677330.172336.388436.3614
Test8.00504.69013.62038.68494.108555.1810138.8855176.5324176.4243
TICTrain0.02650.02450.02250.03130.01690.08070.18330.22740.2272
Test0.04290.02760.02130.04930.02490.20580.40460.48140.4813
U1Train0.02650.02450.02250.03130.01690.08070.18330.22740.2272
Test0.04290.02760.02130.04930.02490.20580.40460.48140.4813
U2Train0.05230.04850.04420.06080.03390.17670.44340.56850.5681
Test0.08940.05410.04210.10150.04990.57561.45281.84301.8419
Case 3RMSETrain38.574238.135331.848538.476838.403938.744338.828738.850538.6034
Test208.8081235.0953206.5127207.9963206.1128210.7499211.8525212.1576208.7974
MAETrain30.961330.473725.007330.981930.884531.170631.224831.238331.0690
Test152.8418171.6049150.9447152.2497150.8776154.2628155.0598155.2788152.8170
MAPETrain18.579218.169514.826618.592418.531718.720218.755218.763718.6545
Test21.292323.854921.033321.211021.023621.486921.596021.626021.2890
TICTrain0.08670.08590.07200.08660.08640.08700.08720.08720.0868
Test0.13950.15350.13810.13910.13800.14060.14120.14140.1395
U1Train0.08670.08590.07200.08660.08640.08700.08720.08720.0868
Test0.13950.15350.13810.13910.13800.14060.14120.14140.1395
U2Train0.17930.17720.14800.17880.17850.18000.18040.18050.1794
Test0.31010.34920.30670.30890.30610.31300.31470.31510.3101
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hao, Y.; Ma, X. A Hybrid Grey System Model Based on Stacked Long Short-Term Memory Layers and Its Application in Energy Consumption Forecasting. Processes 2024, 12, 1749. https://doi.org/10.3390/pr12081749

AMA Style

Hao Y, Ma X. A Hybrid Grey System Model Based on Stacked Long Short-Term Memory Layers and Its Application in Energy Consumption Forecasting. Processes. 2024; 12(8):1749. https://doi.org/10.3390/pr12081749

Chicago/Turabian Style

Hao, Yiwu, and Xin Ma. 2024. "A Hybrid Grey System Model Based on Stacked Long Short-Term Memory Layers and Its Application in Energy Consumption Forecasting" Processes 12, no. 8: 1749. https://doi.org/10.3390/pr12081749

APA Style

Hao, Y., & Ma, X. (2024). A Hybrid Grey System Model Based on Stacked Long Short-Term Memory Layers and Its Application in Energy Consumption Forecasting. Processes, 12(8), 1749. https://doi.org/10.3390/pr12081749

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop