Next Article in Journal
Estimation of Indoor 222Rn Concentration in Lima, Peru Using LR-115 Nuclear Track Detectors Exposed in Different Modes
Next Article in Special Issue
Identification and Spatiotemporal Migration Analysis of Groundwater Drought Events in the North China Plain
Previous Article in Journal
Factors Affecting Indoor Radon Levels in Buildings Located in a Karst Area: A Statistical Analysis
Previous Article in Special Issue
Drought in Shanxi Province Based on Remote Sensing Drought Index Analysis of Spatial and Temporal Variation Characteristics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Application of Informer Model Based on SPEI for Drought Forecasting

1
National Supercomputing Center in Zhengzhou, Zhengzhou University, Zhengzhou 450000, China
2
The School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
3
Department of Mathematics, Zhengzhou University of Aeronautics, Zhengzhou 450046, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Atmosphere 2023, 14(6), 951; https://doi.org/10.3390/atmos14060951
Submission received: 28 April 2023 / Revised: 22 May 2023 / Accepted: 24 May 2023 / Published: 29 May 2023

Abstract

:
To increase the accuracy of drought prediction, this study proposes a drought forecasting method based on the Informer model. Taking the Yellow River Basin as an example, the forecasting accuracies of the Autoregressive Integrated Moving Average (ARIMA), Long Short-Term Memory (LSTM), and Informer models on multiple timescales of the Standardized Precipitation Evapotranspiration Index (SPEI) were compared and analyzed. The results indicate that, with an increasing timescale, the forecasting accuracies of the ARIMA, LSTM, and Informer models improved gradually, reaching the best accuracy on the 24-month timescale. However, the predicted values of ARIMA, as well as those of LSTM, were significantly different from the true SPEI values on the 1-month timescale. The Informer model was more accurate than the ARIMA and LSTM models on all timescales, indicating that Informer can widely capture the information of the input series over time and is more effective in long-term prediction problems. Furthermore, Informer can significantly enhance the precision of SPEI prediction. The predicted values of the Informer model were closer to the true SPEI values, and the forecasted SPEI trends complied with the actual trends. The Informer model can model different timescales adaptively and, therefore, better capture relevance on different timecales. The NSE values of the Informer model for the four meteorological stations on SPEI24 were 0.968, 0.974, 0.972, and 0.986.

1. Introduction

From a global perspective, anthropogenic climate change, carbon emissions, deforestation, and urbanization have increased the frequency of drought [1]. The World Meteorological Organization (WMO) classifies drought according to the affected domain as meteorological, agricultural, hydrological, and socio-economic. In the world, few natural hazards are as devastating as drought [2]. The frequent and persistent occurrence of drought can lead to substantial losses in the socio-economic sphere, particularly in agriculture, and it can cause various detrimental ecological and environmental impacts, such as water scarcity, desertification, and frequent occurrences of sand and dust storms [3]. Drought prediction is a crucial field in addressing climate change and effectively managing water resources. Drought, characterized by prolonged water scarcity, has severe impacts on global ecosystems, agriculture, economies, and societies. The ability to accurately forecast drought events and their spatiotemporal patterns is of paramount importance for taking proactive measures and minimizing adverse impacts [4,5,6,7]. Conducting a series of studies on drought monitoring, assessment, and prediction has become a hot issue of great global concern and is of great practical significance [8]. Monitoring drought and issuing timely warnings are essential precursors for disaster mitigation and prevention. Accurately predicting the occurrence of drought offers useful resources for risk management and pre-warning, helping to reduce disaster damage to the greatest extent possible [3].
The use of a drought index is crucial for the quantitative assessment of drought severity and impacts [9]. Several meteorological drought indexes have been developed over the last few decades, such as SPI [10], SPEI [11], PDSI [12], and SMDI [13], which are utilized extensively at distinct spatial scales on global, regional, national, and different basins [14]. Vicente Serrano et al. [15] proposed SPEI, which builds on the algorithms used in both SPI and PDSI [16] and incorporates multi-scale features to evaluate the effects of temperature variations on drought conditions [3]. At present, there are two potential evapotranspiration models commonly used in the SPEI calculation process in China, which are Thornthwaite and Penman–Monteith. The occurrence and evolution of drought usually form a multi-timescale process, and the selection of different scales of SPEI is important for drought research. So, this study selected the timescales of SPEI at 1 month (SPEI1), 3 months (SPEI3), 6 months (SPEI6), 9 months (SPEI9), 12 months (SPEI12), and 24 months (SPEI24).
At present, drought prediction methods can be classified into two types: numerical prediction and statistical prediction. Numerical prediction [17,18] builds on meteorological principles to predict drought conditions by solving atmospheric dynamics equations. The effectiveness of the numerical prediction method relies on the precision of model parameters, the stability of driving variables, and the support of a lot of meteorological statistics [2,3]. Statistical prediction uses mathematical modeling techniques, such as regression prediction and grey system prediction, to model meteorological data [18]. However, the statistical prediction method has difficulty in accurately predicting future drought conditions during meteorological leaps and bounds [18]. With the rapid development of artificial intelligence [19], some new intelligent drought prediction models have emerged and become the mainstream methods for drought prediction. Hu et al. [20] adopted the LSTM model for SPEI spatiotemporal prediction on multiple timescales, and the results suggested that the forecasting efficiency of LSTM gradually improved as the SPEI timescale increased. Xu et al. [3] introduced a hybrid model that combines ARIMA and LSTM for drought prediction based on the deep learning method, and the results suggested that the hybrid model predicts SPEI with high precision on long timescales and with lower precision on short timescales. Zhang et al. [21] utilized two integration methods, Bagging and Boosting, which integrate multiple single models into a more powerful model with predictions on different timescales. Through a comparison of the forecasting results of various models with actual observations, the study found that the models based on the integration methods have higher accuracy and stability relative to the single models. Xu et al. [22] combined Complementary Ensemble Empirical Mode Decomposition (CEEMD) and ARIMA, and they showed that the CEEMD-ARIMA model was applicable to drought prediction; the model could also identify multiple modalities of drought variability on diverse timescales [23], improving the comprehensiveness and accuracy of the prediction.
Currently, most of the machine learning methods widely applied for drought prediction on multiple timescales are mostly based on recurrent neural networks, which can solve the sequence prediction problem better than other deterministic and traditional models [24]. For these problems, some new methods have been proposed, such as the Transformer [25] model and the Informer [26] model, which can handle long series data and increase the precision of prediction.
The Informer model used in this paper is an effective improvement to the Transformer model. A sequence-to-sequence model proposed by a Google team in 2017, Transformer adopts a self-attentive mechanism to handle sequential information as a whole and can avoid the recursion of information while enabling attention to be paid to local information with strong relevance [27]. Informer is essentially an improvement on Transformer. By modifying the structure of Transformer and the probabilistic sparsification of the original self-attentive mechanism, Informer speeds up the computation speed of Transformer and effectively improves the precision of sequence prediction.
The accuracy of SPEI prediction on short timescales is still low in existing studies. Thus, this paper adopts the multi-layer Transformer structure of the Informer model; adopts a novel position encoding method introduced to capture the long-term and short-term dependencies in time series; and incorporates an attention mechanism, which effectively improves the accuracy of short-timescale SPEI prediction. In this article, a drought prediction model is constructed using the Informer algorithm, it is validated with four meteorological stations in the Yellow River Basin, and it is verified with the LSTM and ARIMA models to demonstrate the higher precision of the model’s prediction.

2. Materials and Methods

2.1. Study Area

The Yellow River Basin is a major watershed in China and is known as the Mother River (Figure 1). It is a major agricultural and economic region of China. The Yellow River Basin is located at 90°33′–122°25′ E and 24°30′–35°45′ N, with a mainly temperate monsoon climate [8]. The temperature difference throughout the year is extremely large [19]. However, environmental problems, such as severe land sanding and water shortage, also exist in the basin.

2.2. Data Source

The meteorological data in this paper were obtained from the monthly value dataset of the terrestrial climate information from the China Meteorological Data Network (https://www.data.cma.cn/ accessed on 6 April 2022), and they include precipitation (mm), maximum temperature (°C), minimum temperature (°C), average temperature (°C), wind speed (m·s−1), sunshine hours (h), latitude (°), longitude (°), and altitude (m) for the period of 1960–2019. This study selected 4 meteorological stations in the Yellow River Basin to apply validation. Table 1 shows the information of the 4 representative stations.

2.3. Methods

2.3.1. Standardized Precipitation Evapotranspiration Index

This study uses the Penman–Monteith model to estimate potential evapotranspiration by calculating multi-scale SPEI values for four meteorological stations located within the study area for the period of 1960 to 2019 [15], which allows for the determination of the influence of precipitation, temperature, and evapotranspiration on drought in an integrated manner and has the advantages of multiple timescales and clarity of the mechanism. The procedures for calculating SPEI_PM are as follows [16]:
(1) The Penman–Monteith model is utilized to generate the reference crop evapotranspiration E T 0 , which is determined using the following equation:
E T 0 = 0.408 Δ ( R n G ) + γ 900 T + 273 U 2 ( e s e a ) Δ + γ ( 1 + 0.34 U 2 )
where E T 0 indicates the evaporation from the reference crop (mm/d); Δ is the saturated hydraulic pressure curve slope (kPa/°C) [6]; γ is the moisture constant (kPa/°C); R n means solar net radiation (MJ·m−2·d−1); G is the thermal flux of the soil (MJ·m−2·d−1) [11]; T is the mean temperature for the calculation period (°C); U 2 is the mean speed of the wind at a height of 2 m above the ground; e s is the pressure of saturated water (kPa); and e a is the real water pressure (kPa) [15].
(2) The monthly values of the difference between precipitation and evaporation is calculated.
D i = P i E T 0
where D i indicates the difference between precipitation and evapotranspiration; P i represents the precipitation amount per month; E T 0 is the monthly actual evaporation volume [15].
(3) The data series of D i is normalized. D i is fit with the cumulative probability distribution function F ( x ) , and the corresponding SPEI value for each D i [15] is calculated, making the data fit the probability distribution.
F ( x ) = 1 + α x γ β 1
where F ( x ) is the probability distribution function, and the other parameters are as follows:
α = ( a 0 2 a 1 ) β τ ( 1 + 1 / β ) τ ( 1 1 / β )
β = 2 a 1 a 0 6 a 1 a 0 6 a 2
γ = a 0 α ( 1 + 1 / β ) τ ( 1 1 / β )
where τ is the factorial function; a 0 , a 1 , and a 2 are the weighted moment of the probability of data series D i [15].
The probability of exceeding a certain value of D i can be written as P = 1 F ( x ) . Then, SPEI can be written as a function of P as follows:
SPEI = w g 0 + g 1 w + g 2 w 2 1 + e 2 w + d 1 w 2 + e 3 w 3 , with w = 2 ln P , for P 0.5 w + g 0 + g 1 w + g 2 w 2 1 + e 2 w + e 1 w 2 + e 3 w 3 , with w = 2 ln ( 1 P ) , for P > 0.5
where w = 2 ln ( 1 P ) . The other parameters in Equation (7) are e 1 = 1.432788 , e 2 = 0.189269 , e 3 = 0.001308 , g 0 = 2.515517 , g 1 = 0.802853 , and g 2 = 0.010328 . Referring to the national standard meteorological drought grade (GB/T20481-2017) stipulated by the drought grading standard, the drought categories classified according to the SPEI values are shown in Table 2.

2.3.2. Informer

Informer is considered a supervised learning model built on the attention mechanism, which, as a whole, consists of two components: an encoder and a decoder [26]. Informer is a Transformer-based time series prediction model that better captures the long-term dependencies of time series by adding processing steps, such as position encoding, the block attention mechanism, and adaptive length sequence sampling, where the encoder is used to obtain a long-term dependence on the robustness of the original input sequence and the decoder can further implement sequence prediction. The structure of the Informer model is illustrated in Figure 2. The left encoder primarily receives longer sequence inputs and incorporates sparse self-attention [27], an alternative to the conventional self-attention mechanism [28]. The trapezoidal component refers to the extracted operation of self-attention, which can dramatically reduce the size of the network, while the stacking of multiple layers further enhances the model’s robustness again [28]. The right decoder takes the input of the long-term sequence, padding the target elements to zero, by which an attention-weighted constituent of the feature graph is measured; then, these elements are output in a rapidly generated format [29].

Informer Model Inputs

The input data at time t are as follows:
x t = { x 1 t , , x L x t x i t R d x }
and the output is the corresponding sequence of predictions.
y t = { y 1 t , , y L y t y i t R d y }
where L x and L y are the input length and output length, respectively; d x and d y are the feature dimensions.
For time series prediction problems, the sequence of the data is particularly important. To keep the order structure of the series data from being lost after they are input to the model, Informer encodes the location information P E ( p o s , 2 j ) and P E ( p o s , 2 j + 1 ) for each set of input data, and the specific formulae are implemented as follows:
P E ( p o s , 2 j ) = sin p o s ( 2 L ) 2 j / d m o d e l
P E ( p o s , 2 j + 1 ) = cos p o s ( 2 L ) 2 j / d m o d e l
where p o s is the position (sequence order). The index j =1, 2, ⋯, d m o d e l /2, indicates the dimension. d m o d e l represents the dimensionality of the characteristics represented by the input, and L is the input sequence.

Self-Attention Mechanism of Informer Model

In probability form, the A ( q i , K , V ) of the attention coefficient for the i-th Query is as follows:
A ( q i , K , V ) = j k ( q i , k j ) l k ( q i , k l ) V j = E p ( k j q i ) V j
where p ( k j q i ) = k ( q i , k j ) / l k ( q i , k l ) , and k ( q i , k j ) selects the asymmetric exponential kernel exp ( q i k j T ) / d [23].
To measure the sparsity of Q u e r y , Informer uses Kullback–Leibler divergence. Ignoring the constant, the sparsity measure formula for the i-th Q u e r y is equated as follows:
M ( Q i , K ) = ln j = 1 L K exp q i k T d 1 L K j = 1 L K q i k j T d
where the first is the logarithmic sum expansion (LSE) of q i on all the keys, and the second is their arithmetic average [26].
According to the proposed measurement, the formula of ProbSparse self-attention can be written as follows:
A ( Q , K , V ) = Softmax Q ¯ K T d V
where Q ¯ is a sparse matrix of the same size as q, which only contains T o p u queries under the sparsity measurement M ( q , M ) [26].

Encoder for Informer Model

The aim of the encoder is to capture the long-range dependency of the robustness of the long sequence of inputs [26]. A sketch of the encoder is shown in Figure 3. The procedure of the distillation operation from layer j-th to layer ( j + 1 ) -th is as follows:
X j + 1 t = MaxPool ELU Convld X j t A B
where [ X j t ] A B represents the attention module, which includes the multi-head ProbSparse self-attention and basic operations. Concld represents one-dimensional convolution operations on a time series, which is performed by using ELU as the activity functions [30].
The self-attention distillation mechanism proposed by Informer enables each decoder layer to reduce the input sequence length by half, which dramatically saves the memory spending and computational time of the encoder [26].

Decoder for Informer Model

The standard decoder structure is used in part of the decoder, proposed by VASWANI in 2017 [27], which is composed of two identical multi-head attention layers. The decoder X d e t is supplied with the following vectors:
X d e t = Concat ( X t o k e n t , X 0 t ) R ( L t o k e n + L y ) × d m o d e l
where X t o k e n t R L t o k e n × d m o d e l is the start token; X 0 t R L y × d m o d e l is a placeholder for the target sequence [30].
ProbSparse self-attention adopts blocked multi-headed attention, fully connected layer output dimensions to determine uni/multivariate predictions, and a generative structure to shorten the prediction decoding time.

2.3.3. Long Short-Term Memory

Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture for handling sequential data; it was developed as an improvement over traditional RNN [31], and it effectively resolves the problem of prolonged dependence by using three gating mechanisms and a memory unit. By contrast with the ordinary RNN, LSTM incorporates a memory cell to determine whether the information is available [32]. The cell state is the key of LSTM. To protect and control the state of a memory cell, three control gates are placed in a memory cell, called the input gate, forget gate, and output gate [33]. Each control gate consists of a neural network layer containing a sigmoid function and a dot product operation [34]. The LSTM memory cell structure is illustrated in Figure 4.

2.3.4. Autoregressive Integrated Moving Average

ARIMA is the combination of AR, MA, and Difference (Diff), which converts unsteady time series into a steady-state series by performing one or more differences and then fitting it with ARIMA [35]. Its composition is as follows:
ARIMA ( p , d , q ) = AR ( p ) + Diff ( d ) + MA ( q )
where AR ( p ) represents the autoregressive model; Diff ( d ) indicates the difference model; MA ( q ) indicates the moving-average model; p, d, and q are the parameters corresponding to the three models. The ARIMA model prediction equation for C ( t ) is as follows:
C ( t ) = φ 0 + i = 1 p φ i C t i + ε t + i = 1 q γ i ε t i
where C ( t ) represents the reconstructed component time series formed after the SE algorithm; ε t represents the current period random error disturbance; φ i and γ i represent model parameters; p denotes the quantity of autoregressive terms; d indicates the variance number in a steady time series; q denotes the amount of terms in the moving average [36].

2.3.5. Evaluation Metrics

To estimate the efficiency of the contrasting model more reasonably, NSE, RMSE, and MAE were used in this paper to perform an evaluation. The formula used to calculate above metrics is shown below.
RMSE = 1 N i = 1 N ( y i y ˜ i ) 2
NSE = 1 i = 1 N ( y i y ˜ i ) 2 i = 1 N ( y i y ¯ ) 2
MAE = 1 N i = 1 N y i y ˜ i
where y i indicates the true value; y ˜ indicates the forecasted value; y ¯ represents the average value of y i ; and N indicates an amount of the total data for y i .

3. Results

3.1. SPEI Values on Different Timescales

The 1-, 3-, 6-, 12-, and 24-month timescale SPEI values of Hangjinhouqi, Huanxian, Taian, and Maqin were calculated using monthly meteorological data. The results are shown in Figure 5. Combined with the Mann–Kendall trend test (Table 3), it can be observed that the SPEI1, SPEI3, SPEI6, SPEI12, and SPEI24 of the four stations show a decreasing trend. In particular, the following show a significant decreasing trend: SPEI9, SPEI12, and SPEI24 of the Hangjinhouqi site; SPEI3, SPEI6, SPEI12, and SPEI24 of the Huanxian site; SPEI9, SPEI12, and SPEI24 of the Taian site; and SPEI24 of the Maqin site. The four stations show a high frequency of extreme droughts. In the past decade, the temperature of the Yellow River Basin has been increasing, and the runoff of the main and tributary streams has been decreasing since 1960 [8], which has caused the SPEI values to decrease.

3.2. Analysis of Model Prediction Results

Using multi-scale SPEI data from 1960–2007 as training data, the SPEI values of the four meteorological battle sites on multiple timescales were predicted using the LSTM, ARIMA, and Informer models for 2008–2020. A comparison of the prediction performance of the three models and the prediction evaluation indexes are shown in Figure 6, Figure 7, Figure 8 and Figure 9 and Table 4. It is suggested that the Informer model accurately fit the predicted values to the true values compared to the ARIMA and LSTM models, and it effectively captured the variations in the SPEI values.
The predicted values of ARIMA, as well as those of LSTM, for the four meteorological stations were significantly different from the true SPEI values on the 1-month timescale. In particular, LSTM lost prediction ability in predicting SPEI1 for Hangjinhouqi. The differences between the predicted and actual values of ARIMA and LSTM decreased when predicting SPEI3, SPEI6, SPEI9, SPEI12, and SPEI24. In this study, the data of SPEI1 changed relatively fast and fluctuated more, which required more complex modeling methods to predict, and, therefore, the prediction was the worst on this timescale.
The Informer model predictions were more similar to the true SPEI values, and the predicted SPEI trends were consistent with the actual trends. In Figure 6, Figure 7, Figure 8 and Figure 9, the Informer model shows better prediction results on SPEI3, SPEI6, SPEI9, SPEI12, and SPEI24. The Informer model is able to handle long sequences, and it performs better when dealing with long-term dependencies. It can model different timescales adaptively and, therefore, better capture relevance on different timescales. As a result, Informer has good performance in predicting SPEI for each meteorological station.
As the timescale becomes smaller, the prediction abilities of the Informer, ARIMA, and LSTM models decrease, but Informer still performs better than ARIMA and LSTM, indicating that Informer can widely capture the information of the input series over time and is more effective in long-term prediction problems. In this paper, to assess the prediction performance of the ARIMA, LSTM, and Informer models, three evaluation metrics, MAE, RMSE, and NSE, are utilized (Table 4). The MAE values of ARIMA and LSTM are both above 0.7 at SPEI1 and below 0.2 at SPEI24. The MAE and RMSE values tend to decrease with an increasing timescale, while the values of NSE show the reverse trend. These trends suggest that the prediction accuracy of the ARIMA, LSTM, and Informer models improves with increasing timescales. The prediction performance of the Informer model is superior to that of the ARIMA and LSTM models on different timescales, indicating that the Informer model can significantly enhance the prediction accuracy of SPEI. The NSE values of the Informer model for the four meteorological stations on SPEI24 are 0.968, 0.974, 0.972, and 0.986. On all timescales, the Informer model is superior to the ARIMA and LSTM models in evaluating metric data for prediction results.
Informer solves the problem of the dependencies between the output and input being not well captured due to long distances when predicting long time series. Moreover, the Informer model optimizes the temporal and spatial sophistication of the attention mechanism in the Transformer model so that Informer can obtain higher prediction accuracy. From the analysis, it is obvious that the LSTM and ARIMA models have lower prediction accuracies due to their own structural limitations.

4. Discussion

Drought forecasting is crucial for mitigating risks and preparing measures to alleviate its impact [37]. In this paper, we used the newest time series prediction model, namely, Informer, to predict the drought in the Yellow River Basin, and we compared the prediction results with those of the ARIMA and LSTM models, which showed that the Informer model exhibits superior prediction accuracy compared to both the ARIMA and LSTM models on multiple timescales. Because the data of SPEI1 changed relatively fast and fluctuated more, the predicted values of ARIMA, as well as those of LSTM, for the four meteorological stations were significantly different from the true SPEI values on the 1-month timescale, which is consistent with the conclusion reached by Xu et al. [22]. In particular, LSTM lost prediction ability in predicting SPEI1 for Hangjinhouqi. As the timescale increased, the data series tended to be smooth, and the prediction accuracy of ARIMA and LSTM gradually improved. Xu et al. [2] found that the prediction accuracy was related to the timescale based on the ARIMA-SVR model for multi-scale SPI prediction, and the prediction precision gradually improved with an increasing timescale. Hinge et al. [37] found that the hybrid WPT-MLR model has the potential to be employed for drought warnings in the study region, but the prediction accuracy decreased as the timescale increased. The predicted values of the Informer model were closer to the measured SPEI values, and the predicted SPEI trends aligned with the actual trends. The Informer model can model different timescales adaptively and, therefore, better capture relevance on different timescales. The NSE values of the Informer model for the four meteorological stations on SPEI24 were 0.968, 0.974, 0.972, and 0.986.
The Informer model provides various advantages for capturing long-term dependencies in time series data using a self-attentive mechanism [38], which enables the prediction of droughts over a longer term. In addition, the Informer model adopts the adaptive length idea, which can automatically adapt to different timescales and data features with high flexibility and adaptability [39]. The Informer model is also able to process multiple time series in parallel using the multi-headed self-attentive mechanism, which improves the training and prediction efficiency of the model, and there is no need to manually perform feature engineering, which can automatically extract important features in time series with better generalizability and interpretability. Applying Informer to drought prediction in the Yellow River Basin can improve the accuracy and reliability of drought prediction [40], which, in turn, can improve the efficiency and quality of water resources management and agricultural production [24].
Although the Informer model in this study outperforms that in existing studies in the accuracy of small-scale SPEI prediction, the fit of small-scale prediction results is still not as good as that of a large timescale. In the future, the predictive capability of Informer for different timescales can be improved by combining it with the multi-scale method. In addition, multi-source data and deep learning techniques can be brought in to build deep drought prediction models to better predict the evolution and trends of drought [41]. These measures are expected to improve the timescale of Informer’s performance in drought prediction and further refine its role in practical applications.
There are some aspects of the Informer model that can still be improved to further enhance prediction precision. Future improvements of the Informer model for drought prediction in the Yellow River Basin include adding multi-scale mechanisms to better capture multiple patterns and periodicity in the time series; integrating domain knowledge, such as meteorological and hydrological data, to improve prediction accuracy and interpretability; combining other traditional time series models, such as LSTM and GRU, to build a powerful integrated model; and integrating multiple target prediction problems to deal with multiple indicators and factors in drought prediction to improve prediction accuracy and comprehensiveness. The next steps in research on using the Informer model to predict small-scale SPEI drought could include exploring the use of additional data sources to improve prediction accuracy, such as combining meteorological or remote sensing data. In addition, further investigation into the model’s limitations on larger timescales could be carried out to improve its performance. Other areas of future research could include expanding the model’s application to other meteorological forecasting domains, studying prediction uncertainty, and improving the model’s overall reliability and accuracy.

5. Conclusions

In this paper, multi-scale SPEI was calculated using meteorological station monitoring data in the Yellow River Basin; the SPEI values were predicted using the Informer, ARIMA, and LSTM models; and the following conclusions were obtained from a comparative analysis of the prediction results:
(1) Because the data of SPEI1 changed relatively fast and fluctuated, the predicted values of ARIMA, as well as those of LSTM, for the four meteorological stations were significantly different from the true SPEI values on the 1-month timescale. The differences between the predicted and actual values of ARIMA and LSTM decreased when predicting SPEI3, SPEI6, SPEI9, SPEI12, and SPEI24. The Informer model showed better prediction results on SPEI3, SPEI6, SPEI9, SPEI12, and SPEI24. This indicates that the Informer model is able to handle long sequences and performs better when dealing with long-term dependencies.
(2) The predicted values of the Informer model were closer to the measured SPEI values, and the predicted SPEI trends were consistent with the actual trends. The Informer model can model different timescales adaptively and, therefore, better capture relevance on different timescales, and it can capture sudden changes in SPEI values in a timely and effective manner.
(3) As the timescale became smaller, the prediction ability of the Informer, ARIMA, and LSTM models decreased, but Informer still performed better than ARIMA and LSTM, indicating that Informer can widely capture the information of the input series over time, that it is more effective in long-term prediction problems, and that it can be efficient in improving the prediction precision of SPEI. As a result, Informer has good performance in predicting SPEI for each meteorological station.
Drought prediction not only enables the assessment of drought risks but also guides water resource management, agricultural planning, and ecosystem management and facilitates climate change research. The accuracy and timeliness of drought forecasts empower decision-makers to take appropriate measures, mitigating the adverse impacts of drought on society, economy, and the environment and ensuring sustainable development and resource utilization goals.

Author Contributions

All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Major Science and Technology Project of Henan Province, China, grant numbers “221100210600”, “201400211000”, “201400211300”, and “201400210600”.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets from 1960 to 2019 were obtained from the National Meteorological Data Center (http://data.cma.cn/ accessed on 2 April 2021).

Acknowledgments

All model code programs in this paper were run on the computational platform of the National Supercomputing Zhengzhou Center.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MDPIMultidisciplinary Digital Publishing Institute
DOAJDirectory of open-access journals
TLAThree-letter acronym
LDLinear dichroism

References

  1. Rusca, M.; Savelli, E.; Di Baldassarre, G. Unprecedented droughts are expected to exacerbate urban inequalities in Southern Africa. Nat. Clim. Chang. 2023, 13, 98–105. [Google Scholar] [CrossRef]
  2. Xu, D.; Zhang, Q.; Ding, Y.; Zhang, D. Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting. Environ. Sci. Pollut. Res. 2022, 29, 4128–4144. [Google Scholar] [CrossRef]
  3. Yang, R.; Geng, G.; Zhou, H.; Wang, T. Spatial-temporal Evolution of Meteorological Drought in the Wei River Basin Based on SPEI_PM. Chin. J. Agrometeorol. 2021, 42, 962–974. [Google Scholar]
  4. Abbas, A.; Waseem, M.; Ahmad, R.; Khan, K.A.; Zhao, C.; Zhu, J. Sensitivity analysis of greenhouse gas emissions at farm level: Case study of grain and cash crops. Environ. Sci. Pollut. Res. 2022, 29, 82559–82573. [Google Scholar] [CrossRef]
  5. Abbas, A.; Zhao, C.; Waseem, M.; Ahmed, K.K.; Ahmad, R. Analysis of Energy Input–Output of Farms and Assessment of Greenhouse Gas Emissions: A Case Study of Cotton Growers. Front. Environ. Sci. 2022, 9, 826838. [Google Scholar] [CrossRef]
  6. Elahi, E.; Khalid, Z.; Zhang, Z. Understanding farmers’intention and willingness to install renewable energy technology: A solution to reduce the environmental emissions of agriculture. Appl. Energy 2022, 309, 118459. [Google Scholar] [CrossRef]
  7. Elahi, E.; Khalid, Z.; Tauni, M.Z.; Zhang, H.; Xing, L. Extreme weather events risk to crop-production and the adaptation of innovative management strategies to mitigate the risk: A retrospective survey of rural Punjab, Pakistan. Technovation 2021, 117, 102255. [Google Scholar] [CrossRef]
  8. Ren, Y.; Liu, J.; Shalamzari, M.J.; Arshad, A.; Liu, S.; Liu, T.; Tao, H. Monitoring Recent Changes in Drought and Wetness in the Source Region of the Yellow River Basin, China. Water 2022, 14, 861. [Google Scholar] [CrossRef]
  9. Alahacoon, N.; Edirisinghe, M. A comprehensive assessment of remote sensing and traditional based drought monitoring indices at global and regional scale. Geomat. Nat. Hazards Risk 2022, 13, 762–799. [Google Scholar] [CrossRef]
  10. Saeed, S.; Mohammadi, G.M.; Saviz, S. Spatial and temporal analysis of drought in various climates across Iran using the Standardized Precipitation Index (SPI). Arab. J. Geosci. 2022, 15, 1279. [Google Scholar]
  11. Sergio, M.V.; Santiago, B.; Ji, L.-M. A Multiscalar Drought Index Sensitive to Global Warming: The Standardized Precipitation Evapotranspiration Index. J. Clim. 2010, 23, 1696–1718. [Google Scholar]
  12. Palmer, W.C. Meteorological Drought; U.S. Department of Commerce Weather Bureau Research Paper: San Diego, CA, USA, 1965.
  13. Narasimhan, B.; Srinivasan, R. Development and evaluation of Soil Moisture Deficit Index (SMDI) and Evapotranspiration Deficit Index (ETDI) for agricultural drought monitoring. Agric. For. Meteorol. 2005, 133, 69–88. [Google Scholar] [CrossRef]
  14. Chen, H.; Sun, J. Changes in Drought Characteristics over China Using the Standardized Precipitation Evapotranspiration Index. J. Clim. 2015, 28, 5430–5447. [Google Scholar] [CrossRef]
  15. Wei, J.; Wang, Z.; Han, L.; Shang, J.; Zhao, B. Analysis of Spatio-Temporal Evolution Characteristics of Drought and Its Driving Factors in Yangtze River Basin Based on SPEI. Atmosphere 2022, 13, 1986. [Google Scholar] [CrossRef]
  16. Ma, X.; Zhu, X.; Zhao, J.; Zhao, N.; Shi, Y. Analysis of Drought Characteristics and Driving Forces in the Urban Belt Along the Yellow River in Ningxia Based on SPEI. Res. Soil Water Conserv. 2022, 29, 1986. [Google Scholar]
  17. Wu, Z.; Lu, G.; Guo, H.; Kuang, Y. Drought monitoring technology based on simulation of soil moisture. J. Hohai Univ. (Nat. Sci.) 2012, 40, 28–32. [Google Scholar]
  18. Li, Y.; Chang, J.; Fan, J.; Yu, B. Agricultural drought evolution characteristics and driving mechanisms in the Yellow River Basin under climate and land use changes. Trans. Chin. Soc. Agric. Eng. 2021, 37, 84–93. [Google Scholar]
  19. Liu, J.; Ren, Y.; Tao, H.; Shalamzari, M. Spatial and Temporal Variation Characteristics of Heatwaves in Recent Decades over China. Remote. Sens. 2021, 13, 3824. [Google Scholar] [CrossRef]
  20. Hu, X.; Zhao, A.; Xiang, K.; Zhang, X. Evaluating the application of LSTM model for drought forecasting in Beijing-Tianjin-Hebei region. J. Xi’An Univ. Technol. 2022, 38, 356–365. [Google Scholar]
  21. Zhang, X.; Sun, C.; Wang, H.; Li, M. Assessment of the effectiveness of ensemble-based drought forecasting models in the Yellow River Basin, China. Nat. Hazards 2019, 95, 347–363. [Google Scholar]
  22. Xu, D.; Ding, Y.; Liu, H.; Zhang, Q.; Zhang, D. Applicability of a CEEMD—ARIMA Combined Model for Drought Forecasting: A Case Study in the Ningxia Hui Autonomous Region. Atmosphere 2022, 13, 1109. [Google Scholar] [CrossRef]
  23. Liu, J.; Zhang, W. Climate changes and associated multi-scale impacts on watershed discharge over the upper reach of Yarlung Zangbo River Basin, China. Adv. Meteorol. 2018, 2018, 4851645. [Google Scholar] [CrossRef]
  24. Khan, M.M.H.; Muhammad, N.S.; El-Shafie, A. Wavelet Based Hybrid ANN-ARIMA Models for Meteorological Drought Forecasting. J. Hydrol. 2020, 590, 125380. [Google Scholar] [CrossRef]
  25. Xiang, D.; Zhang, P.; Xiang, S.; Pan, C. Multi-modal Meteorological Forecasting Based on Transformer. Comput. Eng. Appl. 2023, 59, 94–103. [Google Scholar]
  26. Zhou, H.; Zhang, S.; Peng, J.; Zhang, S.; Li, J.; Xiong, H.; Zhang, W. Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting. Proc. AAAI 2021, 35, 11106–11115. [Google Scholar] [CrossRef]
  27. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, L. Attention Is All You Need. In Proceedings of the Annual Conference on Neural Information Processing Systems 2017, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
  28. Dong, H.; Sun, L.; Ouyang, F. Prediction of PM2.5 Concentration Based on Informer. Environ. Eng. 2022, 40, 48–54. [Google Scholar]
  29. Yu, F.; Koltun, V.; Funkhouser, T. Dilated residual networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 636–644. [Google Scholar]
  30. Clevert, D.; Unterthiner, T.; Hochreiter, S. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). In Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
  31. Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  32. Ko, M.S.; Lee, K.; Kim, J.K.; Hong, C.W.; Dong, Z.Y.; Hur, K. Deep Concatenated Residual Network with Bidirectional LSTM for One-Hour-Ahead Wind Power Forecasting. IEEE Trans. Sustain. Energy 2021, 12, 1321–1335. [Google Scholar] [CrossRef]
  33. Ding, Y.; Yu, G.; Tian, R.; Sun, Y. Application of a Hybrid CEEMD-LSTM Model Based on the Standardized Precipitation Index for Drought Forecasting: The Case of the Xinjiang Uygur Autonomous Region, China. Atmosphere 2022, 13, 1504. [Google Scholar] [CrossRef]
  34. Li, Z.; Wang, X.; Zhang, E.; Yu, J. Research on the Drought Prediction Model of Large Irrigation Areas Based on VMD-GRU. China Rural. Water Hydropower 2023, 3, 130–137. [Google Scholar]
  35. Li, Z.; Zou, H.; Qi, B.; Guo, J. A fitting model of annual precipitation prediction based on eemd-arima. Comput. Appl. Softw. 2020, 37, 46–50. [Google Scholar]
  36. Wu, X.; Chen, Y.; Guan, Y.; Tian, X.; Hua, Y. A hybrid CEEMDAN-SE-ARIMA model and its application to summer precipitation forecast over Northeast China. Trans. Atmos. Sci. 2023, 46, 205–216. [Google Scholar]
  37. Hinge, G.; Piplodiya, J.; Sharma, A.; Hamouda, M.A.; Mohamed, M.M. Evaluation of Hybrid Wavelet Models for Regional Drought Forecasting. Remote. Sens. 2022, 14, 6381. [Google Scholar] [CrossRef]
  38. Liu, F.; Dong, T.; Liu, Y. An Improved Informer Model for Short-Term Load Forecasting by Considering Periodic Property of Load Profiles. Front. Energy Res. 2022, 10, 950912. [Google Scholar] [CrossRef]
  39. Pande, C.B.; AIAnsari, N.; Kushwaha, N.L.; Srivastava, A.; Noor, R.; Kumar, M.; Moharir, N.; Elbeltagi, A. Forecasting of SPI and Meteorological Drought Based on the Artificial Neural Network and M5P Model Tree. Land 2022, 11, 2040. [Google Scholar] [CrossRef]
  40. Banadkooki, F.B.; Singh, V.P.; Ehteram, M. Multi-timescale drought prediction using new hybrid artificial neural network models. Nat. Hazards 2021, 106, 2461–2478. [Google Scholar] [CrossRef]
  41. Zhang, Y.; Yang, H.; Cui, H.; Chen, Q. Comparison of the Ability of ARIMA, WNN and SVM Models for Drought Forecasting in the Sanjiang Plain, China. Nat. Resour. Res. 2020, 29, 1469–1470. [Google Scholar] [CrossRef]
Figure 1. Study area.
Figure 1. Study area.
Atmosphere 14 00951 g001
Figure 2. Informer model structure.
Figure 2. Informer model structure.
Atmosphere 14 00951 g002
Figure 3. The single stack in Informer’s encoder.
Figure 3. The single stack in Informer’s encoder.
Atmosphere 14 00951 g003
Figure 4. The structure of the LSTM memory cell.
Figure 4. The structure of the LSTM memory cell.
Atmosphere 14 00951 g004
Figure 5. Observed SPEI values on different timescales of the example stations.
Figure 5. Observed SPEI values on different timescales of the example stations.
Atmosphere 14 00951 g005
Figure 6. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Hangjinhouqi: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Figure 6. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Hangjinhouqi: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Atmosphere 14 00951 g006
Figure 7. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Huanxian: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Figure 7. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Huanxian: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Atmosphere 14 00951 g007
Figure 8. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Taian: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Figure 8. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Taian: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Atmosphere 14 00951 g008
Figure 9. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Maqin: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Figure 9. Prediction results of multi-timescale SPEI values of the ARIMA, LSTM, and Informer models at Maqin: (a) 1-month timescale (b) 3-month timescale; (c) 6-month timescale; (d) 9-month timescale; (e) 12-month timescale; (f) 24-month timescale.
Atmosphere 14 00951 g009
Table 1. Profile about representative meteorological stations.
Table 1. Profile about representative meteorological stations.
Station IDStation NameLongitude (°E)Latitude (°N)Altitude (m)
53420Hangjinhouqi107.1240.851024
53821Huanxian107.336.571255.6
54827Taian117.1536.17129.8
56043Maqin100.2334.483719
Table 2. Drought classification based on SPEI.
Table 2. Drought classification based on SPEI.
LevelTypeSPEI
1No drought SPEI 0.5
2Mild drought 1.0 SPEI < 0.5
3Moderate drought 1.5 SPEI < 1.0
4Severe drought 2.0 SPEI < 1.5
5Extreme drought SPEI 2.0
Table 3. Mann–Kendall trend test for SPEI.
Table 3. Mann–Kendall trend test for SPEI.
Example StationsSPEI Seriesp ValueTrend
HangjinhouqiSPEI10.00055decreasing
SPEI31.31 × 10 6 decreasing
SPEI65.973 × 10 14 decreasing
SPEI90decreasing
SPEI120decreasing
SPEI240decreasing
HuanxianSPEI11.349 × 10 7 decreasing
SPEI30decreasing
SPEI60decreasing
SPEI90decreasing
SPEI120decreasing
SPEI240decreasing
TaianSPEI15.975 × 10 5 decreasing
SPEI31.372 × 10 10 decreasing
SPEI62.22 × 10 16 decreasing
SPEI90decreasing
SPEI120decreasing
SPEI240decreasing
MaqinSPEI13.162 × 10 5 decreasing
SPEI31.086 × 10 9 decreasing
SPEI66.443 × 10 13 decreasing
SPEI92.44 × 10 15 decreasing
SPEI122.22 × 10 16 decreasing
SPEI240decreasing
Table 4. The statistical criteria of the ARIMA, LSTM, and Informer models.
Table 4. The statistical criteria of the ARIMA, LSTM, and Informer models.
Example StationsSPEI SeriesModelMAERMSENSE
HangjinhouqiSPEI1ARIMA0.8001.0270.022
 LSTM0.7991.0210.032
 Informer0.5310.6880.561
SPEI3ARIMA0.6330.8270.371
 LSTM0.6350.8240.375
 Informer0.3880.5210.434
SPEI6ARIMA0.4550.6550.573
 LSTM0.4520.6420.590
 Informer0.2770.4160.828
SPEI9ARIMA0.2790.3970.821
 LSTM0.2910.4020.817
 Informer0.2710.3820.835
SPEI12ARIMA0.1660.2790.910
 LSTM0.1870.2960.899
 Informer0.1820.2870.905
SPEI24ARIMA0.1240.2010.940
 LSTM0.1450.2140.932
 Informer0.1230.1900.968
HuanxianSPEI1ARIMA0.8041.006−0.049
 LSTM0.8041.003−0.042
 Informer0.6660.8420.264
SPEI3ARIMA0.6280.8260.250
 LSTM0.6170.8120.276
 Informer0.2710.4020.822
SPEI6ARIMA0.4230.5940.580
 LSTM0.4150.5810.598
 Informer0.2110.2710.912
SPEI9ARIMA0.2430.3540.842
 LSTM0.2540.3610.836
 Informer0.1910.2860.896
SPEI12ARIMA0.1660.2550.915
 LSTM0.1760.2720.904
 Informer0.0960.1330.977
SPEI24ARIMA0.1090.1770.945
 LSTM0.1270.1930.936
 Informer0.0860.1230.974
TaianSPEI1ARIMA0.8441.007−0.013
 LSTM0.8350.9940.014
 Informer0.5070.6720.548
SPEI3ARIMA0.0.6190.7910.289
 LSTM0.6200.7920.288
 Informer0.5080.6990.445
SPEI6ARIMA0.4010.5520.575
 LSTM0.4130.5540.573
 Informer0.3910.5420.591
SPEI9ARIMA0.2700.3870.789
 LSTM0.2770.3970.777
 Informer0.2010.2830.887
SPEI12ARIMA0.1930.2950.876
 LSTM0.2020.3160.858
 Informer0.1330.1920.948
SPEI24ARIMA0.1370.2020.909
 LSTM0.1480.2160.897
 Informer0.1310.1920.972
MaqinSPEI1ARIMA0.8461.052−0.047
 LSTM0.8571.059−0.061
 Informer0.5430.7380.484
SPEI3ARIMA0.5920.7530.418
 LSTM0.6350.7880.363
 Informer0.2450.3350.884
SPEI6ARIMA0.3890.5550.655
 LSTM0.3820.5500.661
 Informer0.1620.3290.879
SPEI9ARIMA0.2310.3340.858
 LSTM0.2350.3380.855
 Informer0.1240.1930.952
SPEI12ARIMA0.1620.2470.920
 LSTM0.1720.2610.911
 Informer0.1010.1450.972
SPEI24ARIMA0.1020.1590.959
 LSTM0.1170.1690.954
 Informer0.0640.0920.986
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Shang, J.; Zhao, B.; Hua, H.; Wei, J.; Qin, G.; Chen, G. Application of Informer Model Based on SPEI for Drought Forecasting. Atmosphere 2023, 14, 951. https://doi.org/10.3390/atmos14060951

AMA Style

Shang J, Zhao B, Hua H, Wei J, Qin G, Chen G. Application of Informer Model Based on SPEI for Drought Forecasting. Atmosphere. 2023; 14(6):951. https://doi.org/10.3390/atmos14060951

Chicago/Turabian Style

Shang, Jiandong, Bei Zhao, Haobo Hua, Jieru Wei, Guoyong Qin, and Gongji Chen. 2023. "Application of Informer Model Based on SPEI for Drought Forecasting" Atmosphere 14, no. 6: 951. https://doi.org/10.3390/atmos14060951

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop