A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network

Huang, Zhong; Li, Linna; Ding, Guorong

doi:10.3390/su151310660

Open AccessArticle

A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network

by

Zhong Huang

¹,

Linna Li

^1,2,* and

Guorong Ding

^3,*

¹

College of Science, Wuhan University of Science and Technology, Wuhan 430065, China

²

Hubei Province Key Laboratory of Systems, Science in Metallurgical Process, Wuhan 430065, China

³

Statistics Bureau of Maiji District, Tianshui 741020, China

^*

Authors to whom correspondence should be addressed.

Sustainability 2023, 15(13), 10660; https://doi.org/10.3390/su151310660

Submission received: 6 June 2023 / Revised: 29 June 2023 / Accepted: 30 June 2023 / Published: 6 July 2023

(This article belongs to the Section Air, Climate Change and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

Precise and efficient air quality prediction plays a vital role in safeguarding public health and informing policy-making. Fine particulate matter, specifically PM_2.5 and PM₁₀, serves as a crucial indicator for assessing and managing air pollution levels. In this paper, a daily pollution concentration prediction model combining successive variational mode decomposition (SVMD) and a bidirectional long short-term memory (BiLSTM) neural network is proposed. Firstly, SVMD is used as an unsupervised feature-learning method to divide data into intrinsic mode functions (IMFs) and to extract frequency features and improve short-term trend prediction. Secondly, the BiLSTM network is introduced for supervised learning to capture small changes in the air pollutant sequence and perform prediction of the decomposed sequence. Furthermore, the Bayesian optimization (BO) algorithm is employed to identify the optimal key parameters of the BiLSTM model. Lastly, the predicted values are reconstructed to generate the final prediction results for the daily PM_2.5 and PM₁₀ datasets. The prediction performance of the proposed model is validated using the daily PM_2.5 and PM₁₀ datasets collected from the China Environmental Monitoring Center in Tianshui, Gansu, and Wuhan, Hubei. The results show that SVMD can smooth the original series more effectively than other decomposition methods, and that the BO-BiLSTM method is better than other LSTM-based models, thereby proving that the proposed model has excellent feasibility and accuracy.

Keywords:

air pollutant concentration prediction; successive VMD; BiLSTM; Bayesian optimization

1. Introduction

Environmental pollution has become a major global problem, especially in terms of air pollution. The haze problem has had a serious impact on social development and human health. Even at relatively low concentrations, hazardous atmospheric particulate matter can cause serious harm to human health and ecosystems. Particles with an aerodynamic diameter of less than 2.5 and 10 microns are a prevalent type of air pollutant. These particles often contain a variety of toxic and harmful substances and have the ability to penetrate deep into the human respiratory system, causing various adverse health effects such as respiratory diseases, heart failure and cardiovascular diseases [1,2]. In addition, the concentration of air pollutants in most major Chinese cities greatly exceeds the WHO’s recommended standards, and is considered the fourth biggest threat to Chinese people’s health after heart disease, poor diet, and smoking [3].

The concentration of PM_2.5 and PM₁₀ in the air has always been the focus of global attention. To master the rules of change and predict the concentration change trends of the future, we must implement a series of measures and analytical actions. In addition to recommending air quality regulatory requirements to reduce air pollution, it is also crucial to provide guidance for people’s daily activities. This is of great practical significance, as it serves to warn and protect individuals from the harmful effects of air pollutants [4].

However, there are many elements that have an effect on awareness of air pollutants. Different stages of meteorological factors and pollutant elements make awareness a non-stationary and nonlinear time series, so it is tough to predict the awareness series. At present, prediction trends for air pollutant awareness by and large encompass chemical transport trends (CTMs), statistical knowledge of strategies, and hybrid methods [5].

Given the intricate nature of atmospheric chemical and transport processes, and considering that the prediction accuracy of CTMs depends on a full understanding of pollutant sources and an accurate description of physicochemical processes, the performance of CTMs is not sufficient [6]. For predicting air pollution concentrations, statistical strategies have regularly been employed because they are surprisingly simple and easy to understand. Among them, time series forecasting and multi-factor forecasting are the main strategies. When it comes to multi-factor forecasting, there are two obvious drawbacks. On one hand, the prediction of the target variable is dependent on the predictions of other exogenous factors, which can result in cumulative errors. Yet, the multicollinearity of the selected variables may inevitably lead to overfitting as well [7]. On the other hand, time series can be predicted without the help of other external variables, and future trends can be discovered by exploiting the inherent properties of their own past data. In light of these justifications, time series forecasting is chosen in place of the multi-factor forecast. An autoregressive moving average model (ARMA) and multiple linear regression (MLR) were employed, respectively, to predict the PM_2.5 concentration [8,9]. However, due to their linear correlation model, they were unable to correctly forecast the non-linear sequence of air pollution concentrations. It is important to remember that there is a nonlinear relationship between the series of air pollutant concentrations and other variables. Therefore, the accuracy of statistical methods will be limited. Nonlinear models, such as artificial neural network (ANN) [10,11], support vector machine (SVM), least-squares support vector machine (LSSVM) [7,12], and extreme learning machine (ELM) [12] are increasingly being proposed as alternatives. Extensive research has demonstrated the wide adoption of long short-term memory (LSTM) networks for air pollution prediction, enabling significant improvements in timely and accurate predictions [13,14]. The optimization of initialization parameters is also a problem. However, due to its limitations, no single model is perfect [15]. Considering that air pollutant sequences are nonlinear, non-stationary and relatively complex dynamic sequences, the overall performance of a single prediction model is limited. The models cannot be effortlessly utilized with numerous types of time sequence with unknown characteristics, and bad initial parameters used in some deep learning techniques may additionally lead to negative prediction accuracy [16]. In addition, empirical studies applied to other prediction fields also prove that the performance of mixed models is better than that of single models [17,18]. Hence, it is reasonable to infer that the machine learning models mentioned earlier, despite their good performance, may not effectively capture sudden fluctuations in air pollutant concentrations.

Hybrid models combining different technologies may be more suitable [19,20,21], such as prediction models, optimization algorithms, and data decomposition techniques [22,23,24]. Hybrid prediction models are increasingly proposed and used for air pollutant prediction [25]. They not only weaken the nonlinear and non-stationary characteristics of the data, but also effectively improve the prediction accuracy [26]. Decomposition of the original signal can be carried out using empirical mode decomposition (EMD), which has been successfully applied in the prediction of power systems, groundwater, and water quality; the prediction of monthly flow has been achieved with high forecasting accuracy [27,28,29]. However, EMD is vulnerable to modal aliasing. Ensemble empirical mode decomposition (EEMD) was developed to partially tackle the mode aliasing issue by introducing white noise to the data in order to completely eradicate mode aliasing. However, the EEMD increased noise increases the reconstruction inaccuracy [30]. Complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) has been successfully applied to wind speed and PM_2.5 prediction by adding adaptive white noise at each decomposition stage, which makes the reconstruction error almost zero; the complexity of each component is low, and the prediction accuracy is improved again [17,31]. However, EMD, EEMD, and CEEMDAN all have varying degrees of mode aliasing and lack a solid mathematical theoretical foundation, yielding poor prediction results [32]. Variational mode decomposition (VMD) has a sound theoretical basis and noise robustness; it effectively inhibits mode aliasing and can improve the accuracy of the prediction model in the extraction of complex sequence fluctuation patterns. It can better reflect the inherent characteristics of the data, and has been successfully applied to the prediction of wind speed, PM_2.5, and blood glucose, with good results [33,34]. Nevertheless, the manual selection of the modal component

K

and penalty factor

α

in VMD introduces subjectivity and lacks convincing evidence [35]. Although there have been many swarm intelligence algorithms to optimize the parameters in VMD, it takes a lot of time and memory in training, which needs to be further improved [36,37]. In contrast to the variational mode decomposition (VMD) technique, the successive variational mode decomposition (SVMD) method offers the advantage of not requiring the exact specification of the number of mode components during signal processing. SVMD adopts a continuous approach to identify and extract all components, resulting in improved convergence speed and reduced computational time. This technique has shown successful applications across multiple domains, such as wind speed prediction and related fields [38].

Therefore, to overcome these limitations, a semi-supervised model named SVMD-BO-BiLSTM is proposed in this paper to forecast the concentrations of PM_2.5 and PM₁₀. The model only requires the air pollutant concentration time series dataset as input. The contributions of this paper are as follows.

(1): During the unsupervised phase, SVMD is employed for feature learning without the need for labeled data. It segments the sequence into multiple intrinsic mode functions (IMF), which capture distinct frequency and magnitude characteristics, thereby representing diverse features of the data.
(2): In the supervision stage, the BO-BiLSTM neural network is superior to other linear networks in extracting time-dependent features efficiently.
(3): Our article created a novel hybrid model based on SVMD-BO-BiLSTM approaches. The proposed hybrid model’s ability to forecast PM_2.5 and PM₁₀ concentrations in Tianshui, Gansu and Wuhan, Hubei is examined using datasets from these locations.
(4): We present a scientific and reasonable model evaluation system involving multiple experiments, seven model performance indicators, four air pollutant datasets, and a systematic evaluation of the proposed mixed model using multiple comparison model tests and stability tests. Furthermore, the superior performance of the proposed model indicates that the proposed hybrid model not only provides a new option for proposing air quality regulatory requirements to reduce air pollution, but will also help to guide people’s daily activities, and protect them from harmful air pollutants.

This paper is organized as follows: Section 2 provides an overview of fundamental methods, while Section 3 elaborates on the basic methods and the proposed prediction models. The predictive performance of various methods, along with a time comparison and detailed analysis, is discussed in Section 4. Finally, Section 5 and Section 6 summarize the key contributions of this study.

2. Preliminaries

2.1. SVMD Method

Successive variational mode decomposition (SVMD) is a new signal decomposition technique proposed based on VMD. It employs a continuous approach for the identification and extraction of all components, thereby aiding in the enhancement of convergence speed through reduced computation time [39].

In order to mathematically represent the method, we assume that the original input signal

f (t)

can be decomposed into two parts: the

L t h

mode component

u_{L} (t)

and the residual component

f_{r} (t)

:

f (t) = u_{L} (t) + f_{r} (t)

(1)

The residual signal

f_{r} (t)

represents the remaining part of the input signal outside of

u_{L} (t)

. It is composed of two components:

\sum_{i = 1 : L - 1} u_{r} (t)

from the previously obtained modes, and

f_{u} (t)

from the unprocessed partial signals, i.e.,

f_{r} (t) = \sum_{i = 1 : L - 1} u_{i} (t) + f_{u} (t)

(2)

Obviously, when seeking

u_{1} (t)

, the first component of

f_{r} (t)

is null. The specific steps of the SVMD algorithm are as follows.

Each mode should exhibit compactness around its central frequency. Therefore, the $L t h$ mode is minimized to meet the following criteria:

$J_{1} = {‖\partial_{t} [(δ (t) + \frac{j}{π t}) * u_{L} (t)] e^{- j ω_{L} t}‖}_{2}^{2}$

(3)
The spectral overlap between the residual signal $f_{r} (t)$ and the mode $u_{L} (t)$ should be minimized, meaning that the energy of the residual mode signal is minimized in the frequency band wherein the desired mode is located. To ensure stable implementation of this constraint, an appropriate filter is selected with the following frequency response:

${\hat{β}}_{L} (ω) = \frac{1}{α {(ω - ω_{L})}^{2}}$

(4)

Thus, the constraints that are established are the following:

$J_{2} = ∥ β_{L} (t) * f_{r} (t) ∥_{2}^{2}$

(5)

where $β_{L} (t)$ is the impulse response of the filter in Formula (4).
By minimizing the constraints of $J_{1}$ and $J_{2}$ , there may be a challenge in effectively distinguishing the $L t h$ mode and the $(L - 1) t h$ mode. Therefore, based on the idea of $J_{2}$ constraint establishment, the frequency response of the filter used is

${\hat{β}}_{i} (ω) = \frac{1}{α {(ω - ω_{i})}^{2}}, 1 \leq i \leq L - 1$

(6)

Therefore, the established constraints are as follows:

$J_{3} = \sum_{i = 1}^{L - 1} ∥ β_{i} (t) * u_{L} (t) ∥_{2}^{2}$

(7)

where $β_{i} (t)$ represents the impulse response of the filter in Formula (6).
During the signal decomposition, the following constraints are established to ensure that the signal can be completely reconstructed.

$f (t) = u_{L} (t) + f_{u} (t) + \sum_{i = 1 : L - 1} u_{i} (t)$

(8)

Thus, the extraction of the modal components can be expressed as a constrained minimization problem:

\{\begin{matrix} {m i n}_{u_{L}, ω_{L}, f_{r}} {α J_{1} + J_{2} + J_{3}} \\ s . t . u_{L} (t) + f_{r} (t) = f (t) \end{matrix}

(9)

where

α

is a parameter for balancing

J_{1}

,

J_{2}

and

J_{3}

.

In order to achieve better convergence and maximize the reconstruction of the original signal in the presence of noise, an enhanced Lagrange function was created by constructing a quadratic penalty term and a Lagrange multiplier, as shown below:

\begin{array}{l} L (u_{L}, ω_{L}, λ) : = α J_{1} + J_{2} + J_{3} \\ + {‖f (t) - (u_{L} (t) + f_{u} (t) + \sum_{i = 1}^{L - 1} u_{i} (t))‖}_{2}^{2} \\ + 〈λ (t), f (t) - (u_{L} (t) + f_{L} (t) + \sum_{i = 1}^{L - 1} u_{i} (t))〉 \end{array}

(10)

where

λ

is the Lagrange multiplier coefficient. The optimal solution is obtained by constructing the augmented Lagrange function, and then the alternating direction multiplier method is employed to iteratively solve the minimization problem.

The effect of SVMD is analyzed below with a PM_2.5 sequence. Figure 1a is the plot of each IMF component obtained after SVMD processing, and Figure 1b is the corresponding spectrum plot. It can be seen that after the PM_2.5 sequence is smoothed by SVMD, each component can be separated and processed more obviously, and the center frequency of each component looks more obvious.

Following the unsupervised frequency extraction performed by SVMD, the original PM_2.5 time series data are decomposed into multiple components, each representing distinct frequency characteristics. Notably, IMF1 exhibits the highest frequency, while the frequencies of the remaining components gradually decrease.

2.2. BiLSTM Model

LSTM, depicted in Figure 2, is a type of recurrent neural network (RNN) with unique characteristics; LSTM is used to capture long-term trends due to the different gate and cell compositions. Differing from LSTM, the BiLSTM neural network is composed of forward LSTM and backward LSTM, so it can process sequences in both forward and backward directions. Both directions have independent hidden layers, and each hidden layer can simultaneously capture past (forward) and future (backward) information within a specific step size, so as to extract more comprehensive sequence information and improve the prediction performance of the network. Figure 3 shows the network flow diagram of BiLSTM.

Assuming that network input is

(x_{1}, x_{2}, . . ., x_{T})

, the hidden layer state is

(h_{1}, h_{2}, . . ., h_{T})

, and the network is at time

t

, the calculation of each unit and gating is shown in Equations (11)–(16):

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(11)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(12)

{\tilde{C}}_{t} = \tan h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})

(13)

C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t}

(14)

o_{t} = σ (W_{o} [h_{t - 1}, x_{t}] + b_{o})

(15)

h_{t} = o_{t} * \tan h (C_{t})

(16)

where

i_{t}

,

f_{t}

,

o_{t}

are the compute formulas for the input gate, the forget gate, and the output gate, respectively.

h_{t}

,

h_{t - 1}

are final outputs of the network at the current time and the previous time, respectively.

W_{i}

,

W_{f}

,

W_{c}

,

W_{o}

,

b_{i}

,

b_{f}

,

b_{c}

,

b_{o}

are the weight matrix and bias of three gating and cell states, respectively. σ(·) and tanh(·) are the activation function.

2.3. Bayesian Optimization Algorithm

Gradient optimization is a powerful technique for solving the challenging problem of finding the extremum of a function. However, in the process of parameter tuning the BiLSTM model, the opposite function relationship cannot be determined directly, so the traditional optimization algorithm cannot be used to tune the hyperparameters. The Bayesian optimization (BO) algorithm, known for its black box optimization capabilities, offers an effective approach to finding the global optimal solution without requiring explicit knowledge of the objective function. This makes it particularly well suited for parameter tuning in the BiLSTM model [40].

The idea of the Bayesian optimization algorithm is to use the prior probability distribution of the objective function and the known observation points to update the posterior probability distribution, and then find the next minimum point according to the posterior probability distribution so that the minimum value decreases continuously; finally, the optimal hyperparameter can be obtained. The objective of Bayesian optimization is defined as

x_{m i n} = {a r g m i n}_{x \in X} f (x)

(17)

In the context of Bayesian optimization, the hyperparameter

x_{m i n}

represents the parameters being optimized, and the objective function

f (x)

denotes the function being optimized.

Assuming that the hyperparameter to be optimized is

Χ = \{x_{1}, x_{2}, \dots, x_{t}\}

, and the dataset obtained via Bayesian optimization iteration is

D_{t} = \{(x_{1}, f (x_{1})), (x_{2}, f (x_{2})), \dots, (x_{t}, f (x_{t}))\}

(18)

The Gaussian process assumes that the observation points obey Gaussian distribution, and its expression is as follows:

f (x_{1 : t}) \sim G P (μ (x_{1 : t}), Σ (x_{1 : t}, x_{1 : t}))

(19)

where

Σ (x_{1 : t}, x_{1 : t})

is the covariance matrix,

Σ (x_{1 : t}, x_{1 : t}) = [\begin{matrix} k (x_{1}, x_{1}) & \dots & k (x_{1}, x_{t}) \\ \dots & \dots & \dots \\ k (x_{t}, x_{1}) & \dots & k (x_{t}, x_{t}) \end{matrix}]

(20)

According to Bayes’ theorem,

\begin{matrix} P (f (x_{t + 1}) ∣ f (x_{1 : t})) \propto \\ P (f (x_{1 : t}) ∣ f (x_{t + 1})) P (f (x_{t + 1})) \end{matrix}

(21)

Through continuous iteration and updating of

{x t + 1}_{m i n}

, the optimal hyperparameters are finally obtained.

3. SVMD-BO-BiLSTM Prediction Model

3.1. BO-BiLSTM

In this study, a relevant prediction model based on the BiLSTM network was developed to tackle the challenges posed by air pollution-related time series data, which are characterized by instability, nonlinearity, and periodic uncertainty. To address the nonlinearity and non-stationarity of the data, the BO algorithm was combined with the BiLSTM model, allowing the hyperparameters to adapt to each data feature. The framework of the BO-BiLSTM model is illustrated in Figure 4. It is widely recognized that the structure and model parameters of artificial neural networks significantly impact the model’s performance. Therefore, the BO algorithm was employed to optimize rarely selected parameters such as the number of neurons in the BiLSTM layer, the number of hidden layers, and the learning rate of optimization [41]. The mean square error (MSE) was adopted as the loss function, and the hyperparameters were optimized within the following ranges: the number of neurons in the BiLSTM layer ranged from 50 to 200, the number of hidden layers ranged from 1 to 4, and the learning rate of the optimizer ranged from 0.001 to 0.1.

The particular procedure algorithm of BO-BiLSTM is as follows:

(1): Obtain the time collection facts to be predicted, and divide the coaching set and test set according to the proportion.
(2): Take the number of neurons in the BiLSTM network, the learning rate of the optimizer, and the wide variety of hidden layers as the optimization object, and initialize the BO algorithm.
(3): Calculate the contemporary function distribution randomly.
(4): Adjust the modern-day function distribution according to the method selected by means of the determination function.
(5): Determine whether the termination prerequisites are met. If yes, the highest quality hyperparameter value is returned. Otherwise, return to Step 4.
(6): Construct the BiLSTM network model with greatest hyperparameters.

3.2. PM_2.5 and PM₁₀ Concentration Prediction Framework

The hybrid model proposed in this paper has three stages, and the related process is as follows:

Step 1: Data preprocessing. SVMD is utilized to filter out high-frequency noise and extract prominent features in the frequency domain of the time series.

Step 2: The purpose of this stage is to optimize the BiLSTM network via Bayesian optimization. The BO algorithm is initialized with the number of neurons of the BiLSTM network, the learning rate of the optimizer, and the breadth of the hidden layers as the optimization objects. The distribution of contemporary functions is calculated randomly. Adjust the contemporary function distribution according to the method of determining function selection. Determine whether the preconditions for termination are satisfied. If yes, the highest quality hyperparameter value is returned. Construct the BiLSTM network model with the highest quality hyperparameters.

Step 3: The developed hybrid BO-BiLSTM model is used to predict each pattern obtained from the SVMD. The final prediction is obtained by integrating the prediction results of each sub-series. The flow chart of the hybrid forecasting model is shown in Figure 5.

4. Experiment

4.1. Data Source

Changes in urban air quality are the result of a combination of socioeconomic and natural factors, among which the main factors are industrial structure, energy consumption, transportation, population, economy, meteorology, and vegetation [42]. Both Tianshui in Gansu and Wuhan in Hubei are logistically well connected and have an important position as integrated transportation hubs, but they differ in other aspects. Tianshui is an important node city of the Northwest Silk Road Economic Belt and Guanzhong Plain City Cluster, while Wuhan is the economic and geographical center of China, south of the Qinling and Huai Rivers. This paper compares the air pollutants in the central economic city of Northwest China and the central city of Central China in order to be the most representative. The descriptive data of the two cities, such as maximum value, minimum value, mean value, and standard deviation, are used to make empirical predictions, respectively.

The experimental data included the PM_2.5 and PM₁₀ data of Wuhan, Hubei and Tianshui, Gansu province, sourced from four datasets of daily concentrations of both PM_2.5 and PM₁₀ from 1 January 2017 to 30 September 2022 (https://www.aqistudy.cn/historydata/, accessed on 1 October 2022). The two study areas and their information are shown in Figure 6. After sorting (directly removing the data with concentration values greater than 400 μg/m³), there were 2057 valid data in total (as shown in Figure 7). The dataset was split into a training set consisting of the first 70% of the sequence data (1439) and a testing set consisting of the remaining 30% (616). Table 1 presents the descriptive statistics of the four datasets.

4.2. Parameter Setting

The BiLSTM model can set the time step of the input parameters, which then has a certain influence on the prediction results. Therefore, it is necessary to select the optimal time step for the training set to obtain the best prediction results. To select the optimal time step, the autocorrelation function (ACF) and partial autocorrelation function (PACF) were introduced to determine the time step of the training set [43].

The ACF and PACF results of PM_2.5 in Tianshui (training set) are shown in Figure 8. When the optimal time step is set to 3, the ACF value is 0.6625, and the PACF value goes to zero. Through continuous validation and comparison, the best prediction results can be obtained when the time step is 3; thus, this figure can be used as the best time step for the experiment [44].

When training the BiLSTM network, the optimizer is used to reduce the value of the loss function of the model and find the optimal solution to obtain the appropriate model parameters. Adaptive moment estimation (adam), stochastic gradient descent with momentum (sgdm), and root mean square prop (rmsprop) are widely used optimization algorithms in deep learning. Figure 9 illustrates the iterative process of loss values during training for different types of optimizers. It is worth noting that the traditional sgdm optimizer has a poor performance at the beginning and end of the iteration. As one of the most popular optimizers in neural networks, the adam optimizer is widely used because of its powerful adaptive tuning capability. In the current experiments, the rmsprop optimizer converged relatively fast in the early stage, but after 70 rounds of training, the adam optimizer was chosen for its fast convergence, and showed a good performance.

The model training process was optimized using the adam algorithm. This paper uses kernel extreme learning machine (KELM), gate recurrent unit (GRU), bi-directional gate recurrent unit (BiGRU), and LSTM for comparison. The chosen experimental platform was MATAL 2020, accessed on a Windows 10 system (Microsoft, Redmond, WA, USA) equipped with an Intel Core i5 processor running at 1.80 GHz and 8 GB of memory.

4.3. Comparison of the Proposed Predictor with Other Prediction Methods

4.3.1. Evaluation Indexes

In several recent studies, many model evaluation criteria have been widely used to validate the performance of predictive models. In this study, mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), correlation index (R²), relative accuracy (RA), and the Theil inequality coefficient (TIC) were used to measure the superiority between models.

M A E = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\overset{̑}{y}}_{i} |

(22)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} (y_{i} - {\overset{̑}{y}}_{i})^{2}}

(23)

M A P E = \frac{1}{n} \sum_{i = 1}^{n} | \frac{y_{i} - {\overset{̑}{y}}_{i}}{y_{i}} | \times 100 %

(24)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} (y_{i} - {\overset{̑}{y}}_{i})^{2}}{\sum_{i = 1}^{n} (y_{i} - \bar{y})^{2}}

(25)

R A = 1 - \frac{| y_{i} - {\hat{y}}_{i} |}{y_{i}}

(26)

T I C = \frac{\sqrt{\frac{1}{n} \sum^{\underset{n}{i = 1}} {(y_{i} - {\hat{y}}_{i})}^{2}}}{\sqrt{\frac{1}{n} \sum^{\underset{n}{i = 1}} y_{i}^{2}} + \sqrt{\frac{1}{n} \sum^{\underset{n}{i = 1}} {\hat{y}}_{i}^{2}}}

(27)

where

y_{i}

and

{\hat{y}}_{i}

are the observed and predicted values, respectively.

\bar{y}

is the average value of

n

observed samples.

4.3.2. Prediction Result of the Proposed Model

Considering previous findings in the literature that highlight the limitations of a single predictive model [45], this paper does not enumerate individual prediction algorithms; single prediction algorithms are not listed in this paper. In order to assess the effectiveness of the hybrid model developed in this study, multiple datasets of air pollutant concentrations were collected, and several comparative models were constructed for evaluation purposes. To further evaluate the effectiveness of the proposed hybrid models, several experiments were conducted (Experiment I: PM_2.5 prediction; Experiment II: PM₁₀ prediction). EMD, CEEMDAN, and VMD were used as comparison experiments to verify the effectiveness of the proposed models.

Figure 10 illustrates the prediction results of PM_2.5 and PM₁₀ concentrations in Tianshui and Wuhan using different models. It can be seen from the four subplots that the prediction results obtained using the prediction method proposed in this paper are closer to the actual results and have more accurate prediction results. Several statistical indicators are employed in this study to evaluate and showcase the model’s superior performance.

The prediction accuracy of hybrid BiLSTM models, obtained using various signal processing techniques, exhibits variations, as demonstrated in Table 2 and Table 3. These findings highlight the potential significant differences in their prediction performances. For the Tianshui PM₁₀ concentration, the RMSE values of EMD-BiLSTM, CEEMDAN-BiLSTM, VMD-BiLSTM, SVMD-BiLSTM and SVMD-BO-BiLSTM models were evaluated as 10.8604, 9.1445, 2.8760, 2.7702, and 2.6950, respectively. The observed variations in prediction accuracy among the hybrid BiLSTM models employing different signal processing techniques can be attributed to the significant contribution of SVMD in enhancing the performance of the proposed hybrid model. This emphasizes the necessity and importance of employing data decomposition techniques. This may be due to the greater contribution of SVMD to the prediction performance of the proposed hybrid model compared to the hybrid BiLSTM model with EMD, CEEMDAN, and VMD, which indicates the necessity and importance of using data decomposition techniques. For the Wuhan PM_2.5 concentration, the MAE values of EMD-BiLSTM, CEEMDAN-BiLSTM, VMD-BiLSTM, SVMD-BiLSTM and SVMD-BO-BiLSTM models were evaluated as 5.0676, 1.3756, 0.6975, 0.4551, and 0.3846, respectively. In addition, the prediction errors of the same hybrid model may differ significantly when different PM_2.5 and PM₁₀ time series are considered. Taking the SVMD-BO-BiLSTM model as an example, the mean absolute error (MAE) values for PM₁₀ in Tianshui and Wuhan are 0.8360 and 0.4677, respectively. However, for PM_2.5 in Tianshui, the MAE of the proposed model in this paper is 0.4414, and the MAE of the SVMD-BiLSTM model is 0.4134, which is slightly lower than that of the model in this paper. This is an acceptable range for different datasets in a complex environment. Comparing the above four datasets, the method proposed in this paper still has strong robustness and has relatively accurate prediction results. Figure 11 shows the statistical indicators of PM_2.5 and PM₁₀ for the two cities. It can be seen that the smaller values of these indicators indicate a more accurate prediction. The superiority of the proposed method in this paper is also more clearly demonstrated.

4.3.3. Comparison of Forecasting Results

To highlight the advantages of the proposed BO-BiLSTM model, several classical machine learning algorithms are employed and compared using the same decomposition method. As shown in Table 4 and Table 5, the MAE of SVMD-KELM exceeded 2.5 mg/m³ on different datasets, and the MAE ranged from [2.5 to 6.8], while the range of SVMD-BO-BiLSTM ranged from [0.38 to 0.84]; this clearly shows the superiority of the proposed model in this paper. The MAPEs of SVMD-LSTM range from [1.6% to 9.6%] and SVMD-BO-BiLSTM range from [0.80% to 2.10%], which shows the robustness of the proposed model for different data.

In order to compare the superiority of BiLSTM in capturing time series, this paper compares four methods, KELM, GRU, BiGRU, and LSTM, and the prediction RMSE of each dataset is shown in the following Figure 12. It can be observed that the errors of the models proposed in this paper are relatively small, and the prediction accuracy is relatively high.

Figure 13 displays box plots representing the frequency histograms of the absolute value of the prediction error and the relative error percentages for the nine models, respectively. It is worth noting that the absolute value of the prediction error, which represents the difference between the true and predicted values, is relatively closer to zero in the case of the SVMD-BO-BiLSTM model. Furthermore, this model exhibits a smaller range of variation and fewer outliers compared to the other models used for comparison. These findings indicate that the model proposed in this study has a lower frequency of errors and demonstrates better performance.

To visually describe the relationship between standard deviation, RMSE, and correlation coefficient, Taylor diagrams are employed in this paper to illustrate the strengths of the proposed prediction model. By examining the Taylor diagram (Figure 14), it is evident that point J closely aligns with point A, indicating that point M outperforms others in terms of performance metrics. In other words, the SVMD-BO-BiLSTM model has the best prediction performance.

5. Discussion

The SVMD-BO-BiLSTM model proposed in this paper exhibits numerous advantages over other models, leading to a significant enhancement in the prediction accuracy of air pollutants. This novel prediction model represents a highly desirable advancement in the field. The advantages and disadvantages of the model are discussed in this paper and summarized as follows.

The SVMD algorithm plays an pivotal role in data preprocessing, which leads to the improvement of PM_2.5 and PM₁₀ prediction accuracy in both Tianshui, Gansu and Wuhan, Hubei. As discussed above, the hybrid model has a more desirable prediction effect than the single-factor model, so only the prediction effect of the hybrid model is discussed. In the prediction of PM_2.5 in Wuhan, SVMD-BiLSTM improves R by about 0.0439, 0.0030, and 0.0021, and the RMSE decreases by 3.8592, 0.6869, and 0.3817, respectively, compared with EMD-BiLSTM, CEEMDA-BiLSTM, and VMD-BiLSTM; moreover, the model in this paper SVMD-BO-BiLSTM improves R² by 0.0034 on the basis of SVMD-BiLSTM, and the RMSE is reduced by 0.5953. It can be seen that SVMD can effectively improve the accuracy of prediction, and the SVMD algorithm can perform adaptive time-frequency analysis and detect the local transient characteristics of the signal. Its biggest advantage as a new signal processing algorithm is that it can customize the amount of decomposition, and it has a low number of operations.

The hyperparameter optimization of BO BiLSTM can effectively improve the prediction accuracy of the model. Compared with the standard LSTM, the SVMD-BiLSTM model increases R² by about 0.0045, reduces RMSE by about 0.5353, reduces MAE by about 0.2699, and reduces MAPE by about 0.6300. While the model in this paper makes these several metrics optimal, the SVMD-BO-BiLSTM, on the basis of SVMD-BiLSTM, increases R by about 0.0034, and the other three metrics decrease by about 0.5953, 0.0705, and 0.2300, respectively. This proves that the model proposed in this paper is valid, reliable, and stable.

Of course, in addition to the PM_2.5 and PM₁₀ concentration values, weather factors (such as temperature, humidity, wind speed, precipitation, etc.) and other particulate matter indicators (e.g., CO, NOx, SO₂, etc.) may also have an impact on environmental pollutants. Therefore, the next step will be to consider adding more data on weather factors and other particulate matter indicators to improve the accuracy of air quality prediction. The SVMD-BO-BiLSTM air quality prediction model proposed in this paper fully combines decomposition techniques, feature analysis, and optimization algorithms. The proposed model exhibits superior performance compared to both single decomposition-based models and decomposition-based hybrid models. Furthermore, its potential applications can be extended to various domains such as wind speed prediction, rainfall prediction, power load prediction, financial risk assessment, etc.

6. Conclusions

As urbanization, industrialization, and energy consumption continue to accelerate, air pollution has emerged as a pressing global issue. The detrimental effects of high levels of air pollution on the environment and human health are of growing concern. Hence, the development of accurate and dependable air pollutant prediction models holds immense significance. These models can aid in mitigating air pollution and providing valuable guidance for daily activities.

To tackle these challenges, a novel hybrid prediction model is introduced, aiming to enhance prediction accuracy and stability. The model employs a robust data preprocessing technique that decomposes the original time series into various modes based on frequency, starting from low to high frequencies. This study proposes a semi-supervised model SVMD-BO-BiLSTM to predict air pollutant concentrations. The method, which includes SVMD and BiLSTM, requires only PM_2.5 and PM₁₀ time series datasets as input. As a result, the model structure is comparatively simpler compared to other multi-factor prediction methods, yet it delivers accurate predictions of PM_2.5 and PM₁₀ concentrations. The key contributions of this study are summarized as follows.

(1): The SVMD-BO-BiLSTM model considers the impact of implicit time-invariant factors on the prediction results and requires minimal auxiliary data. As a result, it simplifies the model structure, reduces computational complexity, and achieves exceptional accuracy and stability in air quality prediction.
(2): Air quality time series data are processed as signal data using SVMD, allowing unsupervised feature learning to extract frequency features. This method significantly enhances the accuracy of short-term trend prediction, particularly for sudden and abrupt changes in the data. By incorporating unsupervised feature learning, the prediction capability of the supervised LSTM model is enhanced, particularly for high-frequency and irregular fluctuations observed in time series data. The BiLSTM model is more suitable than time series prediction models, such as the GRU, LSTM, and KELM models, for component PM_2.5 and PM₁₀ concentration prediction, based on SVMD method, in the experimental case of this study. The new hybrid model has good spatio-temporal generalization and robustness for PM_2.5 and PM₁₀ concentration prediction in different moments and regions. It can be used as an effective tool for predicting trends in concentration changes at monitoring points, which has good practical significance.
(3): Several evaluation index systems were constructed for model evaluation. The SVMD-BO-BiLSTM model demonstrates remarkable accuracy and stability in predicting PM_2.5 and PM₁₀ concentrations across various time scales.

Nevertheless, it is important to acknowledge the limitations of this study. Firstly, it only focuses on two factors, namely PM_2.5 and PM₁₀, while neglecting other influential factors such as SO₂, NO_x, and O₃ concentrations, and meteorological variables. Secondly, the predictions were limited to single-step forecasting, without considering multi-step predictions. Additionally, due to the scope of this paper, deep learning techniques and other multi-objective optimization algorithms were not explored.

Author Contributions

Z.H.: Data curation, Methodology, Conceptualization, Writing—Original Draft. L.L.: Supervision, Funding acquisition, Writing—review and editing. G.D.: Data curation, Formal analysis, Validation, Writing—Original Draft. All authors have read and agreed to the published version of the manuscript.

Funding

This work was support by the Hubei Key Laboratory of Blasting Engineering Foundation, under grant no. HKLBEF202009.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Duan, J.; Li, Y.-L.; Li, S.; Yang, Y.; Li, F.; Li, Y.; Wang, J.; Dug, P.-Z.; Wu, J.; Wang, W.; et al. Association of in China. Am. J. Kidney Dis. 2022, 22, 638–647. [Google Scholar] [CrossRef]
Shen, F.; Ge, X.; Hu, J.; Nie, D.; Tian, L.; Chen, M. Air pollution characteristics and health risks in Henan Province, China. Environ. Res. 2017, 156, 625–634. [Google Scholar] [CrossRef]
Zhang, H.; Wang, Y.; Hu, J.; Ying, Q.; Hu, X.M. Relationships between meteorological parameters and criteria air pollutants in three megacities in China. Environ. Res. 2015, 140, 242–254. [Google Scholar] [CrossRef]
Biancofiore, F.; Busilacchio, M.; Verdecchia, M.; Tomassetti, B.; Aruffo, E.; Bianco, S.; Di Tommaso, S.; Colangeli, C.; Rosatelli, G.; Di Carlo, P. Recursive neural network model for analysis and forecast of PM₁₀ and PM_2.5. Atmos. Pollut. Res. 2017, 8, 652–659. [Google Scholar] [CrossRef]
Xu, Y.; Du, P.; Wang, J. Research and application of a hybrid model based on dynamic fuzzy synthetic evaluation for establishing air quality forecasting and early warning system: A case study in China. Environ. Pollut. 2017, 223, 435–448. [Google Scholar] [CrossRef]
Sun, W.; Zhang, H.; Palazoglu, A.; Singh, A.; Zhang, W.; Liu, S. Prediction of 24-hour-average PM_2.5 concentrations using a hidden Markov model with different emission distributions in Northern California. Sci. Total Environ. 2013, 443, 93–103. [Google Scholar] [CrossRef]
Sun, W.; Sun, J. Daily PM_2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. J. Environ. Manag. 2017, 188, 144–152. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Lin, J.; Qiu, R.; Hu, X.; Zhang, H.; Chen, Q.; Tan, H.; Lin, D.; Wang, J. Trend analysis and forecast of PM_2.5 in Fuzhou, China using the ARIMA model. Ecol. Indic. 2018, 95, 702–710. [Google Scholar] [CrossRef]
Tien Bui, D.; Moayedi, H.; Gör, M.; Jaafari, A.; Foong, L.K. Predicting slope stability failure through machine learning paradigms. ISPRS Int. J. Geo-Inf. 2019, 8, 395. [Google Scholar] [CrossRef] [Green Version]
Goudarzi, G.; Birgani, Y.T.; Assarehzadegan, M.A.; Neisi, A.; Dastoorpoor, M.; Sorooshian, A.; Yazdani, M. Prediction of airborne pollen concentrations by artificial neural network and their relationship with meteorological parameters and air pollutants. J. Environ. Health Sci. Eng. 2022, 20, 251–264. [Google Scholar] [CrossRef] [PubMed]
Goudarzi, G.; Hopke, P.K.; Yazdani, M. Forecasting PM_2.5 concentration using artificial neural network and its health effects in Ahvaz, Iran. Chemosphere 2021, 283, 131285. [Google Scholar] [CrossRef] [PubMed]
Shang, Z.; Deng, T.; He, J.; Duan, X. A novel model for hourly PM_2.5 concentration prediction based on CART and EELM. Sci. Total Environ. 2019, 651, 3043–3052. [Google Scholar] [CrossRef] [PubMed]
Qi, Y.; Li, Q.; Karimian, H.; Liu, D. A hybrid model for spatiotemporal forecasting of PM_2.5 based on graph convolutional neural network and long short-term memory. Sci. Total Environ. 2019, 664, 1–10. [Google Scholar] [CrossRef]
Pak, U.; Ma, J.; Ryu, U.; Ryom, K.; Juhyok, U.; Pak, K.; Pak, C. Deep learning-based PM_2.5 prediction considering the spatiotemporal correlations: A case study of Beijing, China. Sci. Total Environ. 2020, 699, 133561. [Google Scholar] [CrossRef] [PubMed]
Wang, J.; Yang, W.; Du, P.; Li, Y. Research and application of a hybrid forecasting framework based on multi-objective optimization for electrical power system. Energy 2018, 148, 59–78. [Google Scholar] [CrossRef]
Wang, J.; Liu, F.; Song, Y.; Zhao, J. A novel model: Dynamic choice artificial neural network (DCANN) for an electricity price forecasting system. Appl. Soft Comput. 2016, 48, 281–297. [Google Scholar] [CrossRef]
Zhang, W.; Qu, Z.; Zhang, K.; Mao, W.; Ma, Y.; Fan, X. A combined model based on CEEMDAN and modified flower pollination algorithm for wind speed forecasting. Energy Convers. Manag. 2017, 136, 439–451. [Google Scholar] [CrossRef]
Wang, J.; Hu, J. A robust combination approach for short-term wind speed forecasting and analysis–Combination of the ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression) model. Energy 2015, 93, 41–56. [Google Scholar] [CrossRef]
Wang, J.; Du, P.; Hao, Y.; Ma, X.; Niu, T.; Yang, W. An innovative hybrid model based on outlier detection and correction algorithm and heuristic intelligent optimization algorithm for daily air quality index forecasting. J. Environ. Manag. 2020, 255, 109855. [Google Scholar] [CrossRef]
Qiao, W.; Yang, Z.; Kang, Z.; Pan, Z. Short-term natural gas consumption prediction based on Volterra adaptive filter and improved whale optimization algorithm. Eng. Appl. Artif. Intell. 2020, 87, 103323. [Google Scholar] [CrossRef]
Wang, J.; Yang, W.; Du, P.; Niu, T. Outlier-robust hybrid electricity price forecasting model for electricity market management. J. Clean. Prod. 2020, 249, 119318. [Google Scholar] [CrossRef]
Hao, Y.; Tian, C.; Wu, C. Modelling of carbon price in two real carbon trading markets. J. Clean. Prod. 2020, 244, 118556. [Google Scholar] [CrossRef]
Qiao, W.; Lu, H.; Zhou, G.; Azimi, M.; Yang, Q.; Tian, W. A hybrid algorithm for carbon dioxide emissions forecasting based on improved lion swarm optimizer. J. Clean. Prod. 2020, 244, 118612. [Google Scholar] [CrossRef]
Jiang, P.; Liu, Z. Variable weights combined model based on multi-objective optimization for short-term wind speed forecasting. Appl. Soft Comput. 2019, 82, 105587. [Google Scholar] [CrossRef]
Niu, M.; Gan, K.; Sun, S.; Li, F. Application of decomposition-ensemble learning paradigm with phase space reconstruction for day-ahead PM_2.5 concentration forecasting. J. Environ. Manag. 2017, 196, 110–118. [Google Scholar] [CrossRef] [PubMed]
Niu, M.; Wang, Y.; Sun, S.; Li, Y. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM_2.5 concentration forecasting. Atmos. Environ. 2016, 134, 168–180. [Google Scholar] [CrossRef]
Zhang, H.; Xu, H.R.; Peng, G.; Qian, Y.D.; Zhang, X.X.; Yang, G.L.; Shen, C.; Li, Z.; Yang, J.W.; Wang, Z.Q.; et al. A Prediction model of relativistic electrons at geostationary orbit using the EMD-LSTM network and geomagnetic indices. Space Weather 2022, 20, e2022SW003126. [Google Scholar] [CrossRef]
Zhang, X.; Chen, H.; Zhu, G.; Zhao, D.; Duan, B. A new groundwater depth prediction model based on EMD-LSTM. Water Supply 2022, 22, 5974–5988. [Google Scholar] [CrossRef]
Zhang, Y.; Li, C.; Jiang, Y.; Sun, L.; Zhao, R.; Yan, K.; Wang, W. Accurate prediction of water quality in urban drainage network with integrated EMD-LSTM model. J. Clean. Prod. 2022, 354, 131724. [Google Scholar] [CrossRef]
Ausati, S.; Amanollahi, J. Assessing the accuracy of ANFIS, EEMD-GRNN, PCR, and MLR models in predicting PM_2.5. Atmos. Environ. 2016, 142, 465–474. [Google Scholar] [CrossRef]
Sun, W.; Xu, Z. A hybrid Daily PM_2.5 concentration prediction model based on secondary decomposition algorithm, mode recombination technique and deep learning. Stoch. Environ. Res. Risk Assess. 2022, 36, 1143–1162. [Google Scholar] [CrossRef]
Li, F.; Ma, G.; Chen, S.; Huang, W. An ensemble modeling approach to forecast daily reservoir inflow using bidirectional long-and short-term memory (Bi-LSTM), variational mode decomposition (VMD), and energy entropy method. Water Resour. Manag. 2021, 35, 2941–2963. [Google Scholar] [CrossRef]
Zhang, G.; Liu, H.; Zhang, J.; Yan, Y.; Zhang, L.; Wu, C.; Hua, X.; Wang, Y. Wind power prediction based on variational mode decomposition multi-frequency combinations. J. Mod. Power Syst. Clean Energy 2019, 7, 281–288. [Google Scholar] [CrossRef] [Green Version]
Wang, W.; Tong, M.; Yu, M. Blood glucose prediction with VMD and LSTM optimized by improved particle swarm optimization. IEEE Access 2020, 8, 217908–217916. [Google Scholar] [CrossRef]
Guo, H.; Guo, Y.; Zhang, W.; He, X.; Qu, Z. Research on a Novel Hybrid Decomposition–Ensemble Learning Paradigm Based on VMD and IWOA for PM_2.5 Forecasting. Int. J. Environ. Res. Public Health 2021, 18, 1024. [Google Scholar] [CrossRef]
Ding, G.; Wang, W.; Zhu, T. Remaining Useful Life Prediction for Lithium-Ion Batteries Based on CS-VMD and GRU. IEEE Access 2022, 10, 89402–89413. [Google Scholar] [CrossRef]
Yang, H.; Liu, Z.; Li, G. A new hybrid optimization prediction model for PM_2.5 concentration considering other air pollutants and meteorological conditions. Chemosphere 2022, 307, 135798. [Google Scholar] [CrossRef]
Tuerxun, W.; Xu, C.; Guo, H.; Guo, L.; Zeng, N.; Cheng, Z. An ultra-short-term wind speed prediction model using LSTM based on modified tuna swarm optimization and successive variational mode decomposition. Energy Sci. 2022, 10, 3001–3022. [Google Scholar] [CrossRef]
Nazari, M.; Sakhaei, S.M. Successive variational mode decomposition. Signal Process. 2020, 174, 107610. [Google Scholar] [CrossRef]
Thoppil, N.M.; Vasu, V.; Rao, C.S.P. Bayesian optimization LSTM/bi-LSTM network with self-optimized structure and hyperparameters for remaining useful life estimation of lathe spindle unit. J. Comput. Inf. Sci. Eng. 2022, 22, 021012. [Google Scholar] [CrossRef]
Yan, W.; Wang, J.; Cheng, J.; Wan, Z.; Xing, K.; Gao, K. Long Short-Term Memory Networks and Bayesian Optimization for Predicting the Time-Weighted Average Pressure of Shield Supporting Cycles. Geofluids 2021, 2021, 8895844. [Google Scholar] [CrossRef]
Lu, J. Temporal and Spatial Characteristics of Air Quality and Its Influencing Factors in the Middle and Lower Reaches of the Yangtze River; Wuhan University: Wuhan, China, 2020. [Google Scholar] [CrossRef]
Teng, M.; Li, S.; Xing, J.; Song, G.; Yang, J.; Dong, J.; Zeng, X.; Qin, Y. 24-Hour prediction of PM_2.5 concentrations by combining empirical mode decomposition and bidirectional long short-term memory neural network. Sci. Total Environ. 2022, 821, 153276. [Google Scholar] [CrossRef] [PubMed]
Ding, P.; Liu, X.; Li, H.; Huang, Z.; Zhang, K.; Shao, L.; Abedinia, O. Useful life prediction based on wavelet packet decomposition and two-dimensional convolutional neural network for lithium-ion batteries. Renew. Sustain. Energy Rev. 2021, 148, 111287. [Google Scholar] [CrossRef]
Du, P.; Wang, J.; Hao, Y.; Niu, T.; Yang, W. A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM_2.5 and PM₁₀ forecasting. Appl. Soft Comput. 2020, 96, 106620. [Google Scholar] [CrossRef]

Figure 1. The SVMD method. (a) Each component obtained via SVMD. (b) Each component obtained via SVMD.

Figure 2. The signal LSTM cell structure.

Figure 3. The BiLSTM network structure.

Figure 4. The BO-BiLSTM flow chart.

Figure 5. The SVMD-BO-BiLSTM model.

Figure 6. Infographic of the two study areas.

Figure 7. PM_2.5 and PM₁₀ concentration times series. (a) PM_2.5 concentration in Tianshui and Wuhan. (b) PM₁₀ concentration in Tianshui and Wuhan.

Figure 8. The ACF and PACF of the training set of Tianshui levels of PM_2.5.

Figure 9. Loss on the training set.

Figure 10. Comparison of the different models using 70% datasets. (a) Comparison of the different model prediction results with daily Tianshui PM_2.5 concentrations using 70% datasets. (b) Comparison of the different model prediction results with daily Wuhan PM_2.5 concentrations using 70% datasets. (c) Comparison of the different model prediction results with daily Tianshui PM₁₀ concentrations using 70% datasets. (d) Comparison of the different model prediction results with daily Wuhan PM₁₀ concentrations using 70% datasets.

Figure 11. Daily PM_2.5 and PM₁₀ forecast results index using various models. (a) TIC for different models. (b) MAE, RMSE, and MAPE for different models.

Figure 12. Daily PM_2.5 and PM₁₀ forecast results index using various models.

Figure 13. Box diagram of forecasting error absolute values (A: EMD-BiLSTM; B: CEEMDAN-BiLSTM; C: VMD-BiLSTM; D: SVMD-KELM; E: SVMD-GRU; F: SVMD-BiGRU; G: SVMD-LSTM; H: SVMD-BiLSTM; I: SVMD-BO-BiLSTM).

Figure 14. Taylor plot of forecasting value (A: Original point; B: EMD-BiLSTM; C: CEEMDAN-BiLSTM; D: VMD-BiLSTM; E: SVMD-KELM; F: SVMD-GRU; G: SVMD-BiGRU; H: SVMD-LSTM; I: SVMD-BiLSTM; J: Proposed Method).

Table 1. Descriptive statistics of air pollutants.

		Number	Minimum (μg/m³)	Maximum (μg/m³)	Average (μg/m³)	Standard Deviation (μg/m³)
PM_2.5	Tianshui	2057	4	148	29.80	20.67
PM_2.5	Wuhan	2057	4	216	42.31	28.80
PM₁₀	Tianshui	2057	6	395	62.50	46.27
PM₁₀	Wuhan	2057	3	289	68.14	41.28

Table 2. Evaluation index of PM_2.5 based on LSTM combined with different decomposition methods in Tianshui and Wuhan.

Area	Models	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA	TIC
Tianshui	EMD-BiLSTM	1.0694	1.9381	6.11	0.9892	0.9669	0.1096
	CEEMDAN-BiLSTM	1.6332	2.4282	9.59	0.983	0.9519	0.1123
	VMD-BiLSTM	0.5551	1.179	2.79	0.9953	0.9806	0.0602
	SVMD-BiLSTM	0.4131	1.4278	1.29	0.9927	0.9800	0.0237
	SVMD-BO-BiLSTM	0.4414	1.0951	2.08	0.9957	0.9841	0.0183
Wuhan	EMD-BiLSTM	5.0676	5.4234	22.47	0.9506	0.9024	0.1309
	CEEMDAN-BiLSTM	1.3756	2.2511	5.14	0.9915	0.9663	0.1240
	VMD-BiLSTM	0.6975	1.9459	2.09	0.9924	0.9806	0.0747
	SVMD-BiLSTM	0.4551	1.5642	1.28	0.9945	0.9864	0.0259
	SVMD-BO-BiLSTM	0.3846	0.9689	1.05	0.9979	0.9878	0.0178

Table 3. Evaluation index of PM₁₀ based on LSTM combined with different decomposition methods in Tianshui and Wuhan.

Area	Models	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA	TIC
Tianshui	EMD-BiLSTM	8.4403	10.8604	19.92	0.9258	0.8760	0.1587
	CEEMDAN-BiLSTM	5.4991	9.1445	12.85	0.9474	0.9203	0.1616
	VMD-BiLSTM	1.4000	2.8760	4.15	0.9918	0.9762	0.0931
	SVMD-BiLSTM	0.8818	2.7702	7.88	0.9924	0.9814	0.0217
	SVMD-BO-BiLSTM	0.8360	2.6950	1.45	0.9936	0.9837	0.0227
Wuhan	EMD-BiLSTM	4.7955	5.3231	11.21	0.9735	0.9333	0.1262
	CEEMDAN-BiLSTM	2.2813	3.3890	5.18	0.9893	0.9666	0.1266
	VMD-BiLSTM	0.9735	2.6349	1.55	0.9898	0.9813	0.0738
	SVMD-BiLSTM	0.7912	1.6438	1.47	0.9967	0.9861	0.0209
	SVMD-BO-BiLSTM	0.4677	1.2177	0.82	0.9978	0.9915	0.0096

Table 4. Evaluation index of PM_2.5 based on SVMD combined with different prediction methods in Tianshui and Wuhan.

PM_2.5	Tianshui					Wuhan
PM_2.5	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA
SVMD-KELM	2.5244	3.9106	10.15	0.9449	0.8996	4.3520	6.2202	14.56	0.9125	0.8803
SVMD-GRU	0.4822	1.7069	1.89	0.9895	0.9786	0.6286	1.5367	2.43	0.9947	0.9851
SVMD-BiGRU	0.7049	2.5878	2.22	0.9759	0.9646	1.3756	2.2511	5.14	0.9915	0.9663
SVMD-LSTM	1.6333	2.4282	9.60	0.9831	0.9520	0.7250	2.0995	1.91	0.9900	0.9771
SVMD-BiLSTM	0.4131	1.4278	1.29	0.9927	0.9800	0.4551	1.5642	1.28	0.9945	0.9864
Proposed Method	0.4414	1.0951	2.08	0.9957	0.9841	0.3846	0.9689	1.05	0.9979	0.9878

Table 5. Evaluation index of PM₁₀ based on SVMD combined with different prediction methods in Tianshui and Wuhan.

PM₁₀	Tianshui					Wuhan
PM₁₀	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA	MAE (μg/m³)	RMSE (μg/m³)	MAPE (%)	R²	RA
SVMD-KELM	6.7510	10.8193	34.28	0.8844	0.8793	6.5128	8.8460	12.08	0.8847	0.8896
SVMD-GRU	1.2227	3.0350	3.83	0.9909	0.9775	0.9672	1.6762	1.76	0.9959	0.9836
SVMD-BiGRU	0.9777	3.1997	6.01	0.9899	0.9815	1.0565	3.0781	1.69	0.9860	0.9799
SVMD-LSTM	2.0968	4.1838	3.51	0.9845	0.9590	0.9735	2.6349	1.55	0.9898	0.9813
SVMD-BiLSTM	0.8818	2.7702	7.88	0.9924	0.9814	0.7912	1.6438	1.47	0.9967	0.9861
Proposed Method	0.8360	2.6950	1.45	0.9936	0.9837	0.4677	1.2177	0.82	0.9978	0.9915

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, Z.; Li, L.; Ding, G. A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network. Sustainability 2023, 15, 10660. https://doi.org/10.3390/su151310660

AMA Style

Huang Z, Li L, Ding G. A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network. Sustainability. 2023; 15(13):10660. https://doi.org/10.3390/su151310660

Chicago/Turabian Style

Huang, Zhong, Linna Li, and Guorong Ding. 2023. "A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network" Sustainability 15, no. 13: 10660. https://doi.org/10.3390/su151310660

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network

Abstract

1. Introduction

2. Preliminaries

2.1. SVMD Method

2.2. BiLSTM Model

2.3. Bayesian Optimization Algorithm

3. SVMD-BO-BiLSTM Prediction Model

3.1. BO-BiLSTM

3.2. PM_2.5 and PM₁₀ Concentration Prediction Framework

4. Experiment

4.1. Data Source

4.2. Parameter Setting

4.3. Comparison of the Proposed Predictor with Other Prediction Methods

4.3.1. Evaluation Indexes

4.3.2. Prediction Result of the Proposed Model

4.3.3. Comparison of Forecasting Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A Daily Air Pollutant Concentration Prediction Framework Combining Successive Variational Mode Decomposition and Bidirectional Long Short-Term Memory Network

Abstract

1. Introduction

2. Preliminaries

2.1. SVMD Method

2.2. BiLSTM Model

2.3. Bayesian Optimization Algorithm

3. SVMD-BO-BiLSTM Prediction Model

3.1. BO-BiLSTM

3.2. PM2.5 and PM10 Concentration Prediction Framework

4. Experiment

4.1. Data Source

4.2. Parameter Setting

4.3. Comparison of the Proposed Predictor with Other Prediction Methods

4.3.1. Evaluation Indexes

4.3.2. Prediction Result of the Proposed Model

4.3.3. Comparison of Forecasting Results

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2. PM_2.5 and PM₁₀ Concentration Prediction Framework