Application of CNN-LSTM Algorithm for PM2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area

Su, Yuxuan; Li, Junyu; Liu, Lilong; Guo, Xi; Huang, Liangke; Hu, Mingyun

doi:10.3390/atmos14091392

Open AccessArticle

Application of CNN-LSTM Algorithm for PM_2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area

by

Yuxuan Su

^1,2,

Junyu Li

^1,2,*

,

Lilong Liu

^1,2,

Xi Guo

^1,2,

Liangke Huang

^1,2

and

Mingyun Hu

^1,2

¹

College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541006, China

²

Guangxi Key Laboratory of Spatial Information and Geomatics, Guilin 541006, China

^*

Author to whom correspondence should be addressed.

Atmosphere 2023, 14(9), 1392; https://doi.org/10.3390/atmos14091392

Submission received: 22 July 2023 / Revised: 25 August 2023 / Accepted: 1 September 2023 / Published: 3 September 2023

(This article belongs to the Special Issue How AI/ML Improve Our Understanding of the Magnetosphere-Ionosphere-Theromosphere-Troposphere?)

Download

Browse Figures

Versions Notes

Abstract

:

Prolonged exposure to high concentrations of suspended particulate matter (SPM), especially aerodynamic fine particulate matter that is ≤2.5 μm in diameter (PM_2.5), can cause serious harm to human health and life via the induction of respiratory diseases and lung cancer. Therefore, accurate prediction of PM_2.5 concentrations is important for human health management and governmental environmental management decisions. However, the time-series processing of PM_2.5 concentration based only on a single region and a special time period is less explanatory, and thus, the spatial-temporal applicability of the model is more restricted. To address this problem, this paper constructs a PM_2.5 concentration prediction optimization model based on Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM). Hourly data of atmospheric pollutants, meteorological parameters, and Precipitable Water Vapor (PWV) of 10 cities in the Beijing-Tianjin-Hebei metropolitan area during the period of 1–30 September 2021/2022 were used as the training set, and the PM_2.5 data of 1–7 October 2021/2022 were used for validation. The experimental results show that the CNN-LSTM model optimizes the average root mean square error (RMSE) by 25.52% and 14.30%, the average mean absolute error (MAE) by 26.23% and 15.01%, and the average mean absolute percentage error (MAPE) by 35.64% and 16.98%, as compared to the widely used Back Propagation Neural Network (BPNN) and Long Short-Term Memory (LSTM) models. In summary, the CNN-LSTM model is superior in terms of applicability and has the highest prediction accuracy in the Beijing-Tianjin-Hebei metropolitan area. The results of this study can provide a reference for the relevant departments in the Beijing-Tianjin-Hebei metropolitan area to predict PM_2.5 concentration and its trend in specific time periods.

Keywords:

PM_2.5; machine learning; Convolutional Neural Network; LSTM time series forecasting; Precipitable Water Vapor; Beijing-Tianjin-Hebei metropolitan area; holidays

1. Introduction

The rapid development of industrialization and the economy has made it difficult to solve the problem of air pollution in some cities in China, such as the Beijing-Tianjin-Hebei region (Beijing, Tianjin, and Hebei areas) [1,2,3]. In addition, prolonged exposure to high concentrations of suspended fine particulate matter in the air, especially fine particulate matter with a diameter of less than 2.5 μm (PM_2.5), can cause serious harm to human health. For example, PM_2.5 has been strongly associated with an increased risk of asthma, bronchitis and other chronic obstructive pulmonary diseases [4]. More alarmingly, prolonged exposure to such pollutants can lead to more serious health problems such as lung cancer and adversely affect children’s lung health and development [5]. The problems that air pollution poses to cities, however, are not limited to health. Economically, increased health costs and reduced labor productivity caused by air pollution have caused huge economic losses [6]. Environmentally, air contamination deteriorates the city’s ecological harmony, leading to diminished biodiversity and degradation of both soil and water quality [7]. A more latent concern is the potential exodus of residents due to consistently poor air quality; such demographic shifts could profoundly challenge the city’s social structure and economic equilibrium.

Given the significance of the year 2022 for China’s comprehensive construction of a modernized socialist country and the impact of the goal of carbon peak and carbon neutrality (dual-carbon), the Chinese government is highly concerned about controlling pollutant emissions in China’s core urban agglomerations. The National Day is an important peak period for travel. On this day, a large number of people gather in the Beijing-Tianjin-Hebei region in a short period of time because of the unique political, economic and cultural status of this region, which is followed by a sharp increase in carbon emissions of vehicles, leading to the occurrence of air pollution, and significant changes in PM_2.5 concentration [8].

Therefore, accurate forecasts of PM_2.5 concentrations in the Beijing-Tianjin-Hebei metropolitan area during National Day can provide a basis for the relevant authorities to formulate timely and effective policies to deal with pollution and secondary disasters. In addition, accurate forecasts of PM_2.5 concentrations during National Day will help the residents of this region to adjust their travel plans and take appropriate preventive measures to better protect their health.

Currently, there are two types of PM_2.5 prediction models: deterministic models and empirical models. Empirical models contain time series-based methods, regression-based methods, and machine learning-based methods [9]. Time series-based methods aim to reveal the potential relationship between historical and future values in the time series of PM_2.5 concentration. The Autoregressive Integrated Moving Average model (ARIMA), the more popular time series model, has been widely used to predict PM_2.5 concentration [10]. Zhang et al. [11] used the ARIMA model to predict Fuzhou City’s monthly PM_2.5 concentration values from August 2014 to July 2016, and explored its seasonal variation rules. Good prediction results were achieved, and it was found that the largest prediction errors occurred in months of winter and spring. Cheng et al. [12] combined wavelet analysis and the ARIMA model to decompose one-dimensional daily average PM_2.5 concentration data from five Chinese cities (Beijing, Taiyuan, Shanghai, Chengdu, and Guangzhou) into multidimensional information for short-term prediction, and its accuracy was improved compared with that of ARIMA model. However, the time series approach has limited interpretability because it relies heavily on historical data and does not comprehensively consider the effects of other factors, including air pollutants, meteorological variables, and other relevant environmental determinants.

The regression-based approach aims to explore the linear relationship between multi-variables and thus establish a model that overcomes the shortcomings of the time series model [9]. Zhao et al. [13] utilized the Multivariate Linear Regression model (MLR) to fit the linear relationship between three meteorological factors, namely wind speed, temperature, and relative humidity, and four air pollution factors, including SO₂, NO₂, CO, and O₃, and PM_2.5. A model to predict PM_2.5 concentration in Beijing was constructed at the 2015 yearly scale, the spring scale, and the summer scale, which had an R-Squared (R2) of 0.766, 0.852 and 0.874 respectively. Upadhyay et al. [14] used an MLR model to predict the annual average PM_2.5 concentration in Indian regions from 2010 to 2012 with an R2 of more than 0.9. However, regression-based methods mainly focus on the linear relationship between the variables, while ignoring the complex nonlinear relationship among various factors [15]. Therefore, they all fail to solve the nonlinearity problem and cannot be used for long-term prediction [16].

In summary, although time series and regression methods play an important role in PM_2.5 concentration prediction, they have some limitations. In order to predict PM_2.5 concentrations more accurately, prediction models that can handle nonlinear relationships and integrate multiple factors are demanded.

Machine learning has a strong advantage in dealing with nonlinear problems and is gradually used to construct PM_2.5 concentration prediction models [17,18,19]. Support Vector Machines (SVM) is a machine learning classification algorithm based on statistical learning theory, which can deal with both linear and nonlinear data and has good generalization ability. The SVM model was used by Deters et al. [20] to analyze the pattern of PM_2.5 concentration changes in Ecuador’s capital city by combining it with meteorological parameters and daily average PM_2.5 concentrations. However, the method performed poorly on the mean squared error (MSE) and the mean absolute percentage error (MAPE) compared to Neural Networks (NN). To improve the prediction performance, Sun et al. [21] combined Principal Component Analysis (PCA) and Least Squares Support Vector Machine (LSSVM) to fit with PM_2.5 concentration and constructed a prediction model for daily PM_2.5 concentration in Baoding City, Hebei Province, China, whose accuracy is better than the LSSVM model. However, the SVM model requires a large amount of memory to process high-dimensional data and is prone to overfitting problems [22,23].

Another commonly used nonlinear model is the Artificial Neural Network, ANN, which has a variety of modeling structures. Given its nonlinear processing capability and highly complex characterization properties, ANN can be more effective in capturing the complex and nonlinear relationships that may exist between the relevant factors and PM_2.5 concentrations [24]. He et al. [25] used an ANN model based on Bayesian regularization (BR), in combination with air pollution factors, to predict the monthly average PM_2.5 concentration in Liaocheng, China, and achieved good prediction results, with a root mean square error (RMSE) of 6.6 μg∙m⁻³ and a mean absolute error (MAE) of 4.6 μg∙m⁻³. However, the study did not consider other related factors such as meteorological factors. Maleki et al. [26] applied an ANN model to predict PM_2.5 with an one-hour resolution in Ahvaz, Iran; air pollutants and time-date information were used as inputs and five meteorological parameters were incorporated, including wind speed (WS), ambient temperature (T), dew point temperature (Td), rainfall amount (R), and atmospheric pressure (P). Their study also investigated the temporal distribution patterns of air pollutants in the city. The model yielded a correlation coefficient (R) of 0.87 and an RMSE of 59.9 μg∙m⁻³, suggesting satisfactory performance. However, no comparisons with traditional models were conducted.

Although the PM_2.5 prediction model based on the ANN algorithm has achieved relatively good results, shallowly structured as it is, the ANN model has not been able to extract all the hidden features of the correlation factors. Deep learning technology, an important branch of machine learning, has demonstrated excellent feature extraction ability and have made significant breakthroughs in many fields [27,28,29]. Hence, deep learning techniques may present a promising approach for predicting PM_2.5 concentrations. The Manifold Learning method was utilized to optimize the Deep Belief Network (DBN) model for forecasting PM_2.5 concentrations. Employing this optimized model, Xie [30] predicted daily PM_2.5 concentrations in Chongqing, China, achieving a reduction of 22.92% and 3.59% in RMSE compared to the Back Propagation Neural Network (BPNN) and the original DBN model, respectively. Although this improved model demonstrates good prediction results, its adaptability to different regions and generalizability to specific periods are still not clearly demonstrated. Recurrent Neural Networks (RNNs) and its variant Long Short-Term Memory (LSTM) have also been applied to the modeling and prediction of PM_2.5 concentration. In addition, complex hybrid deep learning models come into use for PM_2.5 concentration prediction along with the increasing volume of data. Liu et al. [31] constructed a PM_2.5 concentration prediction model at different scales based on the Self-Organizing-LSTM algorithm, with better accuracy than the MLR and ANN models. However, the highest fitting degree was found for the 1 h prediction, with R2 reaching 0.99, and the worst effect was found for the 12 h prediction, with R2 dropping to 0.918. It can be seen that the effective prediction length of this model is too short, and the applicability of this model is not high. It can be seen that deep learning-based PM_2.5 concentration prediction models (e.g., CNN and LSTM) are more advantageous than traditional models, but they each have their limitations: the CNN is good at capturing spatial features and complex relationships but may ignore long-term time dependence, while the LSTM is superior at processing time series but is not responsive enough to short-time domain features. To maximize the advantages of both, a combination of them can be considered to improve the prediction results. Qin et al. [32] constructed a CNN-LSTM PM_2.5 concentration hourly scale model by combining a Convolutional Neural Network (CNN). In Shanghai, the prediction results of the CNN-LSTM model were more in line with the actual values than those of the BPNN and the LSTM model, and its RMSE reached 14.3 μg∙m⁻³. Despite the relatively good predictions, the experiments were conducted on a single region and thus failed to adequately demonstrate the adaptability of the model in different regions. Huang and Kuo [33] constructed a CNN-LSTM prediction model in Beijing, China. The MAE and RMSE of this model are better than those of the SVM, MLP, and LSTM models, reaching 14.634 μg∙m⁻³ and 24.228 μg∙m⁻³, respectively, which shows that the PM_2.5 concentration prediction model based on deep learning techniques exceeds the performance of traditional models. Notwithstanding the model’s excellent performance, it is based on only three input factors and a 1 h prediction in a specific region, factors that may affect its applicability in a wider range of scenarios are not explored. Although the accuracy of PM_2.5 concentration prediction models established by previous researchers has been significantly improved, further enhancing the prediction accuracy for PM_2.5 concentrations still holds research value for human health, urban ecological balance, and local government management. This is particularly true for China’s political, economic, and cultural center, the Beijing-Tianjin-Hebei metropolitan area, as well as for the National Day, one of the most significant holidays of the year.

Given the intimate association between PM_2.5 and meteorological factors, and the fact that Precipitable Water Vapor (PWV) is a critical element in meteorological research, a significant relationship between these entities is to be expected. PWV, with its crucial role in weather systems and the hydrological cycle, exerts substantial influences on climate, weather, and the variation in air pollutant concentrations. Recent studies have shown that there is a high correlation between PWV, an important meteorological factor, and PM_2.5 concentration. Xin et al. [34] showed a high agreement between aerosol optical depth (AOD) and PM_2.5 at each station by ranging PWV concentrations at regional stations in China. Guo et al. [35,36] demonstrated the potential of PWV-assisted haze detection by analyzing the hour-by-hour serial correlation between PWV and PM_2.5, using data from a relatively continuous haze period without heavy precipitation in the Beijing area in the past three years. They also successfully predicted short-term changes in PM_2.5 concentration based on the MLP model in combination with meteorological parameters. The findings show that the stronger the correlation between PWV and PM_2.5 concentration is, the shorter the prediction time is, and the closer the predicted value of PM_2.5 concentration is to the true value. Liu [37] showed that there was a significant positive correlation between PWV and PM_2.5 concentration distribution in Northeast China, with larger PWV corresponding to lower PM_2.5 concentrations.

The above studies show that a certain degree of PM_2.5 concentration forecasting can be carried out based on PWV and air pollution information, meteorological factors, and other factors. However, most of the studies only forecast PM_2.5 concentrations in a single city, without considering the applicability of the model to different cities and specific time periods. Therefore, we focus on ten cities within the Beijing-Tianjin-Hebei metropolitan area, encompassing two municipalities directly under the Central Government (Beijing and Tianjin), and eight other cities (Shijiazhuang, Baoding, Qinhuangdao, Langfang, Cangzhou, Chengde, Zhangjiakou, and Tangshan). The hour-by-hour data of air pollutants, meteorological factors, and PWV in the region from 1 September to 30 September 2021/2022 are selected to construct an optimization model for PM_2.5 concentration prediction based on the CNN-LSTM model. Secondly, we use the new model for PM_2.5 concentration prediction from 1 October to 7 October 2021/2022, and then compare the accuracy with the models constructed using other classical methods to verify the accuracy and reliability of the new model. A reference for personal health management and governmental environmental management decisions is established.

2. Study Area, Data, and Correlation Analysis

2.1. Experimental Area

The Beijing-Tianjin-Hebei metropolitan area is one of the most important regions in China, not only in terms of economic, political, and cultural importance but also as a key area for air pollution control in China. In a nutshell, the region is strategically important and has a profound impact on China’s political, economic, and social development. Accurate forecasting and effective regulation of PM_2.5 concentrations are crucial for environmental protection, security, and stability in the region. This region mainly includes Beijing, Tianjin, Shijiazhuang, Baoding, Qinhuangdao, Langfang, Cangzhou, Chengde, Zhangjiakou, and Tangshan (these data are from the official website of the People’s Government of Beijing Municipality (Beijing.gov.cn, accessed on 29 August 2023)); the distribution of these cities is shown in Figure 1.

2.2. Experimental Data

Two main types of data were used in this study. Air quality data sourced from the China Environmental Monitoring General Station (www.cnemc.cn, accessed on 29 August 2023) were used for modeling and validation. The period is from 1 September to 7 October 2021/2022. The air quality data include PM_2.5, PM₁₀, SO₂, NO₂, O₃, and CO. Hour-by-hour meteorological data of the same period sourced from the fifth generation of reanalysis information (ERA5) provided by the European Centre for Medium-Range Weather Forecasts (ECMWF) (https://cds.climate.copernicus.eu/cdsapp, accessed on 29 August 2023) were used for the modeling. This includes Boundary Layer Height (BLH), 10 m v-component of Wind (V10), 10 m u-component of Wind (U10), 2 m Temperature (T2M), Surface Pressure (P), Relative Humidity (RH) and Precipitable Water Vapor (PWV). To ensure data matching, we matched the ancillary data obtained from ERA5 to the corresponding city sites through the inverse distance weighting (IDW) method. In some previous studies [31,33,38,39], they mainly constructed models based on the data of the past 24–72 h and then predicted the PM_2.5 concentration values for 1–24 h forward. Therefore, this study was modeled based on the hourly value data of the month before the National Day and predicted the 168 h PM_2.5 concentration values forward. Moreover, since this study focuses on a unique period during the National Day, data from a longer period away from the National Day period were not utilized for modeling to avoid interfering with the extraction of PM_2.5 data features during the National Day period.

2.3. Correlation Analysis of PM_2.5 with Various Factors

To determine the correlation between each factor and PM_2.5, the correlation analysis of PM_2.5 concentration values with atmospheric pollutants, meteorological factors, and ERA5-PWV data during the National Day of the year 2022 in 10 cities was performed using IBM SPSS Statistics 27 software. Their corresponding Spearman correlation coefficients were calculated, which are shown in Table 1 and Table 2.

Table 1 shows that the PM_2.5 concentration values of 10 cities during the National Day in 2022 suggest some degree of correlation with air pollutants. All the correlation coefficients for Langfang, Shijiazhuang, Qinhuangdao, Zhangjiakou, Chengde, and Cangzhou are positive, implying that PM_2.5 in these cities exhibits a positive correlation with these air pollutants. This indicates that as PM_2.5 concentration increases, these air pollutants increase accordingly. PM_2.5 and SO₂ in Beijing are negatively correlated, while the correlation coefficients between PM_2.5 and O₃ in Tianjin, Baoding, and Tangshan are also negative, and this negative correlation is presumably due to different atmospheric chemical reactions: when the concentration of PM_2.5 rises, the increase in atmospheric particulate matter promotes the process of oxidizing SO₂ into sulfuric acid particulate matter, which leads to a decrease in the concentration of SO₂. Meanwhile, because of the property of absorption of fine particulate matter, PM_2.5 absorbs a portion of solar radiation, lowering the atmospheric temperature and ultraviolet radiation, thus reducing the rate of O₃ production.

Table 2 shows the positive correlations of RH, T2m, V10 and PM_2.5 in the 10 cities. This is probably due to the fact that when the relative humidity increases, the moisture content in the atmosphere increases, and PM_2.5 combines with water vapor more easily to become larger particles, which leads to an increase in the concentration of PM_2.5. In addition, the increase in temperature leads to higher air pressure, which suppresses the rise of pollutants, thus exacerbating the pollution level. Large-scale population movement leads to increased traffic pressure, especially during National Day. In this case, the increase in vehicle emissions may lead to higher levels of air pollution [40]. Moreover, due to the increased activities of people during the festive season, the emissions of domestic waste and other sources of pollution may also increase, which may further deteriorate air quality. Furthermore, the positive correlation between V10 and PM_2.5 may be due to the relatively large amount of pollutant emissions in the region, which leads to the limited diffusion of pollutants into the atmosphere; the changes in longitudinal winds do not have much effect on the degree of diffusion and dilution of pollutants; thus, longitudinal winds show a positive correlation with PM_2.5. Also, the negative correlation between ERA5-PWV and PM_2.5 concentration in Tianjin may be because the meteorological factors (e.g., temperature and humidity) have a greater effect on PM_2.5 concentration than the water vapor content has on it during that time period, resulting in a negative correlation between the two [41]. In the rest of the cities, ERA5-PWV showed a positive correlation with PM_2.5, which means that when the water vapor content in the air increases, the PM_2.5 concentration in these cities also increases. This positive correlation may be because atmospheric water vapor undergoes a process of hygroscopicity and condensation with PM_2.5 particles, thus increasing their concentration in the air [42]. Lastly, high water vapor content may lead to higher local temperatures and increased atmospheric stability, thus making the dispersion of PM_2.5 more difficult.

Additionally, BLH is negatively correlated with PM_2.5 concentration, except in four cities, namely Tianjin, Baoding, Tangshan, and Cangzhou. This phenomenon is perhaps influenced by the changes in near-surface air momentum caused by friction and perturbation of the horizontal wind field of the atmosphere, which is accompanied by a decrease in BLH and a weakening of air mixing, diffusion, and transmission, leading to an increase in PM_2.5 concentration [43]. In cities other than Tianjin, U10 is negatively correlated with PM_2.5 concentrations, possibly because lateral winds, carrying pollutants out of the city and introducing fresh air, reduce the accumulation of pollutants within the city [44]. However, in Tianjin, lateral winds, carrying pollutants to the center of the city, contribute to higher PM_2.5 concentrations. In cities other than Qinhuangdao, P is negatively correlated with PM_2.5 concentrations. This may be due to the fact that high atmospheric pressure usually increases atmospheric stability, thus limiting vertical mixing and dispersion of air pollutants and reducing the accumulation of pollutants in localized areas, which leads to a decrease in PM_2.5 concentrations. In Qinhuangdao, however, low pressure may enhance atmospheric stability due to specific meteorological conditions (e.g., inversion phenomenon), which prevents air pollutants from effective diffusion and dilution, and thus leads to an increase in PM_2.5 concentrations.

In summary, significant correlations exist among PM_2.5, air pollutants, meteorological factors, and ERA5-PWV. Hence, it is imperative to incorporate air quality data (PM_2.5, PM₁₀, SO₂, NO₂, O₃, and CO), meteorological factors (BLH, P, RH, T2m, U10, and V10), and ERA5-PWV as input variables when modeling PM_2.5 concentrations.

2.4. Variation Trend of PM_2.5 Concentration in 10 Cities

To study the trend of PM_2.5 concentration in 10 cities in the Beijing-Tianjin-Hebei metropolitan area before and after the National Day in 2022 (4 days before and after, totaling 15 days and 360 h), the PM_2.5 concentration data in the region were integrated into a continuous time series and compared with the trend during the same period in 2021, as shown in Figure 2.

Figure 2 indicates that, generally, the PM_2.5 concentration in the region was notably higher from 28 September to 11 October 2022, compared to the same period in 2021. In 2022, an increase in PM_2.5 concentration occurred from late September to early October, peaking around the morning hours of 27 and 28 September; the concentration experienced fluctuations, first decreasing and then rebounding from 1 to 7 October, with a peak on 6 October. Compared to 2021, the PM_2.5 concentration exhibited a low-to-high trend during this period, particularly peaking from 3 to 4 October, and the concentration in early October was significantly higher than that in late September. Notably, there were significant disparities between daytime and nighttime PM_2.5 concentrations over the two-year span, with higher levels typically observed during the day. Hence, we conclude that the trends of PM_2.5 concentration during the same period across different years are not consistent. The necessity to establish mid-to-long-term prediction models for PM_2.5 concentration is highlighted.

Of particular note, during the National Day in 2022, Beijing has the highest PM_2.5 concentration among the 10 cities in the Beijing-Tianjin-Hebei metropolitan area. Specifically, the highest PM_2.5 concentration in Beijing during the National Day period is 46 μg·m⁻³ in 2021, while it increases to 111 μg·m⁻³ in 2022, suggesting that the air pollution during the National Day period in 2022 is more severe than that in 2021. This is attributed to Beijing’s popularity as a tourist city in China and the sharp increase in population density and traffic during the National Day in 2022, as a result of adjustments in epidemic prevention and control policies, the resurgence of tourism, and the accelerated movement of people. In addition, Beijing is located in the North China Plain and is surrounded by mountain ranges, making it easier for pollutants to accumulate in the area. The combined effect of these two reasons has led to a relatively severe level of air pollution in Beijing in 2022 compared to 2021 and other regions. This suggests that there is still a need to forecast and monitor haze-related indices.

3. Methods

3.1. Convolutional Neural Network (CNN)

CNN is a deep neural network model [45]. Its distinguishing features include local receptive domain and weight sharing. The local receptive domain refers to the fact that neurons in each layer of the network are connected only to neurons in the corresponding neighborhoods of their input layers. This local connectivity allows the network to better capture local features, improving the sensitivity and effectiveness of the model. Through weight sharing, CNN models are able to drastically reduce the number of training parameters, alleviating the complexity of the model and reducing the risk of overfitting. This is because the weight-sharing approach allows the neurons in the network to share the same set of weights and thus play the same role in feature extraction at different locations. In this way, the number of samples required during the training of the model is also relatively small. The 2D-CNN model deals directly with 2D matrices through weight sharing and convolution operations, avoiding the complex feature extraction and data reconstruction processes in traditional pattern recognition algorithms [46]. This direct processing of 2D matrices greatly simplifies the computational process of the model and improves operational efficiency and accuracy.

3.2. LSTM Model

RNNs have achieved better results in time series prediction. However, one of the main problems faced by RNNs is the difficulty in modeling long-term dependencies, which is due to the gradient vanishing or explosion problem [47]. The LSTM network structure was introduced to solve this problem. LSTM networks are able to store and process long-term dependencies efficiently by introducing a new structure, the memory unit. The memory unit consists of three core components: input gates, forget gates, and output gates. Input gates are used to control the writing of external information to the memory cell, while forget gates and output gates determine whether information is retained or released at each time step. Figure 3 shows the overall structure of the LSTM model.

f_{t} = σ (W_{f} \times [h_{t - 1}, x_{t}] + b_{f})

(1)

i_{t} = σ (W_{i} \times [h_{t - 1}, x_{t}] + b_{i})

(2)

{\tilde{C}}_{t} = \tanh (W_{c} \times [h_{t - 1}, x_{t}] + b_{c})

(3)

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times {\tilde{C}}_{t}

(4)

o_{t} = σ (W_{o} \times [h_{t - 1}, x_{t}] + b_{o})

(5)

h_{t} = o_{t} \times \tanh (C_{t})

(6)

where

W

and

b

denote the weights and bias vectors, respectively;

x_{t}

,

h_{t}

and

C_{t}

denote the input, output, and memory cell at moment t, respectively;

h_{t - 1}

and

C_{t - 1}

denote the output and memory cell at moment t-1, respectively; and

i_{t}

,

o_{t}

and

f_{t}

represent the input, output, and forget gates, respectively. Among them,

σ (\cdot)

is the sigmoid function, which can be represented using Equation (7);

\tanh (\cdot)

is the tanh function, which can be represented with Equation (8).

σ (x) = \frac{1}{1 + e^{- x}}

(7)

\tanh (x) = \frac{e^{x} - e^{- x}}{e^{x} + e^{- x}}

(8)

3.3. CNN-LTSM Model

The CNN-LSTM model integrates the better feature extraction ability of CNN and the ability of LSTM to deal with long time-dependent time series [49] to model PM_2.5 concentrations in 10 cities in the Beijing-Tianjin-Hebei metropolitan area. In the CNN-LSTM model, the first half is a CNN network for feature extraction and the second half is an LSTM network for PM_2.5 concentration prediction. The input data include the past 720 h of air quality data (PM_2.5, PM₁₀, SO₂, NO₂, O₃, and CO) and auxiliary data (meteorological factors and ERA5-PWV). In this study, after passing through one convolutional layer, the data undergo normalization via Batch-normalization. The Relu activation function is employed to prevent gradient vanishing, and the output forecasts PM_2.5 concentrations for the next 168 h. The specific CNN-LSTM model structure is illustrated in Figure 4.

Combining the presentations of Livieris et al. [50], Huang and Kuo [33], Kim and Cho [51], Li et al. [39], and a large number of experiments, the model parameters determined in this paper are shown in Table 3.

It should be noted that this paper uses an iterative approach to determine the final model for several reasons: ① Achieving the optimal solution: Iterative methods allow the model to refine itself based on feedback from each iteration, thereby gradually converging to the optimal solution. ② Addressing non-linear issues: Many environmental processes, especially those related to PM_2.5 concentrations, are inherently non-linear. Iterative methods offer flexibility in capturing and modeling these non-linear factors more effectively than one-off methods. ③ Model robustness: By continuously refining and testing the model against new data or different data subsets, we ensure that the model is robust and does not overfit to a specific subset of data. ④ Adaptive nature of the prediction model: The iterative nature of the process also offers the flexibility to incorporate new information or feedback at any stage, leading to a more informed and accurate model.

The process of building PM_2.5 concentration prediction models based on different algorithms is represented in Figure 5.

4. Results and Analysis

4.1. Determination of Variables

We selected the factors that had a significant correlation with PM_2.5 concentration at the 0.01 level in each city from Table 1 and Table 2, and input them into CNN-LSTM, LSTM, and BPNN for PM_2.5 concentration modeling. It should be noted that BPNN has been shown to have superior performance in several fields [52,53,54], while LSTM models have likewise been widely used with good results in many fields [55,56,57,58]. The unique geographic, industrial, and meteorological conditions of the Beijing-Tianjin-Hebei metropolitan cluster provided us with a valuable dataset during the National Day. Therefore, we purposely chose the LSTM and BPNN models to compare with the new model to evaluate and accurately determine the best modeling algorithm.

For delineation of the dataset, considering that our study focuses on the specificity of the National Day period in the Beijing-Tianjin-Hebei metropolitan cluster, we adopted the following strategy: From the sample dataset of 1 September to 7 October 2021/2022, we selected the air quality data and auxiliary data of the first 720 h as the training set, which helps the model to learn the PM_2.5 concentration characteristics of the regular daily routine before the National Day. The subsequent 168 h PM_2.5 concentration data were used as the validation set. This division ensures that the model is trained independently of the data characteristics far away from the National Day time period, and is able to better extract the PM_2.5 data characteristics during the National Day period. After model training, we compare its 168 h PM_2.5 concentration prediction with the actual observed values to verify the prediction performance of different models.

4.2. Evaluation and Discussion of Forecast Results

In order to verify the model performance, RMSE, MAE, and MAPE indicators are introduced to judge the applicability of the model and compare the prediction accuracy of the model. These indicators can be calculated in the following way:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - {\hat{x}}_{i})}^{2}}

MAE = \frac{1}{n} \sum_{i = 1}^{n} | x_{i} - {\hat{x}}_{i} |

(9)

MAPE = \frac{1}{n} \sum_{i = 1}^{n} | \frac{x_{i} - {\hat{x}}_{i}}{x_{i}} |

where

x_{i}

is the actual observed value and

{\hat{x}}_{i}

is the model predicted value.

4.2.1. Overall Accuracy

We have compiled the prediction results and the absolute values of prediction relative errors for three models, as shown in Figure 6 and Figure 7. Figure 6 indicates that, compared to the LSTM model and BPNN, the overall trend of the CNN-LSTM model’s prediction results aligns more closely with the actual PM_2.5 values. As can be seen from Figure 7, the CNN-LSTM model exhibits smaller absolute values of relative errors in each city compared to the other two models, with a particularly notable advantage in Beijing.

In addition, the trends of PM_2.5 concentration in Shijiazhuang and Tangshan exhibit minor fluctuations, which in turn place less pressure on the prediction of models, resulting in lower relative errors for each model. The specific mathematical and statistical results are shown in Table 4.

Table 4 shows that the CNN-LSTM model and the LSTM model exhibit better accuracy metrics compared to the BPNN. In terms of the RMSE metrics, the CNN-LSTM model has the best performance, except for Zhangjiakou, where the RMSE of the CNN-LSTM model is slightly larger than that of the LSTM model. Overall, the CNN-LSTM model has an average RMSE of 7.55 μg∙m⁻³, an average MAE of 5.94 μg∙m⁻³, and an average MAPE of 35.31%. Compared to the BPNN, the CNN-LSTM model has improved the average RMSE by 25.52%, the average MAE by 26.23%, and the average MAPE by 35.64%. Relative to the LSTM model, the CNN-LSTM model has improved the average RMSE by 14.30%, the average MAE by 15.01%, and the average MAPE by 16.98%. Among them, the CNN-LSTM model especially improves the prediction accuracy in Beijing, Tangshan, Langfang, and Qinhuangdao compared with the LSTM model, and its RMSE is optimized by 23.44%, 23.13%, 27.03%, and 21.09%, respectively. In addition, compared with the BPNN model, the prediction accuracy of the CNN-LSTM model in Baoding, Chengde, and Cangzhou is also significantly improved, with the RMSE optimized by 26.94%, 36.14%, and 31.71%, respectively.

In particular, except for Zhangjiakou, the MAE and MAPE of the CNN-LSTM model are reduced respectively by 28.14% and 31.61% compared to the LSTM model in Beijing; 12.91% and 35.61% in Tianjin; 28.27% and 43% in Langfang; 14.92% and 27.78% in Qinhuangdao; 9.45% and 5.92% in Baoding; 10.64% and 3.35% in Shijiazhuang; 11.41% and 3.34% in Chengde; 7.81% and 12.91% in Cangzhou. In Shijiazhuang, by 10.64% and 3.35%; in Chengde, by 11.41% and 3.34%; and in Cangzhou, by 7.81% and 12.91%. Additionally, in the Tangshan area, the MAE of the CNN-LSTM model optimized by 12.14%, while the MAPE deteriorated by 46.06%. Combined with Figure 6, the relative error ratio of the CNN-LSTM starts to increase and exceeds that of the LSTM model from the 120th h to the 168th h, thus leading to the deterioration of its MAPE. Therefore, the length of the forward forecasting of PM_2.5 concentration using the CNN-LSTM model in Tangshan should not exceed 120 h to achieve the best forecasting effect. However, despite this, the accuracy of the CNN-LSTM model in forward forecasting from 0 to 168 h still has a significant advantage over the LSTM model and BPNN in the other nine cities, especially in Beijing and Langfang.

4.2.2. Spatial and Temporal Applicability Tests

To further demonstrate the spatial-temporal adaptability of the model, the same period of 2021 in the study area is selected for the prediction experiment; the experimental process is as described in the previous section, and the accuracy of the model prediction results is shown in Figure 8.

As indicated by Figure 8, during the same period in 2021, the CNN-LSTM model generally shows a clear advantage over the other two models. Specifically, the CNN-LSTM model has an average RMSE of 6.16 μg∙m⁻³, an average MAE of 4.52 μg∙m⁻³, and an average MAPE of 25%. When compared to the LSTM model, the RMSE, MAE and MAPE improved by 5.89%, 9.92%, and 25.55%, respectively. When compared to the BPNN model, they improved by 17.51%, 24.66%, and 46.61%, respectively. Additionally, comparing Figure 9, the CNN-LSTM model shows little difference in accuracy between the two years, and the advantage over the other two models is still most obvious in Beijing and least in Zhangjiakou, proving the temporal stability in the Beijing-Tianjin-Hebei region during the period before and after the National Day.

The superior performance of the CNN-LSTM model over the LSTM and BPNN is attributed to its effective combination of CNN’s feature extraction capabilities with LSTM’s strength in handling long time-dependent time series. Despite the influence of various factors, its predictions adeptly match the actual observed values of PM_2.5 concentrations, showcasing enhanced applicability and accurately reflecting the variations in PM_2.5 concentrations during the National Day period in the Beijing-Tianjin-Hebei metropolitan area.

We acknowledge that every model prediction inevitably entails potential errors and uncertainties. For instance, all environmental measurement data might be influenced by the precision of instruments, calibration, and environmental variables. The non-linear characteristics of the model may result in predictive biases under certain extreme conditions. Special socio-economic activities during the National Day period, such as large-scale gatherings or firework displays, might have short-term significant impacts on air quality. These impacts might not have been fully captured in the training data. Indeed, anomalous PM_2.5 concentration values in the observational data, even after processing, could still potentially affect prediction outcomes. However, it is essential to note that this does not undermine the significance and value of this study.

5. Conclusions

(1): In this paper, a PM_2.5 concentration prediction model based on CNN-LSTM as the core algorithm is proposed. Hourly data of atmospheric pollutants, ERA5 meteorological factors, and ERA5-PWV of 10 cities in the Beijing-Tianjin-Hebei metropolitan area in September 2022 are used for model training, followed by the prediction of PM_2.5 concentration during the National Day, which is compared and analyzed with the prediction model based on BPNN and LSTM. The average RMSE of the CNN-LSTM model was 7.55 μg∙m⁻³, the average MAE was 5.94 μg∙m⁻³, and the average MAPE was 35.31%, which was verified with the observed data of PM_2.5 concentration in 2022. Compared with BPNN and LSTM, the average RMSE is optimized by 25.2% and 14.30%, the average MAE is optimized by 26.23% and 15.01%, and the average MAPE is optimized by 35.64% and 16.98%, respectively. Meanwhile, prediction experiments were also conducted with the same three models in the same region for the same period in another year (2021) to verify the regional applicability of the models. Overall, the CNN-LSTM model exhibits the highest accuracy and the best spatial-temporal applicability among the ten cities within the Beijing-Tianjin-Hebei metropolitan area.
(2): Previous studies have demonstrated that prolonged exposure to high PM_2.5 concentrations increases the risk of respiratory and cardiovascular diseases, and the CNN-LSTM model provides high-precision warnings of future PM_2.5 concentrations, which provides both health advice to the public and decision-making references to the governmental departments in urban planning and traffic management. In order to apply this model more effectively, it is being considered to integrate this model with the existing air quality monitoring system, e.g., in terms of data integration, model deployment, and prediction and feedback, which will predict potential changes in PM_2.5 concentrations in real time, assist the government to take timely measures such as restricting certain industrial activities and provide the public with suggestions on activities and guidance on health protection.
(3): PM_2.5 concentration data are formed by complex interactions of spatial and temporal factors. A key advantage of our CNN-LSTM model is its ability to handle complex nonlinear relationships and interactions between multivariate factors. This property greatly improves the predictive performance of the model. In future research, we plan to accumulate more experimental data to broaden the scope of our study. Meanwhile, the prediction accuracy and spatio-temporal applicability of the CNN-LSTM-based PM_2.5 model are further improved by considering additional influencing factors such as population density, industrial emissions, vegetation index, and AOD.
(4): This study has provided new perspectives and tools for air quality prediction in the Beijing-Tianjin-Hebei metropolitan area. Overall, we have successfully demonstrated the efficiency and stability of the CNN-LSTM model in predicting PM_2.5 concentrations during the National Day in the Beijing-Tianjin-Hebei region, which provides strong support for future environmental monitoring and prediction tasks.

Author Contributions

Conceptualization, J.L. and Y.S.; methodology, Y.S.; software, J.L. and Y.S.; validation, J.L. and Y.S.; formal analysis, X.G. and J.L.; investigation, J.L., Y.S., L.L. and L.H.; resources, Y.S. and L.H.; data curation, J.L. and Y.S.; writing—original draft preparation, J.L. and Y.S.; writing—review and editing, J.L., Y.S. and M.H.; visualization, J.L. and Y.S.; supervision, J.L., Y.S., L.L. and M.H.; project administration, J.L., L.L., X.G. and L.H.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Guangxi Natural Science Foundation of China (2020GXNSFBA297145, GuikeAD23026177), the Foundation of Guilin University of Technology (GUTQDJJ6616032), Guangxi Key Laboratory of Spatial Information and Geomatics (21-238-21-05), the National Natural Science Foundation of China (42064002, 42004025, 42074035, 42104040), the Innovative Training Program Foundation (202210596015, 202210596402) and the Innovation Project of Guangxi Graduate Education (YCSW2023341).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This study did not involve humans. The air quality data used in this paper can be downloaded from the China Environmental Monitoring General Station website: www.cnemc.cn (accessed on 25 August 2023), and the meteorological data can be downloaded from the European Centre for Medium-Range Weather Forecasts website: https://cds.climate.copernicus.eu/cdsapp (accessed on 25 August 2023). The data are also available upon request by contacting the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Zhang, H.; Wang, S.; Hao, J.; Wang, X.; Wang, S.; Chai, F.; Li, M. Air pollution and control action in beijing. J. Clean. Prod. 2016, 112, 1519–1527. [Google Scholar]
Zheng, S.; Yi, H.; Li, H. The impacts of provincial energy and environmental policies on air pollution control in China. Renew. Sustain. Energy Rev. 2015, 49, 386–394. [Google Scholar]
Wang, Y.; Zhang, Y.-S. Air quality assessment by contingent valuation in Ji’nan, China. J. Environ. Manag. 2009, 90, 1022–1029. [Google Scholar]
Liu, S.; Zhou, Y.; Liu, S.; Chen, X.; Zou, W.; Zhao, D.; Li, X.; Pu, J.; Huang, L.; Chen, J.; et al. Association between exposure to ambient particulate matter and chronic obstructive pulmonary disease: Results from a cross-sectional study in China. Thorax 2017, 72, 788–795. [Google Scholar] [PubMed]
Kumar, P.; Singh, A.B.; Arora, T.; Singh, S.; Singh, R. Critical review on emerging health effects associated with the indoor air quality and its sustainable management. Sci. Total Environ. 2023, 872, 162163. [Google Scholar]
Graff Zivin, J.; Neidell, M. The impact of pollution on worker productivity. Am. Econ. Rev. 2012, 102, 3652–3673. [Google Scholar]
Singh, R.L.; Singh, P.K. Global environmental problems. In Principles and Applications of Environmental Biotechnology for a Sustainable Future; Singh, R.L., Ed.; Springer: Singapore, 2017; pp. 13–41. [Google Scholar]
Rajabov, B.; Liu, L.; Rajabov, J. Multiple-Factor influence on air quality of road motor vehicles tail number limit in administrative area of Beijing, China. J. Adv. Transp. 2020, 2020, 8853180. [Google Scholar]
Sun, W.; Li, Z. Hourly PM_2.5 concentration forecasting based on feature extraction and stacking-driven ensemble model for the winter of the beijing-tianjin-hebei area. Atmos. Pollut. Res. 2020, 11, 110–121. [Google Scholar]
Shao, X.; Soo Kim, C. Accurate multi-site daily-ahead multi-step PM_2.5 concentrations forecasting using space-shared cnn-lstm. Comput. Mater. Contin. 2022, 70, 5143–5160. [Google Scholar]
Zhang, L.; Lin, J.; Qiu, R.; Hu, X.; Zhang, H.; Chen, Q.; Tan, H.; Lin, D.; Wang, J. Trend analysis and forecast of PM_2.5 in Fuzhou, China using the arima model. Ecol. Indic. 2018, 95, 702–710. [Google Scholar]
Cheng, Y.; Zhang, H.; Liu, Z.; Chen, L.; Wang, P. Hybrid algorithm for short-term forecasting of PM_2.5 in China. Atmos. Environ. 2019, 200, 264–279. [Google Scholar]
Zhao, R.; Gu, X.; Xue, B.; Zhang, J.; Ren, W. Short period PM_2.5 prediction based on multivariate linear regression model. PLoS ONE 2018, 13, e0201011. [Google Scholar]
Upadhyay, A.; Dey, S.; Goyal, P.; Dash, S.K. Projection of near-future anthropogenic PM_2.5 over india using statistical approach. Atmos. Environ. 2018, 186, 178–188. [Google Scholar]
Shao, X.; Kim, C.-S.; Sontakke, P. Accurate deep model for electricity consumption forecasting using multi-channel and multi-scale feature fusion cnn–lstm. Energies 2020, 13, 1881. [Google Scholar]
Wang, J.; Li, H.; Lu, H. Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China. Appl. Soft Comput. 2018, 71, 783–799. [Google Scholar]
Niu, H.; Zhang, Z.; Xiao, Y.; Luo, M.; Chen, Y. A study of carbon emission efficiency in chinese provinces based on a three-stage sbm-undesirable model and an lstm model. Int. J. Environ. Res. Public Health 2022, 19, 5395. [Google Scholar] [PubMed]
Hamrani, A.; Akbarzadeh, A.; Madramootoo, C.A. Machine learning for predicting greenhouse gas emissions from agricultural soils. Sci. Total Environ. 2020, 741, 140338. [Google Scholar] [PubMed]
Aamir, M.; Bhatti, M.A.; Bazai, S.U.; Marjan, S.; Mirza, A.M.; Wahid, A.; Hasnain, A.; Bhatti, U.A. Predicting the environmental change of carbon emission patterns in south asia: A deep learning approach using bilstm. Atmosphere 2022, 13, 2011. [Google Scholar]
Kleine Deters, J.; Zalakeviciute, R.; Gonzalez, M.; Rybarczyk, Y. Modeling PM_2.5 urban pollution using machine learning and selected meteorological parameters. J. Electr. Comput. Eng. 2017, 2017, 5106045. [Google Scholar]
Sun, W.; Sun, J. Daily PM_2.5 concentration prediction based on principal component analysis and lssvm optimized by cuckoo search algorithm. J. Environ. Manag. 2017, 188, 144–152. [Google Scholar]
Shao, X.; Pu, C.; Zhang, Y.; Kim, C.S. Domain fusion cnn-lstm for short-term power consumption forecasting. IEEE Access 2020, 8, 188352–188362. [Google Scholar] [CrossRef]
Weng, Y.; Wang, X.; Hua, J.; Wang, H.; Kang, M.; Wang, F.Y. Forecasting horticultural products price using arima model and neural network based on a large-scale data set collected by web crawler. IEEE Trans. Comput. Soc. Syst. 2019, 6, 547–553. [Google Scholar] [CrossRef]
Feng, X.; Li, Q.; Zhu, Y.; Hou, J.; Jin, L.; Wang, J. Artificial neural networks forecasting of PM_2.5 pollution using air mass trajectory based geographic model and wavelet transformation. Atmos. Environ. 2015, 107, 118–128. [Google Scholar] [CrossRef]
He, Z.; Guo, Q.; Wang, Z.; Li, X. Prediction of monthly PM_2.5 concentration in liaocheng in China employing artificial neural network: 8. Atmosphere 2022, 13, 1221. [Google Scholar] [CrossRef]
Maleki, H.; Sorooshian, A.; Goudarzi, G.; Baboli, Z.; Tahmasebi Birgani, Y.; Rahmati, M. Air pollution prediction by using an artificial neural network model. Clean Technol. Environ. Policy 2019, 21, 1341–1352. [Google Scholar] [CrossRef] [PubMed]
Reichstein, M.; Camps-Valls, G.; Stevens, B.; Jung, M.; Denzler, J.; Carvalhais, N.; Prabhat, F. Deep learning and process understanding for data-driven earth system science. Nature 2019, 566, 195–204. [Google Scholar] [CrossRef] [PubMed]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning: 7553. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Guo, Z.; Zhao, W.; Lu, H.; Wang, J. Multi-step forecasting for wind speed using a modified emd-based artificial neural network model. Renew. Energy 2012, 37, 241–249. [Google Scholar] [CrossRef]
Xie, J. Deep neural network for PM_2.5 pollution forecasting based on manifold learning. In Proceedings of the 2017 International Conference on Sensing, Diagnostics, Prognostics, and Control (SDPC), Shanghai, China, 16–18 August 2017. [Google Scholar]
Liu, X.; Liu, Q.; Zou, Y.; Wang, G. A self-organizing lstm-based approach to PM2.5 forecast. In Cloud Computing and Security: 4th International Conference, ICCCS 2018, Haikou, China, June 8–10, 2018, Revised Selected Papers, Part IV 4; Springer International Publishing: Berlin/Heidelberg, Germany, 8 2018; pp. 683–693. [Google Scholar]
Qin, D.; Yu, J.; Zou, G.; Yong, R.; Zhao, Q.; Zhang, B. A novel combined prediction scheme based on cnn and lstm for urban PM_2.5 concentration. IEEE Access 2019, 7, 20050–20059. [Google Scholar] [CrossRef]
Huang, C.-J.; Kuo, P.-H. A deep cnn-lstm model for particulate matter (PM_2.5) forecasting in smart cities. Sensors 2018, 18, 2220. [Google Scholar] [CrossRef]
Xin, J.; Gong, C.; Liu, Z.; Cong, Z.; Gao, W.; Song, T.; Pan, Y.; Sun, Y.; Ji, D.; Wang, L.; et al. The observation-based relationships between pm 2.5 and aod over China: The functions of pm 2.5 and aod over China. J. Geophys. Res. Atmos. 2016, 121, 10701–10716. [Google Scholar] [CrossRef]
Guo, M.; Xia, P.; Li, P.; Zhang, H. Global navigation satellite system precipitable water vapour combined with other atmospheric factors to predict the short-term change of PM_2.5 mass concentration. Meteorol. Z. 2021, 30, 429–444. [Google Scholar] [CrossRef]
Guo, M.; Zhang, H.; Xia, P. Exploration and analysis of the factors influencing gnss pwv for nowcasting applications. Adv. Space Res. 2021, 67, 3960–3978. [Google Scholar] [CrossRef]
Liu, Y. Relationships of wind speed and precipitable water vapor with regional PM_2.5 based on wrf-chem model. Nat. Resour. Model. 2021, 34, e12306. [Google Scholar] [CrossRef]
Li, J.; Li, X.; Wang, K. Atmospheric PM_2.5 concentration prediction based on time series and interactive multiple model approach. Adv. Meteorol. 2019, 2019, e1279565. [Google Scholar] [CrossRef]
Li, T.; Hua, M.; Wu, X. A hybrid cnn-lstm model for forecasting particulate matter (PM_2.5). IEEE Access 2020, 8, 26933–26940. [Google Scholar] [CrossRef]
Fenger, J. Urban air quality. Atmos. Environ. 1999, 33, 4877–4900. [Google Scholar] [CrossRef]
Tai, A.P.K.; Mickley, L.J.; Jacob, D.J. Correlations between fine particulate matter (PM_2.5) and meteorological variables in the united states: Implications for the sensitivity of PM_2.5 to climate change. Atmos. Environ. 2010, 44, 3976–3984. [Google Scholar] [CrossRef]
Zhang, Y.; Seigneur, C.; Seinfeld, J.H.; Jacobson, M.Z.; Binkowski, F.S. Simulation of aerosol dynamics: A comparative review of algorithms used in air quality models. Aerosol Sci. Technol. 1999, 31, 487–514. [Google Scholar] [CrossRef]
Yang, Y.; Fan, S.; Wang, L.; Gao, Z.; Zhang, Y.; Zou, H.; Miao, S.; Li, Y.; Huang, M.; Yim, S.H.; et al. Diurnal evolution of the wintertime boundary layer in urban Beijing, China: Insights from doppler lidar and a 325-m meteorological tower. Remote Sens. 2020, 12, 3935. [Google Scholar] [CrossRef]
Li, Z.; Zhang, H.; Wen, C.Y.; Yang, A.S.; Juan, Y.H. The effects of lateral entrainment on pollutant dispersion inside a street canyon and the corresponding optimal urban design strategies. Build. Environ. 2021, 195, 107740. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Wen, C.; Liu, S.; Yao, X.; Peng, L.; Li, X.; Hu, Y.; Chi, T. A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. Sci. Total Environ. 2019, 654, 1091–1099. [Google Scholar] [CrossRef] [PubMed]
Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 1998, 06, 107–116. [Google Scholar] [CrossRef]
Zaremba, W.; Sutskever, I. Learning to execute. arXiv 2015, arXiv:1410.4615. [Google Scholar]
Ren, L.; Dong, J.; Wang, X.; Meng, Z.; Zhao, L.; Deen, M.J. A data-driven auto-cnn-lstm prediction model for lithium-ion battery remaining useful life. IEEE Trans. Ind. Inform. 2021, 17, 3478–3487. [Google Scholar] [CrossRef]
Livieris, I.E.; Pintelas, E.; Pintelas, P. A cnn–lstm model for gold price time-series forecasting. Neural Comput. Appl. 2020, 32, 17351–17360. [Google Scholar] [CrossRef]
Kim, T.-Y.; Cho, S.-B. Predicting residential energy consumption using cnn-lstm neural networks. Energy 2019, 182, 72–81. [Google Scholar] [CrossRef]
Chen, L.; Wu, T.; Wang, Z.; Lin, X.; Cai, Y. A novel hybrid bpnn model based on adaptive evolutionary artificial bee colony algorithm for water quality index prediction. Ecol. Indic. 2023, 146, 109882. [Google Scholar] [CrossRef]
Kumar Tipu, R.; Panchal, V.R.; Pandya, K.S. An ensemble approach to improve bpnn model precision for predicting compressive strength of high-performance concrete. Structures 2022, 45, 500–508. [Google Scholar] [CrossRef]
Zhang, L.; Sun, Z.; Zhang, C.; Dong, F.; Wei, P. Numerical investigation of the dynamic responses of long-span bridges with consideration of the random traffic flow based on the intelligent aco-bpnn model. IEEE Access 2018, 6, 28520–28529. [Google Scholar] [CrossRef]
Cui, Z.; Ke, R.; Pu, Z.; Wang, Y. Deep bidirectional and unidirectional lstm recurrent neural network for network-wide traffic speed prediction. arXiv 2019, arXiv:1801.02143. [Google Scholar]
Kim, H.Y.; Won, C.H. Forecasting the volatility of stock price index: A hybrid model integrating lstm with multiple garch-type models. Expert Syst. Appl. 2018, 103, 25–37. [Google Scholar] [CrossRef]
Kratzert, F.; Klotz, D.; Brenner, C.; Schulz, K.; Herrnegger, M. Rainfall–runoff modelling using long short-term memory (lstm) networks. Hydrol. Earth Syst. Sci. 2018, 22, 6005–6022. [Google Scholar] [CrossRef]
Wu, Y.; Yuan, M.; Dong, S.; Lin, L.; Liu, Y. Remaining useful life estimation of engineered systems using vanilla lstm neural networks. Neurocomputing 2018, 275, 167–179. [Google Scholar] [CrossRef]

Figure 1. Distribution map of 10 major cities in the Beijing-Tianjin-Hebei metropolitan area.

Figure 2. The variation trend of PM_2.5 concentration during the National Day in 2021/2022.

Figure 3. The network structure of the LSTM where each gate, memory cell, and output of the hidden layer, can be computed using the following formula (Zaremba and Sutskever [48]; Wen et al. [46]).

Figure 4. The model structure of the CNN-LSTM. The left column enumerates the input variables, while the right column displays the output variable, namely the PM_2.5 concentration. The structure of the CNN-LSTM algorithm is outlined in between.

Figure 5. Flow chart of the experiment.

Figure 6. Actual PM_2.5 concentration (blue dot line) and predicted PM_2.5 concentration from CNN-LSTM (red dot line), LSTM (green dot line), BPNN (yellow dot line) from 1 to 7 October 2022.

Figure 7. Relative errors of the model from 1st to 7th October 2022. Most of the absolute values of relative errors fall within the coordinate range of each subplot. The portions that exceed the coordinate range are relatively few and do not affect the comparison of the three models. The dashed line at 100% represents a reasonable threshold for error.

Figure 8. Comparison of prediction accuracy of each model from 1 October 2021 to 7 October 2021.

Figure 9. Comparison of prediction accuracy of each model from 1 October 2022 to 7 October 2022.

Table 1. Correlation of PM_2.5 and atmospheric pollution.

City	PM₁₀	SO₂	NO₂	O₃	CO
Beijing	0.974 **	−0.128 **	0.720 **	0.173 **	0.924 **
Tianjin	0.963 **	0.590 **	0.643 **	−0.046	0.730 **
Baoding	0.924 **	0.640 **	0.659 **	−0.084 **	0.897 **
Tangshan	0.936 **	0.544 **	0.723 **	−0.010	0.723 **
Langfang	0.943 **	0.672 **	0.599 **	0.036	0.894 **
Shijiazhuang	0.904 **	0.483 **	0.554 **	0.006	0.874 **
Qinhuangdao	0.925 **	0.650 **	0.617 **	0.249 **	0.795 **
Zhangjiakou	0.873 **	0.403 **	0.712 **	0.463 **	0.798 **
Chengde	0.912 **	0.148 **	0.628 **	0.120 **	0.816 **
Cangzhou	0.873 **	0.504 **	0.557 **	0.027	0.745 **

Note: **. is statistically significant at the 0.01 level (two-tailed) correlation.

Table 2. Correlation of PM_2.5 and meteorological factors and ERA5-PWV.

City	BLH	P	RH	T2m	U10	V10	ERA5-PWV
Beijing	−0.117 **	−0.235 **	0.364 **	0.266 **	−0.354 **	0.415 **	0.446 **
Tianjin	0.104 **	−0.094 **	0.263 **	0.159 **	0.257 **	0.259 **	−0.186 **
Baoding	0.038	−0.095 **	0.094 **	0.228 **	−0.173 **	0.301 **	0.175 **
Tangshan	0.023	−0.121 **	0.306 **	0.272 **	−0.156 **	0.380 **	0.338 **
Langfang	−0.068 *	−0.157 **	0.262 **	0.253 **	−0.359 **	0.302 **	0.389 **
Shijiazhuang	−0.035	−0.215 **	0.217 **	0.258 **	−0.106 **	0.136 **	0.302 **
Qinhuangdao	−0.059	0.042	0.240 **	0.069 *	0.129 **	0.242 **	0.112 **
Zhangjiakou	−0.089 **	−0.323 **	0.388 **	0.251 **	−0.079 **	0.282 **	0.509 **
Chengde	−0.046	−0.193 **	0.413 **	0.228 **	−0.309 **	0.337 **	0.365 **
Cangzhou	0.095 **	−0.033	0.218 **	0.166 **	−0.131 **	0.199 **	0.185 **

Note: **. is statistically significant at the 0.01 level (two-tailed) correlation, *. is statistically significant at the 0.05 level (two-tailed) correlation, the same as below.

Table 3. Parameters of CNN-LSTM and their adjustment range.

Parameter Type	Initial Value	Adjustment Range
InputSize	11	Typically dependent on feature dimensions
FilterSize	[1,2]	[1 × 1] − [inputSize × 1]
Stride	[1,1]	[1 × 1] − [inputSize × 1]
Number of Convolution Kernels	10	5–50
Learning Rate	0.001	0.0001–0.1
MaxEpochs	20	10–30
LearnRateDropPeriod	2	0.5–1 × MaxEpochs
LearnRateDropFactor	0.5	0.1–0.9
numhidden_units1	50	30–100
numhidden_units2	100	50–200
numhidden_units3	150	70–300
Dropout Rate	0.3	0–0.8
Lag Length (k)	1	1–24

Table 4. Comparison of prediction accuracy of each model.

City	Model	RMSE (μg∙m⁻³)	MAE (μg∙m⁻³)	MAPE (%)
Beijing	BP	8.47	7.08	79.757
	LSTM	8.79	6.65	49.625
	CNN-LSTM	6.73	4.78	33.94
Tianjin	BP	12.96	10.47	111.36
	LSTM	10.68	9.31	84.32
	CNN-LSTM	9.71	8.11	54.29
Baoding	BP	12.01	8.88	42.56
	LSTM	9.35	7.33	37.93
	CNN-LSTM	8.77	6.64	35.68
Tangshan	BP	12.32	9.57	47.07
	LSTM	12.49	8.09	31.59
	CNN-LSTM	9.60	7.11	46.14
Langfang	BP	11.63	8.62	41.15
	LSTM	11.64	9.01	48.21
	CNN-LSTM	8.56	6.46	27.48
Shijiazhuang	BP	8.13	6.65	36.97
	LSTM	7.59	6.57	28.63
	CNN-LSTM	6.84	5.87	27.67
Qinhuangdao	BP	8.68	8.62	54.05
	LSTM	8.32	6.56	48.82
	CNN-LSTM	6.57	5.58	35.26
Zhangjiakou	BP	6.67	4.10	40.89
	LSTM	4.40	3.01	31.29
	CNN-LSTM	4.99	3.30	33.96
Chengde	BP	6.15	4.51	32.11
	LSTM	4.23	3.35	22.43
	CNN-LSTM	3.93	2.97	21.68
Cangzhou	BP	14.38	11.95	62.77
	LSTM	10.63	8.23	42.51
	CNN-LSTM	9.82	7.59	37.02
Area average accuracy	BP	10.14	8.05	54.86
	LSTM	8.81	6.99	42.53
	CNN-LSTM	7.55	5.94	35.31

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Su, Y.; Li, J.; Liu, L.; Guo, X.; Huang, L.; Hu, M. Application of CNN-LSTM Algorithm for PM_2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area. Atmosphere 2023, 14, 1392. https://doi.org/10.3390/atmos14091392

AMA Style

Su Y, Li J, Liu L, Guo X, Huang L, Hu M. Application of CNN-LSTM Algorithm for PM_2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area. Atmosphere. 2023; 14(9):1392. https://doi.org/10.3390/atmos14091392

Chicago/Turabian Style

Su, Yuxuan, Junyu Li, Lilong Liu, Xi Guo, Liangke Huang, and Mingyun Hu. 2023. "Application of CNN-LSTM Algorithm for PM_2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area" Atmosphere 14, no. 9: 1392. https://doi.org/10.3390/atmos14091392

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of CNN-LSTM Algorithm for PM_2.5 Concentration Forecasting in the Beijing-Tianjin-Hebei Metropolitan Area

Abstract

1. Introduction