1. Introduction
PM
2.5 stands for particulate matter dynamically suspended in air with a diameter less than or equal to 2.5 microns. These fine particles not only penetrate deep into the lungs, but also have the potential to enter the blood system, posing a significant threat to human health. Studies have found that prolonged exposure to high levels of PM
2.5 increases the risk of respiratory and cardiovascular diseases [
1]. In particular, PM
2.5 may affect other organs through blood circulation, causing long-term negative impacts on the health of developing children [
2]. In addition, atmospheric pollution, especially high concentrations of PM
2.5, poses a direct threat to ecosystems and human health and quality of life [
3]. Therefore, accurate prediction of changes in PM
2.5 concentrations is essential for developing effective air pollution prevention and control strategies and protecting public health [
4].
Existing PM
2.5 forecasting methods mainly include statistical forecasting, numerical forecasting, and machine learning methods [
5,
6]. Statistical method is a forecasting method that establishes mathematical relationships between PM
2.5 concentrations and influencing factors based on historical and related data, through regression analysis, time series analysis, and other methods, without considering physicochemical processes. The advantage of statistical methods is that they are easy to operate and can be combined with machine learning to improve forecast accuracy. Its limitation is that it relies on a large amount of data and assumptions, making it difficult to capture the dynamic changes in PM
2.5. Numerical method is a forecasting method based on atmospheric dynamics and atmospheric environmental chemistry, which calculates the spatial and temporal distribution of pollutants by constructing a mathematical model with a system of equations based on air pollution emission source data and meteorological data, and then solves the problem by computer. Numerical methods are able to synthesize physicochemical processes, simulate pollutant behavior in detail, are suitable for multi-scale forecasting, and provide high-resolution results. However, they rely on high-performance computing resources and are sensitive to input data and model parameters, which may bring uncertainties and errors. In recent years, with the rapid development of artificial intelligence technology, the successful application of machine learning, especially deep learning, in the field of air quality forecasting, has gradually become a new direction of research [
7,
8].
Several researchers have applied artificial neural network (ANN) techniques to air quality forecasting and achieved high forecasting accuracy. For example, Mao et al. forecasted the change in PM
2.5 concentration in 12 h. They used meteorological data, including the PM
2.5 concentration of the previous day as well as wind direction, wind speed, air temperature, and humidity for model training, and eventually achieved better forecasting results [
9]. Shishegaran et al. compared four predictive models—ARIMA, Principal Component Regression (PCR), ARIMA-PCR hybrid model, and a combined model of ARIMA and Gene Expression Programming (GEP)—to predict daily air quality in Tehran [
10]. Some researchers also try to combine different models to make full use of their respective advantages. For example, considering that air quality monitoring data involve not only temporal but also spatial features, Ma et al. constructed an LSTM–GCN model for predicting PM
2.5 concentration in the next hour by fusing Graph Convolutional Network (GCN) and Long Short-Term Memory Network (LSTM), which was applied in the Hunnan District of Shenyang and demonstrated higher accuracy than the traditional method [
11]. Ali Kamali Mohammadzadeh et al. proposed a spatio-temporal deep neural structure combining GCN and exogenous Long Short-Term Memory Network (E-LSTM) for predicting PM
2.5 air quality index (AQI), and the results of the study showed that this framework is significantly more accurate in predicting PM
2.5 AQI than the traditional LSTM and E-LSTM methods, and also shows good robustness to the network structure of EPA stations [
12]. Li et al. successfully constructed a hybrid model of a convolutional neural network (CNN) and LSTM, both of which performed well and demonstrated excellent performance in nonlinear time series forecasting [
13].
Combined weighted prediction is one of the hot topics in recent years in the research of air pollutant concentration prediction. Moghram and Rahman pointed out that there is no universal optimal strategy [
14]. Bates and Granger recommended a combined weighted prediction method, which does not rely on a single model but improves the accuracy of the prediction ensemble through weight assignment [
15]. However, the choice of weights is still a difficult part of combinatorial prediction. Many scholars have conducted a lot of research and discussion on the selection of weights for combinatorial models. For example, Yan et al. constructed a self-varying weighted CNN and LSTM combination model based on CNN and LSTM models, and the experimental results showed that the prediction results of this model outperformed other benchmark models [
16]. Other researchers, such as Yang and Xiao’s team, used Differential Evolution (DE) and Cuckoo Search Algorithm (CSO) optimization to adjust the weights to enhance the forecasting accuracy [
17,
18], respectively. The results show that the combined model can improve the forecasting performance to some extent.
The limitations of the above prediction methods are that they usually rely on a single model or use a single evaluation criterion in multi-model combined weighted prediction, such as considering only accuracy or stability, while neglecting the dual needs of both for PM2.5 forecasts. In order to further improve the performance of ensemble forecast by taking into account both accuracy and stability of PM2.5 forecasts, an innovative “ensemble forecast” model is proposed in this study. The ensemble forecast model predicts the PM2.5 concentration at time t by using the PM2.5 concentration data for 24 consecutive hours before time t, and the site characteristics data at time t-1. The model employs a combined weighted prediction method, which combines two powerful models: the LSTM–GCN model (based on LSTM and GCN techniques) and the Transformer–GraphSAGE model (combining Transformer and GraphSAGE techniques). Both models can effectively capture the temporal and spatial properties and their interactions, providing a multi-dimensional perspective for PM2.5 concentration forecasting. In this study, bias and variance are introduced as multi-objective optimization metrics to simultaneously optimize the performance of the forecasting models in terms of both accuracy and stability, and the optimal weighting coefficients of the two models are determined by MOEA/D algorithm to achieve more accurate and stable forecasts.
The novelty of the “ensemble forecast” model is that it combines the advantages of LSTM–GCN and Transformer–GraphSAGE, and obtains the optimal weight coefficients through the multi-objective optimization algorithm to improve the accuracy and stability of the prediction at the same time, which is rarely seen in previous studies. Experimental validation shows that the ensemble forecast model outperforms the comparative benchmark models in terms of forecast accuracy and stability. This advancement not only provides new theoretical support for PM2.5 forecasting, but also provides a solid scientific basis for the development of effective air quality management strategies.
3. Experimental Results and Discussion
3.1. Overall Evaluation Results
The deep learning models developed and used in this study were constructed based on Windows 10 operating system through the Google Colab platform using the PyTorch framework with NVIDIA T4 GPUs in order to accelerate the computational process, aiming to improve the efficiency and performance of model.
According to the real values on the sites and the predicted values of the seven models, the average MAE, RMSE, and MAPE of the 36 monitoring sites were calculated, and the results are shown in
Table 4. These data are visualized through bar charts and line charts, as shown in
Figure 3.
It can be seen from
Table 4 and
Figure 3 that the ensemble prediction model proposed in this study is better than LSTM–GCN, LSTM, GCN, Transformer–GraphSAGE, Transformer, and GraphSAGE models in terms of evaluation results. Among the four single models, the LSTM model has the smallest MAE and MAPE values, and its RMSE value is smaller than Transformer and GraphSAGE and slightly higher than GCN in the four single models. Among the four single models, LSTM showed the best prediction performance, which may be due to the specific features and patterns of the dataset that are more suitable for LSTM. Compared with GCN, the MAE, RMSE, and MAPE of LSTM–GCN are reduced by 3.2%, 3.8%and 3.6%, respectively. Compared with LSTM, LSTM–GCN only has advantages in MAE and RMSE, which are reduced by 1.2% and 5.1% respectively. Transformer–GraphSAGE is different from LSTM–GCN, its MAE, RMSE, and MAPE are 2.514, 4.055, 2.562, respectively. It is significantly lower than Transformer (3.389, 5.292, 3.484) and GraphSAGE (3.280, 4.971, 3.372). From the above data, it can be seen that the combined model has certain advantages over the single model, and this advantage will be different in different models, which may be related to the characteristics of the model itself and the characteristics of the data. Comparing the two combination models, Transformer–GraphSAGE’s MAE, RMSE, and MAPE are comprehensively ahead of LSTM–GCN, which are reduced by 17.5%, 12.9%, and 17.8%, respectively. This shows that Transformer–GraphSAGE can better use the features and patterns of the data. Compared with LSTM–GCN, Ensemble Forecast has these three indicators decreased by 22%, 15.4%, and 21%, respectively, and has decreased by 5.6%, 2.8%, and 3.8%, respectively, compared with Transformer–GraphSAGE. From the above information, it can be seen that Ensemble Forecast, which combines the MOEA/D algorithm to assign weights, is able to use the advantages of LSTM–GCN and Transformer–GraphSAGE to make better predictions.
3.2. Analysis of Predicted and Observed Values
Site No. 11 was randomly selected as the research object. In order to better demonstrate the forecasting effect of the model in the time dimension, the observed values for one week in February, which was more seriously polluted, and the forecast values of all models were extracted, and the curves were drawn as shown in
Figure 4.
It can be observed from
Figure 4 that the forecast results of all models fluctuate with changes in actual values, and this volatility shows a similar trend among the models. Among them, the forecast trend of Ensemble Forecast is consistent with the actual value, demonstrating its advantages in forecast performance.
A specific moment of high PM
2.5 concentration observation is randomly selected, from which the PM
2.5 observations of 36 monitoring stations and the forecast values of the models are extracted. In order to visualize the forecasting ability of these models in the spatial dimension, the graphs shown in
Figure 5 are plotted.
As can be seen from
Figure 5, in the spatial dimension, the fluctuation range and evolution trend of the predicted value of the Ensemble Forecast model are relatively consistent with the actual value, and the prediction effect is good.
The correlation between all model predictions and observations under site No. 11 is shown in
Figure 6. The black line represents the y = x line, the red line represents the regression line of the model, and the different colored areas represent the density level of the data points.
Figure 6 presents the prediction effects of the seven different models on PM
2.5 concentrations in the form of density scatter plots. The density scatterplot visually identifies high-density areas (shown as red areas in the figure) by revealing the concentration trend of data points, thus demonstrating the correlation between the predicted values and the actual observed values. In
Figure 6, the density scatterplot of each model demonstrates the distributional relationship between the predicted results and the actual data, especially the regions where the data points are concentrated, indicating a high degree of consistency between the predicted values and the actual observed values. In addition, the fitted linear equations and R
2 coefficients in each subplot provide a quantitative assessment of the accuracy of the model predictions.
In all the subgraphs, PM2.5 concentration is mostly concentrated between 85 and 90 μg/m3, which belongs to the range of light pollution. Sorted in ascending order of R2 values, the corresponding models are as follows: Transformer (0.7471), GraphSAGE (0.8332), LSTM–GCN (0.8443), LSTM (0.8493), GCN (0.8667), Transformer–GraphSAGE (0.8793), Ensemble Forecast (0.8831). It can be seen that the R2 value of Transformer–GraphSAGE is higher than that of Transformer and GraphSAGE, but the R2 value of LSTM–GCN is lower than that of LSTM and GCN, which may be because the time series features and graph structure features in the data are not obvious enough. Thus, the prediction accuracy of LSTM–GCN model is affected.
The Ensemble Forecast model has the highest R2 value of all models, and its regression line is closer to the y = x line, which is the highest prediction accuracy of all models. This reflects that the Ensemble Forecast model can well integrate the advantages of LSTM–GCN and Transformer–GraphSAGE to achieve better forecasting effect.
3.3. Limitations and Prospects
From the previous experimental results, the Ensemble Forecast model performs well in terms of accuracy and stability. Specifically, the proposed model exhibits lower MAE, RMSE, and MAPE values, as well as higher R2 values than the other compared models. The improvement in these indicators not only reflects the improvement in the performance of the model, but also lays a foundation for its future application and development. Although the proposed ensemble forecast model has shown good performance, there are still some limitations in this study, which needs to be further explored in future. For example, the model could be tested in more regions to further evaluate its generalization ability. In addition, further enhance the predictive power of model by integrating more data sources, such as emission inventories and satellite observations. At the same time, it is also worth exploring the potential of the model in multi-pollutant prediction and long-term prediction.
4. Conclusions
Accurate and stable prediction of PM2.5 concentration is of great significance for air pollution prevention and control. Existing studies usually rely on a single model or use a single evaluation criterion in multi-model combination weighted forecasting, ignoring the dual needs of accuracy and stability for PM2.5 forecasting. In order to make up for the above shortcomings, this paper proposes a PM2.5 forecasting model, called Ensemble Forecast, which integrates multi-model and multi-objective optimization algorithms.
The proposed model combines the LSTM–GCN and Transformer–GraphSAGE models, which can comprehensively consider temporal and spatial features, and introduces bias and variance as multi-objective optimization indicators. The MOEA/D algorithm is used to determine the optimal weight coefficients of LSTM–GCN and Transformer–GraphSAGE models to carry out prediction. The study selects Beijing as a case area to predict the PM2.5 concentration in the next hour. A detailed evaluation and comparative analysis based on the test set and benchmark models (LSTM, GCN, Transformer, GraphSAGE, LSTM–GCN, Transformer–GraphSAGE) is carried out. The results show that the proposed Ensemble Forecast model has smaller MAE, RMSE, and MAPE values, and larger R2 value, and its prediction effect is better than other comparison models. In conclusion, the Ensemble Forecast model is feasible in predicting the PM2.5 concentration and will be an effective PM2.5 prediction tool. It is expected that the model will be further improved in the future to provide more powerful support for public health guidance and the government’s air pollution control work.