Next Article in Journal
Management and Control of Enterprise Negative Network Public Opinion Dissemination Based on the Multi-Stakeholder Game Mechanism in China
Previous Article in Journal
A Deterministic–Statistical Hybrid Forecast Model: The Future of the COVID-19 Contagious Process in Several Regions of Mexico
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Air Pollutant Concentration Prediction System Based on Decomposition-Ensemble Mode and Multi-Objective Optimization for Environmental System Management

1
Business School, Shandong Normal University, Jinan 250014, China
2
School of Statistics, Dongbei University of Finance and Economics, Dalian 116025, China
3
Institute of Systems Engineering, Macau University of Science and Technology, Macao 999078, China
*
Author to whom correspondence should be addressed.
Systems 2022, 10(5), 139; https://doi.org/10.3390/systems10050139
Submission received: 28 July 2022 / Revised: 19 August 2022 / Accepted: 31 August 2022 / Published: 3 September 2022
(This article belongs to the Section Systems Engineering)

Abstract

:
With the continuous expansion of the industrial production scale and the rapid promotion of urbanization, more and more serious air pollution threatens people’s lives and social development. To reduce the losses caused by polluted weather, it is popular to predict the concentration of pollutants timely and accurately, which is also a research hotspot and challenging issue in the field of systems engineering. However, most studies only pursue the improvement of prediction accuracy, ignoring the function of robustness. To make up for this defect, a novel air pollutant concentration prediction (APCP) system is proposed for environmental system management, which is constructed by four modules, including time series reconstruction, submodel simulation, weight search, and integration. It not only realizes the filtering and reconstruction of redundant series based on the decomposition-ensemble mode, but also the weight search mechanism is designed to trade off precision and stability. Taking the hourly concentration of PM2.5 in Guangzhou, Shanghai, and Chengdu, China as an example, the simulation results show that the APCP system has perfect prediction capacity and superior stability performance, which can be used as an effective tool to guide early warning decision-making in the management of environmental engineering.

1. Introduction

With the continuous promotion of global industrialization and urbanization, the environmental problems accumulated by rapid development, especially the problem of air pollution, have become increasingly prominent. The harm caused by air pollution not only seriously affects human health [1,2] but also causes substantial economic losses to all countries [3]. The specific numerical results show that the rise of PM2.5 concentration by 10 g/m3 can increase mortality by 2% [4] and in 2016, bring a total loss of 101.39 billion dollars to 338 cities in China [5]. Therefore, it is imperative to establish an accurate, reliable, and effective air pollutant concentration prediction (APCP) system to announce the expected pollutant concentration to the public in the successive few periods, which is also a research hotspot and challenging issue in the field of systems engineering.
The state-of-the-art work for the prediction of air pollutants (APs) mainly includes deterministic models [6] and data-driven models [7]. The deterministic model relies on the emission source data and various historical meteorological data to simulate the formation of pollutants through the physical and chemical processes of the generation, conversion, and diffusion of APs [8]. For example, CMaq [9,10,11], Wrf-Chem [12,13,14], and LOtos-EUros [15] are all popular deterministic models. W. Wei et al. [16] adopted the Wrf-Chem model to effectively simulate the concentration of APs in Beijing. Based on this model, the impact of pollution emissions on air quality is quantified. However, the complexity of climate, land use, and emission sources is difficult to capture, which limits the development of such models to a certain extent [17].
Relying on the continuous in-depth study of algorithm models and the continuous improvement of computer hardware performance, data-driven models [18] have become the preferred method of prediction, including linear models [19] and nonlinear models [20]. Classical linear models, such as autoregressive integrated moving average (ARima) models [21] and seasonal autoregressive integrated moving average (SARima) [22], are based on the assumption that the studied sequence is linear. In addition, quantile regression (Qr) [23] and other models are also widely used in the field of AP concentration prediction. T. Liu et al. [24] integrated ARima model with three deterministic models (ARimaX) to predict APs, respectively. Taking the results of CMaq-ARima model as an example, the mean of daily root mean squared error (RMSE) at the three stations decreased by R M S E N O 2 ¯ = 46.3 % + 43.4 % + 41.2 % / 3 = 43.6 % . S. Abdullah et al. [25] fitted the multiple linear regression (MLR) model to study the cross-border impact of APs on Malaysia. Based on the MLR model, the values of the next one, two, and three time points in the input sequence of PM10 concentration were predicted, that is, the 1–3 step prediction was performed in the experiment. It is verified that the result of 1-step prediction is the best, in which R 2 = 0.638 and R M S E = 126.728   μ g / m 3 . Due to the prevalence of multicollinearity, N. Mohd Napi et al. [26] applied principal component analysis (PCA) to reduce the high correlation between the data and then used MLR to predict the ozone concentration. It was shown that the value of R2 was improved by R P C A M L R 2 R M L R 2 / R M L R 2 = 0.405 0.325 / 0.325 × 100 % = 24.615 % compared with MLR model.
Although the application of the linear model is relatively simple, in practice, the characteristics of the AP series appear nonlinear and time-varying complexity. Thus, nonlinear models are widely utilized. Y. Bai et al. [27] combined wavelet transform technology with the back propagation (BP) model, named the W-BPnn model, to improve the nonlinear mapping performance of the prediction model. Compared with the BP model, the value of RMSE was reduced by R M S E B P S O 2 R M S E W B P n n S O 2 / R M S E B P S O 2 = 12.716 8.269 / 12.716 × 100 % = 34.972 % . Because the sequence of APs concentration has long-term correlation characteristics, scientifically determining the lag order can avoid the gradient disappearance of artificial neural networks (ANNs). Considering the relationship between space and time, X. Li et al. [28] extended the Lstm model (LstmE) to predict multi-scale APs, which successfully realized the scientific identification of lag order. The comparison results showed that with the extension of prediction time, the required lag order gradually increased. Although the error also increased, it was still within the acceptable interval of long-term prediction results. In order to reduce the noise of input data, X. Jiang et al. [29] split the original sequence with the complete ensemble empirical mode decomposition with adaptive noise (CEemdan) strategy and entropy method, and then input the subsequences into the bidirectional Lstm (Bilstm) model for prediction respectively. This method greatly improved the efficiency of the prediction model.
The individual ANN model is lack flexibility for the adjustment of parameters, which limits the improvement of prediction accuracy [30]. The optimization algorithm provides a new solution for improving the prediction performance [31,32]. J. Murillo-Escobar et al. [33] designed the Svr-Pso model to estimate the concentration of APs in five regions. Compared with the feedforward artificial neural network (FAnn) model used in the San Buenaventura University in Bello (BEL-USBV) ( R M S E T M V T P C F A n n B E L U S B V O 3 = 11.77   ppm ), Pso optimized the prediction performance of the hybrid model ( R M S E T M V T P C S v r P s o B E L U S B V O 3 = 9.43   ppm ). Q. Wu et al. [34] constructed the Ba-LSsvm model to predict the decomposed low-frequency data, while the high-frequency data were simulated after secondary decomposition. Ba-LSsvm model not only produced better prediction results than Pso optimization, for example, the value of mean absolute percentage error (MAPE) is reduced by M A P E P s o B e i j i n g M A P E B a B e i j i n g = 0.1273 0.0877 = 0.0396 but also included the function of external variables. Similarly, G. Li et al. [35] proposed the CEemdan-Dse-BVmd-Csa-KElm model to disintegrate the data with high complexity to simplify the input sequence. Csa mechanism searched the optimal adjustable parameters of the KElm model, which was conducive to configuring the model on demand and realizing effective prediction in multiple fields. However, the optimization mechanism only focuses on the improvement of the precision of model prediction, and the importance of model robustness is ignored. To balance various performances, a series of multi-objective optimization mechanisms [36,37] represented by the multi-objective dragonfly algorithm (Moda) [38], multi-objective chameleon swarm algorithm (MOcsa) [39], and multi-objective salp swarm algorithm (MSsa) [40] have achieved good results in energy prediction. It can be considered to develop a multi-objective optimization mechanism to weigh the prediction performance of APs. In addition, some of the latest research models are evaluated in Table 1.
According to the analysis of existing literature, the shortcomings of different prediction models can be summarized as follows: (i) Deterministic models are highly professional and expensive, which limits the extendability of the model. (ii) The data-driven linear model is limited by linear assumptions, while the actual data is a mixture of linear and nonlinear characteristics. Only a single feature is extracted, which causes the loss of input information. (iii) Although the optimization mechanism avoids the randomness of parameter setting, in most cases, the accuracy of the model is only set as the unique objective function, which lacks the measurement of the stability of the prediction model. Therefore, to solve these drawbacks, a novel air pollutant concentration prediction (APCP) system is proposed for environmental system management in this paper. It is mainly established by four modules, namely time series reconstruction, submodel simulation, weight search, and integration. Specifically, in the time series reconstruction module, the decomposition-ensemble (DE) mode is constructed to filter the redundancy of the original sequence and then merge similar features. The obtained reconstruction sequence is trained by four prediction submodels (convolutional neural network (Cnn), long short-term memory (Lstm), extreme learning machine (Elm), and Elman neural network (Enn)) as input in the submodel simulation module. The outputs of four independent prediction sequences are assigned appropriate weights applying the weight search (Ws) mechanism. Based on this optimal weight, the final prediction results of the APCP system are integrated.
The main contributions of this paper are introduced as follows:
(1) DE mode is designed in the APCP systemto filter out the interference of high-frequency noise signals. The sequence of APs contains more information and has strong volatility. The DE mode is applied to reconstruct the original sequence, which not only eliminates the interference of high-frequency noise signals but also integrates the key information.
(2) A novel weight search mechanism, Ws, is developed in the weight search module. This mechanism makes up for the limitations of the previous optimization mechanism, such as the single objective function, ease of falling into the dilemma of local optimization, and long running time. By the comparative experiment, it can be proved that the weight search mechanism used in this paper is superior to previous optimization mechanisms both in searchability and stability.
(3) The construction of the four prediction submodels in the APCP system covers the inherent modes of sequence as much as possible, including three ANN models (Cnn, Elm, and Enn) and a deep learning model called Lstm. The four models make use of each other’s advantages to fill in the deficiencies and comprehensively capture sequence modes. Based on the extracted sequence modes, the prediction performance of the APCP system is improved.
(4) The proposed APCP system achieves the trade-off between double objectives, prediction precision, and robustness, in the paper. It comprises four modules: time series reconstruction, submodel simulation, weight search, and integration. Based on hourly PM2.5 concentration data in Guangzhou, Shanghai, and Chengdu, China, the superiority of the APCP system is testified by three experiments and discussions. It can also provide scientific data support for air pollution control and policy-making.
The remainder of the paper is arranged as follows. The next section introduces the design of the APCP system, including the construction principles of decomposition-ensemble mode and weight search mechanism, and the structure of the APCP system. Section 3 describes three experiments to verify the superiority of the proposed APCP system. Three discussions, which are significance analysis, correlation analysis, and sensitivity analysis, are carried out in Section 4. Section 5 summarizes the conclusions and prospects for the further development of the APCP system.

2. Design of the APCP System

The APCP system is established by four modules, which are time series reconstruction, submodel simulation, weight search, and integration. As depicted visually in Figure 1, firstly, the original sequence is split into subsequences of different frequencies by DE mode, and noise is filtered out. After recombining the subsequences according to the similarity, the multi-step prediction is realized in four individual prediction submodels, respectively. Next, under the constraint of the multi-objective function, the weight search mechanism is constructed to determine the appropriate weights of the four submodels. Finally, the multi-step prediction results are output based on the obtained weights.
In this section, before the structure of the APCP system is elaborated in detail, the DE mode and weight search mechanism applied to the system are designed as follows.

2.1. Decomposition-Ensemble Mode

The concentration series of AP has a complex structure and frequent fluctuations, which will cause a computational burden if directly input into the prediction model. Therefore, the paper designs the DE mode to divide the sequence structure and clarify the modal characteristics.
First, Λ ˜ v 1 N 0 , 1 , v = 1 , 2 , , V is added to the original concentration sequence C A P to form a new fluctuation sequence C v 1 = C A P + ι 1 Λ ˜ v 1 , where N 0 , 1 means standard normal distribution and ι 1 is a random number. Then, the upper C v 1 = Δ 3 E C v 1 and lower C v 1 = Δ 3 E C v 1 boundaries of C v 1 are calculated, where Δ 3 · is the cubic spline difference function, and E C v 1 and E C v 1 represent local maximum and minimum value sequences of C v 1 , respectively. The first component C ˜ v 1 = C v 1 C v 1 + C v 1 / 2 is obtained. If C ˜ v 1 satisfies Equation (1), then the first mode component T ˜ v 1 can be expressed as T ˜ v 1 = C ˜ v 1 , otherwise, let C v 1 = C ˜ v 1 and repeat the above process.
N E C ˜ v 1 + E C ˜ v 1 N Z C ˜ v 1 1 C ˜ v 1 , c + C ˜ v 1 , c / 2 = 0 , c = 1 , 2 , , C
where N · stands for counting function, Z C ˜ v 1 represents the set of points with zero value in the sequence of C ˜ v 1 , C ˜ v 1 , c , and C ˜ v 1 , c indicate the upper and lower boundaries of C ˜ v 1 at time point c, and C is the total number of time points in the C ˜ v 1 .
Based on this, the first intrinsic mode function (IMF) T ˜ 1 ¯ and residual sequence ξ 1 are listed in Equations (2) and (3).
T ˜ 1 ¯ = v = 1 V T ˜ v 1 / V = v = 1 V C ˜ v 1 / V = v = 1 V C v 1 C v 1 + C v 1 / 2 / V = v = 1 V C v 1 Δ 3 E C v 1 + Δ 3 E C v 1 / 2 / V = v = 1 V C A P + ι 1 Λ ˜ v 1 Δ 3 E C A P + ι 1 Λ ˜ v 1 + Δ 3 E C A P + ι 1 Λ ˜ v 1 / 2 / V = C A P + v = 1 V ι 1 Λ ˜ v 1 Δ 3 E C A P + ι 1 Λ ˜ v 1 + Δ 3 E C A P + ι 1 Λ ˜ v 1 / 2 / V
ξ 1 = C A P T ˜ 1 ¯       = v = 1 V ι 1 Λ ˜ v 1 Δ 3 E C A P + ι 1 Λ ˜ v 1 + Δ 3 E C A P + ι 1 Λ ˜ v 1 / 2 / V       = v = 1 V Δ 3 E C A P + ι 1 Λ ˜ v 1 + Δ 3 E C A P + ι 1 Λ ˜ v 1 / 2 ι 1 Λ ˜ v 1 / V
Similarly, let C v 2 = ξ 1 + ι 2 Λ ˜ v 2 , Λ ˜ v 2 N 0 , 1 and all the above steps are executed again. The results of T ˜ 2 ¯ and ξ 2 are expressed in Equations (4) and (5).
T ˜ 2 ¯ = ξ 1 + v = 1 V ι 2 Λ ˜ v 2 Δ 3 E ξ 1 + ι 2 Λ ˜ v 2 + Δ 3 E ξ 1 + ι 2 Λ ˜ v 2 / 2 / V
ξ 2   = v = 1 V Δ 3 E ξ 1 + ι 2 Λ ˜ v 2 + Δ 3 E ξ 1 + ι 2 Λ ˜ v 2 / 2 ι 2 Λ ˜ v 2 / V
And so on, until ξ is a monotonic function and DE mode is completed. All output results ( T ˜ l ¯ , l = 1 , 2 , , L and ξ ) meet Equation (6).
C A P = l = 1 L T ˜ l ¯ + ξ

2.2. Weight Search Mechanism

To improve the comprehensive ability of the APCP system, the weight allocation of four individual submodels is the key to the successful application under the constraints of multi-objective functions. As an effective weight search mechanism, the multi-objective grasshopper optimization algorithm (Mogoa) is developed in the APCP system, and its structure is constructed as follows. Initially, the grasshopper population Γ with α individuals and β dimensions is listed in Equation (7).
Γ = τ 1 1 τ 2 1 τ β 1 τ 1 2 τ 2 2 τ β 2 τ 1 α τ 2 α τ β α
According to [52], the location of Γ g , g = 1 , 2 , , α is not only affected by the other individuals Γ g , g g , but also related to gravity Θ g and wind direction Ξ g , as shown in Equation (8).
Γ g = λ 1 g = 1 g g α ε 1 e x p Γ g g ε 2 e x p Γ g g Γ g g Γ g g λ 2 Θ g + λ 3 Ξ g
where Γ g g = Γ g Γ g , ε 1 and ε 2 are two attractive parameters, and λ k = r a n d 1 , k = 1 , 2 , 3 .
Due to insufficient convergence, Equation (8) cannot realize the search for optimal weight. An improved convergence function is developed in the APCP system, as described in Equation (9). It is assumed that the change of Γ h g , g = 1 , 2 , , α ; h = 1 , 2 , β is not affected by Θ g , and Ξ g remains towards the food source Ω h F .
Γ h g = δ g = 1 g g α δ Γ h Γ h Γ g g 2 Γ g g ε 1 e x p Γ h g g ε 2 e x p Γ h g g + Ω h F
where Γ h and Γ h are inequality constraint range of Γ h g , Γ h g g = Γ h g Γ h g , h = 1 , 2 , , β , and δ satisfy Equation (10).
δ = I T e r η δ + η δ I T e r
where δ = m a x δ , δ = m i n δ , η indicates the current times, and ITer expresses the total number of cycles.
If the location of Γ h g moves beyond the boundaries ( Γ h and Γ h ), Γ h g is reset by Equation (11).
Γ h g = Γ h i f Γ h g ţ Γ h Γ h i f Γ h g ţ Γ h Γ h g i f Γ h ţ Γ h g ţ Γ h
When η = I T e r , the output results are the suitable weights Π of the APCP system. The pseudo-code of the weight search mechanism is listed in Algorithm 1.
Algorithm 1. Weight search mechanism
Input:
Ψ = ϕ 1 , ϕ 2 , , ϕ N T R a i n
Outputs:
F—the best fitness results
𝛑 the suitable weights
Parameters:
ITer—the iteration number
η —the current iteration number
Γ g —the location of g -th grasshopper
α —the number of Γ
β —the dimension of Γ
ARchIvemax—the archive size
ARchIvenum—the number of repositories
1: /* Initialize Γ . */
2: /*Set δ , δ , and ITer.*/
3. /*Calculate F of each search agent.*/
4. /* 𝛑 = B e s t S e a r c h _ A g e n t .*/
5. WHILE ( η   < I T e r ) DO
6.        /*Update coefficient δ .*/
7.         δ = I T e r η δ + η δ I T e r
8.        FOR (each search agent) DO
9.            /*Normalize Γ g Γ g 1 , 4 .*/
10.           /*Update Γ h g .*/
11.             Γ h g = δ g = 1 g g α δ Γ h Γ h Γ g g 2 Γ g g ε 1 e x p Γ h g g ε 2 e x p Γ h g g + Ω h F
12.          /*Reset Γ h g if it moves beyond the boundaries.*/
13.            Γ h g = Γ h i f Γ h g < Γ h Γ h i f Γ h g > Γ h Γ h g i f Γ h Γ h g Γ h
14.          /*Update 𝛑 if a better solution is produced.*/
15.          /*Calculate F of each search agent.*/
16.          /*Identify the non-dominated solutions.*/
17.          /*Extended repository based on the non-dominated solutions.*/
18.          IF  A R c h I v e n u m > A R c h I v e m a x  DO
19.                /* Start the repository maintainer to remove one repository resident.*/
20.                /*Put the new non-dominated solution into it.*/
21.          END IF
22.      END FOR
23.       η = η + 1
24. END WHILE
25. RETURN  𝛑

2.3. Framework of the APCP System

The APCP system is designed into four modules, as described in Figure 1, which are independent and coordinated. In this subsection, the structural design of each module is carefully explained to prove the scientific nature and rationality of the proposed system.

2.3.1. Time Series Reconstruction

Since only the influence of the historical AP series is considered, extracting as much information as possible from the original series is the premise to enhance the APCP system performance. The frequent fluctuation and disordered noise bring a burden to the prediction model. Therefore, the DE mode is designed to split the concentration series of AP to clarify the structure by increasing quantity. Based on the principle of similarity, the AP series is reconstructed. It not only reduces redundant interference but also ensures the integrity of series information.

2.3.2. Submodel Simulation

Using the PM2.5 data of Guangzhou, Shanghai, and Chengdu in January 2015, the pre-experiment, including Cnn, Bilstm, least square support vector machine (Lssvm), gaussian process (Gp), Lstm, gate recurrent unit (Gru), Elm, and Enn models, is carried out to select the appropriate prediction submodels. From the results of the four evaluation indicators in Table 2, the prediction ability of the Cnn, Lstm, Elm, and Enn models is similar M A P E = M A P E C n n A v e . , M A P E L s t m A v e . , M A P E E l m A v e . , M A P E E n n A v e . = 7.751 % , 7.630 % , 7.654 % , 7.815 % and superior to other models. Thus, Cnn, Lstm, Elm, and Enn as effective prediction tools are applied in the submodel simulation module.

2.3.3. Weight Search

The prediction ability of the individual submodel is limited, and it fails to fit all the characteristics of the AP series. The weight search mechanism can provide a new solution for identifying information from multiple perspectives by integrating all submodels. The APCP system aims to balance the prediction precision and robustness, so the objective functions of the weight search mechanism are shown in Equation (12). After the iteration, the optimal weights Π of the four submodels are output.
min o b j I = m e a n O σ ( 596 t h 744 t h ) F i t σ ( 596 t h 744 t h ) O σ ( 596 t h 744 t h ) o b j I I = s t d O σ ( 596 t h 744 t h ) F i t σ ( 596 t h 744 t h )
where O σ · means the original series, F i t σ · is the fitting results, and 596 t h 744 t h represents the AP series used for testing.

2.3.4. Integration

According to Equation (13), the fitting values ϑ of the four submodels are integrated with the corresponding optimal weights 𝛑 to obtain the final prediction results Φ ^ A P C P .
Φ ^ A P C P = φ ^ ϑ , 𝛑 = s = 1 4 π s ϑ s
where s = 1 4 π s = 1 .
Furthermore, to ensure the best performance of the APCP system, the parameters of all models included in the system are default as listed in Table 3, which is set by trial-and-error results.

3. Establishment and Evaluation of Experiment

Taking the hourly PM2.5 concentration data of Guangzhou, Shanghai, and Chengdu as an example, this section makes a comparative analysis of three experiments on the APCP system, which aims to explain the performance of the proposed system from multiple perspectives.

3.1. Description of Datasets

This paper adopts hourly PM2.5 concentration data from three cities in China, namely Guangzhou (Gz), Shanghai (Sh), and Chengdu (Cd), in January 2015, which are from http://archive.ics.uci.edu/ (accessed on 8 July 2022). The division of datasets and basic information such as maximum (Max.), minimum (Min.), mean (Mean), standard deviation (Std.), and maximum Lyapunov exponent (Mlye) are shown in Table 4.
According to the results of statistical evaluation, this month, the average PM2.5 concentration in Chengdu is the highest M e a n T o t a l G z , M e a n T o t a l S h , M e a n T o t a l C d = 34.155 , 55.563 , 64.711 and has a large fluctuation range S t d . T o t a l G z , S t d . T o t a l S h , S t d . T o t a l C d = 69.609 , 84.036 , 141.839 . Besides, the Mlye values of the three cities are greater than zero, suggesting that the experimental data are chaotic. It proves the necessity of the time series reconstruction module in the APCP system.

3.2. Evaluation Indexes

For convenience to compare the experimental results, four evaluation criteria are utilized, including MAPE, MRE, MSE, and R2. The expressions are given in Table 5.

3.3. Experiment I: Comparison with the Individual Models

Experiment I analyzes the multi-step prediction results of the APCP system compared with well-known models, which are Bilstm, Gru, Gp, and Lssvm models, and individual submodels, including Cnn, Lstm, Elm, and Enn models, as revealed in Table 6 and Figure 2. In order to ensure fairness, the data of the first four hours are used as input to predict the value of the fifth time point in all models. Moreover, the parameter settings of all models are shown in Table 3.
(a) For the dataset in Gz, the APCP system produces the best results in all measurement indicators. For the 1-step prediction, the MSE value of the APCP system is only M S E A P C P G z 1 s t e p = 20.861 , while the minimum value of the other individual models is M S E L s t m G z 1 s t e p = 22.180 . Similar results are also found in MRE and MAPE values. Based on the positive indicator, the R2 value of the APCP system is R A P C P 2 = 0.965 , which is higher than the average result of the four well-known individual models R 2 ¯ = R B i l s t m 2 + R G r u 2 + R G p 2 + R L s s v m 2 / 4 = 0.947 + 0.943 + 0.832 + 0.950 / 4 = 0.918 . From the evaluation index of the 2-step prediction, the MRE value of the APCP system M R E A P C P G z 2 s t e p = 0.014 is even better than that of the Gru and GP models in the 1-step prediction ( M R E G r u G z 1 s t e p = 0.029 and M R E G p G z 1 s t e p = 0.057 ). Besides, the performance of the 3-step prediction in the APCP system M A P E A P C P G z 3 s t e p = 9.710 % has little difference from that of the 2-step prediction M A P E A P C P G z 2 s t e p = 9.552 % , which indicates that the proposed system has better stability.
(b) For the dataset in Sh, although the overall performance of the APCP system is not as good as that of the dataset in Gz, it is still superior to the other individual models. For the 1-step prediction, the advantage of the APCP system is not outstanding R 2 = R A P C P 2 , R C n n 2 , R L s t m 2 , R E l m 2 , R E n n 2 = 0.976 , 0.973 , 0.975 , 0.974 , 0.974 . Because the individual submodels show good prediction capacity, the improvement space of the APCP system is limited. However, with the increase in prediction steps, the superiority of the APCP system becomes more and more obvious. Based on the evaluation results of MSE, the APCP system is M S E E l m S h 2 s t e p M S E A P C P S h 2 s t e p = 159.395 143.385 = 16.010 lower than the Elm model, which has the optimal result among all individual models, in the 2-step prediction. In the 3-step prediction, the APCP system is M S E C n n S h 3 s t e p M S E A P C P S h 3 s t e p = 160.774 147 . 137 = 13.637 less than the Cnn model with the best results. It is worth noting that the MAPE of the APCP system in the 3-step M A P E A P C P S h 3 s t e p = 14.994 % is less than that in the 2-step M A P E A P C P S h 2 s t e p = 15.477 % .
(c) For the dataset in Cd, the R2 value is the largest, which indicates that historical data has the strongest ability to interpret the predicted value at the next time point. For example, in the 3-step prediction of the APCP system, R2 in Cd R A P C P 2 = 0.952 C d 3 s t e p is much higher than that in Gz R A P C P 2 = 0.854 G z 3 s t e p and Sh R A P C P 2 = 0.894 S h 3 s t e p . Due to the frequent fluctuations of the dataset in Cd, the APCP system still achieves exact prediction in the case that the measurement errors of the models are large. Compared with the average value of MSE in four well-known models M S E C d 2 s t e p ¯ = M S E B i l s t m C d 2 s t e p + M S E G r u C d 2 s t e p + M S E G p C d 2 s t e p + M S E L s s v m C d 2 s t e p / 4 = 203.926 + 520.354 + 368.929 + 183.649 / 4 = 319.2145 , the MSE of the APCP system is only M S E A P C P C d 2 s t e p = 157.888 . For the 3-step prediction results, though the Cnn model performs the best M R E C n n C d 3 s t e p , M A P E C n n C d 3 s t e p = 0.062 , 14.993 % of all individual models, it is inferior to the APCP system M R E A P C P C d 3 s t e p , M A P E A P C P C d 3 s t e p = 0.027 , 13.810 % .
Remark 1.
Compared with all individual models, including Bilstm, Gru, Gp, Lssvm, Cnn, Lstm, Elm, and Enn models, the APCP system has perfect prediction performance. Even in multi-step prediction, the proposed system represents strong robustness as usual.

3.4. Experiment II: Test the Superiority of the APCP System

Based on the results of Experiment I, Experiment II mainly analyzes the percentage improvement of the APCP system compared with four well-known models, which are listed in Table 7. The influence of the magnitude is eliminated, and the superiority of the APCP system can be objectively evaluated. In addition, the average percentage improvement of the three regions is also calculated to illustrate the generality of the APCP system performance.
(a) For the dataset in Gz, the performance of the APCP system is greatly improved in the 1-step and 3-step prediction, while it is relatively small in the 2-step prediction. For example, the MRE value of the APCP system increases by at least m r e v s . L s s v m G z 1 s t e p = 98.38 % in one-step prediction. In contrast, in the two-step prediction, it improves by m r e v s . G p G z 1 s t e p = 82.07 % at most. The APCP system also has a significant advantage in the 3-step prediction, with MSE ranging from m s e v s . G p G z 3 s t e p = 58.46 % to m s e v s . G r u G z 3 s t e p = 76.50 % , which is also better than the 2-step prediction m s e v s . G r u G z 2 s t e p , m s e v s . G p G z 2 s t e p = 13.91 % , 51.58 % .
(b) For the dataset in Sh, the improvement of the APCP system for some models is even higher than that of the dataset in Gz. From the results of Experiment I, the fitting result of the dataset in Gz is significantly better than that of the dataset in Sh. It may be because the dataset in Sh is noisier. However, from the perspective of improvement effect, the APCP system can also exactly predict redundant data. Taking 2-step and 3-step predictions as examples, the values of MSE and MRE in the APCP system increase by at least m s e v s . B i l s t m S h 2 s t e p = 21.90 % and m r e v s . B i l s t m S h 3 s t e p = 81.84 % , respectively, while for the dataset in Gz, the minimum values of m s e G z 2 s t e p and m r e G z 3 s t e p are only m s e v s . G r u G z 2 s t e p = 13.91 % and m r e v s . L s s v m G z 3 s t e p = 76.44 % respectively.
(c) For the dataset in Cd, with the increase of prediction steps, the performance advantage of the APCP system becomes more and more significant. From the comparison of MSE results, the percentage of improvement of the APCP system gradually increases m s e v s . B i l s t m C d 1 s t e p , m s e v s . B i l s t m C d 2 s t e p , m s e v s . B i l s t m C d 3 s t e p = 17.90 % , 22.58 % , 67.41 % . More prediction steps increase the complexity so that the well-known individual model is difficult to maintain a stable prediction, which may cause the percentage change in the APCP system. Similarly, the same conclusion can be drawn in m r e and m a p e .
Remark 2.
In the comparative analysis with the above individual model, the superiority of the APCP system is scientifically and systematically verified. Even though there are slight differences between different regions, the average effect is still outstanding.

3.5. Experiment III: Comparison of Different Module Strategies

Experiment III mainly compares the combined models using different module strategies. Specifically, by changing the strategies of the time series reconstruction module and weight search module, the effectiveness of the DE mode and weight search mechanism in the APCP system is proved. The experimental results of the combined model are depicted in Table 8 and Figure 3. To ensure the comparability of the experiment, the settings of all model parameters remain the same as before, as listed in Table 3.
(a) For the dataset in Gz, the time series reconstruction module strategy applied by the APCP system achieves the best prediction effect. The above experiment shows that when the number of prediction steps increases, the prediction result of the individual model is poor. In order to reduce the difficulty of the prediction model, different time series reconstruction strategies are applied to the experiment, which is the empirical mode decomposition (Emd) and ensemble Emd (Eemd). It can be seen that the Eemd strategy does not produce the expected effect M S E E e m d - W s G z 1 s t e p , M S E E e m d - W s G z 2 s t e p , M S E E e m d - W s G z 3 s t e p = 83.250 , 194.027 , 215.869 , while the Emd strategy helps to improve the prediction accuracy M S E E m d - W s G z 1 s t e p , M S E E m d - W s G z 2 s t e p , M S E E m d - W s G z 3 s t e p = 47.653 , 108.801 , 137.224 . Although the R2 value at the 3-step prediction R E m d - W s 2 = 0.901 G z 3 s t e p is higher than that of the APCP system R A P C P 2 = 0.854 G z 3 s t e p , it is not as perfect as the APCP system at all other times M S E A P C P G z 1 s t e p , M S E A P C P G z 2 s t e p , M S E A P C P G z 3 s t e p = 20.861 , 76.568 , 85.532 .
(b) For the dataset in Sh, based on the same DE mode, the weight search mechanism of the APCP system can contain the advantages of all individual models as much as possible. Considering the function of the multi-objective optimization mechanism, Moda and Mogwo optimization mechanism is utilized to evaluate the weight search capacity. The effects of these two optimization mechanisms are basically the same M A P E D E - M o d a S h 1 s t e p , M A P E D E - M o g w o S h 1 s t e p = 12.558 % , 12.356 % , except that they are better M R E D E M o g w o S h 2 s t e p = 0 . 005 than the APCP system M R E A P C P S h 2 s t e p = 0 . 025 in individual indicators, and the prediction performance is far inferior to the APCP system M A P E A P C P S h 1 s t e p , M A P E A P C P S h 2 s t e p , M A P E A P C P S h 3 s t e p = 9.139 % , 15.477 % , 14.994 % .
(c) For the dataset in Cd, no matter how the module strategies change, the APCP system still maintains an absolute advantage. Taking the result of the 1-step prediction as an example, the MSE value of the APCP system is reduced by M S E D E - M o d a C d 1 s t e p M S E A P C P C d 1 s t e p / M S E D E - M o d a C d 1 s t e p = 138.789 49.914 / 138.789 × 100 % = 64.036 % to that of the DE-Moda model. Compared with the MRE value of the DE-Mogwo model M R E D E - M o g w o C d 1 s t e p = 0.048 , the APCP system is only M R E A P C P C d 1 s t e p = 0.012 . Similarly, the MAPE value of the APCP system is M A P E E m d - W s C d 1 s t e p M A P E A P C P C d 1 s t e p = 10.261 % 7.560 % = 2.701 % lower than that of the Emd-Ws model. The R2 value reaches the maximum R A P C P S , R D E M o d a 2 , R D E M o g w o 2 , R E e m d W s 2 , R E m d W s 2 = 0.985 , 0.959 , 0.957 , 0.960 , 0.977 .
Remark 3.
The DE mode and weight search mechanism adopted by the APCP system can minimize the complexity of input data and integrate the advantages of the individual models, which is the key to improving the prediction performance of the APCP system.

4. Discussions

In this section, three discussions, which are significance analysis, correlation analysis, and sensitivity analysis, are implemented to more comprehensively and scientifically demonstrate the perfect performance of the APCP system.

4.1. Significance Analysis

Significance analysis aims to explain whether the difference between the APCP system and the model involved in Experiment I~III is significant. Diebold-Mariano (DM) statistic ϒ D M is a general indicator to measure the consistency of two sequences [53]. Based on this ϒ D M , this subsection compares the results between the APCP system and eight models, and the results are represented in Table 9.
Compared with the four well-known individual models, the prediction results of the APCP system have significant advantages ϒ B i l s t m G z D M 2 s t e p , ϒ B i l s t m S h D M 2 s t e p , ϒ B i l s t m C d D M 2 s t e p = 1.76 , 1.77 , 1.66 at the significance level of α = 0.10 , except that there is no significant difference between the Gru ϒ G r u G z D M 2 s t e p = 0.30 and Lssvm ϒ L s s v m S h D M 2 s t e p , ϒ L s s v m C d D M 2 s t e p = 1.59 , 1 . 22 models and the APCP system in the 2-step prediction results. For the combined models with different module strategies, although the Emd-Ws model accepts H 0 : E H φ ι φ ^ ι = E H φ ι φ ^ ι in the 2-step and 3-step prediction results of the Sh dataset ϒ E m d W s S h D M 2 s t e p , ϒ E m d W s S h D M 2 s t e p = 1.04 , 0.52 , in other cases (DE-Moda, DE-Mogwo, and Eemd-Ws models), H 0 : E H φ ι φ ^ ι = E H φ ι φ ^ ι is rejected at the significance level of α = 0.05 , such as ϒ D E M o d a S h D M 1 s t e p = 2.72 . In other words, it is significantly different from the APCP system. In summary, the APCP system has significant advantages in improving prediction performance.

4.2. Correlation Analysis

This subsection mainly adopts the method of grey relational degree analysis [54] to calculate the correlation ς υ between the prediction results φ ^ τ υ of all models υ and the real values φ τ . The comparison results are revealed in Table 10.
In the correlation analysis of all models, the APCP system has the strongest correlation ς A P C P G z 1 s t e p , ς A P C P G z 2 s t e p , ς A P C P G z 3 s t e p = 0.875 , 0.847 , 0.883 with the real value φ τ . The predicted value of the APCP system not only does not decrease with the increase of prediction steps but also improves ς A P C P S h 1 s t e p , ς A P C P S h 2 s t e p , ς A P C P S h 3 s t e p = 0.888 , 0.857 , 0.876 , which also confirms the stability of APCP system performance. For example, for the dataset in Cd, the ς A P C P C d 1 s t e p value of the 1-step prediction is ς A P C P C d 1 s t e p = 0.840 , while the ς A P C P C d 3 s t e p value of the 3-step prediction is ς A P C P C d 1 s t e p = 0.850 .

4.3. Sensitivity Analysis

In order to verify the generality of the APCP system, this section discusses the stability of the weight search mechanism. Three important parameters, including iteration number I T e r = I T e r 1 , I T e r 2 , I T e r 3 , I T e r 4 , I T e r 5 = 300 , 350 , 400 , 450 , 500 , population number α = α 1 , α 2 , α 3 , α 4 , α 5 = 20 , 40 , 60 , 80 , 100 , and archive size A R c h I v e max = A R c h I v e max 1 , A R c h I v e max 2 , A R c h I v e max 3 , A R c h I v e max 4 , A R c h I v e max 5 = 100 , 200 , 300 , 400 , 500 , are selected to evaluate changes in APCP system performance. The results of the discussion are depicted in Table 11 and Figure 4.
(a) The performance of the APCP system can not fluctuate greatly with the change in ITer. The sensitivity indicators χ ¯ M R E I T e r and χ ¯ R 2 I T e r show the volatility of MRE and R2 values in the APCP system when ITer changes. From the results of the three cities, the χ ¯ M R E I T e r and χ ¯ R 2 I T e r values remain unchanged, which are close to 0.
(b) The change in α has no obvious influence on the evaluation index of the APCP system. Although χ ¯ M S E α C d 2 s t e p = 6.765 fluctuate significantly in the 2-step prediction, this is only an individual case. It may be caused by insufficient times of experiments. In most cases, the four evaluation indicators of the APCP system reflect good stability χ ¯ M S E α G z 1 s t e p , χ ¯ M R E α G z 1 s t e p , χ ¯ M A P E α G z 1 s t e p , χ ¯ R 2 α G z 1 s t e p = 0.070 , 0.000 , 0.010 , 0.000 .
(c) From the adjustment of ARchIvemax, the stability of the APCP system can also be verified. Except for χ ¯ M S E A R c h I v e max C d 2 s t e p = 6.412 in the two-step prediction of the dataset in Cd, the error evaluation index of the APCP system has no significant fluctuation. Even the results in the 3-step prediction emerge with superior stability χ ¯ M S E A R c h I v e max C d 3 s t e p , χ ¯ M R E A R c h I v e max C d 3 s t e p , χ ¯ M A P E A R c h I v e max C d 3 s t e p , χ ¯ R 2 A R c h I v e max C d 3 s t e p = 1.671 , 0.001 , 0.062 , 0.001 .
Remark 4.
The parameter adjustment of the weight search mechanism has no significant impact on the APCP system, which demonstrates that the system has excellent stability.

5. Conclusions and Prospect

With the comprehensive popularization of urbanization and the rapid increase in energy consumption, many air pollution problems have been brought to cities. To formulate protective measures timely and effectively, a novel APCP system is proposed for environmental system management in this paper. First, based on DE mode, the AP sequence is reconstructed to reduce redundant interference. Then, four individual models with superior performance are selected to realize AP prediction. Next, the weight search mechanism is designed to balance the precision and robustness of the proposed system. Finally, appropriate weights are adopted to integrate the advantages of all models, which can obtain a perfect prediction value.
From Experiment I~III, this conclusion can be drawn that the multi-step prediction performance of the APCP system is superior. Taking the dataset in Gz as an example, in the 1-step prediction, the MSE of the APCP system has a minimum value M S E A P C P G z 1 s t e p = 20.861 . Compared with the Lssvm model, the value of MRE is reduced by m r e v s . L s s v m G z 1 s t e p = 98.38 % . For the results of the 2-step prediction, the value of MAPE is M A P E E m d - W s G z 2 s t e p M A P E A P C P G z 2 s t e p = 1.441 % less than that of the Emd-Ws model. It is worth noting that the R2 value predicted by the APCP system in 3-step is even better than that predicted by the DE-Moda model in 2-step R A P C P S G z 3 s t e p , R D E - M o d a S G z 2 s t e p = 0.854 , 0.668 . To sum up, the comprehensive capacity of the APCP system is outstanding, which can be used as an effective tool in actual AP prediction.
Although this paper assumes that the AP concentration sequence is only related to historical values, many external factors, such as wind speed, affect the concentration changes of AP in real life. Based on this consideration, the APCP system can be further expanded in future research to improve its prediction performance.

Author Contributions

Y.H.: Conceptualization, software, writing—original draft preparation. Y.Z.: methodology, visualization, writing—reviewing and editing. J.G.: writing—original draft preparation, software validation. J.W.: formal analysis, software validation. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Major Program of National Social Science Foundation of China (Grant No. 17ZDA093).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the contribution of the anonymous reviewers and the editors.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Liu, H.Y.; Dunea, D.; Iordache, S.; Pohoata, A. A Review of Airborne Particulate Matter Effects on Young Children’s Respiratory Symptoms and Diseases. Atmosphere 2018, 9, 150. [Google Scholar] [CrossRef]
  2. Yang, S.; Fang, D.; Chen, B. Human Health Impact and Economic Effect for PM2.5 Exposure in Typical Cities. Appl. Energy 2019, 249, 316–325. [Google Scholar] [CrossRef]
  3. Hao, Y.; Niu, X.; Wang, J. Impacts of Haze Pollution on China’s Tourism Industry: A System of Economic Loss Analysis. J. Environ. Manag. 2021, 295, 113051. [Google Scholar] [CrossRef] [PubMed]
  4. Kim, K.H.; Kabir, E.; Kabir, S. A Review on the Human Health Impact of Airborne Particulate Matter. Environ. Int. 2015, 74, 136–143. [Google Scholar] [CrossRef] [PubMed]
  5. Maji, K.J.; Ye, W.F.; Arora, M.; Shiva Nagendra, S.M. PM2.5-Related Health and Economic Loss Assessment for 338 Chinese Cities. Environ. Int. 2018, 121, 392–403. [Google Scholar] [CrossRef] [PubMed]
  6. Zhang, Y.; Bocquet, M.; Mallet, V.; Seigneur, C.; Baklanov, A. Real-Time Air Quality Forecasting, Part I: History, Techniques, and Current Status. Atmos. Environ. 2012, 60, 632–655. [Google Scholar] [CrossRef]
  7. Wang, J.; Wang, R.; Li, Z. A Combined Forecasting System Based on Multi-Objective Optimization and Feature Extraction Strategy for Hourly PM2.5 Concentration. Appl. Soft Comput. 2022, 114, 108034. [Google Scholar] [CrossRef]
  8. Djalalova, I.; Delle Monache, L.; Wilczak, J. PM2.5 Analog Forecast and Kalman Filter Post-Processing for the Community Multiscale Air Quality (CMAQ) Model. Atmos. Environ. 2015, 108, 76–87. [Google Scholar] [CrossRef]
  9. Baker, K.R.; Woody, M.C.; Valin, L.; Szykman, J.; Yates, E.L.; Iraci, L.T.; Choi, H.D.; Soja, A.J.; Koplitz, S.N.; Zhou, L.; et al. Photochemical Model Evaluation of 2013 California Wild Fire Air Quality Impacts Using Surface, Aircraft, and Satellite Data. Sci. Total Environ. 2018, 637–638, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
  10. Zhang, Q.; Xue, D.; Liu, X.; Gong, X.; Gao, H. Process Analysis of PM2.5 Pollution Events in a Coastal City of China Using CMAQ. J. Environ. Sci. 2019, 79, 225–238. [Google Scholar] [CrossRef]
  11. Lee, K.; Yu, J.; Lee, S.; Park, M.; Hong, H.; Park, S.Y.; Choi, M.; Kim, J.; Kim, Y.; Woo, J.H.; et al. Development of Korean Air Quality Prediction System Version 1 (KAQPS v1) with Focuses on Practical Issues. Geosci. Model Dev. 2020, 13, 1055–1073. [Google Scholar] [CrossRef]
  12. Ryu, Y.H.; Hodzic, A.; Barre, J.; Descombes, G.; Minnis, P. Quantifying Errors in Surface Ozone Predictions Associated with Clouds over the CONUS: A WRF-Chem Modeling Study Using Satellite Cloud Retrievals. Atmos. Chem. Phys. 2018, 18, 7509–7525. [Google Scholar] [CrossRef]
  13. Cheng, X.; Liu, Y.; Xu, X.; You, W.; Zang, Z.; Gao, L.; Chen, Y.; Su, D.; Yan, P. Lidar Data Assimilation Method Based on CRTM and WRF-Chem Models and Its Application in PM2.5 Forecasts in Beijing. Sci. Total Environ. 2019, 682, 541–552. [Google Scholar] [CrossRef] [PubMed]
  14. Abdi-Oskouei, M.; Carmichael, G.; Christiansen, M.; Ferrada, G.; Roozitalab, B.; Sobhani, N.; Wade, K.; Czarnetzki, A.; Pierce, R.B.; Wagner, T.; et al. Sensitivity of Meteorological Skill to Selection of WRF-Chem Physical Parameterizations and Impact on Ozone Prediction During the Lake Michigan Ozone Study (LMOS). J. Geophys. Res. Atmos. 2020, 125, e2019JD031971. [Google Scholar] [CrossRef]
  15. Lopez-Restrepo, S.; Yarce, A.; Pinel, N.; Quintero, O.L.; Segers, A.; Heemink, A.W. Forecasting PM10 and PM2.5 in the Aburrá Valley (Medellín, Colombia) via EnKF Based Data Assimilation. Atmos. Environ. 2020, 232, 117507. [Google Scholar] [CrossRef]
  16. Wei, W.; Lv, Z.F.; Li, Y.; Wang, L.T.; Cheng, S.; Liu, H. A WRF-Chem Model Study of the Impact of VOCs Emission of a Huge Petro-Chemical Industrial Zone on the Summertime Ozone in Beijing, China. Atmos. Environ. 2018, 175, 44–53. [Google Scholar] [CrossRef]
  17. Chen, Q.; Taylor, D. Transboundary Atmospheric Pollution in Southeast Asia: Current Methods, Limitations and Future Developments. Crit. Rev. Environ. Sci. Technol. 2018, 48, 997–1029. [Google Scholar] [CrossRef]
  18. De Mattos Neto, P.S.G.; Madeiro, F.; Ferreira, T.A.E.; Cavalcanti, G.D.C. Hybrid Intelligent System for Air Quality Forecasting Using Phase Adjustment. Eng. Appl. Artif. Intell. 2014, 32, 185–191. [Google Scholar] [CrossRef]
  19. Baptista, M.; Sankararaman, S.; de Medeiros, I.P.; Nascimento, C.; Prendinger, H.; Henriques, E.M.P. Forecasting Fault Events for Predictive Maintenance Using Data-Driven Techniques and ARMA Modeling. Comput. Ind. Eng. 2018, 115, 41–53. [Google Scholar] [CrossRef]
  20. Wang, J.; Lei, C.; Guo, M. Daily Natural Gas Price Forecasting by a Weighted Hybrid Data-Driven Model. J. Pet. Sci. Eng. 2020, 192, 107240. [Google Scholar] [CrossRef]
  21. Aladağ, E. Forecasting of Particulate Matter with a Hybrid ARIMA Model Based on Wavelet Transformation and Seasonal Adjustment. Urban Clim. 2021, 39, 100930. [Google Scholar] [CrossRef]
  22. Bhatti, U.A.; Yan, Y.; Zhou, M.; Ali, S.; Hussain, A.; Qingsong, H.; Yu, Z.; Yuan, L. Time Series Analysis and Forecasting of Air Pollution Particulate Matter (PM2.5): An SARIMA and Factor Analysis Approach. IEEE Access 2021, 9, 41019–41031. [Google Scholar] [CrossRef]
  23. Shaziayani, W.N.; Ul-Saufie, A.Z.; Ahmat, H.; Al-Jumeily, D. Coupling of Quantile Regression into Boosted Regression Trees (BRT) Technique in Forecasting Emission Model of PM10 Concentration. Air Qual. Atmos. Health 2021, 14, 1647–1663. [Google Scholar] [CrossRef]
  24. Liu, T.; Lau, A.K.H.; Sandbrink, K.; Fung, J.C.H. Time Series Forecasting of Air Quality Based On Regional Numerical Modeling in Hong Kong. J. Geophys. Res. Atmos. 2018, 123, 4175–4196. [Google Scholar] [CrossRef]
  25. Abdullah, S.; Napi, N.N.L.M.; Ahmed, A.N.; Mansor, W.N.W.; Mansor, A.A.; Ismail, M.; Abdullah, A.M.; Ramly, Z.T.A. Development of Multiple Linear Regression for Particulate Matter (PM10) Forecasting during Episodic Transboundary Haze Event in Malaysia. Atmosphere 2020, 11, 289. [Google Scholar] [CrossRef]
  26. Mohd Napi, N.N.L.; Noor Mohamed, M.S.; Abdullah, S.; Mansor, A.A.; Ahmed, A.N.; Ismail, M. Multiple Linear Regression (MLR) and Principal Component Regression (PCR) for Ozone (O3) Concentrations Prediction. IOP Conf. Ser. Earth Environ. Sci. 2020, 616, 012004. [Google Scholar] [CrossRef]
  27. Bai, Y.; Li, Y.; Wang, X.; Xie, J.; Li, C. Air Pollutants Concentrations Forecasting Using Back Propagation Neural Network Based on Wavelet Decomposition with Meteorological Conditions. Atmos. Pollut. Res. 2016, 7, 557–566. [Google Scholar] [CrossRef]
  28. Li, X.; Peng, L.; Yao, X.; Cui, S.; Hu, Y.; You, C.; Chi, T. Long Short-Term Memory Neural Network for Air Pollutant Concentration Predictions: Method Development and Evaluation. Environ. Pollut. 2017, 231, 997–1004. [Google Scholar] [CrossRef]
  29. Jiang, X.; Wei, P.; Luo, Y.; Li, Y. Air Pollutant Concentration Prediction Based on a CEEMDAN-FE-BiLSTM Model. Atmosphere 2021, 12, 1452. [Google Scholar] [CrossRef]
  30. Weihong, W.; Shuangshuang, N. The Performance of Several Combining Forecasts for Stock Index. In Proceedings of the 2008 International Seminar on Future Information Technology and Management Engineering, Leicestershire, UK, 20 November 2008; pp. 450–455. [Google Scholar] [CrossRef]
  31. Wang, J.; Li, J.; Li, Z. Prediction of Air Pollution Interval Based on Data Preprocessing and Multi-Objective Dragonfly Optimization Algorithm. Front. Ecol. Evol. 2022, 10, 855606. [Google Scholar] [CrossRef]
  32. Yang, W.; Sun, S.; Hao, Y.; Wang, S. A Novel Machine Learning-Based Electricity Price Forecasting Model Based on Optimal Model Selection Strategy. Energy 2022, 238, 121989. [Google Scholar] [CrossRef]
  33. Murillo-Escobar, J.; Sepulveda-Suescun, J.P.; Correa, M.A.; Orrego-Metaute, D. Forecasting Concentrations of Air Pollutants Using Support Vector Regression Improved with Particle Swarm Optimization: Case Study in Aburrá Valley, Colombia. Urban Clim. 2019, 29, 100473. [Google Scholar] [CrossRef]
  34. Wu, Q.; Lin, H. A Novel Optimal-Hybrid Model for Daily Air Quality Index Prediction Considering Air Pollutant Factors. Sci. Total Environ. 2019, 683, 808–821. [Google Scholar] [CrossRef] [PubMed]
  35. Li, G.; Chen, L.; Yang, H. Prediction of PM2.5 Concentration Based on Improved Secondary Decomposition and CSA-KELM. Atmos. Pollut. Res. 2022, 13, 101455. [Google Scholar] [CrossRef]
  36. Gan, R.; Guo, Q.; Chang, H.; Yi, Y. Improved Ant Colony Optimization Algorithm for the Traveling Salesman Problems. J. Syst. Eng. Electron. 2010, 21, 329–333. [Google Scholar] [CrossRef]
  37. Dhiman, G.; Kumar, V. Seagull Optimization Algorithm: Theory and Its Applications for Large-Scale Industrial Engineering Problems. Knowledge-Based Syst. 2019, 165, 169–196. [Google Scholar] [CrossRef]
  38. Zhou, Q.; Lv, Q.; Zhang, G. A Combined Forecasting System Based on Modified Multi-Objective Optimization for Short-Term Wind Speed and Wind Power Forecasting. Appl. Sci. 2021, 11, 9383. [Google Scholar] [CrossRef]
  39. Zhou, Y.; Wang, J.; Li, Z.; Lu, H. Short-Term Photovoltaic Power Forecasting Based on Signal Decomposition and Machine Learning Optimization. Energy Convers. Manag. 2022, 267, 115944. [Google Scholar] [CrossRef]
  40. Wang, J.; Gao, J.; Wei, D. Electric Load Prediction Based on a Novel Combined Interval Forecasting System. Appl. Energy 2022, 322, 119420. [Google Scholar] [CrossRef]
  41. Wu, Z.; Zhang, S. Study on the Spatial–Temporal Change Characteristics and Influence Factors of Fog and Haze Pollution Based on GAM. Neural Comput. Appl. 2019, 31, 1619–1631. [Google Scholar] [CrossRef]
  42. Zakaria, N.N.; Othman, M.; Sokkalingam, R.; Daud, H.; Abdullah, L.; Kadir, E.A. Markov Chain Model Development for Forecasting Air Pollution Index of Miri, Sarawak. Sustainability 2019, 11, 5190. [Google Scholar] [CrossRef]
  43. Zhou, W.; Wu, X.; Ding, S.; Cheng, Y. Predictive Analysis of the Air Quality Indicators in the Yangtze River Delta in China: An Application of a Novel Seasonal Grey Model. Sci. Total Environ. 2020, 748, 141428. [Google Scholar] [CrossRef]
  44. Kim, J.; Wang, X.; Kang, C.; Yu, J.; Li, P. Forecasting Air Pollutant Concentration Using a Novel Spatiotemporal Deep Learning Model Based on Clustering, Feature Selection and Empirical Wavelet Transform. Sci. Total Environ. 2021, 801, 149654. [Google Scholar] [CrossRef] [PubMed]
  45. Dai, H.; Huang, G.; Zeng, H.; Zhou, F. PM2.5 Volatility Prediction by XGBoost-MLP Based on GARCH Models. J. Clean. Prod. 2022, 356, 131898. [Google Scholar] [CrossRef]
  46. Liu, B.; Yu, X.; Chen, J.; Wang, Q. Air Pollution Concentration Forecasting Based on Wavelet Transform and Combined Weighting Forecasting Model. Atmos. Pollut. Res. 2021, 12, 101144. [Google Scholar] [CrossRef]
  47. Sayeed, A.; Choi, Y.; Eslami, E.; Lops, Y. Using a Deep Convolutional Neural Network to Predict 2017 Ozone Concentrations, 24 Hours in Advance. Neural Netw. 2019, 121, 396–408. [Google Scholar] [CrossRef] [PubMed]
  48. Mo, Y.; Li, Q.; Karimian, H.; Fang, S.; Tang, B.; Chen, G. A Novel Framework for Daily Forecasting of Ozone Mass Concentrations Based on Cycle Reservoir with Regular Jumps Neural Networks. Atmos. Environ. 2020, 220, 117072. [Google Scholar] [CrossRef]
  49. Chen, S.; Wang, J.; Zhang, H. A Hybrid PSO-SVM Model Based on Clustering Algorithm for Short-Term Atmospheric Pollutant Concentration Forecasting. Technol. Forecast. Soc. Chang. 2019, 146, 41–54. [Google Scholar] [CrossRef]
  50. Huang, G.; Li, X.; Zhang, B.; Ren, J. PM2.5 Concentration Forecasting at Surface Monitoring Sites Using GRU Neural Network Based on Empirical Mode Decomposition. Sci. Total Environ. 2021, 768, 144516. [Google Scholar] [CrossRef] [PubMed]
  51. Wang, B.; Jiang, Q.; Jiang, P. A Combined Forecasting Structure Based on the L 1 Norm: Application to The. J. Environ. Manag. 2019, 246, 299–313. [Google Scholar] [CrossRef]
  52. Saremi, S.; Mirjalili, S.; Lewis, A. Grasshopper Optimisation Algorithm: Theory and Application. Adv. Eng. Softw. 2017, 105, 30–47. [Google Scholar] [CrossRef]
  53. Diebold, F.X.; Mariano, R.S. Comparing Predictive Accuracy. J. Bus. Econ. Stat. 1995, 13, 253–263. [Google Scholar] [CrossRef]
  54. Liu, S.; Lin, Y. Introduction to Grey Systems Theory. In Understanding Complex Systems; Springer: Berlin/Heidelberg, Germany, 2010; Volume 68, pp. 1–399. [Google Scholar] [CrossRef]
Figure 1. The flowchart of the APCP system.
Figure 1. The flowchart of the APCP system.
Systems 10 00139 g001
Figure 2. The prediction results of the APCP system and four well-known models in Experiment I. The three-line charts represent the fitting results of the 1-step prediction in the three regions. The bar chart shows the difference in MSE values for the dataset in Gz. The radar chart describes the changes in MAPE values in well-known models and the APCP system for the dataset in Sh. The horizontal bar chart depicts the comparison results of MRE values in different models for the dataset in Cd.
Figure 2. The prediction results of the APCP system and four well-known models in Experiment I. The three-line charts represent the fitting results of the 1-step prediction in the three regions. The bar chart shows the difference in MSE values for the dataset in Gz. The radar chart describes the changes in MAPE values in well-known models and the APCP system for the dataset in Sh. The horizontal bar chart depicts the comparison results of MRE values in different models for the dataset in Cd.
Systems 10 00139 g002
Figure 3. Performance comparison of different combined models for the dataset in Gz. The six subfigures on the left side of the dotted line indicate how close the predicted results of all models in Experiment III are to the real values. The three subfigures on the right compare the differences in MAPE, R2, and MSE values between the APCP system and the other combined models.
Figure 3. Performance comparison of different combined models for the dataset in Gz. The six subfigures on the left side of the dotted line indicate how close the predicted results of all models in Experiment III are to the real values. The three subfigures on the right compare the differences in MAPE, R2, and MSE values between the APCP system and the other combined models.
Systems 10 00139 g003
Figure 4. Sensitivity assessment of average results in three cities.
Figure 4. Sensitivity assessment of average results in three cities.
Systems 10 00139 g004
Table 1. Evaluation of data-driven prediction models for AP concentration.
Table 1. Evaluation of data-driven prediction models for AP concentration.
ModelsRef.DatasetConclusionsStrengthsLimitations
GAM[41]PM2.5 in BeijingThe lag order and climatic conditions have the most significant influence on the change in PM2.5 concentration.The GAM model intuitively explains the reasons for the change and diffusion of PM2.5 concentration.The prediction accuracy of this model is limited.
Markov chain model[42]API in MalaysiaMarkov chain model can be used as an effective tool in haze pollution prediction.The model is simple in structure and easy to operate.The higher-order extended form of Markov chain is not considered.
SNgbn (1,1) model[43]AQI, PM10, PM2.5, SO2, NO2, CO, and O3 in the Yangtze River DeltaFor data with seasonal periodic fluctuations, the model provides stable prediction results.The SNgbn (1,1) model simulates the seasonal characteristics of APs to a great extent.External factors are not added to the model.
3D-CBLstm[44]PM2.5 in BeijingThe application of clustering analysis and feature selection strategy is conducive to the improvement of the prediction effect.The 3D-CBLstm model not only realizes the efficient extraction of important features but also considers the long-term correlation in the sequence.The selection of prediction model parameters is subjective.
XGBoost-Garch-MLP[45]PM2.5 in ShanxiThis model can effectively predict the fluctuation range of PM2.5 concentration, which is helpful to identify the moving direction of PM2.5.The quality of input data is improved based on feature selection and four Garch extended models comprehensively cover the fluctuation interval of PM2.5 concentration.The selection of input variables and prediction models needs to be further optimized.
CWfm[46]PM10, PM2.5, NO2, SO2, O3, and CO in BeijingThe CWfm model is scientific and efficient for predicting the concentration of APs.The proposed combined model has better fitting results than its submodel.The influencing factors considered are not comprehensive enough.
Dcnn[47]Meteorological and AP (NOX and O3) data in TexasCompared with the deterministic models and linear models, the prediction results of this model are significantly improved.Predictions can be successfully achieved even when there are fewer input dimensions.The model has poor accuracy in estimating extreme values.
CEemd-CRJ-MLR model[48]Meteorological and AP (NO2, CO, and O3) data in BeijingThe improved CRJ model is effectively applied to the prediction of AP concentration, and the prediction performance of the hybrid model is improved.The hybrid model also has accurate results for the long-term prediction of the concentration of APs.The structural design of the model is complex, which reduces the universality.
Pso-Svm model[49]Meteorological and AP (AQI, PM10, PM2.5, SO2, NO2, CO, and O3) data in BeijingThe hybrid model is superior to the benchmark models in both fitting accuracy and simulation speed.As the amount of data is reduced, the running time of the model is shortened.The influence of holidays, seasons, and other relevant information is not included.
Emd-Gru model[50]PM2.5 in BeijingCompared with the Gru model, the proposed combined model shows the best results in all error measurement indicators.The problem of time lag is perfectly solved.The fitting results of different spaces are lacking, and the regional versatility of the model is limited.
Combined model based on the L1 norm[51]PM10, PM2.5, SO2, NO2, CO, and O3 in Baoding, Tianjin, and ShijiazhuangThe proposed model can accurately evaluate future air quality and has broad application prospects.The model parameters are adjusted based on the optimization algorithm, which enhances the scientificity and feasibility.The model structure is complex.
Table 2. Comparison results of pre-experiment.
Table 2. Comparison results of pre-experiment.
ModelsGuangzhouShanghaiChengduAverage
MSEMREMAPER2MSEMREMAPER2MSEMREMAPER2MSEMREMAPER2
Cnn25.4770.0095.465%0.95843.3530.0399.564%0.97355.4570.0348.224%0.98341.4290.0277.751%0.971
Bilstm31.7040.0146.119%0.94747.7550.02810.225%0.97060.8000.0418.414%0.98246.7530.0288.252%0.966
Lssvm29.8780.0135.752%0.95051.4540.05810.610%0.96861.7300.0499.123%0.98247.6870.0408.495%0.967
Gp100.9160.05712.038%0.832270.3690.35239.540%0.830156.8230.17520.018%0.953176.0360.19523.865%0.872
Lstm22.1800.0115.043%0.96339.0160.0309.876%0.97551.5180.0317.970%0.98537.5710.0247.630%0.974
Gru34.1120.0297.025%0.94368.9200.12017.239%0.957145.4640.13217.194%0.95782.8320.09413.819%0.952
Elm23.1600.0125.282%0.96241.5450.0309.563%0.97452.6330.0338.117%0.98439.1130.0257.654%0.973
Enn23.4650.0125.330%0.96141.9200.0359.796%0.97453.7820.0368.320%0.98439.7220.0287.815%0.973
Note: This table reports the results of submodel simulations, the assessment metrics employed in this simulation include MSE, MAPE, MRE and R2. Bold numbers indicate that the average simulation results in all submodels are better.
Table 3. The explanation and the corresponding value of the APCP system.
Table 3. The explanation and the corresponding value of the APCP system.
SystemsSymbolExplanationValueSystemsSymbolExplanationValue
Bilstm m ε Max epochs number400Gp Gl Gaussian likelihood−1
η ν Hidden layer node numbers20 ι ν Input layer node number4
BPnn, Elm, Enn ι ν Input layer node numbers4Lstm ε τ Epochs of training500
ο ν Output layer node numbers1Emd S r Stopping rule of siftingwave
η ν Hidden layer node numbers20 B d Boundarytype 5
Cnn ν κ Number of kernels in convolutional layer3Eemd Ν σ τ δ Signal-to-noise ratio0.1
σ κ Kernel size of the convolutional layer40DE M ˜ I Maximum iteration number500
h n Hidden layer node numbers[384,384] Ν ε Number of noise additions50
Gru m ε Max epochs number2000 Ν σ τ δ Signal-to-noise ratio0.1
μ β σ Mini batch size256Ws, Moda, Mogwo M ˜ I Maximum iteration number500
Lssvm Κ ϕ Kernel function parameter5 S ˜ A Archive size400
Χ λ Penalty parameter5 C ˜ N Chameleon number60
Note: Mogwo means the multi-objective grey wolf optimization.
Table 4. The statistical assessments of the dataset.
Table 4. The statistical assessments of the dataset.
DatasetsNo.Max.Min.MeanStd.Mlye
Guangzhou
Total744176834.1669.610.23
Train595176835.7969.790.17
Test1491423126.7768.890.37
Shanghai
Total7442551155.5684.040.31
Train5952551155.9990.410.28
Test1492031345.8658.600.04
Chengdu
Total7443351564.71141.840.27
Train5953351559.47154.840.28
Test1492332358.5989.910.14
Note: This table summarizes the main statistical information of three datasets, and the mathematical formulas of Mean, Std. and Mlye are M e a n = 1 / O ˜ σ , S t d . = O ˜ σ O ¯ σ 2 / 1 , and M l y e = 1 / T l T 0 l o g 2 δ ι / δ ι 1 , respectively. Besides, the unit of PM2.5 concentration data is μ g / m 3 .
Table 5. The mathematical formula of four evaluation metrics.
Table 5. The mathematical formula of four evaluation metrics.
MetricsMathematical Formula
Mean Absolute Percentage Error M A P E = 1 / φ ^ ι φ ι / φ ι × 100 %
Mean Relative Error M R E = 1 / φ ^ ι φ ι / φ ι
Mean Squared Error M S E = 1 / φ ι φ ^ ι 2
R-squared score R 2 = 1 φ ^ ι φ ι 2 / φ ι φ ¯ ι 2
Note: φ ι is the ι -th actual value of the AP concentration series, and φ ^ ι is the ι -th output result of the APCP system.
Table 6. Comparison results with the individual models.
Table 6. Comparison results with the individual models.
ModelsCityGuangzhouShanghaiChengdu
MSEMREMAPER2MSEMREMAPER2MSEMREMAPER2
APCP1-step20.8610.0004.883%0.96538.4450.0069.139%0.97649.9140.0127.560%0.985
2-step76.568−0.0149.552%0.871143.385−0.02515.477%0.903157.8880.02113.515%0.953
3-step85.532−0.0149.710%0.854147.137−0.02814.994%0.894158.7890.02713.810%0.952
Bilstm1-step31.7040.0146.119%0.94747.7550.02810.225%0.97060.8000.0418.414%0.982
2-step97.9060.03110.704%0.835183.5870.07820.620%0.876203.9260.11016.432%0.939
3-step209.0200.06016.410%0.644416.9760.15631.115%0.699487.2250.21727.335%0.853
Gru1-step34.1120.0297.025%0.94368.9200.12017.239%0.957145.4640.13217.194%0.957
2-step88.9410.04611.036%0.850222.9100.27734.932%0.849520.3540.30034.546%0.844
3-step364.0310.06419.808%0.381585.3650.51159.411%0.5771425.3530.57061.621%0.571
Gp1-step100.9160.05712.038%0.832270.3690.35239.540%0.830156.8230.17520.018%0.953
2-step158.1480.07815.547%0.733465.3990.48954.192%0.685368.9290.27831.152%0.889
3-step205.9230.09818.482%0.650681.2160.63269.090%0.508640.6930.37341.128%0.807
Lssvm1-step29.8780.0135.752%0.95051.4540.05810.610%0.96861.7300.0499.123%0.982
2-step101.2260.03110.821%0.829185.3590.13120.207%0.874183.6490.10116.178%0.945
3-step209.4270.05816.510%0.644408.7640.21929.946%0.705371.8230.16023.578%0.888
Cnn1-step25.4770.0095.465%0.95843.3530.0399.564%0.97355.4570.0348.224%0.983
2-step95.1820.02210.614%0.839165.4750.09117.971%0.888173.7370.09215.444%0.948
3-step94.8560.02310.537%0.839160.7740.08118.188%0.884166.6720.06214.993%0.950
Lstm1-step22.1800.0115.043%0.96339.0160.0309.876%0.97551.5180.0317.970%0.985
2-step92.5520.01910.346%0.844179.3770.06623.206%0.878180.8290.07115.373%0.946
3-step94.4320.02010.303%0.839175.8690.06922.962%0.873181.5080.06615.209%0.945
Elm1-step23.1600.0125.282%0.96241.5450.0309.563%0.97452.6330.0338.117%0.984
2-step86.0270.02910.312%0.855159.3950.08118.464%0.892166.1300.08715.212%0.950
3-step92.9100.01810.404%0.842165.4850.06119.901%0.880176.7670.06415.206%0.947
Enn1-step23.4650.0125.330%0.96141.9200.0359.796%0.97453.7820.0368.320%0.984
2-step87.8950.03110.345%0.852162.5840.09619.111%0.890167.9320.09415.414%0.950
3-step93.4830.01910.452%0.841169.5850.06920.610%0.877180.4070.06615.460%0.946
Table 7. The improvement percentage of the APCP system.
Table 7. The improvement percentage of the APCP system.
CityGuangzhouShanghaiChengduAverage
m s e m r e m a p e m s e m r e m a p e m s e m r e m a p e m s e m r e m a p e
APCP
vs. Bilstm
1-step34.20%98.40%20.20%19.50%79.32%10.62%17.90%71.02%10.16%23.87%82.91%13.66%
2-step21.79%55.72%10.77%21.90%67.66%24.94%22.58%80.96%17.75%22.09%68.11%17.82%
3-step59.08%76.97%40.83%64.71%81.84%51.81%67.41%87.61%49.48%63.73%82.14%47.37%
APCP
vs. Gru
1-step38.85%99.23%30.49%44.22%95.16%46.99%65.69%90.98%56.03%49.58%95.12%44.50%
2-step13.91%69.42%13.45%35.68%90.90%55.69%69.66%92.99%60.88%39.75%84.44%43.34%
3-step76.50%78.41%50.98%74.86%94.44%74.76%88.86%95.28%77.59%80.08%89.38%67.78%
APCP
vs. Gp
1-step79.33%99.62%59.44%85.78%98.35%76.89%68.17%93.18%62.24%77.76%97.05%66.19%
2-step51.58%82.07%38.56%69.19%94.85%71.44%57.20%92.43%56.62%59.33%89.78%55.54%
3-step58.46%86.04%47.46%78.40%95.51%78.30%75.22%92.79%66.42%70.69%91.45%64.06%
APCP
vs. Lssvm
1-step30.18%98.38%15.11%25.28%90.04%13.87%19.14%75.60%17.14%24.87%88.01%15.37%
2-step24.36%55.64%11.73%22.64%80.69%23.41%14.03%79.16%16.46%20.34%71.83%17.20%
3-step59.16%76.44%41.19%64.00%87.06%49.93%57.29%83.17%41.43%60.15%82.22%44.18%
Note: This table reports the improvement percentage of the APCP system, the assessment metrics employed in this experiment includes MSE, MRE, and MAPE, and the mathematical formulas of the metric Φ can be described as Φ = Φ c o m p a r i s o n Φ A P C P / Φ c o m p a r i s o n .
Table 8. Error evaluation based on different module strategies.
Table 8. Error evaluation based on different module strategies.
ModelsCityGuangzhouShanghaiChengdu
MSEMREMAPER2MSEMREMAPER2MSEMREMAPER2
APCP1-step20.8610.0004.883%0.96538.4450.0069.139%0.97649.9140.0127.560%0.985
2-step76.568−0.0149.552%0.871143.385−0.02515.477%0.903157.8880.02113.515%0.953
3-step85.532−0.0149.710%0.854147.137−0.02814.994%0.894158.7890.02713.810%0.952
DE-Moda1-step44.299−0.0186.346%0.92678.536−0.01612.558%0.951138.7890.05512.368%0.959
2-step196.907−0.06213.260%0.668291.930−0.04020.260%0.802324.7900.02216.480%0.902
3-step236.687−0.08314.610%0.597322.913−0.07120.461%0.767388.7790.17923.553%0.883
DE-Mogwo1-step45.530−0.0086.333%0.92480.2080.00712.356%0.949144.4900.04812.091%0.957
2-step190.901−0.02913.188%0.678303.2050.00519.997%0.795312.1850.10118.881%0.906
3-step255.447−0.06014.885%0.565316.368−0.02319.873%0.771478.2420.00717.896%0.856
Eemd-Ws1-step83.250−0.0208.714%0.862125.311−0.00613.821%0.921135.2190.04811.996%0.960
2-step194.027−0.05413.224%0.673292.060−0.04220.282%0.802309.6330.07117.753%0.907
3-step215.869−0.05213.967%0.633311.389−0.04319.927%0.775381.9700.14421.816%0.885
Emd-Ws1-step47.653−0.0266.535%0.92179.5000.01012.204%0.95076.215−0.08810.261%0.977
2-step108.801−0.06710.993%0.816116.014−0.00916.138%0.921879.255−0.38139.738%0.736
3-step137.2240.12320.357%0.901162.8070.01117.037%0.882294.961−0.13218.134%0.911
Note: This table reports the comparison results between the APCP system with other different module strategies, which include DE-Moda, DE-Mogwo, Eemd-Ws, and Emd-Ws, where Ws means the weight search mechanism of the APCP system.
Table 9. Significance comparison results of the APCP model.
Table 9. Significance comparison results of the APCP model.
ModelsGuangzhouShanghaiChengdu
1-Step2-Step3-Step1-Step2-Step3-Step1-Step2-Step3-Step
Bilstm2.62 **1.76 *4.21 **1.87 *1.77 *3.70 **1.96 **1.66 *5.04 **
Gru1.68 *0.303.48 **3.22 **2.33 **6.42 **7.50 **8.29 **11.90 **
Gp5.15 **3.45 **3.88 **6.99 **5.90 **6.96 **8.98 **6.22 **8.20 **
LSsvm2.83 **2.02 **4.22 **2.02 **1.593.32 **2.52 **1.224.71 **
DE-Moda3.35 **3.28 **3.87 **2.72 **3.57 **2.72 **4.68 **4.22 **5.04 **
DE-Mogwo3.39 **3.26 **3.92 **2.64 **3.19 **3.07 **4.64 **4.63 **3.57 **
Eemd-Ws4.49 **3.22 **3.11 **3.75 **3.60 **2.96 **4.62 **4.50 **4.54 **
Emd-Ws3.41 **1.93 **4.45 **2.71 **−1.04−0.523.60 **12.53 **3.05 **
Note: This table gives the results of DM statistics ϒ D M , which can be expressed as: ϒ D M = ι = 1 H φ ι φ ^ ι H φ ι φ ^ ι / /   σ   ^ 2 / , where   σ   ^ 2 = ι = 1 H φ ι φ ^ ι H φ ι φ ^ ι H φ ι φ ^ ι H φ ι φ ^ ι ¯ 2 , and H · is loss function. Besides, if ϒ D M ţ 1.65 (marked with *) or ϒ D M ţ 1.96 (marked with **), it means that H 0 : E H φ ι φ ^ ι = E H φ ι φ ^ ι will be rejected at the significance level of α = 0.10 or α = 0.05 , respectively.
Table 10. Discussion results of correlation analysis.
Table 10. Discussion results of correlation analysis.
ModelsGuangzhouShanghaiChengdu
1-Step2-Step3-Step1-Step2-Step3-Step1-Step2-Step3-Step
APCP0.8750.8470.8830.8880.8570.8760.8400.8070.850
Bilstm0.8570.8350.8260.8760.8310.8020.8250.7910.768
Gru0.8620.8450.8010.8570.7910.7250.7620.7020.633
Gp0.7570.7940.8210.7110.7110.6990.7210.7120.723
LSsvm0.8560.8330.8220.8760.8380.8090.8200.7990.793
DE-Moda0.8390.7990.8310.8570.8170.8360.7770.7820.786
DE-Mogwo0.8390.8060.8310.8600.8200.8420.7760.7760.810
Eemd-Ws0.7990.8020.8390.8340.8170.8400.7780.7820.805
Emd-Ws0.8370.8250.8160.8590.8610.8670.7940.6210.818
Note: This table shows the correlation between all models and real series. The expression of ς υ is ς υ = τ = 1 m i n υ m i n τ φ τ φ ^ ι υ + μ m a x υ m a x τ φ τ φ ^ ι υ / φ τ φ ^ ι υ + μ m a x υ m a x τ φ τ φ ^ ι υ / .
Table 11. Sensitivity analysis results of the APCP system.
Table 11. Sensitivity analysis results of the APCP system.
CityGuangzhouShanghaiChengdu
1-Step2-Step3-Step1-Step2-Step3-Step1-Step2-Step3-Step
MSE0.0661.6323.2020.6091.8560.4200.1114.3511.818
MRE0.0000.0010.0020.0020.0030.0010.0010.0010.001
MAPE0.0150.0570.1670.1760.3440.0750.0540.1260.118
R20.0000.0030.0050.0000.0010.0000.0000.0010.001
MSE0.0701.4223.1380.4611.3420.4130.3226.7652.122
MRE0.0000.0010.0020.0010.0030.0010.0000.0030.001
MAPE0.0100.0790.1890.1350.3150.1320.0460.2650.119
R20.0000.0020.0050.0000.0010.0000.0000.0020.001
MSE0.0361.8393.5870.6601.6781.0450.8166.4121.671
MRE0.0000.0010.0020.0010.0030.0010.0000.0030.001
MAPE0.0090.0920.1740.1400.2820.1110.0480.3000.062
R20.0000.0030.0060.0000.0010.0010.0000.0020.001
Note: When one parameter is changed, the other parameters retain the default values. Sensitivity indicator is χ I P = a = 1 Q I P a I P ¯ 2 / Q , where Q = 5 indicates the number of parameter P changes, I P a represents the error indicator I of the a-th parameter P, and I P ¯ is the mean of I P a . Besides, the measurement results χ ¯ I P = i = 1 10 χ I i P / 10 in this table are the average values of 10 operations.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hao, Y.; Zhou, Y.; Gao, J.; Wang, J. A Novel Air Pollutant Concentration Prediction System Based on Decomposition-Ensemble Mode and Multi-Objective Optimization for Environmental System Management. Systems 2022, 10, 139. https://doi.org/10.3390/systems10050139

AMA Style

Hao Y, Zhou Y, Gao J, Wang J. A Novel Air Pollutant Concentration Prediction System Based on Decomposition-Ensemble Mode and Multi-Objective Optimization for Environmental System Management. Systems. 2022; 10(5):139. https://doi.org/10.3390/systems10050139

Chicago/Turabian Style

Hao, Yan, Yilin Zhou, Jialu Gao, and Jianzhou Wang. 2022. "A Novel Air Pollutant Concentration Prediction System Based on Decomposition-Ensemble Mode and Multi-Objective Optimization for Environmental System Management" Systems 10, no. 5: 139. https://doi.org/10.3390/systems10050139

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop