*3.2. Data Completion*

The collected PM2.5, PM10, O3, CO, NO2, and SO2 concentration data totaled 26,120. This paper calculates the mean of the previous and next state's concentration data. It completes the time series of missing data, as shown in Formula (8). The final data set contains PM2.5, PM10, O3, CO NO2, and SO2 concentration adequate data in 27,380 moments.

$$X\_t = \frac{1}{2}(X\_{t-1} + X\_{t+1})\tag{8}$$

In (8), *Xt* represents the missing concentration data at the current time, *Xt*−<sup>1</sup> represents the concentration data at the previous moment, and *Xt* represents the concentration data at the next time point. Furthermore, it is for sure that this could add additional noise to the dataset since we are just filling missing points with roughly generated data. We do not need to complete most of the dataset because the final dataset is only 3% greater than the vanilla one, which is tolerable for machine learning tasks.

#### *3.3. Standardized Processing*

In the neural network, large-value data tends to increase the proportion of influence on the model and makes the model lose the characteristic properties of the data with low value. Therefore, to avoid errors caused by different numerical ranges, we convert all historical concentration data to −1~1 (9).

$$X' = \frac{X - X}{X\_{\text{max}} - X\_{\text{min}}} \tag{9}$$

*X* denotes the concentration data after the standardized processing, *X* represents the original concentration data, *X* denotes the mean of the concentration data, *Xmax* denotes the maximum value, and *Xmin* represents the minimum value.

In this paper, we base on the assumption that the PM2.5 or PM10 concentration at the next moment is related to its short-term historical data and the O3, CO, NO2, and SO2 concentration values at the same moment. Therefore, we reconstructed the dataset and used the PM2.5 concentration in the past 24 h. The current PM10, O3, CO, NO2, and SO2 concentration values were training data, and the corresponding ground truth was the current PM2.5 concentration. Similarly, we also constructed a dataset for predicting the concentration of PM10. Again, the PM10 concentration in the past 24 h and the PM2.5, O3, CO, NO2, and SO2 concentration values at the current time were used as training data. The ground truth was the current time PM10 concentration. Finally, we divided the reorganized dataset into the training set, verification set, and test set according to 80%, 10%, and 10%.
