*2.2. Haze Level Transformation*

The haze-level prediction model outputs haze levels in different regions. Therefore, we divide the collected PM2.5 concentration in Beijing into 10 levels. Level 1: 0–35 μg/m3, air quality is good, basically no pollution; level 2: 36–70 μg/m3, acceptable; level 3: 71–105 μg/m3, mild pollution; level 4: 106–140 μg/m3, moderate pollution; level 5–7: 141–245 μg/m3, severe pollution; level 8–10: 245–500 μg/m3, severe pollution. We use the one-hot vector to label the training set, as in (1).

$$y\_i = [p\_\prime c\_1, c\_{2\prime} \dots c\_n] \tag{1}$$

The *p*-bit of the first element in the vector indicates whether there is a cloud layer effect and whether haze characteristics can be extracted. If *p* = 1, it means that haze characteristics are undetectable. In this case, the parameter optimization is to ignore the subsequent elements so that the adjustment of the parameters will not be affected in training; if *p* = 0, it means that the haze characteristics can be detected. When *n* represents the number of haze level, the subsequent *cn* represent the corresponding haze level. If the number of *c4* is 1, it indicates the corresponding haze level of the input data is 4. *i* indicates the serial number of the region where the PM2.5 concentration is located, and *i* ranges from 1 to 9. Hence, the ground truth corresponding to each satellite image is as shown in (2).

$$y = \begin{bmatrix} \ p^1 & c\_1^1 & \dots & c\_{10}^1 & \dots & p^9 & c\_1^9 & \dots & c\_{10}^9 \end{bmatrix} \tag{2}$$

#### **3. Method**

## *3.1. Joint Structure of Multi-Convolution Neural Networks*

In order to identify haze grades for finer spatial scales and study the temporal and spatial evolution of haze in different regions of Beijing, we use a multi-convolution network structure to segment the haze data and then classify it. The multi-convolution neural network structure includes an input layer, block layer, convolution layer, pooling layer, local full connection layer, and classification layer [37], as shown in Figure 2. In this network, we use unified input and unified output.

**Figure 2.** The structure of the multi-convolution neural network.

The input layer accepts the processed satellite images in the Section 2 as inputs to obtain more spatio-temporal data.

The block layer is a sliding window, whose size is 60 × 60 × 3. The sliding step is 60, so the original 180 × 180 × 3 images can be divided into 9 blocks from the upper to the lower, from left to right, and then input into the different convolution neural networks.

Nine convolution layers acquire image features of nine different regions. Because different regions have different background information, such as geographical environment, the separated convolution layer can distinguish the fine-grained differences of different regions and provide the haze levels of different regions.

The pooling layer can help reduce the size of the model and increase the speed of the operation. First, we set padding with 0 and choose the maximum pooling function. Max pooling uses the maximum value of the region to replace all the elements in the region.

The locally full connection layer prevents cross-contamination of the different layer outputs at the fully connected layer, preserving the characteristics of each region. Each locally full connection corresponds to a soft-max classification layer. The classification layer consists of 99 nodes corresponding to nine regions. The marking and representation of data is the same as in Equation (2).

#### *3.2. Spatial Autocorrelation Analysis of Haze Concentration*

Global spatial autocorrelation analysis can reflect the relationship between haze concentration and spatial distribution. Suppose a piece of data is related to geographic location. In that case, the distribution of the data in geographic space is also correlated, and the degree of correlation is inversely proportional to the region's distance. Its distribution methods are clustering, random distribution, and regular distribution. Through the spatial autocorrelation analysis of haze, this phenomenon can be better understood. Its essence is to analyze haze distribution in different geographic spaces and the correlation between different regions. In this section, the specific forms of using the matrix notation method to mark the data of the nine blocks are from top to bottom, and from left to right are Block 1 to Block 9.
