*2.2. Data Sources and Validity*

This study collected hourly PM2.5 concentration data from 347 automatic air quality monitoring stations in the study area, from 1 January 2015, to 31 December 2019. This set of data was obtained from the Urban Air Quality Distribution platform of the National Environmental Monitoring Center (http://www.moc.cma.gov.cn, accessed on 9 October 2021). Based on the hourly PM2.5 data, the arithmetic mean method was used to calculate the annual PM2.5 concentration in each city, from 2015 to 2019. To improve the validity of the data, we processed the missing values according to the provisions of the Ambient Air Quality Standard (GB3095−2012). When calculating the daily average concentrations, we required that the number of hourly average concentrations or the sampling time should be more than 20, otherwise the daily average concentration was considered invalid. In calculating the average monthly concentrations, we required at least 27 (February: 25) daily average concentration values, otherwise, the monthly mean concentration was considered invalid. At least 324 daily average concentrations were required to calculate the annual average concentration, otherwise, the annual average concentration was considered invalid.

The potential impact of socioeconomic indicators on PM2.5 pollution has been widely discussed. Based on previous studies and the availability of socioeconomic data, we selected seven indicators (Table 1): Population (POP), Gross Domestic Product (GDP), Green Ratio of Built-up Area (GR), Output of Second Industry (SI), Proportion of Urban Population (UP), Roads Density (RD), and Proportion of Built-up Area (BA). Among them, POP, GDP, and GR, respectively, represent population size, economic development level,

and urban greening; SI and RD express industrial structure and traffic factors, respectively; UP and BA represent population urbanization and spatial urbanization. The annual statistical data of POP, GDP, SI, and RD were acquired from the Social and Economic Development Bulletin and Statistical Yearbook of each city in the study area, while those of GR and BA were obtained from the China Urban Statistical Yearbook. The time span of all socioeconomic indicators was consistent with that of PM2.5 data in this study. Figure S4 provides detailed statistical information on these socioeconomic factors, for each city.


**Table 1.** Socioeconomic indicators and the abbreviations and units.

#### *2.3. Statistical Methods*

#### 2.3.1. Moran's I Test

Air pollution usually has obvious spatial distribution characteristics with regional aggregation. Many researchers usually use Moran's I to test the spatial correlation of variables. In this study, we used the Global Moran's I to test the overall spatial effect of PM2.5 concentrations in 58 cities, from 2015 to 2019. The Global Moran's I model can be explained as follows [17]:

$$\text{GlobalNorm's } I\_i = \frac{n \sum\_{i=1}^{n} \sum\_{j=1}^{n} w\_{ij} (y\_i - \overline{y}) \left( y\_j - \overline{y} \right)}{\mathbb{S}\_0 \sum\_{i=1}^{n} \left( y\_i - \overline{y} \right)^2} \tag{1}$$

$$Z = \frac{1 - E(I)}{\sqrt{Var(I)}}\tag{2}$$

$$E[I] = -1/\left(n-1\right)\tag{3}$$

$$V[I] = E\left[I^2\right] - E[I]^2\tag{4}$$

where *yi* is the PM2.5 concentration of city *i*, *yj* is the PM2.5 concentration of city *j*, and *y* is the average PM2.5 concentration of the study area. *wij* is the spatial weight matrix; if two cities share a common boundary, the weight is 1, otherwise, it is 0; *S*<sup>0</sup> = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*<sup>n</sup> <sup>j</sup>*=<sup>1</sup> *wij* is the aggregation of all spatial weights; *n* = 56 is the number of cities. *Z* score and *p* values used to judge the Moran's I significance level; when the |*Z*| > 1.96 or *p* < 0.05, the result is considered significant at the 95% confidence level; when the |*Z*| > 2.58 or *p* < 0.01, the result is considered significant at the 99% confidence level. In this paper, the Global Moran's I was calculated using ArcGIS software.
