*4.4. Propensity Score Matching Results and Descriptive Statistics*

#### 4.4.1. Counterfactual Matches with the Equation Estimates

Rosembaum and Rubin proposed the Propensity Score Method [41]. Simulation experiments show that the ATT can obtain unbiased estimation results under a series of assumptions. It can be defined as "an algorithm that matches the treatment group and the control group based on the conditional probability of participants, namely the propensity score, under the condition of given observable characteristics". The propensity score is defined as:

$$P(X\_i) = Pr\{\exp p\_i = 1 | X\_i\} \tag{3}$$

According to Equation (3), the propensity score similarity between the treatment group and the control group is matched, and its effectiveness depends on two preconditions. The first is conditional independence. The second is that the conditions for common support are met. The independence condition means that ETS pilot cities or non-pilot cities are independent of carbon intensity after controlling the common influencing factor X, and the common support condition ensures that cities in each treatment group can match cities in the control group through propensity score matching. The average treatment effect ATE of city *i* can be expressed as

$$E[\triangle\_i] = E\left[\ln y\_i^1 \left(\boldsymbol{n} y\_i^1, f \boldsymbol{y}\_i^1\right)\right] \exp p\_i = 1,\\ P(\mathbf{X}\_i) \left[ -E\left[\ln y\_i^0 \left(\boldsymbol{n} y\_i^0, f \boldsymbol{y}\_i^0\right)\right] \exp p\_i = 1,\\ P(\mathbf{X}\_i) \right] \tag{4}$$

To estimate *P(X)* is to estimate the probability that the city is or is not an ETS pilot. A Probit or Logit binary choice model is most commonly used. In this paper, a Probit model is used to obtain the predicted probability value *P<sup>i</sup>* of city *i* in the treatment group and *P<sup>j</sup>* of city *j* in the control group. The average treatment effect (ATT) of ETS on carbon intensity is as follows:

$$\beta = \frac{1}{M} \sum\_{i \in (\exp = 1)} \left[ \ln y\_i (\imath y\_{i\prime} f y\_i) - \sum\_{i \in (\exp = 0)} Y(\mathcal{N}Y, \mathcal{F}Y) \left( p\_{i\prime} p\_{\rangle} \right) \ln y\_{\rangle} (\imath y\_{\rangle} f y\_{\rangle}) \right] \tag{5}$$

where *M* is the number of cities in which ETS was piloted. *Y*(*NY*, *FY*) *pi* , *p<sup>j</sup>* represents the case when *lny*<sup>0</sup> *i* (*ny*<sup>0</sup> *j* , *f y*<sup>0</sup> *j* ) of city j is replaced by *lny*<sup>0</sup> *i* (*ny*<sup>0</sup> *j* , *f y*<sup>0</sup> *j* ) of city *i*. This represents the weight assigned to *lny*<sup>0</sup> *i* (*ny*<sup>0</sup> *j* , *f y*<sup>0</sup> *j* ) of city *j*. When the corresponding assumptions are met, especially when the mean values of variables in the treatment group and the control group are not different, the propensity score matching method can obtain the ATT, and a "clean" policy treatment effect can be obtained. Of course, being able to eliminate this noise completely requires being able to control for all variables that may have an impact on choice and outcome when matched. According to the matching method (radius matching, caliper matching, local linear regression matching, etc.), the weight function selection is also different. This study first selects the nearest neighbor matching with k = 4, and then selects other matching methods in the robustness test.

#### 4.4.2. Plot of Propensity Score Matching Kernel Density Function

The quality of propensity score matching can be examined by a plot of kernel density functions. If there is more overlap between the treatment group and the control group in the figure, this indicates that the test propensity match score is better. Figure 2 shows the kernel density function of the two groups of cities before and after propensity score matching. The solid line represents the cities in the processing group, and the dashed line represents the cities in the control group. As shown in Figure 2, prior to PSM, the two groups showed large differences in both skewness and kurtosis. After PSM, the change trend of the two groups is consistent, and there is a high degree of line segment coincidence. This indicates that the propensity score matching has a significant effect. This provides a good data basis for the use of the DID method in the empirical part of this paper. *Int. J. Environ. Res. Public Health* **2022**, *19*, x FOR PEER REVIEW 9 of 20

**Figure 2.** Kernel density function before and after PSM in treatment group and control group. **Figure 2.** Kernel density function before and after PSM in treatment group and control group.

4.4.3. Balance Test of Propensity Score Matching and Variables of Descriptive Statistics 4.4.3. Balance Test of Propensity Score Matching and Variables of Descriptive Statistics

In order to make the results of PSM more robust, the results should satisfy the two groups of cities, and there is no obvious difference in each matching variable. The method to judge whether PSM is effective generally carries out the balance test of propensity score matching. Note the absolute value of the standard deviation of the matching variable. If the absolute value of the standard deviation is smaller, it indicates that the matching effect is better. Table 2 results show that most of the matching variables decrease significantly In order to make the results of PSM more robust, the results should satisfy the two groups of cities, and there is no obvious difference in each matching variable. The method to judge whether PSM is effective generally carries out the balance test of propensity score matching. Note the absolute value of the standard deviation of the matching variable. If the absolute value of the standard deviation is smaller, it indicates that the matching effect is better. Table 2 results show that most of the matching variables decrease significantly in the absolute value of the PSM standard deviation. The t-test value also changed from significant to insignificant. This indicates that the null hypothesis that the mean of each

> **The Mean Standard Deviation (%)** *t* **Test Treat Control Deviation To Reduce** *p* **> |***t***|**

PGDP Before 10.576 10.432 20 99.3 0.000

IND Before 3.818 3.844 −11.2 99.3 0.026

PP Before 6.084 5.962 19 97 0.000

OPEN Before 0.003 0.003 23.7 95.6 0.000

FS Before 0.725 0.727 −1.2 −550.3 0.824

FD Before 0.820 0.812 1.8 −236.8 0.694

WODK Before 0.003 0.003 11.5 78.1 0.017

After 10.576 10.575 0.1 0.982

After 3.818 3.836 7.8 0.203

After 6.084 6.088 −0.6 0.926

After 0.003 0.003 1 0.877

After 0.725 0.706 7.8 0.182

After 0.820 0.794 5.9 0.323

After 0.003 0.004 -2.5 0.692

Table 3 shows the descriptive statistics of variables after PSM.

**Table 2.** Balance test of propensity score matching.

**Variable Sample** 

**Match** 

in the absolute value of the PSM standard deviation. The t-test value also changed from significant to insignificant. This indicates that the null hypothesis that the mean of each variable is consistent after matching is accepted. Propensity matching scores are valid. Table 3 shows the descriptive statistics of variables after PSM.


**Table 2.** Balance test of propensity score matching.

**Table 3.** Descriptive statistics.


#### **5. Empirical Analysis**

*5.1. Results of Dual Difference Regression*

In order to more clearly identify the causal impact of ETS on urban carbon intensity, the above control variables will be introduced in this section. The model combining city individual fixed effects (Id) and year fixed effects (Year) is used for further analysis, and the results are shown in Table 4. The results in column (1) show that, without adding any control variables, the coefficient of Treated\*time is significantly negative at the level of 1%. The results in columns (2)–(4) show that, after the introduction of other control variables, the coefficient of Treated ∗ time is significantly negative at the 1% level. This indicates that ETS can effectively reduce urban carbon intensity. In order to make the results more reliable, Column (5) shows the test results of the generalized method of moments estimation for dynamic instrumental variables. The coefficient of Treated ∗ time is still significantly negative at the level of 1%, which further verifies the conclusion of this paper. This empirical study also preliminarily shows that the introduction of pilot carbon emission trading policies can effectively reduce urban carbon intensity, and Hypothesis 1 is established.


**Table 4.** Dual difference regression.

Note: \*, \*\*, and \*\*\* represent the significance levels of 10%, 5%, and 1% respectively. The clustering standard error is shown in brackets.

#### *5.2. Heterogeneity Analysis*

#### 5.2.1. Regional Heterogeneity Test

On the whole, ETS can effectively reduce the carbon intensity of cities. However, different cities in China are located in different external environments. This leads to obvious differences across urban regions. In particular, the pilot policy of carbon emission trading has great relevance to the energy environment. The economically developed eastern region and the economically less-developed central and western regions have obvious differences in infrastructure and other conditions. Table 5 shows the regional heterogeneity results. The results in columns (1)–(3) show that the eastern region is inferior to the central and western regions in terms of coefficient and significance level. This indicates that the ETS policy in the eastern region is less effective in reducing carbon intensity. Relative to the central and western regions, the eastern region has a large population and more developed economy, with a concentration of various industries. As a result, the consumption of electric energy and heat energy caused by industrial electricity and residential electricity is large. Due to the normal economic and social activities in the eastern region, ETS cannot reduce the carbon intensity of the city in a short time. However, in China's central and western areas, the population is lower and the economic development level is weaker. In those regions, the establishment of carbon emission trading pilots can effectively reduce the carbon intensity.



Note: \*, \*\*, and \*\*\* represent the significance levels of 10%, 5%, and 1% respectively. The clustering standard error is shown in brackets.

#### 5.2.2. Quantile Regression Test

The regional heterogeneity of ETS on urban carbon intensity has been analyzed above. This part will analyze the quantile heterogeneity of ETS on carbon intensity—that is, the policy effect of ETS on high and low carbon intensity. It can be seen from Table 6 that, regardless of the value of M (M is the quantile), ETS always has a dampening effect on carbon intensity. Moreover, the impact of ETS on carbon intensity at different quantiles also changes significantly. Specifically, the emission reduction effect of ETS on cities with higher carbon intensity is more obvious. Figure 3 shows the trend of ETS regression on urban carbon intensity quantiles. The horizontal axis in the figure shows the different quantile decimal points of the ETS on urban carbon intensity. The vertical axis shows the regression coefficients of each variable. The dashed lines of the line segments represent the OLS regression estimates of the corresponding explanatory variables. The region between the two dotted lines represents the confidence interval of the OLS regression value (confidence 0.95). The solid lines are the quantile regression estimation results of each explanatory variable. The shaded part is the confidence interval (confidence 0.95) of the quantile regression estimate. Figure 3 further shows that the emission reduction effect of ETS on cities with higher carbon intensity is more obvious.


**Table 6.** Quantile regression.

Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets

quantile regression estimate. Figure 3 further shows that the emission reduction effect of

**Variable M = 0.1 M = 0.3 M = 0.5 M = 0.7 M = 0.9**  Treated ∗ time −0.203 \*\*\* −0.220 \*\*\* −0.328 \*\*\* −0.435 \*\*\* −0.520 \*\*\*

Constant 0.049 0.041 0.255 0.459 1.513 \*\*\*

Control YES YES YES YES YES Id YES YES YES YES YES Year YES YES YES YES YES

Sample size 3432 3432 3432 3432 3432 Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets.

(0.055) (0.033) (0.021) (0.028) (0.045)

(0.355) (0.232) (0.345) (0.283) (0.359)

ETS on cities with higher carbon intensity is more obvious.

**Table 6.** Quantile regression.

**Figure 3.** Quantile regression trend of carbon intensity of cities under ETS. **Figure 3.** Quantile regression trend of carbon intensity of cities under ETS.

#### *5.3. Robustness Test 5.3. Robustness Test*

as:

#### 5.3.1. Parallel Trend Test 5.3.1. Parallel Trend Test

This section presents the results of parallel trend tests. The specific test formula is set This section presents the results of parallel trend tests. The specific test formula is set as:

$$\text{Carbon}\_{\text{it}} = \mathfrak{a}\_0 + \omega\_d \sum\_{d=-5}^{d=5} \text{treated}\_{\text{it}} \* \text{time}\_{\text{it}} + \sum\_{i=1}^{N} \beta\_i \text{control}\_{\text{it}} \ + \mu\_i + \gamma\_t + \varepsilon\_{\text{it}} \tag{6}$$

where d\_5 represents 5 years before the introduction of ETS policy, and d5 represents the 5th year after the introduction of ETS policy. The coefficient ௗ is the focus of this paper's test. If the coefficient estimate is insignificant before ETS, significantly negative after ETS, and shows a difference in marginal effects, then the parallel trend assumption is satisfied. As shown in Figure 4, before the ETS, the effect on carbon intensity is not significant. After the establishment of the ETS, the coefficient is significantly negative. The marginal effect The main variables in the above formula have the same meaning as in Formula (1), where d\_5 represents 5 years before the introduction of ETS policy, and d5 represents the 5th year after the introduction of ETS policy. The coefficient *ω<sup>d</sup>* is the focus of this paper's test. If the coefficient estimate is insignificant before ETS, significantly negative after ETS, and shows a difference in marginal effects, then the parallel trend assumption is satisfied. As shown in Figure 4, before the ETS, the effect on carbon intensity is not significant. After the establishment of the ETS, the coefficient is significantly negative. The marginal effect of ETS on carbon intensity mitigation is strengthened over time, showing a long-term emission reduction effect. This proves the rationality of using the PSM-DID method in this paper. *Int. J. Environ. Res. Public Health* **2022**, *19*, x FOR PEER REVIEW 13 of 20 of ETS on carbon intensity mitigation is strengthened over time, showing a long-term emission reduction effect. This proves the rationality of using the PSM-DID method in this paper.

**Figure 4.** Parallel trend test. **Figure 4.** Parallel trend test.

intensity.

5.3.2. Change the Sample-Matching Method

**Table 7.** Results of replacing the matched DID.

**Variable Mahalanobis Distance Matches Caliper Match Radius of a Match Nuclear Match**  Treated ∗ time −0.168 \*\*\* −0.141 \*\*\* −0.168 \*\*\* −0.141 \*\*\*

Constant 4.530 \*\*\* 4.932 \*\*\* 4.530 \*\*\* 4.861 \*\*\*

Control YES YES YES YES Id YES YES YES YES Year YES YES YES YES R2 0.872 0.905 0.872 0.904 Sample size 3934 3432 3934 3434

The nearest neighbor matching method with K = 4 was selected above for data matching and processing. In order to make the above conclusion more robust, this part re-selects

the data. Table 7 shows the results of difference-in-differences estimation by various matching methods. After changing the propensity matching scoring method, the estimated results are close to the regression results above. This indicates that the above regression results are reliable, verifying that that ETS can effectively reduce urban carbon

(0.026) (0.024) (0.026) (0.024)

(0.469) (0.488) (0.469) (0.493)

Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets.

#### 5.3.2. Change the Sample-Matching Method

The nearest neighbor matching method with K = 4 was selected above for data matching and processing. In order to make the above conclusion more robust, this part re-selects the matching party for data matching. In this part, the methods of Mahalanobis distance matching, caliper matching, radius matching, and kernel matching were used to re-match the data. Table 7 shows the results of difference-in-differences estimation by various matching methods. After changing the propensity matching scoring method, the estimated results are close to the regression results above. This indicates that the above regression results are reliable, verifying that that ETS can effectively reduce urban carbon intensity.


**Table 7.** Results of replacing the matched DID.

Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets.

#### 5.3.3. Placebo Test *Int. J. Environ. Res. Public Health* **2022**, *19*, x FOR PEER REVIEW 14 of 20

The cities in which ETS was piloted may have been chosen as pilots due to their relatively complete infrastructure and high economic development potential. Therefore, in order to eliminate the interference of other unobservable factors with the conclusions of this paper, a placebo test was used to further prove the reliability of the previous conclusions. In this part, the interaction terms are randomly selected 1000 times to check whether the coefficients are significantly different from the benchmark estimation results. The results are shown in Figure 5. The dashed line indicates that the actual estimated coefficient obtained by PSM-DID is −0.142. The coefficient estimate is lower than 1000 random draws. This indicates that the placebo test in this part is valid. Thus, the reliability of the conclusions of this paper is proven. 5.3.3. Placebo Test The cities in which ETS was piloted may have been chosen as pilots due to their relatively complete infrastructure and high economic development potential. Therefore, in order to eliminate the interference of other unobservable factors with the conclusions of this paper, a placebo test was used to further prove the reliability of the previous conclusions. In this part, the interaction terms are randomly selected 1000 times to check whether the coefficients are significantly different from the benchmark estimation results. The results are shown in Figure 5. The dashed line indicates that the actual estimated coefficient obtained by PSM-DID is −0.142. The coefficient estimate is lower than 1000 random draws. This indicates that the placebo test in this part is valid. Thus, the reliability of the conclusions of this paper is proven.

**Figure 5.** Placebo test. **Figure 5.** Placebo test.

vironmental governance.

௧ = ଵ + ଷ௧ ∗ ௧ + ସ௧ + ∑ ௧

**6. Further Analysis**  *6.1. The Mediation Effect Test* 

intensity, and what is the specific mechanism? According to the above theoretical analysis, this paper argues that the pilot carbon emissions trading policy acts through green technology innovation and environmental governance. Therefore, this paper will examine the intermediary mechanism from the two channels of green technology innovation and en-

M௧ = + ଶ௧ ∗ ௧ + ∑ ௧

In the above equation, M represents the mediating variables, which are green technological innovation (Inno) and environmental governance (Trash), respectively. Among them, green technological innovation is represented by the number of green invention patents and green utility model patents granted per capita in cities [42]. A larger value

ே

ே

ୀଵ + + ௧ + ଵ௧ (7)

ୀଵ + + ௧ + ଶ௧ (8)

#### **6. Further Analysis**

#### *6.1. The Mediation Effect Test*

The above empirical results show that introducing the carbon emission trading pilot policy has alleviated the carbon intensity of cities. Then, how does ETS affect the carbon intensity, and what is the specific mechanism? According to the above theoretical analysis, this paper argues that the pilot carbon emissions trading policy acts through green technology innovation and environmental governance. Therefore, this paper will examine the intermediary mechanism from the two channels of green technology innovation and environmental governance.

$$\mathbf{M}\_{\rm it} = \beta\_0 + \delta\_2\\treated\_{\rm it} \* time\_{\rm it} + \sum\_{i=1}^{N} w\_j control\_{\rm it} + \mu\_i + \gamma\_t + \varepsilon\_{\rm life} \tag{7}$$

$$\text{Carbon}\_{\text{il}} = \theta\_1 + \delta\_3\\\text{treated}\_{\text{il}} \* \text{time}\_{\text{il}} + \delta\_4\\M\_{\text{il}} + \sum\_{i=1}^{N} e\_j \\\text{control}\_{\text{il}} + \mu\_{\text{i}} + \gamma\_{\text{l}} + \varepsilon\_{\text{2it}} \tag{8}$$

In the above equation, M represents the mediating variables, which are green technological innovation (Inno) and environmental governance (Trash), respectively. Among them, green technological innovation is represented by the number of green invention patents and green utility model patents granted per capita in cities [42]. A larger value indicates a higher level of green technology innovation. The calculation of the environmental governance level index is measured by the sum of waste water, waste gas, and solid waste generated by the city. A smaller value indicates a higher level of environmental governance.

Traditional parameter estimation methods require the assumption of a normal distribution of data. The use of stepwise regression may have some impact on the assessed policy effects. Therefore, the Sobel test and Bootstrap method were used to test the mediating effect in this part. The Bootstrap test uses the mixed effects hypothesis. In this paper, the original sample was randomly sampled repeatedly with *n* = 1000. The asymmetry in the distribution of indicators was corrected. This can significantly improve the accuracy of model testing under a complex mediation structure.

Table 8 shows the mediation test results. When Inno is used as a mediating variable, the coefficient before Treated ∗ time is significantly positive at the 1% level. This indicates that the introduction of the pilot policy of carbon emission trading has promoted the level of urban green technological innovation. The coefficient of urban carbon intensity is significantly negative at the 1% level. This shows that the improvement of green technology innovation alleviates urban carbon intensity, and the path of "carbon emission trading pilot policy-green technology innovation-urban carbon intensity" is established. This proves Hypothesis 2.

When the level of environmental governance is used as a mediating variable, the coefficient before Treated ∗ time is significantly negative at the 1% level. This shows that the introduction of the pilot policy of carbon emission trading has improved the level of urban environmental governance. The coefficient of urban carbon intensity is significantly positive at the 1% level. This indicates that the improvement of environmental governance will alleviate urban carbon emissions. The path of "Carbon emission trading pilot policy—environmental governance level—urban carbon intensity" is established. This proves Hypothesis 3.


**Table 8.** Results of mediating effect test.


**Table 8.** *Cont.*

Note: \*\* and \*\*\* represent the significance levels of 5%, and 1% respectively. The clustering standard error is shown in brackets.

#### *6.2. Spatial Spillover Effect Test*

#### 6.2.1. Model Set and Related Analysis

According to the above analysis, the impact of the pilot carbon emission trading policy on carbon intensity may have a spatial spillover effect, which needs further analysis. Therefore, this paper establishes a spatial econometric model based on the equation:

$$\text{Carbon}\_{\text{il}} = \beta\_0 + \beta\_1 \mathcal{W} \times \text{Carbon}\_{\text{il}} + \beta\_2 \mathcal{C}D\_{\text{il}} + \beta\_3 \mathcal{W} \times \mathcal{C}D\_{\text{il}} + \sum\_{l=1}^{N} q\_l \text{control}\_{\text{il}} + \mu\_l + \gamma\_l + \varepsilon\_{\text{il}} \tag{9}$$

where *W* is the spatial weight matrix. Equation (7) adds the spatial interaction term (*W* × *CD*) of the core explanatory variable (*CD*) to the equation. The model estimates the spatial spillover effects of the explained and core explanatory variables. Regarding the selection of the spatial weight matrix, this paper chooses the geographical inverse distance matrix to study the possible spatial spillover effect.

Before the spatial econometric analysis, it is necessary to determine whether there is a spatial correlation of urban carbon intensity. In this paper, the global Moran's I index is used to test the spatial correlation of carbon emissions. Table 9 reports the regression results of each year. For 2006~2019, Moran's I index shows significance under the 1% level, which shows a spatial correlation in urban carbon intensity.


**Table 9.** Results of spatial correlation test.

Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets.

#### 6.2.2. Analysis of Regression Results

Table 10 shows the regression results of the spatial Durbin model with double fixed effects. Columns (1)–(3) represent the direct effect, indirect effect, and total effect after coefficient decomposition respectively. From R<sup>2</sup> and the Sigma<sup>2</sup> and log-likelihood statistics, the fit of the model is better and the overall regression reliability is higher. As column

(1) shows, the Treated ∗ time coefficient is −0.141, and is significant at the 1% level. This means that the establishment of carbon emissions trading pilot cities can alleviate local urban carbon intensity, which is consistent with the results of the benchmark in front of the regression. Column (2) shows that the Treated ∗ time coefficient is 0.168 and is significant at the 1% level. This means that the establishment of pilot emissions trading can increase the carbon intensity in areas surrounding the region. W ∗ Treated ∗ time before the time coefficient is 0.399 and is significant at the 1% level. This means that, when pilot emissions trading was set up in this region, the ETS produced a spatial spillover effect, increasing the carbon intensity of the surrounding area. Because of the region's strict carbon trading controls, polluting companies cannot afford the high prices of carbon credits and move to surrounding areas. As the above result shows, the establishment of a pilot emissions trading city not only can reduce the carbon intensity of the city, but also can affect surrounding cities. The environmental regulation in the region, through the strict design of the carbon trading system, relies on the power of the government. The expansion of the implementation of tertiary industries such as service, acceleration of the upgrading of industrial structure, and finally, the improved efficiency of energy utilization can alleviate the carbon intensity of the region, but may cause enterprises to transfer, which can increase the carbon intensity in the surrounding areas. This proves hypothesis 4.


**Table 10.** Regression results of spatial Durbin model.

Note: \*\*\* represent the significance levels of 1%. The clustering standard error is shown in brackets.

#### **7. Conclusions and Recommendations**

#### *7.1. Conclusions*

This paper regards the carbon emission trading system as a quasi-natural experiment. Using the panel data of 281 cities in China from 2006 to 2019, this paper empirically examined the policy effect and spatial spillover effect of ETS on urban carbon intensity in China by using PSM-DID and spatial Durbin models and analyzed it from multiple perspectives.

First, ETS helps mitigate urban carbon intensity. However, this effect has heterogeneous characteristics. The mitigation effect of the carbon emission trading system on the carbon intensity in the eastern region is not significant. By contrast, the mitigation effect on carbon intensity in the central and western regions is very significant. Compared with the central and western regions, the eastern region has a large population, developed economy and various industries. Industrial and residential consumption of electricity and heat energy is huge. Setting up pilot carbon emission trading in the eastern region, while also promoting economic activities in that region, cannot significantly reduce the carbon intensity of cities in a short period of time. In the central and western regions of China, the population is small, and the level of economic development is weak. Setting up carbon emission trading pilots in those regions can effectively reduce the carbon intensity

of the regions. The results of the quantile test show that the emission reduction effect of ETS is more obvious for cities with higher carbon intensity, and the marginal effect of emission reduction is larger for cities with higher carbon intensity, so there is more room for emission reduction.

Second, the parallel trend test shows that the longer the carbon emission trading system is established, the more obvious is the mitigation effect on urban carbon intensity. The longer the carbon emission trading system is established, the more time the pilot enterprises have to carry out technological innovation, and the more obvious the effect of mitigating urban carbon intensity measurement will be.

Third, this paper further analyzes the influence mechanism of ETS from the two aspects of green technological innovation and environmental governance. The results show that the carbon emission trading system can encourage enterprises to carry out technological innovation to reduce emissions, thus alleviating urban carbon intensity. By improving the level of environmental governance and reducing the emission of all kinds of pollutants, this also reduces the corresponding carbon emissions, which then alleviates the carbon intensity. This is consistent with most of the literature results.

Fourth, spatial spillovers show that the ETS, although able to mitigate the carbon intensity of the pilot cities, causes the carbon emissions of the surrounding non-pilot cities to rise. This is because the penalty mechanism of ETS leads to high environmental costs that cause enterprises to transfer to surrounding non-pilot areas. As a result, the carbon intensity of surrounding areas increases.

#### *7.2. Recommendations*

First, the development of ETS should always adhere to the combination of "market determination" and "government regulation". On the one hand, policy makers should continue to insist on the decisive role of the market in the allocation of carbon emission rights, and should use supply and demand mechanisms, competition mechanisms, price mechanisms, and other means to promote the effective operation of carbon trading markets. This requires constantly adjusting the incentives of enterprises through surplus carbon emission rights and adjusting the penalties imposed on enterprises with insufficient carbon emission rights through market means. Thus, the cost of carbon emissions is internalized into the cost-benefit analysis of the enterprise and becomes an important variable for maximizing corporate profits, thus promoting carbon emission reduction. On the other hand, policy makers should give full play to the regulating and supporting role of the government. The government should formulate laws and regulations suitable for the healthy and effective operation of the market in order to make up for market failures such as monopoly, information asymmetry, and externalities caused by market limitations, thereby constantly improving the market environment.

Second, the carbon reduction effect of ETS is regionally heterogeneous. There are significant differences between different regions due to their local level of economic development, industrial structure, energy structure, and other factors. Therefore, each transaction pilot cannot adopt a "one size fits all" attitude when formulating policies. The construction of carbon trading markets should be carried out according to local conditions. In this way, achieving carbon reduction targets also can promote high-quality economic development.

Third, scientific and technological research and innovation of enterprises is the key to the carbon reduction effect of the ETS. The government, enterprises, and society should pay special attention to the important role of scientific and technological R&D and innovation in carbon trading policies. It is recommended to continuously increase the R&D investment of all enterprises and encourage them to carry out technological innovation in order to constantly update the production process. This will promote the green development of enterprises. The government should also effectively improve its own environmental governance level in order to improve its ability to prevent and control urban pollution. Through the development of a series of laws and regulations to assist the operation of the ETS system, strict penalties should be imposed on enterprises that violate the system.

Fourth, the spatial spillover effect among different cities increases the carbon intensity of surrounding cities. On the one hand, local governments in non-pilot areas are encouraged to actively learn from the experiences of pilot areas in order to reduce the carbon intensity of the region. However, carbon leakage through spatial spillovers undermines the goal of reducing emissions. New mechanisms should be considered to prevent companies from avoiding emission rules. A hybrid mechanism that combines the carbon ETS with other environmental regulation tools is recommended. For example, a carbon tax could be imposed on emissions in non-pilot areas to discourage firms from leaving pilot areas.

The above is the main content of this paper, but the research of this paper still has limitations. This study uses an econometric approach based on historical data from a pilot carbon trading program in China. We also believe that carbon tax is one of the effective ways to reduce the carbon intensity of cities. If carbon tax projects are implemented in these cities, a more reasonable conclusion can be obtained by comparing the effects of carbon trading and carbon tax. Since China has no plan to implement carbon tax at present, such data cannot be obtained to reconstruct the regression model of carbon tax cases. This prevents more reasonable conclusions from being drawn. If China has some concrete practice in carbon tax, the author will study it.

**Author Contributions:** Writing, J.L.; Methodology, J.L.; Providing case and idea: W.Y.; Revising and editing: J.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data in this paper were collected through publicly available data. Data about cities are based on the related data from the National Bureau of Statistics (https://data. stats.gov.cn/, accessed on 15 August 2021). Patent data is from the State Intellectual Property Office (https://www.cnipa.gov.cn/, accessed on 20 August 2021). The relevant data can be downloaded from the relevant website. Data can be obtained from the corresponding author on request.

**Acknowledgments:** The author thanks the anonymous reviewer for critical corrections and Cyndi Berck for professional language editing.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

