*3.3. CO2 Emission Accounting*

Since carbon emissions from fossil fuel consumption account for more than 90% of total carbon emissions, this study only calculates carbon emissions from urban fossilenergy consumption. In addition, in the absence of an urban energy balance table in China, following Shan et al. [39,40], we scale down the provincial energy balance table to the city level based on GDP, demographic data, and industrial output. The calculation of CO2 follows the Intergovernmental Panel on Climate Change (IPCC). The calculation formula for carbon emissions is as follows:

$$\text{CE}\_{\text{mn}}^{\text{c}} = \sum\_{\text{m}=1}^{17} \sum\_{\text{n}=1}^{7} \text{AD}\_{\text{mn}}^{\text{c}} \times \text{EF}\_{\text{m}\prime} \tag{9}$$

where CE<sup>c</sup> mn represents CO2 emissions from 17 fuel types, m is the energy type, n represents the main industry sector, AD<sup>c</sup> mn is the consumption of m fuel in sector n, and EFm is the emission factor. The default IPCC emission factors are used in this study.

#### *3.4. STIRPAT Model*

STIRPAT is widely used to examine the factors affecting CO2 emissions. It is based on Ehrich and Holdren's (1971) IPAT (impact of population, affluence, and technology) model and has been widely used to examine the effects of human activity on environmental change. The STIRPAT formula is as follows:

$$\mathbf{I} = \mathbf{a} \mathbf{P}^{\mathbf{b}} \mathbf{A}^{\mathbf{c}} \mathbf{T}^{\mathbf{d}} \boldsymbol{\varepsilon}, \tag{10}$$

where I is environmental impact, P is population, A is affluence, and T is technology level. a is the dominant factor; b, c, and d are the parameters to be estimated; and ε denotes the random error. Taking the logarithm form to both sides, the STIRPAT model can be expressed as below:

$$\text{lnI} = \text{lna} + \text{blnP} + \text{clnA} + \text{dlnT} + \text{ln } \varepsilon,\tag{11}$$

where ln () is the natural logarithm, and b, c, and d are equivalent to elasticity coefficients in economics, which can be regarded as the percentage change in the environment caused by a 1% change in one influencing factor under the condition that other factors remain unchanged.

The STIRPAT model allows additional explanatory factors to be added. In this study, referring to Cai et al. [16] and Wang et al. [41], we augment STIRPAT by adding industrial structure. The augmented model is given as

$$\ln \mathbf{I} = \ln \mathbf{a} + \beta\_1 \ln \mathbf{P}\_{\text{it}} + \beta\_2 \ln \mathbf{A}\_{\text{it}} + \beta\_3 \ln \mathbf{T}\_{\text{it}} + \beta\_4 \ln \mathbf{Q}\_{\text{it}} + \ln \varepsilon,\tag{12}$$

where I represents CO2 emissions, a is the intercept term, P is the total population, and T is the technological level, which is represented by the reciprocal of energy intensity and is obtained by the ratio of GDP to energy consumption. Q is the industrial structure, represented by the proportion of output value of tertiary industry; β1, β2, β3, and β<sup>4</sup> are the elastic coefficients of population, economy, technology, and industry structure, respectively.

In this study, the panel data method based on extended STIRPAT model was used to empirically investigate the relationship between influencing factors and CO2 emissions. The panel data analysis is a statistical method to analyze two-dimensional observations collected from multiple entities over multiple times. Compared with conventional models using only time-series or cross-sectional data, this method has the advantages of providing more degrees of freedom and reducing the effects of multi-collinearity [18]. Three models are commonly used in panel data analysis: mixed-effects model, fixed-effects model and random-effects model. F-test and Hausman test are required to determine which model to use for regression. We conduct regressions on the four groups of cities to test whether there are differences in the factors affecting carbon emissions in different groups. After using the F-test and Hausman test on the four groups, a panel fixed-effect model was selected for regression (see Section 4.3 for the empirical results).

#### *3.5. Research Objects and Data Sources*

Research objects: The research object of this paper involves cities at the prefecture level and above. Due to the lack of data in some cities, 280 cities were finally identified, and their economy and population accounted for more than 80% of China's. At the same time, they cover all provinces in China and are widely distributed.

Data sources: The data used to construct UDD (i.e., registered population, population density, natural population growth rate, per capita GDP, per capita fiscal expenditure, GDP growth rate, built-up area, population fiscal expenditure, and total retail sales of consumer goods) are from The Statistical Yearbook of Chinese Cities, covering the two years of 2010 and 2019. The GDP of each industry and population used for scaling are derived from cities and their corresponding provincial statistical yearbooks from 2010 to 2019. The province energy balance tables are from the China Energy Statistical Yearbook for 2010–2019. Missing population density data for 2018 were calculated from the ratio of the registered population to land area, which was obtained from the China Urban Statistical Yearbook. Some missing data are supplemented with data for adjacent years.
