1. Introduction
With rapid societal development and the advancement of economic growth and urbanization, China’s cumulative building area expanded dramatically from 4.818 billion square meters in 2007 to 15.75 billion square meters by 2021. This surge in large-scale construction has resulted in substantial energy use and a sharp increase in CE [
1]. CE constitutes 28% to 34% of China’s total emissions [
2]. High energy consumption and CE significantly impact China’s carbon reduction process [
3]. In response to the current situation, the Chinese government issued a relevant plan in 2021 (<Action Plan for Carbon Peaking Before 2030>), emphasizing the progress of a green energy transition and a decrease in CE. The plan seeks to expedite the shift toward greener production and lifestyles, ensuring the timely realization of the carbon peak target. Several countries and regions worldwide have made commitments to reduce carbon emissions as part of the international pathway toward “carbon neutrality [
4]. In its official work report, the Chinese government emphasized achieving a carbon peak by 2030 and carbon neutrality by 2060 [
5].
The swift growth of China’s construction industry serves as a crucial foundation for the national economy; however, it is also one of the largest contributors to carbon emissions. As a global infrastructure leader, China faces an urgent imperative to address the significant challenges posed by the CE. Amid the ongoing social and economic development of the industry, China has become a leading source of global greenhouse gas emissions. The expansion of construction activities has accelerated urbanization, and the widespread use of energy-intensive materials has made China the largest global source of CE. Achieving energy-saving and carbon-reduction targets, reducing fossil energy consumption and carbon reduction in the building materials industry, and promoting energy efficiency are identified as key priorities in China’s 14th Five-Year Plan.
Based on relevant data (<2022 China Building Energy Consumption and Carbon Emission Report>), the total CE from the entire life cycle in China reached 5.08 billion tCO2, representing 50.9% of the nation’s total emissions. Among them, the CE during the material production phase was the highest, with emissions totaling 2.82 billion tCO2, or 28.2% of the national total. CE at the material production stage is crucial to the whole life cycle of CE [
6]. A study of 78 office buildings in China revealed that concrete, steel, wall materials, and mortar are the primary contributors to CE [
7]. Construction is a key area of China’s economic and social development. Economic growth is accompanied by substantial consumption of energy and materials, with primary energy consumption accounting for approximately 40% of the total [
8].
Building on this foundation, this study seeks to quantify and track CE, aiming to identify and effectively manage high-emission stages. Investigating the factors influencing CE provides a thorough and structured method for assessing and addressing the challenges related to carbon emissions in this field. Further projections of China’s construction sector will allow for a more precise setting and clarification of emission-reduction targets, increase public awareness of carbon reduction efforts, and provide a robust basis for policy formulation.
The issue of CE has garnered considerable attention from academia. The factors influencing CE are crucial to the CE process and directly impact on the overall carbon emission levels. In previous studies, most researchers have employed single models to analyze the factors influencing CE, for instance, SDA modeling to analyze CE intensity [
9]. Other scholars have used the improved STIRPAT-based model to analyze multifactorial models [
10]. Most of these studies utilized single models to analyze influencing factors without exploring the use of combined models. The single SDA model exhibits a strong dependence on variable selection and has certain limitations in handling nonlinear relationships. Meanwhile, the STIRPAT model suffers from an inadequate understanding of causality, challenges in addressing covariance, and shortcomings in analyzing spatiotemporal effects. In contrast, this study employs a hybrid approach by integrating the STIRPAT model with the LMDI method, a methodology rarely applied in existing research and not yet observed in studies on CE. The combined STIRPAT-LMDI model leverages the strengths of both the STIRPAT and LMDI models, enabling a more detailed decomposition analysis of the influencing factors. This approach facilitates the visual identification of key factors and quantifies their contributions to the CE. This method extends beyond traditional single-model approaches and is crucial for accurately identifying the drivers of CE. In terms of CE forecasting, CE forecasting plays a crucial role in energy saving and emission reduction and promotes sustainable development strategies [
11]. Carbon footprint research is pivotal for advancing the construction of green buildings. Additionally, CE forecasting can indirectly facilitate effective control of CE at all stages of the construction process, which is crucial for achieving sustainable development. Furthermore, predicting CE will provide strong support for government policymaking and guide the future direction of the industry [
12]. Finally, CE projections have a positive impact on the industrial structure, increasing the growth rate of output in high-tech industries and the value of industrial output. The resulting increase in returns will contribute to the advancement of digital technologies [
13]; indirectly, it also supports the development of low-carbon technologies, which play a critical role in reducing CE. Many researchers have applied various predictive models to estimate CE, such as forecasting CE with an improved gray prediction mode [
11]. Additionally, many scholars have employed classical models, such as the BP model and ARIMA, to predict CE [
14]. However, the accuracy of predictive methods still requires improvement. This study compares the performance of BP neural networks, gray prediction models, support vector machines, and the GA-BP model. The optimal GA-BP model is then selected for multi-scenario forecasting of CE. In previous studies, few scholars have performed comparative analyses of predictive model performance, and the use of an optimized combined model, like GA-BP, for multi-scenario forecasting of CE has not been explored. This study enhances the existing body of work and offers strong support for CE reduction through the application of more precise, predictive models.
2. Literature Review
2.1. CE Accounting
For carbon accounting, numerous scholars have quantified CE from various perspectives, actively exploring CE accounting frameworks and driving continuous innovation to propel the field forward [
15]. CE accounting denotes the total quantity of carbon dioxide generated at different stages, including material production, construction, operation and maintenance, and dismantling and disposal [
16]. The IPCC Emission Factor Technique is a commonly applied method for carbon emission calculations. Different carbon accounting methods can greatly assist governments in formulating relevant carbon reduction policies. Accurate carbon emission accounting ensures the effectiveness and precision of carbon reduction measures. Some researchers have employed the energy balance split method to calculate the CE [
17,
18]. Most researchers also utilize the IPCC Emission Factor Accounting Method to estimate CE [
12,
19]. Emission factor and mass balance methods have also been employed to determine the carbon emission equivalents of activities such as nitrification, denitrification, anaerobic ammonia oxidation, and microalgal assimilation [
20]. A small group of researchers have used process-oriented methodologies to quantify the embodied CE during the design phase [
21]. However, there is still limited accounting for the full life cycle of CE. The multiple stages of CE involve more complex influencing factors; therefore, we need to be more comprehensive and detailed in conducting corresponding research to accurately grasp the high carbon emission stage and provide support for policymaking and industry-standard improvement. Based on this, the following assumptions are made:
H1. In terms of the time dimension, carbon emissions throughout the entire life cycle of the construction industry exhibit an upward trend, with the material production stage contributing the highest percentage of emissions. Spatially, the eastern region exhibits the highest carbon emissions from the construction industry.
2.2. Comprehensive Analysis of the Determining Factors
Both domestic and international scholars have proposed various models to study these factors, with the LMDI decomposition method, modified STIRPAT model and Kaya identity being the most widely adopted frameworks. Some researchers have applied the SDM to examine the influencing factors, and the results indicated that population size has a significant effect on carbon emissions [
22]. Several scholars have performed correlation analyses of explanatory variables using the enhanced STIRPAT model [
23,
24]. The STIRPAT model is widely used in studies on CE drivers because of its exceptional flexibility and scalability, which allows it to effectively capture the underlying mechanisms of CE. Additionally, in studies utilizing the LMDI model, certain researchers have decomposed the five determinants of carbon emissions, with results indicating that economic growth is the primary driver of increased CE, followed by population size and the energy mix [
25]. A few researchers have integrated the Tapio and STIRPAT models to assess the influence of explanatory variables on carbon emissions [
26]. However, the traditional Kaya identity fails to achieve the expected outcomes when applied to the LMDI decomposition, as some newly introduced variables lack economic significance, leading to certain limitations in the model. To tackle this problem, this study employs an improved STIRPAT model for variable decomposition, avoiding the introduction of additional variables and thereby achieving the desired objectives.
2.3. Carbon Emission Forecast
To evaluate whether China can fulfill its carbon reduction commitments and the feasibility of reaching peak carbon emissions within a short timeframe, it is essential to conduct a scientific forecast of carbon emission trends. The crux of this research lies in assessing the practicality of achieving the peak emission target [
27]. Some researchers have predicted the effects of demographic factors on carbon emissions [
28]. Many scholars have used the BP model to predict future CE trends [
29,
30]. While this model offers broad applicability and ease of operation, it may suffer from limitations such as lower prediction accuracy or overfitting issues. Some researchers have also utilized SVR modeling to predict CE from residential buildings in their studies [
31]. While the model successfully addresses accuracy concerns after training, SVR, as a general-purpose machine learning regression algorithm, is designed primarily to solve regression problems rather than environmental issues. It is sensitive to parameters and outliers, which can result in the loss of crucial information. Additionally, with the continuous advancement of research, machine learning methods have become increasingly prevalent. For instance, the gray prediction model is widely applied; however, it is primarily suited for small sample sizes and often suffers from lower accuracy [
32]. In this context, an increasing number of scholars are focusing on the development of hybrid models to improve the traditional model. An improved wavelet transform multivariate gray model was proposed for carbon emission prediction, addressing the limitations of the traditional GM (1, N) model [
33]. Other researchers have proposed the SSA-FAGM-SVR integrated model for carbon emission prediction, aimed at forecasting future carbon emissions for the G20 countries [
34]. Compared to other forecasting methods, this model significantly enhances prediction accuracy, demonstrating strong adaptability and a high level of precision. These hybrid models offer superior generalization capabilities, enabling faster model convergence, reduced computational time, and quicker achievement of training and prediction outcomes.
In summary, while there has been considerable research on CE accounting, factors influencing emissions, and CE forecasting, most scholars have approached these topics from a single perspective. Few studies have examined CE from a life cycle perspective. From the perspective of influencing factors, previous research primarily employed single models. In contrast, this paper uses the combinatorial model STIRPAT-LMDI; furthermore, no research on this combined model has been found in the construction industry sector. From the viewpoint of CE forecasting, the optimized GA-BP model used in this study provides higher accuracy than models widely explored by other researchers. Moreover, it has not been widely applied in this field, thereby broadening the scope of research on CE. Therefore, this study will provide an in-depth analysis of CE accounting, influencing factors, and forecasting, offering a theoretical foundation to support China’s goals of carbon peaking by 2030 and carbon neutrality by 2060. Based on this, the following assumptions are made:
H2. From the perspective of the forecasting model, GA-BP is identified as the optimal model. From the perspective of the results of CE forecasting, it is expected that the carbon peak target will be reached by 2030 under the low-carbon scenario.
3. Carbon Emission Accounting Methods for China’s Construction Industry
3.1. Definition of Objectives and Scope
This paper refers to [
1], which divides CE into several stages: material production, transportation, construction, operational, and dismantling. The material production phase encompasses the extraction and processing of raw materials, with aluminum, glass, steel, wood, and cement as the primary accounting components. The transportation phase primarily involves CE from the transportation of construction materials. Based on reports from Chinese government departments («China Building Energy Consumption Report 2020»), the construction phase combines the building and demolition stages, covering the energy consumption of construction machinery and electricity used for site lighting. The operational phase encompasses indoor energy consumption activities, including domestic hot water supply, lighting, elevator operation, HVAC systems, and cooking. From a macro perspective, energy consumption in this phase primarily focuses on daily energy use in the tertiary sector and non-transportation residential activities [
35].
Table 1 presents the CE factors used in this paper. The coal conversion rate represents the weight of the standard coal produced per unit of energy consumed for each type of energy. The CE factor for energy indicates the amount of CE produced per unit of standard coal. The CE factor for construction materials reflects the amount of CE emitted per unit of production of various construction materials.
3.2. Life Cycle CE Calculation
CEs are calculated using the life cycle assessment method, energy balance table decomposition method, and emission factor method. The total CE is derived by adding the emissions from each phase. Equation (1) is the formula for calculating total CE:
In this formula, LCCO2, CP, CT, CB, and CO denote the total CE, CE during the material production stage, CE during the construction stage, and CE during the operation stage.
The equation for calculating CE during the material production phase is as follows:
In this formula, M denotes the annual consumption of construction materials, i is the type of material in the production process, and CP represents the CE factor for the ith material.
The equation for calculating CE during the transportation phase is as follows:
In this formula,
Di represents the average transportation distance (km) for material
i, and
CTi denotes the CE of material
i for a specific mode of transportation (kgCO
2/t∙km). Based on the default values specified in the Standard for Carbon Emission Calculation of Buildings (GB/T 51366-2019),
Di = 500 km and
CTi = 0.129 kgCO
2/t∙km [
36].
The equation for calculating CE during the construction phase is as follows:
In this formula, Ej,i, and Ed,i represent the aggregate energy consumption of energy source ith during the construction and demolition phases (in kWh or kg), respectively, and CBi denotes the CE factor of the ith energy source in the study.
The equation for calculating CE during the operation phase is as follows:
In this formula, Ei represents the energy usage associated with the ith building type, which includes household living activities and energy consumption from the tertiary sector, and CM denotes CE from transportation.
3.3. Dagum Gini Coefficient and Its Decomposition (DGC)
The Dagum Gini coefficient, based on the research of economist Maxwell Dagum, is a key indicator for assessing the degree of income and wealth distribution inequality [
37]. The Dagum Gini coefficient enhances the conventional Gini index by incorporating the Dagum parameter, thereby offering a more refined and optimized measure, enabling a more accurate measurement of inequality. According to the decomposition principle of the DGC, the overall inequality G can be broken down into within-region inequality Gw, between-region inequality Gb, and super-density inequality Gt, such that G = Gw + Gb + Gt. The specific calculation formula is as follows:
where
G denotes the overall Gini coefficient,
k denotes the total count of districts,
yij, and
yhr represent the coordinated development levels of any province or city in the
ith (
h-th) region,
μ denotes the coordinated development level of DG and CE,
n represents the total count of provinces and cities, and
ni and
nh refer to the count of provinces and cities in the
ith (
hth) district, respectively.
3.4. Indicators for Analyzing the Impact of CE
To study CE in different provinces, it is necessary to establish corresponding indicators, and this paper establishes horizontal analysis indicators to evaluate the CE of different provinces [
38].
(1) The CE per unit of building area is used as a key metric for comparing the carbon emission impacts across different provinces.
where
E0 represents the CE per unit area,
CE0 denotes the total CE of each province, and
A0 is the building land area of each province.
(2) Carbon emissions per unit per capita per year as an indicator for carbon impact evaluation
where
E1 represents per capita annual CE and
P represents the population size.
4. STIRPAT-LMDI Model
The STIRPAT model is a flexible stochastic Environmental Impact Assessment (EIA) model based on the IPAT equation. This paper applies the STIRPAT model to explore the connection between CE and explanatory variables [
39].
The standard form of the STIRPAT model is as follows:
where a denotes the coefficient,
b,
c, and
d denote the exponents of the respective variables, and e is the error term. Equation (10) is obtained by taking the logarithm of Equation (9):
As a method for factor decomposition analysis, the LMDI model has been widely applied in the areas of economics, environment, and energy. Originally presented by Angela K. Meyer [
40]. It is developed based on the Divisia index method. The model aims to numerically evaluate the effects of various determinants on variations in the target variable. The integration of the STIRPAT model with the LMDI model first involves analyzing the influencing factors using the STIRPAT model to generate regression coefficients and related parameters for each factor. The factors were subsequently decomposed based on the conclusions of the STIRPAT model to quantify the contribution of each factor to the CE. The integration of the models not only identifies the degree of influence of each factor but also quantifies the independent contribution of each factor to CE. This approach provides a clearer analytical framework for examining the factors that influence CE. Based on the STIRPAT model,
CEt and
CE0 represent the CE at time node
t and time node 0, respectively. The LMDI additive decomposition equation is as follows:
The decomposition of various influencing factors is carried out, and the basic calculation equations are shown in Equations (12)–(15):
where Δ
CP, Δ
CA, and Δ
CT represent the carbon emission changes attributed to each influencing factor.
4.1. BP Neural Network Algorithm
The BP neural network, introduced by Rumelhart and colleagues in 1986, is an error backpropagation algorithm for a multilayer feedforward network. It has strong approximation and generalization capabilities. The algorithm is based on the gradient descent method, where the weights are adjusted iteratively to minimize the output error until the desired accuracy is achieved [
41]. The BP neural network consists of an input layer, a hidden layer, and an output layer [
42]. The computation process involves both forward and backward propagation. In the forward direction, the process flow begins at the input layer and reaches the output layer. In contrast, if the neuron does not produce the desired result, a backward operation is performed. This model offers several advantages, including improved generalization and the ability to perform nonlinear mapping [
43]. The basic structure is shown in
Figure 1.
4.2. Selection of Variables
The neural network model requires the selection of appropriate input variables to predict the output. For the construction industry system, the selection of predictive variables is based on three main considerations: ① the variables must be representative to comprehensively cover the key sectors within the construction industry system; ② the variables should be readily available and credible to improve the reliability of the prediction results, while also providing a valuable reference for similar studies; ③ the number of variables should be kept limited, as the data sample size for construction industry carbon emissions is relatively small, and too many variables may lead to underfitting or overfitting in the model. This paper makes references to relevant literature [
15,
18]. Seven indicators were selected for analysis: year-end resident population (Y), urbanization level (R), value generated by the secondary sector (V), technical apparatus ratio of construction firms (T), labor productivity in the construction sector (P), value added of the service sector (S), and total output value of the construction industry (L).
4.3. Gray System Theory Model (GM)
The gray model was introduced by Professor Deng Julong in the 1980s [
44] to address issues of uncertainty and data scarcity. Its core function is to handle limited data and insufficient information by creating precise predictive models that allow quantitative analysis and control of future trends. The most representative model is GM (1,1), which accumulates the original data and describes them using a first-order differential equation. By solving for the parameters, it forecasts future trends, and the final predictions are obtained through inverse accumulation. The GM model can be classified into two types: univariate and multivariate [
45]. The univariate model is simpler to implement and is, therefore, more widely used. However, it has limitations, such as its inability to capture the influence of external factors on the system. In contrast, the multivariate model addresses these shortcomings by considering the effects of multiple factors, thereby providing a more comprehensive analysis of the system [
46].
4.4. Support Vector Machine (SVM)
The SVM model was introduced by Cortes and Vapnik in 1995 [
47]. SVM, an efficient supervised classifier, offers superior generalization capabilities compared to simpler classifiers [
48]. As an emerging method, SVM or SVR offers powerful advantages in machine learning algorithms for classifying multivariate and complex carbon emission samples by constructing hyperplanes. This process aims to maximize the margin of the support vectors from the decision boundary, thereby enhancing the model’s robustness and generalization capability. The realization formula is given by (16). For nonlinear problems, SVM can map the original features to a high-dimensional space for processing, providing reliable support for the accuracy of CE predictions.
where
wT denotes the transpose of the normal vector,
b is the bias, and
x is the input variable.
4.5. GA-BP Model
The Genetic Algorithm (GA) was first introduced by the renowned scientist Professor Holland in 1962. The algorithm draws on the natural law of “survival of the fittest” from evolutionary theory and the principles of gene recombination, hybridization, and mutation from genetics. By iteratively exploring the solution space, the GA seeks an optimal solution or a solution that is close to optimal [
49]. The GA-BP model organically combines the GA model with the BP model. The traditional GA model exhibits strong global optimization capabilities but suffers from slow processing. The GA-BP model addresses this issue by integrating the BP algorithm with GA optimization, thereby enhancing the model’s efficiency and prediction accuracy. This combination significantly improves both the operational speed and the overall performance of the model [
50]. The traditional BP model is prone to falling into local optima, and its training process can be unstable [
51]. In contrast, the GA-BP model effectively mitigates these issues, leading to a more stable and reliable performance. By employing a genetic algorithm, this model effectively reduces the risk of becoming trapped in local optima, while the BP algorithm further refines the output precision, significantly enhancing the accuracy of predictions and classifications. The specific flowchart is shown in
Figure 2.
4.6. Data Sources
The data are sourced from the China Statistical Yearbook, China Energy Statistical Yearbook, China Construction Industry Statistical Yearbook, and China Electric Power Statistical Yearbook for the years 2007 to 2021. The CE factors refer to data promulgated by Chinese government departments («Guidelines for the Preparation of Provincial Greenhouse Gas Inventories») and relevant references and literature [
52].
5. Results and Discussion
5.1. CE Accounting Results
This study accounts for CE based on 30 provinces in China (excluding Hong Kong, Macao, Taiwan, and Tibet) from 2007 to 2021, categorizing regions based on the per capita GDP in 2021.
Figure 3 and
Figure 4 illustrate the construction carbon emission data for the 30 provinces and the comparison of carbon emissions across different periods. It is evident that during the period from 2010 to 2014, provinces such as Jiangsu, Hebei, Shandong, Liaoning, Sichuan, Jilin, and Henan experienced a significant increase in carbon emissions compared to other years. This increase is primarily attributed to the increased CE during the material production phase. The results are generally consistent with those of previous studies [
52,
53]. From 2010 to 2014, CE from the material production stage accounted for 66.16%, 73.99%, 76.85%, 68.95%, and 73.17% of the total CE, respectively. Additionally, carbon emissions in most provinces remain relatively stable throughout their life cycles. The nation’s overall carbon emissions experienced a marked surge, with CE amounting to 1.868 billion tons and 5.646 billion tons in 2007 and 2021, respectively, representing an average annual growth rate of 9.81%. The growth rates in 2011 and 2012 were 56.16% and 27.54%, respectively. This significant increase is closely correlated with the substantial escalation of CE across various provinces during the timeframe spanning from 2010 to 2014.
5.2. Analysis of Carbon Emission Impact Indicators
The literature was referenced to develop two cross-sectional assessment indicators [
38]. This paper utilizes cross-sectional evaluation indicators to offer a comprehensive assessment of the impact of CE. This methodology enables a comparison of carbon emissions from the construction industry across different regions and residential lifestyles, offering a clear quantitative foundation for assessing carbon emissions throughout the entire life cycle of the construction industry. As illustrated in
Figure 5, based on CE accounting, Zhejiang Province has the highest cumulative annual per capita CE, peaking at 10.16 tons in 2017, indicating a significant contribution to CE and robust construction demand in the province. Qinghai Province exhibits the highest CE per unit of building area, followed by Heilongjiang Province, with the maximum annual CE per unit area recorded at 2.27 t CO
2/m
2 and 3.14 t CO
2/m
2, respectively. This is closely linked to the carbon footprint of building materials and the high energy consumption associated with buildings. Furthermore, Jilin Province recorded an annual per capita CE of 42.52 tons in 2012, with CE per unit of building area reaching 8.90 t CO
2/m
2, which was the highest among all provinces. These factors have significantly boosted the need for building materials, which has resulted in a substantial increase in CE at all stages, thus raising the per capita CE level for that year. Therefore, future development should focus on measures that promote low-carbon and high-efficiency building materials, strengthen energy efficiency management, and encourage the application of intelligent control systems.
5.3. Spatial Disparity Analysis
China’s provinces are divided into four regions: Eastern, Central, Western, and Northeastern [
54]. As shown in
Figure 6, which depicts the spatial distribution of CE, CE has been steadily increasing. The Eastern region exhibits significantly higher emissions than the other regions, while the Western region has the lowest emissions. Thus, Hypothesis 1 is verified and clearly holds. However, within the Western region, Sichuan Province shows a relatively high construction-related CE value. In 2012, Jilin Province had the highest CE, primarily because of the widespread utilization of construction materials. Given the substantial CE, implementing effective measures to reduce its environmental impact is crucial for achieving the goal of sustainable development at the earliest opportunity. This study further applies to the Dagum Gini coefficient decomposition model to examine CE across different regions. As shown in
Figure 7, after model decomposition, the figure clearly illustrates that inter-regional disparities are the main factors driving spatial variations in CE.
5.4. Variable Correlation Analysis
The Spearman correlation coefficient method is employed to calculate the correlation coefficients and significance of the seven explanatory variables selected in
Section 4.2 with CE. The results are presented in
Table 2.
As shown in
Table 2, the total CE is significantly correlated with the six variables (
Y), (
R), (
V), (
P), (
S), and (
L) at the 1% level, which is a strong correlation. In contrast, the correlation coefficient with (
T) is 0.286, indicating a weak correlation with the total CE. Although (
T) does not show a significant correlation, it is retained in the study due to its potential impact on CE in practice, given the limited number of variables selected in this study [
55]. After identifying the influencing factors, the STIRPAT model can be expanded, leading to the following equation:
Using statistical regression methods, calculate the regression coefficients for the model. The construction of the extended STIRPAT model requires a logarithmic operation of Equation (17). The formula is as follows:
To avoid spurious regression,
Table 3 presents the results of the collinearity analysis of the relevant data.
Table 3 presents the variance inflation factor (VIF) analysis of the factor variables. When it indicates a severe degree of multicollinearity. The results show that, except for (
T), which has a VIF value less than 10, all other factors have VIF values significantly greater than 10, indicating a higher degree of collinearity [
56]. Therefore, ridge regression analysis is required to address the collinearity issue, as shown in
Figure 8.
5.5. Ridge Regression Analysis
Figure 8a,b shows the ridge regression
R2 versus K value and the ridge trace plot of K value, respectively. When the K value is small, it corresponds to a higher
R2; therefore, the choice of the K value is more critical. As shown in (b), when K ≥ 0.19, all variables converge to stability.
According to the ridge regression analysis results in
Table 4, when K = 0.19, the
R2 value is 0.913, and the F-statistic is 8.005. With a sigF value of 0.010, the overall F value of the regression equation is significant at the 1% level. Furthermore, all variables are significant at the 1% level, except for Ln
R, Ln
P, and Ln
S, which are significant at the 5% level. The variables can be ranked according to their impact as follows: (
Y) > (
T) > (
R) > (
V) > (
P) > (
L) > (
S).
The regression model is as follows:
By employing ridge regression coefficients and substituting them into the LMDI decomposition formula, it is possible to determine the contribution of the explanatory variables to the CE, as shown in
Figure 9.
According to the information presented in
Figure 9, during the entire life cycle, the contribution rates of (
V) and (
P) are substantial, revealing the dominance of energy-intensive industries in the current economic landscape. This indicates that the rise in CE is primarily caused by significant energy consumption in secondary industry, while the enhancement in (
P) partially reflects technological advancements and improved efficiency. Conversely, it also signifies growth in the construction scale and a rise in the usage of building materials, thereby exacerbating CE.
5.6. Training in the CE Prediction Mode
The parameter settings in the model define the input, hidden, and output layers of the BP neural network as 7, 7, and 1, respectively. The learning rate and the number of iterations of the model were set to 0.01 and 1000, respectively. In the GM-BP model, the output of the GM serves as one of the inputs to the model, with the input, hidden, and output layers of the GM-BP neural network being 8, 7, and 1, respectively. The learning rate and maximum number of iterations are consistent with those of the BP model. The SVM model achieves optimal performance after training when the radial basis function parameter and penalty factor of the SVM are configured as 3.5 and 10, respectively. In the GA-BP model, the input, hidden, and output layers of the GA-BP neural network as 7, 5, and 1, respectively, with the learning rate and maximum number of iterations consistent with those of the BP model, and the dataset is divided into a training set and a test set in an 8:2 ratio.
Figure 10 shows the performance comparison between the actual and predicted values for each model. Both the BP and GM-BP neural networks exhibit relative fluctuations in actual performance across the training and test sets, while SVM shows some variation in the test set. In contrast, the fitting performance of the GA-BP model is superior to that of the other three models.
Table 5 shows that the regression models are ranked in order of performance as GA-BP, SVM, GM-BP, and BP. Additionally, the
R2 of GA-BP is higher than that of the other models by 0.0435–0.0981, while the
MAE is reduced by 63–76%, and the
MAPE is decreased by 23–68%. In summary, this paper selects the optimal model, GA-BP, to predict the CE. This study uses
R2,
MAE, and
MAPE as evaluation metrics, with the following calculation formulas:
where
represents the actual value,
denotes the predicted value,
is the mean value of the sample, and
n indicates the total number of samples. The value of
R2 should be as close to 1 as possible, indicating a better model fit, while smaller values of
MAE and
MAPE indicate better model performance.
5.7. Scenario Prediction Analysis of CE
In the process of constructing the CE prediction model, with reference to the relevant literature [
18], the CE projection scenarios are categorized into three scenarios: low, baseline, and high carbon. This classification aims to comprehensively assess potential trends in future CE and the underlying driving factors. By comparing these three scenarios, a deeper understanding of the potential impacts of different policy directions and development pathways on the CE can be achieved. Based on the relevant policies of the national “14th Five-Year Plan”, the specific rates of change for the three scenarios are set using the parameter variation rates from the past five years. Low, baseline, and high carbon represent the lowest, average, and highest rates of change over the past five years for each parameter, respectively. Detailed information on the scenario parameter settings is provided in
Table 6.
As shown in
Figure 11, CE exhibits significant differences across various scenarios. In the high-carbon scenario, emissions continue to rise, with a notable increase between 2024 and 2025. Although the growth rate slows down and fluctuates after 2025, the CE has yet to reach its peak. In the baseline scenario, even with the support of existing national policies, CE continues to rise steadily, suggesting the need for targeted mitigation measures to alleviate the pressure on emissions. In the low-carbon scenario, CEs remain stable throughout the forecast period, in line with the objectives outlined in the State Council’s “Action Plan for Reaching Carbon Peak Before 2030”, with emissions expected to peak by 2030. Thus, Hypothesis 2 is verified and clearly holds. To achieve a carbon peak before 2030 across multiple scenarios, it is crucial to accelerate the green, low-carbon transition and high-quality development. Moreover, stricter regulatory and control mechanisms must be enforced to ensure that CE reaches its peak at the earliest opportunity [
57].
6. Limitations and Future Prospects
The research presented in this paper has several shortcomings. First, due to data constraints, climate factors have not been comprehensively incorporated into the life cycle assessment, which significantly affects energy consumption across regions and indirectly impacts the consistency of CE data. Future research should integrate carbon emissions resulting from climatic variations with technological advancements to improve the precision of assessments. Secondly, this paper has some limitations in selecting the influencing factors of CE, as the factors considered are not fully comprehensive. Future studies could further incorporate the specific effects of building materials on carbon emissions and employ multi-model comparative analysis methods to enrich the depth and breadth of the research. Third, while there are numerous types of predictive models, this study has only validated four models; the GA-BP model could also be integrated with other emerging ensemble algorithms to achieve greater predictive accuracy and generalization capabilities. Lastly, the horizontal evaluation of carbon emission impacts in this study is not exhaustive; future research could conduct more in-depth sensitivity analyses based on this foundation to improve the models’ robustness and identify driving factors more efficiently and accurately.
7. Conclusions and Recommendations
This study conducts detailed CE accounting for 30 provinces in China from 2007 to 2021, analyzing the key driving factors influencing CE. The STIRPAT model was used to identify the influence of weights and determine the contribution rates of various factors through LMDI decomposition. Additionally, multiple predictive models are trained and compared with the GA-BP model selected for its superior accuracy and generalization capabilities to forecast CE. Finally, a multidimensional assessment of CE impacts is performed.
(1) The CE accounting results for each province indicate that the life cycle CE remains relatively stable in most regions. Overall, CE has shown an upward trend, increasing from 1868 million tons in 2007 to 5646 million tons in 2021. Regarding the spatial distribution of CE, the eastern region exhibits the highest emissions, while the western region has the lowest. In 2021, the average CE for the eastern and western provinces was 279 million tons and 124 million tons, respectively. Furthermore, the results of the Dagum Gini coefficient decomposition clearly indicate that inter-regional disparities are the main factors driving spatial variations in CE.
(2) From the life cycle perspective of CE, the overall trend shows an increase in CE at all stages. In terms of CE by stage, the material production stage accounts for the largest share, followed by the operation stage. The CE from the transportation and construction stages is relatively small in comparison. Among these, CE from the transportation and construction phases remain relatively stable overall, while CE from the operation phase has gradually increased over time. The CE in the material production phase exhibits significant fluctuations, with the most notable fluctuation occurring in 2012, primarily due to a dramatic rise in the use of construction materials.
(3) Based on the analysis of the seven factors influencing CE, they are ranked according to their level of impact as follows: (Y) > (T) > (R) > (V) > (P) > (L) > (S). All these variables positively contribute to the increase in CE.
(4) Compared to the other predictive models in this study, the GA-BP model exhibits superior generalization ability and prediction accuracy. Predictions of future CE trends based on three scenarios indicate a lower likelihood of achieving peak CE before 2030 under high-carbon and baseline scenarios; in contrast, within the low-carbon scenario, CE is expected to stabilize before 2030, thereby presenting a potential pathway to achieving the peak emission target.
Based on these conclusions, this paper presents the following policy recommendations:
From the perspective of material production and transportation, the production and transportation processes of building materials are the primary sources of CE. To reduce CE at the source, it is essential to upgrade the industrial structure, promote the research and development of green and lightweight building materials, and improve the utilization rate of building materials.
In the construction stage, CE primarily arises from the construction site environment, equipment uses, and energy consumption associated with the processing of waste construction materials. It is crucial to implement effective management strategies at construction sites to minimize environmental impact, promote the use of more sustainable building materials, and enhance the recycling rate of discarded materials.
In the operation stage, CE primarily results from energy consumption in the tertiary industry and residential non-transportation activities. It is important to raise environmental awareness among residents and encourage communities to organize more energy-saving and emission-reduction initiatives to actively involve residents. Additionally, encouraging the adoption of clean energy sources, including wind and solar energy, can significantly reduce carbon emissions.
From the government’s perspective, it is essential to actively support the green technology research and development efforts of enterprises and strengthen policy support and incentive mechanisms. The government can provide tax exemptions, R&D subsidies, and other benefits to promote technological innovation. Additionally, in terms of environmental regulation, it is important to implement measures that reward companies with strong environmental performance, penalize those that exceed CE standards, and limit the market access of non-compliant companies. These actions will help raise awareness of corporate environmental responsibility.