Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors

Li, Xiangqian; Li, Keke; Tian, Yaxin; Shen, Siqi; Yu, Yue; Jin, Liwei; Meng, Pengyu; Cao, Jingjing; Zhang, Xiaoxiao

doi:10.3390/su16135475

Open AccessArticle

Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors

by

Xiangqian Li

¹,

Keke Li

¹,

Yaxin Tian

²,

Siqi Shen

¹,

Yue Yu

¹,

Liwei Jin

¹,

Pengyu Meng

¹,

Jingjing Cao

¹ and

Xiaoxiao Zhang

^3,*

¹

School of Statistics, Capital University of Economics and Business, Beijing 100070, China

²

School of Finance, Capital University of Economics and Business, Beijing 100070, China

³

School of Statistics and Data Science, Beijing Wuzi University, Beijing 101126, China

^*

Author to whom correspondence should be addressed.

Sustainability 2024, 16(13), 5475; https://doi.org/10.3390/su16135475

Submission received: 13 May 2024 / Revised: 19 June 2024 / Accepted: 21 June 2024 / Published: 27 June 2024

(This article belongs to the Section Air, Climate Change and Sustainability)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

China is one of the world’s largest producers and consumers of cement, making carbon emissions in the cement industry a focal point of current research and practice. This study explores the prediction of cement consumption and its influencing factors across 31 provinces in China using the RF-MLP-LR model. The results show that the RF-MLP-LR model performs exceptionally well in predicting cement consumption, with the Mean Absolute Percentage Error (MAPE) below 10% in most provinces, indicating high prediction accuracy. Specifically, the model outperforms traditional models such as Random Forest (RF), Multi-Layer Perceptron (MLP), and Logistic Regression (LR), especially in handling complex scenarios or specific regions. The study also conducts an in-depth analysis of key factors influencing cement consumption, highlighting the significant impact of factors such as per capita GDP, per capita housing construction area, and urbanization rate. These findings provide important insights for policy formulation, aiding the transition of China’s cement industry towards low-carbon, sustainable development, and contributing positively to achieving carbon neutrality goals.

Keywords:

cement consumption; machine learning; carbon neutrality; prediction model

1. Introduction

China, as one of the world’s largest producers and consumers of cement, plays a crucial role in the country’s economic growth and infrastructure development [1,2]. However, amidst increasing global concerns about climate change and carbon neutrality goals, the issue of carbon emissions generated during cement production has become a focal point in current research and practice. The cement industry is recognized as one of the most challenging sectors to mitigate carbon emissions, making it imperative to find effective solutions to reduce its carbon footprint [3,4].

Key to addressing the carbon emissions challenge in the cement industry is understanding the trends and influencing factors of cement consumption. Cement consumption is influenced by various factors including economic growth, infrastructure development, and policy directives. Therefore, accurately predicting changes in cement consumption and analyzing its influencing factors will provide a scientific basis and effective pathways for formulating emission reduction policies.

This study aims to explore the prediction of cement consumption and its influencing factors across 31 provinces in China, thus providing support for addressing the carbon emissions challenge in the cement industry. We propose an integrated model, RF-MLP-LR, which combines machine learning algorithms such as Random Forest, Multi-Layer Perceptron, and Logistic Regression, for predicting cement consumption and analyzing its influencing factors. Through this research, we hope to provide scientific insights for the green and sustainable development of China’s cement industry and contribute to achieving carbon neutrality goals.

Currently, research on the Chinese cement industry has covered multiple aspects, including market analysis, trend forecasting, and policy impact. However, studies on the prediction of cement consumption and its influencing factors in various provinces of China remain relatively scarce, providing further room and potential for research in this field.

From the perspective of prediction methods, time series forecasting models can be roughly divided into three categories: traditional statistical methods, machine learning methods, and neural networks. Here is a detailed discussion of these three types of methods: Linear regression (LR) in traditional statistical methods is a fundamental tool for predictive analysis, which predicts by establishing a linear relationship model between independent and dependent variables. In recent years, researchers have innovated and improved the linear regression model. The Local Linear Ensemble Regression (LLER) method proposed by [5] effectively addresses non-linear problems by decomposing data into local linear regions and applying multiple linear models. Ref. [6] compared the effects of Nonlinear Regression (NLR) and Simple Linear Regression (SLR) in regional freight demand forecasting, finding that each has its advantages. Ref. [7] considered the estimation of the linear regression model as an optimization problem and improved the accuracy of the prediction model through optimization principles and methods. These studies demonstrate the flexibility and applicability of the linear regression model in data processing and predictive analysis, providing strong support for practical applications.

Artificial Neural Networks (ANNs), especially Multi-Layer Perceptron (MLP) models, have become powerful tools in data analysis and prediction tasks across multiple fields. Due to their excellent non-linear mapping and self-learning abilities, ANN models have shown significant advantages in solving complex problems. In the fields of transportation and energy, ANN models are widely used for prediction and policy planning. For example, ref. [8] successfully optimized key parameters in urban transportation energy policy planning using a hybrid algorithm combined with ANN, demonstrating that the ANN-PSO model based on GDP, population, and ton-kilometers has significant advantages in prediction accuracy. This study provides a strong basis for the formulation of urban transportation energy policies.

In the prediction of environmental emissions, ANN models also play an important role. Ref. [9] accurately predicted CO₂ emissions from the Organization of the Petroleum Exporting Countries (OPEC) using ANN combined with advanced optimization algorithms, improving prediction speed while ensuring accuracy. Additionally, ref. [10] predicted CO₂ emissions expenditures, determining the optimal parameters using backpropagation artificial neural network models, and achieving good prediction results. In the management of carbon emissions in the transportation sector, the study by [11] showed that the ANN model performs best in predicting CO₂ emissions from the transportation sector, providing a scientific basis for relevant policy-making. Ref. [12] focused on real-time emission prediction, developing an emission prediction model based on vehicle characteristics using ANN technology. This model showed significantly better accuracy in emission prediction than traditional models, providing technical support for vehicle emission control and regulation.

In summary, MLP or ANN models, with their excellent performance and wide range of applications, provide strong support for data analysis and prediction tasks in multiple industries. With the growth of data volume and the improvement of computing power, ANN models are expected to play a greater role in more fields in the future, providing more accurate and efficient solutions to complex problems.

In the field of machine learning, common models for time series prediction include Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN), and Support Vector Regression (SVR), among others. However, these models are typically suitable for large datasets. In contrast, Random Forest performs exceptionally well when handling small datasets. As an ensemble learning algorithm, it has achieved excellent predictive performance in areas such as drug design and financial forecasting. Ref. [13] successfully predicted the binding sites of mannose using random forests, while ref. [14] used it to accurately predict the initial return on investment for IPOs. Random forests can handle high-dimensional data and have strong noise resistance, making them an ideal prediction tool for scenarios with small data sets. The algorithm continues to improve, such as with post-selection enhanced random forests to enhance performance. Research by ref. [15] has confirmed that random forests also have advantages in regression prediction. In conclusion, random forests, with their excellent predictive performance and optimized algorithms, have become an important tool for prediction analysis in scenarios with small data sets.

Feature selection, which involves ranking the influencing factor features, is a key step in data analysis that helps us to deepen our understanding of the inherent structure and patterns of data, revealing the key factors influencing the target variable. Among the many feature selection methods, the STIRPAT model, Random Forest (RF), and Principal Component Analysis (PCA) are three commonly used and effective methods.

Firstly, the STIRPAT model has been widely applied to analyze the impact of various factors on carbon emissions due to its flexibility and scalability. Ref. [16] found that analyzing carbon-related news topics can help improve global carbon emissions and transfers, demonstrating the effectiveness of the STIRPAT model in media analysis. Ref. [17] introduced this model to the marketing industry and anticipated its broad application prospects in this field. The STIRPAT model, by considering factors such as population, affluence, and technology comprehensively, provides policymakers with a framework for a thorough understanding of the driving factors of carbon emissions.

Secondly, Principal Component Analysis (PCA), as an effective dimensionality reduction technique, has also played an important role in carbon emission research. Ref. [18] used PCA and grey relational analysis to determine the key factors affecting China’s carbon emissions, providing a scientific basis for policy formulation. Ref. [19] estimated carbon emissions from agricultural production using PCA and revealed a weak decoupling relationship between carbon emissions and output value. PCA simplifies the analysis process and improves computational efficiency by extracting the main features and reducing data dimensions, helping researchers to better understand the driving mechanisms of carbon emissions.

Finally, the Random Forest model has shown significant advantages in feature selection and predicting carbon emissions. The mixed prediction model based on Random Forest proposed by [20] accurately predicts the cooling load of large commercial buildings, demonstrating the superiority of Random Forest in predictive accuracy. Ref. [21] analyzed the influencing factors of carbon emissions in several cities along the Yangtze River Economic Belt using Random Forest, revealing regional differences and emission reduction effects. In addition, the studies by [22,23] fully demonstrate the effectiveness of Random Forest in carbon emission prediction and feature selection. These studies not only emphasize the ability of the Random Forest model to handle complex data and nonlinear relationships but also highlight its advantages in improving prediction accuracy and stability.

Random forests as a feature selection model have significant advantages in handling nonlinear relationships, large numbers of features, overfitting resistance, interpretability, and robustness. In comparison, the STIRPAT model may not perform as well as random forests in handling nonlinear relationships and large numbers of features. Although the STIRPAT model is flexible and scalable, it may be limited in dealing with complex data structures and multidimensional relationships. While PCA serves as a dimensionality reduction technique, it may not be as intuitive and effective as random forests in feature selection, as its primary advantage lies in dimensionality reduction rather than feature selection. Therefore, the superior performance of random forests in feature selection tasks makes it an ideal choice for handling challenging data features.

In summary, the RF-MLP-LR model proposed in this paper combines traditional statistical models, neural network models, and machine learning models, aiming to achieve better results in predicting cement consumption and analyzing its influencing factors across Chinese provinces in the cement industry. The linear regression model in traditional statistical methods serves as the foundation, and through innovation and improvement, the model’s accuracy and applicability are enhanced. The MLP model, with its outstanding non-linear mapping and self-learning capabilities, demonstrates significant advantages in solving complex problems. The RF model in machine learning methods, particularly excels in small data scenarios, with its superior predictive performance and algorithm optimization, making it an ideal tool for predictive analysis in small data scenarios. By combining these three types of methods, the RF-MLP-LR model is expected to play an important role in predicting cement consumption and analyzing its influencing factors in the cement industry across Chinese provinces, providing a scientific basis for policy formulation and strong support for industry development.

The innovative points of this paper are as follows:

(1): The application of integrated models: This study adopts the RF-MLP-LR integrated model for the first time to predict the cement consumption and analyze influencing factors across 31 provinces in China. The integrated model combines various machine learning algorithms such as Random Forest, Multi-Layer Perceptron, and Logistic Regression, leveraging their respective strengths to enhance the accuracy and robustness of predictions. Compared to traditional single models, this integrated approach allows for a more comprehensive consideration of influencing factors, thereby providing more precise recommendations for emission reduction policies.
(2): Refinement of cement consumption prediction: Building upon the integrated model, this study conducts a detailed analysis of cement consumption prediction across 31 provinces in China. By delving into factors such as economic development status, infrastructure construction needs, and policy orientations of each province, precise predictions of cement consumption are achieved. This provides crucial evidence for formulating tailored emission reduction policies in different regions, thereby facilitating the achievement of interregional carbon reduction goals.
(3): Policy recommendations: This study not only involves predicting cement consumption and analyzing influencing factors but also, more importantly, proposes targeted emission reduction policy recommendations. Based on in-depth analysis of key factors influencing cement consumption, we put forward a series of feasible emission reduction measures, such as strengthening the formulation of environmental protection policies, promoting energy-saving and environmentally friendly technologies and green materials, and enhancing the establishment of carbon emission trading mechanisms. These policy suggestions provide guidance for promoting the transition of the cement industry towards low-carbon and green development, offering practical action plans for achieving carbon neutrality goals.

The structure of this paper is as follows: Section 2 presents the methodology, focusing on the integrated model RF-MLP-LR. Section 3 discusses the data, detailing its sources. Section 4 covers the results and discussion, providing an in-depth analysis of the findings. Section 5 concludes the paper, summarizing the conclusions drawn and offering corresponding policy recommendations.

2. Methodology

2.1. Random Forest Model

In multivariate time series prediction, we aim to forecast the trends of multiple variables over time. Random Forest, as a powerful ensemble learning method, can be applied to this problem to effectively capture the complex relationships between variables and achieve accurate predictions [24,25]. Here is a detailed explanation of the principle of using Random Forest model for multivariate time series prediction, with the structural diagram of the model shown in Figure 1.

Step 1, Data Preparation: Organize the multivariate time series data chronologically, using lagged features to transform the series into a supervised learning problem, with past values as features and current values as targets.

Step 2, Data Set Splitting: Divide the data set into a training set (80%) and a test set (20%), with the training set used to build the Random Forest model and the test set to evaluate its performance.

Step 3, Decision Tree Construction: Select nodes for splitting, choose random subsets of features to enhance diversity, determine split criteria, and recursively build child nodes until stopping conditions are met. Pruning may be performed to prevent overfitting and enhance generalization.

Step 4, Model Prediction: Aggregate predictions from multiple decision trees in the Random Forest. The final prediction value is typically the average (or weighted average) of these predictions.

2.2. Multilayer Perceptron Model

Multilayer Perceptron (MLP): MLP is a commonly used artificial neural network model suitable for regression tasks [26]. It consists of input, hidden, and output layers. In time series forecasting, MLP handles sequential dependencies and predicts future values by capturing complex patterns and trends, whose structure can be seen as Figure 2. It is trained using backpropagation and optimization techniques to minimize prediction errors. Designing the appropriate network architecture and activation functions allows MLP to achieve accurate predictions.

(1) Data Preparation: First, organize the time series data in chronological order. Typically, time series data can be transformed into a suitable supervised learning problem, where past time steps

X_{t - 1}, X_{t - 2} \dots, X_{t - n}

are used as input features, and the current or future time steps

X_{t}

are used as output labels.

(2) Network Architecture Design: Design the network structure of the MLP model. A typical MLP structure includes one or multiple hidden layers, each containing multiple neurons, along with an output layer. The hidden layers usually employ nonlinear activation functions such as ReLU (Rectified Linear Unit) or Sigmoid functions. The computation of the input

z_{j}^{l}

, and output

a_{j}^{l}

of the

j

-th node in the

l

-th layer is as follows:

z_{j}^{l} = \sum_{i} w_{i j}^{l} a_{i}^{l - 1} + b_{j}^{l}

(1)

a_{j}^{l} = σ (z_{j}^{l})

(2)

where

w_{ⅈ j}^{l}

represents the weight connecting the

ⅈ

-th node in laye

l

-1 to the

j

-th node in layer,

b_{j}^{l}

, is the bias of the

j

-th node in lay, and

σ (\cdot)

denotes the activation function.

(3) Model Training: Train the MLP model using backpropagation algorithm and optimization techniques (such as stochastic gradient descent). During the training process, the MLP model gradually adjusts its parameters to minimize the loss function between predicted

\hat{Y_{t}}

and actual

Y_{t}

values. The loss function

L

is calculated as follows:

L = \frac{1}{N} \sum_{i = 1}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}

(3)

where

N

is the number of samples.

(4) Model Prediction: After training, use the trained MLP model to predict future time series. By inputting the values of multiple past time steps, the MLP model can output predictions for future time steps.

2.3. Linear Regression Model

Linear Regression (LR) Model: LR is a classical statistical method used to establish a linear relationship between independent variables (features) and dependent variables [27]. It fits a straight line or hyperplane to describe this relationship, using the mathematical expression:

Y = β_{0} + β_{1} X_{1} + β_{2} X_{2} + \dots + β_{n} X_{n} + ε

(4)

where

Y

is the dependent variable (target variable),

X_{1}, X_{2} \dots, X_{n}

are the independent variables (features),

β_{0}, β_{1}, β_{2} \dots, β_{n}

are the coefficients (parameters) of the model, and ε is the error term.

The model parameters

β_{0}, β_{1}, β_{2} \dots, β_{n}

can be estimated by minimizing the loss function. The commonly used loss function is the Mean Squared Error (MSE), calculated as:

M S E = \frac{1}{N} \sum_{i}^{N} {(Y_{i} - \hat{Y_{i}})}^{2}

(5)

where

N

is the number of samples,

Y_{i}

is the true value of the dependent variable, and

{\hat{Y}}_{i}

is the predicted value by the model.

The training process of the linear regression model is to adjust the parameters

β_{0}, β_{1}, β_{2} \dots, β_{n}

to minimize the loss function. Usually, optimization algorithms such as Ordinary Least Squares (OLS)or gradient descent are used for parameter estimation.

Finally, the trained linear regression model can be used to predict the dependent variable values of new samples, i.e., calculate the corresponding dependent variable value based on the given independent variable values.

2.4. RF-MLP-LR Model

In the integration process of the RF-MLP-LR model, we first train the RF, MLP, and LR models separately. The RF model, by constructing multiple decision trees, effectively handles high-dimensional data and captures complex nonlinear relationships. The RF-MLP-LR Model’s structure is listed as Figure 3. The MLP model, as a deep neural network, learns deep features of the data through its multi-layer neuron structure and handles nonlinear relationships. The LR model provides a simple and efficient method for modeling linear relationships.

During the integration, the outputs of the RF and MLP models are used as new features and fed into the LR model for the final prediction. The advantage of this approach is that the RF model improves data quality through feature selection and noise reduction, the MLP model further extracts deep features, and the LR model combines these features linearly, enhancing the overall model’s prediction accuracy and stability.

Through this integration method, the RF, MLP, and LR models complement each other in handling various data complexities, allowing the RF-MLP-LR model to better handle complex data relationships and improve prediction performance. The specific principles and processes are described as follows:

Step 1: Data preparation: Data preparation involves extracting features (X) and the target variable (y) from the data, followed by splitting the dataset into training and testing sets.

Step 2: Random forest model training and prediction: Train the random forest model using the training set, setting appropriate hyperparameters (such as the number of trees, maximum depth, etc.), and then make predictions on the testing set to obtain the random forest model’s predictions.

Step 3: Neural network model training and prediction: Train the neural network model using the training set, specifying the network structure and parameters (such as the size of hidden layers, learning rate, etc.), and then make predictions on the testing set to obtain the neural network model’s predictions.

Step 4: Linear regression model training and prediction: Extract predictions from the random forest and neural network models on the testing set and use them as new features. Train the linear regression model using the training set to learn how to combine the predictions from different models. Then, make predictions on the testing set to obtain the final ensemble prediction.

We selected the RF-MLP-LR model for several reasons: First, the RF (Random Forest) model and the MLP (Multi-Layer Perceptron) model perform exceptionally well in handling nonlinear and complex data relationships. Second, the RF model excels in high-dimensional data environments, capable of processing a large number of features and exhibiting strong noise resistance. The MLP model, with its layered structure, effectively handles high-dimensional data and captures complex feature relationships through nonlinear activation functions, while the LR (Linear Regression) model provides a simple and efficient method for modeling linear relationships. Additionally, by combining the strengths of the RF, MLP, and LR models, the RF-MLP-LR model maintains high prediction accuracy and stability in various data environments, leveraging their individual advantages to reduce potential bias and variance issues associated with single models. Therefore, the choice of the RF-MLP-LR model allows us to fully utilize the advantages of different models in handling complex data relationships, high-dimensional data environments, and improving prediction accuracy and stability, providing more comprehensive and accurate prediction results.

In this paper, we will use Mean Absolute Percentage Error (MAPE) as the evaluation metric for our model. This is a metric that quantifies the average percentage difference between predicted values and actual values. It provides a useful perspective on the magnitude of errors relative to the scale of the actual values. MAPE is often employed to evaluate the accuracy of forecasting models, particularly when the range of possible values is significant. The formula for calculating MAPE is:

M A P E = \frac{1}{n} \sum_{i = 1}^{n} (\frac{| y_{i} - {\hat{y}}_{i} |}{| y_{i} |}) \times 100 %

where,

y_{i}

is the actual value,

{\hat{y}}_{i}

is the predicted value,

\bar{y}

is the mean of the actual values,

n

the number of observations. We selected MAPE as the primary evaluation metric for the following reasons: First, MAPE represents the prediction error as a percentage, making it easy to understand and interpret; second, MAPE is suitable for data of various scales and can effectively measure the accuracy of prediction models; additionally, by expressing errors as percentages, MAPE avoids the issue of comparing different scales, making results comparable across different datasets.

3. Data

The data used in this study includes cement consumption data and economic and social development indicators for 31 provinces in China from 2000 to 2021. These data are primarily sourced from the provincial statistical yearbook databases. Specifically, the cement consumption data are derived from the provincial cement consumption statistics, and the economic and social development indicators include per capita GDP, population, urbanization rate, industrial added value, and others.

In the data preprocessing process, we performed the following steps: First, data cleaning was conducted to check and handle outliers and errors in the dataset. Second, for the missing values in the dataset, interpolation and mean imputation methods were used to maintain data integrity. Finally, feature engineering was performed to extract and construct multiple features, including per capita GDP, per capita housing construction area, and urbanization rate, and feature scaling was conducted to ensure consistency in the magnitude of different features.

During the analysis, Python 3.7.3 and Jupyter Notebook 6.0.3 were used for code development and analysis. The specific analysis results can be found in Figure 4 and Table 1, which respectively show the CCPC values of each province and the statistical description of the relationship between cement consumption and indicators of economic and social development. The statistical description of other influencing factors is presented in Table 2 and Table 3. Detailed description is as follows:

Cement Consumption Per Capita (CCPC, kg/person): As a key indicator in this study, variations in per capita cement Consumption directly reflect changes in demand for construction and infrastructure across provinces. Cement, as a crucial raw material in construction and real estate industries, exhibits closely linked demand patterns with economic development levels.
Per Capita GDP (GDP, CNY/person): Reflecting economic levels and spending capacities across provinces. Economic growth often entails increased demand for infrastructure and housing, thus boosting the demand for cement.
Urbanization Rate (UR, %): Increasing urbanization levels may lead to higher investments in urban development and infrastructure, thereby driving the demand for cement.
Per Capita Housing Construction Area (SG, m²/person): The development of the real estate sector is a major driver of cement consumption. Increased demand for housing may stimulate construction activities, consequently increasing the demand for cement.
Per Capita Length of Secondary Class Highways (SO2, m/person) and Per Capita Length of Grade External Highways (SO1, m/person): Road construction is a significant factor influencing cement consumption. Economic development and urbanization processes often entail increased demand for transportation infrastructure, stimulating road construction and, consequently, cement consumption.
Proportion of Tertiary Industry (TR3, %): Reflecting the impact of industrial structure on cement consumption. An increase in the proportion of tertiary industry may indicate growth in service industries and urban development, leading to higher consumption for cement.
Investment Rate (INV, %): Higher investment rates may accelerate infrastructure construction and industrial production, subsequently increasing consumption for cement.

Figure 4. The map of CCPC across 31 provinces in China.

Table 1. Statistical description of CCPC.

Province	Max	Min	Mean	Median
Beijing	804.13	117.89	457.78	486.64
Tianjin	749.59	267.54	507.24	510.84
Hebei	2009.67	703.41	1339.12	1314.39
Shanxi	1634.66	367.62	995.91	1048.84
Inner Mongolia	2631.30	265.61	1461.22	1400.31
Liaoning	1381.44	467.23	943.12	956.38
Jilin	1402.68	282.95	869.50	912.79
Heilongjiang	1157.86	237.37	645.58	645.90
Shanghai	575.91	160.33	294.82	284.68
Jiangsu	2354.36	627.75	1641.93	1806.45
Zhejiang	2189.76	909.53	1836.19	1963.36
Anhui	2453.95	312.75	1426.75	1478.68
Fuiian	2419.63	443.88	1502.29	1701.47
Jiangxi	2303.24	352.62	1447.22	1470.00
Shandong	1790.70	727.61	1428.79	1541.61
Henan	1796.94	392.43	1109.14	1135.46
Hubei	2034.63	435.87	1356.60	1610.72
Hunan	1847.50	365.09	1202.01	1377.28
Guangdong	1359.75	678.89	1081.24	1076.16
Guangxi	2490.07	447.04	1557.65	1754.72
Hainan	2354.71	392.70	1379.00	1582.31
Chongqing	2228.08	492.38	1470.46	1652.81
Sichuan	1801.27	332.14	1203.52	1657.81
Guizhou	2987.99	208.70	1518.89	1299.55
Yunnan	2780.67	356.52	1459.56	1363.39
Xizang	2994.32	188.56	1148.56	740.69
Shaanxi	2393.19	271.53	1314.03	1535.44
Gansu	1949.92	287.67	1118.62	1014.36
Qinghai	3256.74	239.28	1632.73	1642.94
Ningxia	3103.80	505.87	1893.38	2252.61
Xinjiang	2271.47	483.80	1228.39	1278.16

Table 2. Statistical description of influencing factors.

Province	GDP			UR			SG			R01
Province	Max	Min	Mean	Max	Min	Mean	Max	Min	Mean	Max	Min	Mean
Beijing	184,921.69	106,696.18	145,808.94	87.55	77.54	82.55	12.93	1.27	7.10	0.33	0.00	0.17
Tianjin	161,450.55	96,154.37	128,802.46	84.88	71.99	78.44	16.41	0.43	8.42	0.58	0.00	0.29
Hebei	60,488.44	44,768.40	52,628.42	61.14	19.60	40.37	10.70	0.39	5.55	2.38	0.01	1.19
Shanxi	65,252.50	29,685.62	47,469.06	63.42	34.91	49.17	9.25	0.39	4.82	4.87	0.27	2.57
Inner Mongolia	106,399.64	34,678.10	70,538.87	68.21	42.68	55.45	11.29	0.42	5.86	19.67	1.10	10.39
Liaoning	95,539.19	62,506.09	79,022.64	72.81	54.24	63.53	16.37	0.37	8.37	5.14	0.05	2.59
Jilin	74,995.97	39,868.07	57,432.02	63.36	43.50	53.43	9.30	0.40	4.85	5.83	0.57	3.20
Heilongjiang	60,695.77	46,258.33	53,477.05	65.69	51.94	58.82	7.83	0.45	4.14	14.59	0.17	7.38
Shanghai	174,527.92	136,581.66	155,554.79	89.60	74.60	82.10	10.01	0.61	5.31	0.30	0.00	0.15
Jiangsu	137,531.24	68,774.59	103,152.92	73.94	41.49	57.72	14.55	0.31	7.43	2.39	0.00	1.20
Zhejiang	112,995.00	75,728.02	94,361.51	72.66	48.67	60.67	16.51	0.49	8.50	1.86	0.00	0.93
Anhui	70,641.20	28,617.15	49,629.18	59.39	28.00	43.70	13.51	0.20	6.86	4.10	0.00	2.05
Fuiian	117,183.29	61,865.78	89,524.54	69.70	41.57	55.64	16.66	0.36	8.51	7.98	2.51	5.25
Jiangxi	65,915.39	28,345.33	47,130.36	61.46	27.67	44.57	9.52	0.22	4.87	14.19	1.21	7.70
Shandong	89,311.32	55,739.71	72,525.52	63.94	38.00	50.97	13.10	0.33	6.72	0.63	0.00	0.31
Henan	60,616.61	31,792.12	46,204.36	56.45	23.20	39.83	10.11	0.27	5.19	9.19	0.24	4.71
Hubei	86,232.26	38,747.53	62,489.90	64.09	40.47	52.28	10.47	0.35	5.41	8.07	0.72	4.40
Hunan	69,923.04	30,909.86	50,416.45	59.71	29.75	44.73	15.14	0.30	7.72	17.25	1.65	9.45
Guangdong	98,563.17	65,582.81	82,072.99	74.63	55.00	64.82	10.19	0.44	5.32	3.25	0.02	1.63
Guangxi	49,374.17	25,002.17	37,188.17	55.08	21.30	38.19	9.65	0.22	4.94	8.10	1.46	4.78
Hainan	63,813.02	36,410.22	50,111.62	60.97	40.11	50.54	14.95	0.56	7.75	11.54	0.15	5.84
Chongqing	87,295.44	32,753.16	60,024.30	70.32	35.59	52.96	14.13	0.59	7.36	20.80	2.64	11.72
Sichuan	64,657.55	28,268.80	46,463.17	57.82	29.89	43.86	9.95	0.38	5.16	10.29	1.67	5.98
Guizhou	51,112.21	15,408.85	33,260.53	54.33	23.87	39.10	10.11	0.26	5.18	22.73	2.63	12.68
Yunnan	58,183.80	25,012.58	41,598.19	51.05	23.36	37.21	11.18	0.31	5.75	22.01	1.65	11.83
Xizang	57,132.11	26,730.01	41,931.06	36.61	19.30	27.96	8.97	0.65	4.81	119.24	31.59	75.42
Shaanxi	75,761.83	26,760.80	51,261.32	63.63	32.26	47.95	11.45	0.51	5.98	14.19	1.41	7.80
Gansu	41,352.03	22,830.99	32,091.51	53.33	24.01	38.67	7.40	0.53	3.96	20.72	1.67	11.19
Qinghai	57,447.34	29,934.17	43,690.76	61.02	34.76	47.89	11.84	0.51	6.17	43.31	5.04	24.18
Ningxia	62,701.46	28,144.79	45,423.13	66.04	32.54	49.29	16.85	0.42	8.64	2.98	0.01	1.50
Xinjiang	62,058.15	41,858.80	51,958.48	57.26	33.75	45.51	11.93	0.74	6.34	36.01	1.24	18.62

Table 3. Statistical description of influencing factors.

Province	R02			TR3			INV
Province	Max	Min	Mean	Max	Min	Mean	Max	Min	Mean
Beijing	0.18	0.11	0.14	83.87	58.31	71.09	63.20	36.90	50.05
Tianjin	0.26	0.12	0.19	64.40	35.88	50.14	76.90	49.85	63.38
Hebei	0.30	0.12	0.21	51.73	31.51	41.62	59.00	43.50	51.25
Shanxi	0.46	0.25	0.36	55.45	32.18	43.81	72.80	44.80	58.80
Inner Mongolia	0.87	0.16	0.52	50.48	30.41	40.45	93.40	39.70	66.55
Liaoning	0.45	0.20	0.32	53.47	34.00	43.74	65.00	31.52	48.26
Jilin	0.41	0.15	0.28	53.76	34.16	43.96	79.80	37.02	58.41
Heilongjiang	0.40	0.12	0.26	57.02	29.42	43.22	65.60	30.90	48.2
Shanghai	0.16	0.06	0.11	73.27	47.86	60.56	48.40	37.20	42.80
Jiangsu	0.29	0.10	0.19	52.53	34.87	43.70	51.70	39.80	45.75
Zhejiang	0.17	0.09	0.13	55.76	36.26	46.01	51.10	39.52	45.31
Anhui	0.23	0.10	0.17	51.25	32.52	41.89	57.60	36.00	46.80
Fuiian	0.28	0.13	0.21	47.47	38.41	42.94	58.90	44.80	51.85
Jiangxi	0.28	0.11	0.19	48.13	28.76	38.45	54.40	36.24	45.32
Shandong	0.26	0.21	0.24	53.54	32.20	42.87	56.80	46.50	51.65
Henan	0.30	0.10	0.20	49.14	28.62	38.88	78.00	40.80	59.40
Hubei	0.43	0.14	0.28	52.78	34.85	43.82	58.80	39.80	49.30
Hunan	0.25	0.06	0.15	53.23	36.49	44.86	57.90	34.53	46.22
Guangdong	0.19	0.15	0.17	56.46	36.81	46.64	44.50	35.30	39.90
Guangxi	0.31	0.06	0.18	51.87	34.11	42.99	85.20	32.98	59.09
Hainan	0.20	0.08	0.14	61.50	39.07	50.28	74.20	45.60	59.90
Chongqing	0.30	0.10	0.20	53.20	36.04	44.62	62.30	43.23	52.77
Sichuan	0.21	0.09	0.15	52.53	33.36	42.94	54.40	37.04	45.72
Guizhou	0.28	0.02	0.15	50.91	33.69	42.30	69.60	49.91	59.76
Yunnan	0.28	0.04	0.16	52.64	34.56	43.60	94.60	37.07	65.84
Xizang	0.34	0.22	0.28	56.13	45.91	51.02	114.30	35.90	75.10
Shaanxi	0.26	0.09	0.18	47.94	30.84	39.39	68.80	51.28	60.04
Gansu	0.44	0.13	0.29	55.12	33.31	44.22	67.70	41.10	54.40
Qinghai	1.53	0.35	0.94	50.84	32.06	41.45	148.50	63.40	105.95
Ningxia	0.59	0.28	0.43	50.34	33.00	41.67	124.30	63.11	93.71
Xinjiang	0.81	0.29	0.55	51.63	32.49	42.06	99.70	43.29	71.50

By analyzing the relationship between these factors and cement consumption, we can establish predictive models to provide support for carbon emission reduction in the cement industry and contribute to the formulation of relevant policies, facilitating the transition of the cement industry towards a low-carbon and sustainable future.

4. Results and Discussions

4.1. Spatiotemporal Feature Analysis

The differences in per capita cement consumption in various provinces of China are influenced not only by the level of economic development but also by regional development strategies and policies. Based on the differences in geographical location, economic development status, and urbanization level among provinces, China’s cement consumption can be divided into different regions, such as the eastern coastal region, central eastern region, northeast region, central region, western region, and northwest region. A detailed analysis is provided below:

(1): Eastern coastal region: Including provinces like Beijing, Tianjin, Shanghai, and Guangdong, these regions showed relatively high per capita cement consumption, which has been declining in recent years. These areas are economically developed with high levels of urbanization and significant infrastructure construction demands. However, the per capita cement consumption is decreasing due to economic restructuring and urban development transitions.
(2): Central eastern region: Provinces like Henan, Hunan, and Anhui saw a gradual increase in per capita cement consumption from 2000 to 2021, albeit at a slow pace. Despite having a large economic output, these regions lag behind in urbanization and infrastructure development, resulting in a relatively stable growth in per capita cement consumption.
(3): Northeast region: Provinces such as Liaoning, Jilin, and Heilongjiang showed relatively stable per capita cement consumption throughout the period. These areas were once important industrial bases in China, but with economic restructuring and industrial upgrading, the per capita cement consumption has remained stable.
(4): Central region: Provinces like Hubei, Henan, and Jiangxi witnessed an increasing trend in per capita cement consumption, but the growth rate has gradually slowed down. These regions exhibit stable economic development, with an increasing demand for urbanization and infrastructure construction.
(5): Western region: Provinces like Sichuan, Shaanxi, and Gansu experienced a significant growth in per capita cement consumption from 2000 to 2021, with a rapid growth rate. These regions have relatively underdeveloped economies, but with accelerated urbanization in recent years, the demand for infrastructure construction has increased, leading to a fast growth in per capita cement consumption.
(6): Northwest region: Provinces such as Qinghai, Ningxia, and Xinjiang had relatively low and slow-growing per capita cement consumption. These regions have low population densities and relatively underdeveloped economies, resulting in limited demand for urbanization and infrastructure construction, thus leading to slow growth in per capita cement consumption.

4.2. Prediction Model Comparison

In this study, we developed a new prediction model, RF-MLP-LR, and compared its performance with three common models: Random Forest (RF), Multi-Layer Perceptron (MLP), and Linear Regression (LR). Using data from 31 provinces in China, we trained and tested these models and evaluated their performance with Mean Absolute Percentage Error (MAPE) [28], where a MAPE below 10% indicates very good model performance [29]. The results for each model are shown in Figure 5.

The RF-MLP-LR model achieved MAPE values below 10% for carbon emissions prediction in most provinces, demonstrating superior prediction accuracy and meeting the criterion for excellent performance. Notably, only one out of the 31 provinces has a MAPE exceeding 10%, with others being relatively low, indicating overall stable model performance; the high MAPE in that province may be due to data volatility, external factors, and small sample size. In contrast, the other three models had significantly higher MAPE values in some provinces, particularly in complex scenarios or specific regions, indicating lower prediction accuracy.

While the RF model performed well in certain provinces, it exhibited higher MAPE values in others, reflecting limitations in handling specific complex relationships. The MLP model showed unstable performance in some provinces, with MAPE values sometimes exceeding the 10% threshold, likely due to its sensitivity to data distribution and issues such as overfitting or underfitting. The LR model generally failed to achieve MAPE values within 10% in most provinces, highlighting its inadequacies in handling complex nonlinear problems.

The comparative analysis highlights the clear advantages of the RF-MLP-LR model. By integrating the strengths of Random Forest, Multi-Layer Perceptron, and Linear Regression, this model effectively manages nonlinear relationships and feature interactions while maintaining stable performance in high-dimensional data scenarios. This results in high prediction accuracy and stability for carbon emissions predictions.

Furthermore, the difficulty of carbon emissions prediction varies among provinces due to differences in industrial structures, energy consumption patterns, and policy environments. Despite these challenges, the RF-MLP-LR model achieved satisfactory prediction results even in the most complex provinces, validating its strong generalization ability and adaptability.

4.3. Influence Factor Analysis

We have conducted a thorough analysis of the key factors influencing cement consumption using the Random Forest (RF) model, and it has provided us with a detailed list of feature importance ranking, as shown in Figure 6. It is evident from Figure 6 that per capita GDP tops the list, highlighting the significant impact of economic development level on cement consumption. As per capita GDP increases, society’s investment in construction, infrastructure, and other sectors continues to grow, directly driving the increase in cement demand.

Closely following per capita GDP is the per capita housing construction area, which reflects a facet of infrastructure development. With the acceleration of urbanization, people’s demand for housing is constantly growing, leading to frequent housing construction activities. This makes cement an indispensable building material. Especially during urban expansion and renovation, the construction of a large number of new residential and commercial buildings directly drives cement consumption.

Urbanization rate, as an important indicator reflecting the level of urbanization, occupies a prominent position among the factors influencing cement consumption. With the increase in urbanization rate, urban population continues to grow, urban scale expands, and the demand for urban infrastructure and public service facilities also increases. The construction and maintenance of these facilities require a large amount of cement as a supporting material.

Although the proportion of the tertiary industry does not directly affect cement consumption, it indirectly reflects the impact of industrial structure changes on cement demand. With the rapid development of the tertiary industry, especially the rise of modern service industries and high-tech industries, the requirements for infrastructure and supporting facilities are constantly improving, indirectly promoting the use of cement.

In addition, factors such as investment rate, per capita off-grade highway mileage, and per capita secondary-grade highway mileage also have an important impact on cement consumption. An increase in the investment rate means increased investment in infrastructure construction, thus driving the growth of cement demand. Meanwhile, per capita off-grade highway mileage and per capita secondary-grade highway mileage directly reflect the scale and speed of highway construction, which is an important component of infrastructure construction and has a particularly significant demand for cement.

However, from a regional perspective, the significance of these influencing factors may vary. In developed provinces such as Beijing, Shanghai, and Guangdong, due to their high level of economic development, the impact of per capita GDP on cement consumption may be more prominent. These regions have vibrant economies, frequent construction activities, and a large and stable demand for cement. In contrast, in less developed provinces such as Guizhou, Yunnan, and Tibet, where the level of infrastructure development is relatively lagging, factors such as per capita housing construction area, per capita off-grade highway mileage, and per capita secondary-grade highway mileage may have a more significant impact on cement consumption. These regions are in a rapid stage of infrastructure development, with rapidly growing demand for cement.

This difference not only reflects the different characteristics and development stages of provinces in terms of economic and infrastructure development, but also provides a basis for us to formulate targeted policies. For developed provinces, we can reduce cement consumption by optimizing industrial structure, improving construction quality and efficiency, and other means. For less developed provinces, it is necessary to increase investment in infrastructure construction and promote the healthy development of the cement industry.

From a regional perspective, Guangdong Province, as one of the most economically developed provinces in China, has seen a gradual decrease in cement consumption in recent years. With the adjustment of the economic structure and the transformation of urban development, the demand for infrastructure construction in Guangdong Province has gradually declined. This phenomenon is particularly evident in regions with high income and high urbanization levels. We specifically analyzed the impact of per capita GDP, per capita housing construction area, and urbanization rate on cement consumption in Guangdong Province from 2000 to 2021. The results show that the level of economic development and housing construction demand have a significant impact on cement consumption.

Henan Province is a region with a large economic output but relatively low urbanization levels. From 2000 to 2021, the per capita cement consumption in Henan Province has gradually increased, but the growth rate has been relatively slow. By analyzing the economic development status, infrastructure construction needs, and policy directions of Henan Province, we found that the demand for infrastructure construction is the main factor driving the growth of cement consumption.

Cement consumption in Sichuan Province has significantly increased from 2000 to 2021, with a rapid growth rate. As a relatively underdeveloped region, Sichuan Province has accelerated its urbanization process in recent years, and the demand for infrastructure construction has greatly increased, leading to a rapid growth in cement consumption. By analyzing the urbanization process and infrastructure construction situation in Sichuan Province, we found that the urbanization rate and infrastructure investment have a significant impact on cement consumption.

In summary, the Random Forest model has provided us with an in-depth insight into the factors influencing cement consumption. Through a comprehensive analysis of the importance and interrelationships of these factors, we can better understand the operating mechanism and development trends of the cement market, providing strong support for the formulation and implementation of relevant policies.

5. Conclusions

This study delves into the cement consumption trends and their underlying influencing factors in 31 provinces of China through the construction of an innovative integrated RF-MLP-LR model. The model exhibits remarkable performance in predicting cement consumption, accurately capturing key information from complex data relationships, and thus serves as a valuable tool for policy formulation and industrial development. This paper not only demonstrates the superiority of the model in predicting cement consumption but also provides an in-depth analysis of key factors that influence cement consumption, such as per capita GDP, per capita housing area, and urbanization rate. These findings provide strong support for policy formulation, help optimize the structure of the cement industry, promote sustainable development, and make positive contributions to China’s efforts to achieve carbon neutrality. The main conclusions of this paper are as follows:

(1): The RF-MLP-LR model has demonstrated excellent performance in predicting cement consumption. Compared with traditional models, its prediction accuracy in most provinces has been significantly improved, with an average absolute percentage error (MAPE) value below 10%. This result proves the superior performance of the model in handling complex data relationships and high-dimensional data environments, providing a powerful tool for accurate prediction of cement consumption.
(2): Through the analysis of influencing factors on cement consumption, this study found that factors such as per capita GDP, per capita housing area, and urbanization rate have a significant impact on cement consumption. There are differences in the importance of these factors among different provinces, reflecting the varying levels of economic and infrastructure development in different regions. This discovery provides an important basis for formulating targeted cement industry policies.
(3): Based on the research results, this paper proposes a series of policy recommendations. For developed provinces, it is essential to optimize industrial structure and improve construction quality to reduce cement consumption. For less developed provinces, increasing investment in infrastructure construction should be a priority to promote the healthy development of the cement industry. These policy recommendations aim to achieve sustainable development of the cement industry and make positive contributions to China’s efforts to achieve carbon neutrality.

In summary, this study not only provides an in-depth analysis of China’s cement consumption trends and influencing factors but also provides strong support for policy formulation. The application of the RF-MLP-LR model will further improve the prediction accuracy of cement consumption, providing a scientific basis for policy making. At the same time, the deep analysis of influencing factors will also provide important references for optimizing the structure of the cement industry and promoting sustainable development. These research results will help promote the healthy development of China’s cement industry and make positive contributions to achieving the carbon neutrality goal.

Although this paper conducted a thorough analysis of cement consumption trends and influencing factors in 31 provinces of China through the innovative integration of the RF-MLP-LR model, there are still some limitations. Specifically, the research data is mainly based on a past period and fails to fully consider the dynamic changes and recent influencing factors in the industry, which may limit the model’s adaptability to future predictions. Additionally, the study does not assess the potential economic, social, and environmental impacts of the proposed policy recommendations. Future research should focus on updating and applying real-time data to optimize the model’s predictive performance and better reflect the real-time changes in the cement industry. Moreover, it should incorporate a comprehensive assessment of the economic, social, and environmental impacts of the proposed policies to provide a more robust evaluation of their effectiveness.

Author Contributions

Software, X.L. and K.L.; Validation, P.M.; Resources, S.S. and Y.Y.; Data curation, L.J.; Writing—original draft, X.L.; Writing—review & editing, P.M.; Visualization, Y.T. and L.J.; Supervision, J.C. and X.Z.; Funding acquisition, X.L. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was support by the National Natural Science Foundation of China (grant numbers 12301427; 12301069) and R&D Program of Beijing Municipal Education Commission (KM202210037002).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets and materials used in this study are available upon requests.

Acknowledgments

The authors extend their heartfelt appreciation to the reviewers and editors for their invaluable comments and constructive suggestions. Additionally, special thanks go to Marina for her providing language help.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Qin, J.H.; Gong, N.J. The estimation of the carbon dioxide emission and driving factors in China based on machine learning methods, Sustain. Prod. Consum. 2022, 33, 218–229. [Google Scholar] [CrossRef]
Gao, S.S.; Zhang, X.P.; Chen, M.X. Spatiotemporal dynamics and driving forces of city-level CO₂ emissions in China from 2000 to 2019. J. Clean. Prod. 2022, 377, 134358. [Google Scholar] [CrossRef]
Du, Z.W.; Wei, J.X.; Cen, K. China’s carbon dioxide emissions from cement production toward 2030 and multivariate statistical analysis of cement consumption and peaking time at provincial levels. Environ. Sci. Pollut. Res. 2019, 26, 28372–28383. [Google Scholar] [CrossRef] [PubMed]
Wei, J.X.; Cen, K. Empirical assessing cement CO₂ emissions based on China’s economic and social development during 2001–2030. Sci. Total Environ. 2019, 653, 200–211. [Google Scholar] [CrossRef] [PubMed]
Kang, S.; Kang, P. Locally linear ensemble for regression. Inf. Sci. 2018, 432, 199–209. [Google Scholar] [CrossRef]
Yang, Y.D. Development of the regional freight transportation demand prediction models based on the regression analysis methods. Neurocomputing 2015, 158, 42–47. [Google Scholar] [CrossRef]
Narula, S.C.; Wellington, J.F.; Lewis, S.A. Valuating residential real estate using parametric programming. Eur. J. Oper. Res. 2012, 217, 120–128. [Google Scholar] [CrossRef]
Sahraei, M.A.; Çodur, M.K. Prediction of transportation energy demand by novel hybrid meta-heuristic ANN. Energy 2022, 249, 123735. [Google Scholar] [CrossRef]
Chiroma, H.; Abdul-kareem, S.; Khan, A. Global warming: Predicting OPEC carbon dioxide emissions from petroleum consumption using neural network and hybrid cuckoo search algorithm. PLoS ONE 2015, 10, e0136140. [Google Scholar] [CrossRef]
Saleh, C.; Leuveano, R.A.C.; Ab Rahman, M.N.; Deros, B.M.; Dzakiyullah, N.R. Prediction of CO₂ emissions using an artificial neural network: The case of the sugar industry. Adv. Sci. Lett. 2015, 21, 3079–3083. [Google Scholar] [CrossRef]
Cansiz, O.F.; Unsalan, K.; Unes, F. Prediction of CO₂ emission in transportation sector by computational intelligence techniques. Int. J. Glob. Warm. 2022, 27, 271–283. [Google Scholar] [CrossRef]
Jaikumar, R.; Nagendra, S.M.S.; Sivanandan, R. Modeling of real time exhaust emissions of passenger cars under heterogeneous traffic conditions. Atmos. Pollut. Res. 2017, 8, 80–88. [Google Scholar] [CrossRef]
Khare, H.; Ratnaparkhi, V.; Chavan, S.; Jayraman, V. Prediction of protein-mannose binding sites using random forest. Bioinformation 2012, 8, 1202–1205. [Google Scholar] [CrossRef] [PubMed]
Quintana, D.; Sáez, Y.; Isasi, P. Random Forest Prediction of IPO Underpricing. Appl. Sci. 2017, 7, 636. [Google Scholar] [CrossRef]
Johansson, U.; Boström, H.; Löfström, T.; Linusson, H. Regression conformal prediction with random forests. Mach. Learn. 2014, 97, 155–176. [Google Scholar] [CrossRef]
Zhou, W.W.; Cao, X.M.; Dong, X.F.; Zhen, X. The effects of carbon-related news on carbon emissions and carbontransfer from a global perspective: Evidence from an extended STIRPAT model. J. Clean. Prod. 2023, 425, 138974. [Google Scholar] [CrossRef]
Kilbourne, W.E.; Thyroff, A. STIRPAT for marketing: An introduction, expansion, and suggestions for future use. J. Bus. Res. 2020, 108, 351–361. [Google Scholar] [CrossRef]
Huang, Y.S.; Shen, L.; Liu, H. Grey relational analysis, principal component analysis and forecasting of carbon emissions based on long short-term memory in China. J. Clean. Prod. 2019, 209, 415–423. [Google Scholar] [CrossRef]
Tian, J.X.; Yang, H.L.; Xiang, P.A.; Liu, D.W.; Li, L. Drivers of agricultural carbon emissions in Hunan Province, China. Environ. Earth Sci. 2016, 75, 121. [Google Scholar] [CrossRef]
Gao, Z.K.; Yu, J.Q.; Zhao, A.J.; Hu, Q.; Yang, S.Y. A hybrid method of cooling load forecasting for large commercial building based on extreme learning machine. Energy 2022, 238, 122073. [Google Scholar] [CrossRef]
Wang, Z.H.; Zhao, Z.J.; Wang, C.X. Random forest analysis of factors affecting urban carbon emissions in cities within the Yangtze River Economic Belt. PLoS ONE 2021, 16, e0252337. [Google Scholar] [CrossRef] [PubMed]
Zhao, J.J.; Kou, L.; Jiang, Z.L.; Lu, N.; Wang, B.; Li, Q.S. A novel evaluation model for carbon dioxide emission in the slurry shield tunneling. Tunn. Undergr. Space Technol. 2022, 130, 104757. [Google Scholar] [CrossRef]
Niu, D.; Wang, K.; Wu, J.; Sun, L.; Liang, Y.; Xu, X.; Yang, X. Can China achieve its 2030 carbon emissions commitment? Scenario analysis based on an improved general regression neural network. J. Clean. Prod. 2020, 243, 118558. [Google Scholar] [CrossRef]
Wang, H.; Wang, G.Z. Improving random forest algorithm by Lasso method. J. Stat. Comput. Simul. 2021, 91, 353–367. [Google Scholar] [CrossRef]
Tang, Z.P.; Mei, Z.; Liu, W.D.; Xia, Y. Identification of the key factors affecting Chinese carbon intensity and their historical trends using random forest algorithm. J. Geogr. Sci. 2020, 5, 743–756. [Google Scholar] [CrossRef]
Panahi, F.; Ehteram, M.; Ahmed, A.N.; Huang, Y.F.; Mosavi, A.; El-Shafie, A. Streamflow prediction with large climate indices using several hybrid multilayer perceptrons and copula Bayesian model averaging. Ecol. Indic. 2021, 133, 108285. [Google Scholar] [CrossRef]
Wang, S.; Ning, Y.F.; Shi, H.M. A new uncertain linear regression model based on equation deformation. Soft Comput. 2021, 25, 12817–12824. [Google Scholar] [CrossRef]
Chang, Z.Y.; Jiao, Y.M.; Wang, X.J. Influencing the Variable Selection and Prediction of Carbon Emissions in China. Sustainability 2023, 15, 13848. [Google Scholar] [CrossRef]
Wang, D.L.; Gan, J.; Mao, J.Q.; Chen, F.; Yu, L. Forecasting power demand in China with a CNN-LSTM model including multimodal information. Energy 2023, 263, 126012. [Google Scholar] [CrossRef]

Figure 1. Random forest structure diagram.

Figure 2. MLP architecture diagram.

Figure 3. RF-MLP-LR model architecture diagram.

Figure 5. Model comparison of MAPE values.

Figure 6. Impact factor importance graph.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, X.; Li, K.; Tian, Y.; Shen, S.; Yu, Y.; Jin, L.; Meng, P.; Cao, J.; Zhang, X. Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors. Sustainability 2024, 16, 5475. https://doi.org/10.3390/su16135475

AMA Style

Li X, Li K, Tian Y, Shen S, Yu Y, Jin L, Meng P, Cao J, Zhang X. Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors. Sustainability. 2024; 16(13):5475. https://doi.org/10.3390/su16135475

Chicago/Turabian Style

Li, Xiangqian, Keke Li, Yaxin Tian, Siqi Shen, Yue Yu, Liwei Jin, Pengyu Meng, Jingjing Cao, and Xiaoxiao Zhang. 2024. "Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors" Sustainability 16, no. 13: 5475. https://doi.org/10.3390/su16135475

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Decision Support for Carbon Emission Reduction Strategies in China’s Cement Industry: Prediction and Identification of Influencing Factors

Abstract

1. Introduction

2. Methodology

2.1. Random Forest Model

2.2. Multilayer Perceptron Model

2.3. Linear Regression Model

2.4. RF-MLP-LR Model

3. Data

4. Results and Discussions

4.1. Spatiotemporal Feature Analysis

4.2. Prediction Model Comparison

4.3. Influence Factor Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI