Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model

Cui, Xiwen; Guan, Xinyu; Wang, Dongyu; Niu, Dongxiao; Xu, Xiaomin

doi:10.3390/en15166019

Open AccessArticle

Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model

by

Xiwen Cui

^1,2,*,

Xinyu Guan

^1,2,

Dongyu Wang

^1,2

,

Dongxiao Niu

^1,2 and

Xiaomin Xu

^1,2

¹

School of Economics and Management, North China Electric Power University, Beijing 102206, China

²

Beijing Key Laboratory of New Energy and Low-Carbon Development, North China Electric Power University, Beijing 102206, China

^*

Author to whom correspondence should be addressed.

Energies 2022, 15(16), 6019; https://doi.org/10.3390/en15166019

Submission received: 14 July 2022 / Revised: 9 August 2022 / Accepted: 16 August 2022 / Published: 19 August 2022

Download

Browse Figures

Versions Notes

Abstract

:

In order to accurately predict China’s future total energy consumption, this article constructs a random forest (RF)–sparrow search algorithm (SSA)–support vector regression machine (SVR)–kernel density estimation (KDE) model to forecast China’s future energy consumption in 2022–2030. It is explored whether China can reach the relevant target in 2030. This article begins by using a random forest model to screen for influences to be used as the input set for the model. Then, the sparrow search algorithm is applied to optimize the SVR to overcome the drawback of difficult parameter setting of SVR. Finally, the model SSA-SVR is applied to forecast the future total energy consumption in China. Then, interval forecasting was performed using kernel density estimation, which enhanced the predictive significance of the model. By comparing the prediction results and error values with those of RF-PSO-SVR, RF-SVR and RF-BP, it is demonstrated that the combined model proposed in the paper is more accurate. This will have even better accuracy for future predictions.

Keywords:

energy consumption; sparrow algorithm; support vector regression machine; kernel density estimation

1. Introduction

Energy shortage is one of the major issues facing mankind today. China has made an international commitment to achieve the “30–60 double carbon” goal of carbon peaking by 2030 and carbon neutrality by 2060 [1]. Currently, coal and oil account for more than 70% of total energy consumption. Decarbonization of the energy mix is a key potential factor to curb the growth of carbon emissions [2]. Currently, China is still in a state of high demand for energy [3]. The long-term high level of energy consumption has gradually brought China closer to the “red line” of resource and ecological carrying capacity. The increase in fossil energy consumption will lead to environmental degradation and thus hinder sustainable growth [4]. By accurately controlling China’s total energy consumption, a forecasting model can help China to control the consumption of fossil energy and maintain its sustainable development, which is of great significance to China’s economic development and energy development.

Most forecasting methods in the traditional sense in terms of energy are statistical. Yuan et al. used the ARIMA (autoregressive integrated moving average) model to forecast energy consumption in China [5]. Ediger et al. used ARIMA and seasonal ARIMA methods to estimate future Turkish primary energy demand from 2005 to 2020 [6]. Xie et al. used the elastic coefficient method (ECM) to forecast the total energy consumption demand in China in 2025 [7]. Meng et al. developed a hybrid trend extrapolation model to predict future trends in the growth of household electricity consumption in China up to 2030 [8]. However, with the changing policies and continuous economic development, the traditional forecasting methods are no longer able to accurately predict the trend of energy consumption.

Machine learning has gradually made its mark in the field of energy prediction [9,10,11]. Wang et al. predicted the building cooling and heating loads based on an improved BP neural network, and the results were shown [12]. Izadyar et al. used ELM to predict residential heating demand. The results show that the ELM method can make important progress in terms of accuracy compared to ANN and GP [13]. Ding et al. performed effective short-term wind power prediction based on improved KELM [14]. In order to predict electricity consumption in a timely manner to avoid shortage problems and generation overflow, Li et al. proposed a random forest (RF) model for predicting the daily electricity consumption of general businesses [15]. The SVM-based model for electricity load forecasting was proposed by Aasim et al. The results showed that the model provided excellent forecasts [16]. Support vector regression (SVR) is an extension of support vector machine (SVM) proposed by Drucker et al. [17]. SVR is a machine learning method that is suitable for studying nonparametric estimation problems in finite sample situations and is widely used in small sample, nonlinear problems [18]. Liu et al. used SVR to predict the short-term load of an integrated energy system [19].

The parameters of the SVR affect the accuracy of the model, so the choice of parameters for the SVR becomes a key issue [20]. Niu et al. applied the grey wolf optimizer (GWO) to optimize SVR for prediction accuracy. The results showed that the prediction model has good stability and high prediction accuracy [21]. Li et al. applied the cuckoo search (CS) algorithm to optimize the SVR for predicting short-term wind power output, and the results showed that the model can improve the prediction results of the output [22]. However, the CS algorithm is prone to fall into the local optimum, and the GWO algorithm is prone to over-convergence. Although support vector machines have the advantages of high prediction accuracy, strong learning ability and high generalization ability, the parameter setting problem in their practical use is a difficult problem that cannot be ignored [23]. In this paper, we choose to use the sparrow search algorithm (SSA) to optimize the SVR model. SSA is a new swarm intelligence algorithm [24]. This algorithm can be used to solve problems, such as the traditional optimization algorithms tending to fall into local optimality, and it avoids the problem of over-convergence to some extent.

Traditional point forecasts are difficult to reflect future data fluctuations, while interval forecasts can provide more information for future planning [25]. Currently, interval forecasting is mostly used in the field of load forecasting and has not been used in energy consumption forecasting. Han et al. used kernel density estimation (KDE) for PV power interval forecasting and provided more reliable forecasting results [26]. Zhang et al. used random forest (RF) combined with KDE to forecast short-term electricity load. The results showed that interval prediction has better predictive reflectivity [27].

There are many factors that affect energy consumption [28]. Khan et al. took into account various factors, such as changes in weather, holidays and weekends [29]. Dai et al. considered influencing factors such as population, GDP and so on [30] before forecasting to screen the influencing factors. The commonly used methods for screening influencing factors are grey relational analysis (GRA) [31], principal component analysis (PCA) [32,33] and the random forest (RF) method [34]. In this paper, the random forest method, which can calculate the importance of features, is chosen to filter variables.

Both to meet the commitments to the international community and to promote China’s own sustainable development, energy conservation and emission reduction policies and measures need to be vigorously pursued. The projection of total energy consumption will facilitate the future planning of primary energy consumption, thus limiting the amount of carbon emissions and reaching the carbon peak target. Reasonable projections can provide a theoretical basis and reference for the government to plan energy use and help China explore a suitable sustainable path. This is of great relevance to China’s future energy development endeavors and policy implementation.

The Energy Production and Consumption Revolution Strategy (2016–2030) states that by 2030, the total energy consumption should be controlled within 6 billion tons of standard coal. In order to accurately forecast China’s total future energy consumption and predict whether China will be able to meet its energy consumption targets by 2030, we established the RF-SSA-SVR-KDE model. The model utilizes the excellent global search capability of the latest sparrow algorithm to optimize the SVR. It overcomes the problem of the difficult setting of SVR parameters to predict the total energy consumption in China. It contributes to the future energy-related planning. Moreover, this paper uses kernel density estimation to make interval forecasts of China’s total energy consumption to achieve the maximum value of the forecast. This paper also forecasts the total energy consumption in China for 2022–2030 and makes relevant suggestions.

2. Methods

2.1. Random Forest

The random forest algorithm (RF) is an integrated algorithm based on decision trees [35]. Importance evaluation of the input features is an important function of the random forest algorithm. The specific steps are as follows:

Step 1: Random forest for bootstrap sampling.

K

decision trees were generated by sampling

K

sample datasets, with each decision tree generated independently.

Step 2: Let

k

= 1, train the decision tree

T_{k}

. The training input is the

k

th dataset and calculate the accuracy

L_{k}

for the

k

th out-of-bag dataset.

Step 3: Rearrange the features

f

in the out-of-bag dataset and calculate the accuracy

L_{k}^{f}

.

Step 4: Repeat steps 2 and 3 for all sample datasets

k

= 2, 3, …,

K

repeat steps 2 and 3.

Step 5: The classification accuracy error after feature rearrangement is calculated using the following formula:

e_{k}^{f} = L_{k} - L_{k}^{f}, K = 2, 3, …, K

(1)

Step 6: The degree of influence of feature

f

on out-of-bag data accuracy can be obtained from step 5, and the equation is shown below.

e^{f} = \frac{1}{K} \sum_{k = 1}^{K} e_{k}^{f}

(2)

The variance of

e^{f}

is

S^{2} = \frac{1}{K - 1} \sum_{k = 1}^{K} {(e_{k}^{f} - e^{f})}^{2}

(3)

Step 7: to calculate the importance of feature

f

, the formula is shown as follows:

F = \frac{e^{f}}{S}

(4)

Step 8: The importance of obtaining the full range of characteristics.

2.2. Sparrow Search Algorithm

In the sparrow search algorithm, individuals are distinguished into discoverers, followers and vigilantes. The discoverers are the less adapted individuals, and the followers are the more adapted ones. Vigilantes are a random selection of individuals from the whole population.

The formula for the discoverer is as follows:

X_{i, j}^{t + 1} = {\begin{matrix} X_{i, j}^{t} \exp (- \frac{i}{α T_{\max}}) & R_{2} < S_{t} \\ X_{i, j}^{t} + Q L & R_{2} \geq S_{t} \end{matrix}

(5)

where

t

and

T_{\max}

are the current the maximum of iterations.

X_{i, j}

is the position information of the

i

th sparrow in the

j

th dimension.

α

∈ (0, 1] is a random number. Early warning parameter

R_{2}

∈ [0, 1], and safety parameter

S_{t}

is usually 0.5.

Q

is a random number subject to a normal distribution. L is an all-one-row vector.

The formula for updating the location of followers is as follows:

X_{i, j}^{t + 1} = {\begin{matrix} Q \exp (\frac{X_{w o r s t} - X_{i, j}^{t}}{i^{2}}) & i > n / 2 \\ X_{p}^{t + 1} + | X_{i, j}^{t} - X_{p}^{t + 1} | A^{+} L & o t h e r w i s e \end{matrix}

(6)

where

X_{p}

is the current optimal position, and

X_{w o r s t}

is the current worst position. A is a random row vector with elements of 1 or −1.

The vigilante individuals who perform vigilance are called vigilantes and generally represent 10–20% of the population with the following equation for location update:

X_{i, j}^{t + 1} = {\begin{cases} X_{b e s t}^{t} + β | X_{i, j}^{t} - X_{b e s t}^{t} | \\ X_{i, j}^{t} + K (\frac{X_{i, j}^{t} - X_{w o r s t}^{t}}{(f_{i} - f_{w o r s t}) + ε}) \end{cases} \begin{matrix} f_{i} > f_{b e s t} \\ f_{i} = f_{b e s t} \end{matrix}

(7)

where

β

and

K

are step control parameters.

X_{b e s t}^{t}

indicates the current global optimal position.

f_{w o r s t}

,

f_{b e s t}

and

f_{i}

denote the adaptation of the worst, best and current individuals, respectively.

ε

is a very small constant to avoid the extreme case of a zero in the denominator.

2.3. Support Vector Regression (SVR)

SVR is an extension of support vector machine, a statistical-based machine learning method proposed by Drucker et al. It uses the kernel function to map the input parameters to a high-dimensional space through a nonlinear transformation.

In the SVR model, the commonly used kernel functions are sigmoid kernel function, polynomial kernel function and RBF kernel function and other kernel functions. As a kernel function corresponding to a nonlinear mapping, the RBF kernel function is widely used in dealing with problems such as nonlinear differentiability, so the RBF radial kernel function is chosen in this paper.

2.4. Kernel Density Estimation

Non-parametric kernel density estimation (KDE) is a method that does not require prior assumptions about the distribution. The equation for the probability density function obtained using KDE is as follows:

\hat{f} (e) = \frac{1}{N h} \sum_{i = 1}^{N} K (h^{- 1} (e - e_{i}))

(8)

where

N

is the number of samples,

h

is the window width,

K (u)

is the kernel function and

u = h^{- 1} (e - e_{i})

.

e_{i}

is the

i

th sample value of the power prediction error. The kernel functions

K (u)

are Gaussian kernel, triangular kernel and Epanechnikov kernel, etc. The window width

h

is the main parameter that affects the smoothness of the KDE.

2.5. Evaluation Indicators

In order to fairly and objectively compare the predictive performance of the models, four error evaluation indicators have been chosen in this paper. They are MAPE, RMSE, MAE and MSE

The PICP (probability of PI coverage) is the rate at which the true value falls within the forecast interval. PIAW (PI average width) indicates the average width of the prediction interval.

I_{P I C P} = \frac{100 %}{N} \sum_{i = 1}^{N} c_{_{i}}^{(1 - α)}

(9)

I_{P I A W} = \frac{1}{N} \sum_{i = 1}^{N} [U_{i} - L_{i}]

(10)

When

y_{i} \in [L_{i}, U_{i}]

,

c_{i} = 1

and vice versa

c_{i} = 0

.

L_{i}

and

U_{i}

refer to the upper and lower bounds of the

i

th data. Moreover,

100 (1 - α) %

is the nominal confidence level of the prediction interval.

2.6. RF-SSA-SVR-KDE Model Construction

In this paper, an RF-SSA-SVR-KDE model was constructed, as shown in Figure 1, and the specific steps are shown below:

(1): Collect the original data and pre-process the data.
(2): Screen the influencing factors using random forest.
(3): Use the SSA algorithm to find the optimal parameters in the SVR algorithm and obtain the SSA-SVR model.
(4): Calculate the error value and input the SSA-SVR model results and the error value into the KDE model to obtain the interval prediction results.
(5): End.

3. Results

3.1. Data Preprocessing

To eliminate the effect of order-of-magnitude mismatch, the input and output data need to be normalized before model training. We choose to use standard normalization for data preprocessing.

X^{*} = \frac{x - μ}{σ}

(11)

where

μ

is the mean of all sample data, and

σ

is the standard deviation of all sample data.

3.2. Influencing Factor Screening

The device used in this paper is equipped with AMD Ryzen 5 3600 and Nvidia RTX 3060 Ti GPU processors. Equipment for Lenovo Computers in Beijing, China. The server was configured with a 6-core, 3.6 GHz frequency processor. A total of 8 Gb of GPU video memory was used. Experiments were conducted using PyCharm version 2021.7.15 Professional, in the Python 3.6.5 environment. Matlab 2019b was also used for the experiments.

In this paper, the specific influencing factors are shown in Table 1. Factors from 1990–2021 are selected from the World Bank, National Bureau of Statistics of China, China Carbon Accounting Database and BP Energy Statistical Yearbook. The energy consumption in China is taken as a factor of China’s energy consumption.

In particular, GDP can reflect the macroeconomic level of China, and the economic level can reflect the intensity of demand for energy. The level of population consumption and total import and export can reflect the social level and foreign trade level of the economy, showing the stage of development of China’s economic level, which has positive significance for the demand of energy. The structure of energy consumption visually reflects the restructuring of energy, the gradual increase in demand for new and clean energy sources and the rapid development of new energy-related technologies. The size of the population intuitively affects the demand for energy. The urbanization rate reflects the level of modernization, and the increase in the level of modernization intuitively affects the total energy consumption. Most of the energy consumed generates carbon emissions. With the introduction of China’s “double carbon” target, strict controls on carbon emissions and the high cost of technologies such as carbon capture will drive down primary energy consumption, while carbon reduction technologies will continue to be developed. China’s traditional coal-fired power producers still have a place, and the power generation industry is leading the industry in terms of energy consumption, which is closely related to energy consumption. At the same time, the introduction of the double carbon target has allowed more energy demand to be converted into electricity demand, such as electric vehicles, so power generation has multiple impacts on total energy consumption.

To determine the model input values, random forest is used in this paper, and the results are shown in the Table 1 and Figure 2.

As visualized from the data in Table 1 and Figure 2, the population number has the largest influence on energy consumption, followed by the GDP value, while the influence of energy consumption structure and urbanization rate is smaller. Therefore, we decided to keep the influencing factors whose contribution exceeds 0.1 and screen out the two factors of energy consumption structure and urbanization rate.

3.3. Model Comparison and Result Analysis

Based on the influence factor screening in Section 3.2, we apply SSA-SVR for prediction. To prove that our model is more effective, we choose PSO-SVR, SVR and BP models for comparison. Moreover, in this paper, the dataset is divided into training and testing sets according to the ratio of 8:2.

The range of SSA-SVR parameters in this paper is set in Table 2.

The PSO-SVR parameters are set as follows: population size is 50, the maximum number of iterations is 100, the weight ω is 0.2 and the particle position update coefficients are all 2.0. The SVR parameters are set the same as SSA-SVR.

BP is set as two hidden layers and one output layer, the first hidden layer is six neurons, and the second hidden layer is four neurons.

The four model fitting plots are shown in Figure 3. As can be seen from the figure, the RF-SSA-SVR model constructed in this paper fits the actual values the best.

In order to evaluate the performance of the models more objectively and fairly, MAPE, RMSE, MAE and MSE are selected in this paper. From the results, as seen in Table 3, RF-SSA-SVR has the smallest values of MAPE, RMSE, MAE and MSE, followed by PSO-SVR, which has not been optimized with a larger error of SVR. In comparison, the fit of RF-BP is also inferior to RF-SVR, which proves the superiority of the SVR model itself.

3.4. Kernel Density Interval Prediction

This section performs KDE interval estimation prediction based on the absolute errors. We choose the error data of the first 22 years as the basis and the data of the last 10 years to verify the results of kernel density estimation. The error calculation formula is shown below.

x E r r o r = \frac{x_{F} - x_{T}}{x_{\min}}

(12)

where F is the predicted value, and T is the true value.

As shown in Figure 4, the curve of Gaussian function is more flat, so we choose the Gaussian function for kernel density estimation. The default optimal window width is used for the window width.

According to the calculation, the results of the error interval are shown in Table 4.

The error interval is calculated with the data of the latter 10 years, and the interval indicator can be calculated by comparing it with the actual value. The specific results are shown in the following Table 5.

As seen from the indicators given in the table, the PICP value is 100% at a 90% confidence level. At an 80% confidence level, the PICP value is 100%. It can be seen that interval estimation is more accurate and realistic for energy consumption forecasting.

4. Forecast of China’s Total Energy Consumption from 2022–2030

The accuracy of the model has been tested above, and we next forecast the future total energy consumption in China. For the influencing factors, we use the gray forecasting method to make predictions. The specific prediction results are shown in Table 6.

Using the interval projections presented in Section 2.4, we calculate the error values for 1990–2021 for kernel density estimation, and the interval results are shown in Table 7.

The results in the table show that China’s future energy consumption continues to trend upward, and the total energy consumption in the year 2030 also exceeds the mark of 7 billion tons of standard coal. The Energy Production and Consumption Revolution Strategy (2016–2030) states that by 2030, the total energy consumption should be controlled within 6 billion tons of standard coal. In addition, according to the table, the total energy consumption will already exceed 6 billion tons of standard coal in 2024. Therefore, relevant agencies should make reasonable plans to accomplish the target as soon as possible.

5. Discussion

We have made projections for China’s future energy consumption. Through the results, we can see that the RF-SSA-SVR constructed in the article has the best forecasting results. Compared with RF-pso-SVR, this model has improved MAPE by 11.37%, RMSE by 8.6%, MAE by 11.37% and MSE by 16.5%, demonstrating the effectiveness of SSA compared with PSO for SVR optimization. In comparison with RF-SVR, it can be seen that the model in this paper has improved MAPE by 45%, RMSE by 38.8%, MAE by 45% and MSE by 62.5%, which proves the effectiveness and significance of the optimization parameters of SSA.

In addition to this, we have performed kernel density estimation interval prediction based on point prediction. Based on the errors in the historical data, the interval model we developed has a high coverage, which is conducive to improving the practicality and relevance of energy consumption forecasting.

The final projection results let us know that China may not meet the 2030 energy consumption target on time. Therefore, relevant organizations should be alert and change their relevant strategies to meet the target.

6. Conclusions and Recommendation

This paper proposes an RF-SSA-SVR prediction model, which has favorable forecasting performance and can more accurately describe the future energy consumption trend in China. At the same time, this paper performs interval prediction based on the point prediction of the model, which enhances the realistic significance of the model prediction.

The RF-SSA-SVR-KDE model is applied to forecast the total energy consumption in China, and good results are obtained. The main conclusions of this paper are as follows:

(1): This paper uses the sparrow algorithm to optimize C and gamma in SVR and achieves good results. The prediction performance is more significant compared with the results of PSO-SVR, SVR and BP.
(2): Interval prediction based on kernel density estimation is performed based on the prediction of RF-SSA-SVR. The results show that the interval prediction is good and more realistic.
(3): In this paper, the total energy consumption in China from 2022 to 2030 is forecasted. The results show that China’s energy consumption will gradually increase, while China may not be able to accomplish the target according to the plan. Therefore, policy adjustments by relevant agencies are needed.

Because of the data projected in this paper, China may not meet its target on time. In order to achieve the “carbon peak” target, China needs to limit the total energy consumption and thus the total carbon emissions:

(1): To guide the concept of energy saving and low carbon and encourage distributed power generation. As we can see from the text, the total energy consumption and the population are closely related. First, in terms of total population, China is currently in a low population growth trend, which means that the growth rate of total energy consumption will decrease. However, the per capita energy consumption is still rising at present. In such a state, residents should be guided by low-carbon concepts to slow down the growth rate of per capita energy consumption. Secondly, according to the data of the seventh census, the total population increases, and the number of people per household decreases, which means that the number of households increases. This will lead to a rise in carbon emissions from home appliances, transportation, etc. Therefore, measures such as oil-to-gas and oil-to-electricity conversion should be promoted, and the development of new energy vehicles should be encouraged through relevant policy subsidies to reduce carbon emissions.
(2): Reduce primary energy consumption and increase the application of new energy. In order to develop a low-carbon economy, China’s 14th Five-Year Plan has taken the “double carbon” target as an important goal in the battle against pollution and incorporated it into the overall layout of ecological civilization construction. As the largest share of total energy consumption is primary energy, the application of new energy should be increased to reduce the proportion of primary energy consumption. Meanwhile, from the form of regional distribution of population in the seventh census, the proportion of population in the east and west of China has increased, while the central and northeastern regions have decreased. The pressure of energy transmission has increased, and the cost of energy has thus increased, which further leads to unbalanced energy production and consumption in China, so the development of distributed wind, light and biomass power generation should also be encouraged to develop low carbon.
(3): Innovate new energy economic development system and improve management mechanism. Taking the power industry as an example, the power industry is the core of energy transformation, creating a good atmosphere and space for new energy development, increasing government subsidies to ease the cost of new energy research and development and establishing a good market management system, which can ensure the healthy development of new energy enterprises. Therefore, it is suggested that the relevant departments should speed up the favorable policies for new energy development and accelerate the construction of related market mechanisms, such as the electricity market, carbon market and green certificate market mechanisms. In addition, it should increase the policy support for the stable operation of a high proportion of renewable energy into the grid, increase the financial investment in the research and development, demonstration and promotion and application of energy storage technology and study and introduce the market mechanism related to energy storage as soon as possible to realize the large-scale development of energy storage.
(4): Guide the development of the digital economy and realize the digital transformation of enterprises. Through the use digital technology to open up the supply chain of enterprises, reduce the energy consumption of each link and reduce the energy waste of each link. For the whole country, the construction of a new generation of information infrastructure should be promoted to expand the negative effect of the digital economy on carbon emissions. At the same time, we should increase the investment in innovation and R&D to reduce the consumption of manufacturing resources. For regions, regional differentiated governance strategies should be adopted according to the development level of each region. After considering the resource endowment levels of different regions and the development level of digital economy, the pace of digital economy development in each region should be adjusted to break the regional barriers and improve the synergy of the digital economy.

Author Contributions

Conceptualization, X.C.; methodology, X.C. and D.W.; software, X.G.; validation, X.G.; resources, D.N.; writing—original draft preparation, X.C.; writing—review and editing, D.N. and X.X. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Key R&D Program of China, Ministry of Science and Technology of the People’s Republic of China (2020YFB1707801).

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: http://www.stats.gov.cn/ (accessed on 13 July 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

Qiu, S.; Lei, T.; Yao, Y.; Wu, J.; Bi, S. Impact of high-quality-development strategy on energy demand of east china. Energy Strateg. Rev. 2021, 38, 100699. [Google Scholar] [CrossRef]
Pan, C.; Li, S.; Liu, Q. Exploring the drivers of carbon emissions in Chinese provinces from the perspective of consumption. Econ. Manage. 2022, 36, 1–9. [Google Scholar]
Qiu, S.; Lei, T.; Wu, J.; Bi, S. Energy demand and supply planning of china through 2060. Energy 2021, 234, 121193. [Google Scholar] [CrossRef]
Wang, Z.; Jia, X. Analysis of energy consumption structure on CO₂ emission and economic sustainable growth. Energy Rep. 2022, 8, 1667–1679. [Google Scholar] [CrossRef]
Yuan, C.; Liu, S.; Fang, Z. Comparison of china’s primary energy consumption forecasting by using arima (the autoregressive integrated moving average) model and gm(1,1) model. Energy 2016, 100, 384–390. [Google Scholar] [CrossRef]
Ediger, V.S.; Akar, S. Arima forecasting of primary energy demand by fuel in turkey. Energy Policy 2007, 35, 1701–1708. [Google Scholar] [CrossRef]
Xie, H.; Wu, L.; Zheng, D. China’s energy consumption and coal demand forecast in 2025. J. Coal. 2019, 44, 1949–1960. [Google Scholar]
Meng, M.; Wang, L.; Shang, W. Decomposition and forecasting analysis of China’s household electricity consumption using three-dimensional decomposition and hybrid trend extrapolation models. Energy 2018, 165, 143–152. [Google Scholar] [CrossRef]
Tan, Z.; De, G.; Li, M.; Lin, H.; Yang, S.; Tan, Q. Combined electricity-heat-cooling-gas load forecasting model for integrated energy system based on multi-task learning and least square support vector machine. J. Clean. Prod. 2019, 248, 119252. [Google Scholar] [CrossRef]
Dietrich, B.; Walther, J.; Weigold, M.; Abele, E. Machine learning based very short term load forecasting of machine tools. Appl. Energ. 2020, 276, 115440. [Google Scholar] [CrossRef]
Khan, P.W.; Byun, Y.C.; Lee, S.J.; Park, N. Machine Learning Based Hybrid System for Imputation and Efficient Energy Demand Forecasting. Energies 2020, 13, 2681. [Google Scholar]
Wang, H.; Jin, T.; Wang, H.; Su, D. Application of IEHO–BP neural network in forecasting building cooling and heating load. Energy. Rep. 2022, 8, 455–465. [Google Scholar] [CrossRef]
Izadyar, N.; Ong, H.C.; Shamshirband, S.; Ghadamian, H.; Tong, C.W. Intelligent forecasting of residential heating demand for the district heating system based on the monthly overall natural gas consumption. Energy Builgs 2015, 104, 208–214. [Google Scholar] [CrossRef]
Ding, Y.; Chen, Z.; Zhang, H.; Wang, X.; Guo, Y. A short-term wind power prediction model based on ceemd and woa-kelm. Renew Energ. 2022, 189, 188–198. [Google Scholar] [CrossRef]
Li, C.; Tao, Y.; Ao, W.; Yang, S.; Bai, Y. Improving forecasting accuracy of daily enterprise electricity consumption using a random forest based on ensemble empirical mode decomposition. Energy 2018, 165, 1220–1227. [Google Scholar] [CrossRef]
Aasim; Singh, S.N.; Mohapatra, A. Data driven day-ahead electrical load forecasting through repeated wavelet transform assisted svm model. Appl. Soft Comput. 2021, 16, 107730. [Google Scholar]
Drucker, H.; Burges, C.; Kaufman, L.; Smola, A.; Vapnik, V. Support vector regression machines. Adv. Neural. Inform. Pr. Sys. 2003, 9, 779–784. [Google Scholar]
Guo, L.; Fang, W.; Zhao, Q.; Wang, X. The hybrid PROPHET-SVR approach for forecasting product time series demand with seasonality. Comput. Ind. Eng. 2021, 161, 107598. [Google Scholar] [CrossRef]
Liu, H.; Tang, Y.; Pu, Y.; Mei, F.; Sidorov, D. Short-term Load Forecasting of Multi-Energy in Integrated Energy System Based on Multivariate Phase Space Reconstruction and Support Vector Regression Mode. Electr. Pow. Syst. Res. 2022, 210, 108066. [Google Scholar] [CrossRef]
Wang, X.; Wang, Y. A Hybrid Model of EMD and PSO-SVR for Short-Term Load Forecasting in Residential Quarters. Math. Probl. Eng. 2016, 2016, 1–10. [Google Scholar] [CrossRef]
Niu, D.; Ji, Z.; Li, W.; Xu, X.; Liu, D. Research and application of a hybrid model for mid-term power demand forecasting based on secondary decomposition and interval optimization. Energy 2021, 234, 121145. [Google Scholar] [CrossRef]
Li, L.; Cen, Z.; Tseng, M.; Shen, Q.; Ali, M. Improving short-term wind power prediction using hybrid.improved cuckoo search arithmetic—Support vector regression machine. J. Clean Prod. 2021, 279, 123739. [Google Scholar] [CrossRef]
Liu, Y.; Guan, S.; Zhao, H.; Sha, Y. Prediction of rifling cutting force by SVR based on GA optimization. J. Arms. Eq. Eng. 2022, 46, 1–8. [Google Scholar]
Xue, J.; Shen, B. A novel swarm intelligence optimization approach: Sparrow search algorithm. Syst. Sci. Control. Eng. 2020, 8, 22–34. [Google Scholar] [CrossRef]
Li, Y.; Tong, Z.; Tong, S.; Wester, D.A. Data-driven interval forecasting model for building energy prediction using attention-based LSTM and fuzzy information granulation. Sustain. Cities Soc. 2021, 76, 103481. [Google Scholar] [CrossRef]
Han, Y.; Wang, N.; Ma, M.; Zhou, H.; Dai, S.; Zhu, H. A pv power interval forecasting based on seasonal model and nonparametric estimation algorithm. Sol. Energy 2019, 184, 515–526. [Google Scholar] [CrossRef]
Zhang, L.; Lu, S.; Ding, Y.; Duan, D.; Wang, Y.; Wang, P.; Yang, L.; Fan, H.; Cheng, Y. Probability prediction of short-term user-level load based on random forest and kernel density estimation. Energy Rep. 2022, 8, 1130–1138. [Google Scholar] [CrossRef]
Peng, L.; Wang, L.; Xia, D.; Gan, Q. Effective energy consumption forecasting using empirical wavelet transform and long short-term memory. Energy 2022, 238, 121756. [Google Scholar] [CrossRef]
Khan, P.W.; Kim, Y.; Byun, Y.C.; Lee, S.J. Influencing factors evaluation of machine learning-based energy consumption prediction. Energies 2021, 14, 7167. [Google Scholar] [CrossRef]
Dai, S.; Niu, D.; Li, Y. Forecasting of Energy Consumption in China Based on Ensemble Empirical Mode Decomposition and Least Squares Support Vector Machine Optimized by Improved Shuffled Frog Leaping Algorithm. Appl. Sci. 2018, 8, 678. [Google Scholar] [CrossRef]
Huang, Y.; Shen, L.; Liu, H. Grey relational analysis, principal component analysis and forecasting of carbon emissions based on long short-term memory in China. J. Clean Prod. 2019, 209, 415–423. [Google Scholar] [CrossRef]
Bhowmik, C.; Bhowmik, S.; Ray, A. Social acceptance of green energy determinants using principal component analysis. Energy 2018, 160, 1030–1046. [Google Scholar] [CrossRef]
Zhang, C.; Tian, Y.; Fan, Z. Forecasting sales using online review and search engine data: A method based on pca–dsfoa–bpnn—sciencedirect. Int. J. Forecast. 2022, 38, 1005–1024. [Google Scholar] [CrossRef]
Niu, D.; Wang, K.; Sun, L.; Wu, J.; Xu, X. Short-term photovoltaic power generation forecasting based on random forest feature selection and ceemd: A case study. Appl. Soft. Comput. 2020, 93, 106389. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]

Figure 1. RF-SSA-SVR-KDE model.

Figure 2. Random forest results.

Figure 3. Prediction comparison chart.

Figure 4. Kernel density estimation results.

Table 1. Table of random forest results.

Impact Factor	Degree of Importance
GDP (billion CNY)	0.181817
Resident consumption level (CNY)	0.118609
Total import and export (billion CNY)	0.153113
Energy consumption structure (%)	0.016823
Population (million people)	0.182949
Urbanization rate (%)	0.073632
Carbon emissions (billion tons)	0.127944
Electricity generation (billion kilowatt hours)	0.145114

Table 2. Related parameter settings.

Parameter	Value
Sparrow population size	40
Number of iterations	100
The values of C	[0, 100]
The range of values of gamma	[0, 10]

Table 3. Evaluation indicators.

Model	MAPE	RMSE	MAE	MSE
RF-SSA-SVR	0.025968	8786.966	7264.817	7.72 × 10⁷
RF-PSO-SVR	0.029298	9614.264	8196.356	9.24 × 10⁷
RF-SVR	0.047176	14,358.13	13,198.07	2.06 × 10⁸
RF-BP	0.111776	44,779.05	31,270.56	2.01 × 10⁹

Table 4. Error interval of kernel density estimation.

Confidence Level (%)	Error Interval
90	[−0.17, 0.2]
80	[−0.13, 0.17]

Table 5. Evaluation indicators of kernel density estimation.

Confidence Level (%)	PICP	PIAW
90	100%	36,520.11
80	100%	29,610.9

Table 6. China’s total energy consumption forecast from 2022–2030.

Year	Total Energy Consumption (Million Tons of Standard Coal)
2022	549,271.5306
2023	568,308.1529
2024	587,590.0974
2025	607,117.3642
2026	626,889.9529
2027	646,907.8639
2028	667,171.0967
2029	687,679.6522
2030	708,433.5293

Table 7. Forecast range of total energy consumption in China from 2022 to 2030 (million tons of standard coal).

Year	Upper Limit	Lower Limit
2022	564,570.4956	531,998.5056
2023	583,607.1179	551,035.1279
2024	602,889.0624	570,317.0724
2025	622,416.3292	589,844.3392
2026	642,188.9179	609,616.9279
2027	662,206.8289	629,634.8389
2028	682,470.0617	649,898.0717
2029	702,978.6172	670,406.6272
2030	723,732.4943	691,160.5043

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cui, X.; Guan, X.; Wang, D.; Niu, D.; Xu, X. Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model. Energies 2022, 15, 6019. https://doi.org/10.3390/en15166019

AMA Style

Cui X, Guan X, Wang D, Niu D, Xu X. Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model. Energies. 2022; 15(16):6019. https://doi.org/10.3390/en15166019

Chicago/Turabian Style

Cui, Xiwen, Xinyu Guan, Dongyu Wang, Dongxiao Niu, and Xiaomin Xu. 2022. "Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model" Energies 15, no. 16: 6019. https://doi.org/10.3390/en15166019

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Can China Meet Its 2030 Total Energy Consumption Target? Based on an RF-SSA-SVR-KDE Model

Abstract

1. Introduction

2. Methods

2.1. Random Forest

2.2. Sparrow Search Algorithm

2.3. Support Vector Regression (SVR)

2.4. Kernel Density Estimation

2.5. Evaluation Indicators

2.6. RF-SSA-SVR-KDE Model Construction

3. Results

3.1. Data Preprocessing

3.2. Influencing Factor Screening

3.3. Model Comparison and Result Analysis

3.4. Kernel Density Interval Prediction

4. Forecast of China’s Total Energy Consumption from 2022–2030

5. Discussion

6. Conclusions and Recommendation

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI