1. Introduction
As the world economy advances, environmental and climate-related issues have become shared challenges among nations. Therefore, emissions reduction has become an effective way for governments to achieve global climate goals [
1]. The Kyoto Protocol was established, which formulated specific emission reduction plans and timetables based on the situation of each country [
2]. The European Union Emissions Trading Scheme (EU ETS) was launched to achieve the emissions reduction goals, which allocates carbon trading allowances to different emitters under regulations, and those who overtake the quota are obliged to buy emission rights from those who fall below the percentage through the carbon trading market [
3].
China, one of the top greenhouse gas emitters, has pledged to achieve a carbon emissions peak before 2030 and carbon neutrality before 2060 for sustainable development [
4]. Although China initiated carbon pilot programs in eight cities in 2011, these programs are still under construction, and market mechanism improvements are necessary.
Carbon price fluctuations significantly impact the development of industries, energy, agriculture, and stock investments [
5]. Accurately predicting the regional carbon price in China can help reduce carbon dioxide emissions and provide an essential basis for revising carbon pricing strategies, thus regulating carbon trading markets and assisting investors in avoiding investment risks. The constant refinement and elucidation of China’s carbon trading market mechanism make studying carbon price prediction in China imperative.
Research methods for carbon price prediction mostly rely on building models based on historical data. Still, accurate predictions are difficult due to the nonlinearity, non-stationarity, and high complexity of carbon prices. Recently, carbon price prediction research has primarily been classified into two main categories.
The first primary strategy is to establish models according to the carbon price time series itself. TSAI proposed an efficient and accurate carbon price prediction system by using a radial basis function neural network (RBFNN) to process carbon price prediction [
6]. Huang and Liu proposed a model based on RBFNN. The input layer of the model consists of multiple signal source nodes, the hidden layer is the core of the RBFNN, and the output layer responds to the input pattern [
7]. Zhu and Wang studied the multi-scale non-linear integrated learning paradigm for carbon price prediction, which could accurately predict complex carbon price fluctuations, as well as improve prediction accuracy and statistical efficiency in the carbon market with non-stationary and non-linear characteristics [
8]. Zhu et al. proposed several new hybrid prediction models and demonstrated that the autoregressive comprehensive moving average model (ARIMA) was suitable for capturing linear features [
9]. Zhao et al. proposed a new mixed frequency data sampling model (MIDAS) to improve predictive performance [
10]. Fan studied from a chaotic perspective, established a multilayer perceptron (MLP) model, and expressed the nonlinear part of carbon prices [
11]. Chai et al. proposed a support vector machine (SVM) model combining fuzzy information granulation (FIG) and proved that the proposed FIG-SVM model had better performance than other methods through a study of the EU ETS [
12]. Zhang et al. proposed a sequence-to-sequence deep neural network model combining time convolution to predict carbon prices, which performed significantly better than traditional prediction models regarding prediction ability and robustness [
13]. Yang et al. proposed an integrated prediction model that included feature extraction techniques, bi-directional LSTM, a CNN, and extreme learning machines (ELMs). The model combined the advantages of each sub-model and provided a more effective and stable prediction of carbon prices [
14]. To overcome the limitations of ELMs, Hao et al. developed an optimal kernel-based ELM with good generalization and stability based on the chaos sine cosine algorithm [
15].
With the development of digital signal processing technology, these feature extraction methods have been applied in the prediction field through decomposition and reconstruction of the original sequence. Wang proposed a new integrated prediction system for carbon price prediction. The system utilizes data decomposition, feature selection, and improved multi-objective optimization techniques for both point and interval carbon price prediction [
16]. Cheng and Hu proposed a prediction model based on the “decomposition-reconstruction-integration” concept, which was used to predict the carbon prices in four regional carbon markets in China [
17]. E proposed a denoising hybrid method for carbon price prediction. The action was to decompose the original sequence using empirical mode decomposition (EMD) with pole symmetry, wherein independent components reflecting the internal mechanism were separated by independent component analysis (ICA), and the prediction results were obtained using a least squares support vector machine (SVM) [
18]. Xiong et al. proposed a multi-step hybrid prediction model using variational mode decomposition (VMD)to extract the features of the original data and build a fast multi-output relevance vector regression model [
19]. Sun et al. decomposed sequences into some intrinsic mode functions (IMFs) using Fast EMD. The sub-sequence was predicted by a particle-swarm-optimized-ELM, which had the highest accuracy and stability among all models [
20]. Sun et al. used a method combining FEEMD and sample entropy to decompose the original sequence. A particle-swarm-optimized-ELM was used to predict carbon prices [
21]. Zhou et al. proposed a new hybrid model for carbon price prediction. The carbon price was decomposed into multiple IMFs using EMD, and an ELM was used to predict the components [
22]. Liu et al. decomposed the original sequence into interval trends and residuals using three interval multiscale decomposition methods. A novel prediction model combining statistical and neural network models was used to predict all components, and the combination prediction result was obtained using the least squares method [
23]. Huang et al. proposed a new combination prediction model for the EU ETS market reform context; the sequence underwent decomposition via VMD, which resulted in multiple sub-sequences, and the low-frequency and high-frequency components were separately fitted and predicted using a LSTM neural network and GARCH models [
24]. Zhu et al. proposed a least squares SVM prediction model based on EMD. The EMD decomposed the sequence, and an SVM based on particle swarm optimization was used to predict each mode [
25]. Zhou et al. decomposed the Guangzhou Carbon Emission Exchange sequence using the CEEMDAN. They constructed a prediction model using the LSTM network, which showed stable and reliable results [
26].
At the same time, some scholars have found that the complexity of some sequences after decomposition was still high, which significantly impacts the prediction accuracy. The approach of employing secondary decomposition strategies for modeling has attracted more attention recently. Li et al. decomposed an original sequence through CEEMD and performed VMD secondary decomposition on an IMF with a maximum sample entropy. They predicted all component arrangements through a back propagation (BP) neural network and obtained a final prediction result [
27]. Zhou et al. decomposed the carbon price using EMD and performed VMD secondary decomposition on the first component sequence. They used a parameter-optimized kernel-based ELM to predict [
28].
The second primary strategy involves establishing a predictive model for the fluctuation of carbon prices over time by analyzing the relevant influencing factors. Chevallier investigated a carbon price interaction model that considered macroeconomic and energy dynamics under the EU emissions trading system [
29]. Han et al. proposed a BP model with mixed data sampling to predict carbon prices. Environmental factors were included in the prediction indicators, and the model’s accuracy was improved by 30% and 40% compared to the MIDAS and benchmark models [
30], respectively. Li et al. used a multi-variate LSTM network to study and analyze domestic carbon trading market carbon prices while considering the relevant factors affecting carbon prices. According to the results, this model was more appropriate for predicting carbon prices than multi-variate least squares, MLP, and recurrent neural network (RNN) models [
31]. Zhao et al. collected many factors affecting carbon prices through meta-analysis and proposed a hybrid model to predict carbon prices. The results demonstrated that this model outperformed the other benchmark models regarding predictive accuracy [
32]. Du et al. used a BP model to analyze the influencing elements on carbon prices. According to the results, the BP model displayed satisfactory results in predicting carbon prices and examining their determinants in Fujian province [
33].
From the literature review, it is not difficult to find that most scholars have adopted a single strategy that only establishes predictive models based on the sequence itself or the relevant factors affecting the fluctuation of the carbon price; as of yet, no literature comprehensively considers the two strategies to develop predictive models. Therefore, this paper takes a comprehensive approach by considering both strategies and presenting the CEEMDAN-CNN-LSTM and Multi-CNN-LSTM models in two aspects; a combination predictive model with multiple strategies was established through Lp-norms.
Therefore, this article assumes that the combined predictive model obtained after considering the two strategies performs better than any single strategy predictive model. Based on this research hypothesis, this article proposed the CEEMDAN-CNN-LSTM and Multi-CNN-LSTM models for the two strategies, respectively, and then established the final combined multiple strategies predictive model through Lp-norm to achieve the goal of higher prediction accuracy.
2. Materials and Methods
2.1. Data Characteristics and Sources
Carbon price time series data refers to the historical record of carbon dioxide emission prices in the carbon market. These data show how the price of carbon credits fluctuates over time, thereby reflecting changes in the supply and demand of emission quotas. Because of the influence of market demand, regulations, and other factors, carbon price time series data have characteristics such as trend, seasonality, volatility, potential nonlinearity, non-stationarity, and complexity.
The nonlinearity of carbon price data means that there is no reliable linear relationship between the variables being studied. Non-stationarity means that the statistical properties of time series data vary over time, where the mean, variance, and covariance may vary or show trends in the data. Complexity refers to multiple variable factors interacting with each other in time series data, which make it challenging to analyze and interpret. Nonlinearity, non-stationarity, and complexity all involve the statistical characteristics of time series data, thereby indicating that it is difficult to process such data through traditional linear analysis methods; more advanced and flexible techniques and methods are needed for analysis. Carbon price time series data can be collected from government agencies, financial markets, and independent research institutions.
This article selected data from three carbon exchanges in Hubei, Shanghai, and Guangzhou in China for empirical analysis, and the data was collected from the China Emissions Exchange.
2.2. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise
The CEEMDAN algorithm is a novel signal processing method developed based on EMD and EEMD [
34]. This method adaptively incorporates white noise into the decomposition process of IMFs, thereby solving the problems of mode aliasing in EMD and noise residue in EEMD.
Let an original sequence be , and the steps for performing the CEEMDAN algorithm are as follows:
Step 1: Add a Gaussian white noise sequence
, which obeys standard normal distribution, to obtain a new sequence
, i.e.,
In the Formula (1), N is the number of times.
Step 2: The first IMF part
was obtained through the EMD model. Take the average of the N IMF components generated to gain the first IMF part
for CEEMDAN, i.e.,
Step 3: Deduct the first IMF part from to gain the residual term , and treat as the initial sequence to redo steps 1 and 2 to gain the second IMF part .
Step 4: Treat the newly generated residual sequence as the initial sequence and redo the above steps. The algorithm terminates when the residual sequence transforms into a monotonically increasing or decreasing function and is no longer decomposable. If
K IMF components are obtained at this time, the initial sequence can be represented as:
2.3. Convolutional Neural Networks
The CNN represents a popular deep learning model that specializes in managing gridded data and has demonstrated exceptional effectiveness in image recognition [
35]. In recent years, the CNN has also shown great potential in areas such as time series prediction.
The core modules of the CNN include the convolutional, pooling, and fully connected layers. The convolutional layer uses a set of learnable filters to perform linear transformations on the input data, thus generating feature maps. For time series data, one-dimensional convolution (1D-CNN) is mainly used for feature extraction. The pooling layer reduces the dimension of feature maps by extracting the essential features. The fully connected layer inputs the extracted features to output the predicted values.
The mathematical model of 1D-CNN is as follows:
In Formula (4), is the mapping of the i-th feature in the l-th layer, is the activation function, is the mapping of the i-th feature in the layer, M is the total number of mappings of the input features, is the convolutional kernel, ∗ denotes the convolution operation, and is the bias term.
2.4. Long Short-Term Memory Networks
LSTM networks introduce memory modules in the various neural nodes of their hidden layers, thereby addressing the gradient problem during iterations of an RNN [
36].
Figure 1 shows the design of the LSTM network; the expression for the gate control system operation process in the LSTM network is as follows:
where
,
, and
correspond to the output of the forget gate, input gate, and output gate, respectively;
and
are the hidden states of the previous and current time steps, respectively;
is the candidate state;
and
are the internal states of the previous and current time steps, respectively; and
and
represent activation functions. In contrast,
W and
b are different modules’ weight matrices and biases, respectively.
2.5. CNN-LSTM
Figure 2 depicts the design of the CNN-LSTM model. Using the convolutional layers of CNN, the model can learn spatial characteristics in the time series data and automatically extract relevant local patterns, thus resulting in effective feature extraction and noise reduction. Through the LSTM layer, the model can capture long-term dependencies in the sequence. The hybrid architecture of the CNN-LSTM model combines the strengths of both models, thereby enabling the model to learn more complex time-series features and improve prediction accuracy.
2.6. Lp-Norm
Based on the Lp-norm error, the accuracy of a prediction model is measured by calculating the distance between the original sequence and the predicted sequence, where a smaller value indicates a higher prediction accuracy [
37]. Let
e be the Lp-norm error, which is expressed as follows:
In Formula (6), is the predicted value, is the actual value, represents the prediction error at time t, represents the error of the i-th method at time t, and is the weight of the combined prediction model.
The value of
p can be selected based on the actual situation. In this paper,
p was chosen as 2, and, therefore, the optimal combined prediction model in this paper was constructed based on the L2-norm:
2.7. The Proposed Model
Carbon prices have significant impacts on various industries across the country. To improve the accuracy of carbon price prediction, this paper proposed a multi-strategy combined model based on CNN and LSTM networks. The model design is shown in
Figure 3,
Figure 4 and
Figure 5.
A combination prediction model based on L2-norm was established, and the optimal weight parameters of the model were determined.
Figure 4 depicts the establishment process. The initial sequence was decomposed into several IMFs using the CEEMDAN method. These IMFs were reconstituted as high- and low-frequency sequences based on their fluctuation characteristics. Two prediction models, CNN-LSTM and LSTM, were created for each sequence type. The predictions of both models were combined to gain the final prediction of the CEEMDAN-CNN-LSTM model.
The Multi-CNN-LSTM model comprehensively considers both economic and technical indicators of carbon trading price sequences, as illustrated in
Figure 5. Technical indicators are extracted based on economic indicators, and highly correlated indicators with carbon trading prices are chosen using the Pearson correlation coefficient as explanatory variables for the model. The Multi-CNN-LSTM model was established based on these variables to predict prices.
4. Conclusions
Carbon price time series have the characteristics of nonlinearity, non-stationarity, and high complexity. Accurate carbon trading price prediction can provide the basis for carbon emission quotas, pricing strategy revision, regulation, and investor decision making, thereby creating an efficient and perfect carbon market.
Based on the traditional prediction models, this paper established an Lp-norm-based Lp-CNN-LSTM combined prediction model for carbon trading prices. Through experimental research on the carbon markets in Hubei, Shanghai, and Guangdong, the following conclusions have been drawn:
Compared with the initial carbon price series prediction model, the prediction model after CEEMDAN decomposition was better.
For the high-frequency carbon price series reconstructed after CEEMDAN decomposition, the CNN-LSTM model with a more robust feature extraction capability had the best prediction effect. In contrast, the relatively stable low-frequency carbon price series was more suitable for prediction using a simple LSTM model.
Compared with prediction models that only considered the original carbon price series, models that comprehensively considered economic and technical indicators that are highly related to the carbon price series had better prediction performance.
Compared with using a single strategy CEEMDAN-CNN-LSTM and Multi-CNN-LSTM models, the Lp-CNN-LSTM combined prediction model combineed two strategies based on Lp-norm and had the best prediction performance. The Lp-CNN-LSTM combined prediction model reduced the optimal single strategy model’s MSE, RMSE, MAE, and MAPE by 7.20%, 3.67%, 5.80%, and 1.92%, respectively.
The empirical results have proved the correctness of the hypothesis in this paper. The combined forecasting model obtained by considering the two strategies comprehensively was better than any single-strategy forecasting model. The Lp-CNN-LSTM combined model had good accuracy, effectiveness, and robustness, thereby providing the basis for carbon pricing strategy revisions, carbon trading market regulation, and investor decision making. At the same time, the Lp-CNN-LSTM combined prediction model can also be applied to predictive research in other fields.
Although the Lp-CNN-LSTM combination forecasting model achieved excellent predictive results, there is still room for further improvement. In strategy one, the first IMF component obtained after the CEEMDAN decomposition may still have high complexity, and it can be further decomposed to reduce its complexity. In strategy two, the modeling process only considered the economic indicators of carbon prices themselves and the technical indicators derived from economic indicators, but it did not consider other factors that affect carbon price fluctuations. Therefore, further research can consider more related influencing factors, such as energy prices (including oil prices, natural gas prices, coal prices, etc.), stock indexes, national policies, and social public opinion.