1. Introduction
Upon signing the Kyoto protocol, the carbon market was formally established in 2005 in an attempt to reduce global greenhouse gas. The carbon market defines virtual carbon emission rights as scarce valuable assets, gives them commodity attributes, and realizes the target of resource allocation and emission reduction through market transaction among reduction entities. The signing of the Paris Agreement in December 2015 further highlights the carbon market’s capital allocation for achieving emission reduction on a global scale [
1]. As a core issue of the carbon market mechanism, accurate forecasting of the carbon price can develop an efficient carbon pricing mechanism, and also help investors to avoid market risks and to increase returns. As an emerging policy-based artificial market, the carbon market is characterized by strong sensitivity to policy shocks, especially carbon dioxide (CO2) reduction policies and carbon quota policies [
2]. Therefore, as for the certain pricing framework, the prediction of the carbon price should not only follow the basic pricing method of general financial assets, but also reflect the special driving mechanism of the carbon price.
Further evidence has verified that the shock of policy turbulence (like the implementation of carbon tax) and uncertainty in the energy and financial markets can easily affect the carbon price through financial channels [
3]. In other words, the shock from the energy and financial markets will also have an impact on the carbon price forecast and price fluctuation. Consequently, the carbon price is vulnerable to national strategies, government regulations, international agreements, and other policy factors, as well as asymmetric influences from market participants’ behaviors that originate from its influential markets. Thus, it is worth exploring the carbon price driving mechanism, which reflects the impact of external shock and market asymmetry.
However, the common foundation of existing research only focuses on the low-order moment perspective of the returns (market return and variance), while ignoring the shock of extreme events and market asymmetry originating from its influential markets from the framework of higher-order moment (market skewness and kurtosis). That is, these ignored factors are not included in the analysis framework of the pricing mechanism. Actually, market skewness and kurtosis have proven to be valuable indicators that characterize market asymmetry information and market extreme factors for explaining portfolio returns [
4]. This kind of defect makes the accuracy of the existing carbon price prediction questionable.
To overcome this weakness, the multivariate skew-GARCH (Generalized Autoregressive Conditional Heteroskedasticity) model, which incorporates the third-order moment of financial assets, was introduced to explain finance return and the asymmetric relation among some European financial markets [
5]. However, the multivariate skew-GARCH model has more parameters to be estimated, and needs certain assumptions on the tail distribution of the residual. Furthermore, Fry et al. [
6] developed a binary higher-order moment Capital Asset Pricing Model (CAPM) model, which considers the shock of co-skewness and co-kurtosis to represent the risk of market asymmetry and extreme event on the return of the portfolio. As a result, the higher-order moment CAPM model provides a reasonable framework to study the pricing of carbon assets under the impact of external events and information asymmetry. As we know, policy factors and information asymmetry are important external factors affecting the carbon price. While many scholars use Extreme Value Theory (EVT) to examine extreme risk shocks in financial markets [
7], the EVT model ignores the extreme shock from other relevant markets and only considers the extreme impact generated by the financial market itself. Thus, the higher-order moment CAPM model is superior to EVT-based models in describing the asset prices affected by the extreme event shocks.
However, the higher-order moment CAPM model mentioned above is limited in its ability to describe the relationship between two assets, and requires that the asset returns obey the binary standard normal distribution, which makes it difficult to capture the non-linear mapping relationship between the carbon price and the multiple pricing factors.
The aim of this article is to construct a novel extended higher-order moment multi-factor framework for forecasting the carbon price, in which the framework is in line with the non-linear and peak thick tail characteristics of the carbon price. Furthermore, this paper relaxes the framework of higher-order moment CAPM mentioned above to multivariate factors, and brings the higher-order moment terms that indicate the shock of market asymmetry and extreme events originating from its influential markets into the carbon pricing framework. Thus, based on the arbitrage pricing theory (APT), an extended higher-order moment multi-factor framework (EHM-APT) that conforms to the special driving mechanism of the carbon price is formed. The above research design is to compensate for the inaccurate pricing caused by neglecting the pricing factors of extreme events and market asymmetry information. The proposed framework is convinced that the carbon price can be theoretically explained by the higher-order moment terms of its influential markets. Further, this article uses a machine learning method to empirically evaluate the accuracy and performance of the framework. Specifically, this paper utilizes the Long Short-Term Memory network with the designed multi-layer and multi-variable structure (Multi-LSTM) to predict the carbon price and capture the non-linear mapping relationship between the carbon price and its pricing factors.
The conclusion of this article summarizes that the proposed EHM-APT framework can significantly improve the prediction of the carbon price. That is to say, the co-skewness and co-kurtosis, which represent the market asymmetry and extreme events, which stem from its pricing factor markets, are indispensable factors for predicting and fitting the carbon price. Therefore, it is valuable to allocate and arrange funds for carbon investors by taking market extreme factors and market asymmetric information as indispensable pricing factors. The conclusion of this article and the proposed pricing framework will help investors to predict and analyze the carbon price both effectively and accurately.
The remaining structure of this paper is as follows:
Section 2 reviews the related literature.
Section 3 introduces the methodology of this paper.
Section 4 presents the experiment analysis, contains the descriptive statistics and the design of the Multi-LSTM model, and offers the results and the discussion.
Section 5 concludes this paper and puts forward ideas for further research.
2. Literature Review
The growing academic literature on carbon price forecasting has focused on two major perspectives: One is the traditional econometric models, the other is the artificial intelligence models.
For traditional econometric models, the GARCH model has potential advantages in capturing the non-linear characteristics of the carbon price. Chevallier [
8] proposed that the asymmetric threshold GARCH model can effectively capture the characteristics of the carbon future price and improve prediction accuracy. Byun et al. [
9] put forward that GARCH-type models are superior to the k-nearest neighbor model for predicting the carbon price. Koop and Tole [
10] created the dynamic model averaging (DMA) method to forecast the carbon price, and the results showed a high prediction accuracy compared with the Bayesian model and the time-varying parameter regression model. Sanin et al. [
11] demonstrated that the integration of Autoregressive Moving Average model (ARMAX) and the GARCH model, when using a time-varying jump process, can accurately predict the carbon price compared with a standard ARMAX–GARCH framework. Benz et al. [
12] concluded that the Markov regime-switching GARCH model outperforms other Markov regime-switching or simple GARCH models in forecasting the carbon price. In addition, Chevallier [
13] maintained that a nonparametric method of predicting the carbon price can reduce the prediction error by almost 15% compared with linear autoregression models. Based on the integration of the threshold dynamic conditional correlation (DCC) GARCH model and the full Baba, Engle, Kraft, and Kroner (BEKK) GARCH model, the volatility spillover from the price decrease of the energy market may have a stronger shock on carbon price, and the GARCH-based model can better reveal the price volatility mechanism of carbon assets based on the analysis of volatility spillover [
14,
15].
However, the applicability of traditional econometric models usually requires that the returns follow a strict hypothesis, such as normal distribution at the tail of carbon returns [
16]. As a matter of fact, the carbon return has significant non-normal and non-linear characteristics, and asset returns exhibit a peak and thick tail feature [
8], which results in traditional pricing methods failing to predict the carbon price accurately.
Being capable of mapping non-linear functions and not needing to consider the tail distribution of return series, artificial intelligence methods include artificial neural network (ANN), support vector machine (SVM), least squares support vector machine (LSSVM), multilayer perceptron (MLP), and Hybrid Fuzzy Neural Network (HFNN), which are superior to traditional econometric models in solving forecasting issues [
17,
18]. To determine the parameters of the LSSVM model, Zhu et al. [
19] proposed an integration approach of a group method of data handling (GMDH), particle swarm optimization (PSO), and the LSSVM model for carbon price prediction, in which the prediction accuracy is higher than that of ANN and the autoregressive integrated moving average (ARIMA) model. Fan et al. [
20] established an MLP–ANN prediction model, and the results showed that the proposed model has good performance compared with the models of ARIMA, ANN, and LSSVM. Recently, research has found that an integrated model that combines the EMD (Empirical model decomposition) method with ANN and LSSVM can achieve better performance for forecasting the carbon price than that of the EMD method alone [
21]. Additionally, Atsalakis [
22] proposed a computational intelligence-based model with a novel hybrid neuro-fuzzy controller for forecasting the carbon price, which obtained a higher accuracy. Zhu et al. [
23] combined variational mode decomposition (VMD) and spiking neural networks (SNNs) to improve forecasting accuracy and reliability.
There are two defects in the above research. Firstly, the foundation of these studies may ignore the shock of the higher-order moment attribute (skewness–kurtosis) of the pricing factors on the carbon price. The common theoretical basis of the above models is exploring the linear or non-linear regression relationship between the carbon price and its pricing factors under the multi-factor framework, which is a manner of the low-order moment of return. Correspondingly, the previous studies imply the hypothesis that the shock of extreme events or information asymmetry that stem from its pricing factor markets will not affect carbon price. Secondly, the forecasting models mentioned in previous research are flawed in their ability to predict carbon price accurately. Because the traditional statistical and econometric models require that the return follows a strict hypothesis, artificial intelligence methods are easily able to fall into local minima for solving forecasting issues and have difficulty in achieving a globally optimal situation.
Therefore, to remedy the shortcomings of the existing research, the contributions of this paper are as follows: firstly, this article relaxes the framework of binary higher-order moment CAPM to multivariate factors. That is, to bring the higher-order moment terms that indicate the shock of market asymmetry and extreme events into the carbon pricing framework. As for the framework, a complex non-linear and non-structural relationship between the carbon price and its pricing factors cannot be ignored. Therefore, the second innovation is the creation of a model of Multi-LSTM with the designed multi-layer and multi-variable structure to capture the non-linear mapping relationship between the carbon price and its pricing factors. The reasons for selecting the Multi-LSTM network are as follows: on the one hand, the Multi-LSTM has an advantage in handling time series data with non-linear and complex relationships, which is consistent with the time lag characteristic of the carbon price sequence. On the other hand, a remarkable feature of Multi-LSTM in the training process is that the model adjusts adaptively for obtaining optimal parameters and structure, and can also avoid falling into the local optimal solution [
24], rather than the other optimization models such as the genetic algorithm and the backpropagation algorithm used in the process of neural network training [
25,
26].
3. Methodology
In this section, we theoretically explain the construction process of the novel extended higher-order moment multi-factor pricing framework. Based on this, the Multi-LSTM network for fitting the pricing framework is introduced.
3.1. A Novel Extended Higher-Order Moment Multi-Factor Framework (EHM-APT)
According to the higher-order moment CAPM model proposed by Hwang et al. [
27], the return of financial assets is affected not only by systemic risk, but also by irrational behavior and extreme external shock represented by co-skewness and co-kurtosis. Therefore, the higher-order moment CAPM model can reveal more price characteristics than the traditional CAPM model.
The higher-order moment CAPM model, according to Hwang et al. [
27], is as follows:
and
where
is the excess return of the asset portfolio,
represents the risk premium coefficient of
respectively.
is the coefficient of co-variance and represents the shock of the first-order centered moment (return) of asset portfolio m on the first-order centered moment (return) of asset i,
is the coefficient of co-skewness and represents the shock of the second-order centered moment (variance) of asset portfolio m on the first-order centered moment (return) of asset i, and
is the coefficient of co-kurtosis and represents the shock of the third-order centered moment (skewness) of asset portfolio m on the first-order centered moment (return) of asset i. The co-skewness reflects the information asymmetry of market investment, and the negative skewness results in the probability of return decline exceeding the probability of rise, thus increasing the losses for investors [
6]. Correspondingly, higher co-kurtosis increases the probability of extreme events, which reflects the impact of external events on market investment.
For examining the shock of higher-order moment terms on the portfolio return, Fry et al. [
6] developed an extended higher-order moment CAPM model by introducing co-skewness and co-kurtosis into the traditional CAPM framework. The suggested model, which incorporates bivariate normal distribution under the shock of co-skewness and co-kurtosis, is shown as
where
and
represents the portfolio return considering the impact of co-skewness and co-kurtosis, respectively,
represents the impact coefficient of co-skewness, that is, the shock of the second-order centered moment (variance) of asset 2 on the first-order centered moment (return) of asset 1,
denotes the impact coefficient of co-kurtosis, that is, the shock of the third-order centered moment (skewness) of asset 2 on the first-order centered moment (return) of asset 1,
,
,
and
,
,
represent the return, mean, and variance of assets 1 and 2, respectively,
refers to the correlation coefficient between assets 1 and 2, and
indicates the residual.
However, the above extended higher-order moment CAPM model is only applicable to the pricing of binary assets, and the assumption of the binary standard normal distribution is required. To overcome this limitation, this paper proposes an extended higher-order moment multi-factor framework by introducing the co-skewness and co-kurtosis of multivariate pricing factors into the framework. The novel pricing foundation, which considers the shock of market asymmetry and extreme events, is shown as
where
where
represents the carbon return,
,
, …
represents the impact coefficient of the pricing factors,
,
,
and
,
,
represent the return, mean, and variance of the carbon market and its pricing factors, respectively,
denotes the impact coefficient of the second-order centered moment (variance) of carbon pricing factors on the first-order centered moment (return) of carbon return, that is, the shock of market asymmetry, and
denotes the impact coefficient of the third-order centered moment (skewness) of carbon pricing factors on the first-order centered moment (return) of carbon return, that is, the shock of extreme events.
3.2. A Multi-Layer and Multi-Variable LSTM (Multi-LSTM) Model for Predicting the Carbon Price
This paper constructs a multi-layer and multi-variable LSTM model based on the research of Hochreiter et al. [
24] for predicting the carbon price and investigating the performance of the proposed EHM-APT framework.
The LSTM network consists of an input layer, an output layer, and a hidden layer. Remarkably, the hidden layer of the LSTM is no longer a common neuron compared with the traditional Recurrent Neural Network (RNN), but rather an LSTM cell with special memory function. As the core of LSTM, the cell can eliminate or enhance the transmission of input data to the cell by controlling the structure of the designed gate. That is, the forget gate, the input gate, and the output gate. Furthermore, the Multi-LSTM model contains more cells and hidden layers based on single layers. The Multi-LSTM with a multi-layer chain structure and recursive network features can extract and mine the feature of input data, thus significantly improving the ability of learning and training.
During the training of LSTM, the three gates of LSTM receive two kinds of external information at time t, that is, X
t represents the input of the current state and H
t-1 denotes the output of the hidden layer at the previous state. In addition, each gate receives internal information, that is, the memory unit of C
t-1 at the previous state. Among the three gates, the forget gate determines the discarded information in the cell, which is equivalent to decay of information in each dimension [
24]. The input gate determines the type of updated and input information in the cell. In this process, a sigmoid function is responsible for updating the information and the tanh function creates a new candidate vector of C
t at the current cell state. The output gate determines the output of the current cell through the sigmoid function, while the tanh function is used to convert the current output into a value between –1 and 1, and then the final output of LSTM is obtained after multiplying the output of the above two functions. The training structure of the Long Short-Term Memory network is shown in
Figure 1.
The information filtering of the forget gate is shown as
The information updating of the input gate is shown as
The information screening of the output gate is shown as
The information updating vector, the candidate vector, and the update vector of the input gate at the current cell are represented by
,
, and
.
refers to the final output of the LSTM network.
,
,
, and
represent the weight vectors.
,
,
, and
show the bias of the training process and
is the sigmoid function.
3.3. Evaluation Criteria of Multi-LSTM for Determining the Performance of the Proposed Framework
This paper conducts the following criteria to evaluate the parameters and performance of the Multi-LSTM model. The five criteria are
where
represents the time series of actual carbon return,
means the series of predicted carbon return, and
T denotes the observations.
Furthermore, we chose the criteria of Root-Mean-Square Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) because they are utilized in model performance judgment and are often used in the existing literature. These indicators are all real numbers and are between 0 and 1. If their values are smaller, then the closer the true value to the predicted value, and thus the better the prediction performance of the model. As for the investors, the criterion of direction accuracy (DA) means the probability that the market trend is consistent with the correctly predicted direction. Investors generally pay more attention to the probability of the market’s predicted value deviating from the true value, as the right investment direction can help investors make more valuable decisions. If the value of DA is larger, it means the predicted carbon return is closer to the psychological expectation of investors. CORR represents the Pearson correlation coefficient, which is used to measure the correlation between the true carbon return and the predicted return. The value of CORR is between 0 and 1. The closer the CORR value is to 1, the stronger the correlation, and vice versa.
5. Conclusions and Prospects
In this paper, we developed an extended higher-order moment Multi-factor framework (EHM-APT) for predicting the carbon price by extending the theory of the binary higher-order moment CAPM model developed by Fry et al. [
6] to multivariate factors. The proposed EHM-APT framework in this paper not only helps to promote the construction of an effective carbon pricing mechanism, but also provides a more effective market-oriented means for carbon emission reduction and the sustainability of low-carbon economy. As for the EHM-APT framework, we considered the impact of market asymmetry and extreme events that stem from its pricing factor markets on the carbon price compared with the traditional APT model. Furthermore, a multi-layer and multi-variable LSTM (Multi-LSTM) model was constructed so that the parameters and structure can be determined experimentally for testing the performance of the proposed EHM-APT framework in predicting the carbon price. Some main conclusions are summarized as follows.
Firstly, the proposed EHM-APT framework, when considering the shock of higher-order moment, can significantly improve the prediction accuracy of the carbon price compared with the framework when not considering the shock higher-order moment. This conclusion further proves that the shock of market asymmetry and extreme events that stem from its pricing factors is an indispensable factor for predicting and fitting the carbon price, which is ignored by previous research. In fact, it has been proven in many studies that co-skewness and co-kurtosis, which represent the market asymmetry and extreme events, are used as the pricing factors of financial assets [
4]. Based on this idea, this paper further proves that co-skewness and co-kurtosis are also important and indispensable factors for explaining the pricing mechanism of carbon assets.
Secondly, the Multi-LSTM model for which the parameters and structure were determined experimentally (the structure of 18–64–64–1) in this article, is a suitable network for predicting the carbon price, and its performance is superior to the other benchmark models (that is, the Multi-GRU, RNN, MLP, GARCH, and BP models) in all of the evaluation criteria. This result suggests that the Multi-LSTM model is competitive for predicting the carbon price under the proposed EHM-APT framework. Therefore, we conclude that the model has advantages in effectively capturing and mapping the complex non-linear network relationship between the carbon price and its pricing factors.
This study predicts the carbon price based on the structured pricing factors that the data sourced from certain financial markets. As evidenced in this article, the proposed EHM-APT framework has good applicability, and can support valuable references for solving pricing problems in other financial markets as well. In fact, in addition to the higher-order moment terms, we can also explore pricing factors that can measure market asymmetry information and external shock by means of big data text mining. Future research can explore the non-structured pricing factors obtained by text information technology by applying the proposed multi-factor pricing framework, such as the factor of investor sentiment, policy events, and others. Based on this, another valuable exploration prospect is derived, that is, as the increase of data information by means of text mining technology or other machine learning technologies, the way to achieve information fusion and sharing of data features between the structured and unstructured pricing factors within the established pricing framework is a key issue that requires an outcome.