Next Article in Journal
A Novel ANN-Based Radial Basis Function Collocation Method for Solving Elliptic Boundary Value Problems
Next Article in Special Issue
Ranking of Service Quality Index and Solutions for Online English Teaching in the Post-COVID-19 Crisis
Previous Article in Journal
An Information Recognition and Time Extraction Method of Tracking a Flying Target with a Sky Screen Sensor Based on Wavelet Modulus Maxima Theory
Previous Article in Special Issue
An Evaluation System for COVID-19 Vaccine Transportation Quality Based on Fuzzy Analytic Hierarchy Process
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Risk Analysis of the Chinese Financial Market with the Application of a Novel Hybrid Volatility Prediction Model

1
School of Economics and Management, Sanming University, Sanming 365004, China
2
School of International Economics and Management, Beijing Technology and Business University, Beijing 100048, China
3
Institute of Digital Economy, Beijing Technology and Business University, Beijing 100048, China
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(18), 3937; https://doi.org/10.3390/math11183937
Submission received: 22 August 2023 / Revised: 6 September 2023 / Accepted: 13 September 2023 / Published: 16 September 2023

Abstract

:
This paper endeavors to enhance the prediction of volatility in financial markets by developing a novel hybrid model that integrates generalized autoregressive conditional heteroskedasticity (GARCH) models and long short-term memory (LSTM) neural networks. Using high-frequency data, we first estimate realized volatility as a robust measure of volatility. We then feed the outputs of multiple GARCH models into an LSTM network, creating a hybrid model that leverages the strengths of both approaches. The predicted volatility from the hybrid model is used to generate trading strategy signals, which are subsequently used to build an investment strategy. Empirical analysis using the China Securities Index 300 (CSI300) dataset demonstrates that the hybrid model significantly improves value-at-risk (VaR) prediction performance compared to traditional GARCH models. This study’s findings have broad implications for risk management in financial markets, suggesting that hybrid models incorporating mathematical models and economic mechanisms can enhance derivative pricing, portfolio risk management, hedging transactions, and systemic risk early-warning systems.
MSC:
62P20; 91B05; 91B84

1. Introduction

The volatility of financial asset returns plays a key role in financial practice, which forms one of the core subjects of modern financial theory. Among them, as a barometer of the financial market, the predicable volatilities of tickets’ prices are crucial for risk management and financial supervision, portfolio optimization, and financial derivative pricing, which has always been a research hotspot.
Since 2020, COVID-19 has spread all over the world. The pandemic has had a severe impact on the global economy. The CSI300 Index, which is considered the ‘Blue Chip’ index for the Mainland China stock exchange, has experienced a maximum drawdown of 33.52% since 2021, marking the largest drawdown since 2015. In February and March 2020, the S&P 500 Index plummeted five times, triggering a market meltdown. Many investors incurred losses due to the abnormal volatility in the financial market. Consequently, the tail risk of asset returns under extreme volatility has become the focus of scholarly research. Volatility plays a crucial role in various areas of finance, such as derivative pricing, portfolio risk management, hedging strategies, and systemic risk. Therefore, it is valuable for investors to utilize volatility information effectively in constructing their trading strategies.
Modeling the economic mechanism and pricing of assets has been a crucial task in economics, with various methods employed to estimate the mean and variance of prices. One such approach is the autoregressive conditional heteroskedastic (ARCH) model, introduced by Engle [1]. This model was further extended by Andersen and Bollerslev [2] through the development of the generalized ARCH (GARCH) model, featuring a more rational lag structure. Later, Nelson [3] proposed the exponential generalized ARCH (EGARCH) model, which incorporated an exponential component to better capture extreme events. Despite their contributions, these traditional time series models rely on simplified assumptions that may not always hold true in practice. As such, there exists a need for alternative or supplementary methodologies that can address these limitations and provide more accurate inflation rate estimates.
The application of artificial neural networks (ANNs) in finance has gained significant traction in recent years, particularly in the areas of volatility prediction and stock market forecasting. Barunik and Krehlik [4] pioneered the use of ANNs in energy market volatility prediction, demonstrating improved accuracy with high-frequency data. Notably, Hochreiter and Schmidhuber [5] introduced long short-term memory (LSTM) algorithms, a type of recurrent neural network (RNN), which has since become a widely used tool for tackling complex tasks with long time lags. Chen et al. [6] successfully applied LSTM to predict Chinese stock returns, showcasing its potential in stock market prediction, which also indicated a possible way to predict volatility than the strategies currently used. Kim and Won [7] developed a hybrid model combining LSTM with multiple GARCH-type models to improve realized volatility forecasts for the KOSPI 200 index. Their findings indicated that the integrated model outperformed individual GARCH-type models. The increasing availability of high-frequency financial data has fueled research in this area, driving the development of novel techniques and architectures to harness the power of advanced machine learning methods. As data science continues to evolve, the intersection of AI and finance holds great promise for unlocking new insights and improving decision-making processes in the investment industry.
According to Hornik et al. [8], artificial neural network (ANN) models possess the ability to approximate continuous functions without imposing restrictions on the underlying data generation process, as demonstrated in D’Amato et al. [9]. Numerous studies have shown that ANN models excel over traditional GARCH-type models in volatility prediction due to their capacity to capture nonlinearity and their lack of requirement for stationarity in the series (Tapia and Kristjanpoller [10]; Bahareh Amirshahi, Salim Lahmiri [11]). Notably, hybrid models combining deep learning and GARCH-type models exhibit superior performance compared to single deep learning or time series models (Kristjanpoller and Hernández [12]; Vidal and Kris-tjanpoller [13]). This paper contributes to the field by highlighting the significance of intelligent algorithms and economic connections in volatility prediction, offering a unique perspective on the interplay between these factors.
Realized volatility, a concept introduced by Andersen and Bollerslev [2], has revolutionized the way we measure and understand volatility in financial markets. By utilizing high-frequency sample data, realized volatility captures ex post volatility and provides a more comprehensive picture of market fluctuations compared to traditional measures. Building upon this concept, Shao and Yin [14] developed a realized volatility model and a realized range model, which were used to compute value at risk (VaR) using intraday high-frequency data. Their work demonstrated that models based on intraday data significantly outperform those relying on daily returns, highlighting the importance of high-frequency data in volatility modeling. Furthermore, Kuster et al. [15] emphasized the critical role of accurate volatility predictions in estimating VaR, underscoring the significance of developing sophisticated models capable of capturing the complexity of modern financial markets.
This paper offers several significant contributions to the field of financial risk management. First, it employs multiple models to study realized volatility (RV), thereby enhancing the accuracy and robustness of predictions. Second, it performs out-of-sample forecasts to evaluate the performance of the developed models. Third, it utilizes estimated value at risk (VaR) to conduct risk analysis. Fourth, it combines artificial intelligence algorithms and traditional volatility models, not only improving model performance but also highlighting the relevance of each variable. Lastly, it provides a reference model for investment and risk management that can contribute to market pricing efficiency and stability.
Previous studies have demonstrated the superiority of hybrid models combining deep learning and GARCH-type models in volatility forecasting for cryptocurrencies (Bahareh Amirshahi and Salim Lahmiri [11]; Kristjanpoller and Minutolo [16]). In contrast, our study applies this approach to the stock market, which has a larger market value and greater practical significance. While Ramos-Pérez et al. [17] and Liu [18] utilized hybrid models to predict volatility in the S&P500 and Kim and Won [7] examined the volatility of the Korean stock price index (KOSPI 200), these studies neglected the underlying economic mechanisms driving volatility. Our research addresses this gap by incorporating economic insights into hybrid artificial intelligence algorithms, rendering it the first study to bridge this divide. By doing so, we expand upon existing research and underscore the significance of economic variables and econometric models in volatility analysis.
The organization of this paper is as follows: In Section 2, we conduct a literature review of relevant studies on realized volatility, GARCH-type models, and LSTM. We then propose a hybrid model that combines these approaches to better capture volatility and predict it. Next, we outline the basic models used in our study. Section 3 presents the empirical results of our models and compares them with traditional models. In Section 4, we discuss the potential applications of our models, including systemic risk prediction and portfolio management, as well as robustness tests. Finally, we conclude with a summary of our findings and implications for future research in Section 5.

2. Materials and Methods

2.1. Realized Volatility

To assess the accuracy of our predictions, we compare our forecasted volatility values with the actual realized volatility, which serves as the target variable for our supervised learning algorithm. Our calculation of realized volatility draws on the method introduced by Andersen and Bollerslev [2], defined as
R V t d = i = 1 M r t , i 2
where M is the number of observations within a day, r t , i = 100 × l n P t , i / P t , i 1 , P t , i denotes the i-th close price on the t-th day, and r t , i denotes the i-th return on the t-th day. This formula yields the daily realized volatility measure, RV, which we use for comparison with our predicted volatility values.

2.2. Models

2.2.1. GARCH and EGARCH

Bollerslev [19] introduced the generalized autoregressive conditional heteroscedasticity (GARCH) model, which is mathematically equivalent to the ARCH-infinite model. The standard GARCH (1,1) model is
y t = φ x t + μ t , μ t N 0 , σ t 2
σ t 2 = V a r ( y t | I t 1 ) = α 0 + α 1   μ t 1 2
where y t is a given stochastic time series whose drift is μ t . σ t 2 , I t and N ( 0 , σ t 2 ) denote the volatility at time t, given information up to time t and the standard Gaussian distribution, respectively, and all coefficients are set to be non-negative.
The EGARCH (exponential GARCH) model was put up by Nelson [3], which is also called EGARCH. Compared to the GARCH model, EGARCH allows the coefficients to be negative. The asymmetry of volatility is characterized by a parameter, and the conditional variance equation in the EGARCH (1,1) model is:
l n ( σ t 2 ) = α 0 + β 1 l n ( σ t 1 2 ) + α 1 | μ t 1 | σ t 1 + γ μ t 1 σ t 1
Compared to the ARCH and GARCH models, the EGARCH model reduces the constraints on parameters. The EGARCH model is more flexible, and the leverage effect is achieved through α 1 | μ t 1 | σ t 1 + γ μ t 1 σ t 1 . Furthermore, we denote the model of EGARCH with GED (generalized error distribution) as the GED-GARCH model.

2.2.2. LSTM

Recurrent neural networks (RNNs) are employed to predict sequential data, comprising input, hidden, and output layers, and they can unfurl to a depth tailored to the input dataset. However, classical RNNs suffer from the vanishing gradient problem, which long short-term memory (LSTM) networks address. Unlike feedforward neural networks, RNNs possess feedback connections, enabling them to capture temporal patterns effectively. As a result, RNNs are well suited for modeling sequential data and time series analyses. In fact, studies have shown that LSTM models outperform feedforward neural networks in financial time series forecasting. For instance, Maknickienė and Maknickas [20] utilized an LSTM model to predict exchange rates and foreign exchange trading, demonstrating improved prediction performance compared to feedforward neural networks. Similarly, Chen et al. [6] applied an LSTM model to predict returns in the Chinese stock market, yielding better results than the random prediction method. These findings suggest that LSTM models excel as financial time series models due to their ability to capture complex temporal relationships. The feedforwarding process of LSTM for the input data and hidden state at time step t can be formulated as follows:
i t = σ W 1 X + b 1 f t = σ W 2 X + b 2 o t = σ W 3 X + b 3 g t = t a n h W 4 X + b 4 c t = c t 1 × f t + g t × i t h t   = tan h c t × o t
where W i and b i are weights and bias terms, respectively, and X = ( b t h t 1 ) . Function σ and tanh are defined by σ = 1 / 1 + e x and t a n h = e x + e x e x e x .

2.2.3. Proposed Hybrid Models

Artificial neural network (ANN) models possess the ability to approximate continuous functions without imposing restrictive assumptions on the underlying data generation process, as demonstrated by Hornik et al. [8] and D’Amato et al. [9]. Furthermore, various studies have explored the integration of ANNs and GARCH-type models to enhance stock market volatility predictions, as shown by Kim and Won [7]. Additionally, it has been shown that utilizing information from multiple GARCH-type models as inputs leads to better performance than relying on a single GARCH model [7]. Building upon these findings, our study proposes a novel approach to combining deep neural networks with econometric models. Our proposed method expands upon previous hybrid models (Roh [21]; Wang [22]; Hajizadeh et al. [23]; Kristjanpoller et al. [24]; Kristjanpoller and Minutolo [25]) by incorporating multiple econometric variables and GARCH-type models with neural networks. We assume that various economic characteristic information, such as volatility shock magnitude, persistence, and direction, can be acquired from GARCH (1,1) and EGARCH models. By inputting this information into a long short-term memory (LSTM) network, we can leverage its ability to learn high-level temporal patterns in time series data, thus improving predictive accuracy.
To validate our hypothesis, we compare the performance of a hybrid model combining economic variables and GARCH-type models versus multiple GARCH models. Our experiments use three evaluation metrics (mean absolute error, root mean square error, and mean absolute percentage error) to assess the models’ performance in predicting the realized volatility of China Securities Index 300 (CSI300) data. Notably, our study employs a more sophisticated LSTM architecture than previous research (Roh [21]; Wang [22]; Hajizadeh et al. [23]; Fuertes et al. [26]), allowing it to learn long-range dependencies and more intricate patterns.
Furthermore, we contribute to the literature by exploring the application of deep neural networks and LSTMs in finance, as recent works have focused primarily on shallow neural networks [12]. Our study’s results are consistent with the findings of Oliveira, Cortez, and Areal [27], who employed sentiment and attention indicators from microblogging data to develop a method for predicting returns, volatility, and trading volume. Similarly, Yao et al. [28] proposed a hybrid model that combined the outputs of an autoregressive neural network and a GARCH-type model, showing superior performance in realized volatility prediction compared to single models. By building upon these studies and integrating deep neural networks and econometric models, our research offers a novel approach to enhancing stock market volatility predictions through the combination of multiple GARCH-type models and cutting-edge machine learning techniques.
This paper’s unique contribution lies in its incorporation of macroeconomic variables into a hybrid model for volatility prediction, thereby extending beyond traditional algorithm-centric approaches. Recognizing the interplay between stock prices and macroeconomic factors such as interest rates, inflation, industrial production indices, and economic growth, we integrate these variables into our model to enhance its accuracy and robustness. By including RATE, CPI, CSI, and GROWTH in our model, we demonstrate the feasibility and effectiveness of considering macroeconomic dynamics in stock price prediction, ultimately contributing to a deeper understanding of the underlying mechanisms. This innovative approach opens up new possibilities for future research and practical applications in the field.

2.2.4. Variables

Table 1 shows the definition of variables in our paper in detail.
As shown in Table 1, r t , i denotes the i-th 5 min return of the CSI300 Index on the t-th day. The dependent variable is RV, which means the daily realized volatility of the CSI300 Index on the t-th day, calculated in Equation (1) based on r t , i .
r t denotes the daily return of the CSI300 Index on the t-th day, based on which we obtain the predicted volatility using GARCH-type models. Macro variables reflect the economic mechanism affecting asset pricing and thus volatility. As a result, we include RATE, CPI, CSI, and GROWTH in our hybrid model to improve the precision and robustness. RATE is the interbank offered (lending) rate, which comprises monthly data. CPI is the consumer price index, which reflects the pricing level. CSI is the consumer sentiment index, which reflects the sentiment of consumers and traders. GROWTH is the national industrial growth rate, which reflects the state of the real economy.
Firstly, we use r t to obtain volatility predictions (RV-GARCHs) based on Equations (2)–(4) using GARCH-type models. Then we compare the RV-GARCHs with RV. Secondly, we use RV-GARCHs and macro variables along with the LSTM model to obtain a hybrid prediction of volatility (RV-Hybrid) based on Equation (5). In our hybrid model, we have 13 variables, which are as follows: RV-GARCH based on the GARCH model; lag1 and lag2 periods of RV-GARCH; RV-EGARCH based on the EGARCH model; lag1 and lag2 periods of RV-EGARCH; RV-GEDGARCH based on the GED-GARCH model; lag1 and lag2 periods of RV-GED-GARCH; RATE; CPI; CSI; and GROWTH. Then the application of our model is discussed. Volatility is one of the most important measurements of asset pricing. The prediction of volatility of financial assets plays an important role in risk management, which could be applied in derivative pricing, portfolio risk management, hedging strategies, and systemic risk alert.

3. Results

This section, structured by subheadings, offers a condensed yet comprehensive account of the experimental outcomes, their meaningful interpretations, and the subsequent conclusions derived from the data. Through a meticulous examination of the results, we unveil new insights into the phenomenon under investigation, furnishing the field with valuable knowledge and paving the way for further investigations and practical applications.

3.1. Data

The historical 5 min trading data of the CSI300 Index and macro variables employed in this study were sourced from JoinQuant. Specifically, the CSI300 Index is designed to mirror the performance of the top 300 stocks listed on the Shanghai Stock Exchange and Shenzhen Stock Exchange. Our dataset encompasses 68,976 5 min data points and 1437 daily data points spanning the period of 23 August 2016 to 22 July 2022. To train the LSTM model, we utilized 90% of the data in the training set as a holdout set for model fitting and 10% as a validation set for hyperparameter tuning. Notably, the within-sample period ranges from 23 August 2016 to 18 January 2021, while the outside-sample period covers 21 January 2021 to 22 July 2022.
The following table shows the description statistics of the return and RV (realized volatility) of the CSI300 Index. As shown in Table 2, the mean daily realized volatility (RV) of the CSI300 Index is 0.0082, and the standard deviation is 0.0035; the minimum value of RV is 0.0025, and the maximum value of RV is 0.0296; the mean daily return of the CSI300 Index is 0.0002, and the standard deviation is 0.0120; and the minimum daily return is −0.0821, and the maximum daily return is 0.0578. The ADF test result of return is −19, which means the return is a stationary series at the 1% level.

3.2. Volatility Prediction

This study leverages the rolling-time-window technique for volatility forecasting. Our approach involves first training three individual GARCH-type models, including GARCH (1,1), EGARCH (1,1), and GED-EGARCH (1,1). We then integrate these models with LSTM and macro variables to create a hybrid model. Additionally, we incorporate inputs such as the interbank offered rate, consumer price index (CPI), industrial growth, and consumer sentiment. To accommodate data limitations, we apply a one-period lag for macro variables. Finally, we evaluate the out-of-sample predictive performance using three loss functions: mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE).
According to Table 3, the GARCH model exhibits the strongest performance among the GARCH-type models, with a mean absolute error (MAE) of 0.0035, root mean square error (RMSE) of 0.0043, and mean absolute percentage error (MAPE) of 0.4483. In comparison, the EGARCH model has a slightly higher MAE of 0.0041, while the GED-EGARCH model has an MAE of 0.0040, both of which are inferior to the GARCH model’s performance. Similarly, the RMSE and MAPE measurements also indicate that the GARCH model outperforms the other two models.
Our hybrid model, which integrates LSTM and GARCH-type models with macro variables, exhibits superior performance compared to the standalone GARCH model, as demonstrated in Table 3. Specifically, the hybrid model achieves a mean absolute error (MAE) of 0.0020, root mean square error (RMSE) of 0.0027, and mean absolute percentage error (MAPE) of 0.2233. These values represent improvements of 43%, 37%, and 50% over the GARCH model, respectively. The table clearly shows that the inclusion of macro variables in the hybrid model leads to the most accurate predictions.
According to Kim and Won [7], the mean absolute error (MAE) of the growth-based exponential smoothing–long short-term memory (GEW-LSTM) model is 0.0107, which represents a 37.2% reduction compared to the ensemble–dual factor nested (E-DFN) model (0.017). Additionally, the GEW-LSTM model exhibits inferior performance in terms of mean square error (MSE), half-life autoregressive moving average (HMAE), and half-life moving average (HMSE), with reductions of 57.3%, 24.7%, and 48%, respectively. Our models, which integrate macro variables, demonstrate even lower MAEs than those reported by Kim and Won [7]. Moreover, our hybrid models outperform GARCH-type models, suggesting their superiority in predicting stock market volatility.
Figure 1 depicts the comparison between predicted and realized volatility for both GARCH-type models (part a) and the hybrid model (part b). The GARCH-type models’ predictions are contrasted with the actual volatility, which serves as the target value in this study. On the whole, the hybrid model’s forecasts are likewise compared to the realized volatility.

4. Discussion

4.1. VaR Analysis

Value at risk (VaR) is a widely used metric in risk management, measuring the potential loss of investments within a specified time frame (typically a day) and probability. VaR provides a quantitative assessment of the potential downside risk associated with a portfolio, enabling investors and financial institutions to make informed decisions regarding their exposure to market fluctuations. By estimating the maximum potential loss within a given confidence interval, VaR serves as a valuable tool for managing and mitigating risks in various financial contexts. It is defined as follows:
P r t + 1 > V a R t + 1 α = 1 α
V a R t + 1 α = μ + t α σ t + 1
in which μ denotes the mean of the CSI300 Index return, t α denotes the α quantile of distribution of the return time series, and σ t + 1 is obtained by the model we have built.
This study leverages both realized volatility and predicted volatility to compute value at risk (VaR) utilizing a hybrid model that combines the strengths of LSTM and GARCH-type models. To enhance the reliability of our findings, we adopted a robust approach by selecting confidence levels of 90% and 99%. Notably, the integrated approach optimally exploits the available data information, leading to more precise VaR estimations. In this study, we propose a novel hybrid model that incorporates long short-term memory (LSTM) and generalized autoregressive conditional heteroskedasticity (GARCH-type) models to improve the accuracy of value-at-risk (VaR) predictions. Utilizing a rolling-window approach with a fixed-window size of twenty-two trading days, we generate one-day-ahead VaR forecasts. The estimated VaR serves as the foundation for creating a trading strategy where negative VaR represents potential losses. By integrating the strengths of both models, our hybrid approach provides a robust and effective solution for risk assessment and informed decision-making in financial markets.
S i g n a l t + 1 = 0 , V a R t + 1 α < 0.2 1 ,   V a R t + 1 α 0.2  
We utilize a hitting series function to transform the predicted VaR values into binary signals, where a signal of 1 indicates a potential loss of less than 20% and prompts a long position in the CSI300 Index, while a signal of 0 suggests selling positions to hold cash only. Based on this approach, we develop a trading strategy and evaluate its performance using a holdout method. As displayed in Table 4, under a 99% confidence level and leveraging the volatility predictions of our hybrid model, we achieve a cumulative return of −0.0142, outperforming the returns of −0.1898 and −0.228 obtained using RV and CSI300 alone. Similarly, under a 90% confidence level, our hybrid model yields a cumulative return of −0.2075, surpassing the returns of −0.2494 and −0.228 derived from RV and CSI300. Notably, the strategy was tested using out-of-sample data, demonstrating the significant improvement in performance offered by our hybrid model compared to the simple trading strategy.
Figure 2 presents the daily returns during the out-of-sample period, applying our trading strategy based on the predicted VaR values. The graph displays the cumulative return without trading fees against the date, with the yellow line representing the actual return of the CSI300 asset. The blue line depicts the cumulative return of the strategy employing RV, while the red line illustrates the cumulative return of our hybrid model. Observing the results under a 99% confidence level, our hybrid model yielded superior performance compared to RV. Additionally, the 90% confidence level produced comparable findings. This validates the effectiveness of our approach in mitigating risks and enhancing investment returns.

4.2. Robustness Tests

In addition, this hybrid model can also be applied to different time periods with various outside noise. For example, to explore how it works without the COVID-19 effect, we test our model by excluding the time of the pandemic period, using within-sample period ranges from 23 August 2016 to 4 July 2018, while the outside-sample period covers 5 July 2018 to 31 December 2019, during which the financial market was not affected by COVID-19 in China. As demonstrated in Table 5, our hybrid model achieves a mean absolute error (MAE) of 0.0025, root mean square error (RMSE) of 0.0031, and mean absolute percentage error (MAPE) of 0.3382. As a result, these values represent improvements of 45%, 36%, and 53% over the GARCH model, respectively. Compared with the results during both COVID-19 and pre-COVID-19 times (see Table 3), the effect of variance prediction based on our hybrid model proves to be less sensitive to such outside noise.
This hybrid model can also be applied to different stock indexes or markets. We show the results of variance prediction models applied in the CSI 50 stock index with the same sample period as Table 3. As demonstrated in Table 6, our hybrid model achieves a mean absolute error (MAE) of 0.0025, root mean square error (RMSE) of 0.0035, and mean absolute percentage error (MAPE) of 0.2495. These values represent improvements of 40%, 36%, and 38% over the GARCH model, respectively. Compared with the results of the CSI300 Index (see Table 3), our hybrid model shows a wide range of applications in different assets.
The results of Table 3, Table 5 and Table 6, collectively, demonstrate the superiority of our hybrid model in yielding the most precise predictions. Our model’s robustness is evident in its consistent performance across various datasets and experimental conditions, lending credence to its reliability and effectiveness in real-world applications. This finding highlights the advantages of integrating multiple approaches, as our hybrid model capitalizes on the strengths of its component models to produce improved forecasts.

5. Conclusions

This study introduces a novel hybrid model that seamlessly integrates multiple GARCH-type models with long short-term memory (LSTM) networks to capture a wide range of economic characteristics. The GARCH (1,1) model and EGARCH model are employed to reflect the magnitude of volatility shocks, persistence of volatility, and leverage effects, respectively. These features are then fed into an LSTM network, which exhibits remarkable capabilities in identifying high-level temporal patterns in time series data. Furthermore, the incorporation of macroeconomic variables, such as interbank lending rates and consumer price indices, provides valuable information for long-term risk assessment. Our comprehensive evaluation of the hybrid model’s performance, conducted using three distinct loss functions and CSI300 Index data, demonstrates its superiority over single GARCH-type models in predicting realized volatility. The hybrid model’s ability to learn from multiple sources of information enhances its predictive accuracy, making it a promising tool for financial risk management.
The hybrid model, which synergistically combines GARCH-type models and LSTM, yields a substantial improvement in prediction performance compared to single GARCH-type models. Additionally, incorporating macro variables, such as interbank lending rates, as inputs to the LSTM model further enhances its predictive accuracy. Statistical comparisons reveal that the hybrid model with the optimal macro variable selection achieves improvements of 43%, 37%, and 50% in mean absolute error (MAE), root mean squared error (RMSE), and mean absolute percentage error (MAPE), respectively, relative to the best-performing single GARCH model. Consequently, the out-of-sample prediction error of the hybrid model is demonstrated to be the lowest across all evaluation metrics, underscoring its superior forecasting capability.
The present study’s empirical findings demonstrate that the proposed hybrid model significantly enhances the value-at-risk (VaR) prediction performance for the CSI300 Index. By implementing a basic trading strategy based on the predicted VaR values, the cumulative return is found to increase significantly under 90% and 99% confidence levels. These results suggest that the developed hybrid model offers considerable potential for practical applications in finance, contributing to the advancement of risk management and investment decision-making. The methodology and conclusions presented in this study pave the way for future research endeavors to build upon and expand the scope of this innovative approach.
The limitations of our study arise from the fact that our models need to be estimated separately for each asset, precluding a universal application. However, our findings offer valuable insights for traders and market participants, who can utilize our framework to evaluate the volatility of their portfolio holdings and determine the optimal critical level for adjusting their positions, thereby effectively managing risks. While our analysis has focused on two prominent Chinese stock indexes, the applicability of our models extends to other assets, including those in the US stock market, providing fertile ground for future research endeavors.

Author Contributions

Conceptualization, Y.W. and W.W.; methodology, Y.W.; formal analysis, Y.W.; investigation, W.W.; resources, Y.W. and W.W.; data curation, Y.W.; writing—original draft preparation, Y.W. and W.W.; writing—review and editing, W.W.; supervision, W.W.; funding acquisition, W.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research and APC was funded by the National Social Science Foundation of China (Youth Program), grant number 22CJY021.

Data Availability Statement

Not applicable.

Acknowledgments

We thank Zeyu Xia for helpful discussions on topics related to this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Engle, R. Autoregressive conditional heteroskedasticity with estimates of the variance of U.K. inflation. Econometrica 1982, 50, 987–1008. [Google Scholar] [CrossRef]
  2. Andersen, T.G.; Bollerslev, T. Answering the Skeptics: Yes, Standard Volatility Models Do Provide Accurate Forecasts. Int. Econ. Rev. 1998, 39, 885. [Google Scholar] [CrossRef]
  3. Nelson, D.B. Conditional heteroskedasticity in asset returns: A new approach. Econom. J. Econom. Soc. 1991, 59, 347–370. [Google Scholar] [CrossRef]
  4. Baruník, J.; Křehlík, T. Combining high frequency data with non-linear models for forecasting energy market volatility. Expert Syst. Appl. 2016, 55, 222–242. [Google Scholar] [CrossRef]
  5. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  6. Chen, K.; Zhou, Y.; Dai, F. A LSTM-based method for stock returns prediction: A case study of China stock market. In Proceedings of the 2015 IEEE International Conference on Big Data (Big Data), Santa Clara, CA, USA, 29 October–1 November 2015; pp. 2823–2824. [Google Scholar]
  7. Kim, H.Y.; Won, C.H. Forecasting the volatility of stock price index: A hybrid model integrating LSTM with multiple GARCH-type models. Expert Syst. Appl. 2018, 103, 25–37. [Google Scholar] [CrossRef]
  8. Hornik, K.; Stinchcombe, M.; White, H. Multilayer feedforward networks are universal approximators. Neural Netw. 1989, 2, 359–366. [Google Scholar] [CrossRef]
  9. D’amato, V.; Levantesi, S.; Piscopo, G. Deep learning in predicting cryptocurrency volatility. Phys. A Stat. Mech. Appl. 2022, 596, 127158. [Google Scholar] [CrossRef]
  10. Tapia, S.; Kristjanpoller, W. Framework based on multiplicative error and residual analysis to forecast bitcoin intra-day-volatility. Phys. A Stat. Mech. Appl. 2022, 589, 126613. [Google Scholar] [CrossRef]
  11. Amirshahi, B.; Lahmiri, S. Hybrid deep learning and GARCH-family models for forecasting volatility of cryptocurrencies. Mach. Learn. Appl. 2023, 12, 100465. [Google Scholar] [CrossRef]
  12. Kristjanpoller, R.W.; Hernández, P.E. Volatility of main metals forecasted by a hybrid ANN-GARCH model with regressors. Expert Syst. Appl. 2017, 84, 290–300. [Google Scholar] [CrossRef]
  13. Vidal, A.; Kristjanpoller, W. Gold volatility prediction using a CNN-LSTM approach. Expert Syst. Appl. 2020, 157, 113481. [Google Scholar] [CrossRef]
  14. Shao, X.D.; Yin, L.Q. The study on financial market risk measures in China based on realized range and realized volatility. J. Financ. Res. 2008, 6, 109–121. [Google Scholar]
  15. Kuster, K. Value-at-Risk Prediction: A Comparison of Alternative Strategies. J. Financ. Econom. 2006, 4, 53–89. [Google Scholar] [CrossRef]
  16. Kristjanpoller, W.; Minutolo, M.C. A hybrid volatility forecasting framework integrating GARCH artificial neural network, technical analysis and principal components analysis. Expert Syst. Appl. 2018, 109, 1–11. [Google Scholar] [CrossRef]
  17. Ramos-Pérez, E.; Alonso-González, P.J.; Núñez-Velázquez, J.J. Forecasting volatility with a stacked model based on a hybridized Artificial Neural Network. Expert Syst. Appl. 2019, 129, 1–9. [Google Scholar] [CrossRef]
  18. Liu, Y. Novel volatility forecasting using deep learning–Long Short Term Memory Recurrent Neural Networks. Expert Syst. Appl. 2019, 132, 99–109. [Google Scholar] [CrossRef]
  19. Bollerslev, T. Generalized autoregressive conditional heteroskedasticity. J. Econ. 1986, 31, 307–327. [Google Scholar] [CrossRef]
  20. Maknickienė, N.; Maknickas, A. Application of neural network for forecasting of exchange rates and forex trading. In Proceedings of the 7th International Scientific Conference Business and Management, Vilnius, Lithuania, 10–11 May 2012; pp. 10–11. [Google Scholar]
  21. Roh, T.H. Forecasting the volatility of stock price index. Expert Syst. Appl. 2007, 33, 916–922. [Google Scholar] [CrossRef]
  22. Wang, Y.-H. Nonlinear neural network forecasting model for stock index option price: Hybrid GJR–GARCH approach. Expert Syst. Appl. 2009, 36, 564–570. [Google Scholar] [CrossRef]
  23. Hajizadeh, E.; Seifi, A.; Zarandi, M.F.; Turksen, I.B. A hybrid modeling approach for forecasting the volatility of S&P 500 index return. Expert Syst. Appl. 2012, 39, 431–436. [Google Scholar]
  24. Kristjanpoller, W.; Fadic, A.; Minutolo, M.C. Volatility forecast using hybrid Neural Network models. Expert Syst. Appl. 2014, 41, 2437–2442. [Google Scholar] [CrossRef]
  25. Kristjanpoller, W.; Minutolo, M.C. Forecasting volatility of oil price using an artificial neural network-GARCH model. Expert Syst. Appl. 2016, 65, 233–241. [Google Scholar] [CrossRef]
  26. Fuertes, A.-M.; Izzeldin, M.; Kalotychou, E. On forecasting daily stock volatility: The role of intraday information and market conditions. Int. J. Forecast. 2009, 25, 259–281. [Google Scholar] [CrossRef]
  27. Oliveira, N.; Cortez, P.; Areal, N. The impact of microblogging data for stock market prediction: Using Twitter to predict returns, volatility, trading volume and survey sentiment indices. Expert Syst. Appl. 2017, 73, 125–144. [Google Scholar] [CrossRef]
  28. Yao, Y.; Zhai, J.; Cao, Y.; Ding, X.; Liu, J.; Luo, Y. Data analytics enhanced component volatility model. Expert Syst. Appl. 2017, 84, 232–241. [Google Scholar] [CrossRef]
Figure 1. Out-of-sample prediction of RV with different models: (a) results of GARCH models; (b) results of hybrid model.
Figure 1. Out-of-sample prediction of RV with different models: (a) results of GARCH models; (b) results of hybrid model.
Mathematics 11 03937 g001
Figure 2. Out-of-sample strategy test.
Figure 2. Out-of-sample strategy test.
Mathematics 11 03937 g002
Table 1. Variable definitions.
Table 1. Variable definitions.
VariablesSymbolDefinition
Dependent variableRVSee Equation (1).
Macro variablesRATEInterbank offered (lending) rate (monthly).
CPIConsumer price index (monthly).
CSIConsumer sentiment index (monthly).
GROWTHNational industrial growth rate (monthly).
Market price P t , i P t , i denotes the i-th 5 min close price of CSI300 Index on the t-th day.
Market return r t r t denotes the daily return of CSI300 Index on the t-th day.
r t , i r t , i denotes the i-th 5 min return of CSI300 Index on the t-th day.
Table 2. Statistics.
Table 2. Statistics.
VariablesMeanStd.Min.Max.
RV0.81670.35320.25352.9588
Return0.01661.2006−8.20875.7774
The units are %.
Table 3. Variance prediction.
Table 3. Variance prediction.
ModelMAERMSEMAPE
GARCH0.00350.00430.4483
EGARCH0.00410.00480.5183
GED-EGARCH0.00400.00470.5153
Hybrid model (adding macro variables)0.00200.00270.2233
Table 4. Cumulative returns of the trading strategy based on our model.
Table 4. Cumulative returns of the trading strategy based on our model.
VaR Confidence LevelOur ModelRVCSI300
99%−0.0142−0.1898−0.228
90%−0.2075−0.2494−0.228
Table 5. Variance prediction before pandemic period.
Table 5. Variance prediction before pandemic period.
ModelMAERMSEMAPE
GARCH0.00410.00480.5433
EGARCH0.00450.00510.6015
GED-EGARCH0.00450.00510.6101
Hybrid model (adding macro variables)0.00250.00310.3382
Table 6. Variance prediction using CSI 50 stock index.
Table 6. Variance prediction using CSI 50 stock index.
ModelMAERMSEMAPE
GARCH0.00450.00540.5341
EGARCH0.00480.00550.5762
GED-EGARCH0.00480.00540.5762
Hybrid model (adding macro variables)0.00250.00350.2495
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, W.; Wu, Y. Risk Analysis of the Chinese Financial Market with the Application of a Novel Hybrid Volatility Prediction Model. Mathematics 2023, 11, 3937. https://doi.org/10.3390/math11183937

AMA Style

Wang W, Wu Y. Risk Analysis of the Chinese Financial Market with the Application of a Novel Hybrid Volatility Prediction Model. Mathematics. 2023; 11(18):3937. https://doi.org/10.3390/math11183937

Chicago/Turabian Style

Wang, Weibin, and Yao Wu. 2023. "Risk Analysis of the Chinese Financial Market with the Application of a Novel Hybrid Volatility Prediction Model" Mathematics 11, no. 18: 3937. https://doi.org/10.3390/math11183937

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop