*3.3. Out-of-Sample Validation*

We proceed with an out-of-sample comparison of the risk measures and forecasting ability of the two models, SVCJ and TGARCH. Our benchmark model is the RiskMetrics (RM) of J.P.Morgan (1996). The risk measures VaR and ES were estimated with a rolling window of *T* − 365 = 1328 daily log-returns, and the remaining 365 days (24 March 2018 to 24 March 2019) are kept for out-of-sample forecasts and accuracy checks. We then simulate 5000 returns paths from both models. For the AR(2)-TGARCH(1,1)∼Skewed *t* model, we used the Filtered Historical Simulation by first extracting the standardized residuals using the volatilities to form a new set of innovations, which are then utilized to obtain the conditional mean. For each return, these steps are repeated recursively to obtain different simulated pathways, with 5000 draws from the standardized residuals to generate 1328 (same as in-sample size) replicates of the returns.

Table 4 reports the out-of-sample backtesting results. The Christoffersen (1998) conditional coverage test confirms that the two models SVCJ and TGARCH accurately forecast the VaR as the *p*-values are greater than 5%. There is an exception for XRP where TGARCH performs better for 1% VaR. Although the RiskMetrics model displays forecasting accuracy, it occasionally fails to perform accordingly for LTC and XLM cryptocurrencies. Speculative investors taking either a long or short position in a cryptocurrency can generate accurate VaR forecasts using these two models.


**Table 4.** Value-at-Risk backtesting results.

Christofersen's test *p*-values and average values of the VaR and ES forecasts are displayed under SVCJ, AR(2)-TGARCH(1,1)∼Skewed *t*, and RiskMetrics (with a decay factor of 0.94). Bold *p*-values below 5% rejects the null hypothesis of correct exceedances and independence of violation sequences, and hence represents inaccurate VaR estimates.

Given the accuracy of the models, Table 5 reports the zero mean test of excess loss provided that the model first passes the test for VaR. The results indicate that the predictive power of SVCJ model is better than TGARCH and RM models at the 5% level (many of the *p*-values are less than 5%). One possible explanation of such a finding is that TGARCH and RM models' forecasting have less significant gains over the forecasts of the SVCJ model. This particular evidence supports our prior that accounting for jumps in returns and volatility is a reason for the SVCJ model's superior predictive power.


**Table 5.** Expected Shortfall backtesting results.

Results of the zero mean test for the excess loss, provided that the model generates accurate VaR estimates. *p*-values are reported at 1% and 5% risk levels for the cyptocurrencies. *p*-values below 5% indicate inadequacy of the model for estimating ES.

Table 6 summarizes the test of the best performing model with respect to the quantile loss function of Angelidis et al. (2004). For each cryptocurrency and confidence level, we present the loss differential *B* and the *p*-values of the zero median test of Sarma et al. (2003). When *p*-values are less than 5%, it implies that two competing models are significantly different from each other in terms of estimating risk. The opposite implies that the two competing models are not significantly different from each other, with respect to the quantile loss function. Hence, regulators and risk managers remain indifferent between these two models. The results suggest that, at the 5% level, the SVCJ model is better than TGARCH and RiskMetrics models because it produces lower economic losses. At the 1% level, some of

the results show that a risk manager is indifferent between the models for VaR estimation. For instance, for Bitcoin and Stellar, SCVJ and TGARCH models are not significantly different from each other, with respect to the quantile loss function. For the same cryptocurrencies, these two models are performing better than the RiskMetrics model. Therefore, as far as loss is concerned, a risk manager would prefer either SVCJ or TGARCH model over RiskMetrics.


**Table 6.** Quantile Loss Function test for the best model for VaR estimates

*B* statistic and *p*-values, at 1% and 5% risk levels are reported for the cyptocurrencies. *p*-values below 5% indicate that the difference in the performance of models is significant. If one model fails the previous backtest, we then report the other prevailing model.

Overall, as noted earlier in Table 5, there is a gap between the quantities of risk measured by VaR and ES at the 1% and 5% confidence levels. This suggests that ES gives a more accurate measure of risk than the traditional VaR measure. This finding seems to support the recommendation from the Basel Committee on Banking Supervision (2013) that banks use ES in lieu of VaR, and that there should be a recalibration of the confidence level for consistency and accuracy of the risk measure. In terms of the forecast accuracy, our results show that SVCJ and TGARCH generate better forecasts at the 1% level then RM. This evidence clearly supports the notion that fat-tailed volatility models can predict risk more accurately than non-fat-tailed models. In summary, the combination of jumps in returns and volatility in a stochastic model yields the most accurate VaR forecasts for the majority of the cryptocurrencies studied in this paper.
