**5. Results**

As stated in Section 4.6, we use different measures for point and density forecasting. Initially, the focus is on point forecasting. The first results of the forecasts are given in Table 2; these are the percentages of actual observations outside of the 95% credible interval obtained by simulation. To compare the BVAR model with the BVAR-GARCH model, the forecasts of the BVAR-GARCH model is only for Ripple not more often in the 95% credible interval. This would imply that the forecasts are less volatile using the BVAR-GARCH model compared to the BVAR model, and for Ripple this would be the opposite. This is in line with the expectations since the kurtosis of Ripple (see Table 1) is significantly higher than the other cryptocurrencies. The BVAR-SV and BVARX-SV models have the highest percentages of all the cryptocurrencies except for Bitcoin. This would sugges<sup>t</sup> that using Stochastic Volatility will not give a good prediction overall using credible intervals. The results between the BVAR model and the BVARX model are close to each other, thus there is not a clear distinction between these two models. However, the BVARX-GARCH model is the model that stands out the most, which gives the most forecasts in the 95% credible interval, the only exceptions are the BVAR-GARCH model for Ethereum and the BVARX-SV model for Bitcoin.

Overall, the use of the cryptopredictor variables would be helpful to simulate forecasts due to the fact that in almost every case using the cryptopredictor variables would give a lower percentage of actual observations outside of the 95% credible interval. Using a student-t distribution in the SV model is only for Bitcoin more often out of the interval, which is expected as Bitcoin is the least volatile of the cryptocurrencies. Including the cryptopredictor variables into the SV-t model, this percentage is only smaller for Ripple, however not by a lot.


**Table 2.** Percentage of actual observations outside of the 95% credible interval retrieved by simulation.

For every cryptocurrency, the credible intervals are also plotted (see Figures A1–A4 in Appendix A). In these figures, the credible interval of the BVAR models are pretty steady for all cryptocurrencies, hence these models are not capturing the volatile movements of the data that well. When one uses a more expanded version, e.g., the BVAR-SV or BVAR-GARCH model, the credible levels captures the movements better; when there are shocks, the credible levels adapt to its movement. However, the BVARX-SV models stands out the most; there is much noise in the credible levels, thus using the predictors would not be helpful to give a more narrow credible interval to predict one day ahead.

Table 3 shows the results for the second point forecasting measure previously described. This predictability is not statistically tested but gives an insight into the accuracy of the movement of the forecasts. The returns are used to see if the direction of predictions is correct. The BVAR-SV model is compared to the BVAR model and BVAR-GARCH model in all cases more in the right direction. Another observation is that only for Ethereum and Ripple including the cryptopredictor variables predict the direction more precisely. The reason for this behaviour would be that Ripple is more dependent on market movement than the other cryptocurrencies. However, the percentages are under 50% or close to 50%, which would imply that these models (BVAR and BVAR-GARCH) cannot predict the movement very precise. That statement only applies for now on the prediction of the cryptocurrency going up or down.

An important observation of this table is that the stochastic volatility models have the best scores overall and are in some cases about 60–67%, which is much more precise than for example 35.45% of the BVAR-GARCH for Bitcoin. This is especially the case for the SV model with a student-t distribution, thus using a SV model with student-t distribution is the best way, among these models, to forecast the direction of the cryptocurrencies.


**Table 3.** Percentage of forecasts in the right direction (up or down).

Moving to the last point forecast measure, Table 4 contains the results of the ratio of the RMSE. For these results, the RMSE of the benchmark model (BVAR) and the ratios of the other models are reported. As expected in the descriptive statistics, Ripple is the cryptocurrency with the highest RMSE due to the high kurtosis.

For Ripple and Litecoin, the SV models are significantly better than the benchmark model. The GARCH model is in all cases not significantly better than the benchmark; the cause could be that cryptocurrencies do not follow such dynamics. We could state that including the cryptopredictor variables does not affect the RMSE of the models enough to increase the performance of the forecasts. For Bitcoin, there is no model significantly better performing than the VAR, this could be caused by the aforementioned stability of Bitcoin compared to the other cryptocurrencies.


**Table 4.** Ratio of RMSE against benchmark.

*Notes:* (1) The "X" indicates models with the cryptopredictor variables included, the "t" indicates that the student-t distribution is used. (2) For BVAR, the benchmark model, the table reports the RMSE, for other models it reports the ratio between the RMSE of the current model and the benchmark. Entries less than 1 indicate that forecasts from current model are more accurate than forecasts from the benchmark model. (3) \*\* and \* indicate RMSE ratios are significantly different from 1 at 5% and 10%, according to the Diebold–Mariano test. (4) Gray cells indicate models that belong to the Superior Set of Models delivered by the Model Confidence Set procedure at confidence level 10%.

The grey areas indicate the model confidence set; this also confirms our conclusion that using the SV model is in almost every case (except for Litecoin and VARX-SV) in this set. If one wants to forecast these cryptocurrencies with one of these models, then the preferred option, by looking at the RMSE, is using stochastic volatility.

Tables 5 and 6 contain the results of the density measures CRPS and PL. The results of the CRPS measure are not that different from the RMSE. One difference is that by the CRPS, GARCH outperforms the VAR for Bitcoin and for Ripple if the cryptopredictor variables are included. Hence, the density of Bitcoin and Ripple follow the dynamics of a GARCH model more than the benchmark. However, the SV model also outperforms the GARCH model since the values of the SV model are in many cases lower. In the model, confidence set is now also the GARCH for Bitcoin included.



*Notes:* (1) The "X" indicates models with the cryptopredictor variables included, the "t" indicates that the student-t distribution is used. (2) For BVAR, the benchmark model, the table reports the CRPS, for other models it reports the ratio between the CRPS of the current model and the benchmark. Entries less than 1 indicate that forecasts from current model are more accurate than forecasts from the benchmark model. (3) \*\* and \* indicate CRPS ratios are significantly different from 1 at 5% and 10%, according to the Diebold–Mariano test. (4) Gray cells indicate models that belong to the Superior Set of Models delivered by the Model Confidence Set procedure at confidence level 10%.

The conclusion drawn from the first measure of density forecast (CRPS) is that for Ethereum the case is now the same as the case for Bitcoin by using the RMSE; there is no model significantly better than the benchmark. The reason could be that the density of the forecasts of Ethereum are not following the movement captured by the used models, such that the predictability of Ethereum is low caused by its uncertainty being higher than those of the other cryptocurrencies.

Regarding the density forecast for CRPS, the main conclusion is that including stochastic volatility in the model formulation lead to better results with respect to the benchmark (VAR model) and to GARCH specification. In particular, the inclusion of student-t specification of the errors in the SV models leads to better results and to grea<sup>t</sup> improvements for every cryptocurrency. If one includes the cryptopredictors in the analysis, there are not so grea<sup>t</sup> improvements except when the errors are student-t specified for stochastic volatility.



*Notes:* (1) The "X" indicates models with the cryptopredictor variables included, the "t" indicates that the student-t distribution is used. (2) For BVAR, the benchmark model, the table reports the PL, for other models it reports the difference between the PL of the current model and the benchmark. Entries greater than 0 indicate that forecasts from current model are more accurate than forecasts from the benchmark model. (3) \*\* and \* indicate PL differences are significantly different from 0 at 5% and 10%, according to the Diebold–Mariano test. (4) Gray cells indicate models that belong to the Superior Set of Models delivered by the Model Confidence Set procedure at confidence level 10%.

The predictive likelihood (PL, or log predictive score (LS)) has some different results compared to the previous measures. At first, the predictive likelihood is very close to each other if one compares the cryptocurrencies, which indicates that the models perform the same for the cryptocurrencies. Only for Ethereum there are models significantly better performing than the VAR. The SV models are in that case the most significant and the GARCH and VAR including the cryptopredictor variables are less significant.

Overall, the model confidence set is as before containing the SV models. However, this time the SV-t models are not in this set, only for Litecoin including the cryptopredictor variables. Litecoin has however almost a full set, only the SV-t model is not in it, thus Litecoin is not following a single model, but can be explained by multiple models. The GARCH models are now in the model confidence set as well, which illustrates that the log score of the forecasts are describable as GARCH movements.

Regarding the density forecast for PL, the main conclusion is that including stochastic volatility in the model formulation leads for Ethereum to better results with respect to the benchmark (VAR model) and to GARCH specification. Contrarily, the CRPS inclusion of the student-t specification of the errors in the SV model lead to no significant better results. If one includes cryptopredictors in the analysis, there are only for Ethereum improvements if there is no student-t specification.
