**4. Results and Discussion**

In this section, we compare the VaR forecast results for the six ASEAN equity indices. Therefore, we first present the results for each index individually and compare the findings afterwards. For each model, we estimate 1-, 5-, and 20-days ahead predictions that correspond to forecasts a day, a week, and a month ahead. Note that we do not forecast the VaR for the whole period, but for a certain point in the future. The results are presented in Tables 2–7.

We start our analysis with the results from the Indonesian JCI (Table 2). The traffic light test does not find any of the models to be in the red zone. Moreover, we see that the GARCH is only present in the green zone, i.e., it never has no more than four violations in the whole out-of-sample for the long and the short trading position. However, the GARCH model does not pass the conditional coverage test by Christoffersen (1998) or the multilevel Pérignon and Smith (2008) test at any forecast horizon. There are a number of models that are able to depict the VaR at all quantiles under consideration for the 1-day ahead forecast on the long trading position: FHS, FIGARCH, SV, and SV-L, but only FIGARCH also shows the same ability on the short trading position. Its loss functions are satisfactory, but FIGARCH only belongs to the best performing models at the 2.5% quantile. The generally good performance of this model hints toward an elevated shock persistence in volatility. Regarding higher forecast horizon, it is only HS, which depicts good performance for all horizons on both trading sides with respect to the multilevel coverage test. The fact that HS is not able to pass the conditional coverage test may indicate that the model tends to build clustered violations, which is not covered by the multilevel coverage test.

The second equity index we analyze is the Malaysian KLSE (Table 3). Here, three models fail the regulatory traffic light test. While RiskMetrics has several days in the red zone of the long trading position, the HS and FHS models are included in the red zone for the short trading positions. Moreover, we observe some asymmetric behavior. RiskMetrics also completely fails to meet the criteria from the coverage test for the long trading position. However, it passes all tests for the short trading position. Almost the same behavior is observed for FIGARCH, with only exception for the 2.5% VaR of the conditional coverage test of Christoffersen (1998). On the long trading position, APARCH archives good results especially for 1-day ahead predictions. This suggests that both asymmetric news impact and long memory are present in this market's volatility, which is further underlined by the performance of FIAPARCH. All stochastic volatility models pass the multilevel coverage test for the long trading position.

Next, we compare the results from the Philippine PCOMP index (Table 4). No model appears in the red zone of the traffic light test and thus they could be used without being replaced by the regulator. Here, we find five complete failures of models regarding the two statistical coverage tests. Neither RiskMetrics for the long trading position, nor GARCH or any stochastic volatility specification for the short trading position pass any of the tests. In addition, the two asymmetric GARCH models seem to have problems with the specific dynamics of the PCOMP index. Both APARCH and FIAPARCH

perform very poorly with only a few passed tests. Generally, our model set does not include a clear candidate to be preferred in terms of VaR prediction performance. However, the HS and the FHS deliver the most promising results with respect to the multilevel test and the corresponding loss function results.


FIAPARCH-SkSt 1173/38/0 **0.0828** − − −− − −− − −− − SV 653/558/0 0.1111 − − 0.1813 − − **0.2297** − − 0.3130 0.2961 **0.2695**


228


Bold faced loss functions represent the inclusion in the Model Confidence Set (Hansen et al. 2011) with level of significant 10% and 10,000 bootstraps. The test by Pérignon and Smith (2008) is reported in a similar manner, except for the fact that the three

VaR levels (1%, 2.5%, and 5%) are tested jointly.

*J. Risk Financial Manag.* **2018**, *11*, 18



VaR levels (1%, 2.5%, and 5%) are tested jointly.



Table 5 presents the results from the VaR backtests for the SET, traded in Bangkok. From the traffic light test, it becomes apparent that RiskMetrics and FHS lead to several days in the red zone for the long trading position. The general result for SET is that most models can cope with the long trading position to some extend, but completely fail to depict the dynamics on the short trading position. The results from HS sugges<sup>t</sup> that it can be used for 1-day ahead predictions for the long trading position. Even though it is not rejected by the unconditional coverage tests, it has problems to avoid clustering of the VaR violations. This behavior is illustrated in Figure 2. It shows the slow reaction to VaR violations on the short trading position and the overall good coverage for 1% VaR forecasts. Additionally, SV-*t* shows somewhat good performance on the long trading position, which is reflected in the fact that it passes all multilevel tests and belongs to the set of the best models for 5- and 20-days ahead. For both trading positions however, only FIAPARCH and HS have good results with respect to the multilevel test at least. Hence, we conclude that both asymmetry and long memory play an important role in the variance dynamics of SET, indicating that variance shocks have an extended persistence which is of asymmetric shape, however.

**Figure 2.** Value-at-Risk (*a* = 1%) 1-day ahead forecast with Historical Simulation for the SET index.

The STI from Singapore provides interesting results. From Table 6, we find that RiskMetrics (long) and FHS (short) are included in the red zone of the Basel traffic light test. Consequently, the models would be replaced by the regulatory standard approach and the institution would be penalized with a higher factor on the minimum capital requirements accounting for the bad model choice. Interestingly, RiskMetrics shows a good performance on the short trading side, where it passes most of the tests. The worst results are achieved by the GARCH model, which fails all tests, even though it stays in the green zone over the whole out-of-sample period. The stochastic volatility models show good performance on the long trading position but cannot provide equally good results on the short trading position. APARCH provides very good results for both trading positions regarding 1-day ahead forecasts, which indicates that asymmetries play an important role in the STI return structure.

Lastly, we compare the forecasting results for the Vietnamese equity index VNI (Table 7). Within our model set, which includes very widely used VaR estimation procedures, only the SV model provides an average to good performance. All other models are either in the red zone (RiskMetrics, FHS, FIGARCH) or fail most of the statistical coverage tests (GARCH, APARCH, FIAPARCH). The models in the red zone, however, are only included for one trading side. For the long trading position, FIGARCH and FHS provide good results from the coverage tests. The stochastic volatility models show very good performance for 1- and 5-days ahead predictions on the long trading position and belong to the model confidence set at every test they pass.

Finally, we compare all results from the six different equity indices of Indonesia, Malaysia, Philippines, Singapore, Thailand, and Vietnam. GARCH seems to be regulator's darling with respect to the traffic light characterization. Although popular, it fails almost all statistical coverage tests, while, for most indices, it is 100% of the time in the green zone of the traffic light test. This indicates that the model yields too conservative VaR forecast, which would result in particularly high minimum capital requirements. In addition, the very popular RiskMetrics model shows poor performance. McMillan and Kambouroudis (2009) concludes that the model only performs well in small markets and high VaR quantiles. Our findings sugges<sup>t</sup> that the selected six markets in this paper may already be too big for the RiskMetrics approach. Interestingly, the HS is rarely rejected by the multilevel coverage test, i.e., regardless of the specific forecast horizon, it provides sufficient coverage ratios. However, it is not able to provide satisfying results for the conditional coverage test in all indices. This might be due to the slow adaption of shocks resulting in clustering of violations. The SV model specifications provide a framework with a good overall performance at all markets, but only on the long trading position. However, especially for shorter forecast horizons, the SV models belong to the model confidence sets.

Comparing the model performance for each index, we find evidence that the markets in our analyzed group are heterogeneous with respect to their volatility properties. For example, STI is dominated by asymmetric effects and long memory models are rejected by our coverage tests while, for the VNI, long memory models show a good forecasting performance and asymmetric models are rejected.


**Table 5.** Value-at-Risk backtest results for SET returns.

VaR levels (1%, 2.5%, and 5%) are tested jointly.
