3.1. In-Sample and Out-Of-Sample Predictability Results
In order to set the stage for our forecasting experiments, we first present full-sample results in
Table 1 for the variant of the HAR-RV model that features all five climate-risk factors as additional predictor variables. The full-sample estimation results witness that, as one would have expected given the slowly decaying autocorrelation functions plotted in
Figure 2, the coefficients of the core HAR-RV model are always highly significantly different from zero, and they have the expected positive sign in all cases. In the case of natural gas, the coefficient estimated for the monthly realized volatility is only significant at the 10% level for the long forecast horizon. On balance, however, the full-sample results demonstrate that the core HAR-RV captures an important element of the dynamics of the realized volatilities. The evidence that the climate-risk factors are relevant full-sample predictors, in contrast, is weak. Their estimated coefficients are statistically insignificant in the majority of cases. The list of few exceptions includes the coefficients estimated for global warming (
) and international summits (
) in the case of crude oil, U.S. climate policy and natural disasters (
) and international summits (
) in the case of heating oil, and natural disasters (
) and the narrative factor (
) in the case of natural gas. These exceptions, though, do not discount the overall impression that climate-risk factors do not contribute much to the in-sample predictability of the realized volatilities. We further observe that the (adjusted) coefficient of determination increases when we switch from the daily to the weekly forecast horizon, and then decreases again in the case of crude oil and natural gas when we turn to the analysis of the long forecast horizon.
It is also interesting to observe that the signs of the climate-risk factors are positive in some cases, and negative in others. Moreover, the coefficients estimated for international summits, for example, have a positive sign in the case of crude oil and heating oil, but a negative sign when we study natural gas (
h = 5). The sign of the estimated coefficient for a given climate-risk factor can even change sign across forecast horizons as, for example, in the case of global warming and natural gas. While the economic interpretation of sign switches of the estimated coefficients across forecast horizons should not be stretched too far given that most estimated full-sample coefficients are not significantly different from zero, it still is worth noting that, on economic grounds, both a positive and a negative sign can be rationalized. As [
25] observe, a higher incidence of natural disasters and global warming (operating through increased media coverage of sources of concern) and international summits (which policymakers typically use to put forward proposals related to a global tax on pollutants) are likely to signal “bad news” for the economy. The signal that an increase in media coverage of U.S. climate policy news conveys about potential transition risks, in turn, is likely to depend upon which of the two major U.S. political parties holds the power in Washington. In any event, with climate risks affecting the realized volatility through multiple opposing channels, as outlined in the introduction, the mixed signs should not come as a surprise.
Against the background of the rather weak and inconclusive full-sample results, we next turn our attention to our out-of-sample forecasting experiments. In this regard, it should be noted, as [
48] argues, that an out-of-sample analysis is the ultimate test of any predictive model in terms of the econometric methodology and the predictor(s) under scrutiny.
Table 2 documents our baseline out-of-sample forecasting results. The table shows the
p-values of the test proposed by [
49] for an equal mean-squared prediction error. The classic HAR-RV model is the benchmark model, and the model extended to include climate-risk factors is the rival model. The alternative hypothesis is that the rival model has a smaller MSPE than the benchmark model. Hence, the Clark–West test is a one-sided test. We observe that the test results for the short (that is, daily) forecast horizon are all insignificant (with only one exception). Similarly, the majority of test results for the weekly forecast horizon is insignificant. They yield statistically significant results in three cases when we study natural gas. The evidence that the climate-risk factors have predictive value for out-of-sample forecasts of realized volatility become stronger when we turn to the monthly forecast horizon, where 12 out of 18 test results are highly significant. For
h = 22, we find that using all five climate-risk factors as predictors of the realized variances always yields significant test results. Hence, the key finding from the baseline out-of-sample test results is that climate-risk factors have predictive value for realized volatility mainly at the long (monthly) forecast horizon.
In order to shed light on the robustness check of our key finding, we document in
Table 3 test results for the square root of RV. Such a robustness check is in order given that
Figure 1 witnesses that the realized volatilities exhibited occasional large peaks during our sample period. The main finding of this robustness check is that we observe significant test results predominantly when we study the monthly forecast horizon, even though there are a few significant test results also for the weekly forecast horizon. Hence, the main finding of this robustness test is consistent with the finding from the baseline test results that we lay out in
Table 2.
Eyeballing
Figure 1 further reveals that the dynamics of the realized volatilities changed at roughly observation 1500. In order to account for this observations, we report in
Table 4 results for a shorter sample period, that is, we delete the first 1500 forecasts before setting up the Clark–West test. The results are broadly in line with our baseline test results. We observe strong evidence of predictive value of the climate-risk factors at the monthly forecast horizon for crude oil and heating oil and, to a somewhat lesser extent, for natural gas. For natural gas, we further observe that half of the test results at the weekly forecast horizon are significant. We also observe, corroborating the results of our baseline forecasting experiment that we summarize in
Table 2, that the test results for the full model that features all five climate-risk factors are significant at the monthly forecast horizon.
In
Table 5, we report test results that we obtain when we use three popular shrinkage estimators to estimate the HAR-RV cum all climate-risk factors model. The three shrinkage estimators are interesting for the purpose of our forecasting experiments because they identify a parsimonious forecasting model in a completely data-driven way. We consider the following three shrinkage estimators: the Lasso estimator, the Ridge-regression estimator, and an elastic net. The latter can be interpreted as a combination of the Lasso and Ridge-regression estimators. The test results show again that the climate-risk factors help to improve the accuracy of forecasts of the realized volatilities at the monthly forecast horizon relative to the accuracy of forecasts that we obtain from the benchmark HAR-RV model (estimated by ordinary-least squares). For crude oil and heating oil, all test results are significant (
), while the evidence of an improvement in forecast accuracy is weaker for natural gas. Moreover, the Ridge-regression estimator in particular tends to yield significant test results at the short and intermediate forecast horizons (
) even for the daily and weekly forecast horizon.
Table 6 reports the results we obtain when we use somewhat longer rolling-estimation windows (500 and 1000 observations) to estimate our forecasting models. The message to take home from the results summarized in
Table 6 is that, on balance, there is evidence (somewhat stronger for the window that uses 500 than for the window that uses 1000 observations) that considering climate-risk factors as predictors of realized volatilities at a monthly forecast horizon yields forecasts that are superior relative to the forecasts computed by means of a benchmark HAR-RV model. For the monthly forecast horizon, we observe that the test results for the forecasting model that features all five climate-risk factors in its array of predictors are significant for both rolling-estimation windows.
As a further variant of our forecasting experiment, we report in
Table 7 results that we obtain when we study a recursive-estimation window. The changing dynamics of the realized volatilities documented in
Figure 1 imply that a recursive-estimation window is not our preferred choice for the analysis of our data. Notwithstanding, it is interesting briefly to sketch the results for a recursive-estimation window. As in the case of a rolling-estimation window, we observe several significant test results for the monthly forecast horizon. The forecast model that uses all five climate-risk factors as predictors yields always significant test results at the long forecast horizon.
Based on the suggestion of an anonymous referee to better understand the possible statistical reasons behind the weak in-sample evidence relative to the stronger out-of-sample performance of the climate risks variables, we conducted the multiple structural break tests of [
50] on the augmented HAR-RV model that involves all the five predictors of climate risks. We found five breaks (specific dates of which are available upon request from the authors) each at
h = 1, 5 and 22 for all three of the energy prices. Given the evidence of structural breaks, it is not surprising that the full-sample regressions provide only weak evidence that the climate-risk factors matter for in-sample predictability.
In this regard, it is also worth mentioning that plots (complete details of which are available upon request from the authors) of the time-varying coefficients of the climate-risk factors estimated by means of a rolling-estimation window indicate that the signs of the estimated coefficients changed over time, possibly reflecting that the relative importance of the multiple opposing channels through which climate risks affect the realized volatility was not constant over time. The full-sample regressions capture the “average sign” of the coefficients and, thereby, recover only a weak predictive value of the climate-risk factors for realized volatility. An out-of-sample analysis, in contrast, is better suited to recover such changing patterns in the link between realized volatility and the climate-risk factors, as it is based on time-varying parameter estimates of the models derived from rolling and recursive windows. Intuition, thus, suggests that an out-of-sample analysis can be expected to yield stronger evidence of predictability than a full-sample analysis, which, in turn, is vindicated by our out-of-sample forecasting results.
Finally, we consider the possibility that policymakers and forecasters are interested in forecasts for horizons that extend beyond one month. Moreover, one can conceive situations in which policymakers and forecasters differentiate between positive and negative forecast errors. An underestimation of energy-price volatility in the wake of an energy crisis that gathers steam, for example, might be costlier for a policymaker in terms of approval rates than a corresponding overestimation of the same absolute seize. Overestimation of energy-price volatility, in turn, may result in excessive storage and high opportunity costs. In order to model a potential differential weighting of under- and over-estimations of realized volatility, we use the loss function studied by [
51,
52]. The loss function is given by
, where
denotes the forecast error and
denotes the indicator function. Setting
results in a quasi-linear function that depends on the absolute forecast error (L1 loss), while setting
restricts the loss function to be of the quadratic type (L2 loss). The (as-)symmetry parameter,
governs the relative loss from an under-or overestimation of realized volatility. For
, we obtain a symmetric loss function. For
, we obtain
(
), the loss from under-estimating (over-estimating) realized volatility exceeds the loss from an overestimation (underestimation) of the same (absolute) size.
Figure 4 summarizes the results for this loss function. We report results for forecast horizons from one day to three months (that is, 66 days). The figure displays the loss ratio that we obtain by dividing the sum of the loss from the out-of-sample forecast errors as computed by means of the HAR-RV benchmark model by the the sum of the loss from the out-of-sample forecast errors as computed by means of the HAR-RV model extended to include all climate-risk factors. A ratio exceeding unity, thus, indicates that the extended model has a better out-of-sample forecasting performance than the benchmark model, given the (as-)symmetry parameter,
. The results show that the forecasting gains from using the climate-risk factors as predictors tend to increase in the forecast horizon for the case of a symmetric loss function. In the asymmetric case, in turn, the forecasting gains for long forecast horizons are mainly concentrated in the region where
when we study crude oil and a L2 loss function. For a L2 loss function, the forecasting gains are more evenly distributed across the interval of the asymmetry parameter. For heating oil and natural gas, in turn, the forecasting gains in the case of long forecast horizons are strongest in the region where
.