1. Introduction
The analysis of volatility of electricity spot prices is crucial for traders, portfolio managers, policy makers, and other market participants. The growing interest in modeling the dynamics of electricity prices has revealed several distinctive features typically not observed in financial assets due to the nonstorability of electricity. As electricity is not storable, and because of the inelasticity of supply and demand, electricity prices are known to be much more volatile than other commodity prices. In particular, the main stylized facts show that daily and intradaily electricity spot prices are usually characterized by seasonality, mean reversion, high volatility persistence, frequent price jumps and spikes of short duration, inverse leverage effects (electricity price volatility reacts more to positive shocks than to negative shocks), stationarity in both the price level and squared prices, and negative prices that are mainly related to the inability to dispose of electricity freely together with nontrivial start-up costs for generators (
Bierbrauer et al. 2007;
Byström 2005;
Chan et al. 2008;
Escribano et al. 2011;
Fanone et al. 2013;
Frömmel et al. 2014;
Higgs and Worthington 2008;
Knittel and Roberts 2005).
The popular GARCH-type framework is widely used to model and forecast the volatility of electricity prices (e.g.,
Bowden and Payne (
2008);
Escribano et al. (
2011);
Garcia et al. (
2005);
Hickey et al. (
2012);
Knittel and Roberts (
2005);
Liu and Shi (
2013), among others). Standard GARCH models usually rely on daily data to estimate the latent conditional variance, using all current and past daily squared returns to provide expectations on future volatility. However, the daily return offers only a “weak” signal on the current level of volatility; therefore, this class of model is not able to capture rapid changes in the volatility level.
Since electricity cannot be physically stored directly, production and consumption need to be continuously balanced to smooth supply and demand shocks (
Bierbrauer et al. 2007). In this direction, the liquidity of the electricity market has grown rapidly. The increased availability of high-frequency information has led to the development of new econometric methods for modeling and forecasting volatility, revealing that high-frequency data are much more informative about the price process not only at the intraday level, but also at the daily level. The daily Realized Volatility (RV), defined as the summation of the squared intradaily price changes, provides an unbiased and highly efficient estimator of return volatility (
Andersen et al. 2001,
2003;
Barndorff-Nielsen and Shephard 2002).
Volatility modeling using intradaily price frequencies has received considerable attention, not only in the financial market but also in the electricity market. In this framework, most of the existing literature relies on HAR-type models to directly estimate time series of realized measures.
Chan et al. (
2008) used the HAR model of
Corsi (
2009) and the HAR-CJ model of
Andersen et al. (
2007) to estimate volatility and identify jumps in the electricity price process on five regions of the Australian market. However, their results show a modest improvement in volatility forecasts when total variation is separated into continuous and jump components, with no strong evidence that the HAR-type specifications outperform the EGARCH model.
Haugom et al. (
2011) added exogenous effects on the HAR model and the HAR-CJ model to assess the day-ahead predictions in the Nord Pool forward market, finding forecast improvements from the inclusion of exogenous effects.
Haugom and Ullrich (
2012) extended the HAR approach, including forward realized volatility as a predictor to improve spot price volatility forecasts for some electricity markets of the United States. Differently, focusing on the conditional variance of returns,
Frömmel et al. (
2014) referred to the Realized GARCH models (
Hansen and Huang 2016;
Hansen et al. 2012) with a single measurement equation to forecast the volatility on the Electricity Power Exchange market using both the RV and the intraday range as realized measures. The empirical results suggest that the RGARCH specifications outperform the EGARCH model in terms of forecasting accuracy, especially when intraday range is used, stating (among other potential determinants) that range-based measurements are more robust to microstructure noise bias.
The aim of this paper is to assess the benefits of jointly using different realized measures in fitting and forecasting electricity price volatility. Differently to
Frömmel et al. (
2014), we resort to the use of Realized GARCH models to combine information through multiple measurement equations and multiple realized estimators. In particular, to deal with market microstructure frictions and extreme jumps in electricity price series, in addition to the RV, we also refer to the robust estimators
Realized Kernel (RK) (
Barndorff-Nielsen et al. 2008) and
medRV (MRV) (
Andersen et al. 2012). Moreover, since accurate volatility modeling is crucial for risk management (
Byström 2005;
Chan and Gray 2006), we also focus on the ability of models to accurately predict Value-at-Risk (VaR) and Expected Shortfall (ES) at different risk levels.
Our empirical analysis on time series of spot prices sampled at 30-min intervals in the regional Australian power markets of New South Wales, Queensland, and Victoria reveals some interesting findings. First, the Realized Exponential GARCH (REGARCH) specifications, combining multiple realized measures, improve the in-sample fit over the standard GARCH(1,1) and the simple Realized GARCH (RGARCH) models. Second, the use of jump-robust MRV as a realized measure in the RGARCH model leads to a significant improvement in volatility forecasting by minimizing the QLIKE (
Patton 2011), Mean Squared Error, and Mean Absolute Error loss functions. Finally, in evaluating the performances in forecasting VaR and ES at the risk levels of 1%, 2.5%, and 5% via the Quantile Loss function of (
González-Rivera et al. 2004) and the class of strictly consistent loss functions (
Fissler and Ziegel 2016), respectively, the GARCH(1,1) and some REGARCH specifications clearly outperform the simple RGARCH based on a single measurement equation. The Model Confidence Set (MCS) of
Hansen et al. (
2011) is used to assess the significance of differences in the predictive performances of the models under analysis.
2. Model Specifications
The GARCH model is by far the most widely used specification for fitting and forecasting financial volatility. Let
be the daily log-return at time
t and
the information available up to time
; then, the GARCH(1,1) model takes the following form:
where
and
. The positivity condition for the conditional variance requires
and
to be non-negative constants and
to be a (strictly) positive constant, while a necessary condition for the weak stationarity of GARCH(1,1) is
.
Many studies have documented that realized volatility measures based on intradaily returns can greatly improve the accuracy of volatility forecasts. Differently from standard GARCH approach, the Realized GARCH (RGARCH) of
Hansen et al. (
2012) employs both low (daily returns) and high (realized volatility measures) frequency information to model the dynamics of daily volatility. Following the log-linear specification, the RGARCH is given by
where
is a realized volatility measure and
is the leverage function, with
and
being mutually independent.
Therefore, the RGARCH provides a joint modeling framework of the return and realized volatility, replacing squared returns with a more informative volatility estimator to capture the conditional variance dynamics. The model is completed by the measurement equation in (5), which allows us to define the link between the (ex-post) realized measure and the (ex-ante) latent conditional variance. In addition, replacing the measurement equation into the GARCH equation, the model implies an AR(1) representation for the log-conditional variance, namely,
where
and
, with the restriction
to ensure the stationarity of the process (
Hansen et al. 2012;
Li et al. 2019).
Hansen and Huang (
2016) proposed the Realized Exponential GARCH (REGARCH) model while allowing the inclusion of multiple realized measures of volatility and also including an explicit leverage term in the GARCH equation, providing further flexibility in modeling the dependence between returns and volatility.
In its general formulation, considering a vector of
K realized measures
, the REGARCH is defined as
where
and
. In the empirical analysis, as usual, a quadratic form for the leverage function
—that is,
, is used.
3. Estimation and Inference
The GARCH and R(E)GARCH models are estimated using a maximum likelihood (ML) approach. In particular, we assume a standardized Student-t distribution for innovations to model both fat-tail and excess kurtosis observed in return series.
For REGARCH specifications, due to the presence of
K measurement equations, following
Hansen and Huang (
2016), we assume that
, where
denotes a
K-variate Normal distribution with mean
and variance-covariance matrix
. Therefore, the log-likelihood function for this class of models is given by
where
. The log-likelihood accounts for the contribution of realized measures by (11) and the contribution of returns by (10).
On the other hand, as the RGARCH is characterized by a single realized measure, Equation (11) reduces to
with
.
Finally, as standard GARCH models do not allow for measurement equations, estimation of model parameters is performed by focusing only on the partial log-likelihood of returns in Equation (10).
It is worth noting that the GARCH and R(E)GARCH specifications are not directly comparable in terms of the maximized global log-likelihood. However, since the contribution to the log-likelihood value of returns is the same for both classes of models, the partial log-likelihood of the returns in (10) enables us to compare the empirical fit of the conventional GARCH with that of R(E)GARCH-type models.
4. The Data
The dataset used in this study consists of half-hourly spot prices from the Australian electricity market. In Australia, the Australian Energy Market Operator (AEMO) manages the National Energy Market (NEM), which interconnects Queensland (QLD), New South Wales (NSW), the Australian Capital Territory (ACT), South Australia (SA), Victoria (VIC), and Tasmania (TAS), as well as the Wholesale Electricity Market (WEM) in Western Australia. In particular, the empirical analysis focuses on 30-min spot prices of the NSW, QLD, and VIC for the period between 1 January 2012 and 31 December 2019, where the prices are in Australian dollars per megawatt hour. Continuously recorded half-hourly spot prices are publicly available at
https://www.aemo.com.au (accessed on 12 May 2021).
The summary statistics reported in
Table 1 highlight some stylized facts of price and return dynamics. Differently to financial assets, electricity prices can be negative when the supply of electricity temporarily exceeds demand, resulting in a minimum half-hourly intradaily price lower than zero for the three observed markets. A further characteristic of the electricity prices is that the maximum price is more than 200 times larger than the average price, implying high variability and a high degree of skewness and kurtosis. These features can easily be seen in
Figure 1, which shows the time series of 30-min spot prices in the considered Australian electricity markets. Each market exhibits negative prices and several spikes with a magnitude much higher than their average price. This emphasizes a further characteristic of electricity prices: the existence of extreme jumps.
As negative prices are a rare occurrence, to compute log-returns, we excluded days with nonpositive prices (
Haugom and Ullrich 2012;
Qu et al. 2018). Further, to account for intraday seasonal patterns in the raw data, following
Haugom and Ullrich (
2012), returns are demeaned using half-hourly median returns
, where
denotes the median return for day
t in month
, on day of the week
, and in half-hour periods
. Therefore, in our empirical study, we used demeaned intraday returns
for computing realized volatility measures.
On the other hand, we refer to daily close-to-close log-returns as daily price changes. Although the mean and median are close to zero, electricity returns are highly volatile, as evident from the standard deviation, which varies between 0.216 and 0.314. In addition, the negative skewness and excess kurtosis clearly reveal the non-Gaussian nature of the distribution.
Finally, descriptive statistics point out that the 30-min RV is, as expected, characterized by positive skewness and a strong excess kurtosis because of the many peaks and troughs in the series. This can easily be seen in
Figure 2, which shows the RV behavior for the three electricity markets, where each plot has the same y-axis scale to facilitate comparison across markets. The QLD market reveals several periods of high volatility and the most extreme levels of RV compared to NSW and VIC, which instead show more moderate and less recurrent spikes.
As price volatility is characterized by different dynamics under different market conditions, in addition to the
Realized Volatility (RV) (
Andersen et al. 2003), to account for market microstructure noise and jumps in our empirical analysis, we also referred to the robust estimators
Realized Kernel (RK) (
Barndorff-Nielsen et al. 2008) and
medRV (MRV) (
Andersen et al. 2012), computed at a frequency of 30 min.
5. In-Sample Analysis
In this section, we present the in-sample results for NSW in
Table 2, QLD in
Table 3, and VIC in
Table 4. For each electricity market, we report the in-sample fit of the GARCH(1,1); three RGARCH and REGARCH models based on the single realized estimators RV, RK, and MRV; three REGARCH models with two measurement equations for the pairs (RV,RK), (RV,MRV), and (RK,MRV); and the REGARCH(RV,RK,MRV) combining all three realized estimators considered. Note that to simplify the presentation of the results, the RGARCH models are estimated using the autoregressive representation in order to make the estimated coefficients comparable with those of the REGARCH models.
The empirical results show that the estimates of the GARCH(1,1) for NSW, QLD, and VIC are 0.863, 0.669, and 0.874, respectively, confirming the fact that QLD is the most-nervous electricity market. This aspect is also highlighted by the estimate of taking the highest value for QLD. Focusing on the volatility persistence, for the GARCH(1,1), we observe ; whereas, for the R(E)GARCH-type models, (the parameter which summarizes the persistence of volatility for this class of models) is, on average, about 0.96. Thus, volatility decay is faster for models including high-frequency information. Moreover, a larger for individual realized measures in R(E)GARCHs with respect to the standard GARCH(1,1) is a sign that intraday information provides a stronger signal on future volatility than squared returns.
Analyzing the estimated parameters of the measurement equations, it turns out that although the realized measures are an upward biased measure of the conditional variance , they are approximately proportional to , as suggested by estimates of close to 1.
Furthermore, differently to what is usually observed for financial stock returns showing
and
as well as
and
, here, we find that all parameters of the leverage functions are positive with
and
. This implies an inverse leverage effect, as positive shocks in electricity prices lead to a larger increase in volatility than negative shocks, confirming the finding in
Knittel and Roberts (
2005);
Frömmel et al. (
2014);
Qu et al. (
2018), among others.
Overall, the estimated measurement error variance confirms that the QLD electricity market is the most volatile, showing the highest values of . Additionally, the parameter estimates for are approximately the same for RV and RK, but are larger for MRV, highlighting that a frequency of 30 min is not sufficient to cancel out the effects of price jumps. In addition, focusing on the individual realized measures, an inverse relationship emerges between the coefficient and the variance of the residual measurement error. This is because reflects the “amount” of information about volatility variation; so, the larger the coefficient, the more accurate the realized measure.
Regarding the parameters, as expected, their estimates are very close to 1, indicating that RV, RK, and MVR are highly positively correlated, implying that some are negative or not significant for REGARCH models using two or three realized measures.
The estimated parameter ranges between 2.85 and 5.37, with the upper bound always provided by the GARCH(1,1), confirming the existence of leptokurtosis in the conditional distribution of returns.
Finally, as it makes no sense to compare the full log-likelihood for different realized measures, we only report the partial log-likelihood of the return component . Not surprisingly, there is a clear improvement from RGARCH to REGARCH specifications, which provide the highest maximized values. The lowest values for always occur in the RGARCH(MRV), while, overall, the standard GARCH(1,1) performs quite well.
6. Out-of-Sample Analysis
In this section, one-day-ahead forecasts of volatility, Value-at-Risk (VaR), and Expected Shortfall (ES) are generated with the rolling window method with daily re-estimation. In particular, the first estimation window is of 2000 observations, leading to different out-of-sample periods for the electricity markets: 26/06/2017–31/12/2019 (913 days) for NSW; 01/08/2017–31/12/2019 (813 days) for QLD; 06/09/2017–31/12/2019 (788 days) for VIC.
The out-of-sample forecasting performance of the models is evaluated by considering different loss functions. First, the ability to accurately forecast volatility is assessed by the QLIKE (
Patton 2011), Mean Squared Error (MSE), and Mean Absolute Error (MAE):
where
is the 1-step-ahead conditional variance forecast and
is the volatility proxy at time
t. Since the magnitude of bias due to microstructure noise and jumps tend to vanish at low frequencies, to consider different market scenarios in our forecasting study, we refer to the RV computed at the frequencies of 30 min, 2 h, and 6 h as volatility proxies.
Next, we evaluate the out-of-sample forecasting ability of the models considering one-step-ahead VaR and ES forecasts generated for three different risk levels: 1%, 2.5%, and 5%. The adequacy of VaR forecasts is assessed through the Quantile Loss (
) function (
González-Rivera et al. 2004;
Koenker 2005)
where
. The
is a strictly consistent scoring rule for VaR prediction. Further, being an asymmetric loss function, it is particularly suited to assess quantile risk measures as it imposes a higher penalty, with weight
, for observations below the
-quantile level, namely, when we observe returns exceeding the VaR.
However, as ES turns out to be jointly elicitable with VaR, we rely on the class of (strictly) consistent scoring function (
Fissler and Ziegel 2016) to evaluate the ability of the proposed models to jointly forecast VaR and ES
with
and
, the VaR and ES, respectively, while
is weakly increasing,
is strictly increasing and strictly positive, and
.
Although several strictly consistent scoring rules for the pair (VaR, ES) can be obtained as special cases of the family of functions in (
14), following
Patton et al. (
2019), we assume VaR and ES to be strictly negative and
, with
and
, resulting in the zero-degree homogeneous loss function
Finally, the Model Confidence Set (MCS) of
Hansen et al. (
2011) is used to assess the significance of differences in the predictive performances of the models under analysis considering the confidence levels of 75% and 90%. In particular, in the MCS implementation, we have considered a
Range statistic and 5000 bootstrap resamples generated by means of a block-bootstrap procedure, where the optimal block length has been estimated using the method described in
Patton et al. (
2009).
Table 5 shows the out-of-sample model comparison based on average losses for QLIKE, MSE, and MAE using different volatility proxies. Values in boldface indicate the models that return the minimum average loss, while those shaded in gray and light-gray are associated with models that are included in the 75% and 90% MCS, respectively.
The results in
Table 5 clearly indicate that the RGARCH relying on the jump-robust estimator
medRV provides more accurate volatility forecasts than other models, always minimizing the loss functions considered for all proxies and each electricity market. Furthermore, it is the only model entering MCS at any confidence level. No other models enter MCS, with the exception of the RGARCH(RV) and RGARCH(RK) using the 30-min RV and the RGARCH(RK) using the 2-hour RV, which are included in the 90% MCS for the MSE loss function and the VIC electricity market. On the other hand, the GARCH(1,1) appears to be the worst competitor as it produces the highest loss in every possible scenario. The largest discrepancies between the GARCH(1,1) and R(E)GARCH models occur for the QLIKE loss function, confirming that the QLIKE is more powerful in rejecting poorly performing predictors (
Liu et al. 2015;
Patton 2011). Overall, the simple RGARCH structure leads to substantial improvements in the accuracy of volatility forecasts over the REGARCH models based on one or more realized measures.
The scenario is completely reversed when forecasting VaR and ES. The results in
Table 6 point out that REGARCH specifications provide lower average loss values for
and
than RGARCH models, which are always excluded from the MCS at any risk level. At the same time, the GARCH(1,1) model also turns out to be a good competitor, minimizing the loss functions in most cases. In particular, it is the only model that always enters the MCS at the most extreme 1% risk level both for VaR and ES, and for NSW and QLD, no other model belongs to the MCS. Although the REGARCH(RV,RK,MRV) model minimizes
and
for different risk scenarios, models based on a single or combination of two variables almost always enter the MCS, especially when RV and RK are considered to explain volatility dynamics. Furthermore, the MCS shows that the differences between the models are more pronounced in forecasting ES. Finally, moving towards less extreme risk levels, such as 5%, there is less discrimination between models.
7. Conclusions
This paper uses half-hourly spot prices from the Australian electricity markets of New South Wales, Queensland, and Victoria to predict volatility and manage risk in energy markets. In this framework, we extend the literature on modeling the conditional variance of returns using the Realized GARCH approach by combining information from multiple realized robust and nonrobust measures to capture the key features of electricity prices such as extreme jumps and the inverse leverage effect. Our empirical analysis underlines the following points. First, specifications with multiple realized measures outperform those based on a single realized measure as well as the standard GARCH(1,1), resulting in a remarkably better fit of the data in-sample. Second, the medRV jump-robust measure significantly increases the accuracy of out-of-sample volatility forecasts. In particular, the simple Realized GARCH based on a single measurement equation for the jump-robust medRV estimator always minimizes the set of loss functions considered—i.e., QLIKE, MSE, and MAE—and is the only model that enters the MCS under all circumstances addressed. Finally, in contrast to volatility forecasting, when assessing the predictive ability of the models in terms of VaR and ES, it emerges that the standard GARCH is highly competitive especially for the extreme risk level of 1%. Similarly, REGARCH models based on one or more realized measures outperform the simple RGARCH, which shows—in this case—the worst results in minimizing the loss functions at any risk level. Further, the MCS highlights greater discrimination between models in predicting ES. Electricity market participants aim to constantly pursue optimal trading limits in order to adequately allocate capital and to cover potential losses if trading limits are violated. This is also because overcapitalization implies idle capital that could undermine the profitability of energy industries; at the same time, undercapitalization could cause financial difficulties when they are unable to honor trading contracts. Therefore, accurately predicting VaR and ES is essential for effective energy risk management, as they are the most commonly used measures for establishing optimal trading limits. One aspect that has not been considered here, and that is worth examining in future research, is how the inclusion of exogenous factors, such as weather conditions, would affect electricity price volatility. In addition, volatility inter-relationships between various regions and energy markets are interesting future research areas, as is extending the results to energy markets other than Australia. Finally, as this study has mainly focused on modeling one-day-ahead volatility, exploiting the properties of the Realized GARCH-type models, a natural extension would be to predict price volatility at a longer horizon.