Next Article in Journal
Internet of Things and Big Data Analytics for Risk Management in Digital Tourism Ecosystems
Previous Article in Journal
Taguchi Risk and Process Capability
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

GARMA, HAR and Rules of Thumb for Modelling Realized Volatility

by
David Edmund Allen
1,2,3,* and
Shelton Peiris
1
1
School of Mathematics and Statistics, University of Sydney, Camperdown, NSW 2006, Australia
2
Department of Finance, Asia University, Taichung 41354, Taiwan
3
School of Business and Law, Edith Cowan University, Joondalup, WA 6027, Australia
*
Author to whom correspondence should be addressed.
Risks 2023, 11(10), 179; https://doi.org/10.3390/risks11100179
Submission received: 4 September 2023 / Revised: 8 October 2023 / Accepted: 11 October 2023 / Published: 16 October 2023

Abstract

:
This paper features an analysis of the relative effectiveness, in terms of the Adjusted R-Square, of a variety of methods of modelling realized volatility (RV), namely the use of Gegenbauer processes in Auto-Regressive Moving Average format, GARMA, as opposed to Heterogenous Auto-Regressive HAR models and simple rules of thumb. The analysis is applied to two data sets that feature the RV of the S&P500 index, as sampled at 5 min intervals, provided by the OxfordMan RV database. The GARMA model does perform slightly better than the HAR model, but both models are matched by a simple rule of thumb regression model based on the application of lags of squared, cubed and quartic, demeaned daily returns.

1. Introduction

This paper features the application of time series modelling techniques to the modelling of RV. In this particular case, RV sampled at 5 min, as supplied by the OxfordMan Institute (see: https://realized.oxford-man.ox.ac.uk/data, accessed on 22 March 2022), was used. We compare the effectiveness of Gegenbauer processes in Auto-Regressive Moving Average format, GARMA models, as opposed to Heterogenous Auto-Regressive HAR models and other rules of thumb, based on demeaned squared, cubed and quartic daily returns, as methods for modelling and forecasting daily 5 min RV. The aim of the paper is to explore whether sophisticated models out-perform simple rules of thumb.
Over the past 100 years, considerable advances have been made in time series modelling. Yule (1926) and Slutsky (1927) developed the stochastic analysis of time series and developed the concepts of autoregressive (AR) and moving average (MA) models. Box and Jenkins (1970) suggested methods for applying autoregressive moving average (ARMA) or autoregressive integrated moving average (ARIMA) models to find the best fit of a time-series model to past values of a time series.
A contrary view of the approriateness of this approach has been promoted by Commandeur and Koopman (2007), who suggested that the Box–Jenkins approach is fundamentally problematic. They have championed the adoption of alternative state-space methods to counter the contention that many real economic series are not truly stationary, despite differencing.
In the 1980s, attention switched to the consideration of the issues related to stationarity and non-stationary time series, fractional integration and cointegration. Granger and Joyeux (1980) and Hosking (1981) focussed attention on fractionally integrated autoregressive moving average (ARFIMA or FARIMA) processes. Unit root testing to assess the stationarity of a time series became established via the application of the Dickey–Fuller test, following the work of Dickey and Fuller (1979).
Engle and Granger (1987) developed the concept of cointegration, whereby two time series might be individually integrated and non-stationary I(1), but some linear combination of them might possess a lower order of integration and be stationary, in which case the series are said to be cointegrated. Many of these conceptual developments have important applications to economic and financial time series, and to economic theory in these discipline areas.
One of the common features of many time series of financial data sets is that the variance of the series is not homoscedastic, and that these features concerned are autocorrelated. The Autoregressive Conditional Heteroskedasticity (ARCH) model that incorporates all past error terms was developed by Engle (1982). Bollerslev (1986) further generalised it to GARCH by including lagged term conditional volatility. Therefore, GARCH predicts that the best indicator of future variance is a weighted average of long-run variance, the predicted variance for the current period, and any new information in this period, as captured by the squared residuals. GARCH models provide an estimate of the conditional variance of a financial price time series.
The concept of RV and associated metrics were developed by Andersen et al. (2001), Andersen et al. (2003), and Barndorff-Nielsen and Shephard (2002). This alternative approach measures the variance directly from the observed values of the price series. These RV measures are theoretically sound, high-frequency, nonparametric-based estimators of the variation of the price path of an asset during the times at which the asset trades frequently on an exchange.
The modelling of the variance of financial time series and the use of RV is the focus of this paper. In the empirical analysis, we use RV 5 min estimates from Oxford-Man for the S&P500 Index as the RV benchmark. Their database contains daily (close to close) financial returns and a corresponding sequence of daily realized measures r m 1 , r m 2 , , r m T .
Corsi (2009, p. 174) suggests “an additive cascade model of volatility components defined over different time periods. The volatility cascade leads to a simple AR-type model in the realized volatility with the feature of considering different volatility components realized over different time horizons, and which he termed as being a “Heterogeneous Autoregressive model of Realized Volatility”. We make use of the Corsi (2009) HAR model to model RV in some of the empirical tests included in this paper.
However, the main focus of this paper is the application of Gegenbauer processes to the modelling of RV. Gegenbauer processes were introduced by Hosking (1981) and further developed by Anděl (1986) and Gray et al. (1989). The latter proposed the class of time series models known as Gegenbauer ARMA, or, as abbreviated, GARMA processes, which are the central focus of this paper.
In the current paper, we compare the effectiveness of GARMA models, as opposed to HAR models and other rules of thumb, based on demeaned squared daily returns, as methods for modelling and forecasting daily 5 min RV.
The paper is a further companion piece to two previous studies in the topic’s general area, namely, Allen and McAleer (2020) and Allen (2020), that compared the effectiveness of stochastic volatility (SV), vanilla GARCH and HAR models, as opposed to simple rules of thumb, in their effectiveness as tools for capturing the RV of major stock market indices.
The current paper concentrates on the S&P500 index and examines whether GARMA, HAR or simple rules of thumb better capture the RV sampled at 5 min intervals, as provided by Oxford-Man, of the S&P500 Index. Thus, the central concern is what is the best method of capturing the long memory properties of a historical time series of RV5 for the S&P500 index. This is in contrast with the two previously mentioned studies, which contrasted the effectiveness of the volatility models per se.
The benchmark is provided by the estimates of RV5, in a sample of daily estimates of RV5, running from 8 May 1997 to 30 August 2013 with 4096 observations, plus a longer-period sample of RV5, also based on the S&P500 Index, running from 4 January 2000 to 30 April 2020, comprising 5099 observations. This is the same data set as used in Allen (2020).
The motivation for this paper is provided by Poon and Granger (2005, p. 507), who observed that: “as a rule of thumb, historical volatility methods work equally well compared with more sophisticated ARCH class and SV models.” This paper similarly seeks to explore whether simple rules of thumb, in this case based on the use of a regression model featuring squared demeaned daily returns, with the subsequent addition of cubed and quartic powers of them, perform as well as more sophisticated time series models.
The paper is divided into four sections: Section 2 reviews the literature and econometric methods employed. Section 3 presents the results, and Section 4 presents the conclusions.

2. Previous Work and Econometric Models

Recent reviews of the literature on the nature and applications of Gegenbauer processes are provided by Hunt et al. (2021) and Dissanayake et al. (2018). Peiris et al. (2005) and Peiris and Thavaneswaran (2007) considered long memory models driven by heteroskedastic GARCH errors. Peiris and Asai (2016) returned to this topic, and Guegan (2000) combined Gegenbauer processes with integrated GARCH (GIGARCH) to include the attributes of long memory, seasonality and heteroskedasticity at the same time, in the modelling of volatility.

2.1. The Basic GEGENBAUER Model

Let { X t , t = 1 , 2 , , } be a stationary random process with the autocovariance γ ( k ) = C o v ( X t , X t + k ) , and the autocorrelation function ρ ( k ) = γ ( k ) / γ ( 0 ) , where k = 1 ,   2   . The spectral density function (sdf) is denoted by:
f ( ω ) = 1 2 π k = ρ ( k ) e ω k , π ω π ,
where ω is the Fourier frequency.
There are various ways in which the long memory component of the Gegenbauer model can be specified, as discussed in Dissanayake et al. (2018). In the analysis that follows, we utilise the R package GARMA, as developed by Hunt (2022a).
A Gegenbauer process is a long memory process generated by the dynamic equation:
( 1 2 u B + B 2 ) δ X t = ϵ t ,
where u < 1 , δ ( 0 , 0.5 ) and ϵ t is a short memory process characterised by a positive and bounded spectral density, f ϵ ( ω ) . If ϵ t W N ( 0 , σ 2 ) , (1) is a Gegenbauer process of order δ or a GARMA ( 0 , δ , 0 , u ) process. Dissanayake et al. (2018) mention that (1) complies with the definition of a long memory process at the frequency ω 0 a r c c o s ( u ) . According to (1), X t arises from filtering the process ϵ t by the infinite impulse response
( 1 2 u B + B 2 ) δ = j = 0 C j δ ( u ) B j .
It can be shown that a stationary Gegenbauer process contains an unbounded spectrum at ω 0 and is long memory when 0 < δ < 1 2 . This special frequency ω 0 is called the Gegenbauer or G-frequency, as outlined by Dissanayake et al. (2018, p. 416).
The k-factor GARMA model as fit by the GARMA package, as in Hunt (2022a), is specified as:
ϕ ( B ) i = 1 k ( 1 2 u i B + B 2 ) d i ( 1 B ) δ ( X t u ) = θ ( B ) ϵ t  
  • where ϕ ( B ) represents the short-memory autoregressive component of order p ,
  • θ ( B ) represents the short memory moving average component of order q ,
  • ( 1 2 u i B + B 2 ) d i represents the long-memory Gegenbauer component (there may in general be k of these),
  • d represents integer differencing (currently only = 0 or 1 is supported),
  • X t represents the observed process,
  • ϵ t represents the random component of the model—these are assumed to be uncorrelated but identically distributed variates,
  • B represents the Backshift operator, defined by B X t = X t 1 .
When k = 0 , then this is just a short memory model, as would be represented by an ARIMA model.
Dissanayake (2015) considered a class of Gegenbauer processes generated by Gaussian white noise and GARCH errors and suggested that many other models could be nested within this framework. He explored a number of related issues and used state space modelling to explore seasonal Gegenbauer processes.
Phillip (2018) applied a Gegenbauer long memory stochastic volatility model with leverage and a bivariate Student’s t-error distribution to describe the innovations of the observation and latent volatility jointly and demonstrated its effectiveness in applications to Cryptocurrency time series. He demonstrated by means of MCMC runs for various values of [u; d], that when d is low, and as u → 0. the convergence of u to its true value gets slower. This is an expected result, since as d → 0, the process has less information and becomes “less long-memory”.

2.2. Heterogenous Autoregressive Model (HAR)

Corsi (2009, p. 174) suggests “an additive cascade model of volatility components defined over different time periods. The volatility cascade leads to a simple AR-type model in the realized volatility with the feature of considering different volatility components realized over different time horizons and which he termed as a Heterogeneous Autoregressive model of Realized Volatility”. Corsi (2009) suggests that his model can reproduce the main empirical features of financial returns (long memory, fat tails, and self-similarity) in a parsimonious way. He writes his model as:
σ t + 1 d ( d ) = c + β ( d ) R V t ( d ) + β ( w ) R V t ( w ) + β ( m ) R V t ( m ) + ω ˜ t + 1 d ( d ) ,  
where σ ( d ) is the daily integrated volatility, and R V t ( d ) ,   R V t ( w ) and R V t ( m ) are, respectively, the daily, weekly, and monthly (ex post) observed realized volatilities.

2.3. Historical Volatility Model (HISVOL)

Poon and Granger (2005) discuss various practical issues involved in forecasting volatility. They suggest that the HISVOL model has the following form:
σ ^ t = ϕ 1 σ t 1 + ϕ 2 σ t 2 + + ϕ τ σ t τ ,
where σ ^ t is the expected standard deviation at time t , ϕ is the weight parameter, and σ is the historical standard deviation for periods indicated by the subscripts. Poon and Granger (2005) suggest that this group of models include the random walk, historical averages, autoregressive (fractionally integrated) moving average, and various forms of exponential smoothing that depend on the weight parameter ϕ .
We use a simple form of this model in which the estimate of σ is the previous day’s demeaned squared return. Poon and Granger (2005) review 66 previous studies and suggest that implied standard deviations appear to perform best, followed by historical volatility and GARCH, which have roughly equal performance.
Barndorff-Nielsen and Shephard (2003) point out that taking the sums of squares of increments of log prices has a long tradition in the financial economics literature. See, for example, Poterba and Summers (1986), Schwert (1989), Taylor and Xu (1997), Christensen and Prabhala (1998), Dacorogna et al. (1998), and Andersen et al. (2001). Shephard and Sheppard (2010) p. 200, footnote 4) note that: “Of course, the most basic realized measure is the squared daily return”. We utilise this approach as the basis of our historical volatility model. Furthermore, Perron and Shi (2020) show how the squared low-frequency returns can be expressed in terms of the temporal aggregation of a high-frequency series in relation to volatility measures.

3. Results

3.1. The Data Sets

To expedite a direct comparison with previous work on the HAR model, we use the R library package ’HARModel’ by Sjoerup (2019). This contains data featuring RV measures from the SP500 index from April 1997 to August 2013, and we use the RV5 measures on the S&P500 Index sampled at 5 min intervals.
Table 1 provides a statistical description of this RV5 data set, together with that of another slightly longer data set, taken from 2000 to 2020, also featuring S&P500 index RV5 data taken from Allen and McAleer (2020). Both data sets feature RV5 estimates taken from the Oxford-Man Realized library.
One of the features of estimates of RV is that the data time series displays long-memory characteristics. Long memory refers to the association between observations in a time series that are ever larger sample intervalling and is also referred to as long-range dependence. It basically refers to the level of statistical dependence between two points in the time series sampled at increasing intervals.
Figure 1 displays the long memory characteristics of the two RV5 time series that we analyse. The first panel in the two plots displays the basic series of RV5, and the two large spikes in RV5 correspond to the effects on volatility of the Global Financial Crisis (GFC) that occurred in 2008. The two panels marked ACF and PACF refer to the autocorrelation and partial autocorrelation statistics.
The R GARMA package, Hunt (2022a), was used to generate the two graphs. The program was instructed to use 100 lags of daily observations. The blue lines in the bottom two sets of panels display the standard error bands. The long memory properties of RV5 are apparent in both sets of diagrams in that the ACF statistics remain well outside the error bands for 100 lags, and the PACF is outside the error bands for up to 30 lags.
These long memory characteristics are used in Corsi’s (2009) HAR model and will be a feature of the Gegenbauer models that we fit to the data sets.

3.2. The Basic HAR Model

Table 2 provides summary descriptions of the HAR models fitted to the two RV5 data sets, and Figure 2 provides plots of the fits. The results presented in Table 2 show that the basic HAR model does an excellent job of capturing the time series properties of RV5. All the estimates are significant at a one percent level, as are the F statistics, and the Adjusted R-squares are 52 percent for the period 1997–2013 and 56 percent for the period 2000–2020, respectively.
The plots in Figure 2 confirm this but do suggest that the large periodic peaks in RV5 are not captured so effectively by the HAR model. The question remains as to whether the Gegenbauer model will perform more effectively.

3.3. Gegenbauer Results

The R library package (GARMA), Hunt (2022a), was used to fit GARMA models to the RV5 series for the S&P500 index as sourced from the Oxford-Man library. This shorter RV series, for the S&P500 from 1997 to 2013, was taken from the HARmodel R library package, Sjoerup (2019).
A detailed summary of the methods used in the GARMA package is available in Hunt (2022b). The GARMA package provides the ability to fit stationary, univariate GARMA models to a time series and to forecast from those models. The garma() function in the GARMA package is the main function for estimating the parameters of a GARMA model. It provides three methods of parameter estimation: the Whittle method, (Whittle 1953), the conditional sum-of-squares (CSS) method, (for a discussion, See Hunt (2022b) chp. 2, eq. 2.3.2) and the WLLS method. The latter, the Whittle Log Least-Squares method, was proposed by Hunt (2022b, chp. 3). The Whittle method was used in the estimations reported in the paper.
A summary of the Gegenbauer model estimated for this data is shown in Table 3. A potential advantage of the Gegenbauer model is that it is non-linear and more flexible than the HAR model.
The results of a regression of the fit from the Gegenbauer model for this data on the actual RV5 estimates for the S&P500 index for this period are shown in Table 4. The HAR model regression for this period, shown in the first half of Table 2, had an Adjusted R-squared of 0.52. The result for the Gegenbauer model estimation is an Adjusted R-squared of 0.567, and so the non-linear model does show an increased explanatory power.
We also fitted the Gegenbauer model to the longer time-period of RV estimates running from 2000 to 2020 and present the results in Table 5.
We regressed the actual daily RV5 series for the longer period from 2000 to 2020, and the results are shown in Table 6. The slope coefficient is significant at the 1 per cent level and is very close to 1, whilst the Adjusted R-square is 0.59, and the F statistics for the regression, with a value of 7326, is also significant at the 1 percent level. The Adjusted R-square for the HAR model for the same period was 0.56, so the Gegenbauer model provides a marginally better fit than the HAR model.

3.4. How Do Rule of Thumb Approaches Perform?

The next issue is how squared demeaned end-of-day returns perform as a simple rule-of-thumb method to explain RV5. Table 7 presents the results of the regression of RV5 for the longer period of 2000 to 2020 on 20 lags of squared demeaned daily returns.
It can be seen in Table 7 that 20 lags of squared demeaned returns do not perform quite as well as the Gegenbauer or HAR models but still have an Adjusted R-Square of 54 percent, which is a marginal 2 percent less than the HAR model and 5 percent less than the Gegenbauer model. Only 4 of the 20 lags used in this rule-of-thumb approach are insignificant. The Durbin–Watson statistic of 1.43 suggests that a considerable amount of autocorrelation remains in the residuals.
The application of Ramsey Reset tests (Ramsey 1969), suggests that squares and cubes of the explanatory variable could add to the power of the regression. Table 8 reports the results of adding 10 lags of cubed demeaned SPRET and 10 lags of demeaned SPRET to the power 4.
This is essentially another non-linear model, but admittedly, we now have 40 explanatory variables in the model in the form of lags of three explanatory variables. The Adjusted R-Square now increases to over 59 percent, which matches the power of the Gegenbauer model. Admittedly, the Durbin–Watson statistic is still a relatively low 1.54. This suggests that there is still autocorrelation in the residuals, which could be exploited further in enhanced modifications of the model.
The above regression is quite clumsy and contains some redundant terms. An anonymous reviewer suggested that we apply a lasso technique to reduce the size of the regression. This is not entirely consistent with using a ‘rule of thumb’ approach, and it would be a simple matter to drop the redundant terms in the regression above.
The adaptive lasso regression uses different penalties (weights) for different regressors when running a lasso regression. Under certain conditions, as applied to those weights, the results will have the so-called oracle property, see Zou (2006) as opposed to the standard lasso approach.
Zou (2006) derives a necessary condition for the lasso variable selection to be consistent. His version of the lasso, the adaptive lasso, employs adaptive weights that are used for penalizing different coefficients in the λ 1 penalty. He demonstrates that the adaptive lasso enjoys oracle properties, namely, that it performs as well as if the true underlying model were given in advance. Zou (2006) suggests that y = (y1, , yn)T is the response vector and xj= (x1j, , xnj)T, j = 1, , p, are the linearly independent predictors. He lets X = [x1,…,xp] be the predictor matrix. He assumes that E[y|x] = β*1x1 + ··· + β*pxp. The data are assumed to be centred so the intercept is not included in the regression equation. He lets A = {j: βj ≠ 0} and then assumes that |A| = p0 < p. This implies that the true model depends only on a subset of the predictors.
If we denote by β (δ), the coefficient estimator is produced by a fitting procedure δ. Then, the arguments of Fan and Li (2001) can be adopted and δ can be termed as being an oracle procedure if β (δ) (asymptotically) has the following oracle properties:
  • Identifies the right subset model, {j: β j≠ 0} = A
  • Has the optimal estimation rate, √n β ((δ)Aβ*A) → d N(0,Σ*), where Σ* is the covariance matrix knowing the true subset model.
The lasso is a regularization technique for simultaneous estimation and variable selection Tibshirani (1996). The lasso estimates are defined as:
β ( l a s s o ) = arg min β y j = 1 p x j β j 2 + λ j = 1 p | β j |
where λ is a nonnegative regularization parameter. The second term in the expression above is the so-called “1 penalty,” which is crucial for the success of the lasso. The lasso continuously shrinks the coefficients toward 0 as λ increases, and some coefficients are shrunk to exactly 0 if λ is sufficiently large.
Zou proposes the adaptive lasso, in which adaptive weights are used for penalizing different coefficients in the 1 penalty and he demonstrates that the adaptive lasso enjoys the oracle properties. He employs β (ols) to construct the adaptive weights in the adaptive lasso; and suggests that the computation procedure involves the employment of the LARS algorithm (Efron et al. 2004).
We implemented this version of the adaptive lasso technique using a GRETL function package containing code submitted by Schreiber (2023). Table 9 shows the regression screening results chosen by the technique as applied to the variables in Table 8.
These variables chosen were then applied in a new variant of the rule of thumb OLS regression of RV5 on the various lags of squared, cubed and quartic demeaned returns chosen by the adaptive lasso technique. The results are shown in Table 10.
The results of the application of the adaptive lasso technique shown in Table 10 confirm the previous findings reported in Table 8, namely, that a rule-of-thumb method using squared, cubed, and quartic end-of-day returns in the explanation of RV5 matches those obtained by application of the Gegenbauer technique.

4. Conclusions

In this paper, we have explored the use of the Gegenbauer process or GARMA model to capture the behaviour of RV5 volatility of the S&P500 index, as reported by the OxfordMan database. The results suggest that the non-linear Gegenbauer model performs slightly better than the HAR model in capturing RV5. However, a simplified rule of thumb model based on the use of lagged, squared, cubed, and quartic, demeaned daily returns performed equally well. These results, for the S&P500 index, suggest that non-linear models perform better than linear ones in the capture of long memory properties of RV5 and that sophisticated models do not necessarily dominate rules of thumb.

Author Contributions

Conceptualization, D.E.A. and S.P.; methodology, D.E.A. and S.P.; software, D.E.A. and S.P.; validation, D.E.A. and S.P.; formal analysis, D.E.A.; investigation, D.E.A.; resources, S.P.; data curation, D.E.A.; writing—original draft preparation, D.E.A.; writing—review and editing, D.E.A. and S.P.; visualization, D.E.A.; project administration, S.P.; funding acquisition, S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Index data were obtained from Yahoofinance on 20/3/2022. The realized volatility data were originally obtained from Oxford-Man at http://realized.oxford-man.ox.ac.uk (accessed on 1 January 2020), but this data base has recently been removed from public availability.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Allen, David E., and Michael McAleer. 2020. Do we need stochastic volatility and Generalised Autoregressive Conditional Heteroscedasticity? Comparing squared end-of-day returns on FTSE. Risks 8: 12. [Google Scholar] [CrossRef]
  2. Allen, David Edmund. 2020. Stochastic volatility and GARCH: Do squared end-of-day returns provide similar information? Journal of Risk and Financial Management 13: 202. [Google Scholar] [CrossRef]
  3. Anděl, Jiří. 1986. Long memory time series models. Kybernetika 22: 105–23. [Google Scholar]
  4. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Heiko Ebens. 2001. The distribution of realized stock return volatility. Journal of Financial Economics 61: 43–76. [Google Scholar] [CrossRef]
  5. Andersen, Torben G., Tim Bollerslev, Francis X. Diebold, and Paul Labys. 2003. Modeling and forecasting realized volatility. Econometrica 71: 529–626. [Google Scholar] [CrossRef]
  6. Barndorff-Nielsen, Ole E., and Neil Shephard. 2002. Econometric analysis of realised volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society, Series B 63: 253–80. [Google Scholar] [CrossRef]
  7. Barndorff-Nielsen, Ole E., and Neil Shephard. 2003. Realized power variation and stochastic volatility models. Bernoulli 9: 243–65. [Google Scholar] [CrossRef]
  8. Bollerslev, Tim. 1986. Generalized Autoregressive Conditional Heteroscedasticity. Journal of Econometrics 31: 307–327. [Google Scholar] [CrossRef]
  9. Box, George E. P., and Gwilym M. Jenkins. 1970. Times Series Analysis. Forecasting and Control. San Francisco: Holden-Day. [Google Scholar]
  10. Christensen, Bent J., and Nagpurnanand R. Prabhala. 1998. The relation between implied and realized volatility. Journal of Financial Economics 37: 125–50. [Google Scholar] [CrossRef]
  11. Commandeur, Jacques J. F., and Siem Jan Koopman. 2007. Introduction to State Space Time Series Analysis. Oxford: Oxford University Press. [Google Scholar]
  12. Corsi, Fulvio. 2009. A simple approximate long-memory model of realized volatility. Journal of Financial Econometrics 7: 174–96. [Google Scholar] [CrossRef]
  13. Dacorogna, Michel M., Ulrich A. Muller, Richard B. Olsen, and Olivier V. Pictet. 1998. Modelling short term volatility with GARCH and HARCH. In Nonlinear Modelling of High Frequency Financial Time Series. Edited by C. Dunis and B. Zhou. Chichester: Wiley. [Google Scholar]
  14. Dickey, David A., and Wayne A. Fuller. 1979. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association 74: 427–31. [Google Scholar]
  15. Dissanayake, G. S., M. Shelton Peiris, and Tommaso Proietti. 2018. Fractionally Differenced Gegenbauer Processes with Long Memory: A Review. Statistical Science 33: 413–26. [Google Scholar] [CrossRef]
  16. Dissanayake, Gnanadarsha Sanjaya. 2015. Advancement of Fractionally Differenced Gegenbauer Processes with Long Memory. Ph.D. thesis, School of Mathematics and Statistics, University of Sydney, Sydney, NSW, Australia. [Google Scholar]
  17. Efron, Bradley, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. 2004. Least Angle Regression. The Annals of Statistics 32: 407–99. [Google Scholar] [CrossRef]
  18. Engle, Robert F. 1982. Autoregressive conditional heteroskedasticity with estimates of the variance of United Kingdom inflation. Econometrica 50: 987–1007. [Google Scholar] [CrossRef]
  19. Engle, Robert F., and Clive W. J. Granger. 1987. Co-integration and error correction: Representation, estimation and testing. Econometrica 55: 251–76. [Google Scholar] [CrossRef]
  20. Fan, Jianqing, and Runze Li. 2001. Variable Selection via Nonconcave Penalized Likelihood and Its Oracle Properties. Journal of the American Statistical Association 96: 1348–60. [Google Scholar] [CrossRef]
  21. Granger, Clive W. J., and Roselyne Joyeux. 1980. An introduction to long-memory time series models and fractional differencing. Journal of Time Series Analysis 1: 15–29. [Google Scholar] [CrossRef]
  22. Gray, Henry L., Nien-Fan Zhang, and Wayne A. Woodward. 1989. On generalized fractional processes. Journal of Time Series Analysis 10: 233–57, Erratum in Journal of Time Series Analysis 15: 561–62. [Google Scholar] [CrossRef]
  23. Guegan, Dominique. 2000. A new model: The k-factor GIGARCH process. Journal of Signal Processing 4: 265–71. [Google Scholar]
  24. Hosking, J. R. M. 1981. Fractional differencing. Biometrika 68: 165–76. [Google Scholar] [CrossRef]
  25. Hunt, Richard. 2022a. Garma: Fitting and Forecasting Gegenbauer ARMA Time Series Models, R Package Version 0.9.11. Available online: https://CRAN.R-project.org/package=garma (accessed on 3 October 2023).
  26. Hunt, Richard. 2022b. Investigations into Seasonal ARMA Processes. Ph.D. thesis, School of Mathematics and Statistics, University of Sydney, Sydney, NSW, Australia. [Google Scholar]
  27. Hunt, Richard, Shelton Peiris, and Neville Weber. 2021. Estimation methods for stationary Gegenbauer processes. Statistical Papers 63: 1707–41. [Google Scholar] [CrossRef]
  28. Peiris, M. Shelton, and Manabu Asai. 2016. Generalized fractional processes with long memory and time dependent volatility revisited. Econometrics 4: 37. [Google Scholar] [CrossRef]
  29. Peiris, Shelton, David Allen, and Udara Peiris. 2005. Generalised autoregressive models with conditional heteroscedasticity: An application to financial time series modelling. In The 2004 Workshop on Research Methods: Statistics and Finance, Proceedings of the 2004 Workshop on Research Methods: Statistics and Finance. Edited by Eric J. Beh, Robert G. Clark and J. C. W Rayner, Wollongong: University of Wollongong, pp. 75–83. ISBN 1 74128 107 5. [Google Scholar]
  30. Peiris, M. Shelton, and Aera Thavaneswaran. 2007. An introduction to volatility models with indices. Applied Mathematics Letters 20: 177–82. [Google Scholar] [CrossRef]
  31. Perron, Pierre, and Wendong Shi. 2020. Temporal aggregation and Long Memory for asset price volatility. Journal of Risk and Financial Management 13: 181. [Google Scholar] [CrossRef]
  32. Phillip, Andrew. 2018. On Gegenbauer long memory stochastic volatility models: A Bayesian Markov chain Monte Carlo approach with applications. Ph.D. thesis, School of Mathematics and Statistics, University of Sydney, Sydney, NSW, Australia. [Google Scholar]
  33. Poon, Ser-Huang, and Clive Granger. 2005. Practical issues in forecasting volatility. Financial Analysts Journal 61: 45–56. [Google Scholar] [CrossRef]
  34. Poterba, James M., and Lawrence H. Summers. 1986. The persistence of volatility and stock market fluctuations. American Economic Review 76: 1124–41. [Google Scholar]
  35. Ramsey, James Bernard. 1969. Tests for Specification Errors in Classical Linear Least Squares Regression Analysis. Journal of the Royal Statistical Society, Series B 31: 350–71. [Google Scholar] [CrossRef]
  36. Schreiber, Sven. 2023. adalasso.gfn GRETL Code Function. Available online: https://gretl.sourceforge.net/cgi-bin/gretldata.cgi?opt=SHOW_FUNCS (accessed on 3 October 2023).
  37. Schwert, G. William. 1989. Why does stock market volatility change over time? Journal of Finance 44: 1115–53. [Google Scholar] [CrossRef]
  38. Shephard, Neil, and Kevin Sheppard. 2010. Realising the future: Forecasting with high-frequency-based volatility (HEAVY) models. Journal of Applied Econometrics 25: 197–231. [Google Scholar] [CrossRef]
  39. Sjoerup, Emil. 2019. HARModel: Heterogeneous Autoregressive Models. R Package Version 1.0. Available online: https://CRAN.R-project.org/package=HARModel (accessed on 3 October 2023).
  40. Slutsky, Eugen. 1927. The summation of random causes as the source of cyclic processes. Econometrica 5: 105–46. [Google Scholar] [CrossRef]
  41. Taylor, Stephen J., and Xinzhong Xu. 1997. The incremental volatility information in one million foreign exchange quotations. Journal of Empirical Finance 4: 317–40. [Google Scholar] [CrossRef]
  42. Tibshirani, Robert. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, Ser. B 58: 267–88. [Google Scholar] [CrossRef]
  43. Whittle, Peter. 1953. The analysis of multiple stationary time series. Journal of the Royal Statistical Society. Series B (Methodological) 15: 125. [Google Scholar] [CrossRef]
  44. Yule, G. Udny. 1926. Why do we sometimes get nonsense correlations between timeseries? A study in sampling and the nature of time-series. Journal of the Royal Statistical Society 89: 1–63. [Google Scholar] [CrossRef]
  45. Zou, Hui. 2006. The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association 101: 1418–29. [Google Scholar] [CrossRef]
Figure 1. Plots of the RV5 samples for the S&P500 index and their long-range dependence. (Dashed blue line represents error bands).
Figure 1. Plots of the RV5 samples for the S&P500 index and their long-range dependence. (Dashed blue line represents error bands).
Risks 11 00179 g001
Figure 2. Plots of fitted HAR models.
Figure 2. Plots of fitted HAR models.
Risks 11 00179 g002
Table 1. Descriptive statistics RV5 data sets.
Table 1. Descriptive statistics RV5 data sets.
DescriptorS&P500 1997–2013S&P500 2000–2020
Number of Observations40965099
Minimum0.043290.00000122
Maximum60.560.0074
median0.62940.0000471
mean1.17520.000112
Standard Deviation2.31510.000269
NB: The data taken from the R library package HARModel on RV5 was scaled up in the package.
Table 2. Summary of the HAR models fitted to the RV5 data sets.
Table 2. Summary of the HAR models fitted to the RV5 data sets.
CoefficientEstimateStandard Errort. Value
S&P500 1997–2013 RV5
beta00.112310.030653.664 ***
beta10.227340.0187012.157 ***
beta50.490350.0314415.595 ***
beta220.186380.028136.624 ***
Adjusted R-squared0.5221
F-Statistic1484 ***
S&P500 2000–2020 RV5
beta01.218 × 10−52.877 × 10−64.235 ***
beta12.703 × 10−11.704 × 10−215.858 ***
beta55.295 × 10−12.633 × 10−220.108 ***
beta229.134 × 10−22.225 × 10−14.105***
Adjusted R-squared0.5608
F-Statistic2162 ***
Note: *** Indicates significant at the 1% level.
Table 3. Gegenbauer estimation for RV5 with GARMA for the S&P500 from 1997–2013, constant included with no trend.
Table 3. Gegenbauer estimation for RV5 with GARMA for the S&P500 from 1997–2013, constant included with no trend.
SeriesInterceptU1δ1ar1ar2ar3ar4ar5
coefficient1.139 × 10−40.9794239 0.33368−0.27435 2.215 × 10−11−5.991 × 10−110.09145 0.08230
S.E.7.430 × 10−80.0001457 0.038930.07677 5.745 × 10−23.259 × 10−20.02167 0.01137
Seriesar6ar7ar8ar9ar10ar11ar12ar13
coefficient−0.027439.408 × 10−110.02743 0.24690.16461 0.08230 0.10970.10974
S.E.0.010751.034 × 10−20.01050 0.01180.02678 0.02949 0.02600.02663
Seriesar14ar15ar16ar17ar18ar19ar20ar21
coefficient0.02743 0.0823 0.08230.02743 1.320 × 10−101.125 × 10−10−6.332 × 10−12−0.03658
S.E.0.02655 0.0205 0.02080.02007 1.578 × 10−21.237 × 10−21.103 × 10−20.01056
Seriesar22ar23ar24ar25ar26ar27ar28ar29
coefficient−0.02743−0.10974 0.04572 −0.02743−0.02743 0.02743 4.256 × 10−11−4.386 × 10−11
S.E.0.011270.01226 0.01763 0.012040.01327 0.01359 1.129 × 10−21.117 × 10−2
Seriesar30Gegenbauer frequencyGegenbauer periodGegenbauer Exponent
coefficient0.027430.032330.91970.3337
S.E.0.01050
Table 4. Regression of RV5 on Gegenbauer model estimates, 1997–2013.
Table 4. Regression of RV5 on Gegenbauer model estimates, 1997–2013.
CoefficientS.E.
Constant0.00035960.0287058
RV50.9994219 ***0.0136477
Adjusted RSQ0.567
F. Statistic5363
Note: *** Indicates significant at the 1% level.
Table 5. Gegenbauer estimation for RV5 with GARMA for the S&P500 from 2000 to2020, constant included with no trend.
Table 5. Gegenbauer estimation for RV5 with GARMA for the S&P500 from 2000 to2020, constant included with no trend.
SeriesInterceptU1δ1ar1ar2ar3ar4ar5
coefficient1.175 0.9776774 0.12974 0.098410.18564 −0.06645 0.11861 0.1606
S.E.8.315 0.0004903 0.03286 0.065040.02608 0.01162 0.01376 0.0119
Seriesar6ar7ar8ar9ar10ar11ar12ar13
coefficient−0.01016 −0.02717 0.02901 0.23708−0.02952 0.01219 0.05422 0.05570
S.E.0.01662 0.01295 0.01187 0.012590.02260 0.01466 0.01351 0.01408
Seriesar14ar15ar16ar17ar18ar19ar20
coefficient−0.02716 0.05100 0.05079 −0.04578−0.03084 0.01445 0.04350
S.E.0.01417 0.01183 0.01241 0.012400.01136 0.01165 0.01133
SeriesGegenbauer FrequencyGegenbauer periodGegenbauer Exponent
coefficient0.033729.68120.1297
Table 6. Regression of RV5 on Gegenbauer model estimates, 2000–2020.
Table 6. Regression of RV5 on Gegenbauer model estimates, 2000–2020.
CoefficientS.E.
Constant6.72543 × 10−72.77936 × 10−6
RV50.993054 ***0.0116015
Adjusted RSQ0.589660
F. Statistic7326.850 ***
Note: *** Indicates significant at the 1% level.
Table 7. Regression of RV5 on squared demeaned daily returns for 2000 to 2020; OLS, using observations 21–5099 ( T = 5079); Dependent variable: rv5.
Table 7. Regression of RV5 on squared demeaned daily returns for 2000 to 2020; OLS, using observations 21–5099 ( T = 5079); Dependent variable: rv5.
VariableCoefficientStd. Error t -Ratiop-Value
const0.00003026140.0000028450510.640.000 ***
SQSPRET_10.1742940.0055506031.40.000 ***
SQSPRET_20.09359670.0055854816.760.000 ***
SQSPRET_30.05279280.005875818.990.000 ***
SQSPRET_40.02808100.005876934.780.000 ***
SQSPRET_50.05911880.0058728410.070.000 ***
SQSPRET_60.03865910.005896326.560.000 ***
SQSPRET_70.01273600.005923752.150.031 **
SQSPRET_80.01256740.005907392.1270.033 **
SQSPRET_90.04601410.005905387.790.000 ***
SQSPRET_100.01096190.005880441.860.062 *
SQSPRET_11−0.01259860.00588112−2.140.032 **
SQSPRET_120.006797950.005906600.150.249
SQSPRET_130.0008352420.005908140.140.888
SQSPRET_14−0.007844970.00592586−1.320.186
SQSPRET_15−0.002550570.00589736−0.4330.665
SQSPRET_16−0.01093730.00587609−1.8610.063
SQSPRET_170.004990430.005879280.8480.396
SQSPRET_180.01242270.005922452.0980.036 **
SQSPRET_190.01245950.005630062.2130.027 **
SQSPRET_20−0.007169190.00559011−1.2820.12
Mean dependent var0.000112S.D. dependent var0.000269
Sum squared resid0.000168S.E. of regression0.000182
R 2 0.543145Adjusted R 2 0.541339
F ( 20 , 5058 ) 300.6678p-value( F )0.000000
Log-likelihood36531.92Akaike criterion 73021.83
Schwarz criterion 72884.64Hannan–Quinn 72973.79
ρ ^ 0.281412Durbin–Watson1.437144
Note: ***, ** and * denotes significance at 1%, 5% and 10%.
Table 8. Regression of RV5 on squared, cubed and quartic daily demeaned returns for 2000 to 2020; OLS, using observations 21–5099 ( T = 5079); variable: rv5.
Table 8. Regression of RV5 on squared, cubed and quartic daily demeaned returns for 2000 to 2020; OLS, using observations 21–5099 ( T = 5079); variable: rv5.
CoefficientStd. Error t -Ratiop-Value
const0.0000829710.00003030091.5940.1110
SQSPRET_10.2313510.0095845524.140.000 ***
SQSPRET_20.1103770.0095922911.510.000 ***
SQSPRET_30.1182820.0096659912.240.000 ***
SQSPRET_40.06668210.009694196.8790.000 ***
SQSPRET_50.06420530.009800636.5510.000 ***
SQSPRET_60.06694880.009897976.7640.000 ***
SQSPRET_70.03458050.009955533.4730.0005 ***
SQSPRET_80.01693340.009949481.7020.0888 *
SQSPRET_90.02547470.009934602.5640.0104 **
SQSPRET_100.02273100.009751982.3310.0198 **
SQSPRET_11−0.004787740.00583425−0.82060.4119
SQSPRET_120.02135870.005881833.6310.0003 ***
SQSPRET_130.009346310.005824121.6050.1086
SQSPRET_14−0.001789460.00578095−0.30950.7569
SQSPRET_15−0.0005644900.00574051−0.098330.9217
SQSPRET_16−0.002295100.00575739−0.39860.6902
SQSPRET_170.002472910.005722090.43220.6656
SQSPRET_180.006176660.005760391.0720.2837
SQSPRET_190.006925680.005540601.2500.2114
SQSPRET_20−0.01883720.00539186−3.4940.0005 ***
CUSPRET_1−0.5349070.0592631−9.0260.0000 ***
CUSPRET_2−0.5746050.0614957−9.3440.0000 ***
CUSPRET_3−0.5426580.0625408−8.6770.0000 ***
CUSPRET_4−0.4218810.0625029−6.7500.0000 ***
CUSPRET_5−0.5812110.0623480−9.3220.0000 ***
CUSPRET_6−0.4063210.0625585−6.4950.0000 ***
CUSPRET_7−0.3064190.0622486−4.9230.0000 ***
CUSPRET_8−0.07283040.0624320−1.1670.2434
CUSPRET_90.04409990.06143110.71790.4729
CUSPRET_10−0.1436430.0603871−2.3790.0174
sq_SQSPRET_1−11.10160.991204−11.200.0000 ***
sq_SQSPRET_2−5.000050.992337−5.0390.0000 ***
sq_SQSPRET_3−10.14741.01023−10.040.0000 ***
sq_SQSPRET_4−7.207971.01302−7.1150.0000 ***
sq_SQSPRET_5−2.067481.01765−2.0320.0422 **
sq_SQSPRET_6−5.517111.01948−5.4120.0000 ***
sq_SQSPRET_7−4.9076502938−4.7680.0000 ***
sq_SQSPRET_8−0.9817661.03193−0.95140.3415
sq_SQSPRET_92.008871,023781.9620.0498 **
sq_SQSPRET_10−2.406201.00723−2.3890.0169 **
Mean dependent var0.000112S.D. dependent var0.000269
Sum squared resid0.000148S.E. of regression0.000172
R 2 0.596908Adjusted R 2 0.593707
F ( 40 ,   5038 ) 186.5095p-value( F )0.000000
Log-likelihood36849.86Akaike criterion 73617.72
Schwarz criterion 73349.87Hannan–Quinn 73523.92
ρ ^ 0.228344Durbin–Watson1.543216
Note: ***, ** and * denotes significance at 1%, 5% and 10%.
Table 9. Regressors chosen by the adaptive lasso technique applied to the variables in Table 8.
Table 9. Regressors chosen by the adaptive lasso technique applied to the variables in Table 8.
VariableWeight
sq_DMSPRET_10.23908
sq_DMSPRET_20.12086
sq_DMSPRET_30.12002
sq_DMSPRET_40.063485
sq_DMSPRET_50.059401
sq_DMSPRET_60.067899
sq_DMSPRET_70.026704
sq_DMSPRET_90.012712
sq_DMSPRET_120.012814
CUSPRET_1−0.50290
CUSPRET_2−0.54575
CUSPRET_3−0.48000
CUSPRET_4−0.36132
CUSPRET_5−0.55055
CUSPRET_6−0.32513
CUSPRET_7−0.26524
CUSPRET_90.026202
CUSPRET_10−0.074585
sq_sq_DMSPRET_1−11.184
sq_sq_DMSPRET_2−5.4840
sq_sq_DMSPRET_3−9.7038
sq_sq_DMSPRET_4−6.4668
sq_sq_DMSPRET_5−1.3524
sq_sq_DMSPRET_6−4.7695
sq_sq_DMSPRET_7−3.7897
sq_sq_DMSPRET_81.0471
sq_sq_DMSPRET_92.6983
Table 10. Regression of RV5 on squared, cubed and quartic daily demeaned returns for 2000 to 2020 as chosen by the adaptive lasso technique.
Table 10. Regression of RV5 on squared, cubed and quartic daily demeaned returns for 2000 to 2020 as chosen by the adaptive lasso technique.
CoefficientStd. Errort-Ratiop-Value
Const0.000006371990.000002985852.1340.0329 **
sq_DMSPRET_10.2321170.0094509924.566.33 × 10−126 ***
sq_DMSPRET_20.1172600.0093721912.512.15 × 10−35 ***
sq_DMSPRET_30.1186410.0094335012.589.71 × 10−36 ***
sq_DMSPRET_40.06683960.009395407.1141.28 × 10−12 ***
sq_DMSPRET_50.06529560.009545006.8418.81 × 10−12 ***
sq_DMSPRET_60.07258860.009670257.5067.14 × 10−14 ***
sq_DMSPRET_70.04204260.009588144.3851.18 × 10−5 ***
sq_DMSPRET_90.02435820.009649832.5240.0116 **
sq_DMSPRET_120.02036250.005341933.8120.0001 ***
CUSPRET_1−0.5232560.0581730−8.9953.29 × 10−19 ***
CUSPRET_2−0.5772120.0587275−9.8291.35 × 10−22 ***
CUSPRET_3−0.5256990.0607949−8.6477.00 × 10−18 ***
CUSPRET_4−0.3995510.0586792−6.8091.10 × 10−11 ***
CUSPRET_5−0.6007920.0592911−10.136.67 × 10−24 ***
CUSPRET_6−0.3604120.0582064−6.1926.41 × 10−10 ***
CUSPRET_7−0.3075600.0577694−5.3241.06 × 10−7 ***
CUSPRET_90.03203260.05651130.56680.5709
CUSPRET_10−0.08003200.0544286−1.4700.1415
sq_sq_DMSPRET_1−10.97630.978161−11.227.02 × 10−29 ***
sq_sq_DMSPRET_2−5.621310.964744−5.8276.00 × 10−9 ***
sq_sq_DMSPRET_3−9.911280.993007−9.9813.02 × 10−23 ***
sq_sq_DMSPRET_4−7.046550.991848−7.1041.38 × 10−12 ***
sq_sq_DMSPRET_5−2.347911.00553−2.3350.0196 **
sq_sq_DMSPRET_6−5.402561.00500−5.3767.97 × 10−8 ***
sq_sq_DMSPRET_7−5.333171.00213−5.3221.07 × 10−7 ***
sq_sq_DMSPRET_80.8358330.5502881.5190.1289
sq_sq_DMSPRET_91.699410.9964231.7060.0882 *
Mean dependent var0.000112S.D. dependent var0.000269
Sum squared resid0.000149S.E. of regression0.000172
R-squared0.594745Adjusted R-squared0.592578
F(27, 5051)274.5461p-value(F)0.000000
Log-likelihood36,836.27Akaike criterion−73,616.54
Schwarz criterion−73,433.62Hannan-Quinn−73,552.48
rho0.223409Durbin-Watson1.553007
Note: ***, ** and * denotes significance at 1%, 5% and 10%.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Allen, D.E.; Peiris, S. GARMA, HAR and Rules of Thumb for Modelling Realized Volatility. Risks 2023, 11, 179. https://doi.org/10.3390/risks11100179

AMA Style

Allen DE, Peiris S. GARMA, HAR and Rules of Thumb for Modelling Realized Volatility. Risks. 2023; 11(10):179. https://doi.org/10.3390/risks11100179

Chicago/Turabian Style

Allen, David Edmund, and Shelton Peiris. 2023. "GARMA, HAR and Rules of Thumb for Modelling Realized Volatility" Risks 11, no. 10: 179. https://doi.org/10.3390/risks11100179

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop