1. Introduction
Financial markets can be very volatile, meaning that asset prices can change a lot in a given period. Volatility is a statistical measure of the dispersion of returns for an asset within a short time period. Market volatility is mainly reflected in the deviation of asset future value expectations. It can present significant investment risk or opportunities for astute investors. Many investors trade in portfolios which made up of individual position, each with its own volatility profile. These individual variations, when combined, create a measure of portfolio risk or volatility which can be very different from the risk or volatility of individual position, depending on how they correlate with each other. Hence, investors should align their portfolios not only with relevant expected returns but also with manageable risk level which is subject to the risk profile of the portfolio constituents.
Understanding individual risk dynamics and their cross-dependency can shed light on future returns and volatilities of the portfolio constituents and guide portfolio strategies. Besides, forecasting assets’ future returns and volatilities is also a key factor when pricing options contracts. As volatility is time-varying and unobservable, it needs to be estimated. Researchers estimate volatility mainly in two ways: fitting time series return models and measuring volatility directly using daily or intraday prices.
To model the volatility dynamics, the last few decades have seen the popularisation of many return models based on daily closing prices and their extensions to incorporate volatility measures. Generalised autoregressive conditional heteroskedasticity (GARCH) models are constructed using daily closing prices, discarding all intraday price movements and estimating volatility as a latent process in the return series [
1]. The volatility dynamic for the variance is expressed as a deterministic function of the lagged squared innovations and its lagged variances. An extension of the deterministic volatility in GARCH model to stochastic volatility is the stochastic volatility (SV) model introduced by [
2]. Since then, GARCH and SV become two main classes of models that describe the time-varying autocorrelated volatility process.
Instead of treating volatility as a latent process, some researchers endeavoured to measure volatility directly. Refs. [
3,
4,
5,
6] proposed different types of range-based volatility measures based on the opening, highest, lowest and closing prices. With the availability of high frequency asset price data, another approach to improve the accuracy of daily volatility measure is the realised volatility measures making use of intraday price information. Extensions of realised variance [
7] to many more realised volatility measures have been introduced to improve the efficiency of the daily volatility measures [
8,
9]. However, these volatility measures, like returns, have random noise. Hence, they are often fitted to some models to smooth out the noise and/or provide additional source of information. In realised GARCH model [
10], they are incorporated into the volatility equation via a term from a separated realised volatility model.
Alternatively, volatility models have been proposed and applied directly to the volatility measures. These models should capture features in volatility clustering, such as long-term persistence, short-term persistence and leverage effect. Ref. [
11] proposed the conditional autoregressive range (CARR) model to model the volatility measures directly. Compared to the GARCH model, these volatility models model the volatility measures, not returns, as a stochastic instead of deterministic process. They can be further extended to provide volatility estimates in the return models to capture the volatility clustering in the returns. This results in the two-stage volatility and return models that fit the CARR models in the first stage and impute the fitted volatilities into the returns models to capture the heteroskedasticity of returns in the second stage [
12,
13]. These two-stage CARR-return models are different from the GARCH and SV models that use only the information of returns. Instead, the two-stage models use both return and volatility information and model them as two stochastic processes. Hence, the two-stage models also differ from the realised GARCH model since the latent volatility equation is omitted although the realised GARCH model also has two stochastic processes for returns and realised volatilities.
Although the univariate GARCH models are popular in financial econometrics, they neglect the contemporaneous dependency over a set of assets, stock market indices or exchange rates. These contemporaneous dependency induces return and volatility spillover between multiple assets and are important in many financial applications such as portfolio optimisation and hedging strategy formulation. Without capturing the contemporaneous dependency through covariance measures, the univariate GARCH models are limited in assessing the risk of a portfolio. Multivariate modelling and forecasting of asset returns can capture the dynamics of the covariance matrix and, hence, have prominent applications in portfolio optimisation. The increasing interaction and interconnection between financial markets have motivated the need for reliable modelling and forecasting of the contemporaneous dependencies of financial asset returns. However, the variance–covariance matrix of asset returns is not directly observable in practice. Traditional volatility models treat it either as deterministic latent quantities such as the multivariate GARCH (MGARCH) models or as stochastic latent quantities such as the multivariate SV (MSV) models [
14]. These models estimate the latent variance–covariance matrix using returns.
Extending from the univariate setting, the variance–covariance matrices are, in general, modelled as a linear function of lagged observed and modelled variance–covariance matrices, multiplied to some coefficient matrices to be estimated. However, as the number of assets increase, the number of parameters in a multivariate model may explode causing problems of model and, hence, parameter instabilities. This problem is called the curse of dimensionality. Moreover, the variance–covariance matrices in the MGARCH models should be positive definite [
15]. The vectorisation GARCH (VEC-GARCH) model introduced by [
16] suffers from the high dimensionality and non-positive definiteness problems. Since then, various MGARCH models were developed to address the dimensionality and positive definiteness issues. The Baba–Engle–Kraft–Kroner GARCH (BEKK-GARCH) model proposed by [
17] adopts quadratic long and short-term persistent matrices to ensure positive-definiteness and reduce the number of parameters. Moreover, the dynamic conditional correlation (DCC)-GARCH model [
18] offers direct modelling of variances and correlation and is very appealing. Multivariate extension also applies to the SV model family. Ref. [
19] proposed the MSV model, which allows the variances and covariances to evolve through time. In the model, a set of asset returns is driven by some latent factors which are specified as SV processes. The dimension of the parameter space of this model is manageable as it only increases linearly with the number of assets being modelled. Hence, the MSV model has fewer parameters than the MGARCH model.
In recent years, different range-based MGARCH models have been proposed to improve the modelling efficiency. They use different ways to incorporate the range information. Among them, [
20] proposed to calculate covariance using the variance of each component and their pairwise sum so that range-based volatility measures such as Parkinson high-low [
3] can be applied. Then the range-based variance–covariance matrix can be constructed. The BEKK-High-Low-GARCH model proposed by [
21] adopted this range-based variance–covariance matrix as the short-term persistence matrix, replacing the product of return residual vector in the standard BEKK-GARCH model.
A similar idea applies to the DCC-GARCH model for the short-term persistence matrix in the correlation matrix equation. Ref. [
22] proposed the co-range DCC-GARCH model which adopts the range-based correlation matrix standardised from the range-based variance–covariance matrix which replaces the product of standardised return residual vector in the standard DCC-GARCH model. For the univariate model, Ref. [
23] proposed range-GARCH (RGARCH) model in which the squared return residual in the volatility equation is replaced by the Parkinson [
3] volatility measure. Then, Ref. [
24] proposed the DCC-RGARCH model which estimates the short-term persistence matrix using the product of standardised return residual vector from the RGARCH model. Another approach uses the range information to standardise return residuals. Ref. [
25] estimates the short-term persistence matrix using product of return residual vector standardised by dividing each component from the mean of the univariate CARR model for that component. These models incorporate a range-based volatility measure in different ways to improve estimates of covariance or correlation matrix for returns and increase the accuracy of covariance, variance and correlation forecasts compared to MGARCH models that use only closing prices.
On the other hand, extensions of CARR models for volatility measures to multivariate settings imply the modelling of some positive definite variance–covariance matrix measures using some distributions defined for matrices. These highly sophisticated matrix distributions include the Wishart [
26] and matrix-F [
27] distributions. Alternatively, ref. [
20] extended the CARR model to the multivariate CARR (MCARR) model to model the variances of individual asset return and their pairwise sum. Through the estimated variance of pairwise sum, the estimated covariance of the asset pair can be determined, and so, the estimated variance–covariance matrix can be similarly constructed. An obvious advantage of the MCARR model is that the modelling of covariance via the variance of pairwise sum allows the same CARR model structure to be applied. Otherwise, one needs to adopt some covariance models with a different distribution assumption for a different support. However, Ref. [
20] did not provide a numerical demonstration for the application of the proposed model.
Similar to the two-stage CARR-return models of [
13] the MCARR model can also be extended to a two-stage MCARR-return model. However, empirical studies using the MCARR and two-stage MCARR-return models are lacking in the literature. This paper aims to provide a practical framework for the application of these models. Our approach is similar to [
21] in modelling covariance rather than correlation matrix. However, we apply the stochastic MCARR model to the vector of range-based variance of each component and their pairwise sum instead of taking the range-based variance–covariance matrix as the short-term persistence matrix in the deterministic covariance model for the GARCH model. Our approach is more interpretable and direct. To estimate the parameters of these models, we consider four popular methods, namely, the maximum likelihood (ML) [
28,
29], quasi ML [
11,
30,
31], Bayesian [
12,
32,
33] and estimating functions [
34]. We choose the ML method since the likelihood function can be completely specified under the MCARR and two-stage MCARR-return models and the models can be easily implemented using the MATLAB optimisation package
fmincon.
To select some best performing models, the forecast performance for volatilities can be evaluated by some loss functions. Apart from the root mean squared forecast error and the mean absolute forecast error, [
35] proposed a loss function family that can be adopted to assess model predictability for both in-sample estimates and out-of-sample forecasts. The scalar parameter in the loss function family enables the loss function to cover a wide variety of shape, ranging from symmetric loss to asymmetric loss, with heavier penalty either on under-prediction or over-prediction. Ref. [
35] claimed that these loss functions are robust to the choice of volatility proxy. Then the best performing two-stage MCARR-return model can be applied to provide forecast risk measures such as value-at-risk (VaR) [
36] and conditional VaR (CVaR) [
37], which are essential tools to measure the potential loss of an asset. The performance of VaR for returns can be assessed using some backtests based on counts of tail losses that exceed VaR [
38,
39].
The contribution of this paper is threefold. Firstly, we propose the MCARR model to study the dynamics of volatility and correlation of assets’ returns. We adopt asymmetric mean functions in the MCARR model to capture the leverage effects. We fit the efficient scaled realised Parkinson volatility measure of each index series and their pairwise sums of indices to the MCARR model to obtain in-sample estimates and forecasts of volatilities. The covariances and, hence, correlations of pairwise indices are calculated using the fitted volatilities of individual and pairwise sums of indices. We also investigate the performance of MCARR model with asymmetric and symmetric mean functions. The accuracy of volatility forecasts relative to the volatility proxy based on a scaled realised volatility measure is compared using two robust loss functions proposed in [
35]. Secondly, we show how the volatility and covariance estimates from the MCARR models are imputed into the return models to model the variance–covariance matrix of returns in the two-stage MCARR-return model. The modelling performance of the two-stage MCARR-return model is compared with BEKK-GARCH models. Lastly, we provide forecasts of various VaR and CVaR for returns and tail quantile (TQ) and tail conditional expectation (TCE) for closing prices based on the best fitted two-stage MCARR-return models. The accuracy of these VaR forecasts is further tested using the Kupiec unconditional coverage (KUC) [
38] and Christoffersen independent (CI) [
39] tests.
The remainder of this paper is organised as follows.
Section 2 introduces volatility measures that are used in this study.
Section 3 describes MCARR and the two-stage MCARR-return models with different mean functions.
Section 4 provides empirical applications to some daily index series. Performance of in-sample estimation and out-of-sample forecast of volatilities and returns are evaluated and compared with univariate CARR model and BEKK-GARCH models. The best model for each pair of indices is used to provide one-day-ahead forecasts of VaR and CVaR for returns and TQ and TCE for closing prices. Finally,
Section 5 concludes the paper.
2. Realised Volatility Measures
Let
be the price of an asset at day
t. It follows the geometric Brownian motion
where
is a drift for
,
is a volatility parameter and
is the standard Brownian motion. The logarithmic price
then satisfies
by Ito’s lemma. The logarithmic price process
follows a Brownian motion with drift parameter
and volatility parameter
. Although (
1) and (
2) are defined on continuous time, they are often discretised in real applications as most asset price series are measured over discrete time points. The logarithmic return (hereafter referred to as return) of an asset price
is given by
where
denotes the logarithmic closing price at day
t. It is also called close-to-close (CC) return. Returns can also be measured intraday (called open-to-close (CO)) as
where
denotes the logarithmic opening price at day
t. Compared with CC return
, research shows that intraday CO return
contributes more to total return. Intraday return is of particular importance for day traders, who use daytime gyrations in stocks and markets to make trading profits. Hence, we adopt intraday CO return
. For volatility, the stationary process
in (
2) is often extended to be a time-varied process
to capture the volatility clustering in returns. Ref. [
3] proposed an unbiased measure for such volatility process, called Parkinson (PK) measure, which is given by
where
and
denote the highest and lowest logarithmic prices of day
t, respectively. Using more intraday information, [
9] proposed the scaled realised PK (SRPK) measure given by
where
is the PK measure of day
,
is the realised PK (RPK) measure of day
t which is given by
where
and
are the highest and lowest logarithmic prices in the
l-th (
) interval of day
t, and
trading days (3 months) is the optimal scaling period suggested by [
40]. This SRPK measure using more intraday information is more efficient and the scaling factor which is the multiple before
in (
4) adjusts for the downward bias due to infrequent trading. To assess the performance of SRPK measure in the empirical study, we use the scaled realised CO (SRCO) measure as a proxy to the true volatility
. The SRCO measure is given by
by applying the idea of scaled realised measures to realised CO (RCO) measure
defined as
is the intraday volatility measure of day
and
trading days (3 months) is the optimal scaling period suggested by [
40].
To measure covariances, these range-based measures fail. Ref. [
7] introduced the realised sample covariance between return series
i and
j as
where
and
are respectively the last and first trading logarithmic prices in the
l-th (
) interval of day
t. This realised sample covariance measure is defined in a similar way as the realised sample variance measure using high frequency data and can be used to estimate the off-diagonal entries in the variance–covariance matrix of asset returns. However, models for the covariance measures are different from models for variance measures because covariances have a real support
whereas variances are always positive. To avoid modelling variance and covariance using different model structures, [
20] proposed the MCARR model by using only variance measures and estimating the covariance using the variance of the pairwise sum as detailed in the next section.
5. Conclusions
In this paper, we propose the two-stage MCARR-return models to study returns, volatilities and correlation of market indices. We provide details of applying the models through analysing S&P 500, DJIA and DJUSFI indices with 1-minute sampling frequency. In these empirical studies, we begin with proposing the SRPK volatility measures and applying the measures to MCARR models with MLN distributed errors for each pair of indices in the first stage. We consider two models, MCARR(1,1,0) and MCARR(1,1,1) in which the mean function has no leverage effect and leverage effect, respectively. We explore ways to simplify the coefficient matrices in the mean function to reduce the number of parameters. In the second stage, the volatility and covariance estimates or forecasts from the best performed MCARR model are imported to the multivariate return models to estimate the returns. The errors of these return models are assigned the MVT distribution. We compare the in-sample fits of volatilities and returns using LL and AIC. For returns, we compare the performances of the two-stage MCARR(1,1,0)-return and MCARR(1,1,1)-return models with MVT error distribution to BEKK-GARCH(1,1) models with MVN and MVT distributions. The results from each pair of in-sample fits show that the best two-stage MCARR-return models outperform all BEKK-GARCH models. The best two-stage MCARR-return model is the MCARR(1,1,0)-return model with a constant mean function for two pairs of indices and the MCARR(1,1,1)-return model with constant mean function for one pair of indices.
Then these best models are applied to provide one-day-ahead volatility and return forecasts for 64 days. The predictive performances of volatility forecasts are evaluated using RL functions with SRCO measure chosen to be the proxy for true volatility. The one-day-ahead VaR and CVaR forecasts of returns at lower and upper risk levels 0.005, 0.025, 0.05, and one-day-ahead TQ and TCE forecasts of closing prices at lower and upper risk levels 0.025 using the best performed model are obtained. The p-values of KUC and CI tests show insufficient evidences to reject the null hypotheses of the VaR model at 5% level across all risk levels, confirming the accuracy of the VaR forecasts. In summary, the proposed two-stage MCARR-return models can capture the persistence, volatility spillover and leverage effects of the volatility dynamics, cross dependence between asset returns, and leptokurtic return distribution. These effects are all estimated to be significant in our data.
However, there are a few limitations of our study. Firstly, intraday prices information is required to measure the variance of the sum of a pair of returns using any volatility measure such as the SRPK that we adopt. Daily prices fail to measure the variance of the sum of two returns as the maximum of the sum is not the sum of each maximum unless the sum and, hence, each return are continuously observed. Hence, the dynamics of the sum of returns process need to be estimated using realised volatility based on more intra-period observations as explained in (
12) and (
13). On the other hand, collecting high frequency intraday price information is possible from many platforms. Apart from SRPK, there are other more efficient volatility measures such as scaled realised Garman–Klass and scaled realised Rogers–Satchell measures since these measures apply intraday open-close apart from high-low information.
Secondly, the MCARR model models the covariance indirectly through modelling the variance of pairwise sum of returns. This is less efficient as features of covariance may be different from those in the variance of pairwise sum. For future research, one may consider modelling directly the time series of variance-covariance matrix measures
using (
8) as one possible measure. Then
can be assigned to the conditional Wishart distribution [
26] with the mean matrix
adopting a structure similar to (
35) for the BEKK-GARCH model which contains short-term persistence matrix
and long-term persistence matrix
. Then
in (
28) with
can be estimated using
. Moreover,
can also adopt a structure similar to DCC-GARCH which models variance and correlation matrix simultaneously. This model adopts range information more directly than the range-based MGARCH models. The Wishart model ensures positive-definite covariance matrices without imposing parameter constraints. The model can be estimated easily using maximum likelihood method with closed-form pdf. This is an interesting field for the analysis of moderate dimensional variance-covariance matrix processes.
However, the MCARR and Wishart models cannot solve the curse of dimensionality problem. As the number of assets increases, the number of parameters may explode, causing problems of convergence and parameter instabilities. To reduce the number of parameters, we suggest to impose some constraints such as symmetric coefficient matrices for the short and long term persistence and diagonal coefficient matrices for the leverage effect. Another approach is to use factor model for time series [
42]. Moreover, machine learning models can handle high dimensional data without imposing any distributional or structural assumptions. Convolutional neural network (CNN) and recurrent neural network (RNN) are the most common types of neural network in capturing image data and sequential data, respectively. Ref. [
43] used GARCH and RNN models to forecast Bitcoin’s return volatility and VaR. They showed that RNN model is more sensitive and responds more quickly to volatility change than the traditional financial time series models. Ref. [
44] proposed a Convolutional Long Short Term Memory neural network (CLSTM-NN) to handle the high dimensional realised covariance matrices consistently. However, although CLSTM-NN can demonstrate its excellent forecasting ability, specific features that drive the covariance process seems to be hidden in a black-box. This limitation is particularly unappealing when the dynamics of the covariance process is the focus of the model. Hence, future research can be directed to explore some hybrid statistical and neural network models that can overcome the curse of dimensionality problem and preserve some model interpretability.