2.1. Modelling and Inference
A common feature of financial time series is the presence of volatility clustering (see, e.g.,
Cont 2001).
1 Common tools used to address such features are Generalized Auto-Regressive Conditional Heteroscedasticity (GARCH) models (
Bollerslev 1986), which generalize the ARCH models introduced by
Engle (
1982). Let
be the return discrete-time process with zero mean. The standardized disturbances
are independent and identically distributed (i.i.d.) with zero mean,
, and unit variance,
. Then, the GARCH(
) process for return
is defined as
and
where
is the conditional variance,
,
and
.
When studying spillover risk, it is natural to look for multivariate extensions of the GARCH model to characterize the joint evolution of stock returns. Before presenting the multivariate model it is useful to define the following quantities of interest. Assuming a market with N assets, then at time we have:
is the vector of assets’ returns at time t,
is the conditional covariance matrix,
is the vector of the univariate conditional variances,
is a squared matrix with the conditional standard deviations on the main diagonal and zero otherwise.
is the positive definite conditional correlation matrix,
is the conditional covariance matrix of the standardized residuals,
is the unconditional covariance matrix of the standardized residuals,
Full generalizations of a univariate model, such as the VEC GARCH model (
Bollerslev et al. 1988;
Ling and McAleer 2003) or the BEKK model, (
Engle and Kroner 1995) have been extensively discussed in the literature. Using matrix notation, it is possible to characterize a multivariate GARCH as follows:
and
where
are
matrices and
is an
valued i.i.d. sequence of random variables with zero-mean and unit-variances (see
Engle and Kroner (
1995) for the restrictions required to ensure stationarity and positive semi-definiteness of the conditional covariance matrix).
Multivariate GARCH models have the drawback of having a large number of parameters, making the estimation complex and computationally challenging, hence these models are suitable only if the dimensionality
N is very small. A solution to the dimensionality problem is to pose further restrictions on the multivariate process. A common restricted specification is the Constant Conditional Correlation model (CCC) proposed by
Bollerslev (
1990) that assumes that the conditional covariance matrix is constant over time, requiring focusing solely on the estimation of conditional variances. According to the CCC-GARCH model, Equation (
1) is given by (
3) and Equation (
2) can be written as follows:
where
is the
dimensional vector of unconditional variances with
,
and
are the
dimensional vector of ARCH and GARCH parameters of order
q and
p with
,
, and ⊙ is the Hadamard product.
The CCC-GARCH model assumes that the conditional covariance matrix,
, can be factorized as
where the correlation matrix is assumed to be constant throughout time
and the conditional standard deviation matrix
is a diagonal matrix given by
The generic element of conditional covariance matrix
is constructed as
where
is the constant conditional correlation coefficient between the
ith and
jth variables.
The multivariate GARCH model with a dynamic conditional correlation structure (DCC), introduced by
Engle (
2002), improves the dynamic relationship, assuming a time-varying correlation matrix as follows
The dynamic correlation model allows
to be time-varying, and its dynamics are modeled assuming a GARCH(1,1) process for the covariance of the standardized residuals. Hence
is decomposed into
where
where
and
are ARCH parameters and GARCH parameters of the DCC model, respectively. By following the GARCH model from Equation (
2), the generic element of the time-varying conditional covariance matrix of the standardized residuals
can be expressed as
where
. The process is mean-reverting as long as
and
. In the particular case of
, the process will follow the exponential smoother matrix of the standard residuals, as described in
Engle (
2002). Finally, the generic conditional correlation
can be written into matrix form as in Equation (
10). Substituting the conditional correlation matrix into Equation (
9), the DCC is given by
Restricted GARCH models beyond CCC and DCC-GARCH have been discussed by
Caporin (
2008) and
Billio et al. (
2021), who introduce spatial matrices within BEKK models for measuring risk spillover. In these approaches, the interaction components of the model are based on spatial weight matrices provided exogenously (for instance on the basis of geographical distances among assets, or some similarity metrics). These models allow easier and more accurate estimation by effectively imposing restrictions on the parameter space. An alternative approach to improve the estimation of multivariate GARCH models is to introduce sparsity in the parameter estimates by using an
penalization, as suggested by
Dhaene et al. (
2022).
2.1.1. Spatial DCC-GARCH
In this work, we introduce a spatial extension of the DCC-GARCH model. The model is based on the approach of
Borovkova and Lopuhaa (
2012). In particular, to enrich the DCC-GARCH model with a spatial component we introduce a spatial matrix
into the vector of the conditional variances
. The resulting conditional variance is expressed as
where
,
,
, and
are diagonal matrices. The term
is the weight matrix
with
and
, given by
| = | . |
The
i-th element of
becomes
where
and
. The introduction of the spatial component results in two exogenous spatial variables in the conditional variance equation and two additional parameters
and
, which measure the influence of the aggregated lagged variances and squared returns of all the other assets. These two new variables measure the aggregated spillover effects. To complete the Spatial DCC-GARCH model, we then estimate the conditional correlation matrix following the two-step procedure described in
Engle and Sheppard (
2001), see
Section 2.1.2.
The condition for the weak stationarity of the spatial GARCH model follows from the corresponding stationarity condition for E-CCC models, derived by
Jeantheau (
1998) and
Conrad and Karanasos (
2010) for E-CCC models. The positivity conditions on all GARCH coefficients are not necessary for the positivity of variance and in many empirical cases, these may be too restrictive, ruling out possible negative volatility feedback. One author
Conrad and Karanasos (
2010) studied the E-CCC models and stated necessary and sufficient conditions (in terms of the process parameters) for the positivity of variance; these conditions are summarized in Theorem 1 of their paper. It can be seen easily that our spatial DCC-GARCH(1,1) model is equivalent to the E-DCC model of order one, with the particular form of the parameter matrices
and
. So for the conditional variances to be positive, the conditions (C1)–(C3) of Theorem 1 of
Conrad and Karanasos (
2010) must apply. The proposed spatial DCC-GARCH(1,1) model is weakly stationary if the modulus of the largest eigenvalue of the matrix
is less than 1. In that case, the unconditional variances are given by
. More specifically the unconditional variance of the
ith bank is given by
and it is positive for
,
,
.
2.1.2. Estimation of the Multivariate Spatial GARCH(1,1) Model
We follow a two-step procedure for the DCC-GARCH estimation, as described in
Engle and Sheppard (
2001) and
Engle (
2002). The first step is devoted to the estimation of (
16) where the exogenous variable
is not observable since it is a function of the conditional variance of the other assets. Hence, following
Borovkova and Lopuhaa (
2012), we start by estimating the standard univariate GARCH(1,1) models without the external regressors to obtain the initial parameters
and the estimated variances
. Then we use an iterative procedure in which we alternate the following two steps:
Compute the exogenous variables given the weights and the initially estimated variances ;
Estimate the complete set of parameters
and the new estimated variances
according to Equation (
16).
We iterate this procedure until the percentage variation of the estimate is less than a small threshold. For more details please refer to
Borovkova and Lopuhaa (
2012).
In the second step, as in
Engle (
2002), we maximize the quasi log-likelihood that, when the standardized error
follows a multivariate Gaussian distribution is
where
is the set of parameters of the multivariate distribution, with subsets
being the spatial GARCH parameters estimated in the first step, and
the parameters of the time-varying conditional correlation that are estimated in the second step.
Excluding
and other additive and multiplicative constants, we maximize the following function:
The quasi-log-likelihood function under the Student’s t distribution is
where
is the degrees of freedom,
is the set of parameters estimated in the second step, and
is the Gamma function. The estimation of the model is implemented in R using the packages
rugarch and
rmgarch for the estimation of univariate GARCH models in the first step, and the DCC-GARCH model in the second step, respectively.
Concerning the complexity of estimation, we see that the spatial models add two parameters (,) for each asset. Hence the number of additional parameters scales linearly with the size of the dataset considered. We also see that Student’s t model has one extra parameter compared to the Gaussian model, and that in the limit for , the former converges to the latter. Moreover, the spatial model nests the non-spatial DCC-GARCH models, where the coefficients of the spatial components are restricted to zero. The Spatial DCC-GARCH model can therefore be considered parsimonious in terms of the number of parameters, especially compared to VEC GARCH or the BEKK model. One drawback of the proposed model is that it requires the exogenous identification of a spatial matrix.
2.1.3. Spatial Weight Matrix
To estimate the spatial DCC-GARCH described in
Section 2.1.1, we need to specify the weight matrix
which incorporates the spatial structure defined a priori. The most intuitive way to compute the weights is to consider the geographical distance between the issuers’ market cities. However, according to
Borovkova and Lopuhaa (
2012), the obtained weights are not economically meaningful, and as an alternative, they consider a different set of information and compute distance in terms of GDP and market capitalization. In our work, we investigate whether the banks’ similarity of the structure of credit exposure provides some benefit in catching risk spillover effects. Hence we propose to consider the cosine similarity between exogenous information relative to the credit exposure of each bank derived from the EU-wide stress test under the European Banking Authority (EBA). The higher the cosine similarity the stronger the closeness of banks’s credit exposure. Suppose two attribute vectors of length
L,
and
which describe the credit exposure information of bank
i and
j with
. We define the cosine similarity as follows:
We set
and we normalize the rows of
C by dividing each element by the sum of the row. Doing so, we obtain the matrix
that is the spatial weight matrix used in Equation (
15).
2.2. Financial Application: CoVaR
Financial institutions use VaR to measure the standalone risk. However, the measurement of individual risk is not able to explain the linkages between other financial institutions and the financial system. Systemic risk is the possibility that an event at the institutional level could trigger severe instability or collapse of an entire industry or economy. The work
Adrian and Brunnermeier (
2014) introduces CoVaR to help regulators to measure risk spillovers.
The Value at Risk (VaR) at level
of a random variable
r with cumulative distribution function
is defined as
where
denotes the confidence level of the VaR.
2 Restricting our analysis to continuous probability distribution functions, VaR can be implicitly defined as the
q-quantile of the probability distribution function
The Conditional Value-at-Risk (CoVaR) (see
Adrian and Brunnermeier 2014), denoted by
, is implicitly defined by the
q-quantile for a continuous probability distribution function of the financial system
S conditional on some event related to
, where
is the return of institution
i such that
The CoVaR can capture the contribution of systemic risk by conditioning the VaR to a stressed situation. It captures the spillover of risk between a particular institution and the financial system, and it is commonly used to assess the systemic risk of a bank in a financial system. Inspired by this idea, we concentrate our attention on a CoVaR pairwise analysis between institutions in order to quantify the spillover between couples of banks.
3The conditioning event
in the original paper by
Adrian and Brunnermeier (
2014) is defined as the return of the conditioning asset
i being equal to its negative VaR, that is
. In this work, we follow the alternative approach of
Girardi and Ergün (
2013) that considers as a conditioning event the return
being smaller or equal than the following quantity:
. This formulation allows us to consider more severe distress events and improves the consistency of the measure with respect to the dependence parameter, allowing for backtesting. Following
Girardi and Ergün (
2013), the redefined
is obtained solving
Let
be the bivariate probability distribution function of future returns, estimated using the DCC-GARCH model with either Gaussian or Student’s t innovations,
is implicitly defined as the quantity that solves
We compute the integral (
23) on a grid of 100 values for
to find the approximated solution under the different distributional assumptions.
4 2.2.1. CoVaR Based on Filtered Historical Simulations (FHS)
In order to compare our result with a model-free approach, we consider the Filtered Historical Simulations (FHS).
FHS is a well-known tool for multivariate forecasting and simulation of time series that avoids the need for distributional assumptions for the returns’ joint dynamic, relying instead on past realizations. The main novelty of this approach compared to historical simulation is to rescale the innovation by the volatility that prevails on a specific day, allowing therefore to reflect the current market conditions (
Barone-Adesi et al. 2002;
Giannopoulos and Tunaru 2005;
Gurrola-Perez and Murphy 2015). To provide a distribution-free benchmark model for the analysis, we compute the VaR and CoVaR via FHS. Consider a time window of length
T and let
be the series of historical returns with
. The volatility weighted returns series can be computed as follows:
, where
is the volatility estimated with an Exponentially Weighted Moving Average procedure (EWMA) with decay factor
at time
t and
is the one-day-ahead estimate of volatility at the end of the estimation period. In practice, implementing FHS for the estimation of VaR and CoVaR requires the following steps:
compute the residual (or devol) time series, dividing the returns by EWMA estimated volatility . This allows us to sample from approximately serially independent and identically distributed data;
compute the estimated empirical distribution of (revol), multiplying the devol time series by the latest estimate of volatility and assigning to each of the possible outcome a weight ,
estimate and by computing the empirical quantile of and , respectively.
The FHS approach has the advantage of being non-parametric, although it has the drawback of requiring a large number of observations to accurately estimate risk, especially for the CoVaR. For this reason, it is not suitable for small values of q. For instance, with the expected number of exceedances of the CoVaR for an estimation window of 10,000 daily observations (approx 40 years) is 1, while for it is 25.
2.2.2. Backtesting VaR and CoVaR
In order to test the goodness of our VaR and CoVaR estimates we estimate a time series of length
of one-day-ahead estimates, each computed on an estimation window of
daily observations. We consider tests based on the number of violations and specifically unconditional and conditional coverage tests (
Christoffersen and Pelletier 2004;
Kupiec 1995), as well as tests based on asymmetric loss functions for the VaR and CoVaR (
Caporin 2008). The model that provides estimates of VaR and CoVaR with the correct number and distribution of exceedances and/or lower loss function values will be considered the more accurate.
2.2.3. Tests Based on the Number of Violations
In order to determine the accuracy of the proposed model, we consider two tests based on the number of violations.
Denote by
the ex-post realized returns of institution i with ;
the ex-ante Value-at-Risk forecasts at for time t, where q is the expected coverage;
a sequence of violation for a given interval of the Value-at-Risk forecast:
The first test is the Kupiec test or unconditional coverage (UC) test (
Kupiec 1995). The null hypothesis that the observed failure rate
p is equal to the failure rate, suggested by the confidence level of VaR,
q, is tested. Thus, the null hypothesis assumes that the observed violation rate is equal to the expected violation rate. If the null hypothesis is rejected, the model is considered inaccurate at the 95% confidence level.
The conditional coverage (CC) test proposed by
Christoffersen and Pelletier (
2004) indicates that the number of violations must be independently distributed along the testing period where the dependence can be described as a first-order Markov sequence with a transition probability matrix given by
where
is the probability that, conditional on today being a non-violation, the next period is a violation, and
is the probability that, conditional on today being a violation, the next period is a violation. The hypothesis to test for the conditional coverage property is
which assesses the independence of failures on consecutive time periods.
Girardi and Ergün (
2013) proposed the backtesting of
via a straightforward application of the standard Kupiec and Christoffersen tests considering the violations (i.e.,
) for those time periods in which
. Having that in mind we compute a second hit sequence,
, on the sub-sample in which
as follows:
where the number of observations of the second hit sequence is equal to the number of violations of the first hit sequence. Hence for the tests on CoVaR, the sequence of violation
can be used instead of
.
2.2.4. Backtesting Based on Loss Functions
The backtesting based on the confidence level of VaR estimates shows the accuracy of an individual model; however, the comparison between the different models can be limited. To overcome the drawback,
Lopez (
1999) proposed backtesting based on a loss function. The method focuses on the magnitude of the failure when the violation occurs. Thus, the VaR estimates under the loss function can provide the model’s performance as a numerical score. The value of the loss function at time
t can be given as
where
and
are the loss functions applied to exceedances and values within the VaR, respectively. Finally
is defined as the total loss. The best model can be identified by the lowest total loss. Other works by
Abad et al. (
2014),
Caporin (
2008), and
Cesarone and Colucci (
2016) show several alternative specifications for the loss functions
and
, defined from the regulator and investor’s point of view. In the regulator’s view, we consider the size of the loss only if the violation occurs:
On the contrary, the investor is interested in both sides, as an overestimation of VaR may trigger limitations from the risk management, or lead to higher capital requirements imposed by the regulator. In particular we consider the functions
and
We underline that the resulting loss function
is strictly related to the Koenker loss function used for the estimation of quantile regression, defined as
where
and
. In case of independent and identically distributed returns the minimization
is the value at risk. For further details we refer to
Koenker and Bassett (
1978),
Rockafellar and Uryasev (
2013), and
Giacometti et al. (
2021).
We extend the backtesting procedure to the case of the CoVaR as before, estimating the measure on a sub-sample in which .