*2.2. Research Methodology*

Step 1: preliminary data preparation and analysis

At the beginning, we stationarized all the 11 equity indices time series using the logarithmic transformation of prices into (continuously compounded) returns. Subsequently, we tested and confirmed the stationarity of each transformed time series using Augmented Dickey–Fuller and Philips–Perron unit root tests (see Appendix A, Table A1). The summary statistics for the time series are presented in Table A2.

Step 2: estimating conditional volatilities for daily returns

We used our database of 11 time series to compute daily conditional volatilities. We did this for every time series. Because in the financial literature there are many studies confirming the asymmetry of the distributions of daily returns (particularly the negative skewness), we chose to use an EGARCH (1,1) model according with Tsay [34], described by the equations below:

$$y\_t = \mu + \varepsilon\_{l\prime} \text{ where } \varepsilon\_{l} = \sigma\_l z\_t \tag{1}$$

$$\log \sigma\_t^2 = k + \sum\_{i=1}^{P} \gamma\_i \log \sigma\_{t-i}^2 + \sum\_{j=1}^{Q} \alpha\_j \left[ \frac{|\varepsilon\_{t-j}|}{\sigma\_{t-j}} - E \left\{ \frac{|\varepsilon\_{t-j}|}{\sigma\_{t-j}} \right\} \right] + \sum\_{j=1}^{Q} \xi\_j \binom{\varepsilon\_{t-j}}{\sigma\_{t-j}} \tag{2}$$

Subsequently, we tested and confirmed that the coefficients of every EGARCH (1,1) model that we calibrated for each of the 11 time series are significant.

Step 3: estimating conditional correlations for daily returns

Going further in our study, we estimated conditional correlations between MSCI WORLD and all the remaining ten equity indices. In order to do this, we chose a multivariate DCC GARCH model as described by Engle and Sheppard [35] and Sheppard [36]:

$$\left|r\_l\right|F\_{l-1} \sim N(0, H\_l) \text{ and } H\_l = D\_l R\_l D\_l \tag{3}$$

where Dt is a k × k diagonal matrix containing the time-varying standard deviations estimated using univariate GARCH models, with " *hit* found on the *i th* diagonal, and *Rt* represents the matrix with similar dimension containing the time varying correlations (at time *t*).

The log-likelihood for the above estimator can be expressed as presented below:

$$-\frac{1}{2}\sum\_{t=1}^{T}\left(k\log\left(2\pi\right)+2\log\left(|D\_{t}|\right)\right)+\log\left(|R\_{t}|\right)+\epsilon\_{t}^{'}R\_{t}^{-1}\epsilon\_{t}\tag{4}$$

where ε<sup>t</sup> ~ N (0, *Rt*) is the time series of the standardized residuals.

By expressing the components of the *Dt* matrix as univariate GARCH (P,Q) processes as below, we find that:

$$h\_{\rm it} = \omega\_{\rm i} + \sum\_{p=1}^{P\_i} \alpha\_{ip} r\_{\rm it-p}^2 + \sum\_{q=1}^{Q\_i} \beta\_{iq} h\_{\rm it-p} \tag{5}$$

According with these results, the structure of our dynamic correlation processes was:

$$Q\_t = (1 - \sum\_{m=1}^{M} \alpha\_m - \sum\_{n=1}^{N} \beta\_n)\overline{Q} + \sum\_{m=1}^{M} \alpha\_m (\varepsilon\_{t-m} \epsilon\_{t-m}') + \sum\_{n=1}^{N} \beta\_n Q\_{t-n} \tag{6}$$

$$R\_t = Q\_t^{\*-1} Q\_l Q\_t^{\*-1} \tag{7}$$

where *Q* represents the unconditional covariance for the time series of the standardized residuals that resulted from the initial estimation. Considering this, we note as *Q\*t* the diagonal matrix containing the square root of each element situated on the diagonal of the matrix *Qt*.

In addition, the elements of *Rt* represent the values of the time-varying correlations among the pairs of time series of indices returns and can be expressed as: <sup>ρ</sup>*ijt* <sup>=</sup> <sup>σ</sup>*ijt* <sup>σ</sup>*it*σ*jt* .

As we did previously in step two, described above, we subsequently tested and confirmed that the coefficients of every DCC MV-GARCH model that we estimated are significant.

#### Step 4: analyzing the volatility regimes of the indices' daily returns

As previous studies also found (ex. Lupu [37]), especially during negative shocks, there is a link between the correlation and volatility, as a form of contagion in a narrow definition. We were interested to see the level of intensity at which the gender equality indices exhibit this behavior. In order to do this, we investigated whether the volatility regimes of overall indices and gender equality indices were synchronized. We estimated a Markov Regime Switching model for each time series of indices' returns, as proposed by Tsay [34], Hamilton [38,39], and Perlin [40]. The model's output represents the

probabilities of the time series being either in a high-volatility or in a low-volatility regime at each period, and are described by the equation below:

$$y\_t = \sum\_{i=1}^{N\_{\rm NS}} \beta\_i x\_{i,t}^{\rm nS} + \sum\_{j=1}^{Ns} \mathcal{Q}\_{j,S\_t} x\_{j,t}^S + \varepsilon\_t \qquad \qquad \varepsilon\_t \sim P(\Phi\_{S\_t}) \tag{8}$$

where NS and NnS represent the total number of the coefficients that are switching and non-switching, respectively; *xnS <sup>i</sup>*,*<sup>t</sup>* is a subset of *xi*,*<sup>t</sup>* and groups the independent variables with coefficients that do not switch; *xS <sup>i</sup>*,*<sup>t</sup>* is a subset grouping the variables with coefficients that switch; P(Φ) indicates the probability density function of the errors; and Φ is the vector containing the values of the parameters of *P*.

Similar to Badea et al. [41], we labeled the volatility regimes at each period by transforming the time series of probabilities resulting from the model into the time series of volatility regimes. The labels were set as chronological binary values: 1 if the respective equity index was manifesting a high volatility and 0 if the volatility regime was a quiet one. As Badea et al. [41] propose, the rules that we used to derive the values of 0 and 1 were the following:


Step 5: verifying the results from previous stages by two different methods: estimating the slope coefficients of simple linear quantile regressions at different percentiles and using unrestricted VAR models

We analyzed the results obtained according with the methods presented, looking for patterns to confirm whether the evolution of the performance/risk measures of the gender equality indices was different or not from the overall market indices.

Trying to validate our findings regarding the correlation between the gender equality indices and the overall market indices, we used two regression methods. We estimated simple quantile regressions where the dependent variables were the gender equality indices and the explanatory variables were the overall market indices, and we also estimated unrestricted VAR (2) models for the cross-sectoral indices and for the financial sector indices, separately.

As described in the literature on financial econometrics, linear regression describes the average linear relationship between a combination of explanatory variables and a dependent variable relying on the conditional mean expressed as E(*y*|*x*). Because this method only offers a partial explanation of the relationship, we were interested to investigate the values of the slope coefficients between the gender equality indices and the overall market indices at several percentiles of the conditional distribution of these series, and the quantile regression is one of the tools available to do this.

In a quantile regression we express the estimator for the quantile *q* by minimizing the objective function described below:

$$Q(\boldsymbol{\beta}\_{q}) = \sum\_{i:y\_{i}\ge x\_{i}'}^{N} q \left| y\_{i} - \mathbf{x}\_{i}' \boldsymbol{\beta}\_{q} \right| + \sum\_{i:y\_{i}<\mathbf{x}\_{i}'}^{N} (1-q) \left| y\_{i} - \mathbf{x}\_{i}' \boldsymbol{\beta}\_{q} \right| + \tag{9}$$

$$QQ\{\boldsymbol{\beta}\_{\boldsymbol{q}}\} = \sum\_{i:y\_i \ge \boldsymbol{x}\_i'}^{N} q \left| y\_i - \boldsymbol{\mathfrak{x}}\_i' \beta\_{\boldsymbol{\theta}} \right| + \sum\_{i:y\_i < \boldsymbol{\mathfrak{x}}\_i'}^{N} (1 - q) \left| y\_i - \boldsymbol{\mathfrak{x}}\_i' \beta\_{\boldsymbol{\theta}} \right| \tag{10}$$

In a similar approach with other previous studies (e.g., Hammoudeh et al. [42] and Dekker et al. [43]), we investigated the linkages between the gender equality indices and the overall market indices (separately for cross-sectoral and financial sector) using a vector autoregressive model. Because of the nature of our data, we preferred a simple unrestricted *k* dimensional VAR(*p*) model which can be described by the following equation:

$$y\_t = A\_1 y\_{t-1} + \dots + A\_p y\_{t-p} + \text{Cx}\_t + \varepsilon\_t \tag{11}$$

where *yt* = (*y*1, *y*2, ... , *ykt*) is a *k x* 1 vector of endogenous variables, *xt* = (*x*1, *x*2, ... , *xdt*) is a *d x* 1 vector of exogenous variables, *A*1, *A*2, ... , *Ap*, are *kxk* matrices of lag coefficients to be estimated, *C* is a *kxd* matrix of exogenous coefficients to be estimated, and ε*<sup>t</sup>* = (ε1, ε2, ... , ε*kt*) is a *k x* 1 white noise innovation process.

The results of our calculations made according with the methods presented above are described and discussed in the following two sections of our paper.
