*Article* **Non-Parametric Statistic for Testing Cumulative Abnormal Stock Returns**

**Seppo Pynnonen**

Department of Mathematics and Statistics, University of Vaasa, P.O. Box 700, FI-65101 Vaasa, Finland; sjp@uwasa.fi; Tel.:+358-21-449-8311

**Abstract:** Due to the non-normality of stock returns, nonparametric rank tests are gaining accceptance relative to parametric tests in financial economics event studies. In rank tests, financial assets' multiple day cumulative abnormal returns (CARs) are replaced by cumulated ranks. This paper proposes modifications to the existing approaches to improve robustness to cross-sectional correlation of returns arising from calendar time overlapping event windows. Simulations show that the proposed rank test is well specified in testing CARs and is robust towards both complete and partial overlapping event windows.

**Keywords:** finance; economics; event study; clustered event days; cross-sectional correlation; cumulated ranks; rank test; standardized abnormal returns

**JEL Classification:** G14; C10; C15

### **1. Introduction**

Efficient markets has been and still is a cornerstone of asset pricing theory. Empirical work in this regard is largely concerned with the adjustment of security prices to relevant information. Fama (1970, 1991) refine relevant information into three hierarchical subsets of weak form, semi-strong form, and strong form Fama (1970), or equivalently, return predictability, event studies, and private information Fama (1991). Event studies investigate the effect of unexpected economic events on asset prices. Therefore, event studies can give the most direct evidence on market efficiency (c.f. Fama 1991, p. 1577). For this purpose, asset price data available from financial markets can be used with appropriate statistical testing methodology, reliability of which is central in inferences. In order to foster this, the current paper aims to fill the gap in existing (non-parametric) statistical testing by proposing non-parametric rank tests that are robust to cross-sectional dependency of asset returns in more general circumstances than the existing ones. Otherwise, refer to (Campbell et al. 1997, chp. 4) as an excellent introduction to event studies and related statistical methods.

Regarding methodology, standardizing returns by their respective standard deviations homogenizes data and has proven to improve testing performance. Because of this improvement, standardized return based tests by Patell (1976) and Boehmer et al. (1991) (BMP) have gained popularity over conventional non-standardized tests in testing event effects on mean security price performance. Harrington and Shrider (2007) found that in a short-horizon testing of abnormal returns (i.e., systematic deviation from expected behavior), one should always use methods that are robust to cross-sectional variation in the true abnormal returns.1 They found that BMP is a good candidate for robust, parametric tests in conventional event studies.2

A major problem in statistical tests of returns is that the returns are not normally distributed (Fama 1976). Not surprisingly, non-parametric rank tests introduced by Corrado (1989, 2011); Corrado and Zivney (1992); Campbell and Wasley (1993) and Kolari and

**Citation:** Pynnonen, Seppo. 2022. Non-Parametric Statistic for Testing Cumulative Abnormal Stock Returns. *Journal of Risk and Financial Management* 15: 149. https:// doi.org/10.3390/jrfm15040149

Academic Editor: ¸Stefan Cristian Gherghina

Received: 8 December 2021 Accepted: 15 March 2022 Published: 23 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Pynnonen (2011), among others, dominate parametric tests both in terms of better size and power (e.g., see Campbell and Wasley 1993; Corrado 1989; Corrado and Zivney 1992; Kolari and Pynnonen 2011; Kolari and Pynnönen 2010; Luoma 2011). Furthermore, rank tests by Corrado and Zivney (1992) and Kolari and Pynnonen (2011) that utilize event period re-standardized returns have proven to be robust to event-induced volatility (Kolari and Pynnonen (2011); Kolari and Pynnönen (2010)), cross-correlation due to event day clusterings (Kolari and Pynnönen 2010), and autocorrelation (Kolari and Pynnonen 2011). These studies are consistent with the view stated in the epilogue of Lehmann (2006): "Rank tests apply often to relatively simple solutions, such as one-, two-, and *s*-sample problems, and testing for independence and randomness, but for these situations they are often the method of choice". (Lehmann 2006, p. v). In addition, the results of rank tests are invariant to monotone transformations of the underlying returns; that is, whether the returns are defined as simple, continuously compounded log returns.

Existing rank based tests, however, are not robust to cross-sectional correlation if the event days are partially overlapping in calendar time. This *partial clustering* occurs when events are in calendar time scattered within an event window more or less randomly rather than clustered on the same calendar day (i.e., *complete clustering* as in Kolari and Pynnönen 2010). Figure 1 illustrates the various degrees of clustering in terms of three stocks. Panel A depict the non-clustered case, Panel B the partial clustering, and Panel C the complete clustering. In the complete clustering the event days are the same in calendar time, while in the partial clustering the event days may or may not be the same in calendar time but the event windows are more or less overlapping. In the non-clustered case the event windows are completely separate in calendar time. In this case all event effects can be investigated utilizing cross-sectional independence assumption of returns. In complete clustering cross-sectional correlation of returns must be fully accounted for. In the partial case the correlation can bias the results depending on the degree of overlapping. For example in the case of Panel C if the interest is only on the event day effect, as all the event days are different, there is no biasing effect by the correlation. On the other hand, if cumulative return effect over the whole event window is of interest, correlation of returns on the overlapping affects the joint behavior of the cumulative returns.

Jaffe (1974) is probably the first paper in event study testing to address the potential biasing effect of cross-sectional correlation due to clustered events. Table 2 of Kolari and Pynnönen (2010) explicitly addresses the issue by showing that already a virtually trivial cross-sectional correlation, such as 0.05, can severely bias testing for event effects towards material over-rejection. The present paper seeks to fill this gap of accounting for crosssectional correlation in non-parametric even study testing also with partially clustered event days.

The paper is organized as follows. Section 2 reviews some related key literature. Section 3 defines the main concepts and derives some distributional properties of rank statistics. Section 4 introduces the new transformed rank test. Section 5 reports simulation results, and Section 6 concludes.

**3DQHO\$1RQíFOXVWHUHG(YHQW:LQGRZV**

&DOHQGDUWLPH

**3DQHO%3DULDO&OXVWHULQJ**

&DOHQGDUWLPH

**3DQHO&&RPSOHWH&OXVWHULQJ**

&DOHQGDUWLPH

**Figure 1.** Event Windows Clustering.

### **2. Review of Related Literature**

Patell and BMP parametric tests are straightforward tests of cumulative abnormal returns (CARs) over multiple day windows. With the correction suggested by Kolari and Pynnönen (2010), these tests are useful in the case of completely clustered event days, and with the correction suggested by Kolari et al. (2018) when event days are either completely or partially clustered. By construction, the Corrado (1989) non-parametric rank test applies for testing single day event returns. Testing for CARs with the same logic implies the need of defining multiple-day returns that match the number of days in the CARs, (see (Corrado 1989, p. 395); (Campbell and Wasley 1993, footnote 4)). In practice this approach is carried out by dividing the estimation period and event period into intervals matching the number of days in the CAR. Unfortunately, this procedure is not useful for a number of reasons. Foremost among these is that it does not necessarily lead to a unique testing procedure. In addition, the abnormal return model should be re-estimated for each multiple-day CAR definition. Furthermore, for a fixed estimation period, as the number of days accumulated in a CAR increases, the number of multiple-day estimation period observations reduces quickly impractically low and thus would weaken the abnormal return model estimation (c.f., Kolari and Pynnönen 2010). Kolari and Pynnonen (2011) solve these issues in their generalized rank test approach.

On the other hand, Campbell and Wasley (1993) recommend using the Corrado (1989) rank test to test cumulative abnormal returns by simply accumulating the respective ranks to constitute cumulative ranks (see also Hagnäs and Pynnonen 2014). This practice is adopted in the Eventus software Cowan (2007) and is probably, for the time being, the most popular procedure for multiple day applications of rank tests. An advantage is that this proceure implicitly accounts the cross-sectional correlation in the case of the complete clustering.

In spite of these attractive properties, the cumulative ranks test does not account for cross-sectional correlation due to calendar time partially overlapping event windows, i.e., the case of partial clustering. As referred above, even a small (positive) correlation biases the standard errors downwards leading to over-rejection of the null hypothesis of no event effect. Contributing to the event study literature, this paper proposes an adjustment for the standard errors that corrects the bias in non-parametric testing.

### **3. Distributional Properties of Ranks**

We begin by fixing some notations and an underlying assumption to facilitate our theoretical discussion.

**Assumption 1.** *Stock returns rit for firm i are weak white noise continuous random variables and are cross-sectionally independent over non-overlapping calendar days, or,*

$$\begin{array}{rcl} \mathbb{E}[r\_{it}] &=& \mu\_{i} \text{ for all } t\\ \text{var}[r\_{it}] &=& \sigma\_{i}^{2} \text{ for all } t\\ \text{cov}[r\_{it}, r\_{iu}] &=& 0 \text{ for all } t \neq u\\ & r\_{it} \text{ and } \quad r\_{ju} \quad \text{are independent whenever } i \neq j \text{ and } t \neq u. \end{array} \tag{1}$$

It is a stylized fact that the variances of the returns are time varying and that there is mild autocorrelation. The time varying volatility problem can be partially captured in terms of GARCH-modeling. However, typical GARCH-processes satisfy Assumption 1.

Let AR*it* <sup>=</sup> *rit* <sup>−</sup> <sup>E</sup>[*rit*] denote the abnormal return of security *<sup>i</sup>* on day *<sup>t</sup>*, and following commonly used notations (e.g., Brown and Warner 1985, p. 6), let day *t* = 0 indicate the event day. Days from *t* = *T*<sup>0</sup> + 1 to *t* = *T*<sup>1</sup> represent the estimation period relative to the event day, and days from *t* = *T*<sup>1</sup> + 1 to *t* = *T*<sup>2</sup> represent the event window. The cumulative abnormal return (CAR) from *τ*<sup>1</sup> to *τ*<sup>2</sup> with *T*<sup>1</sup> < *τ*<sup>1</sup> ≤ *τ*<sup>2</sup> ≤ *T*2, is defined as

$$\text{CAR}\_i(\tau\_1, \tau\_2) = \sum\_{t=\tau\_1}^{\tau\_2} \text{AR}\_{it}.\tag{2}$$

The time period from *τ*<sup>1</sup> to *τ*<sup>2</sup> is called in the following as a CAR window or CAR period.

Standardized abnormal returns are defined as

$$\text{SAR}\_{it} = \frac{\text{AR}\_{it}}{\text{S}\_{\text{AR}\_i}},\tag{3}$$

where

$$\mathcal{S}\_{\text{AR}\_i} = \sqrt{\frac{1}{T\_1 - T\_0 - 1} \sum\_{t=T\_0+1}^{T\_1} \text{AR}\_{it}^2}. \tag{4}$$

Furthermore, for the purpose of accounting the possible event induced volatility, the restandardized abnormal returns are defined as in Boehmer et al. (1991) (see also, Corrado and Zivney 1992), or

$$\text{SAR}'\_{it} = \begin{cases} \text{SAR}\_{it} / \text{S}\_{\text{SAR}\_{i'}} & T\_1 < t \le T\_2\\ \text{SAR}\_{it'} & \text{otherwise} \end{cases} \tag{5}$$

where

$$S\_{\text{SAR}\_{t}} = \sqrt{\frac{1}{n-1} \sum\_{i=1}^{n} (\text{SAR}\_{it} - \overline{\text{SAR}\_{t}})^2} \tag{6}$$

is the time *t* cross-sectional standard deviation of SAR*it*, SAR*<sup>t</sup>* = <sup>1</sup> *<sup>n</sup>* <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> SAR*it*, and *n* is the number of stocks in the portfolio. In addition, let *Kit* denote the rank numbers of abnormal returns, where *Kit* ∈ {1, . . . , *T*}, *t* = *T*<sup>0</sup> + 1, . . . , *T*2, *T* = *T*<sup>2</sup> − *T*0, and *i* = 1, . . . , *n*.

If the available observations in the estimation period vary from one series to another, it is convenient to use *standardized ranks* with zero mean and unit variance. To do this, we compile the known results of rank statistics (e.g., Lehmann 2006, Appendix, Section 1) as described below.

**Result 1.** *Let Kit denote the rank numbers as defined above, then*

$$\mathbb{E}[\mathbb{K}\_{it}] \quad = \quad (T+1)/2 \tag{7}$$

$$\text{var}[\mathbb{K}\_{it}] \quad = \ (T^2 - 1) / 12 \tag{8}$$

$$\text{cov}[\mathbb{K}\_{\text{is}}, \mathbb{K}\_{\text{it}}] \quad = \ -(T+1)/12, (\text{s} \neq \text{t}). \tag{9}$$

**Definition 1.** *Standardized ranks are defined as*

$$
\Delta I\_{it} = \frac{K\_{it} - \frac{1}{2}(T+1)}{\sqrt{(T^2 - 1)/12}}.\tag{10}
$$

(c.f., Hagnäs and Pynnonen 2014). By Result 1, we obtain:

**Result 2.**

E[*Uit*] = 0 (11)

$$\text{var}[\mathcal{U}\_{it}] \quad = \quad \mathbf{1} \tag{12}$$

$$\text{cov}[\mathcal{U}\_{\text{is}}, \mathcal{U}\_{\text{it}}] \quad = \quad -1/(T-1). \tag{13}$$

Next, we define *cumulative standardized ranks* for individual stocks.

**Definition 2.** *The cumulative standardized ranks of a stock i over the event days window form τ*<sup>1</sup> *to τ*<sup>2</sup> *are defined as*

$$\mathcal{U}l\_i(\tau\_1, \tau\_2) = \sum\_{t=\tau\_1}^{\tau\_2} \mathcal{U}\_{it} \tag{14}$$

*where T*<sup>1</sup> < *τ*<sup>1</sup> ≤ *τ*<sup>2</sup> ≤ *T*2*.*

From Result <sup>2</sup> and utilizing the variance-of-the-sum formula, var[*Ui*(*τ*1, *<sup>τ</sup>*2)] <sup>=</sup> <sup>∑</sup>*τ*<sup>2</sup> *t*=*τ*<sup>1</sup> var[*Uit*] + <sup>∑</sup>*s*-<sup>=</sup>*<sup>t</sup>* cov[*Uis*, *Uit*], we obtain:

**Result 3.**

$$\mathbb{E}[\mathcal{U}\_i(\tau\_1, \tau\_2)] \quad = \quad 0 \tag{15}$$

$$\text{var}[\mathcal{U}\_i(\tau\_1, \tau\_2)] \quad = \quad \frac{\tau(T-\tau)}{T-1},\tag{16}$$

*where i* = 1, . . . , *n, T*<sup>1</sup> < *τ*<sup>1</sup> ≤ *τ*<sup>2</sup> ≤ *T*2*, and τ* = *τ*<sup>2</sup> − *τ*<sup>1</sup> + 1*.*

Rather than investigating individual (cumulative) returns, the practice in event studies is to aggregate individual returns into equally-weighted portfolios such that:

**Definition 3.** *The average cumulative standardized ranks are defined as the equally weighted portfolio of individual cumulative standardized ranks defined in* (14)*, i.e.,*

$$\left\|\vec{\mathcal{U}}(\tau\_1, \tau\_2) = \frac{1}{n} \sum\_{i=1}^{n} \mathcal{U}\_i(\tau\_1, \tau\_2) = \sum\_{t=\tau\_1}^{\tau\_2} \mathcal{U}\_{t\prime} \tag{17}$$

*where T*<sup>1</sup> < *τ*<sup>1</sup> ≤ *τ*<sup>2</sup> ≤ *T*<sup>2</sup> *and*

$$\bar{\mathcal{U}}\_{l} = \frac{1}{n} \sum\_{i=1}^{n} \mathcal{U}\_{it} \tag{18}$$

*is the time t average of standardized ranks.*

The expected value of *U*¯ (*τ*1, *τ*2) is the same as that of the cumulative ranks of individual securities, or

$$\mathbb{E}[\bar{\mathcal{U}}(\tau\_1, \tau\_2)] = \frac{1}{n} \sum\_{i=1}^n \mathbb{E}[\mathcal{U}\_i(\tau\_1, \tau\_2)] = 0.$$

If the event days are not clustered the cross-sectional correlations of the return series are zero (or at least negligible). Under the cross-sectional independence and by Equation (16), the variance of *U*¯ (*τ*1, *τ*2) is

$$
\sigma\_{\tau}^{2} = \text{var}[\mathcal{U}(\tau\_{1}, \tau\_{2})] = \frac{\tau(T-\tau)}{(T-1)n}.\tag{19}
$$

Then by the central limit theorem

$$Z = \left(\frac{(T-1)n}{\tau(T-\tau)}\right)^{\frac{1}{2}} \mathcal{U}(\tau\_1, \tau\_2) \sim N(0, 1) \text{ as } n \to \infty. \tag{20}$$

The situation is more complicated if the event days are partially overlapping in calendar time which implies cross-sectional correlation. Recalling that the variances of *Ui*(*τ*1, *τ*2) given in Equation (16) are constants (independent of *i*), we can write the crosssectional covariance of *Ui*(*τ*1, *τ*2) and *Uj*(*τ*1, *τ*2) as

$$\text{cov}\left[\mathcal{U}\_{i}(\tau\_{1},\tau\_{2}),\mathcal{U}\_{j}(\tau\_{1},\tau\_{2})\right] = \frac{\tau(T-\tau)}{T-1}\rho\_{ij}(\tau\_{1},\tau\_{2}),\tag{21}$$

where *ρij*(*τ*1, *τ*2) is the cross-sectional correlation of *Ui*(*τ*1, *τ*2), and *Uj*(*τ*1, *τ*2), *i*, *j* = 1, ... , *n*. Utilizing this result and the variance-of-the-sum formula, the variance of *U*¯ (*τ*1, *τ*2) in (17) becomes:

**Result 4.**

$$\text{var}[\bar{\mathcal{U}}(\tau\_1, \tau\_2)] \quad = \quad \frac{1}{n^2} \sum\_{i=1}^n \text{var}[\mathcal{U}\_i(\tau\_1, \tau\_2)] + \frac{1}{n^2} \sum\_{i=1}^n \sum\_{j \neq i}^n \text{cov}\left[\mathcal{U}\_i(\tau\_1, \tau\_2), \mathcal{U}\_j(\tau\_1, \tau\_2)\right]$$

$$= \quad \frac{\tau(T-\tau)}{(T-1)n} (1 + (n-1)\bar{\rho}\_n(\tau\_1, \tau\_2)), \tag{22}$$

*where*

$$\phi\_n(\tau\_1, \tau\_2) = \frac{1}{n(n-1)} \sum\_{i=1}^n \sum\_{\substack{j=1 \\ j \neq i}}^n \rho\_{ij}(\tau\_1, \tau\_2) \tag{23}$$

*is the average cross-sectional correlation of cumulated ranks.*

Cross-sectional dependence affects the asymptotic distribution of the statistic in Equation (20). However, as discussed in (Lehmann 1999, Scttion 2.8), it is frequently true that the asymptotic normality holds provided that the average correlation, *ρ*¯*n*(*τ*1, *τ*2), tends to zero rapidly enough such that

$$\frac{1}{n}\sum\_{i\neq j}^{n}\rho\_{ij}(\tau\_1,\tau\_2) = (n-1)\bar{\rho}\_n(\tau\_1,\tau\_2) \to \gamma \text{ as } n \to \infty,\tag{24}$$

where *γ* is some finite constant. Under this condition the limiting distribution of *Z*-statistic in (20) becomes *N*(0, 1 + *γ*).

Otherwise, from practical point of view, the crucial result of Formula (22) is that the only unknown parameter to be estimated is the average cross-sectional correlation *ρ*¯*n*(*τ*1, *τ*2). Hagnäs and Pynnonen (2014) discuss approaches to account implicitly for this average correlation in cumulated ranks tests when all events share the same calendar day, i.e., the case of complete clustering. These implicit approaches, however, do not work in the case of partial clustering. Therefore, by utilizing the procedure developed in Kolari et al. (2018), this paper proposes a method to estimate explicitly the cross-sectional correlation, *ρ*¯*n*(*τ*1, *τ*2), and thereby solve the cross-sectional correlation problem in the case of the partial clustering.

### **4. Correlation Robust Test for Cumulated Ranks**

Following Kolari et al. (2018), let *τij*, 0 ≤ *τij* ≤ *τ* denote the number of calendar days stocks *i* and *j* share in common within the event windows. By independence in Assumption 1, the correlation, cor *Uiu*, *Ujv* , of the standardized ranks *Uiu* and *Ujv* is zero whenever the underlying calendar days of the relative event days, *u* and *v*, differ and can be non-zero when the calendar days are the same. Denoting these non-zero correlations (which are also covariances) by *ρij*, we get

$$\operatorname{cov}\left[\mathcal{U}\_{i}(\boldsymbol{\tau}\_{1},\boldsymbol{\tau}\_{2}),\mathcal{U}\_{\boldsymbol{j}}(\boldsymbol{\tau}\_{1},\boldsymbol{\tau}\_{2})\right] = \sum\_{\boldsymbol{\mu}=\boldsymbol{\tau}\_{1}}^{\text{T}} \sum\_{\boldsymbol{\nu}=\boldsymbol{\tau}\_{1}}^{\text{T}} \operatorname{cor}\left[\mathcal{U}\_{i\boldsymbol{\mu}\boldsymbol{\nu}}\mathcal{U}\_{\boldsymbol{j}\boldsymbol{\nu}}\right] = \boldsymbol{\tau}\_{\boldsymbol{\text{ij}}\boldsymbol{\bar{\boldsymbol{\nu}}}}\boldsymbol{\rho}\_{i\boldsymbol{\bar{\boldsymbol{\nu}}}}.$$

Combining this with (21), we obtain

$$
\rho\_{i\bar{j}}(\mathbf{r}\_1, \mathbf{r}\_2) = \left(\frac{T-1}{T-\tau}\right) \frac{\tau\_{i\bar{j}}}{\tau} \rho\_{i\bar{j}}.\tag{25}
$$

We can assume that the overlapping window lengths, *τij*, and the cross-sectional correlations, *<sup>ρ</sup>ij*, are not dependent on each other so that <sup>∑</sup>*i*-<sup>=</sup>*<sup>j</sup> τijρij* = *n*(*n* − 1)*τ*¯*ρ*¯, where *τ*¯ is the average number of overlapping calendar days, and *ρ*¯ is the average cross-sectional correlation of *Ui* and *Uj*. <sup>3</sup> Consequently, we can rewrite (22) as

$$\text{var}[\ddot{\mathcal{U}}(\tau\_1, \tau\_2)] = \frac{\tau(T-\tau)}{(T-1)n} (1 + (n-1)\delta\phi),\tag{26}$$

where *δ* = *τ*¯(*T* − 1)/(*τ*(*T* − *τ*)) adjusts the average correlation by the fraction of overlapping calendar days within the event window.

It is notable that, even though the average cross-sectional correlation, *ρ*¯, in Equation (26) is based on *n*(*n* − 1)/2 correlations, it can be computed without estimating individual correlations by utilizing the method introduced by Edgerton and Toops (1928). Instead of *n*(*n* − 1)/2 individual correlations, it turns out that one needs to compute only *n* + 1 variances, which is a computational problem of order *n* viz-a-viz of order *n*<sup>2</sup> with averaging the correlations. To illustrate the idea, consider *n* random variables *xj*, *j* = 1, ... , *n* and define the standardized variables *zj* = *xj*/*σj*. Next let *z*¯ = ∑*<sup>j</sup> zj*/*n* denote the average of the variables. Then because var *zj* = 1 and cov *zj*, *zk* = cor *zj*, *zk* = *ρjk*, variance of *z*¯ becomes *σ*<sup>2</sup> *<sup>z</sup>*¯ = var[*z*¯] = (1 + (*n* − 1)*ρ*¯)/*n*, we obtain

$$
\phi = (n\sigma\_\sharp^2 - 1)/(n-1). \tag{27}
$$

Hence, to estimate the average cross-sectional correlation, all we need are estimates of *n* standard deviations of the *x*-variables and the variance of *z*¯. Finally, for large *n*, Equation (27) shows that *<sup>ρ</sup>*¯ ≈ *<sup>σ</sup>*<sup>2</sup> *z*¯ .

Because in our case the calendar days of different stocks are only partially overlapping, we estimate the variance of the average utilizing the clustering robust estimation technique (e.g., see Cameron et al. 2011) suggested in Kolari et al. (2018).

Following Kolari et al. (2018), denote the calendar days of the returns in the combined estimation and event window as *t* = 1, ... , *L*, which implies that *L* becomes the number of clusters equaling the number of separate calendar days on which returns are available in the combined estimation and event windows. Let *nt* denote the number of stocks having returns on calendar day *t* and define

$$
\mathcal{U}L\_t = \sum\_{k=1}^{n\_t} \mathcal{U}\_{kt}.\tag{28}
$$

Then

$$
\hbar L\_t^2 = \sum\_{k=1}^{n\_l} \mathcal{U}\_{kt}^2 + \sum\_{i \neq j}^{n\_l} \mathcal{U}\_{it} \mathcal{U}\_{jt \prime} \tag{29}
$$

so that *nt*

$$\sum\_{i \neq j}^{n\_l} \mathcal{U}\_{it} \mathcal{U}\_{jt} = \mathcal{U}\_t^2 - \sum\_{k=1}^{n\_l} \mathcal{U}\_{kt}^2. \tag{30}$$

Summing these up over the calendar days in the combined estimation and event window, we have *<sup>L</sup>*

$$\sum\_{t=1}^{L} \sum\_{i \neq j}^{n\_t} \mathbb{L} I\_{it} \mathbb{L} I\_{jt} = \sum\_{t=1}^{L} \mathbb{L} I\_t^2 - \sum\_{t=1}^{L} \sum\_{k=1}^{n\_t} \mathbb{L} I\_{kt}^2. \tag{31}$$

Taking the average, we get an estimator for the average correlation

$$\hat{\rho} = \frac{1}{M} \sum\_{t=1}^{L} \sum\_{i \neq j}^{n\_t} \mathbb{U}\_{it} \mathbb{U}\_{jt} \tag{32}$$

where

$$M = \sum\_{t=1}^{L} n\_t (n\_t - 1) \tag{33}$$

is the number of the cross-product terms. It is notable that days for which there is available only one return drop automatically out (if *nt* = 1 for all *t*, then *ρ*ˆ¯ = 0). The potentially tedious computation over all cross-products can be materially simplified by utilizing the right-hand- side of Equation (31). By Result 2 the variances of standardized ranks are all equal to one and means equal zero. Therefore, arranging the terms of the rightmost sum of Equation (31) to correspond to variance representations, the (double) sum becomes equal to ∑*<sup>L</sup> <sup>t</sup>*=<sup>1</sup> *nt*, i.e., the total number of observations.<sup>4</sup> Thus, the only component we need to compute is the first sum of squares on the right-hand-side of (31). Therefore, similar to the illustration of computing the average correlation above, the computational effort of computing the average correlation is again of order *n* (rather than *n*2). Finally, we get:

**Result 5.** *A computationally efficient form of the average correlation in* (32) *is*

$$
\not{p} = \frac{N}{M} (s\_U^2 - 1) \tag{34}
$$

*where N* = ∑*<sup>L</sup> <sup>t</sup>*=<sup>1</sup> *nt is the total number of returns, M is given by* (33)*, and*

$$s\_{ll}^2 = \frac{1}{N} \sum\_{t=1}^{L} \mathcal{U}\_t^2 \tag{35}$$

*with Ut given in Equation* (28)*. Variance, s*<sup>2</sup> *U, is a clustering robust variance estimator of standardized ranks in the presence of intra-cluster correlation (cf. e.g., Cameron et al. 2011).*

As noted earlier, *ρ*ˆ¯ = 0 if all *nt* = 1.

Given the estimator of the average cross-sectional correlation, *ρ*¯, we can define an appropriate cross-sectional correlation robust test for the null hypothesis of zero cumulative abnormal returns

$$H\_0: \mu(\tau\_1, \tau\_2) = \mathbb{E}[\mathbb{C} \text{AR}(\tau\_1, \tau\_2)] = 0. \tag{36}$$

The test can be defined in terms of the cumulated ranks using the *z*-ratio

$$z\_{\tau} = \frac{\mathcal{U}(\tau\_1, \tau\_2)}{\sigma\_{\tau}\sqrt{1 + (n - 1)\delta\hat{\boldsymbol{\beta}}}},\tag{37}$$

where *στ* is the square root of Equation (19), i.e., the variance

$$\sigma\_{\tau}^{2} = \frac{\tau(T-\tau)}{(T-1)n}$$

of *U*¯ (*τ*1, *τ*2) for completely non-overlapping event windows in calendar time [i.e., when *ρ*¯ = 0 in Equation (26)], and *τ* = *τ*<sup>2</sup> − *τ*<sup>1</sup> + 1 is the length of the window of cumulated abnormal returns.

In event studies, the combined length, *T*, of the estimation and event period remains fixed, while the number of event firms, *n*, defines the sample size, thereby being the dimension increased when dealing with the asymptotic distribution of associated test statistics.

Given that the condition in Equation (24) holds for *ρ*ˆ¯, the null distribution of *z<sup>τ</sup>* is asymptotically normal with zero mean and unit variance.

Kolari and Pynnonen (2011) propose replacing the cumulative ranks in Definition 2 by a single rank number which is based on standardized cumulative abnormal returns (SCARs)

$$\text{SCAR}\_i(\tau\_1, \tau\_2) = \frac{\text{CAR}\_i(\tau\_1, \tau\_2)}{S\_{\text{CAR}\_i(\tau\_1, \tau\_2)}} \tag{38}$$

in which *S*CAR*i*(*τ*1,*τ*2) is the standard deviation of CAR*i*(*τ*1, *τ*2) (for details, see Kolari and Pynnonen 2011). Their approach again accounts implicitly for cross-sectional correlation due to completely overlapping event days. Here, we extend the approach to cover the partial overlapping case. Rather than using the scaled ranks defined in Kolari and Pynnonen (2011), we use the standardized ranks of Definition 1. Subsequently, denoting the standardized rank of SCAR*i*(*τ*1, *τ*2) by *Ui*0, we can base the rank test for testing the null hypothesis of zero cumulative abnormal returns in Equation (36) on the average ranks

$$
\bar{\mathcal{U}}\_0 = \frac{1}{n} \sum\_{i=1}^n \mathcal{U}\_{i0}.\tag{39}
$$

If the event periods are completely non-overlapping, *Ui*0s are independent with zero mean and unit variance (see Definition 1), in which case the null distribution of *U*¯ <sup>0</sup> has zero mean and variance 1/*n*. However, if the event days are partially overlapping, the components of *U*¯ <sup>0</sup> absorb the cross-sectional correlation over the CAR-window. The correlation that inflates the variance is inherited from the cross-sectional correlations of SCAR*i*s. Kolari et al. (2018) show that the variance inflation factor is of the form (1 + (*n* − 1)*νρ*¯) as in Equation (26) with the exception that *δ* is replaced by *ν* = *τ*¯/*τ*, the ratio of the average number of overlapping calendar days within the CAR-window to the window length. With this correction the variance of *<sup>U</sup>*¯ <sup>0</sup> becomes var[*U*¯ <sup>0</sup>] = (<sup>1</sup> + (*<sup>n</sup>* <sup>−</sup> <sup>1</sup>)*νρ*¯)/*n*. We can estimate the average cross-sectional correlation, *ρ*¯, as in Equation (32) utilizing only the estimation period in computing *s*<sup>2</sup> *<sup>U</sup>*. For this approach, the standardized ranks in Definition 1 are redefined for the estimation period abnormal returns. Alternatively one can estimate the cross-sectional correlation exactly as in Result 5. Both approaches will produce essentially the same result in most cases. With the estimated average correlation, we get a cross-sectional correlation robust generalized rank test statistic

$$z\_{\text{r,grank}} = \frac{\sqrt{n}\,\Omega\_0}{\sqrt{1 + (n-1)\nu\beta}},\tag{40}$$

where *ν* = *τ*¯/*τ*. Again, given that the condition in Equation (24) holds for *ρ*ˆ¯, the null distribution of *zτ*,grank is asymptotically normal with zero mean and unit variance.

### **5. Simulation Results**

We generate artificial returns utilizing the Fama and French (2015) five-factor model (FF5),

$$(r\_{\rm it} - r\_f)\_{\rm t} = a\_{\rm i} + \beta\_{\rm i, \rm mkt} (r\_m - r\_f)\_{\rm t} + \beta\_{\rm i, \rm mlb} \text{SMB}\_{\rm l} + \beta\_{\rm i, \rm mnl} \text{HML}\_{\rm l} + \beta\_{\rm i, \rm mvv} \text{RMW}\_{\rm l} + \beta\_{\rm i, \rm cmvv} \text{CMW}\_{\rm l} + \varepsilon\_{\rm it}, \tag{41}$$

where *rm* − *rf* is the market excess return over the risk-free rate *rf* , SMB, HML, RMW, and CMW are common market factors proposed by Fama and French. We utilize daily data from 2 January 1990 through 30 October 2020 (7770 daily returns) to generate 20,000 initial daily return series for this sample period. The regression coefficients for each stock are generated from multivariate normal distribution with mean vector (0, 1, 0.5, 0.5, 0.5, 0.5) and covariance matrix *σ*<sup>2</sup> *<sup>i</sup>* (*X X*)−1, in which *σ*<sup>2</sup> *<sup>i</sup>* is the variance of the error term . The stock specific *σ*<sup>2</sup> *<sup>i</sup>* values are generated by drawing *σi*s, the standard deviations, independently from a uniform distribution *U*(1, 3). This corresponds to a range of annual volatilities roughly from 10 percent to 48 percent. The (*X X*) matrix is the cross-product matrix of the Fama-French 5-factor regression model.5 The (7770) error terms  *it* for stock *i* is generated independently from the normal distribution *N*(0, *σ*<sup>2</sup> *i* ).

In the simulations we define the abnormal returns with respect to the market model as

$$AR\_{it} = (r\_i - r\_f)\_t - (\pounds\_i + \nexists\_i (r\_m - r\_f)\_t), \tag{42}$$

where *α*ˆ*<sup>i</sup>* and *β*ˆ *<sup>i</sup>* are ordinary least squares (OLS) estimates. Therefore, missing common factors introduce cross-sectional correlation between the abnormal returns. The event period is ±10 trading days around the event day *t* = 0, and the estimation period consists of 250 days prior the event periods, i.e., relative days −260, . . . , −11.

In forthcoming experiments we focus on the effect of cross-sectional correlation on the size of the test. Other issues, such as event induced volatility are well documented for example by Kolari and Pynnonen (2011); Kolari and Pynnönen (2010). Utilizing the base design initiated by Brown and Warner (1985), we generate 1000 samples of randomly selected 50 stocks (the returns of which are generated by the FF5 model in Equation (41)) with four over-lapping event days scenarios. In the first case of non-overlapping event days, the returns are cross-sectionally independent. In the second case of completely overlapping events, all firms share the same event day (calendar time), and in the third and fourth scenarios the event days are randomly scattered across 5 and 10 concecutive calendar days, i.e., one and two weeks of trading days, respectively.

We report two-tailed rejection rates for the null hypothesis of no event-effect across different event windows of ±1, ±2, ±5, and ±10 around the event day, i.e., window lengths *τ* = 1, 3, 5, 10, and 21 days. In addition to statistic *z<sup>τ</sup>* in Equation (37) we report results for the more traditional rank based test proposed by (Campbell and Wasley 1993, p. 85):

$$z\_{\rm cw} = \frac{\sum\_{t=\tau\_1}^{\tau\_2} \bar{k}\_t}{\sqrt{\tau} s\_k},\tag{43}$$

where

$$\bar{k}\_t = \frac{1}{n} \sum\_{i=1} (\mathcal{K}\_{it} - \mathbb{E}[\mathcal{K}\_{it}]) \tag{44}$$

with E[*Kit*] = (*<sup>T</sup>* + <sup>1</sup>)/2 and

$$s\_k^2 = \frac{1}{T} \sum\_{t=T\_0+1}^{T\_2} k\_t^2. \tag{45}$$

Furthermore, we report results for traditional parametric (cross-sectional correlation non-robust) *t*-statistics popular in event studies (e.g., see (Campbell et al. 1997, chp. 4)),

$$t\_{\tau} = \frac{\overline{\text{CAR}}(\tau\_1, \tau\_2)}{\text{s.e}(\text{CAR})},\tag{46}$$

where CAR(*τ*1, *τ*2) is the sample average of CAR*i*(*τ*1, *τ*2) defined in (2), and s.e.(CAR) is the related standard error. Under independence, the null distribution of *tτ* is asymptotically standard normal.

Table 1 summarizes the test statistics and their major features.

Table 2 reports the simulation results of the two-tailed rejection rates of the null hypothesis of no abnormal return at the 5% nominal rejection rate. The results are clear-cut. Panel A of the table reports the non-overlapping case with zero cross-sectional correlation. As expected, all statistics reject close to the nominal rate. Panel B reports results of complete overlapping. That is, all events share the same calendar day; hence, returns are prone to cross-sectional correlation. The new *zτ*, *zτ*,grank, and the more traditional cumulative ranks statistic, *zcw*, that account for cross-sectional correlation, reject reasonably close to the nominal rate up to event windows ±5 and exhibit some over-rejection on the longest event window ±10, i.e., 21 days. Not surprisingly, the parametric, non-cross-correlation robust statistic, *tτ*, incrementally over-rejects as event windows increase in length. Panel C reports partial overlapping with events clustered randomly within 5 trading days (about a week). For event day testing also the a priori non-robust statistics perform well by rejecting at the nominal rate. However, they start to incrementally over-reject as the event window grows longer. The a priori partial overlapping robust statistics, *z<sup>τ</sup>* and *zτ*,grtank, reject close to the nominal rate up to the event window lengths of 5 days and over-reject to some extent for the longest event windows of 11 and 21 days, albeit far less than the non-robust statistics of *zcw* and *tτ*. The results are pretty much similar with the decreased overlapping in Panel D. Thus, we conclude that accounting for cross-sectional correlation is crucial to avoid biased inferences in statistical testing, not only due to complete overlapping of event windows, but also for partially overlapping cases. Regarding the latter, this paper has introduced two new test statistics that account for these cases.

**Table 1.** Test statistics and their key features.


**Table 2.** Rejection rates of the null hypothesis of no event effect at the nominal 5% level when the events are no-overlapping, partially overlapping, and completely overlapping.


### **6. Summary and Conclusions**

This paper proposed two variants of a new non-parametric rank based test statistic for testing cumulative abnormal returns in short-run event studies. The statistics are robust to event-induced volatility and cross-sectional correlation due to complete or partially overlapping event windows. This latter source of cross-sectional correlation is not taken into account by the existing non-parametric test statistics. Simulation results indicate that, unlike typically utilized test statistics, the proposed statistics reject the null hypothesis of no event effect close to the nominal significant level in the partially overlapping case. We conclude that accounting cross-sectional correlation is crucial to avoid biased inferences, not only due to complete overlapping of event windows but also for partial overlapping cases. The non-parametric test statistics proposed in this paper serve this purpose. A major

limitation of utilizing non-parametric tests in financial economics is that they seem to play mainly side roles. For example, (Campbell et al. 1997, Sction 4.7) note that nonparametric tests are typically used in conjunction with parametric tests to check robustness of conclusions based on parametric tests. Even so, it should be noted that robustness checks are incrementally demanded in modern empirical financial research. Non-parametric methods can be the tools of choice in completing the task.

**Funding:** The research has not received external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data available upon request from the author.

**Acknowledgments:** The author wants to thank James Kolari, Henk Snoo, Rudi Wietsma, and referees for many useful comments that improved the paper. All errors are the responsibility of the author.

**Conflicts of Interest:** The author declares no conflict of interest.

### **Notes**


$$\sum\_{t=1}^{L} \sum\_{k=1}^{n\_t} \mathcal{U}\_{kt}^2 = \sum\_{t=t\_1}^{L\_1} \mathcal{U}\_{1t}^2 + \sum\_{t=t\_2}^{L\_2} \mathcal{U}\_{2t}^2 + \dots + \sum\_{t=t\_n}^{L\_n} \mathcal{U}\_{nt}^2 = \sum\_{l=1}^{n} \sum\_{t=t\_l}^{L\_l} \mathcal{U}\_{lt}^2.$$

where *ti*, *ti* + 1, ... , *Li* indicate observations on stock *i* with *Ti* = *Li* − *ti* + 1, the number of observations. By Result 2 var[*Uit*] = 1, so that ∑*Li t*=*ti U*<sup>2</sup> *it* <sup>=</sup> *Ti*. Hence, <sup>∑</sup>*<sup>L</sup> <sup>t</sup>*=<sup>1</sup> <sup>∑</sup>*nt <sup>k</sup>*=<sup>1</sup> *<sup>U</sup>*<sup>2</sup> *kt* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *Ti* <sup>=</sup> *<sup>N</sup>* <sup>=</sup> <sup>∑</sup>*<sup>L</sup> <sup>t</sup>*=<sup>1</sup> *nt*.

<sup>5</sup> Factor returns have been downloaded from the French data library. http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/ data\_library.html, accessed on 15 November 2021.

### **References**


Brown, Stephen J., and Jerold B. Warner. 1985. Using daily stock returns. *Journal of Financial Economics* 14: 3–31. [CrossRef]


Corrado, Charles J. 2011. Event studies: A methodology review. *Accounting & Finance* 51: 207–34.


Fama, Eugene F. 1991. Efficient capital markets: Ii. *The Journal of Finance* 46: 1575–617. [CrossRef]


Jaffe, Jeffrey F. 1974. Special information and insider trading. *The Journal of Business* 47: 410–28. [CrossRef]


Lehmann, Eric L. 1999. *Elements of Large-Sample Theory*. Berlin and Heidelberg: Springer.

Lehmann, Eric L. 2006. *Nonparamterics: Statistical Mehtod Based on Ranks*, Revised 1st ed. Berlin and Heidelberg: Springer.

