Next Article in Journal
Active Management of Operational Risk in the Regimes of the “Unknown”: What Can Machine Learning or Heuristics Deliver?
Next Article in Special Issue
A Discussion on Recent Risk Measures with Application to Credit Risk: Calculating Risk Contributions and Identifying Risk Concentrations
Previous Article in Journal
How Does Distress Acquisition Incentivized by Government Purchases of Distressed Loans Affect Bank Default Risk?
Previous Article in Special Issue
Lambda Value at Risk and Regulatory Capital: A Dynamic Approach to Tail Risk
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Intersection–Union Test for the Sharpe Ratio

Chair of Applied Stochastics and Risk Management, Department of Mathematics and Statistics, Helmut Schmidt University, Holstenhofweg 85, D-22043 Hamburg, Germany
Risks 2018, 6(2), 40; https://doi.org/10.3390/risks6020040
Submission received: 26 February 2018 / Revised: 12 April 2018 / Accepted: 13 April 2018 / Published: 19 April 2018

Abstract

:
An intersection–union test for supporting the hypothesis that a given investment strategy is optimal among a set of alternatives is presented. It compares the Sharpe ratio of the benchmark with that of each other strategy. The intersection–union test takes serial dependence into account and does not presume that asset returns are multivariate normally distributed. An empirical study based on the G–7 countries demonstrates that it is hard to find significant results due to the lack of data, which confirms a general observation in empirical finance.

1. Motivation

This work builds upon Frahm et al. (2012), in which the authors argue why joint and multiple testing procedures should be applied in order to judge whether or not some investment strategy is optimal among a set of several alternatives. Frahm et al. (2012) can be understood as a complement to DeMiguel et al. (2009), who doubt that portfolio optimization on the basis of time-series information is worthwhile at all. Indeed, modern portfolio theory suffers from a serious drawback, namely that portfolio weights are very sensitive to estimation risk. It is well known that portfolio optimization fails on estimating expected asset returns.
DeMiguel et al. (2009) show that well-established investment strategies are not significantly better than the naive strategy, i.e., the equally weighted portfolio. Of course, this does not mean that naive diversification is optimal, but we usually do not have enough observations in order to prove the opposite. They highlight a general problem of empirical finance, namely that hypothesis testing is difficult due to the lack of data. This is all the more true if there is more than one (single) null hypothesis. The results reported by DeMiguel et al. (2009) are convincing, but their statistical methodology does not take the undesirable effects of joint and multiple testing into account. The same holds true for similar studies (see, e.g., Fletcher 2011; Low et al. 2016). By contrast, the test presented in this work is designed to address those problems.
The literature provides a wide range of different investment strategies (see, e.g., Burgess 2000; Conrad and Kaul 1998; DeMiguel et al. 2009; Menkhoff et al. 2012; Sawik 2012; Shen et al. 2007; Szakmary et al. 2010; Vrugt et al. 2004; Zagrodny 2003) and we are typically concerned with the question of whether a given investment strategy is optimal among a set of alternatives.1 In order to validate our hypothesis, we usually compare the performance of our benchmark, e.g., its certainty equivalent or Sharpe ratio, with the performance of each other strategy that is taken into consideration. Let d > 1 be the number of investment strategies and i 1 , 2 , , d be our benchmark. We may suppose that i = 1 without loss of generality. Furthermore, let η = ( η 1 , η 2 , , η d ) R d be a (column) vector of performance measures. Now, first of all, consider the hypotheses
H 0 : η 1 η vs . H 1 : η 1 ¬ η .
That is, H 0 states that our benchmark is optimal. After performing a (joint) hypothesis test, we could reject the null hypothesis H 0 in favor of the alternative hypothesis H 1 . In this case, we could say that there exists some strategy that is better than our benchmark, but not which one.2 By contrast, if we are not able to reject H 0 , we must not conclude that our benchmark is optimal. A well-known method for testing the intersection of a number of single null hypotheses is studied by Roy (1953), which is called a union–intersection test (Sen and Silvapulle 2002). However, union-intersection tests are not the object of this work.
By contrast, I consider here the following hypotheses:
H 0 : η 1 ¬ η vs . H 1 : η 1 η .
Now, the joint null hypothesis H 0 asserts that our benchmark is not optimal. If we are able to reject H 0 , our benchmark turns out to be (significantly) optimal among all alternatives. By contrast, in the case in which we cannot reject the null hypothesis, we must not conclude that our benchmark is outperformed by any other strategy. Applying a test for H 0 might be the primary goal both in theoretical and in practical applications of portfolio theory.
The former test can be rewritten, equivalently, as
H 0 : i = 2 d η 1 η i vs . H 1 : i = 2 d η 1 < η i ,
whereas the latter test reads
H 0 : i = 2 d η 1 < η i vs . H 1 : i = 2 d η 1 η i .
This explains the chosen symbols for the null and the alternative hypothesis. However, in the following, I focus on the latter test and write only “ H 0 ” and “ H 1 ” for notational convenience.
The test proposed in this work is very simple: The null hypothesis is rejected if and only if we can reject each single hypothesis H 0 i : η 1 < η i in favor of H 1 i : η 1 η i . Let A i be the event that H 0 i is rejected. The probability that all single null hypotheses are rejected amounts to
P i = 2 d A i i = 2 d ( A i ) .
If H 0 i is true for some i 2 , 3 , , d , we must have that P ( A i ) α i , where α i ( 0 , 1 ) denotes the significance level of the (single) hypothesis test for H 0 i . Under H 0 , at least one single null hypothesis must be true and thus we have that
i = 2 d P ( A i ) i = 2 d α i .
Hence, the proposed test for H 0 has level α ( 0 , 1 ) if α 2 , α 3 , , α d α . The least conservative choice is α 2 = α 3 = = α d = α , in which case H 0 is rejected if and only if the largest p-value of all single tests falls below α . Throughout this work, I assume that each single test has level α .
At first glance, this testing procedure might seem to suffer from a lack of power because it does not take the dependence structure of the single test statistics into account. Nonetheless, it is a likelihood-ratio test that is commonly referred to as an intersection–union test (Berger 1997). Thus, it inherits the general asymptotic optimality properties of likelihood-ratio tests that are known from likelihood theory (see, e.g., van der Vaart 1998, chp. 15 and 16). Another striking feature might be the fact that the overall test has the same significance level as each single test. This is because H 0 is rejected only if all single tests lead to a rejection and so we need no Bonferroni correction in order to preserve the significance level of each single test. For more details on that topic, see Berger (1997) as well as Sen and Silvapulle (2002).
In this work, I present an intersection–union test in order to decide whether a given investment strategy is optimal among a set of alternative strategies. This is done with respect to the Sharpe ratio. Joint and multiple tests for the Sharpe ratio are applied also in Frahm et al. (2012) by using a stationary block–bootstrap procedure. By contrast, I provide here analytical results. I refrain from assuming that asset returns are serially independent and multivariate normally distributed. Each single test represents a (nonparametric) generalization of the Jobson–Korkie test (Jobson and Korkie 1981; Memmel 2003). Finally, I apply the intersection–union test to historical data.
The same problem is addressed by Ledoit and Wolf (2008) as well as Schmid and Schmidt (2009) in a bivariate setting. However, the intersection–union test presented here is motivated by a multivariate point of view, i.e., d > 2 , and its primary goal is to avoid any kind of selection bias that can occur when testing a joint hypothesis. Thus, it cannot be said that the intersection–union test is “better” or “worse” than the tests proposed by Ledoit and Wolf (2008). It is hardly possible to provide any general answer to this question at all (Ledoit and Wolf 2008, sct. 4 and 5). Instead, I try to fill a gap between Frahm et al. (2012) as well as Ledoit and Wolf (2008):
(i)
I derive closed-form expressions for the standard errors of the test statistics, instead of providing numerical results that have been obtained by bootstrapping, and
(ii)
I do this for the case d 2 but not (only) for d = 2 .

2. The Intersection–Union Test

2.1. Gordin’s Condition

In the following, “ X n X ” denotes almost sure convergence, whereas “ X n X ” stands for convergence in distribution. Let P t > 0 be the price of some asset or, more generally, the value of some strategy at time t Z . Throughout this work, the terms “asset” and “strategy” as well as “price” and “value” are used synonymously. The asset return after Period t is defined as R t : = P t / P t 1 1 .3 I assume that the return process { R t } is (strongly) stationary with expected return μ : = E ( R t ) and variance σ 2 : = Var ( R t ) < . The process R t shall also be ergodic. This means that 1 n t = 1 n f ( R t ) E f ( R ) for each integrable function f of R, where the random variable R has the same distribution as each component of { R t } . This guarantees that every finite moment of R can be consistently estimated by the corresponding moment estimator. The return process is ergodic if it is mixing (Bradley 2005). More precisely, for all k , l = 1 , 2 , , the random vector ( R t , R t + 1 , , R t + k ) is asymptotically independent of ( R t n , R t n + 1 , , R t n + l ) as n (Hayashi 2000, p. 101).
The ergodicity of R t implies that μ n μ , where μ n : = 1 n t = 1 n R t is the sample mean of R 1 , R 2 , , R n . Put another way, the return process satisfies the Strong Law of Large Numbers. In order to preserve the Central Limit Theorem (CLT), i.e., n ( μ n μ ) N 0 , σ L 2 , we need an additional requirement. This is known as Gordin’s condition (Hayashi 2000, p. 402). Let H t : = ( R t , R t 1 , ) be the history of { R t } at time t Z . It is assumed that E ( R t | H t n ) converges in mean square to μ as n and, according to Hayashi (2000, p. 403), we must have that
k = 0 E ( ε k 2 ) <
with ε k : = E ( R t | H t k ) E ( R t | H t k 1 ) for k = 0 , 1 , . It can be shown that σ L 2 = k = Γ ( k ) , where Γ is the autocovariance function of { R t } (Hayashi 2000, Proposition 6.10). The number σ L 2 is referred to as the large-sample variance of { R t } , whereas σ 2 represents its stationary variance. In the following, I assume that τ 2 : = Var ( R t μ ) 2 < and that Gordin’s condition is satisfied not only for { R t } but also for { ( R t μ ) 2 } .
The aforementioned requirements can easily be extended to any d-dimensional return process (Hayashi 2000, p. 405) and applied to a broad class of standard time-series models. There exists a number of alternative criteria for the CLT, which can be found, e.g., in Brockwell and Davis (1991, p. 213) as well as Hamilton (1994, p. 195). However, to the best of my knowledge, Gordin’s condition represents the most unrestrictive set of assumptions about the serial dependence structure of a stochastic process (Eagleson 1975). In particular, it can be considered a natural generalization of the CLT for martingale difference sequences (Hayashi 2000, p. 106).
It is worth emphasizing that the number of dimensions, d, is supposed to be fixed. At least, we have to assume that n / d . If n / d tends to a finite number, the CLT might become invalid and other interesting issues, which are well-known from random matrix theory, can arise (Frahm and Jaekel 2015). By contrast, if the number of observations relative to the number of strategies is sufficiently large, we may expect that the CLT is satisfied under the aforementioned conditions.
I suppose, without loss of generality, that the risk-free interest rate is constantly zero. That is, I implicitly refer to asset returns in excess of the risk-free interest rate that can be observed at the beginning of each period. The Sharpe ratio η : = μ / σ (Sharpe 1966) is frequently used as a performance measure both in theory and in practice. In the following section, I present the intersection–union test, which can be applied in order to judge whether a given investment strategy possesses the largest Sharpe ratio among a set of alternatives. This can be done under the quite general assumptions about the return process { R t } mentioned above.

2.2. Asymptotic Properties of Sharpe Ratios

In this section, I present some asymptotic properties of Sharpe ratios. The reader can find the derivations in Appendix A. It holds that
σ n 2 : = 1 n t = 1 n ( R t μ n ) 2 = 1 n t = 1 n ( R t μ ) 2 ( μ n μ ) 0 2 σ 2
and
n σ n 2 σ 2 = n 1 n t = 1 n ( R t μ ) 2 σ 2 n ( μ n μ ) N ( 0 , σ L 2 ) ( μ n μ ) 0 N 0 , τ L 2 .
This means that σ n 2 is a consistent estimator for the stationary variance σ 2 and n σ n 2 σ 2 is asymptotically normally distributed with large-sample variance τ L 2 .
For assessing the large-sample variance of R t , i.e., σ L 2 = k = Γ ( k ) , we need to estimate the autocovariance function Γ . There are many ways to achieve this goal. Usually, one applies either heteroscedasticity-autocorrelation consistent (HAC) inference or some bootstrap procedure (Andrews 1991; Ledoit and Wolf 2008; Politis 2003). A nice comparison between HAC inference and bootstrapping in the context of performance measurement can be found in Ledoit and Wolf (2008).
Bootstrapping is a very powerful tool, but it can be computationally more intensive than HAC inference. Moreover, sometimes it is not clear whether or not the necessary (mathematical) conditions for the bootstrap are satisfied. The method proposed here, in some sense, bypasses the aforementioned problems. However, also HAC estimation can be somewhat obscure when it comes to choosing the right kernel and bandwidth, etc. For this reason, I keep things as simple as possible, i.e., I choose the box-kernel-type HAC-estimator
σ L n 2 : = Γ n ( 0 ) + 2 k = 1 l Γ n ( k ) ,
where Γ n is the empirical autocovariance function of { R t } with l n (Hayashi 2000, p. 142), i.e.,
k Γ n ( k ) : = 1 n t = k + 1 n R t μ n R t k μ n .
It is a stylized fact of empirical finance that Γ n ( k ) Γ ( k ) 0 for all k 0 , i.e., asset returns are not significantly autocorrelated, and so we may expect that σ L n 2 σ n 2 .
The large-sample variance of ( R t μ ) 2 is τ L 2 , which can be estimated by
τ L n 2 : = Π n ( 0 ) + 2 k = 1 l Π n ( k ) ,
where Π n is the empirical autocovariance function of ( R t μ n ) 2 , i.e.,
k Π n ( k ) : = 1 n t = k + 1 n ( R t μ n ) 2 σ n 2 ( R t k μ n ) 2 σ n 2 .
Typically, asset returns are conditionally heteroscedastic. This means that, in contrast to σ L 2 vs. σ 2 , the large-sample variance τ L 2 can be significantly larger than the stationary variance τ 2 .
Gordin’s condition guarantees that
n μ n μ σ n 2 σ 2 N 0 , σ L 2 κ L κ L τ L 2 ,
where κ L represents the large-sample covariance between R and ( R μ ) 2 . Due to the so-called “leverage effect” (Black 1976), we can expect that κ L is negative. Moreover, we already know that n ( μ n μ ) N 0 , σ L 2 and, by applying the delta method (van der Vaart 1998, Chp. 3), we obtain
n ( σ n σ ) N 0 , τ L 2 4 σ 2 ,
which can be used in order to calculate the standard error of σ n .
The Sharpe ratio is estimated by η n : = μ n / σ n and the delta method leads to
n η n η N 0 , σ L 2 σ 2 η κ L σ 3 + η 2 τ L 2 4 σ 4 .
Schmid and Schmidt (2009) obtain the same large-sample variance of { η n } under the assumption that the processes are strongly mixing (Bradley 2005), but that assumption seems to be more restrictive than Gordin’s condition.
To the best of my knowledge, Lo (2002) is the first who analyzes the potential impact of serial dependence when estimating the Sharpe ratio. Mertens (2002) points out that the formula for independent and identically distributed asset returns presented by Lo (2002) is based, implicitly, on the normal-distribution hypothesis. More precisely, he shows that the large-sample variance of { η n } is
1 + η 2 2 γ 3 η + γ 4 3 4 · η 2
if the components of { R t } are independent and identically distributed, where
γ 3 : = E ( R t μ ) 3 σ 3 and γ 4 : = E ( R t μ ) 4 σ 4
denote the skewness and the kurtosis of R t , respectively. Lo (2002) presumes that γ 3 = 0 and γ 4 = 3 , in which case the large-sample variance of { η n } is 1 + η 2 / 2 . Some of those results can be found also in Opdyke (2007). However, Ledoit and Wolf (2008) mention that the formula for serially dependent asset returns presented by Opdyke (2007) is wrong because it does not distinguish between large-sample and stationary (co-)variances. One purpose of this work is to clarify the aforementioned misunderstandings.
Suppose, without loss of generality, that we want to compare the Sharpe ratio of Strategy 1 with that of Strategy 2. In Appendix A, the reader can verify that
n η 1 n η 1 η 2 n η 2 N 0 , ω 11 ω 12 ω 21 ω 22
with
ω 11 = σ L 1 2 σ 1 2 η 1 κ L 1 σ 1 3 + η 1 2 τ L 1 2 4 σ 1 4 , ω 22 = σ L 2 2 σ 2 2 η 2 κ L 2 σ 2 3 + η 2 2 τ L 2 2 4 σ 2 4 ,
and
ω 12 = ω 21 = λ 11 σ 1 σ 2 η 2 σ 1 λ 12 + η 1 σ 2 λ 21 2 σ 1 2 σ 2 2 + η 1 η 2 λ 22 4 σ 1 2 σ 2 2 ,
where
λ 11 λ 12 λ 21 λ 22
is the large-sample covariance matrix of R 1 t , ( R 1 t μ 1 ) 2 and R 2 t , ( R 2 t μ 2 ) 2 .
We conclude that
n Δ η n Δ η N 0 , ω 11 + ω 22 2 ω 12
with Δ η n : = η 1 n η 2 n and Δ η : = η 1 η 2 . It is worth emphasizing that the benchmark must be chosen before examining the Sharpe ratios. Otherwise, the entire procedure would suffer from a selection bias and then the results derived so far are no longer valid. However, this is not a serious drawback: If our choice of the benchmark is based on historical data, we can simply apply the test out of sample.
As already mentioned at the end of Section 1, the given result represents a nonparametric generalization of the Jobson–Korkie test (Jobson and Korkie 1981), which is frequently used in finance. The latter is based on the assumption that asset returns are serially independent and multivariate normally distributed. In this special case, it follows that
n Δ η n Δ η N 0 , 2 ( 1 ρ 12 ) + η 1 2 + η 2 2 2 η 1 η 2 ρ 12 2 2 ,
where ρ 12 : = σ 12 / ( σ 1 σ 2 ) is the linear correlation coefficient between the return on Strategy 1 and the return on Strategy 2. This expression for the large-sample variance of { Δ η n } corrects a typographical error made by Jobson and Korkie (1981), which is observed by Memmel (2003).

2.3. Empirical Study

In order to demonstrate the intersection–union test, I consider monthly excess returns on the MSCI stock indices for the G–7 countries, i.e., Canada, France, Germany, Italy, Japan, UK and USA, from January 1970 to January 2018. The given indices are calculated on the basis of USD stock prices that are adjusted for dividends, splits, etc.4 The sample size corresponds to n = 577 and the risk-free interest rate is calculated on the basis of the secondary market 3-month US treasury bill rate at the beginning of each period.5 I choose the equally weighted portfolio (EWP) of all G–7 countries as a benchmark. This choice can be justified by the argument that investors should make use of international diversification (Jorion 1985).
For estimating the large-sample variances, I choose the lag length l = 12 . First of all, I show that Γ n ( k ) 0 for all k 1 , 2 , , l . For this purpose, I focus on the empirical autocorrelation function, i.e., k ρ n ( k ) : = Γ n ( k ) / Γ n ( 0 ) . Figure A1 (see Appendix B) contains the correlograms with respect to { R t } for the EWP and each G–7 country. The red lines indicate the critical thresholds for the null hypothesis that the (true) autocorrelation at k is zero on the level α = 0.05 . Furthermore, the reader can find the Ljung–Box Q-statistic in each plot, whose critical threshold on the level α = 0.05 amounts to 21.0261. The given results confirm the general opinion that first-order autocorrelations of asset returns do not significantly differ from zero.6 Put another way, the large-sample variances and covariances of asset returns are not significantly larger than their stationary counterparts. This picture changes substantially in Figure A2, which shows the empirical autocorrelations with respect to ( R t μ n ) 2 . Now, the Ljung-Box test always leads to a rejection of the null hypothesis H 0 : ρ ( 1 ) = ρ ( 2 ) = = ρ ( 12 ) = 0 . That is, there is a strong evidence that monthly asset returns exhibit conditional heteroscedasticity.
Table 1 contains the estimated large-sample variances divided by their stationary counterparts both for { R t } and for ( R t μ n ) 2 . We can see that the estimates of the large-sample variance of { R t } do not differ very much from the stationary ones—except for Japan, where the large-sample variance seems to be more than twice the stationary variance. By contrast, the estimates of the large-sample variance of ( R t μ n ) 2 are always more than twice their stationary counterparts. Hence, it is inappropriate to ignore the serial dependence structure of monthly asset returns.
Table 2 contains the means, standard deviations, and Sharpe ratios for the EWP and the G–7 countries based on the monthly asset returns from January 1970 to January 2018. The standard errors are given in parentheses. Despite the large number of observations, the standard errors of μ n and η n are big compared to the corresponding estimates. This is a common problem in financial econometrics or, more specifically, in performance measurement. The last row of Table 2 contains the standard errors of the Sharpe ratios under the Jobson–Korkie assumption, i.e., that asset returns are serially independent and multivariate normally distributed. These numbers are smaller than their nonparametric counterparts and they do not vary too much. Under the Jobson–Korkie assumption, the large-sample variance of { η n } is 1 + η 2 / 2 1 . Hence, the standard error of η n is approximately 1 / n , which explains why the standard errors are almost constant in the last row of Table 2.
Now, in principle, we would like to support the (alternative) hypothesis that the EWP is optimal compared to each G–7 country. Unfortunately, Table 2 shows that UK has the largest Sharpe ratio and so the EWP cannot be significantly better. Interestingly, this was not always the case. A closer inspection of the data reveals that the EWP had the largest Sharpe ratio before the financial crisis 2007–2008. However, now we have to stop our testing procedure. Nonetheless, for informational purposes, I provide the Sharpe-ratio differences for each seven pairs, the corresponding standard errors, and the associated t-statistics in Table 3. The reader can verify that it would have been hard to reject H 0 , anyway. The problem is that every t-statistic must be greater than Φ 1 ( 1 α ) = 1.6449 in order to reject H 0 , but this stringent condition is fulfilled only for Italy.
The lower part of Table 3 contains the standard errors of the Sharpe ratio differences and the t-statistics that are calculated under the Jobson–Korkie assumption. Although the standard errors of η n that are obtained under the same distributional assumption are always lower than their nonparametric counterparts (see the last row of Table 2), the same effect cannot be observed regarding Δ η n . The Jobson–Korkie assumption underestimates the standard errors for some indices, but it overestimates them for other indices. All in all it appears to be very difficult to compare investment strategies by historical observation because the given results are hardly ever significant if we apply a joint or a multiple hypothesis test (Frahm et al. 2012).

3. Conclusions

In portfolio optimization, we are often concerned with the question of whether a given investment strategy is optimal among a set of alternatives. In this work, I presented an intersection–union test for the null hypothesis that the benchmark is suboptimal in terms of the Sharpe ratio. The proposed test can easily be implemented. Furthermore, it accounts for serial dependence and it does not presume that asset returns are multivariate normally distributed. Thus, it is compatible with the stylized facts of empirical finance. However, an empirical study demonstrates that, in most practical applications, it is hard to reject the null hypothesis due to the lack of data.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Asymptotic Results

We can write σ = f ( σ 2 ) with f : σ 2 σ 2 . The first derivative of f at σ 2 is ( 2 σ ) 1 . Hence, the asymptotic variance of n ( σ n σ ) is τ L 2 ( 2 σ ) 2 = τ L 2 / ( 4 σ 2 ) .
Furthermore, the Sharpe ratio can be written as η = g ( μ , σ 2 ) with g : ( μ , σ 2 ) μ / σ 2 . We obtain
g ( μ , σ 2 ) μ = 1 σ and g ( μ , σ 2 ) σ 2 = μ 2 σ 3 .
Hence, the asymptotic variance of n η n η reads
σ L 2 σ 2 2 · μ κ L 2 σ 4 + μ 2 τ L 2 4 σ 6 = σ L 2 σ 2 η κ L σ 3 + η 2 τ L 2 4 σ 4 .
Furthermore, if the components of { R t } are independent and identically distributed, we have that σ L 2 = σ 2 ,
κ L = Cov R t , ( R t μ ) 2 = E R t ( R t μ ) 2 μ σ 2 = E ( R t μ ) 3 + μ σ 2 μ σ 2 = E ( R t μ ) 3 ,
and τ L 2 = Var ( R t μ ) 2 = E ( R t μ ) 4 σ 4 , i.e., κ L / σ 3 = γ 3 and τ L 2 / σ 4 = γ 4 1 . Thus, we conclude that
σ L 2 σ 2 η κ L σ 3 + η 2 τ L 2 4 σ 4 = 1 + η 2 2 γ 3 η + γ 4 3 4 · η 2 .
Now, consider the asymptotic covariance matrix of
n η 1 n η 1 η 2 n η 2 .
The above result immediately leads to
ω 11 = σ L 1 2 σ 1 2 η 1 κ L 1 σ 1 3 + η 1 2 τ L 1 2 4 σ 1 4 and ω 22 = σ L 2 2 σ 2 2 η 2 κ L 2 σ 2 3 + η 2 2 τ L 2 2 4 σ 2 4 .
Moreover, the asymptotic covariance between n η 1 n η 1 and n η 2 n η 2 is
ω 12 = ω 21 = g ( μ 1 , σ 1 2 ) / μ 1 g ( μ 1 , σ 1 2 ) / σ 1 2 λ 11 λ 12 λ 21 λ 22 g ( μ 2 , σ 2 2 ) / μ 2 g ( μ 2 , σ 2 2 ) / σ 2 2 = λ 11 σ 1 σ 2 μ 2 λ 12 2 σ 1 σ 2 3 μ 1 λ 21 2 σ 1 3 σ 2 + μ 1 μ 2 λ 22 4 σ 1 3 σ 2 3 = λ 11 σ 1 σ 2 η 2 σ 1 λ 12 + η 1 σ 2 λ 21 2 σ 1 2 σ 2 2 + η 1 η 2 λ 22 4 σ 1 2 σ 2 2 .
If the asset returns are serially independent, the large-sample (co-)variances coincide with their stationary counterparts. More precisely, it holds that σ L 1 2 = σ 1 2 , σ L 2 2 = σ 2 2 , and λ 11 = σ 12 . Moreover, by using some standard results for the multivariate normal distribution (Muirhead 1982, p. 43), we obtain κ L 1 = κ L 2 = 0 , τ L 1 2 = 2 σ 1 4 , τ L 2 2 = 2 σ 2 4 , λ 12 = λ 21 = 0 , and λ 22 = 2 σ 12 2 . Thus, we have that
ω 11 = σ 1 2 σ 1 2 + η 1 2 2 σ 1 4 4 σ 1 4 = 1 + η 1 2 2 and ω 22 = σ 2 2 σ 2 2 + η 2 2 2 σ 2 4 4 σ 2 4 = 1 + η 2 2 2
as well as
ω 12 = σ 12 σ 1 σ 2 + η 1 η 2 2 σ 12 2 4 σ 1 2 σ 2 2 = ρ 12 + η 1 η 2 ρ 12 2 2 .
This leads to the large-sample variance of Δ η n , i.e.,
ω 11 + ω 22 2 ω 12 = 2 ( 1 ρ 12 ) + η 1 2 + η 2 2 2 η 1 η 2 ρ 12 2 2 .

Appendix B. Correlograms

Figure A1. Correlograms with respect to { R t } of the EWP and each G–7 country.
Figure A1. Correlograms with respect to { R t } of the EWP and each G–7 country.
Risks 06 00040 g0a1
Figure A2. Correlograms with respect to ( R t μ n ) 2 of the EWP and each G–7 country.
Figure A2. Correlograms with respect to ( R t μ n ) 2 of the EWP and each G–7 country.
Risks 06 00040 g0a2

References

  1. Andrews, Donald W. K. 1991. Heteroskedasticity and autocorrelation consistent covariance matrix estimation. Econometrica 59: 817–58. [Google Scholar] [CrossRef]
  2. Berger, Roger L. 1997. Likelihood ratio tests and intersection–union tests. In Advances in Statistical Decision Theory and Applications. Edited by Subramanian Panchapakesan and Narayanaswamy Balakrishnan. Basel: Birkhäuser, pp. 225–37. [Google Scholar]
  3. Black, Fischer. 1976. Studies of stock price volatility changes. In Proceedings of the Business and Economics Section of the American Statistical Association. Washington: American Statistical Association, pp. 177–81. [Google Scholar]
  4. Bradley, Richard C. 2005. Basic properties of strong mixing conditions. A survey and some open questions. Probability Surveys 2: 107–44. [Google Scholar] [CrossRef]
  5. Brockwell, Peter J., and Richard A. Davis. 1991. Time Series: Theory and Methods, 2nd ed. Berlin/Heidelberg: Springer. [Google Scholar]
  6. Burgess, Andrew N. 2000. Statistical arbitrage models of the FTSE 100. In Computational Finance. Edited by Yaser S. Abu-Mostafa, Blake LeBaron, Andrew Lo and Andreas Weigend. Cambridge: MIT Press, pp. 297–12. [Google Scholar]
  7. Conrad, Jennifer, and Gautam Kaul. 1998. An anatomy of trading strategies. The Review of Financial Studies 11: 489–19. [Google Scholar] [CrossRef]
  8. DeMiguel, Victor, Lorenzo Garlappi, and Raman Uppal. 2009. Optimal versus naive diversification: How inefficient is the 1/N portfolio strategy? Review of Financial Studies 22: 1915–53. [Google Scholar] [CrossRef]
  9. Eagleson, Geoffrey K. 1975. On Gordin’s central limit theorem for stationary processes. Journal of Applied Probability 12: 176–79. [Google Scholar] [CrossRef]
  10. Federal Reserve Bank of St. Louis. 2018. Secondary Market 3-Month US Treasury Bill Rate. Available online: https://fred.stlouisfed.org (accessed on 25 February 2018).
  11. Fletcher, Jonathan. 2011. Do optimal diversification strategies outperform the 1/N strategy in U.K. stock returns? International Review of Financial Analysis 20: 375–85. [Google Scholar] [CrossRef]
  12. Frahm, Gabriel, and Uwe Jaekel. 2015. Tyler’s M-estimator in high-dimensional financial-data analysis. In Modern Nonparametric, Robust and Multivariate Methods. Edited by Klaus Nordhausen and Sara Taskinen. Berlin/Heidelberg: Springer, chp. 17. pp. 289–305. [Google Scholar]
  13. Frahm, Gabriel, Tobias Wickern, and Christof Wiechers. 2012. Multiple tests for the performance of different investment strategies. Advances in Statistical Analysis 96: 343–83. [Google Scholar] [CrossRef]
  14. Hamilton, James Douglas. 1994. Time Series Analysis. Princeton: Princeton University Press. [Google Scholar]
  15. Hanke, Michael, and Spiridon Penev. 2018. Comparing large-sample maximum Sharpe ratios and incremental variable testing. European Journal of Operational Research 265: 571–79. [Google Scholar] [CrossRef]
  16. Hayashi, Fumio. 2000. Econometrics. Princeton: Princeton University Press. [Google Scholar]
  17. Jobson, J. Dave, and Bob M. Korkie. 1981. Performance hypothesis testing with the Sharpe and Treynor measures. Journal of Finance 36: 889–908. [Google Scholar] [CrossRef]
  18. Jorion, Philippe. 1985. International portfolio diversification with estimation risk. Journal of Business 58: 259–78. [Google Scholar] [CrossRef]
  19. Ledoit, Olivier, and Michael Wolf. 2008. Robust performance hypothesis testing with the Sharpe ratio. Journal of Empirical Finance 15: 850–59. [Google Scholar] [CrossRef] [Green Version]
  20. Lo, Andrew W. 2002. The statistics of Sharpe ratios. Financial Analysts Journal 58: 36–52. [Google Scholar] [CrossRef]
  21. Low, Rand Kwong Yew, Robert Faff, and Kjersti Aas. 2016. Enhancing mean variance portfolio selection by modeling distributional asymmetries. Journal of Economics and Business 85: 49–72. [Google Scholar] [CrossRef]
  22. Memmel, Christoph. 2003. Performance hypothesis testing with the Sharpe ratio. Finance Letters 1: 21–23. [Google Scholar]
  23. Menkhoff, Lukas, Lucio Sarno, Maik Schmeling, and Andreas Schrimpf. 2012. Currency momentum strategies. Journal of Financial Economics 106: 660–84. [Google Scholar] [CrossRef]
  24. Mertens, Elmar. 2002. Comments on Variance of the iid Estimator in Lo. Technical Report. Basel: University of Basel. [Google Scholar]
  25. MSCI. 2018. End of day data, Country. Available online: https://www.msci.com/end-of-day-data-country (accessed on 25 February 2018).
  26. Muirhead, Robb J. 1982. Aspects of Multivariate Statistical Theory. Hoboken: John Wiley. [Google Scholar]
  27. Opdyke, John Douglas. 2007. Comparing Sharpe ratios: So where are the p-values? Journal of Asset Management 8: 308–36. [Google Scholar] [CrossRef]
  28. Politis, Dimitris N. 2003. The impact of bootstrap methods on time series analysis. Statistical Science 18: 219–30. [Google Scholar] [CrossRef]
  29. Romano, Joseph P., and Michael Wolf. 2005. Stepwise multiple testing as formalized data snooping. Econometrica 73: 1237–82. [Google Scholar] [CrossRef]
  30. Roy, Samarendra Nath. 1953. On a heuristic method of test construction and its use in multivariate analysis. Annals of Mathematical Statistics 24: 220–38. [Google Scholar] [CrossRef]
  31. Sawik, Bartosz. 2012. Downside risk approach for multi-objective portfolio optimization. In Operations Research Proceedings 2011. Edited by Klatte H.-J. Lüthi and K. Schmedders. Berlin/Heidelberg: Springer, pp. 191–96. [Google Scholar]
  32. Schmid, Friedrich, and Rafael Schmidt. 2009. Statistical inference for Sharpe’s ratio. In Interest Rate Models, Asset Allocation and Quantitative Techniques for Central Banks and Sovereign Wealth Funds. Edited by Arjan B. Berkelaar, Joachim Coche and Ken Nyholm. Basingstoke: Palgrave Macmillan, pp. 337–57. [Google Scholar]
  33. Sen, Pranab K., and Mervyn J. Silvapulle. 2002. An appraisal of some aspects of statistical inference under inequality constraints. Journal of Statistical Planning and Inference 107: 3–43. [Google Scholar] [CrossRef]
  34. Sharpe, William F. 1966. Mutual fund performance. Journal of Business 39: 119–38. [Google Scholar] [CrossRef]
  35. Shen, Qian, Andrew C. Szakmary, and Subhash C. Sharma. 2007. An examination of momentum strategies in commodity futures markets. Journal of Futures Markets 27: 227–56. [Google Scholar] [CrossRef]
  36. Szakmary, Andrew C., Qian Shen, and Subhash C. Sharma. 2010. Trend-following trading strategies in commodity futures: a re-examination. Journal of Banking and Finance 34: 409–26. [Google Scholar] [CrossRef]
  37. van der Vaart, Aad W. 1998. Asymptotic Statistics. Cambridge: Cambridge University Press. [Google Scholar]
  38. Vrugt, Evert B., Rob Bauer, Roderick Molenaar, and Tom Steenkamp. 2004. Dynamic Commodity Timing Strategies. Technical Report. Rochester: SSRN. [Google Scholar] [CrossRef]
  39. Zagrodny, Dariusz. 2003. An optimality of change loss type strategy. Optimization 52: 757–72. [Google Scholar] [CrossRef]
1.
A different question is whether some asset universe allows the investor to achieve a higher performance compared to another asset universe (Hanke and Penev 2018).
2.
In order to identify the outperforming strategies, we would have to apply a multiple test. For more details on that topic, see Frahm et al. (2012) as well as Romano and Wolf (2005).
3.
Any capital income that occurs during Period t is considered part of P t .
4.
The total returns have been retrieved from the MSCI webpage (MSCI 2018).
5.
The data have been obtained from the Federal Reserve Bank of St. Louis (FRED 2018).
6.
The only exception is Japan, where we can find a relatively large Q-statistic of 31.7637.
Table 1. Variance ratios.
Table 1. Variance ratios.
EWPCanadaFranceGermanyItalyJapanUKUSA
σ L n 2 / σ n 2 1.49871.02991.20361.12551.69132.18281.27201.0118
τ L n 2 / τ n 2 2.59622.75502.30812.95142.37072.83682.50272.6202
Table 2. Means, standard deviations, and Sharpe ratios for the EWP and the G–7 countries. The standard errors are given in parentheses.
Table 2. Means, standard deviations, and Sharpe ratios for the EWP and the G–7 countries. The standard errors are given in parentheses.
EWPCanadaFranceGermanyItalyJapanUKUSA
μ n 0.00530.00520.00620.00600.00330.00540.00520.0057
SE ( μ n ) 0.00230.00240.00290.00280.00400.00370.00200.0026
σ n 0.04610.05600.06400.06270.07320.05990.04360.0620
SE ( σ n ) 0.00300.00400.00370.00410.00380.00350.00280.0077
η n 0.11490.09230.09710.09610.04490.08980.12020.0927
SE ( η n ) 0.05810.04620.04920.04790.05370.06240.05480.0508
SE JK ( η n ) 0.04190.04170.04170.04170.04170.04170.04180.0417
Table 3. Sharpe ratio differences, standard errors, and t-statistics.
Table 3. Sharpe ratio differences, standard errors, and t-statistics.
CanadaFranceGermanyItalyJapanUKUSA
Δ η n 0.02260.01780.01870.07000.0251−0.00530.0222
SE ( Δ η n ) 0.02130.03170.04190.02690.03740.03810.0376
t1.06350.55980.44722.60540.6718−0.13970.5891
SE JK ( Δ μ n ) 0.02910.02270.02570.02990.03540.02900.0274
t JK 0.77580.78210.72982.34200.7083−0.18330.8089

Share and Cite

MDPI and ACS Style

Frahm, G. An Intersection–Union Test for the Sharpe Ratio. Risks 2018, 6, 40. https://doi.org/10.3390/risks6020040

AMA Style

Frahm G. An Intersection–Union Test for the Sharpe Ratio. Risks. 2018; 6(2):40. https://doi.org/10.3390/risks6020040

Chicago/Turabian Style

Frahm, Gabriel. 2018. "An Intersection–Union Test for the Sharpe Ratio" Risks 6, no. 2: 40. https://doi.org/10.3390/risks6020040

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop