3.1. Consistency Features
We assume that the unobserved distribution function is estimated by the empirical measure from a sample of time-series observations . The payoffs generally are not stationary random variables because they include trend components. This statistical problem can be eliminated or mitigated by transforming the payoffs to gross investment returns and transforming the asset holdings to portfolio weights (for hosts) and (for overlays), which do not change the underlying investment decision-making problem. In order to isolate the effect of sampling variation on the stochastic enhancement restrictions, we also assume that the restrictions that define K and are deterministic and thus are not affected by sampling error.
The following empirically weak SAO set is constructed:
This empirical counterpart of the latent set of population SAOs has several attractive statistical consistency features under three general assumptions that are stated below.
Assumption 1. (Stationarity and Mixing). The process is stationary and absolutely regular. The asymptotic order of its mixing coefficients is for some .
These regularity conditions are compatible with stationary versions of ARMA-GARCH or stochastic volatility models, or more generally, processes that satisfy a general class of stochastic recurrence equations; see, for example, [
23] (
Section 6).
Assumption 2. (Lipschitz Continuity). The population goal is Lipschitz continuous in δ with a Lipschitz coefficient that is continuous at w.r.t. weak convergence.
Given the boundedness of the support of
, Lipschitz continuity is easy to establish for standard goals such as expected return and expected utility. A uniform integrability condition implied by the boundedness of the support ensures the required continuity of the coefficient in these cases. The objective function of [
13] (the weighted average of individual appraisal ratios) also satisfies the assumption due to the CMT and the aforementioned uniform integrability; the Lipschitz coefficient is in this case a continuous function of moments, as long as the variance involved is strictly positive.
The final assumption generalizes the weak independence assumption of [
9]. It is useful to avoid consistency problems that may arise for the empirical solutions due to binding inequalities (equivalences), or
, for some non-constant utility functions
.
Assumption 3. (Joint Enhancement). For any non-constant , there exists some that solves (3), such that for all . The joint enhancement (henceforth, JE) assumption essentially ensures the existence of some neighborhood of every weak SAO that contains strict SAOs , if strict SAOs exist. JE allows for different for different utility functions v instead of a single for all such utilities. It mitigates the adverse effect of the binding inequalities on consistency. Given any weak SAO for which for some , consider the portfolio defined using the Lebesgue integral , where is the set of strictly monotone Borel measures defined on , and is the degenerate measure at v, . JE implies that for all , and by choosing c appropriately small, it can be chosen to lie as close to as desired.
Various consistency results are obtained based on the above assumptions. The limit theory evolves as . The squiggly arrow (⇝) represents convergence in distribution. The analysis also uses Painleve–Kuratowski (PK) convergence for sequences of compact subsets of . With high probability (w.h.p.) refers to probability converging to one. represents the set of optimal weak SAOs, i.e., . Given , represents the set of -optimal weak SAOs, i.e., . denotes the set of trivial solutions ; it obviously is a subset of the enlargement .
Theorem 1. Under Assumptions 1–3 the following results are obtained: (i) if and JE occurs, then for any and for any , such that , we have that ; (ii) if , then .
Proof of Theorem 1. Given that any element of
belongs to
and that when non-trivial weak SAOs do not exist, then
, if
, then
for which
. Assumption 1 and the compactness of
, along with the FCLT of [
24] (Corollary 4.1), imply that
, which establishes Theorem 1. (ii). Suppose next that non-trivial weak SAOs do not exist. Let
and
. Stationarity and mixing of
and the compactness of
, with corollary 2.E of [
25], imply that we need to consider only fixed
in our derivations. For any
that does not have any trivial equivalences, we have that
for all
, thus
for all
. Suppose then that
has non-trivial equivalences. Then, for any
that does not correspond to some equivalence or to some
, the previous analysis holds. If
and
u corresponds to some non-trivial equivalence, then, for a large enough
T, consider the strong SAO
for
L the Lipschitz coefficient of
u. Here,
. Notice that since (i)
is concave, (ii) by construction
and (iii)
due to joint enhancement, we have that
and thereby
and, due to Assumption 1, the CMT and the Portmanteau Theorem, the
of the latter probability is greater than or equal to
. Hence, we have that
and
. The previous then imply that, in all cases,
lies in the
of
w.h.p. Now, due to the definitions of
and the Lipschitz continuity of
, we obtain
where the first inequality follows from (
7). This implies that
. Obviously
. Any other element of
that lies in the empirical weak SAO set with asymptotically positive probability will necessarily converge to some element of
. The previous establish Theorem 1. (i), since there cannot exist accumulation points of sequences of elements of
that lie outside
. □
The following result complements Theorem 1 by showing that the empirical optimal solutions also approximate the original as .
Theorem 2. (Empirical Solution Properties). Under the premises of Theorem 1, the following results are obtained: (i)where if and joint enhancement (7) occurs, and if ; (ii) if and JE occurs, then every limit of any subsequence of elements of lies in ; (iii) if , then . Proof of Theorem 2. Assumption 1, the compactness of
, along with the FCLT of [
24] (Corollary 4.1), the concavity of
and Skorokhod representations applicable due to Theorem 3.7.25 of [
26] imply the w.h.p. epi-convergence of the latter to
due to corollary 2.E of [
25]. Then, the results in (i), (ii) and (iii) follow from Theorem 1, employing Skorokhod representations applicable due to Theorem 3.7.25 of [
26], using Proposition 3.2, in Chapter 5 of [
27] and then reverting to the original probability space. Specifically for (iii), and since,
w.h.p., for any
, that due to Assumption 2,
is constant on
w.h.p., it follows that any
will not lie in
and thereby in
w.h.p., any
lies in
a.s. for all
T, and due to the constancy of
G, it also lies in
w.h.p. □
The above consistency properties imply that the probability of each of two possible types of decision errors tends to zero: (I) the selected portfolio is an empirical SAO but not a population SAO; (II) the selected portfolio is optimal under the empirical measure but suboptimal under the population distribution because the population optimum is not an empirical SAO.
For a type I error, the following reasoning applies: if , then there exists at least one strict inequality , for some Since strict inequalities are asymptotically unaffected by sampling error in our stationary and ergodic framework, the probability that such a is falsely included in the empirical SAO set is asymptotically negligible.
As far as a type II error is concerned, there is a non-vanishing probability of false exclusion of non-strict SAOs , which feature contacts for some . However, the neighborhood of these non-robust solutions includes strict SAOs that feature asymptotically strict (scaled) empirical inequalities for all , and that are therefore included in with high probability.
When and are used in place of and , the results above still hold as long as and , and therefore converges to in the Painleve–Kuratowski topology. This is true even in the case where the latter convergence is in probability, e.g., whenever consistent estimators for a potentially latent upper bound of the support are used in the discretization.
3.2. Empirical Likelihood Ratio Test for Being an SAO
In their empirical analysis, ref. [
13] construct arbitrage portfolios that are SAOs in a given sample and test whether these are population SAOs out of the sample. They thus evaluate a single, optimized overlay with portfolio weights that are fixed and known out of sample, thereby avoiding distortions of the test size stemming from overfitting the data.
In this context, the obvious choice of the null hypothesis seems to be . This specification is further supported by the theoretical prediction that a portfolio that is optimized subject to empirical SAO restrictions converges to a true population SAO, if SAOs exist. Therefore, we discuss the testing of the null before discussing the testing of the alternative in the second part of this section.
The use of
is also consistent with the usual practice for statistical tests of pairwise dominance. Standard tests such as those of [
28] focus on null dominance, due to limiting degeneracies under the alternative of non-dominance. The current null provides a generalization of the standard specification to multiple pairwise dominance relations between combined portfolios and the underlying hosts.
Standard tests for pairwise dominance are usually based on Kolmogorov–Smirnov or Cramer–von Mises statistics for measuring violations of the empirical moment inequalities. These statistics are analytically convenient due to their computational tractability for lattice distributions and their straightforward compatibility with tools of asymptotic analysis such as Glivenko–Cantelli and Donsker theorems for uniform convergence, as well as generalized delta methods. Rejection regions are usually approximated by re-sampling methods that allow for asymptotically exact and consistent inference under stationarity and mixing.
Unfortunately, the Kolmogorov–Smirnov and Cramer–von Mises statistics are relatively inefficient for financial data sets with short time series of non-overlapping, low-frequency returns and broad cross-sections. Re-sampling can also be computationally costly if large optimization problems need to be solved for thousands of pseudo-samples or sub-samples.
To enhance statistical efficiency in finite samples and reduce computer burden, an alternative approach is adopted here based on BEL. A similar approach is used in the tests for non-dominance, efficiency and optimality by [
18,
19,
20]. The proposed approach generalizes these earlier studies by allowing for multiple host portfolios, testing both the null and the alternative, and generalizing the data dependence structure.
A prominent role in our analysis is played by the ‘contact set’, or set of the binding inequalities, . Its cardinality is represented by ; this number is crucial for the approximation of the rejection region of the testing procedure. The analysis below describes sufficient conditions for finiteness of for every member of as well as a modification of the procedure that accounts for the possibility of an infinite contact set.
To account for temporal dependence, the sample is subdivided into potentially overlapping blocks of B consecutive observations, , , with . L is considered to be independent of T, and B is assumed to diverge with a rate slower than . The optimal choice of the block size B is case-dependent and usually involves of a trade-off between the data dynamics and the number of asymptotically independent blocks.
In the following,
denotes the set of probability distributions on the set of blocks, and
denotes the empirical measure, i.e.,
for any block
. The test statistic measures the smallest possible adjustments to the probability mass function of
that ensure that the evaluated overlay is a weak SAO; the optimality property is w.r.t. the divergence from the adjusted distribution to the empirical measure of the blocks. The BELR test statistic can be computed based on a solution to the following minimum relative entropy (MRE) problem:
In this expression,
denotes the Kullback–Leibler divergence:
. This measure gives a well-known information-theoretic representation of the dissimilarity between two measures sharing a common support (see [
29]). For
, the finite system (
5) is used so that the MRE problem (
9) is approximated by an optimization problem that involves a finite system of moment inequalities:
Since any
is discrete, only the probability mass levels at the blocks
,
, need to be determined, and the minimum relative entropy problem reduces to a convex optimization problem, given the formulation in system (
4):
where
.
Inference on
can be based on the BELR test statistic
, where
is a solution of the variational problem (
9). The exact limit null distribution of ELR is a chi-bar-squared distribution under the aforementioned assumptions about the data and the blocks. This null distribution is not directly implementable in rejection region analysis because its mixing weights depend on the latent
. An asymptotically conservative test can, however, be obtained via majorizing chi-squared distributions and methods of moment selection.
Specifically, the limiting null distribution is dominated by , the degrees of freedom of which equal the number of binding inequalities (the cardinality of the contact set) whenever finite. The number of contacts can be consistently estimated by the number of empirical moment conditions that are approximately binding, or where the slack is a potentially degenerate random variable that weakly converges to zero at an appropriate rate.
Consequently, an asymptotically conservative rejection region can be formed using the stochastic distribution , so that the test size, or the probability of false rejection of an SAO, is asymptotically less than or equal to the nominal significance level.
The following result derives the limit theory of the testing procedure defined by (
10) and the rejection of
if
, where
represents the
quantile of the stochastic distribution
for
. The result shows asymptotic conservatism and consistency.
Theorem 3. Suppose that Assumption 1 holds, and (a) CS is finite, (b) there exists some such that,where represents the minimum eigenvalue of(c) the tolerance parameter satisfies , while almost surely, (d) the block size satisfies and for and L is independent of T, and (e) with T, so that converges to to a non-stochastic dense subset of in probability, and . Then, the following results are obtained: (i)where is a zero-mean Gaussian vector with covariance matrix(ii) under ,while under ,(iii) under , Proof of Theorem 3. Equation (
11) follows as in the proof of Theorem 4.2.1 of [
30]. Consider the case
. Uniformly w.r.t. the
, it is found that, due to the definition of the tolerance parameter and the Birkhoff’s ULLN,
, eventually, almost surely. Using Skorokhod representations, since
diverges to infinity almost surely, uniformly over the set
, then
, eventually, almost surely. The previous along with (e) imply that
, jointly with
, and thereby we obtain that
due to Proposition 3.4.1 of [
31], where
represents the polar cone of
. Then, the Portmanteau Theorem establishes (
12). For the case
, (
11) implies that
is eventually zero w.h.p., hence (
13) follows. Furthermore, under the alternative, the proof of Theorem 4.2.1 of [
30] implies that
. Thereby, the growth condition on the approximation of
along with (c), and via the use of Skorokhod representations, imply that the modified statistic diverges to infinity, while the quantile
is almost surely bounded, hence (
14) follows. □
The test is consistent under the alternative hypothesis due to the divergence to infinity of the test statistic and the boundedness from above of the quantiles used as critical values. However, the test is asymptotically conservative, due to the majorizing properties of the critical values and the restricted limiting behavior of the slacks.
For any that does not belong to the generic set of SAOs with finite contacts discussed above, the contact set finiteness condition (a) holds whenever is analytic in the Russell–Seo threshold parameter for every extreme point . Analyticity would be in turn obtained if has an analytic density, due to our bounded support framework.
Under the weaker assumption of a continuously differentiable density, another path for obtaining (a) is to ensure that the Hessian of
w.r.t.
has a finite number of zeros. For example, when
and
is the set of Russell–Seo elementary utilities, using [
32], we obtain that the Hessian equals
Using then the diffeo-geometric analysis in Paragraph 5 of [
33], we obtain that the Hessian is not zero whenever
is of the same sign for almost every value of
on the boundary of
locally uniformly in the null hypothesis.
The strictly positive minimum eigenvalue condition (b) is necessary for sequential convergence of the test statistic under the null hypothesis. Given (a) and if trivial contacts with zero empirical variance are excluded from the analysis, this condition holds whenever the random vector
has a full rank covariance matrix; this can, for example, be tested via the characteristic roots-based tests of [
34] using the empirical covariance matrix of the non-trivial empirical contacts.
The restriction of the asymptotic behavior of the tolerance parameter (c) is usual in the econometric literature; see [
35] and references therein. The restriction on the block size divergence rate (d) is also standard (see, for example, Theorem 3 of [
36]).
The current first-order limit theory is silent about the optimal choice of the block size
B beyond specifying general rates of convergence. Finer details could be obtained by higher-order asymptotics, and it is expected that optimality depends crucially on the temporal-dependence structure of the underlying returns’ process (see, for example,
Section 2 of [
37]). The optimal
B can be approximated by some empirical variance minimization method (see, for example, [
38]).
For the important case of n-th degree SD, or , condition (e) is satisfied when, e.g., the empirical support is partitioned into sub-intervals, for some , and the Russell–Seo thresholds are placed at the boundaries of those sub-intervals.
Whenever the CS is infinite and (a) fails, a simple modification of the test statistic via the estimated number of contacts
would lead to a standard normal limiting null bound distribution. Specifically, performing the test based on the rejection rule
can be proven to satisfy (ii) and (iii) of the previous result, if
V has a spectrum bounded away from zero, (c), (d) and (e) hold. This is obvious whenever
is finite from Theorem 3. Whenever
is infinite, then the spectrum condition and the fact that
is finite for any
T, imply that, under the null hypothesis, the modified statistic will be bounded above by a standard normal distribution. The lhs of the rejection rule can be shown to converge in distribution to the
quantiles of the standard normal. Furthermore, under the alternative, the proof of Theorem 4.2.1 of [
30] implies that
. Thereby, the growth condition on the approximation of
along with (c), and via the use of Skorokhod representations, imply that the modified statistic diverges to infinity. Hence, under the particular restriction on the growth rate of
, we obtain a robust modification of the original test that avoids (a).
In cases where
contains utilities that correspond to contacts with zero empirical variance, the spectrum boundedness away from zero condition can have a restricted stochastic dominance interpretation. For example, when
is the set of Russell-Seo utilities-see [
21], then the fact that the population support infimum
a is finite implies that, for the condition to hold, a set of thresholds to the right of
a should be excluded from the analysis. The analyst can exclude the thresholds in the interval
, where
is the empirical support infimum and
is sufficiently small to ensure that the excluded Russell–Seo utilities are inconsequential, and then test whether the condition holds for the remaining utilities using the [
39] rank tests on the empirical covariance of the empirical contacts.
3.3. Testing the Alternative of Not Being an SAO
The hypothesis generally allows for the construction of tests that are locally more powerful in fixed samples than the alternative or because data variation is generally more likely to yield false non-dominance classifications than false dominance classifications; the existence of a single inequality with the ‘wrong’ sign suffices for the system of dominance inequalities to be violated. Nevertheless, the conservative nature of the proposed test for could compromise power at the boundary between the two hypotheses; a possible local lack of power of the BELR test for the null could be mitigated using a second test that evaluates the alternative.
For this purpose, the analysis is completed here with the design of an ELR test for the alternative hypothesis
. For finite
and
, the condition
amounts to a finite system:
The condition
requires violations of this system. Violations occur whenever
or when
. In the latter case, violations would imply asymptotic non-tightness for Kolmogorov–Smirnov or Cramer–von Mises type statistics under the alternative. Violations can, however, be additionally characterized as solutions to the alternative system
;
. Non-trivial binding inequalities for the alternative system always exist, and this implies tightness for the testing procedure that is described below. Using the block structure above, the relevant relative entropy problem follows:
This problem is bi-linear in and hence bi-convex. It can be solved using an alternating direction method of moments (ADMM) algorithm that alternates between optimizing w.r.t. and optimizing w.r.t. .
The contact set associated with testing the alternative hypothesis is defined as
. As noted above, this is non-empty even in cases where
and no utilities associated with trivial contacts of zero empirical variance appear in the analysis. For testing the alternative, non-finiteness of the contact set is more common than for testing the null; the alternative hypothesis imposes moment inequalities of differing signs and may also contain binding moment inequalities. In this case, there exist infinite convex combinations of the inequalities involved that produce zeros. Thus, inference is performed by the modification of the BELR statistic via translation and scaling by the number of empirical contacts and the finite approximation of the simplex in which parameter
attains its values. The following rejection rule used is
where
, with
the solution of (
15), and, similarly to the previous,
is the number of empirical contacts, and
a (potentially stochastic) finite discretization of the (
)-simplex that converges in probability to a dense subset of the (
)-simplex as
.
The limit theory of this testing procedure is derived in the following result.
Theorem 4. Suppose that Assumption 1 and conditions (c) and (d) of Theorem 3 hold, with T, while, converges in the Painleve–Kuratowski topology to a non-stochastic, dense subset of in probability, converges in probability in the Painleve–Kuratowski topology to a dense, non-stochastic subset of the (-simplex and the spectrum of , where the parameter β lies in the set , is bounded away from zero; here, represents the Painleve–Kuratowski limit in probability of . Then, the following results are obtained:
- (i)
- (ii)
under , and if, moreover, ,
Proof of Theorem 4. Analogous to the proof of Theorem 3, the distance between the number of contacts in
and in its PK-limit can be shown to converge to zero in probability. Using this and arguments analogous to the ones concerning empirical process convergence, Skorokhod representations and bounds on infima of quadratic forms over cones in the proof of Theorem 4.2.1 of [
30], under the null, it can then be shown that the test statistic is asymptotically bounded from above by a random variable that weakly converges to a standard normal. Under the alternative, the relevant part of the proof of Theorem 4.2.1 of [
30] implies that
. Thereby, the growth condition on the approximation of
along with (c) imply that the test statistic diverges to infinity. □
The resulting test is also asymptotically conservative and consistent. It extends the ELR test of [
18] as it allows for non-singleton
and temporal dependence for the underlying stochastic processes. Whenever
, the bounded away from zero spectrum condition is similar to the restricted stochastic dominance analysis of [
18];
may need reduction when its boundary contains utilities associated with trivial contacts, to ensure existence of a well-defined limiting distribution. Whenever the set of Russell and Seo utilities is used-see [
21], this can be performed similarly to the analysis described above for the modification of the BELR test for the null and for the case of infinite contacts.