1. Introduction
Let
Y be a random variable with finite mean
and variance
. Thus, the random variable
is centered and has the same variance as
Y. For a positive real number
b, the celebrated Cantelli inequality—also known as the
one-sided Chebyshev’s inequality—reads as follows
Both Cantelli’s inequality and the classical Chebyshev’s inequality can be (and have been) extended in several ways [
1,
2] to a random vector
in
—here and throughout the rest of the paper,
denotes the transpose of the column vector
. As shown in [
1,
2], there is a standard recipe that yields such extensions: Let
X be a random vector supported by a subset
S of
and let
be the covariance matrix of
X. Let
T be a Borel subset of
S and
such that
for all
and
for all
. Then, with
denoting the indicator of the set
T over
S, one has
and
This technique is essentially a “Markov inequality”. By taking f in the family , where , and minimizing for u under the constraint , Marshall and Olkin obtained the following strong and general result.
Theorem 1 (Marshall and Olkin [
1])
. Let T be a closed convex set in not containing the origin. If X is a centered random vector of with a positive-definite covariance matrix Σ
, thenFurthermore, the inequality is sharp, in the sense that there exists a centered random vector whose support contains T and whose covariance matrix is Σ, such that the inequality is attained as an equality.
Note that Cantelli’s inequality (
1) follows from inequality (
2) after dividing the univariate random variable
X by the positive threshold
b and observing that the variance of
is
. The minimization problem on the right-hand side of (
2) is solved by minimizing the quadratic form
over the same set. This is a convex minimization problem that can be solved using the techniques described in [
3]. As proved in [
1], the infimum in (
2) is attained. The function
corresponding to the vector
, which attains the infimum in inequality (
2), can be seen as a kind of envelope of a given shape (in this case quadratic) for the probability on the right-hand side. The same inequality can be interpreted in the following way: First, we linearly approximate
T inside the probability. This approximation yields a family of linear inequalities, each of which is the tail of a scalar random variable. We then use Cantelli’s inequality (
1) to bound each of these tails, and finally, we choose the tightest one. Let us describe this process for a non-empty arbitrary Borel subset
T of
: Let
—note that
is always a closed convex set regardless of the argument
T (see
Section 2 for more details) and hence a Borel set—since, by definition,
and
, it follows that
where
X is a centered random vector with a positive-definite covariance matrix
. Here, for all
, the random variable
is a centered random variable with variance
. Thus, by Cantelli’s inequality, for all
, it holds that
Thus, if
T is a non-empty convex set, after taking the infimum over
, we recover (
2) from another perspective. It is easy to see that the same result holds in any finite-dimensional Euclidean space
. If
T is of the form
, where
C is a convex cone in
, then we provide a specialized Cantelli bound that is sharp. Furthermore, if
C has a non-empty interior, then
C induces a preorder
on the ambient space of
such that the event
can be written as
and can be interpreted as a generalized tail inequality (we recover classical tail inequalities when
C is the non-negative orthant). Thus, Cantelli’s inequality naturally extends to generalized tail inequalities in finite-dimensional Euclidean spaces. While such an extension is not really genuine since, in the finite-dimensional case, it can be derived by standard arguments from the special case of the standard real
n-dimensional Euclidean space (see Lemma 7), on the other hand, it allows a tight sharpening of the general case (Corollary 1) and also allows to cast seemingly more complicated events into the framework of tail inequalities, as they occur, for example, in random matrix theory. Consider the case where
X is a random matrix sampled from a symmetric real ensemble (see
Section 4.1 for precise definitions). A fundamental problem in this context is to understand, at least asymptotically, when the order of the matrix goes to ∞, the probability that the smallest eigenvalue of
X is positive. This problem is completely and precisely solved for Gaussian matrices (see [
4,
5,
6]): the probability of sampling a positive-definite matrix from a symmetric Gaussian ensemble goes to zero exponentially fast. In this paper, using Corollary 1 in
Section 3, we prove that the same result holds more generally for the
Wigner matrices of the form
, where
M is a random matrix whose entries are i.i.d. centered random variables, although the rate we can provide is much weaker. The result follows by looking at the random matrix
X as a random vector in the Euclidean space
, where
is the real vector space of real symmetric matrices and
is the Frobenius inner product. If
denotes the cone of positive semi-definite matrices lying in
and ⪰ is the partial order induced by
, then, for a real number
, with
I being the identity matrix of order
n, the generalized tail
is the event that occurs when the least eigenvalue of
X is at least
. Another similar motivation for studying generalized tails comes from the need to compute the probability of the feasibility of systems of linear inequalities whose unknowns are real random variables and the coefficients of the system are deterministic (see
Section 4.2). This problem can be cast in the context of concentration and tail inequalities for linear, possibly random images of random vectors [
7,
8]: Ref. [
8] studies the related problem of determining general concentration inequalities, while [
7] actually investigates the more general case of tails of random linear images of random vectors, which, in our interpretation, corresponds to the case where both the coefficients and the unknowns of the system are random variables. In this paper, we give a Cantelli-type bound on the feasibility of a system of linear inequalities with random unknowns and show that the “ordinary tails” of the linear image
of a random vector
X under a surjective map
f are just generalized tails for the random vector
X itself taken with respect to a polyhedral cone. Equivalently, by pursuing the interpretation of tails of linear images as solution sets of a linear system of inequalities in the random vector
X, we show that if the system has a right-invertible coefficient matrix, then its solution set is a generalized tail for
X taken with respect to a polyhedral cone: the rows of the coefficient matrix are the normal vectors of the proper maximal faces of the cone.
It is clear from the discussion above that tail inequalities for random vectors depend crucially on their covariance functions
. When
has some additional structure, for special choices of the threshold
b, Cantelli’s inequality for generalized tails reads as a very simple expression in terms of certain norms of
b. For instance (see
Section 3 for details), for
, if
is a non-negative matrix, then
where
is the
Mahalanobis norm of
b. Note that inequality (
1) specializes to the inequality above after dividing the numerator and denominator by
. The assumption
is only seemingly artificial. It is often satisfied by finitely supported random vectors that occur in network science. In
Section 4.3, we will give one such important case. In light of Theorem 7, which rests on observations from [
9], having easily computable bounds on tail probabilities, like the one above, is generally the best we can hope for, even if we have no control over the quality of the approximation.
The rest of the paper is organized as follows: In
Section 2, we develop the machinery to state and prove the main results provided in
Section 3, while in
Section 4, we apply our results to the probability of sampling a positive-definite Wigner matrix, to the probability of feasibility of a system of linear inequalities, and to testing homophily in networks.
2. Preparatory Results
Finite-dimensional Euclidean spaces, namely, finite-dimensional real vector spaces V equipped with an inner product , are denoted by calligraphic letters such as . The symbol stands for the Euclidean space , where is the standard dot product in . In the following, when we talk about Euclidean spaces, we mean finite-dimensional Euclidean spaces over reals.
When dealing with random vectors, the Borel sets we are interested in have the form
, where
and
is a non-empty Borel set, often a
cone. A
cone in
is a subset
C of
V that is closed under taking positive scalar multiples, i.e.,
for every
, while a
convex cone C is a cone that is closed under taking sum, i.e.,
. The empty set, the set consisting only of the zero vector of
V, and
V itself are the
trivial cones. In the following, when we speak of cones, we mean non-trivial cones. Crucial to the definition of
generalized inequality is the following notion of duality of subsets of Euclidean spaces. First, identify the algebraic dual
of
V with
V by the inner product in
via the isomorphism
. Let
C be a non-empty subset of
V. The
dual of
C in
is the set
Write
for the double dual of
C, namely,
. The following facts, the first three of which are simple consequences of the definition, are known about the dual of a non-empty set
C (see [
10,
11]).
Lemma 1. Let . Then,
- (a)
is always a closed convex cone;
- (b)
, i.e., duality is inclusion reversing;
- (c)
;
- (d)
if and only if C is a closed convex cone.
The last property in the lemma implies that because is a closed convex cone by (a). A cone C is proper whenever C is a closed convex cone that is also pointed, i.e., , where 0 is the zero vector of V, and has non-empty interior. A cone C is self-dual in if .
For a random vector
and
, the event
is said to be a
tail of
X. Such an event reads as
and is the same event as
. The non-negative orthant
is a self-dual cone in
. Clearly,
is non-negative if and only if
is non-negative for all vectors
, i.e., for all
u in the dual cone of
. This fact can be generalized as follows. Let the ambient space of
be
V and let
C be a non-empty subset of
V. For
, write
if
. If
C is a convex cone, then
is a pre-order on
V, while if
C is a proper cone, then
is a partial order on
V. In any case, even when
C is arbitrary, by duality, one has
which reduces to
when
C is a closed convex cone. Generalized inequalities, and hence generalized tails, are well behaved with respect to linear transformations of random vectors because cones are preserved by such maps. In fact, if the latter are invertible, closedness is preserved as well.
We also need a lesser-known duality device, which we borrow from the theory of
blocking pairs of polyhedra [
12]. Let
V be the ambient space of
. For
, the
blocker of T in is the set
Analogous to the dual of T, the blocker of T has the following properties, the first three of which are straightforward.
Lemma 2. In the Euclidean space , let . It holds that
- (i)
if , then is a non-empty closed convex set;
- (ii)
if , then ;
- (iii)
;
- (iv)
if where and C is such that , thenmoreover, if C is a nontrivial cone, then
Proof. If
, then
for all
. Hence, if
T is non-empty, then so is
. Moreover, since
is the intersection of closed half-spaces, then
is a closed convex set. This establishes (i). Statements (ii) and (iii) are straightforward. Let us prove (iv). To prove (
4), first observe that if
, then
. Hence,
and
because
by (ii). Let us prove (
5). Since
, it follows that to prove (
5) it suffices to prove that
, which follows by
after (
4). Let us prove the latter inclusion. Note that if
and
C is a cone, then necessarily
for all
for, if not, there exists
such that
and
for all
. Hence,
for sufficiently large
, which contradicts
. We conclude that
for all
, and thus,
. By the same reasoning, it holds that
. To see this, assume by contradiction that
for some
. Since
, there exists
,
, such that
for all
. However, the latter inequality cannot be satisfied for a small enough
. We conclude that the desired inclusion is true. □
Remark 1. Usually the blocker of a polyhedron T of the form is defined as the set . Hence, in . However, the two definitions coincide in this case because, reasoning as in the proof of (
5)
, . Therefore, . In the Euclidean space
, for a set
C and a vector
b in
V the set
plays an essential role in the following intermediate results.
Lemma 3. Let C be a set in the Euclidean space . One has if and only if . Moreover, if C is a cone, then one has if and only if .
Proof. To prove the first assertion, note that
For the second assertion observe that by (iv) in Lemma 2. Conversely, if , then by definition. □
Note that the condition in Lemma 3 can be written as .
Lemma 4. In the Euclidean space , let C be a non-trivial cone, and . If are defined bywhere is a positive-definite quadratic form, then Proof. The assumption on
b guarantees that
and
are both non-empty by Lemma 3. Since
q is a quadratic form, it is homogeneous of degree 2. Hence,
f is homogeneous of degree 0, namely,
. Observe that
for all
and that
over the hyperplane
. Thus, by homogeneity, for every
,
f is constant on the rays
,
. In particular,
Since, by (
5), it holds that
, it follows that
Let
and observe that
Therefore, equality must hold throughout, yielding the desired equality. □
Recall that given a self-adjoint positive-definite endomorphism of
V in
, the bilinear form
induces a norm
p on
V by
. The
dual norm of
p (see [
13], Section 5.4) is the norm
such that
Lemma 5. Let A be a self-adjoint positive-definite endomorphism of V in . If p is the norm on V defined by , then .
Proof. Although this is actually a “folklore” fact, we were unable to locate a reference. Therefore, we provide a proof here.
Since
A is a self-adjoint positive-definite endomorphism,
for some self-adjoint positive-definite endomorphism
B (
B is a
square-root of
A). Hence, by the Cauchy–Schwarz inequality and because
B is self-adjoint,
Therefore,
with equality for
. On the other hand, if
, then the vector
defined by
is such that
Hence, has the stated expression. □
Lemma 6. In the Euclidean space , let C be a closed convex cone, , and p the norm on V, where A is a self-adjoint positive-definite endomorphism of V. If , then, with , in the notation of Lemma 4,where is the dual norm of p, namely, Proof. Since
C is a closed convex cone, then
by (d). Hence, the assumption on
b guarantees that
is non-empty by Lemma 3. Therefore, we can divide the numerator and the denominator of
by
. This yields
Here, if , then . Hence, belongs to because , with being a positive scalar multiple of , and . By Lemma 5, after plugging into , we conclude that the supremum of is attained over , and this concludes the proof. □
3. Results
Let be a probability space, where is a set, is a -algebra on , and is a probability measure on . Also, let be a Euclidean space and the smallest -algebra containing all the open balls of V taken with respect to the norm induced by —this -algebra does not depend on the particular inner product chosen on V. A random vector X in is a -measurable map . The algebra is the algebra of Borel sets of . To prove our results, we first need the following lemma.
Lemma 7. Let X be a centered random vector in with a positive-definite covariance function Σ
. Let T be a non-empty Borel subset of V and , . Then, for all , it holds that Therefore, if is non-empty, then If T is a closed convex set such that , then the above bound is sharp.
Proof. The second inequality follows from the first provided that
is non-empty. The proof of the first inequality is formally identical to the proof of the same inequality in
given in
Section 1 and is a direct consequence of (ii) and Cantelli’s inequality (
1) after noticing that, for a random vector
X in
, the variance of the random variable
is
. It remains to be shown that if
T is closed and convex in
and
, then the bound is sharp. We deduce this result from Theorem 1 after reducing the general case to
by the following argument. Every Euclidean space is a topological vector space with respect to the standard topology induced by the inner product (recall that, in our terminology, Euclidean spaces are finite-dimensional and, hence, Hilbert spaces). Euclidean spaces of the same dimension are pairwise homeomorphic, and all are homeomorphic to
under the coordinate map isomorphism
f (with respect to a fixed orthonormal basis). Thus, if
X is a centered random vector in
, then
is a centered random vector in
, and conversely. Moreover, if the covariance function
of
X is positive-definite, then the covariance matrix
of
is positive-definite, and conversely. Furthermore,
T is a convex closed set in
if and only if
is a closed set in
(with the standard Euclidean topology): linear images of convex sets are convex, and homeomorphic linear images of closed convex sets are closed and convex. Finally, since
f maps the zero vector
of
to the zero vector 0 of
, it follows that
if and only if
. Here,
Therefore, if is a random vector in with a positive-definite matrix that attains the bound in Theorem 1 as equality, which exists because the hypotheses on are satisfied, then is a random vector in that attains the bound in this lemma. □
Theorem 2. Let X be a centered random vector of with positive-definite covariance Σ
. Let C be a non-empty Borel subset of V and , . Then, for all such that , it holds that If C is a closed convex cone in and , then is non-empty and Moreover, the above bound is sharp.
Proof. Since
, the first inequality follows from (
3) (with
X instead of
y and
b instead of
x) by applying Cantelli’s inequality (
1) to the random variable
. If
C is a closed convex cone and
, then
is non-empty because the hypotheses of Lemma 3 are satisfied (recall that since
C is a closed convex cone, one has
by (c) in Lemma 1). Moreover, still by Lemma 3,
is non-empty as well. Hence, the second inequality follows from the first by Lemma 4 with
. It remains to be proven that the bound is sharp. Let
. Since
C is closed and convex, then so is
T. Moreover, the hypotheses on
b and
C implies
. Therefore, Lemma 7 applies, and we conclude that the bound is sharp. □
Although the previous theorem deals with the special case of of Lemma 7, which, in turn, up to technicalities, follows by Theorems 1 and 2 provides a useful sharpening of the extended Cantelli’s inequality given in Theorem 1, which is further exploited in the next corollary.
Corollary 1. Let X be a centered random vector of with positive-definite covariance Σ
. Let C be a closed convex cone in . If , where denotes the linear image of under Σ
, thenwhere . Moreover, the bound is sharp. Proof. Directly from Lemmas 4 and 6, with . □
The previous corollary can be further (and straightforwardly) specialized in
if
X has a
monotone covariance matrix. Recall that an invertible real matrix
A is monotone if
has nonnegative entries (see [
14], Chapter 6). Random vectors whose covariance matrix has this property are common in Network Science. See
Section 4.3 for such an example.
Corollary 2. Let X be a centered random vector in with a positive monotone covariance matrix Σ
. Let C be a closed convex cone in contained in the orthant . If , then the hypotheses of Corollary 1 are satisfied, and hence, (
5)
holds. Proof. We only need to show that the hypotheses of the present corollary guarantee that the hypotheses of Corollary 1 are met, namely,
Clearly, . Hence, just as clearly, . Since is a self-dual closed convex cone, one has and, thus, . Since , we conclude that as required. □