1. Introduction
The study of high-dimensional random orthogonal and unitary matrices can be traced to a famous paper of E. Borel [
1] in which the following result is proved: Let
denote the first coordinate of
, a
n-dimensional random vector that is uniformly distributed on the unit sphere
; then, as
, the random variable
converges in distribution to
Z, a standard normal random variable.
Subsequent to Borel’s paper, there has ensued a literature of substantial size. We mention, as only a few of the papers in this area, the articles of Weingarten [
2], Diaconis and Freedman [
3], Diaconis, Eaton and Lauritzen [
4], Diaconis and Shahshahani [
5], Johansson [
6], Rains [
7], Diaconis and Evans [
8], D’Aristotile, Diaconis and Newman [
9], Pastur and Vasilchuk [
10], Collins and Śniady [
11], Meckes [
12], Fulman [
13], and Jiang [
14]. A reader interested in exploring the field further may obtain from those papers many references to the area.
In a survey of the literature, we were especially intrigued by a result of D’Aristotile, Diaconis and Newman [
9]. We denote by
the group of
orthogonal matrices, and by the uniform distribution on
we mean the Haar measure, normalized to be a probability distribution. Further, we let
denote the standard normal distribution. Then the result is as follows:
Theorem 1.1. (D’Aristotile
et al. [
9])
Let be a sequence of real matrices such that is and , and let be a random orthogonal matrix that is uniformly distributed on . Then as . The proof given by D’Aristotile,
et al. [
9] is based on classical probabilistic methods involving tightness. Their result was later studied by Meckes [
12] who obtained a bound on the distance, in the total variation metric on the set of probability distributions, between the distribution of
and the standard normal distribution; as a consequence, Meckes obtained an explicit formula for the rate of convergence to normality.
It was particularly striking to us that, throughout the existing literature on high-dimensional random matrices from the classical compact matrix groups, the theory of generalized hypergeometric functions of matrix argument appears not to have played an explicit role. We found this absence intriguing because it has been known since the work of Herz [
15] that the characteristic function of a uniformly distributed random orthogonal matrix can be expressed in terms of the Bessel functions of matrix argument; indeed, a primary motivation for the invention of those Bessel functions was the study of random matrices which are uniformly distributed on
.
In this paper, we provide a heuristic derivation of Theorem 1.1. To that end, we will present crucial features of the theory of the zonal polynomials and a generalized hypergeometric function of matrix argument as necessary to make the paper self-contained. It is also noteworthy that the approach given here applies with ease, mutatis mutandis, to cases in which the matrix is uniformly distributed on the unitary group or the symplectic group, and to cases in which is a rectangular random matrix on Stiefel manifolds corresponding to the classical compact matrix groups. In short, the theory of the generalized hypergeometric functions of matrix argument lends itself readily to the study of linear functions of high-dimensional random matrices from the classical compact matrix groups.
Conversely, the study of high-dimensional orthogonal and unitary matrices also yields new results for the Bessel functions of matrix argument. By application of a result of Johansson [
6], we will obtain an upper bound on the distance, in the supremum norm on
, between a certain generalized hypergeometric function of
scalar matrix argument and the Gaussian quantity,
,
.
2. Zonal Polynomials and a Generalized Hypergeometric Function of Matrix Argument
Throughout the paper, we denote the determinant and trace of a square matrix A by and , respectively. We also denote by the identity matrix of order n. We denote by the generic operation of expectation with respect to a probability distribution which, on all occasions, will be explicit from the context.
A partition is a vector of non-negative integers that are weakly decreasing: . The entries are called the parts of ; the length of is the number of non-zero ; and the weight of is .
The set of partitions may be ordered lexicographically: If and are partitions then we write if for the first index j such that corresponding parts are unequal.
We shall encounter in the sequel the quantity,
Perhaps coincidentally, the term
has appeared before now in the theory of zonal polynomials. James [
16], in proving that the zonal polynomial
is an eigenfunction of the Laplace–Beltrami operator on the cone of positive definite matrices, shows that
appears in the expression for the corresponding eigenvalue; see also Muirhead [
17] (p. 229, Equation (
5)) and Richards [
18].
We will also need the following monotonicity property of .
Lemma 2.1. In the lexicographic ordering on the set of partitions of weight k, is a strictly increasing function: for . In particular, Proof. We shall use induction on in the lexicographic ordering on the set of partitions. For the top two partitions, and , we find that .
As inductive hypothesis, suppose that the result has been proved for all partitions from
down to a partition
. Then the partition which is immediately below
is of the form
for some
j and
l with
. By comparing the
jth and
lth parts of
we also find that, necessarily,
.
By cancelling common terms in the sums that define
and
, we obtain
We have seen already that
. Further, since the sequence
is weakly decreasing then the sequence
is strictly decreasing, and hence
for
. Therefore, we obtain
; consequently, by induction, the strictly-increasing property holds for all partitions of weight
k.
Because the set of partitions of weight
k is totally ordered with respect to the lexicographic ordering, with minimal element
and maximal element
, it follows from the monotonicity property of
that
for all partitions
of weight
k. Thus, we obtain Equation (2.2). □
For
and any nonnegative integer
j, the
rising factorial,
is defined as
Corresponding to each partition
, the
partitional rising factorial,
is defined as
Let
S be a real symmetric
matrix. For each partition
, we denote by
the
zonal polynomial of the matrix
S. A complete description of the zonal polynomials may be obtained from James [
19], Muirhead [
17], or Gross and Richards [
20]. Noting that the present paper deals directly with aspects of integration over the orthogonal group
, we remark that a direct definition of the zonal polynomials may be obtained as follows: For any symmetric
matrix
S, and for
, denote by
the principal minor of order
j of
S. Let
be the
power function corresponding to the partition
. Denote by
the Haar measure on
, normalized to be a probability measure. Then
, the zonal polynomial corresponding to the partition
, may be defined by
where the normalizing constants
are positive and are chosen uniquely so that
Integral representations of the type given in Equation (2.6) have played a crucial role in earlier studies of central limit theorems for positive definite random matrices (Richards [
22]).
We now introduce a generalized hypergeometric function of matrix argument. Let
be such that
is not a non-negative integer for all
. For any symmetric
matrix
S, we define a generalized hypergeometric function of matrix argument,
where the inner summation is over all partitions
of weight
k.
By a result of Gross and Richards [
20], Theorem 6.3, the series Equation (2.8) converges absolutely for all
S. With
, it is a result of Herz [
15], (p. 423, see also James [
19]) that for any
matrix
A, there holds the integral formula,
This result generalizes a well-known formula that expresses a classical Bessel function as an integral over the unit circle; for this reason, the function
also is viewed as a Bessel function of matrix argument.
3. The Case of the Stiefel Manifold
We regard this section as preparatory for the ensuing new approach to Theorem 1.1, for the method of hypergeometric functions of matrix argument very easily yields the high-dimensional asymptotic behavior of random matrices taking values in Stiefel manifolds.
Denote by
the Stiefel manifold of all
m-tuples of orthonormal
n-dimensional vectors. As a homogeneous space,
, hence is compact. An explicit description of the unique
-invariant uniform distribution on
is given by Herz [
15]. The following result is both a generalization of Borel’s result for the unit sphere and an analog of Theorem 1.1 for the Stiefel manifold.
Theorem 3.1. Let m be a fixed positive integer, and let be a sequence of real matrices such that is and . For each , let be a random matrix that is uniformly distributed on . Then as .
To see how this result is obtained, we apply Equation (2.9) to obtain, for
,
Because
is an
matrix then, by Equation (2.6),
if
has length greater than
m; therefore, in this case, the zonal polynomial expansion involves partitions of length at most
m only.
By Equation (2.4), we obtain for any partition
of weight
k,
Therefore,
for large
n, and so we obtain
Thus, as
,
which establishes that
converges in distribution to
.
For general
, the argument given above also leads to the conclusion,
We deduce, by applying the standard Cramér–Wold device, that for large
n the entries of the matrix
are asymptotically multivariate normally distributed with mean 0 and identity covariance matrix
. We note also that a similar conclusion may be obtained for the results to follow.
4. The Case of the Orthogonal Group
We now present a new approach to Theorem 1.1. In this setting,
is an
real matrix satisfying the condition
, and the random matrix
is uniformly distributed. Then, for
, we again apply Equation (2.9) to deduce that the characteristic function of the random variable
is
On expanding the
function in a series of zonal polynomials, we obtain a generating function for the moments of the random variable
:
On comparing the coefficients of like powers of
t we deduce that, for
and
We now examine the asymptotic behavior of the
kth moment of
as
. For a partition
of weight
k, the same argument used at Equation (3.1) shows that
Substituting this result into Equation (4.1), we obtain
By a Taylor–Maclaurin expansion, we obtain
as
, where
is the quantity first encountered at Equation (2.1).
On applying Equation (2.7), we obtain
By Equation (2.6), it follows that
. Hence, by applying Equation (2.2), we obtain
an upper bound which is not dependent on
n. Therefore,
and we conclude that for fixed
k,
as
, where
. Finally, we apply the moment problem (Loéve [
23], p. 185) to deduce that
converges in distribution to
.
We remark that the condition can be weakened to require only that , with a sufficiently fast rate of convergence, as .
It is also interesting to discover that the study of high-dimensional random orthogonal matrices yields a new inequality for the generalized hypergeometric function, , of scalar matrix argument.
Proposition 4.1. There exist positive constants c and d such that, for all and , Proof. Define the random variable
, where
is uniformly distributed on
. Denote by
and
the probability density functions of
Y and the
random variable, respectively. By Johansson [
6], Theorem 3.7(b) there exist positive constants
c and
d such that
for all
n. Therefore, for
,
The proof is complete. □
5. The Case of the Unitary Group
As we noted in the introduction, the method used in
Section 4 produces similar results in the case of the unitary and symplectic groups. We shall present the details in the unitary case; and as regards the symplectic case, which we leave to the reader, we note that necessary details on the zonal polynomials and generalized hypergeometric function may be obtained from the paper of Gross and Richards [
20].
In the sequel, we denote by
the adjoint of a complex matrix:
. The analog of Theorem 1.1 in the unitary case, due to Meckes [
12], is the following:
Theorem 5.1. (Meckes [
12]) Let
be a sequence of complex matrices such that
is
and
for all
n. Let
be a random unitary matrix which is uniformly distributed on
. Then
as
.
In this setting, we will need the analogs of the partitional rising factorial, the zonal polynomial, and the generalized hypergeometric function of matrix argument that pertain to the “complex” case; see James [
19] or Gross and Richards [
20,
21]. Specifically, the partitional rising factorial is now defined as
where each
is a classical rising factorial as defined in Equation (2.3); the zonal polynomial is defined for any Hermitian
matrix
S as
where the power function
is defined in Equation (2.5), and the normalizing constants
are positive and are chosen uniquely so that
and for any
such that
is not a non-negative integer for all
, the generalized hypergeometric function of matrix argument is defined as
Similar to the orthogonal case, the characteristic function of the random variable
is
where
is a generalized hypergeometric function of Hermitian matrix argument.
By expanding the
function in a series of complex zonal polynomials, we obtain a generating function for the moments of the random variable
:
By comparing powers of
t, we deduce that, for
and
By Equation (5.1),
as
, where
We can prove by means of an argument similar to that given in the proof of Lemma 2.1 that the coefficients
are strictly increasing in the lexicographic ordering on the set of partitions of weight
; therefore,
so we obtain
which is not dependent on
n. Therefore,
We conclude that for fixed
k,
as
, where
. Finally, we apply the moment problem to deduce that
converges in distribution to
.
We can also obtain an upper bound on the difference between the
function of scalar matrix argument and the Gaussian quantity,
. The proof is similar to that of Proposition 4.1 and rests on an inequality of Johansson [
6], Theorem 2.6(b).
Proposition 5.2. There exist positive constants c and d such that, for all and ,