Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions

Nawa, Victor; Nadarajah, Saralees

doi:10.3390/e26080663

Open AccessArticle

Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions

by

Victor Nawa

¹

and

Saralees Nadarajah

^2,*

¹

Department of Mathematics and Statistics, University of Zambia, Lusaka 10101, Zambia

²

Department of Mathematics, University of Manchester, Manchester M13 9PL, UK

^*

Author to whom correspondence should be addressed.

Entropy 2024, 26(8), 663; https://doi.org/10.3390/e26080663

Submission received: 9 July 2024 / Revised: 2 August 2024 / Accepted: 3 August 2024 / Published: 4 August 2024

(This article belongs to the Section Information Theory, Probability and Statistics)

Download Versions Notes

Abstract

:

The Kullback–Leibler divergence is a measure of the divergence between two probability distributions, often used in statistics and information theory. However, exact expressions for it are not known for multivariate or matrix-variate distributions apart from a few cases. In this paper, exact expressions for the Kullback–Leibler divergence are derived for over twenty multivariate and matrix-variate distributions. The expressions involve various special functions.

Keywords:

beta function; gamma function; matrix-variate beta function; matrix-variate confluent hypergeometric function; matrix-variate gamma function; matrix-variate Gauss hypergeometric function; matrix-variate type I Dirichlet density; type I Dirichlet density

1. Introduction

The Kullback–Leibler divergence (KLD) due to [1] is a fundamental concept in information theory and statistics used to measure the divergence between two probability distributions. It quantifies how one probability distribution diverges from a second, reference probability distribution. Specifically, it calculates the expected extra amount of information required to represent data sampled from one distribution using a code optimized for another distribution. The KLD is asymmetric and not a true metric as it does not satisfy the triangle inequality. It is widely employed in various fields including machine learning, where it serves as a key component in tasks such as model comparison, optimization, and generative modeling, providing a measure of dissimilarity or discrepancy between probability distributions [2].

Suppose

X

is a continuous vector-variate random variable or a continuous matrix-variate random variable having one of two probability density functions

f_{i} (\cdot; θ_{i})

,

i = 1, 2

parameterized by

θ_{i}

,

i = 1, 2

. The KLD between

f_{1} (\cdot; θ_{1})

and

f_{2} (\cdot; θ_{2})

is defined by

\begin{matrix} K L D = E [log \frac{f_{1} (X; θ_{1})}{f_{2} (X; θ_{2})}], \end{matrix}

(1)

where the expectation is with respect to

f_{1} (\cdot; θ_{1})

.

Because of the increasing applications of the KLD, it is useful to have exact expressions for (1). Apart from the multivariate normal distribution, not many expressions have been derived for (1) for multivariate or matrix-variate distributions. The KLD for the multivariate generalized Gaussian distribution was derived only in 2019, see [3]. The KLD for the multivariate Cauchy distribution was derived only in 2022, see [4]. The KLD for the multivariate t distribution was derived only in 2023, see [5].

The aim of this paper is to derive exact expressions for (1) for over twenty multivariate and matrix-variate distributions. The exact expressions for multivariate distributions are presented in Section 2. The exact expressions for matrix-variate distributions are presented in Section 3. The derivations of all of the expressions including a technical lemma needed for the derivations are presented in Section 4. The distributions considered in this paper are continuous. We shall not be considering discrete distributions including mixtures.

The functions and parameters used in this paper are all real-valued. The calculations involve several real-valued special functions listed in the Appendix A.

2. Exact Expressions for Multivariate Distributions

In this section, we state the exact expressions for (1) for Dirichlet, multivariate generalized Gaussian, inverted Dirichlet, multivariate Gauss hypergeometric, multivariate Kotz type, ref. [6]’s multivariate logistic, ref. [7]’s multivariate logistic, ref. [8]’s multivariate normal, multivariate Pearson type II, multivariate Selberg beta, multivariate weighted exponential and von Mises distributions.

A closed form for (1) for the multivariate generalized Gaussian distribution was derived by [3]. But it involved a special function defined as a

(p - 1)

folded infinite sum. The expression we give in Section 2.2 is much simpler in that it involves a single infinite sum. A closed form for (1) for the Dirichlet distribution is available in [9] and https://statproofbook.github.io/P/dir-kl.html (accessed on 1 July 2024).

2.1. Dirichlet Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{\prod_{i = 1}^{K} x_{i}^{a_{i} - 1}}{B (a_{1}, \dots, a_{K - 1}; a_{K})} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{\prod_{i = 1}^{K} x_{i}^{b_{i} - 1}}{B (b_{1}, \dots, b_{K - 1}; b_{K})} \end{matrix}

for

K \geq 2

,

a_{1} > 0, \dots, a_{K} > 0

,

b_{1} > 0, \dots, b_{K} > 0

,

0 \leq x_{1} \leq 1, \dots, 0 \leq x_{K} \leq 1

and

x_{1} + \dots + x_{K} = 1

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{B (b_{1}, \dots, b_{K - 1}; b_{K})}{B (a_{1}, \dots, a_{K - 1}; a_{K})}] + \sum_{i = 1}^{K} (a_{i} - b_{i}) ψ (a_{i}) - ψ (a_{1} + \dots + a_{K}) \sum_{i = 1}^{K} (a_{i} - b_{i}) . \end{matrix}

2.2. Multivariate Generalized Gaussian Distribution ([10], p. 215)

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} {|V_{1}|}^{- \frac{1}{2}} exp \{- {(x^{T} V_{1}^{- 1} x)}^{\frac{α}{2}}\} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{β Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{β})} {|V_{2}|}^{- \frac{1}{2}} exp \{- {(x^{T} V_{2}^{- 1} x)}^{\frac{β}{2}}\} \end{matrix}

for

- \infty < x_{1} < \infty, \dots, - \infty < x_{p} < \infty

,

α > 0

,

β > 0

and

V_{1}

,

V_{2}

positive definite symmetric matrices. The corresponding KLD is

\begin{matrix} K L D = & log [\frac{α Γ (\frac{p}{β})}{β Γ (\frac{p}{α})}] + \frac{1}{2} log \frac{|V_{2}|}{|V_{1}|} - \frac{p}{α} + \frac{Γ (\frac{p}{2}) Γ (\frac{p + β}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{\frac{β}{2}}{r_{1}}) \\ \times (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{\frac{β}{2} - r_{1}} \prod_{j = 1}^{p - 1} [{(λ_{j + 1} - λ_{j})}^{r_{j} - r_{j + 1}} B (r_{j} + \frac{p - j}{2}, \frac{1}{2})] \end{matrix}

provided that the infinite sum converges.

2.3. Inverted Dirichlet Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{Γ (a_{1} + \dots + a_{K + 1})}{Γ (a_{1}) \dots Γ (a_{K + 1})} {(1 + \sum_{i = 1}^{K} x_{i})}^{- a_{1} - \dots - a_{K + 1}} \prod_{i = 1}^{K} x_{i}^{a_{i} - 1} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{Γ (b_{1} + \dots + b_{K + 1})}{Γ (b_{1}) \dots Γ (b_{K + 1})} {(1 + \sum_{i = 1}^{K} x_{i})}^{- b_{1} - \dots - b_{K + 1}} \prod_{i = 1}^{K} x_{i}^{b_{i} - 1} \end{matrix}

for

K \geq 2

,

a_{1} > 0, \dots, a_{K + 1} > 0

,

b_{1} > 0, \dots, b_{K + 1} > 0

and

x_{1} > 0, \dots, x_{K} > 0

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{Γ (a_{1} + \dots + a_{K + 1}) Γ (b_{1}) \dots Γ (b_{K + 1})}{Γ (b_{1} + \dots + b_{K + 1}) Γ (a_{1}) \dots Γ (a_{K + 1})}] + \sum_{i = 1}^{K} (a_{i} - b_{i}) ψ (a_{i}) \\ - ψ (a_{K + 1}) \sum_{i = 1}^{K} (a_{i} - b_{i}) \\ + (b_{1} + \dots + b_{K + 1} - a_{1} - \dots - a_{K + 1}) [ψ (a_{1} + \dots + a_{K + 1}) - ψ (a_{K + 1})] . \end{matrix}

2.4. Multivariate Gauss Hypergeometric Distribution [11]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = C (a_{1}, \dots, a_{K}, b, c) \frac{{(1 - \sum_{i = 1}^{K} x_{i})}^{b - 1} \prod_{i = 1}^{K} x_{i}^{a_{i} - 1}}{{(1 + \sum_{i = 1}^{K} x_{i})}^{c}} \end{matrix}

and

\begin{matrix} f_{2} (x) = C (d_{1}, \dots, d_{K}, e, f) \frac{{(1 - \sum_{i = 1}^{K} x_{i})}^{e - 1} \prod_{i = 1}^{K} x_{i}^{d_{i} - 1}}{{(1 + \sum_{i = 1}^{K} x_{i})}^{f}} \end{matrix}

for

K \geq 2

,

a_{1} > 0, \dots, a_{K} > 0

,

b > 0

,

- \infty < c < \infty

,

d_{1} > 0, \dots, d_{K} > 0

,

e > 0

,

- \infty < f < \infty

,

0 \leq x_{1} \leq 1, \dots, 0 \leq x_{K} \leq 1

and

x_{1} + \dots + x_{K} \leq 1

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{C (a_{1}, \dots, a_{K}, b, c)}{C (d_{1}, \dots, d_{K}, e, f)}] + \sum_{i = 1}^{K} (a_{i} - d_{i}) {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{i} + α, \dots, a_{K}, b, c)}]|}_{α = 0} \\ + (b - e) {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{K}, b + α, c)}]|}_{α = 0} \\ - (c - f) {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{K}, b, c - α)}]|}_{α = 0} . \end{matrix}

2.5. Multivariate Kotz Type Distribution [12]

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} {|Σ_{1}|}^{- \frac{1}{2}} {(x^{T} Σ_{1}^{- 1} x)}^{N - 1} exp \{- q {(x^{T} Σ_{1}^{- 1} x)}^{a}\} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{b Γ (\frac{p}{2}) s^{\frac{2 M + p - 2}{2 b}}}{π^{\frac{p}{2}} Γ (\frac{2 M + p - 2}{2 b})} {|Σ_{2}|}^{- \frac{1}{2}} {(x^{T} Σ_{2}^{- 1} x)}^{M - 1} exp \{- s {(x^{T} Σ_{2}^{- 1} x)}^{b}\} \end{matrix}

for

- \infty < x_{1} < \infty, \dots, - \infty < x_{p} < \infty

,

a > 0

,

q > 0

,

N > 1 - \frac{p}{2}

,

b > 0

,

s > 0

,

M > 1 - \frac{p}{2}

and

Σ_{1}

,

Σ_{2}

positive definite symmetric matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{a q^{\frac{2 N + p - 2}{2 a}} Γ (\frac{2 M + p - 2}{2 b})}{b s^{\frac{2 M + p - 2}{2 b}} Γ (\frac{2 N + p - 2}{2 a})}] + \frac{1}{2} log \frac{|Σ_{2}|}{|Σ_{1}|} + \frac{N - 1}{a} [ψ (\frac{2 N + p - 2}{2 a}) - log q] \\ - \frac{2 N + p - 2}{2 a} - (M - 1) \frac{Γ (\frac{p}{2})}{a π^{\frac{p}{2}}} \{ψ (\frac{p + 2 N - 2}{2 a}) - log q\} [\prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2})] \\ - (M - 1) \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} log λ_{1} [\prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2})] \\ + (M - 1) \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p - 1} = k} (\binom{k}{i_{1}, \dots, i_{p - 1}}) \\ \times \prod_{j = 1}^{p - 1} \{{[\frac{(λ_{j + 1} - λ_{j})}{λ_{1}}]}^{i_{j}} B (\sum_{ℓ = j}^{p - 1} i_{ℓ} + \frac{p - j}{2}, \frac{1}{2})\} \\ + \frac{s Γ (\frac{p}{2}) Γ (\frac{2 N + p + 2 b - 2}{2 a})}{π^{\frac{p}{2}} q^{\frac{b}{a}} Γ (\frac{2 N + p - 2}{2 a})} \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{b}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{b - r_{1}} \\ \times \prod_{j = 1}^{p - 1} [{(λ_{j + 1} - λ_{j})}^{r_{j} - r_{j + 1}} B (r_{j} + \frac{p - j}{2}, \frac{1}{2})] \end{matrix}

provided that the infinite sum converges.

2.6. Multivariate Logistic Distribution [6]

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{p! a_{1} \dots a_{p} exp (- a_{1} x_{1} - \dots - a_{p} x_{p})}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{p + 1}} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{p! b_{1} \dots b_{p} exp (- b_{1} x_{1} - \dots - b_{p} x_{p})}{{[1 + exp (- b_{1} x_{1}) + \dots + exp (- b_{p} x_{p})]}^{p + 1}} \end{matrix}

for

- \infty < x_{1} < \infty, \dots, - \infty < x_{p} < \infty

,

a_{1} > 0, \dots, a_{p} > 0

and

b_{1} > 0, \dots, b_{p} > 0

. The corresponding KLD is

\begin{matrix} K L D & = log (\frac{a_{1} \dots a_{p}}{b_{1} \dots b_{p}}) \\ - (p + 1) \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) [\prod_{j = 1}^{p} Γ (\frac{a_{j} + i_{j} b_{j}}{a_{j}})] Γ (1 - \sum_{j = 1}^{p} \frac{i_{j} b_{j}}{a_{j}}) \\ - (p + 1) \frac{Γ^{'} (p + 1) - Γ (p + 1) Γ^{'} (1)}{p!} \end{matrix}

provided that

\sum_{j = 1}^{p} \frac{i_{j} b_{j}}{a_{j}} < 1

and the infinite series converges.

2.7. Multivariate Logistic Distribution [7]

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{{(b)}_{p} a_{1} \dots a_{p} exp (- a_{1} x_{1} - \dots - a_{p} x_{p})}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{b + p}} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{{(d)}_{p} c_{1} \dots c_{p} exp (- c_{1} x_{1} - \dots - c_{p} x_{p})}{{[1 + exp (- c_{1} x_{1}) + \dots + exp (- c_{p} x_{p})]}^{d + p}} \end{matrix}

for

- \infty < x_{1} < \infty, \dots, - \infty < x_{p} < \infty

,

a_{1} > 0, \dots, a_{p} > 0

,

c_{1} > 0, \dots, c_{p} > 0

,

b > 0

and

d > 0

. The corresponding KLD is

\begin{matrix} K L D & = log [\frac{{(b)}_{p} a_{1} \dots a_{p}}{{(d)}_{p} b_{1} \dots b_{p}}] \\ - \frac{d + p}{Γ (b)} \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) [\prod_{j = 1}^{p} Γ (\frac{a_{j} + i_{j} c_{j}}{a_{j}})] Γ (b - \sum_{j = 1}^{p} \frac{i_{j} c_{j}}{a_{j}}) \\ - (b + p) \frac{Γ (b) Γ^{'} (b + p) - Γ (b + p) Γ^{'} (b)}{Γ (b) Γ (b + p)} \end{matrix}

provided that

\sum_{j = 1}^{p} \frac{i_{j} c_{j}}{a_{j}} < b

and the infinite series converges.

2.8. Sarabia [8]’s Multivariate Normal Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{\sqrt{a_{1} \dots a_{p}} β_{p} (c, a_{1}, \dots, a_{p})}{{(2 π)}^{\frac{p}{2}}} exp \{- \frac{1}{2} [\sum_{i = 1}^{p} (a_{i} x_{i}^{2}) + c \prod_{i = 1}^{p} (a_{i} x_{i}^{2})]\} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{\sqrt{b_{1} \dots b_{p}} β_{p} (d, b_{1}, \dots, b_{p})}{{(2 π)}^{\frac{p}{2}}} exp \{- \frac{1}{2} [\sum_{i = 1}^{p} (b_{i} x_{i}^{2}) + d \prod_{i = 1}^{p} (b_{i} x_{i}^{2})]\} \end{matrix}

for

- \infty < x_{1} < \infty, \dots, - \infty < x_{p} < \infty

,

a_{1} > 0, \dots, a_{p} > 0

,

b_{1} > 0, \dots, b_{p} > 0

,

c > 0

and

d > 0

, where

β_{p} (c, a_{1}, \dots, a_{p})

and

β_{p} (d, b_{1}, \dots, b_{p})

denote normalizing constants. The corresponding KLD is

\begin{matrix} K L D & = log [\frac{\sqrt{a_{1} \dots a_{p}} β_{p} (c, a_{1}, \dots, a_{p})}{\sqrt{b_{1} \dots b_{p}} β_{p} (d, b_{1}, \dots, b_{p})}] \\ + [\frac{1}{2} - \frac{c β_{p}^{'} (c, a_{1}, \dots, a_{p})}{β_{p} (c, a_{1}, \dots, a_{p})}] \sum_{i = 1}^{p} (\frac{b_{i}}{a_{i}} - 1) \\ + (\frac{d b_{1} \dots b_{p}}{a_{1} \dots a_{p}} - c) \frac{β_{p}^{'} (c, a_{1}, \dots, a_{p})}{β_{p} (c, a_{1}, \dots, a_{p})}, \end{matrix}

where

β_{p}^{'} (c, a_{1}, \dots, a_{p}) = \frac{\partial}{\partial c} β_{p} (c, a_{1}, \dots, a_{p})

.

2.9. Multivariate Pearson Type II Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{π^{- \frac{K}{2}} Γ (\frac{K}{2}) Γ (\frac{K}{2} + a + b - 1)}{Γ (a) Γ (\frac{K}{2} + b - 1)} {(\sum_{i = 1}^{K} x_{i}^{2})}^{b - 1} {(1 - \sum_{i = 1}^{K} x_{i}^{2})}^{a - 1} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{π^{- \frac{K}{2}} Γ (\frac{K}{2}) Γ (\frac{K}{2} + c + d - 1)}{Γ (c) Γ (\frac{K}{2} + d - 1)} {(\sum_{i = 1}^{K} x_{i}^{2})}^{d - 1} {(1 - \sum_{i = 1}^{K} x_{i}^{2})}^{c - 1} \end{matrix}

for

K \geq 2

,

a > 0

,

b > 0

,

c > 0

,

d > 0

and

0 < x_{1}^{2} + \dots + x_{K}^{2} < 1

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{Γ (\frac{K}{2} + a + b - 1) Γ (c) Γ (\frac{K}{2} + d - 1)}{Γ (a) Γ (\frac{K}{2} + b - 1) Γ (\frac{K}{2} + c + d - 1)}] \\ + (b - d) [ψ (\frac{K}{2} + b - 1) - ψ (\frac{K}{2} + a + b - 1)] \\ + (a - c) [ψ (a) - ψ (\frac{K}{2} + a + b - 1)] . \end{matrix}

2.10. Multivariate Selberg Beta Distribution [13]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = C (a, b, c) {|\prod_{1 \leq i < j \leq p} (x_{i} - x_{j})|}^{2 c} \prod_{i = 1}^{p} [x_{i}^{a - 1} {(1 - x_{i})}^{b - 1}] \end{matrix}

and

\begin{matrix} f_{2} (x) = C (d, e, f) {|\prod_{1 \leq i < j \leq p} (x_{i} - x_{j})|}^{2 f} \prod_{i = 1}^{p} [x_{i}^{d - 1} {(1 - x_{i})}^{e - 1}] \end{matrix}

for

a > 0

,

b > 0

,

c > 0

,

d > 0

,

e > 0

,

f > 0

and

0 < x_{1} < 1, \dots, 0 < x_{p} < 1

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{C (d, e, f)}{C (a, b, c)}] + (a - d) {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a + α, b, c)}]|}_{α = 0} \\ + (b - e) {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a, b + α, c)}]|}_{α = 0} \\ + 2 (c - f) {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a, b, c + α)}]|}_{α = 0} . \end{matrix}

2.11. Multivariate Weighted Exponential Distribution [14]

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{p}) = \frac{(\sum_{i = 1}^{p} a_{i}) (\prod_{i = 1}^{p} a_{i})}{a_{p + 1}} \{1 - exp [- a_{p + 1} min (x_{1}, \dots, x_{p})]\} [\prod_{i = 1}^{p} exp (- a_{i} x_{i})] \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{p}) = \frac{(\sum_{i = 1}^{p} b_{i}) (\prod_{i = 1}^{p} b_{i})}{b_{p + 1}} \{1 - exp [- b_{p + 1} min (x_{1}, \dots, x_{p})]\} [\prod_{i = 1}^{p} exp (- b_{i} x_{i})] \end{matrix}

for

x_{1} > 0, \dots, x_{p} > 0

,

a_{1} > 0, \dots, a_{p + 1} > 0

and

b_{1} > 0, \dots, b_{p + 1} > 0

. The corresponding KLD is

\begin{matrix} K L D & = log [\frac{b_{p + 1} (\sum_{i = 1}^{p} a_{i}) (\prod_{i = 1}^{p} a_{i})}{a_{p + 1} (\sum_{i = 1}^{p} b_{i}) (\prod_{i = 1}^{p} b_{i})}] + \sum_{i = 1}^{p} (b_{i} - a_{i}) (\frac{1}{a_{i}} + \frac{1}{a_{p + 1}}) \\ - \sum_{k = 1}^{\infty} \frac{(a_{1} + \dots + a_{p}) (a_{1} + \dots + a_{p} + a_{p + 1})}{k (a_{1} + \dots + a_{p} + k a_{p + 1}) [a_{1} + \dots + a_{p} + (k + 1) a_{p + 1}]} \\ + \sum_{k = 1}^{\infty} \frac{(a_{1} + \dots + a_{p}) (a_{1} + \dots + a_{p} + a_{p + 1})}{k (a_{1} + \dots + a_{p} + k b_{p + 1}) [a_{1} + \dots + a_{p} + a_{p + 1} + k b_{p + 1}]}, \end{matrix}

which follows from properties stated in [14] provided that the infinite series converge.

2.12. Von Mises Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{κ_{1}^{\frac{p}{2} - 1} exp (κ_{1} μ_{1}^{T} x)}{{(2 π)}^{\frac{p}{2}} I_{\frac{p}{2} - 1} (κ_{1})} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{κ_{2}^{\frac{p}{2} - 1} exp (κ_{2} μ_{2}^{T} x)}{{(2 π)}^{\frac{p}{2}} I_{\frac{p}{2} - 1} (κ_{2})} \end{matrix}

for

κ_{1} > 0

,

κ_{2} > 0

,

μ_{1}^{T} μ_{1} = 1

,

μ_{2}^{T} μ_{2} = 1

and

x^{T} x = 1

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{κ_{1}^{\frac{p}{2} - 1}}{κ_{2}^{\frac{p}{2} - 1}} \frac{I_{\frac{p}{2} - 1} (κ_{2})}{I_{\frac{p}{2} - 1} (κ_{1})}] + (κ_{1} - κ_{2} μ_{2}^{T} μ_{1}) . \end{matrix}

3. Exact Expressions for Matrix-Variate Distributions

In this section, we state exact expressions for (1) for matrix-variate beta, matrix-variate Dirichlet, matrix-variate gamma, matrix-variate Gauss hypergeometric, matrix-variate inverse beta, matrix-variate inverse gamma, matrix-variate Kummer beta, matrix-variate Kummer gamma, matrix-variate normal and matrix-variate two-sided power distributions.

3.1. Matrix-Variate Beta Distribution [15]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|Ω - x|}^{a - \frac{p + 1}{2}} {|x|}^{b - \frac{p + 1}{2}}}{{|Ω|}^{a + b} B_{p} (a, b)} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|Ω - x|}^{c - \frac{p + 1}{2}} {|x|}^{d - \frac{p + 1}{2}}}{{|Ω|}^{c + d} B_{p} (c, d)} \end{matrix}

for

a > \frac{p - 1}{2}

,

b > \frac{p - 1}{2}

,

c > \frac{p - 1}{2}

,

d > \frac{p - 1}{2}

and

Ω

,

x

,

Ω - x

being

p \times p

positive definite matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{{|Ω|}^{c + d} B_{p} (c, d)}{{|Ω|}^{a + b} B_{p} (a, b)}] + (a - c) {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (α + a, b)}{B_{p} (a, b)}\}|}_{α = 0} \\ + (b - d) {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a, α + b)}{B_{p} (a, b)}\}|}_{α = 0} . \end{matrix}

3.2. Matrix-Variate Dirichlet Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x_{1}, \dots, x_{n}) = \frac{1}{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1})} {|x_{1}|}^{a_{1} - p} \dots {|x_{n}|}^{a_{n} - p} {|I_{p} - \sum_{i = 1}^{n} x_{i}|}^{a_{n + 1} - p} \end{matrix}

and

\begin{matrix} f_{2} (x_{1}, \dots, x_{n}) = \frac{1}{B_{p} (b_{1}, \dots, b_{n}; b_{n + 1})} {|x_{1}|}^{b_{1} - p} \dots {|x_{n}|}^{b_{n} - p} {|I_{p} - \sum_{i = 1}^{n} x_{i}|}^{b_{n + 1} - p} \end{matrix}

for

a_{1} > \frac{p - 1}{2}, \dots, a_{n + 1} > \frac{p - 1}{2}

,

b_{1} > \frac{p - 1}{2}, \dots, b_{n + 1} > \frac{p - 1}{2}

and

x_{1}, \dots, x_{n}

,

I_{p} - x_{1} - \dots - x_{n}

being

p \times p

positive definite matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{B_{p} (b_{1}, \dots, b_{n}; b_{n + 1})}{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1})}] + (a_{n + 1} - b_{n + 1}) {\frac{\partial}{\partial α} \frac{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1} + α)}{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1})}|}_{α = 0} \\ + \sum_{i = 1}^{n} (a_{i} - b_{i}) {\frac{\partial}{\partial α} \frac{B_{p} (a_{1}, \dots, a_{i} + α, \dots, a_{n}; a_{n + 1})}{B_{p} (a_{1}, \dots, a_{i}, \dots, a_{n}; a_{n + 1})}|}_{α = 0} . \end{matrix}

3.3. Matrix-Variate Gamma Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|Σ_{1}|}^{- a}}{b^{a p} Γ_{p} (a)} {|x|}^{a - \frac{p + 1}{2}} exp [tr (- \frac{1}{b} Σ_{1}^{- 1} x)] \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|Σ_{2}|}^{- c}}{d^{c p} Γ_{p} (c)} {|x|}^{c - \frac{p + 1}{2}} exp [tr (- \frac{1}{d} Σ_{2}^{- 1} x)] \end{matrix}

for

a > \frac{p - 1}{2}

,

b > 0

,

c > \frac{p - 1}{2}

,

d > 0

and

x

,

Σ_{1}

,

Σ_{2}

being

p \times p

positive definite symmetric matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{d^{p c} Γ_{p} (c)}{b^{p a} Γ_{p} (a)} \frac{{|Σ_{2}|}^{c}}{{|Σ_{1}|}^{a}}] + \frac{a - c}{Γ_{p} (a)} {\frac{\partial}{\partial α} [{|Σ_{1}|}^{α} b^{p α} Γ_{p} (a + α)]|}_{α = 0} + \frac{2 a p}{b} - tr [- \frac{2 a}{d} Σ_{2}^{- 1} Σ_{1}] . \end{matrix}

3.4. Matrix-Variate Gauss Hypergeometric Distribution [16]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|x|}^{a - \frac{p + 1}{2}} {|I_{p} - x|}^{b - \frac{p + 1}{2}}}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B) {|I_{p} + B x|}^{c}} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|x|}^{d - \frac{p + 1}{2}} {|I_{p} - x|}^{e - \frac{p + 1}{2}}}{B_{p} (d, e)_{2} F_{1} (d, f; d + e; - B) {|I_{p} + B x|}^{f}} \end{matrix}

for

a > \frac{p - 1}{2}

,

b > \frac{p - 1}{2}

,

0 \leq c < \infty

,

d > \frac{p - 1}{2}

,

e > \frac{p - 1}{2}

,

0 \leq f < \infty

and

x

,

I_{p} - x

,

B

,

I_{p} + B

being

p \times p

positive definite matrices, where

Γ_{p} (c)

and

Γ_{p} (f)

are assumed to exist. The corresponding KLD is

\begin{matrix} K L D = log [\frac{B_{p} (d, e)_{2} F_{1} (d, f; d + e; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}] \\ + (a - d) {\frac{\partial}{\partial α} \frac{B_{p} (a + α, b)_{2} F_{1} (a + α, c; a + b + α; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0} \\ + (b - e) {\frac{\partial}{\partial α} \frac{B_{p} (a, b + α)_{2} F_{1} (a, c; a + b + α; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0} \\ + (f - c) {\frac{\partial}{\partial α} \frac{_{2} F_{1} (a, c - α; a + b; - B)}{_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0} . \end{matrix}

3.5. Matrix-Variate Inverse Beta Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|Ω + x|}^{- a - b} {|x|}^{b - \frac{p + 1}{2}}}{{|Ω|}^{- a} B_{p} (a, b)} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|Ω + x|}^{- c - d} {|x|}^{d - \frac{p + 1}{2}}}{{|Ω|}^{- c} B_{p} (c, d)} \end{matrix}

for

a > \frac{p - 1}{2}

,

b > \frac{p - 1}{2}

,

c > \frac{p - 1}{2}

,

d > \frac{p - 1}{2}

and

x

,

Ω

,

Ω + x

being

p \times p

positive definite matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{{|Ω|}^{a - c} B_{p} (c, d)}{B_{p} (a, b)}] + (c + d - a - b) {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a - α, b)}{B_{p} (a, b)}\}|}_{α = 0} \\ + (b - d) {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a - α, α + b)}{B_{p} (a, b)}\}|}_{α = 0} . \end{matrix}

3.6. Matrix-Variate Inverse Gamma Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|Σ_{1}|}^{a}}{b^{a p} Γ_{p} (a)} {|x|}^{- a - \frac{p + 1}{2}} exp [tr (- \frac{1}{b} Σ_{1} x^{- 1})] \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|Σ_{2}|}^{c}}{d^{c p} Γ_{p} (c)} {|x|}^{- c - \frac{p + 1}{2}} exp [tr (- \frac{1}{d} Σ_{2} x^{- 1})] \end{matrix}

for

a > \frac{p - 1}{2}

,

b > 0

,

c > \frac{p - 1}{2}

,

d > 0

and

x

,

Σ_{1}

,

Σ_{2}

being

p \times p

positive definite symmetric matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{d^{p c} Γ_{p} (c)}{b^{p a} Γ_{p} (a)} \frac{{|Σ_{1}|}^{a}}{{|Σ_{2}|}^{c}}] + \frac{c - a}{Γ_{p} (a)} {\frac{\partial}{\partial α} [\frac{{|Σ_{1}|}^{α} Γ_{p} (a - α)}{b^{p α}}]|}_{α = 0} \\ + \frac{2 a}{d} tr [Σ_{2} Σ_{1}^{- 1}] - \frac{2 a p}{b} . \end{matrix}

3.7. Matrix-Variate Kummer Beta Distribution [17]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|x|}^{a - \frac{p + 1}{2}} {|I_{p} - x|}^{b - \frac{p + 1}{2}} exp [- tr (B_{1} x)]}{B_{p} (a, b)_{1} F_{1} (a, a + b; - B_{1})} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|x|}^{c - \frac{p + 1}{2}} {|I_{p} - x|}^{d - \frac{p + 1}{2}} exp [- tr (B_{2} x)]}{B_{p} (c, d)_{1} F_{1} (c; c + d; - B_{2})} \end{matrix}

for

a > \frac{p - 1}{2}

,

b > \frac{p - 1}{2}

,

c > \frac{p - 1}{2}

,

d > \frac{p - 1}{2}

and

x

,

I_{p} - x

,

B_{1}

,

B_{2}

being

p \times p

positive definite matrices. The corresponding KLD is

\begin{matrix} K L D = log [\frac{B_{p} (c, d)_{1} F_{1} (c; c + d; - B_{2})}{B_{p} (a, b)_{1} F_{1} (a, a + b; - B_{1})}] \\ + (a - c) {\frac{\partial}{\partial α} \frac{B_{p} (a + α, b)_{1} F_{1} (a + α; a + b + α; - B_{1})}{B_{p} (a, b)_{1} F_{1} (a; a + b; - B_{1})}|}_{α = 0} \\ + (b - d) {\frac{\partial}{\partial α} \frac{B_{p} (a, b + α)_{1} F_{1} (a; a + b + α; - B_{1})}{B_{p} (a, b)_{1} F_{1} (a; a + b; - B_{1})}|}_{α = 0} \\ + tr [(B_{2} - B_{1}) {\frac{\partial}{\partial z} \frac{_{1} F_{1} (a; a + b; z - B_{1})}{_{1} F_{1} (a; a + b; - B_{1})}|}_{z = 0}] . \end{matrix}

3.8. Matrix-Variate Kummer Gamma Distribution [18]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{{|x|}^{a - \frac{p + 1}{2}} {|I_{p} + x|}^{- b} exp [- tr (B_{1} x)]}{Γ_{p} (a) Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{{|x|}^{c - \frac{p + 1}{2}} {|I_{p} + x|}^{- d} exp [- tr (B_{2} x)]}{Γ_{p} (c) Ψ_{1} (c; c - d + \frac{p + 1}{2}; B_{2})} \end{matrix}

for

a > \frac{p - 1}{2}

,

- \infty < b < \infty

,

c > \frac{p - 1}{2}

,

- \infty < d < \infty

and

x

,

I_{p} + x

,

B_{1}

,

B_{2}

being

p \times p

positive definite matrices, where

Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})

and

Ψ_{1} (c; c - d + \frac{p + 1}{2}; B_{2})

denote Kummer functions with matrix arguments and the parameters are chosen such that these functions exist. The corresponding KLD is

\begin{matrix} K L D = log [\frac{Γ_{p} (c) Ψ_{1} (c; c - d + \frac{p + 1}{2}; B_{2})}{Γ_{p} (a) Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}] \\ + (a - c) {\frac{\partial}{\partial α} \frac{Γ_{p} (a + α) Ψ_{1} (a + α; a - b + α + \frac{p + 1}{2}; B_{1})}{Γ_{p} (a) Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{α = 0} \\ + (d - p) {\frac{\partial}{\partial α} \frac{Ψ_{1} (a; a - b + α + \frac{p + 1}{2}; B_{1})}{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{α = 0} \\ + tr [(B_{2} - B_{1}) {\frac{\partial}{\partial z} \frac{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1} - z)}{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{z = 0}] . \end{matrix}

3.9. Matrix-Variate Normal Distribution

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = \frac{1}{{(2 π)}^{\frac{n p}{2}} {|V_{1}|}^{\frac{n}{2}} {|U_{1}|}^{\frac{p}{2}}} exp \{- \frac{1}{2} tr [V_{1}^{- 1} {(x - M_{1})}^{T} U_{1}^{- 1} (x - M_{1})]\} \end{matrix}

and

\begin{matrix} f_{2} (x) = \frac{1}{{(2 π)}^{\frac{n p}{2}} {|V_{2}|}^{\frac{n}{2}} {|U_{2}|}^{\frac{p}{2}}} exp \{- \frac{1}{2} tr [V_{2}^{- 1} {(x - M_{2})}^{T} U_{2}^{- 1} (x - M_{2})]\} \end{matrix}

for

U_{1}

,

U_{2}

being positive definite symmetric matrices of dimension

n \times n

,

V_{1}

,

V_{2}

being positive definite symmetric matrices of dimension

p \times p

and

M_{1}

,

M_{2}

being matrices of dimension

n \times p

. The corresponding KLD is

\begin{matrix} K L D = log [\frac{{|V_{2}|}^{\frac{n}{2}} {|U_{2}|}^{\frac{p}{2}}}{{|V_{1}|}^{\frac{n}{2}} {|U_{1}|}^{\frac{p}{2}}}] + \frac{1}{2} tr (U_{1}) tr (V_{2}^{- 1} V_{1} U_{2}^{- 1}) + \frac{1}{2} tr (V_{2}^{- 1} M_{1}^{T} M_{1} U_{2}^{- 1}) \\ - \frac{1}{2} tr (V_{2}^{- 1} M_{1}^{T} M_{2} U_{2}^{- 1}) - \frac{1}{2} tr (V_{2}^{- 1} M_{2}^{T} M_{1} U_{2}^{- 1}) \\ + \frac{1}{2} tr (V_{2}^{- 1} M_{2}^{T} M_{2} U_{2}^{- 1}) - \frac{1}{2} tr (U_{1}) tr (U_{1}^{- 1}) . \end{matrix}

3.10. Matrix-Variate Two-Sided Power Distribution [19]

Consider the joint probability density functions

\begin{matrix} f_{1} (x) = C (a) \{\begin{matrix} {|x|}^{a - \frac{p + 1}{2}} {|B|}^{- a + \frac{p + 1}{2}}, & 0_{p} < x \leq B, \\ {|I_{p} - x|}^{a - \frac{p + 1}{2}} {|I_{p} - B|}^{- a + \frac{p + 1}{2}}, & B \leq x \leq I_{p} \end{matrix} \end{matrix}

and

\begin{matrix} f_{2} (x) = C (b) \{\begin{matrix} {|x|}^{b - \frac{p + 1}{2}} {|B|}^{- b + \frac{p + 1}{2}}, & 0_{p} < x \leq B, \\ {|I_{p} - x|}^{b - \frac{p + 1}{2}} {|I_{p} - B|}^{- b + \frac{p + 1}{2}}, & B \leq x \leq I_{p}, \end{matrix} \end{matrix}

for

a > \frac{p - 1}{2}

,

b > \frac{p - 1}{2}

and

x

,

I_{p} - x

,

B

,

I_{p} - B

being

p \times p

positive definite matrices, where

\begin{matrix} C (a) = {|B|}^{\frac{p + 1}{2}} B_{p} (a, \frac{p + 1}{2}) + {|I_{p} - B|}^{\frac{p + 1}{2}} B_{p} (\frac{p + 1}{2}, a) \end{matrix}

and

\begin{matrix} C (b) = {|B|}^{\frac{p + 1}{2}} B_{p} (b, \frac{p + 1}{2}) + {|I_{p} - B|}^{\frac{p + 1}{2}} B_{p} (\frac{p + 1}{2}, b) . \end{matrix}

The corresponding KLD is

\begin{matrix} K L D = log [\frac{C (a)}{C (b)}] + (a - b) {\frac{\partial}{\partial α} \{{|B|}^{α + \frac{p + 1}{2}} B_{p} (a + α, \frac{p + 1}{2})\}|}_{α = 0} \\ + (a - b) {\frac{\partial}{\partial α} \{{|I_{p} - B|}^{α + \frac{p + 1}{2}} B_{p} (\frac{p + 1}{2}, a + α)\}|}_{α = 0} \\ + (b - a) {|B|}^{\frac{p + 1}{2}} log |B| B_{p} (a, \frac{p + 1}{2}) \\ + (b - a) {|I_{p} - B|}^{\frac{p + 1}{2}} log |I_{p} - B| B_{p} (\frac{p + 1}{2}, a) . \end{matrix}

4. Proofs

Before presenting the proofs of the expressions in Section 2 and Section 3, we state a lemma and give its proof.

4.1. A Technical Lemma

Lemma 1.

Let

\begin{matrix} I (a_{1}, \dots, a_{p}, t_{1}, \dots, t_{p}, b) = \int_{R^{p}} \frac{exp (- \sum_{j = 1}^{p} t_{j} x_{j})}{{[1 + \sum_{j = 1}^{p} exp (- a_{j} x_{j})]}^{b}} d x_{1} \dots d x_{p} \end{matrix}

for

t_{j} > 0

,

a_{j} > 0

,

j = 1, 2, \dots, p

and

b > 0

. Then,

\begin{matrix} I (a_{1}, \dots, a_{p}, t_{1}, \dots, t_{p}, b) = \frac{1}{a_{1} \dots a_{p} Γ (b)} [\prod_{j = 1}^{p} Γ (\frac{t_{j}}{a_{j}})] Γ (b - \sum_{j = 1}^{p} \frac{t_{j}}{a_{j}}) \end{matrix}

provided that

b - \sum_{j = 1}^{p} \frac{t_{j}}{a_{j}} > 0

.

Proof.

Setting

y_{j} = exp (- a_{j} x_{j})

and assuming the conditions in the lemma, we can write

\begin{matrix} I (a_{1}, \dots, a_{p}, t_{1}, \dots, t_{p}, b) \\ = \frac{1}{Γ (b)} \int_{R^{p}} \int_{0}^{\infty} t^{b - 1} exp \{- [1 + \sum_{j = 1}^{p} exp (- a_{j} x_{j})] t\} exp (- \sum_{j = 1}^{p} t_{j} x_{j}) d t d x_{1} \dots d x_{p} \\ = \frac{1}{Γ (b)} \int_{0}^{\infty} t^{b - 1} e^{- t} \prod_{j = 1}^{p} \{\int_{- \infty}^{\infty} exp [- t_{j} x_{j} - t exp (- a_{j} x_{j})] d x_{j}\} d t \\ = \frac{1}{a_{1} \dots a_{p} Γ (b)} \int_{0}^{\infty} t^{b - 1} e^{- t} \prod_{j = 1}^{p} [\int_{0}^{\infty} y_{j}^{\frac{t_{j}}{a_{j}} - 1} exp (- t y_{j}) d y_{j}] d t \\ = \frac{1}{a_{1} \dots a_{p} Γ (b)} [\prod_{j = 1}^{p} Γ (\frac{t_{j}}{a_{j}})] \int_{0}^{\infty} t^{b - \sum_{j = 1}^{p} \frac{t_{j}}{a_{j}} - 1} e^{- t} d t . \end{matrix}

The result follows. □

4.2. Proof for Section 2.1

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{B (b_{1}, \dots, b_{K - 1}; b_{K})}{B (a_{1}, \dots, a_{K - 1}; a_{K})}] + \sum_{i = 1}^{K} (a_{i} - b_{i}) E [log X_{i}] . \end{matrix}

(2)

It is easy to show that

\begin{matrix} E [log X_{i}] = ψ (a_{i}) - ψ (a_{1} + \dots + a_{K}), \end{matrix}

so (2) reduces to the required.

4.3. Proof for Section 2.2

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{α Γ (\frac{p}{β})}{β Γ (\frac{p}{α})}] + \frac{1}{2} log \frac{|V_{2}|}{|V_{1}|} + E [{(X^{T} V_{2}^{- 1} X)}^{\frac{β}{2}}] - E [{(X^{T} V_{1}^{- 1} X)}^{\frac{α}{2}}] . \end{matrix}

(3)

The second expectation in (3) can be expressed as

\begin{matrix} E [{(X^{T} V_{1}^{- 1} X)}^{\frac{α}{2}}] & = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} {|V_{1}|}^{- \frac{1}{2}} \int_{R^{p}} {(x^{T} V_{1}^{- 1} x)}^{\frac{α}{2}} exp \{- {(x^{T} V_{1}^{- 1} x)}^{\frac{α}{2}}\} d x \\ = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{R^{p}} {(y^{T} y)}^{\frac{α}{2}} exp \{- {(y^{T} y)}^{\frac{α}{2}}\} d y \\ = \frac{α}{2 Γ (\frac{p}{α})} \int_{0}^{\infty} u^{\frac{p}{2} + \frac{α}{2} - 1} exp (- u^{\frac{α}{2}}) d u \\ = \frac{1}{Γ (\frac{p}{α})} \int_{0}^{\infty} t^{\frac{p}{α}} exp (- t) d t \\ = \frac{p}{α}, \end{matrix}

where

y = V_{1}^{- \frac{1}{2}} x

,

u = y^{T} y

and

t = u^{\frac{α}{2}}

.

Let

V = V_{1}^{\frac{1}{2}} V_{2}^{- 1} V_{1}^{\frac{1}{2}}

and

V = P D P^{- 1}

, where

P

is an orthonormal matrix composed of eigenvectors of

V

and

D

is a diagonal matrix composed of eigenvalues say

λ_{i}

of

V

. Then, the first expectation in (3) can be expressed as

\begin{matrix} E [{(X^{T} V_{2}^{- 1} X)}^{\frac{β}{2}}] & = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{R^{p}} {[tr ({DP}^{T} {yy}^{T} P)]}^{\frac{β}{2}} exp \{- {(y^{T} y)}^{\frac{α}{2}}\} d y \\ = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{R^{p}} {(x^{T} D x)}^{\frac{β}{2}} exp \{- {(x^{T} x)}^{\frac{α}{2}}\} d x \\ = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{R} \dots \int_{R} {(\sum_{i}^{p} λ_{i} x_{i}^{2})}^{\frac{β}{2}} exp \{- {(\sum_{i}^{p} x_{i}^{2})}^{\frac{α}{2}}\} d x, \end{matrix}

(4)

where

y = V_{1}^{- \frac{1}{2}} x

and

z = P^{T} y

. Using the pseudo-polar transformation

z_{1} = r sin θ_{1}

,

z_{2} = r cos θ_{1}

sin θ_{2}

, …,

z_{p} = r cos θ_{1}

cos θ_{2}

⋯

cos θ_{p - 1}

https://en.wikipedia.org/wiki/Polar_coordinate_system (accessed on 1 July 2024), (4) can be expressed as

\begin{matrix} E [{(X^{T} V_{2}^{- 1} X)}^{\frac{β}{2}}] & = \frac{α Γ (\frac{p}{2})}{2 π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{0}^{\infty} r^{p - 1} \int_{- \frac{π}{2}}^{\frac{π}{2}} \dots \int_{- π}^{π} {[r^{2} (λ_{1} {sin}^{2} θ_{1} + \dots + λ_{p} {cos}^{2} θ_{1} \dots {cos}^{2} θ_{p - 1})]}^{\frac{β}{2}} \\ \times exp [- {(r^{2})}^{\frac{α}{2}}] [\prod_{j = 1}^{p - 1} {|cos θ_{j}|}^{p - j - 1}] d r \prod_{j = 1}^{p - 1} d θ_{j} \\ = \frac{α Γ (\frac{p}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{0}^{\infty} r^{p + β - 1} exp (- r^{α}) \int_{0}^{1} \dots \int_{0}^{1} \prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}] \\ \times {[B_{p} (x_{1}, \dots, x_{p - 1})]}^{\frac{β}{2}} d x_{1} \dots d x_{p - 1} d r \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{p + β}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} {[B_{p} (x_{1}, \dots, x_{p - 1})]}^{\frac{β}{2}} d x_{1} \dots d x_{p - 1}, \end{matrix}

(5)

where

x_{i} = {cos}^{2} θ_{i}

and

B_{p} (x_{1}, \dots, x_{p - 1}) = λ_{1} + (λ_{2} - λ_{1}) x_{1} + \dots + (λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}

. Provided that

|λ_{1}| \geq |(λ_{2} - λ_{1}) x_{1} + \dots + (λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}|

holds, we can apply the generalized multinomial theorem to calculate (5) as

\begin{matrix} E [{(X^{T} V_{2}^{- 1} X)}^{\frac{β}{2}}] & = \frac{Γ (\frac{p}{2}) Γ (\frac{p + β}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{\frac{β}{2}}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) \\ \times λ_{1}^{\frac{β}{2} - r_{1}} {[(λ_{2} - λ_{1}) x_{1}]}^{r_{1} - r_{2}} {[(λ_{3} - λ_{2}) x_{1} x_{2}]}^{r_{2} - r_{3}} \\ \dots {[(λ_{p - 1} - λ_{p - 2}) x_{1} x_{2} \dots x_{p - 2}]}^{r_{p - 2} - r_{p - 1}} {[(λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}]}^{r_{p - 1}} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{p + β}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{\frac{β}{2}}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{\frac{β}{2} - r_{1}} \\ \times \prod_{j = 1}^{p - 1} {[(λ_{j + 1} - λ_{j}) (\prod_{ℓ = 1}^{j} x_{ℓ})]}^{r_{j} - r_{j + 1}} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{p + β}{2})}{π^{\frac{p}{2}} Γ (\frac{p}{α})} \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{\frac{β}{2}}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{\frac{β}{2} - r_{1}} \\ \times \prod_{j = 1}^{p - 1} [{(λ_{j + 1} - λ_{j})}^{r_{j} - r_{j + 1}} B (r_{j} + \frac{p - j}{2}, \frac{1}{2})] \end{matrix}

provided that the infinite sum converges. Hence, the required.

4.4. Proof for Section 2.3

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{Γ (a_{1} + \dots + a_{K + 1}) Γ (b_{1}) \dots Γ (b_{K + 1})}{Γ (b_{1} + \dots + b_{K + 1}) Γ (a_{1}) \dots Γ (a_{K + 1})}] + \sum_{i = 1}^{K} (a_{i} - b_{i}) E [log X_{i}] \\ + (b_{1} + \dots + b_{K + 1} - a_{1} - \dots - a_{K + 1}) E [log (1 + \sum_{i = 1}^{K} X_{i})] . \end{matrix}

(6)

It is easy to show that

\begin{matrix} E [log X_{i}] = ψ (a_{i}) - ψ (a_{K + 1}) \end{matrix}

and

\begin{matrix} E [log (1 + \sum_{i = 1}^{K} X_{i})] = ψ (a_{1} + \dots + a_{K + 1}) - ψ (a_{K + 1}), \end{matrix}

so (6) reduces to the required.

4.5. Proof for Section 2.4

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{C (a_{1}, \dots, a_{K}, b, c)}{C (d_{1}, \dots, d_{K}, e, f)}] + \sum_{i = 1}^{K} (a_{i} - d_{i}) E [log X_{i}] + (b - e) E [log (1 - \sum_{i = 1}^{K} X_{i})] \\ - (c - f) E [log (1 + \sum_{i = 1}^{K} X_{i})] . \end{matrix}

(7)

It is easy to show that

\begin{matrix} E [log X_{i}] = {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{i} + α, \dots, a_{K}, b, c)}]|}_{α = 0}, \end{matrix}

\begin{matrix} E [log (1 - \sum_{i = 1}^{K} X_{i})] = {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{K}, b + α, c)}]|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log (1 + \sum_{i = 1}^{K} X_{i})] = {\frac{\partial}{\partial α} [\frac{C (a_{1}, \dots, a_{i}, \dots, a_{K}, b, c)}{C (a_{1}, \dots, a_{K}, b, c - α)}]|}_{α = 0}, \end{matrix}

so (7) reduces to the required.

4.6. Proof for Section 2.5

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{a q^{\frac{2 N + p - 2}{2 a}} Γ (\frac{2 M + p - 2}{2 b})}{b s^{\frac{2 M + p - 2}{2 b}} Γ (\frac{2 N + p - 2}{2 a})}] + \frac{1}{2} log \frac{|Σ_{2}|}{|Σ_{1}|} + (N - 1) E [log (X^{T} Σ_{1}^{- 1} X)] \\ - q E [{(X^{T} Σ_{1}^{- 1} X)}^{a}] - (M - 1) E [log (X^{T} Σ_{2}^{- 1} X)] + s E [{(X^{T} Σ_{2}^{- 1} X)}^{b}] . \end{matrix}

(8)

The second expectation in (8) can be calculated as

\begin{matrix} E [{(X^{T} Σ_{1}^{- 1} X)}^{a}] & = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} {|Σ_{1}|}^{- \frac{1}{2}} \int_{R^{p}} {(x^{T} Σ_{1}^{- 1} x)}^{a + N - 1} exp \{- q {(x^{T} Σ_{1}^{- 1} x)}^{a}\} d x \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(y^{T} y)}^{a + N - 1} exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{1}{q Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} t^{\frac{2 N + p - 2}{2 a}} exp (- t) d t \\ = \frac{2 N + p - 2}{2 a q}, \end{matrix}

where

y = Σ_{1}^{- \frac{1}{2}} x

and

t = q {(y^{T} y)}^{a}

.

Let

Σ = Σ_{1}^{\frac{1}{2}} Σ_{2}^{- 1} Σ_{1}^{\frac{1}{2}}

. The first expectation in (8) can be calculated as

\begin{matrix} E [log (X^{T} Σ_{1}^{- 1} X)] & = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} {|Σ_{1}|}^{- \frac{1}{2}} \int_{R^{p}} {(x^{T} Σ_{1}^{- 1} x)}^{N - 1} log (x^{T} Σ_{1}^{- 1} x) exp \{- q {(x^{T} Σ_{1}^{- 1} x)}^{a}\} d x \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(y^{T} y)}^{N - 1} log (y^{T} y) exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \frac{\partial}{\partial N} \int_{R^{p}} {(y^{T} y)}^{N - 1} exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{q^{\frac{2 N + p - 2}{2 a}}}{Γ (\frac{2 N + p - 2}{2 a})} \frac{\partial}{\partial N} \int_{0}^{\infty} \frac{t^{\frac{2 N + p - 2}{2 a} - 1}}{q^{\frac{2 N + p - 2}{2 a}}} exp (- t) d t \\ = \frac{q^{\frac{2 N + p - 2}{2 a}}}{Γ (\frac{2 N + p - 2}{2 a})} \frac{\partial}{\partial N} [\frac{Γ (\frac{2 N + p - 2}{2 a})}{q^{\frac{2 N + p - 2}{2 a}}}] \\ = \frac{1}{a} [ψ (\frac{2 N + p - 2}{2 a}) - log q], \end{matrix}

where

y = Σ_{1}^{- \frac{1}{2}} x

and

t = q {(y^{T} y)}^{a}

.

As in Section 4.3, write

Σ = P D P^{- 1}

, where

P

is an orthonormal matrix composed of eigenvectors of

Σ

and

D

is a diagonal matrix composed of eigenvalues say

λ_{i}

of

Σ

. Then, the fourth expectation in (8) can be expressed as

\begin{matrix} E [{(X^{T} Σ_{2}^{- 1} X)}^{b}] & = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} {|Σ_{1}|}^{- \frac{1}{2}} \int_{R^{p}} {(x^{T} Σ_{2}^{- 1} x)}^{b} {(x^{T} Σ_{1}^{- 1} x)}^{N - 1} exp \{- q {(x^{T} Σ_{1}^{- 1} x)}^{a}\} d x \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(y^{T} Σ y)}^{b} {(y^{T} y)}^{N - 1} exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {[tr (D P^{T} y y^{T} P)]}^{b} {(y^{T} y)}^{N - 1} exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(v^{T} D v)}^{b} {(v^{T} v)}^{N - 1} exp \{- q {(v^{T} v)}^{a}\} d v \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(\sum_{i = 1}^{p} λ_{i} v_{i}^{2})}^{b} {(\sum_{i = 1}^{p} v_{i}^{2})}^{N - 1} exp \{- q {(\sum_{i = 1}^{p} v_{i}^{2})}^{a}\} d v, \end{matrix}

(9)

where

y = Σ_{1}^{- \frac{1}{2}} x

and

v = P^{T} y

.

Using the pseudo-polar transformation

v_{1} = r sin θ_{1}

,

v_{2} = r cos θ_{1} sin θ_{2}

, …,

v_{p} = r

cos θ_{1}

cos θ_{2}

⋯

cos θ_{p - 1}

, (9) can be expressed as

\begin{matrix} E [{(X^{T} Σ_{2}^{- 1} X)}^{b}] & = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} r^{p - 1} \int_{- \frac{π}{2}}^{\frac{π}{2}} \dots \int_{- π}^{π} {[r^{2} (λ_{1} {sin}^{2} θ_{1} + \dots + λ_{p} {cos}^{2} θ_{1} \dots {cos}^{2} θ_{p - 1})]}^{b} \\ \times {(r^{2})}^{N - 1} exp [- q {(r^{2})}^{a}] [\prod_{j = 1}^{p - 1} {|cos θ_{j}|}^{p - j - 1}] d r \prod_{j = 1}^{p - 1} d θ_{j} \\ = \frac{Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} \frac{t^{\frac{2 N + p + 2 b - 2}{2 a} - 1}}{q^{\frac{2 N + p + 2 b - 2}{2 a}}} exp (- t) \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times {[B_{p} (x_{1}, \dots, x_{p - 1})]}^{b} d x_{1} \dots d x_{p - 1} d t \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{2 N + p + 2 b - 2}{2 a})}{π^{\frac{p}{2}} q^{\frac{b}{a}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times {[B_{p} (x_{1}, \dots, x_{p - 1})]}^{b} d x_{1} \dots d x_{p - 1}, \end{matrix}

(10)

where

x_{i} = {cos}^{2} θ_{i}

,

t = q r^{2 a}

and

B_{p} (x_{1}, \dots, x_{p - 1}) = λ_{1} + (λ_{2} - λ_{1}) x_{1} + \dots + (λ_{p} - λ_{p - 1})

x_{1} x_{2} \dots x_{p - 1}

.

Provided that

|λ_{1}| \geq |(λ_{2} - λ_{1}) x_{1} + \dots + (λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}|

holds, we can apply the generalized multinomial theorem to calculate (10) as

\begin{matrix} E [{(X^{T} Σ_{2}^{- 1} X)}^{b}] & = \frac{Γ (\frac{p}{2}) Γ (\frac{2 N + p + 2 b - 2}{2 a})}{π^{\frac{p}{2}} q^{\frac{b}{a}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{b}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{b - r_{1}} {[(λ_{2} - λ_{1}) x_{1}]}^{r_{1} - r_{2}} {[(λ_{3} - λ_{2}) x_{1} x_{2}]}^{r_{2} - r_{3}} \\ \dots {[(λ_{p - 1} - λ_{p - 2}) x_{1} x_{2} \dots x_{p - 2}]}^{r_{p - 2} - r_{p - 1}} {[(λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}]}^{r_{p - 1}} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{2 N + p + 2 b - 2}{2 a})}{π^{\frac{p}{2}} q^{\frac{b}{a}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{b}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) \\ \times λ_{1}^{b - r_{1}} \{\prod_{j = 1}^{p - 1} {[(λ_{j + 1} - λ_{j}) \prod_{ℓ = 1}^{j} x_{ℓ}]}^{r_{j} - r_{j + 1}}\} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2}) Γ (\frac{2 N + p + 2 b - 2}{2 a})}{π^{\frac{p}{2}} q^{\frac{b}{a}} Γ (\frac{2 N + p - 2}{2 a})} \sum_{r_{1} = 0}^{\infty} \sum_{r_{2} = 0}^{r_{1}} \dots \sum_{r_{p - 1} = 0}^{r_{p - 2}} (\binom{b}{r_{1}}) (\binom{r_{1}}{r_{2}}) \dots (\binom{r_{p - 2}}{r_{p - 1}}) λ_{1}^{b - r_{1}} \\ \times \prod_{j = 1}^{p - 1} [{(λ_{j + 1} - λ_{j})}^{r_{j} - r_{j + 1}} B (r_{j} + \frac{p - j}{2}, \frac{1}{2})] \end{matrix}

provided that the infinite sum converges.

The third expectation in (8) can be expressed as

\begin{matrix} E [log (X^{T} Σ_{2}^{- 1} X)] & = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} {|Σ_{1}|}^{- \frac{1}{2}} \int_{R^{p}} {(x^{T} Σ_{1}^{- 1} x)}^{N - 1} log (x^{T} Σ_{2}^{- 1} x) exp \{- q {(x^{T} Σ_{1}^{- 1} x)}^{a}\} d x \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(y^{T} y)}^{N - 1} log [tr (D P^{T} y y^{T} P)] exp \{- q {(y^{T} y)}^{a}\} d y \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{R^{p}} {(\sum_{i = 1}^{p} v_{i}^{2})}^{N - 1} log (\sum_{i = 1}^{p} λ_{i} v_{i}^{2}) exp \{- q {(\sum_{i = 1}^{p} v_{i}^{2})}^{a}\} d v \\ = \frac{a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} r^{p - 1} \int_{- \frac{π}{2}}^{\frac{π}{2}} \dots \int_{- π}^{π} {(r^{2})}^{N - 1} \\ \times log [r^{2} (λ_{1} {sin}^{2} θ_{1} + \dots + λ_{p} {cos}^{2} θ_{1} \dots {cos}^{2} θ_{p - 1})] \\ \times exp \{- q {(r^{2})}^{a}\} [\prod_{j = 1}^{p - 1} {|cos θ_{j}|}^{p - j - 1}] d r \prod_{j = 1}^{p - 1} d θ_{j} \\ = I_{1} + I_{2} \end{matrix}

say, where

y = Σ_{1}^{- \frac{1}{2}} x

,

v = P^{T} y

,

\begin{matrix} I_{1} = \frac{2 a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} r^{p + 2 N - 3} log (r^{2}) exp (- q r^{2 a}) \\ \times \int_{0}^{1} \dots \int_{0}^{1} \prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}] d x_{1} \dots d x_{p - 1} d r \end{matrix}

and

\begin{matrix} I_{2} = \frac{2 a Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} r^{p + 2 N - 3} exp (- q r^{2 a}) \\ \times \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} log [B_{p} (x_{1}, \dots, x_{p - 1})] d x_{1} \dots d x_{p - 1} d r . \end{matrix}

The

I_{1}

can be calculated as

\begin{matrix} I_{1} & = \frac{Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} \frac{t^{\frac{p + 2 N - 2}{2 a} - 1}}{q^{\frac{2 N + p - 2}{2 a}}} log {(\frac{t}{q})}^{\frac{1}{a}} exp (- t) \prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2}) d t \\ = \frac{Γ (\frac{p}{2})}{a π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \{a \frac{\partial}{\partial N} \int_{0}^{\infty} t^{\frac{p + 2 N - 2}{2 a} - 1} exp (- t) - Γ (\frac{2 N + p - 2}{2 a}) log q\} \prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2}) \\ = \frac{Γ (\frac{p}{2})}{a π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \{a \frac{\partial}{\partial N} Γ (\frac{p + 2 N - 2}{2 a}) - Γ (\frac{2 N + p - 2}{2 a}) log q\} \prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2}) \\ = \frac{Γ (\frac{p}{2})}{a π^{\frac{p}{2}}} \{ψ (\frac{p + 2 N - 2}{2 a}) - log q\} \prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2}), \end{matrix}

where

t = q r^{2 a}

.

The

I_{2}

can be calculated as

\begin{matrix} I_{2} & = \frac{Γ (\frac{p}{2}) q^{\frac{2 N + p - 2}{2 a}}}{π^{\frac{p}{2}} Γ (\frac{2 N + p - 2}{2 a})} \int_{0}^{\infty} \frac{t^{\frac{2 N + p - 2}{2 a} - 1}}{q^{\frac{2 N + p - 2}{2 a}}} exp (- t) \\ \times \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} log [B_{p} (x_{1}, \dots, x_{p - 1})] d x_{1} \dots d x_{p - 1} d t \\ = \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \{log λ_{1} + log [1 + \frac{(λ_{2} - λ_{1}) x_{1}}{λ_{1}} + \dots + \frac{(λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}}{λ_{1}}]\} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} log λ_{1} [\prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2})] - \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} \int_{0}^{1} \dots \int_{0}^{1} \{\prod_{j = 1}^{p - 1} [x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}]\} \\ \times \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} {[\frac{(λ_{2} - λ_{1}) x_{1}}{λ_{1}} + \dots + \frac{(λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}}{λ_{1}}]}^{k} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} log λ_{1} [\prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2})] - \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p - 1} = k} (\binom{k}{i_{1}, \dots, i_{p - 1}}) \\ \times \int_{0}^{1} \dots \int_{0}^{1} \prod_{j = 1}^{p - 1} \{[x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}] {[\frac{(λ_{j + 1} - λ_{j})}{λ_{1}} (\prod_{ℓ = 1}^{j} x_{ℓ})]}^{i_{j}}\} d x_{1} \dots d x_{p - 1} \\ = \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} log λ_{1} [\prod_{j = 1}^{p - 1} B (\frac{p - j}{2}, \frac{1}{2})] - \frac{Γ (\frac{p}{2})}{π^{\frac{p}{2}}} \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p - 1} = k} (\binom{k}{i_{1}, \dots, i_{p - 1}}) \\ \times \prod_{j = 1}^{p - 1} \{{[\frac{(λ_{j + 1} - λ_{j})}{λ_{1}}]}^{i_{j}} B (\sum_{ℓ = j}^{p - 1} i_{ℓ} + \frac{p - j}{2}, \frac{1}{2})\} \end{matrix}

provided that the infinite sum converges.

Hence, the required.

4.7. Proof for Section 2.6

The corresponding KLD can be expressed as

\begin{matrix} K L D & = log (\frac{a_{1} \dots a_{p}}{b_{1} \dots b_{p}}) + \sum_{ℓ = 1}^{p} (b_{ℓ} - a_{ℓ}) E (X_{ℓ}) \\ + (p + 1) E \{log [1 + exp (- b_{1} X_{1}) + \dots + exp (- b_{p} X_{p})]\} \\ - (p + 1) E \{log [1 + exp (- a_{1} X_{1}) + \dots + exp (- a_{p} X_{p})]\} \\ = log (\frac{a_{1} \dots a_{p}}{b_{1} \dots b_{p}}) + (p + 1) E \{log [1 + exp (- b_{1} X_{1}) + \dots + exp (- b_{p} X_{p})]\} \\ - (p + 1) E \{log [1 + exp (- a_{1} X_{1}) + \dots + exp (- a_{p} X_{p})]\} \end{matrix}

(11)

since the expectations are zero. Using the Taylor expansion for

log (1 + z)

, the first expectation in (11) can be expressed as

\begin{matrix} - \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} E \{{[exp (- b_{1} X_{1}) + \dots + exp (- b_{p} X_{p})]}^{k}\} \\ = - \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) E [exp (- i_{1} b_{1} X_{1} - \dots - i_{p} b_{p} X_{p})] \\ = - \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) [\prod_{j = 1}^{p} Γ (\frac{a_{j} + i_{j} b_{j}}{a_{j}})] Γ (1 - \sum_{j = 1}^{p} \frac{i_{j} b_{j}}{a_{j}}), \end{matrix}

where the last step follows by Lemma 1 provided that

\sum_{j = 1}^{p} \frac{i_{j} b_{j}}{a_{j}} < 1

and the infinite series converges. The second expectation in (11) can be expressed as

\begin{matrix} p! a_{1} \dots a_{p} \int_{R^{p}} \{{\frac{\partial}{\partial α} {[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{α}|}_{α = 0}\} \\ \times \frac{exp (- a_{1} x_{1} - \dots - a_{p} x_{p}) d x_{1} \dots d x_{p}}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{p + 1}} \\ = p! a_{1} \dots a_{p} \frac{\partial}{\partial α} {\{\int_{R^{p}} \frac{exp (- a_{1} x_{1} - \dots - a_{p} x_{p})}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{p + 1 - α}} d x_{1} \dots d x_{p}\}|}_{α = 0} \\ = p! \frac{\partial}{\partial α} {[\frac{Γ (1 - α)}{Γ (p + 1 - α)}]|}_{α = 0} \\ = \frac{Γ^{'} (p + 1) - Γ (p + 1) Γ^{'} (1)}{p!}, \end{matrix}

where the penultimate step is followed by Lemma 1. Hence, the required.

4.8. Proof for Section 2.7

The corresponding KLD can be expressed as

\begin{matrix} K L D & = log [\frac{{(b)}_{p} a_{1} \dots a_{p}}{{(d)}_{p} b_{1} \dots b_{p}}] + \sum_{ℓ = 1}^{p} (c_{ℓ} - a_{ℓ}) E (X_{ℓ}) \\ + (d + p) E \{log [1 + exp (- c_{1} X_{1}) + \dots + exp (- c_{p} X_{p})]\} \\ - (b + p) E \{log [1 + exp (- a_{1} X_{1}) + \dots + exp (- a_{p} X_{p})]\} \\ = log [\frac{{(b)}_{p} a_{1} \dots a_{p}}{{(d)}_{p} b_{1} \dots b_{p}}] + (d + p) E \{log [1 + exp (- c_{1} X_{1}) + \dots + exp (- c_{p} X_{p})]\} \\ - (b + p) E \{log [1 + exp (- a_{1} X_{1}) + \dots + exp (- a_{p} X_{p})]\} \end{matrix}

(12)

since the expectations are zero. Using the Taylor expansion for

log (1 + z)

, the first expectation in (12) can be expressed as

\begin{matrix} - \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} E \{{[exp (- c_{1} X_{1}) + \dots + exp (- c_{p} X_{p})]}^{k}\} \\ = - \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) E [exp (- i_{1} c_{1} X_{1} - \dots - i_{p} c_{p} X_{p})] \\ = - \frac{1}{Γ (b)} \sum_{k = 1}^{\infty} \frac{{(- 1)}^{k}}{k} \sum_{i_{1} + \dots + i_{p} = k} (\binom{k}{i_{1}, \dots, i_{p}}) [\prod_{j = 1}^{p} Γ (\frac{a_{j} + i_{j} c_{j}}{a_{j}})] Γ (b - \sum_{j = 1}^{p} \frac{i_{j} c_{j}}{a_{j}}), \end{matrix}

where the last step follows by Lemma 1 provided that

\sum_{j = 1}^{p} \frac{i_{j} c_{j}}{a_{j}} < b

and the infinite series converges. The second expectation in (12) can be expressed as

\begin{matrix} {(b)}_{p} a_{1} \dots a_{p} \int_{R^{p}} \{{\frac{\partial}{\partial α} {[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{α}|}_{α = 0}\} \\ \times \frac{exp (- a_{1} x_{1} - \dots - a_{p} x_{p}) d x_{1} \dots d x_{p}}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{b + p}} \\ = {(b)}_{p} a_{1} \dots a_{p} \frac{\partial}{\partial α} {\{\int_{R^{p}} \frac{exp (- a_{1} x_{1} - \dots - a_{p} x_{p})}{{[1 + exp (- a_{1} x_{1}) + \dots + exp (- a_{p} x_{p})]}^{b + p - α}} d x_{1} \dots d x_{p}\}|}_{α = 0} \\ = {(b)}_{p} \frac{\partial}{\partial α} {[\frac{Γ (b - α)}{Γ (b + p - α)}]|}_{α = 0} \\ = \frac{Γ (b) Γ^{'} (b + p) - Γ (b + p) Γ^{'} (b)}{Γ (b) Γ (b + p)}, \end{matrix}

where the penultimate step is followed by Lemma 1. Hence, the required.

4.9. Proof for Section 2.8

The corresponding KLD can be expressed as

\begin{matrix} K L D & = log [\frac{\sqrt{a_{1} \dots a_{p}} β_{p} (c, a_{1}, \dots, a_{p})}{\sqrt{b_{1} \dots b_{p}} β_{p} (d, b_{1}, \dots, b_{p})}] \\ + \frac{1}{2} \sum_{i = 1}^{p} (b_{i} - a_{i}) E (X_{i}^{2}) \\ + \frac{1}{2} (d b_{1} \dots b_{p} - c a_{1} \dots a_{p}) E (X_{1}^{2} \dots X_{p}^{2}) . \end{matrix}

(13)

Using results in [8], we can calculate (13) as the required.

4.10. Proof for Section 2.9

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{Γ (\frac{K}{2} + a + b - 1) Γ (c) Γ (\frac{K}{2} + d - 1)}{Γ (a) Γ (\frac{K}{2} + b - 1) Γ (\frac{K}{2} + c + d - 1)}] + (b - d) E [log (\sum_{i = 1}^{K} X_{i}^{2})] \\ + (a - c) E [log (1 - \sum_{i = 1}^{K} X_{i}^{2})] . \end{matrix}

(14)

It is easy to show that

\begin{matrix} E [log (\sum_{i = 1}^{K} X_{i}^{2})] = ψ (\frac{K}{2} + b - 1) - ψ (\frac{K}{2} + a + b - 1) \end{matrix}

and

\begin{matrix} E [log (1 - \sum_{i = 1}^{K} X_{i}^{2})] = ψ (a) - ψ (\frac{K}{2} + a + b - 1), \end{matrix}

so (14) reduces to the required.

4.11. Proof for Section 2.10

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{C (d, e, f)}{C (a, b, c)}] + (a - d) E [log (\prod_{i = 1}^{p} X_{i})] + (b - e) E [log (\prod_{i = 1}^{p} (1 - X_{i}))] \\ + 2 (c - f) E [log (\prod_{1 \leq i < j \leq p} (X_{i} - X_{j}))] . \end{matrix}

(15)

Easy calculations show that

\begin{matrix} E [log (\prod_{i = 1}^{p} X_{i})] = {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a + α, b, c)}]|}_{α = 0}, \end{matrix}

\begin{matrix} E [log (\prod_{i = 1}^{p} (1 - X_{i}))] = {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a, b + α, c)}]|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log (\prod_{1 \leq i < j \leq p} (X_{i} - X_{j}))] = {\frac{\partial}{\partial α} [\frac{C (a, b, c)}{C (a, b, c + α)}]|}_{α = 0}, \end{matrix}

so (7) reduces to the required.

4.12. Proof for Section 2.11

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{b_{p + 1} (\sum_{i = 1}^{p} a_{i}) (\prod_{i = 1}^{p} a_{i})}{a_{p + 1} (\sum_{i = 1}^{p} b_{i}) (\prod_{i = 1}^{p} b_{i})}] + \sum_{i = 1}^{p} (b_{i} - a_{i}) E (X_{i}) \\ + E [log \{1 - exp [- a_{p + 1} min (X_{1}, \dots, X_{p})]\}] \\ - E [log \{1 - exp [- b_{p + 1} min (X_{1}, \dots, X_{p})]\}] . \end{matrix}

(16)

Using the series expansion for

log (1 + z)

, we can express (16) as

\begin{matrix} K L D = log [\frac{b_{p + 1} (\sum_{i = 1}^{p} a_{i}) (\prod_{i = 1}^{p} a_{i})}{a_{p + 1} (\sum_{i = 1}^{p} b_{i}) (\prod_{i = 1}^{p} b_{i})}] + \sum_{i = 1}^{p} (b_{i} - a_{i}) E (X_{i}) \\ - \sum_{k = 1}^{\infty} \frac{E \{exp [- k a_{p + 1} min (X_{1}, \dots, X_{p})]\}}{k} \\ + \sum_{k = 1}^{\infty} \frac{E \{exp [- k b_{p + 1} min (X_{1}, \dots, X_{p})]\}}{k} . \end{matrix}

Hence, the required.

4.13. Proof for Section 2.12

The corresponding KLD can be expressed as

\begin{matrix} K L D & = log [\frac{κ_{1}^{\frac{p}{2} - 1}}{κ_{2}^{\frac{p}{2} - 1}} \frac{I_{\frac{p}{2} - 1} (κ_{2})}{I_{\frac{p}{2} - 1} (κ_{1})}] + (κ_{1} μ_{1}^{T} - κ_{2} μ_{2}^{T}) E (X) \\ = log [\frac{κ_{1}^{\frac{p}{2} - 1}}{κ_{2}^{\frac{p}{2} - 1}} \frac{I_{\frac{p}{2} - 1} (κ_{2})}{I_{\frac{p}{2} - 1} (κ_{1})}] + (κ_{1} μ_{1}^{T} - κ_{2} μ_{2}^{T}) μ_{1} . \end{matrix}

Hence, the required.

4.14. Proof for Section 3.1

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{{|Ω|}^{c + d - a - b} B_{p} (c, d)}{B_{p} (a, b)}] + (a - c) E [log |Ω - X|] + (b - d) E [log |X|] . \end{matrix}

(17)

The expectations in (17) can be calculated as

\begin{matrix} E [log |Ω - X|] = {\frac{\partial}{\partial α} \{\int \frac{{|Ω - x|}^{α + a - \frac{p + 1}{2}} {|x|}^{b - \frac{p + 1}{2}}}{{|Ω|}^{a + b} B_{p} (a, b)} d x\}|}_{α = 0} = {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (α + a, b)}{B_{p} (a, b)}\}|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log |X|] = {\frac{\partial}{\partial α} \{\int \frac{{|Ω - x|}^{a - \frac{p + 1}{2}} {|x|}^{α + b - \frac{p + 1}{2}}}{{|Ω|}^{a + b} B_{p} (a, b)} d x\}|}_{α = 0} = {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a, α + b)}{B_{p} (a, b)}\}|}_{α = 0} . \end{matrix}

Hence, the required.

4.15. Proof for Section 3.2

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{B_{p} (b_{1}, \dots, b_{n}; b_{n + 1})}{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1})}] + (a_{n + 1} - b_{n + 1}) E [log |I_{p} - \sum_{i = 1}^{n} X_{i}|] \\ + \sum_{i = 1}^{n} (a_{i} - b_{i}) E [log |X_{i}|] . \end{matrix}

(18)

It is easy to show that

\begin{matrix} E [log |X_{i}|] = {\frac{\partial}{\partial α} \frac{B_{p} (a_{1}, \dots, a_{i} + α, \dots, a_{n}; a_{n + 1})}{B_{p} (a_{1}, \dots, a_{i}, \dots, a_{n}; a_{n + 1})}|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log |I_{p} - \sum_{i = 1}^{n} X_{i}|] = {\frac{\partial}{\partial α} \frac{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1} + α)}{B_{p} (a_{1}, \dots, a_{n}; a_{n + 1})}|}_{α = 0}, \end{matrix}

so (18) reduces to the required.

4.16. Proof for Section 3.3

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{d^{p c} Γ_{p} (c)}{b^{p a} Γ_{p} (a)} \frac{{|Σ_{2}|}^{c}}{{|Σ_{1}|}^{a}}] + (a - c) E [log |X|] + tr [- \frac{1}{b} Σ_{1}^{- 1} E (X)] \\ - tr [- \frac{1}{d} Σ_{2}^{- 1} E (X)] . \end{matrix}

(19)

The first expectation in (19) can be calculated as

\begin{matrix} E [log |X|] & = \frac{{|Σ_{1}|}^{- a}}{b^{a p} Γ_{p} (a)} {\frac{\partial}{\partial α} \{\int {|x|}^{α + a - \frac{p + 1}{2}} exp [tr (- \frac{1}{b} Σ_{1}^{- 1} x)]\}|}_{α = 0} \\ = \frac{1}{Γ_{p} (a)} {\frac{\partial}{\partial α} [{|Σ_{1}|}^{α} b^{p α} Γ_{p} (a + α)]|}_{α = 0} . \end{matrix}

Since

E (X) = 2 a Σ_{1}

, the second and third terms in (19) are equal to

\begin{matrix} tr [- \frac{1}{b} Σ_{1}^{- 1} E (X)] = \frac{2 a p}{b} \end{matrix}

and

\begin{matrix} tr [- \frac{1}{d} Σ_{2}^{- 1} E (X)] = tr [- \frac{2 a}{d} Σ_{2}^{- 1} Σ_{1}], \end{matrix}

respectively. Hence, the required.

4.17. Proof for Section 3.4

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{B_{p} (d, e)_{2} F_{1} (d, f; d + e; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}] + (a - d) E [log |X|] + (b - e) E [log |I_{p} - X|] \\ + (f - c) E [log |I_{p} + B X|] . \end{matrix}

(20)

The expectations in (20) can be easily calculated as

\begin{matrix} E [log |X|] = {\frac{\partial}{\partial α} \frac{B_{p} (a + α, b)_{2} F_{1} (a + α, c; a + b + α; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0}, \end{matrix}

\begin{matrix} E [log |I_{p} - X|] = {\frac{\partial}{\partial α} \frac{B_{p} (a, b + α)_{2} F_{1} (a, c; a + b + α; - B)}{B_{p} (a, b)_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log |I_{p} + B X|] = {\frac{\partial}{\partial α} \frac{_{2} F_{1} (a, c - α; a + b; - B)}{_{2} F_{1} (a, c; a + b; - B)}|}_{α = 0} . \end{matrix}

Hence, the required.

4.18. Proof for Section 3.5

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{{|Ω|}^{a - c} B_{p} (c, d)}{B_{p} (a, b)}] + (c + d - a - b) E [log |Ω + X|] + (b - d) E [log |X|] . \end{matrix}

(21)

The expectations in (21) can be calculated as

\begin{matrix} E [log |Ω + X|] = {\frac{\partial}{\partial α} \{\int \frac{{|Ω + x|}^{α - a - b} {|x|}^{b - \frac{p + 1}{2}}}{{|Ω|}^{- a} B_{p} (a, b)} d x\}|}_{α = 0} = {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a - α, b)}{B_{p} (a, b)}\}|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log |X|] = {\frac{\partial}{\partial α} \{\int \frac{{|Ω + x|}^{- a - b} {|x|}^{α + b - \frac{p + 1}{2}}}{{|Ω|}^{- a} B_{p} (a, b)} d x\}|}_{α = 0} = {\frac{\partial}{\partial α} \{\frac{{|Ω|}^{α} B_{p} (a - α, α + b)}{B_{p} (a, b)}\}|}_{α = 0} . \end{matrix}

Hence, the required.

4.19. Proof for Section 3.6

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{d^{p c} Γ_{p} (c)}{b^{p a} Γ_{p} (a)} \frac{{|Σ_{1}|}^{a}}{{|Σ_{2}|}^{c}}] + (c - a) E [log |X|] + \frac{1}{d} tr [Σ_{2} E (X^{- 1})] \\ - \frac{1}{b} tr [Σ_{1} E (X^{- 1})] . \end{matrix}

(22)

The first expectation in (22) can be calculated as

\begin{matrix} E [log |X|] & = \frac{{|Σ_{1}|}^{a}}{b^{a p} Γ_{p} (a)} {\frac{\partial}{\partial α} \{\int {|x|}^{α - a - \frac{p + 1}{2}} exp [tr (- \frac{1}{b} Σ_{1} x)]\}|}_{α = 0} \\ = \frac{1}{Γ_{p} (a)} {\frac{\partial}{\partial α} [\frac{{|Σ_{1}|}^{α} Γ_{p} (a - α)}{b^{p α}}]|}_{α = 0} . \end{matrix}

Since

E (X^{- 1}) = 2 a Σ_{1}^{- 1}

, the second and third terms in (22) are equal to

\begin{matrix} tr [Σ_{2} E (X^{- 1})] = 2 a tr [Σ_{2} Σ_{1}^{- 1}] \end{matrix}

and

\begin{matrix} tr [Σ_{1} E (X^{- 1})] = 2 a p, \end{matrix}

respectively. Hence, the required.

4.20. Proof for Section 3.7

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{B_{p} (c, d)_{1} F_{1} (c; c + d; - B_{2})}{B_{p} (a, b)_{1} F_{1} (a, a + b; - B_{1})}] + (a - c) E [log |X|] + (b - d) E [log |I_{p} - X|] \\ + tr [(B_{2} - B_{1}) E (X)] . \end{matrix}

(23)

The expectations in (23) can be easily calculated as

\begin{matrix} E [log |X|] = {\frac{\partial}{\partial α} \frac{B_{p} (a + α, b)_{1} F_{1} (a + α; a + b + α; - B_{1})}{B_{p} (a, b)_{1} F_{1} (a; a + b; - B_{1})}|}_{α = 0}, \end{matrix}

\begin{matrix} E [log |I_{p} - X|] = {\frac{\partial}{\partial α} \frac{B_{p} (a, b + α)_{1} F_{1} (a; a + b + α; - B_{1})}{B_{p} (a, b)_{1} F_{1} (a; a + b; - B_{1})}|}_{α = 0} \end{matrix}

and

\begin{matrix} E (X) = {\frac{\partial}{\partial z} \frac{_{1} F_{1} (a; a + b; z - B_{1})}{_{1} F_{1} (a; a + b; - B_{1})}|}_{z = 0} . \end{matrix}

Hence, the required.

4.21. Proof for Section 3.8

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{Γ_{p} (c) Ψ_{1} (c; c - d + \frac{p + 1}{2}; B_{2})}{Γ_{p} (a) Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}] + (a - c) E [log |X|] + (d - b) E [log |I_{p} + X|] \\ + tr [(B_{2} - B_{1}) E (X)] . \end{matrix}

(24)

The expectations in (24) can be easily calculated as

\begin{matrix} E [log |X|] = {\frac{\partial}{\partial α} \frac{Γ_{p} (a + α) Ψ_{1} (a + α; a - b + α + \frac{p + 1}{2}; B_{1})}{Γ_{p} (a) Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{α = 0}, \end{matrix}

\begin{matrix} E [log |I_{p} + X|] = {\frac{\partial}{\partial α} \frac{Ψ_{1} (a; a - b + α + \frac{p + 1}{2}; B_{1})}{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{α = 0} \end{matrix}

and

\begin{matrix} E (X) = {\frac{\partial}{\partial z} \frac{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1} - z)}{Ψ_{1} (a; a - b + \frac{p + 1}{2}; B_{1})}|}_{z = 0} . \end{matrix}

Hence, the required.

4.22. Proof for Section 3.9

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{{|V_{2}|}^{\frac{n}{2}} {|U_{2}|}^{\frac{p}{2}}}{{|V_{1}|}^{\frac{n}{2}} {|U_{1}|}^{\frac{p}{2}}}] + \frac{1}{2} E \{tr [V_{2}^{- 1} {(X - M_{2})}^{T} U_{2}^{- 1} (X - M_{2})]\} \\ - \frac{1}{2} E \{tr [V_{1}^{- 1} {(X - M_{1})}^{T} U_{1}^{- 1} (X - M_{1})]\} . \end{matrix}

(25)

The second expectation in (25) can be expressed as

\begin{matrix} tr \{V_{1}^{- 1} E [{(X - M_{1})}^{T} (X - M_{1})] U_{1}^{- 1}\} = tr (U_{1}) tr (U_{1}^{- 1}) . \end{matrix}

The first expectation in (25) can be expressed as

\begin{matrix} tr \{V_{2}^{- 1} E [{(X - M_{2})}^{T} (X - M_{2})] U_{2}^{- 1}\} \\ = tr [V_{2}^{- 1} E (X^{T} X - X^{T} M_{2} - M_{2}^{T} X + M_{2}^{T} M_{2}) U_{2}^{- 1}] \\ = tr \{V_{2}^{- 1} [E (X^{T} X) - E (X^{T} M_{2}) - E (M_{2}^{T} X) + M_{2}^{T} M_{2}] U_{2}^{- 1}\} \\ = tr (U_{1}) tr (V_{2}^{- 1} V_{1} U_{2}^{- 1}) + tr (V_{2}^{- 1} M_{1}^{T} M_{1} U_{2}^{- 1}) - tr (V_{2}^{- 1} M_{1}^{T} M_{2} U_{2}^{- 1}) \\ - tr (V_{2}^{- 1} M_{2}^{T} M_{1} U_{2}^{- 1}) + tr (V_{2}^{- 1} M_{2}^{T} M_{2} U_{2}^{- 1}) . \end{matrix}

Hence, the required.

4.23. Proof for Section 3.10

The corresponding KLD can be expressed as

\begin{matrix} K L D = log [\frac{C (a)}{C (b)}] + (a - b) E [log |X| I \{0_{p} < X \leq B\}] \\ + (a - b) E [log |I_{p} - X| I \{B < X \leq I_{p}\}] \\ + (b - a) {|B|}^{\frac{p + 1}{2}} log |B| B_{p} (a, \frac{p + 1}{2}) \\ + (b - a) {|I_{p} - B|}^{\frac{p + 1}{2}} log |I_{p} - B| B_{p} (\frac{p + 1}{2}, a) . \end{matrix}

(26)

The expectations in (26) can be easily calculated as

\begin{matrix} E [log |X| I \{0_{p} < X \leq B\}] = {\frac{\partial}{\partial α} \{{|B|}^{α + \frac{p + 1}{2}} B_{p} (a + α, \frac{p + 1}{2})\}|}_{α = 0} \end{matrix}

and

\begin{matrix} E [log |I_{p} - X| I \{B < X \leq I_{p}\}] = {\frac{\partial}{\partial α} \{{|I_{p} - B|}^{α + \frac{p + 1}{2}} B_{p} (\frac{p + 1}{2}, a + α)\}|}_{α = 0} . \end{matrix}

Hence, the required.

Author Contributions

Conceptualization, V.N. and S.N.; methodology, V.N. and S.N.; investigation, V.N. and S.N. All authors have read and agreed to the published version of the manuscript.

Funding

Research of the first author was partially supported by grants from the IMU-CDC, Simons Foundation, and the Heilbronn Institute for Mathematical Research.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors would like to thank the Editor and the two referees for careful reading and comments which greatly improved the paper.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Special Functions

The following special functions are used in the paper: the gamma function defined by

\begin{matrix} Γ (a) = \int_{0}^{\infty} t^{a - 1} exp (- t) d t \end{matrix}

for

a > 0

; the digamma function is defined by

\begin{matrix} ψ (a) = \frac{d log Γ (a)}{d a} \end{matrix}

for

a > 0

; the beta function is defined by

\begin{matrix} B (a, b) = \int_{0}^{1} t^{a - 1} {(1 - t)}^{b - 1} d t \end{matrix}

for

a > 0

and

b > 0

; the type I Dirichlet density is defined by

\begin{matrix} B (a_{1}, \dots, a_{n}; a_{n + 1}) = \int_{0 \leq t_{1} + \dots + t_{n} \leq 1} t_{1}^{a_{1} - 1} \dots t_{n}^{a_{n} - 1} {(1 - \sum_{i = 1}^{n} t_{i})}^{a_{n + 1} - 1} d t_{1} \dots d t_{n} \end{matrix}

for

a_{j} > 0

,

j = 1, 2, \dots, k + 1

; the modified Bessel function of the first kind of order

ν

defined by

\begin{matrix} I_{ν} (x) = \sum_{k = 0}^{\infty} \frac{1}{Γ (k + ν + 1) k!} {(\frac{x}{2})}^{2 k + ν} \end{matrix}

for

k + ν + 1 \neq 0, - 1, - 2, \dots

; the matrix-variate gamma function is defined by

\begin{matrix} Γ_{p} (α) = \int {|x|}^{α - \frac{p + 1}{2}} exp [- tr (x)] d x \end{matrix}

for

x

a

p \times p

positive definite matrix and

a > \frac{p - 1}{2}

; the matrix-variate beta function is defined by

\begin{matrix} B_{p} (α, β) = \int {|I_{p} - x|}^{α - \frac{p + 1}{2}} {|x|}^{β - \frac{p + 1}{2}} d x \end{matrix}

for

x

a

p \times p

positive definite matrix,

I_{p} - x

a

p \times p

positive definite matrix,

α > \frac{p - 1}{2}

and

β > \frac{p - 1}{2}

; the matrix-variate type I Dirichlet density is defined by

\begin{matrix} B_{p} (a_{1}, \dots, a_{n}; a_{n + 1}) = \int {|x_{1}|}^{a_{1} - p} \dots {|x_{n}|}^{a_{n} - p} {|I_{p} - \sum_{i = 1}^{n} x_{i}|}^{a_{n + 1} - p} d x_{1} \dots d x_{n} \end{matrix}

for

x_{i}

,

i = 1, 2, \dots, n

and

I_{p} - \sum_{i = 1}^{n} x_{i}

being

p \times p

positive definite matrices, and

a_{j} > \frac{p - 1}{2}

,

j = 1, 2, \dots, n + 1

; the matrix-variate confluent hypergeometric function is defined by

\begin{matrix} _{1} F_{1} (a; b; X) = \sum_{k = 0}^{\infty} \sum_{κ} \frac{{(a)}_{κ}}{{(b)}_{κ}} \frac{C_{κ} (X)}{k!} \end{matrix}

for

X

a

p \times p

positive definite matrix and provided that

Γ_{p} (α)

and

Γ_{p} (β)

exist; the matrix-variate Gauss hypergeometric function is defined by

\begin{matrix} _{2} F_{1} (a, b; c; X) = \sum_{k = 0}^{\infty} \sum_{κ} \frac{{(a)}_{κ} {(b)}_{κ}}{{(c)}_{κ}} \frac{C_{κ} (X)}{k!} \end{matrix}

for

X

a

p \times p

positive definite matrix and provided that

Γ_{p} (a)

,

Γ_{p} (b)

and

Γ_{p} (c)

exist, where

C_{κ} (X)

denotes the zonal polynomial of the

p \times p

symmetric matrix

X

corresponding to the ordered partition

κ = (k_{1}, \dots, k_{p})

with

k_{1} \geq \dots \geq k_{p} \geq 0

and

k_{1} + \dots + k_{p} = k

,

\sum_{κ}

denotes summation over all such partitions

κ

, and

\begin{matrix} {(a)}_{κ} = \prod_{i = 1}^{p} {(a - \frac{i - 1}{2})}_{k_{i}}, \end{matrix}

where

{(a)}_{0} = 1

and

{(a)}_{k} = a (a + 1) \cdot (a + k - 1)

.

References

Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Bishop, C.M. Pattern Recognition and Machine Learning; Springer: Cham, Switzerland, 2006. [Google Scholar]
Bouhlel, N.; Dziri, A. Kullback–Leibler divergence between multivariate generalized gaussian distributions. IEEE Signal Process. Lett. 2019, 26, 1021–1025. [Google Scholar] [CrossRef]
Bouhlel, N.; Rousseau, D. A generic formula and some special cases for the Kullback–Leibler divergence between central multivariate Cauchy distributions. Entropy 2022, 24, 838. [Google Scholar] [CrossRef] [PubMed]
Bouhlel, N.; Rousseau, D. Exact Rényi and Kullback–Leibler divergences between multivariate t-distributions. IEEE Signal Process. Lett. 2023, 30, 1672–1676. [Google Scholar] [CrossRef]
Malik, H.J.; Abraham, B. Multivariate logistic distributions. Ann. Stat. 1973, 1, 588–590. [Google Scholar] [CrossRef]
Satterthwaite, S.P.; Hutchinson, T.P. A generalization of Gumbel’s bivariate logistic distribution. Metrika 1978, 25, 163–170. [Google Scholar] [CrossRef]
Sarabia, J.-M. The centered normal conditional distributions. Commun. Stat.-Theory Methods 1995, 24, 2889–2900. [Google Scholar] [CrossRef]
Penny, W.D. Kullback-Liebler Divergences of Normal, Gamma, Dirichlet and Wishart Densities; Wellcome Department of Cognitive Neurology: London, UK, 2001. [Google Scholar]
Kotz, S.; Balakrishnan, N.; Johnson, N.L. Continuous Multivariate Distributions; John Wiley and Sons: New York, NY, USA, 2000. [Google Scholar]
Nagar, D.K.; Bedoya-Valencia, D.; Nadarajah, S. Multivariate generalization of the Gauss hypergeometric distribution. Hacet. J. Math. Stat. 2015, 44, 933–948. [Google Scholar] [CrossRef]
Kotz, S. Multivariate distributions at a cross-road. In Statistical Distributions in Scientific Work; Patil, G.P., Kotz, S., Ord, J.K., Eds.; D. Reidel Publishing Company: Dordrecht, The Netherlands, 1975; Volume 1, pp. 247–270. [Google Scholar]
Pham-Gia, T. The multivariate Selberg beta distribution and applications. Statistics 2009, 43, 65–79. [Google Scholar] [CrossRef]
Al-Mutairi, D.K.; Ghitany, M.E.; Kundu, D. A new bivariate distribution with weighted exponential marginals and its multivariate generalization. Stat. Pap. 2011, 52, 921–936. [Google Scholar] [CrossRef]
Dawid, A.P. Some matrix-variate distribution theory: Notational considerations and a Bayesian application. Biometrika 1981, 68, 265–274. [Google Scholar] [CrossRef]
Gupta, A.K.; Nagar, D.K. Matrix-variate Gauss hypergeometric distribution. J. Aust. Math. Soc. 2012, 92, 335–355. [Google Scholar] [CrossRef]
Nagar, D.K.; Gupta, A.K. Matrix-variate Kummer-beta distribution. J. Aust. Math. Soc. 2002, 73, 11–25. [Google Scholar] [CrossRef]
Nagar, D.K.; Cardeno, L. Matrix variate Kummer-gamma distribution. Random Oper. Stoch. Equ. 2001, 9, 207–218. [Google Scholar]
Zinodiny, S.; Nadarajah, S. Matrix variate two-sided power distribution. Methodol. Comput. Appl. Probab. 2022, 24, 179–194. [Google Scholar] [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nawa, V.; Nadarajah, S. Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions. Entropy 2024, 26, 663. https://doi.org/10.3390/e26080663

AMA Style

Nawa V, Nadarajah S. Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions. Entropy. 2024; 26(8):663. https://doi.org/10.3390/e26080663

Chicago/Turabian Style

Nawa, Victor, and Saralees Nadarajah. 2024. "Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions" Entropy 26, no. 8: 663. https://doi.org/10.3390/e26080663

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Exact Expressions for Kullback–Leibler Divergence for Multivariate and Matrix-Variate Distributions

Abstract

1. Introduction

2. Exact Expressions for Multivariate Distributions

2.1. Dirichlet Distribution

2.2. Multivariate Generalized Gaussian Distribution ([10], p. 215)

2.3. Inverted Dirichlet Distribution

2.4. Multivariate Gauss Hypergeometric Distribution [11]

2.5. Multivariate Kotz Type Distribution [12]

2.6. Multivariate Logistic Distribution [6]

2.7. Multivariate Logistic Distribution [7]

2.8. Sarabia [8]’s Multivariate Normal Distribution

2.9. Multivariate Pearson Type II Distribution

2.10. Multivariate Selberg Beta Distribution [13]

2.11. Multivariate Weighted Exponential Distribution [14]

2.12. Von Mises Distribution

3. Exact Expressions for Matrix-Variate Distributions

3.1. Matrix-Variate Beta Distribution [15]

3.2. Matrix-Variate Dirichlet Distribution

3.3. Matrix-Variate Gamma Distribution

3.4. Matrix-Variate Gauss Hypergeometric Distribution [16]

3.5. Matrix-Variate Inverse Beta Distribution

3.6. Matrix-Variate Inverse Gamma Distribution

3.7. Matrix-Variate Kummer Beta Distribution [17]

3.8. Matrix-Variate Kummer Gamma Distribution [18]

3.9. Matrix-Variate Normal Distribution

3.10. Matrix-Variate Two-Sided Power Distribution [19]

4. Proofs

4.1. A Technical Lemma

4.2. Proof for Section 2.1

4.3. Proof for Section 2.2

4.4. Proof for Section 2.3

4.5. Proof for Section 2.4

4.6. Proof for Section 2.5

4.7. Proof for Section 2.6

4.8. Proof for Section 2.7

4.9. Proof for Section 2.8

4.10. Proof for Section 2.9

4.11. Proof for Section 2.10

4.12. Proof for Section 2.11

4.13. Proof for Section 2.12

4.14. Proof for Section 3.1

4.15. Proof for Section 3.2

4.16. Proof for Section 3.3

4.17. Proof for Section 3.4

4.18. Proof for Section 3.5

4.19. Proof for Section 3.6

4.20. Proof for Section 3.7

4.21. Proof for Section 3.8

4.22. Proof for Section 3.9

4.23. Proof for Section 3.10

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Special Functions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI