Next Article in Journal
Currencies in Resource Theories
Previous Article in Journal
The Value of Heart Rhythm Complexity in Identifying High-Risk Pulmonary Hypertension Patients
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Entropy Optimization, Maxwell–Boltzmann, and Rayleigh Distributions

1
Department of Statistics, St. Thomas College, Thrissur, Kerala 680001, India
2
Department of Mathematics and Statistics, McGill University, Montreal, QC H0H H9X, Canada
3
Office for Outer Space Affairs, United Nations, Vienna International Center, A-1400 Vienna, Austria
*
Author to whom correspondence should be addressed.
Entropy 2021, 23(6), 754; https://doi.org/10.3390/e23060754
Submission received: 26 April 2021 / Revised: 1 June 2021 / Accepted: 3 June 2021 / Published: 15 June 2021
(This article belongs to the Section Information Theory, Probability and Statistics)

Abstract

:
In physics, communication theory, engineering, statistics, and other areas, one of the methods of deriving distributions is the optimization of an appropriate measure of entropy under relevant constraints. In this paper, it is shown that by optimizing a measure of entropy introduced by the second author, one can derive densities of univariate, multivariate, and matrix-variate distributions in the real, as well as complex, domain. Several such scalar, multivariate, and matrix-variate distributions are derived. These include multivariate and matrix-variate Maxwell–Boltzmann and Rayleigh densities in the real and complex domains, multivariate Student-t, Cauchy, matrix-variate type-1 beta, type-2 beta, and gamma densities and their generalizations.

1. Introduction

The following notations will be used in this paper: Real scalar variables, whether mathematical variables or random variables, will be denoted by lower-case letters, such as x , y , etc.; real vector/matrix variables—mathematical and random—will be denoted by capital letters, such as X , Y , etc. Complex variables will be written with a tilde, such as x ˜ , y ˜ , X ˜ , Y ˜ , etc. Scalar constants will be denoted by a , b , etc., and vector/matrix constants by A,B, etc. No tilde will be used on constants. If A = ( a i j ) is a p × p matrix, then its determinant will be denoted by | A | or det ( A ) if the elements of a i j are real or complex. The transpose of A is written as A and the complex conjugate transpose as A * . The absolute value of the determinant will be written as | det ( A ) | = det ( A A * ) . For example, if det ( A ) = a + i b , i = ( 1 ) , a , b is a real scalar, then the absolute value is | det ( A ) | = ( a 2 + b 2 ) . If X = ( x i j ) is a p × q real matrix, then the wedge product of the differentials d x i j is written as d X = i = 1 p j = 1 q d x i j , where, for two real scalar variables x and y with differentials d x and d y , the wedge product is defined as d x d y = d y d x so that d x d x = 0 , d y d y = 0 . If X ˜ in the complex domain is a p × q matrix, then we can write X ˜ = X 1 + i X 2 , i = ( 1 ) , X 1 , X 2 , which is real; then, we define d X ˜ = d X 1 d X 2 . If f ( X ) is a real-valued scalar function of X, where X may be scalar real variable x, scalar complex variable x ˜ , vector/matrix real variable X, or vector/matrix complex variable X ˜ such that f ( X ) 0 for all X and X f ( X ) d X = 1 , then f ( X ) will be called a statistical density.
In many disciplines, especially in physics, communication theory, engineering, and statistics, one popular method of deriving statistical distributions is the optimization of an appropriate measure of entropy under appropriate constraints. For a real scalar random variable x, [1] introduced a measure of entropy or a measure of uncertainty:
S ( f ) = c x f ( x ) ln f ( x ) d x
where c is a constant. The corresponding measure for the discrete case is
c j = 1 k p j ln p j , p j > 0 , j = 1 , , k , p 1 + + p k = 1
or ( p 1 , , p k ) , which is a discrete probability law. By optimizing S ( f ) , several authors have derived exponential, Gaussian, and other distributions under the constraints in terms of moments of x, such as E [ x ] = fixed over all functional f, meaning that the first moment is given, where E ( · ) indicates the expected value of ( · ) . This constraint will produce exponential density. If E [ x ] and E [ x 2 ] are fixed, meaning that the first two moments are fixed, then one has a Gaussian density, etc. The basic entropy measure in (1) has been generalized by various authors. One such generalized entropy is the Havrda–Charvát entropy [2] H α ( f ) for the real scalar variable x, which is given by
H α ( f ) = x [ f ( x ) ] α d x 1 2 1 α 1 , α 1
where f ( x ) is a density. The original H α ( f ) is for the discrete case, and the corresponding continuous case is given in (2). Various properties, characterizations, and applications of the Shannon entropy and various α -generalized entropies were discussed by [3]. A modified version of (2) was introduced by Tsallis [4], and it is known in the literature as Tsallis’ entropy, which is the following:
T q ( f ) = x [ f ( x ) ] q d x 1 1 q , q 1 .
Observe that when α 1 in (2) and q 1 in (3), both of these generalized entropies in the real scalar case reduce to the Shannon entropy of (1). Tsallis developed the whole area of non-extensive statistical mechanics by deriving Tsallis’ statistics by optimizing (3) under the constraint that the first moment is fixed in an escort density, g ( x ) = [ f ( x ) ] q x [ f ( x ) ] q d x . Hundreds of papers have been published on Tsallis’ statistics.
In early 2000, the second author introduced a generalized entropy of the following form:
M α ( f ) = X [ f ( X ) ] 1 + a α η d X 1 α a , α a
where f ( X ) is a statistical density, f ( X ) 0 , X f ( X ) d X = 1 , where X may be real scalar x, complex scalar x ˜ , real vector/matrix X, or complex vector/matrix X ˜ , a is a fixed real scalar anchoring point, α is a real scalar parameter, and η > 0 is a real scalar constant so that the deviation of α from a is measured in η units. In the real scalar case, we can see that when α a , then (4) goes to the Shannon entropy in (1). Therefore, for vector/matrix variables in the real and complex domain, one has a generalization of the Shannon entropy in (4). If (3) is optimized under the constraint that the first moment E [ x ] in f ( x ) is fixed; then, it does not lead directly to Tsallis’ statistics. One must optimize (3) in the escort density mentioned above under the restriction that the first moment in the escort density is fixed. Then, one obtains Tsallis’ statistics. If (4) is used, then one can derive various real and complex, scalar, vector, or matrix-variate distributions directly from f ( X ) by imposing moment-like restrictions in f ( X ) . A particular case of (4) for a = 1 , η = 1 , introduced by the second author was applied by [5] in time-series analysis, fractional calculus, and other areas. The researchers in [6] used a particular case of (4) in record values, ordered random variables, and derived some properties, including characterization theorems. In [7] discussed the analytical properties of the classical Mittag–Leffler function as being derived as the solution of the simplest fractional differential equation governing relaxation processes. In [8] studied the complexity of the ultraslow diffusion process using both the classical Shannon entropy and its general case with the inverse Mittag–Leffler function in conjunction with the structural derivative.
In the present article, the term “entropy” is used as a mathematical measure of uncertainty or information characterized by some basic axioms, as illustrated by [3]. Thus, it is a functional resulting from a set of axioms, that is, a function that can be interpreted in terms of a statistical density in the continuous case and in terms of multinomial probabilities in the discrete case. A general discussion of “entropy” is not attempted here because, as per Von Neumann, “whoever uses the term ‘entropy’ in a discussion always wins since no one knows what entropy really is, so in a debate, one always has the advantage”. An overview of various entropic functional forms used so far in the literature is available from [9], along with their historical backgrounds and an account of the numbers of citations of these various functional forms. Hence, no detailed discussion of various entropic functional forms is attempted in the present paper. The concept of entropy is applied in general physics, information theory, chaos theory, time series, computer science, data mining, statistics, engineering, mathematical linguistics, stochastic processes, etc. An account of the entropic universe was given by [10], along with answers to the following questions: How different concepts of entropy arose, what the mathematical definitions of each entropy are, how entropies are related to each other, which entropy is appropriate in which areas of application, and their impacts on the scientific community. Hence, the present article does not attempt to repeat the answers to these questions again. The present paper is about one entropy measure on a real scalar variable, its generalizations to vector/matrix variables in the real and complex domains, and an illustration of how this entropy can be optimized under various constraints to derive various statistical densities in the scalar, vector, and matrix variables in the real and complex domains. Because the entropy measure to be considered in the present article does not contain derivatives, the method of calculus of variation is used for optimization so that the resulting Euler equations will be simple. Mathematical variables and random variables are treated in the same way so that the double notations used for random variables are avoided. In order to avoid having too many symbols and the resulting confusion, scalar variables are denoted by lower-case letters and vector/matrix variables are denoted by capital letters so that the presentation is concise, consistent, and reader-friendly.

Entropy as an Expected Value

Shannon entropy S ( f ) can be looked upon as an expected value of c ln f ( x ) . In Mathai’s entropy (4), one can write the numerator as X { [ f ( X ) ] a α η 1 } f ( X ) d X , which is the expected value of [ f ( X ) ] a α η 1 . Then, (4) is the following expected value:
M α ( f ) = E [ { f ( X ) } a α η 1 α a ] .
The quantity in the expected value operator goes to 1 η ln f ( X ) when α a , which is the same as the Shannon case for c = 1 η . Therefore, the quantity inside the expectation operator is an approximation to 1 η ln f ( X ) .

2. Optimization of Mathai’s Entropy for the Real Scalar Case

Let x be a real scalar variable and let f ( x ) be a density function, that is, f ( x ) 0 for all x and x f ( x ) d x = 1 . Consider the optimization of (4) under the following moment-like constraints:
E [ x γ ( a α η ) ] = fixed   and   E [ x γ ( a α η ) + δ ] = fixed
over all possible densities f ( x ) . Then, if we use calculus of variation for the optimization of (4), the Euler equation is the following:
f [ f 1 + ( a α η ) λ 1 x γ ( a α η ) f ( x ) + λ 2 x γ ( a α η ) + δ f ( x ) ] = 0 ( 1 + a α η ) f a α η = λ 1 x γ ( a α η ) [ 1 b ( a α ) x δ ] f 1 ( x ) = c 1 x γ [ 1 b ( a α ) x δ ] η a α , α < a
where λ 1 and λ 2 are Lagrangian multipliers and λ 2 λ 1 is taken as b ( a α ) for convenience for α < a , b > 0 , γ > 0 , δ > 0 , η > 0 ; a is a fixed real scalar constant, 1 b ( a α ) x δ > 0 , and c 1 is the normalizing constant. For α > a , f 1 ( x ) changes into
f 2 ( x ) = c 2 x γ [ 1 + b ( α a ) x δ ] η α a
for α > a , b > 0 , η > 0 , δ > 0 , γ > 0 , x 0 . When α a , both f 1 ( x ) and f 2 ( x ) go to
f 3 ( x ) = c 3 x γ e b η x δ
for b > 0 , η > 0 , δ > 0 , γ > 0 , x 0 . Observe that all three functions f i ( x ) , i = 1 , 2 , 3 can be reached through the pathway parameter α . From f 1 ( x ) , one can go to f 2 ( x ) and f 3 ( x ) . Similarly, from f 2 ( x ) , one can obtain f 1 ( x ) and f 3 ( x ) . Hence, f 1 ( x ) or f 2 ( x ) is Mathai’s pathway model for the real scalar positive variable x as a mathematical model or as a statistical model. The model f 1 ( x ) is a generalized type-1 beta model, f 2 ( x ) is a generalized type-2 beta model, and f 3 ( x ) is a generalized gamma model. For δ = 2 , γ = 0 , f 3 ( x ) is a real scalar Gaussian model. For γ = 2 , δ = 2 , f 3 ( x ) is a Maxwell–Boltzmann density for x 0 , and for γ = 1 2 , δ = 2 , x 0 , f 3 ( x ) is the Rayleigh density for the real scalar positive variable case. If a location parameter is desired, then x is replaced by x m in all of the above models, where m is the relocation parameter. For γ = 0 , δ = 1 , η = 1 , a = 1 , α = q , f i ( x ) , i = 1 , 2 , 3 is Tsallis’ statistic of non-extensive statistical mechanics; see [4] Tsallis (1988). Hundreds of articles have been published on Tsallis’ statistics. For δ = 1 , η = 1 , a = 1 , α = 1 , f 2 ( x ) and f 3 ( x ) —but not f 1 ( x ) —provide superstatistics of statistical mechanics. Several articles have been published on superstatistics.
Fermi–Dirac and Bose–Einstein densities are also available from the same procedure. In this case, the second factor x δ in the constraint is replaced by e c x , c > 0 , x 0 , and the Lagrangian multipliers are taken as λ 1 and λ 2 so that the second factor in Equation (6) becomes ( λ 1 + λ 2 e c x ) η / ( α a ) for α > a with ( λ 1 + λ 2 e c x ) > 0 to create a density function. Now, take γ = 0 , η = 1 , α a = 1 . Then, for λ 1 = 1 , λ 2 = e d and for some constant d, this gives the Fermi–Dirac density, and for λ 1 = 1 and λ 2 = e d , this gives Bose–Einstein density.
In model-building situations, if f 3 ( x ) is the generalized gamma model, Maxwell–Botlzmann model ( γ = 2 , δ = 2 ) , Rayleigh model ( γ = 1 2 , δ = 2 ) , or Gaussian model ( γ = 0 , δ = 2 ) and is the stable or ideal situation in a physical system, then f 1 ( x ) and f 2 ( x ) provide the unstable or chaotic neighborhoods, and through the pathway parameter α , one can model the stable situation, the unstable neighborhoods, and the transitional stages in a data analysis situation. This is the pathway idea of Mathai.

3. Constraints in Terms of the Ellipsoid of Concentration in the Real p -Variate Case

Let X be a p × 1 real vector with distinct real scalar variables x j as elements; X = ( x 1 , , x p ) , where a prime denotes the transpose. Let μ be a p × 1 location vector. Let the covariance matrix in X be Σ = E [ ( X μ ) ( X μ ) ] , μ = E [ X ] ; then, Σ = Σ , and let Σ > O (real positive definite). Then, the square of the Euclidean distance of X from the point of location μ is ( X μ ) ( X μ ) , and the generalized distance of X from μ is ( X μ ) Σ 1 ( X μ ) . Because Σ is real positive definite, u = ( X μ ) Σ 1 ( X μ ) is known as the ellipsoid of concentration. The probability content of this ellipsoid of concentration is an important quantity in statistical analysis. Let us consider constraints in terms of moments of the ellipsoid of concentration u. Consider the following constraints:
E [ u γ ( a α η ) ] = fixed   and   E [ u γ ( a α η ) + δ ] = fixed
over all possible densities f ( X ) , where X is a p × 1 vector random variable. Then, optimizing Mathai’s entropy in (4) for all possible densities f ( X ) and proceeding as in Section 2, we have the following three densities: For α < a ,
f 1 ( X ) = C 1 [ ( X μ ) Σ 1 ( X μ ) ] γ [ 1 b ( a α ) ( ( X μ ) Σ 1 ( X μ ) ) δ ] η a α
for α < a , b > 0 , γ > 0 , δ > 0 , Σ > O , η > 0 . For α > a , the model in (9) changes into the model
f 2 ( X ) = C 2 [ ( X μ ) Σ 1 ( X μ ) ] γ [ 1 + b ( α a ) ( ( X μ ) Σ 1 ( X μ ) ) δ ] η α a
for α > a , b > 0 , γ > 0 , δ > 0 , η > 0 , Σ > O , and for α a , the models in both (9) and (10) go to the model
f 3 ( X ) = C 3 [ ( X μ ) Σ 1 ( X μ ) ] γ e b η ( ( X μ ) Σ 1 ( X μ ) ) δ
for b > 0 , η > 0 , Σ > O , where C i , i = 1 , 2 , 3 are the normalizing constants. These normalizing constants can be evaluated, and further properties of the models can be studied with the help of the following results from [11]:
Lemma 1.
Let X = ( x i j ) be a p × q real matrix with distinct real scalar variables x i j as elements. Let A be p × p and B be q × q constant nonsingular matrices. Then,
Y = A X B , | A | 0 , | B | 0 d Y = | A | q | B | p d X .
For the proof of this result, as well as for other similar results, see [11]. We will state one more result from [11] here without proof.
Lemma 2.
Let X be real p × q , p q , and rank p matrix with distinct real scalar variables as elements. Let S = X X so that S is p × p symmetric and positive definite. Then, after integrating out over the Stiefel manifold,
d X = π p q 2 Γ p ( q 2 ) | S | q 2 p + 1 2 d S
where, for example, Γ p ( α ) is the real matrix-variate gamma given by
Γ p ( α ) = π p ( p 1 ) 4 Γ ( α ) Γ ( α 1 2 ) . . . Γ ( α p 1 2 ) , ( α ) > p 1 2
= S > O | S | α p + 1 2 e tr ( S ) d S , ( α ) > p 1 2
where ( · ) indicates the real part of ( · ) , S > O is p × p real positive definite, and tr ( · ) indicates the trace of ( · ) .

Evaluation of the Normalizing Constants

Consider f 1 ( X ) of (9). Let Y = Σ 1 2 ( X μ ) d Y = | Σ | 1 2 d X by using Lemma 1, where Σ 1 2 is the positive definite square root of Σ 1 . Let s = Y Y , which is 1 × 1 because Y is 1 × p . Then, from Lemma 2, d Y = π p 2 Γ ( p 2 ) s p 2 1 d s . Therefore, the total integral is
1 = X f 1 ( X ) d X = C 1 X [ ( X μ ) Σ 1 ( X μ ) ] γ [ 1 b ( a α ) ( ( X μ ) Σ 1 ( X μ ) ) δ ] η a α d X = C 1 | Σ | 1 2 Y [ Y Y ] γ [ 1 b ( a α ) ( Y Y ) δ ] η a α d Y = C 1 | Σ | 1 2 π p 2 Γ ( p 2 ) s = 0 s γ + p 2 1 [ 1 b ( a α ) s δ ] η a α d s = C 1 | Σ | 1 2 π p 2 Γ ( p 2 ) δ [ b ( a α ) ] 1 δ ( γ + p 2 ) Γ ( 1 δ ( γ + p 2 ) ) Γ ( 1 + η a α ) Γ ( 1 + η a α + 1 δ ( γ + p 2 ) )
for α < a . The last step is obtained by integrating out s by using a real type-1 beta integral. Hence, for α < a , the normalizing constant is
C 1 = 1 | Σ | 1 2 Γ ( p 2 ) π p 2 [ b ( a α ) ] 1 δ ( γ + p 2 ) δ Γ ( 1 + η a α + 1 δ ( γ + p 2 ) ) Γ ( 1 δ ( γ + p 2 ) ) Γ ( 1 + η a α )
for α < a , b > 0 , η > 0 , δ > 0 , γ > 0 , Σ > O . In a similar manner, and by integrating out s by using a real type-2 beta integral, we have the normalizing constant C 2 for α > a as the following:
C 2 = 1 | Σ | 1 2 Γ ( p 2 ) π p 2 [ b ( α a ) ] 1 δ ( γ + p 2 ) δ Γ ( η α a ) Γ ( 1 δ ( γ + p 2 ) ) Γ ( η α a 1 δ ( γ + p 2 ) )
for α > a , η α a 1 δ ( γ + p 2 ) > 0 , γ > 0 , δ > 0 , η > 0 , Σ > O , and for α a we have
C 3 = 1 | Σ | 1 2 Γ ( p 2 ) π p 2 [ b η ] 1 δ ( γ + p 2 ) δ Γ ( 1 δ ( γ + p 2 ) )
for b > 0 , η > 0 , δ > 0 , γ > 0 , Σ > O .
Observe that the model in (9) is a multivariate generalized real type-1 beta model, (10) is a multivariate generalized real type-2 beta model, and (11) is a multivariate generalized real gamma model. For δ = 2 , γ = 2 , (11) is also a real multivariate Maxwell–Boltzmann model, and for δ = 2 , γ = 1 2 , (11) is a real multivariate Rayleigh model. The corresponding densities for Y = Σ 1 2 ( X μ ) can be called the standard real multivariate Maxwell–Boltzmann and Rayleigh densities, respectively. If the Maxwell–Boltzmann and Rayleigh densities are the stable distributions in a physical system, then the unstable or chaotic neighborhoods are available from (9) and (10), and all of the situations, the stable situation, the unstable neighborhoods, and the transitional stages can be reached through the pathway parameter α . For γ = 0 , the model in (9) is very useful in real multivariate reliability analysis; see [12,13]. The model in (10) for γ = 0 corresponds to a multivariate version of Student-t, Cauchy, multivariate F, and related distributions; see [14].
From the normalizing constants C 1 , C 2 , C 3 , one can also obtain the h-th moment of the ellipsoid of concentration for an arbitrary h. That is, for α < a ,
E [ b ( a α ) ( X μ ) Σ 1 ( X μ ) ] h = Γ ( 1 δ ( γ + h + p 2 ) ) Γ ( 1 δ ( γ + p 2 ) ) Γ ( 1 + η a α + 1 δ ( γ + p 2 ) ) Γ ( 1 + η a α + 1 δ ( γ + h + p 2 ) )
for ( γ + h + p 2 ) > 0 . The density coming from ( i v ) is an H-function. For the theory and applications of the H-function, see [15]. Then, [ b ( a α ) ( X μ ) Σ 1 ( X μ ) ] 1 δ is distributed as a real scalar type-1 beta random variable with the parameters ( γ + p 2 , 1 + η a α ) for α < a . Similarly, b ( α a ) ( X μ ) Σ 1 ( X μ ) has an H-function distribution, whereas [ b ( α a ) ( X μ ) Σ 1 ( X μ ) ] 1 δ is a real scalar type-2 beta variable with the parameters ( γ + p 2 , η α a ( γ + p 2 ) ) for α > a , and [ b η ( X μ ) Σ 1 ( X μ ) ] 1 δ is a real scalar gamma random variable with the parameters ( γ + p 2 , 1 ) .
Theorem 1.
For the f 1 ( X ) , f 2 ( X ) , f 3 ( X ) defined in (9)–(11), respectively, [ b ( a α ) ( X μ ) Σ 1 ( X μ ) ] 1 δ is a real scalar type-1 beta random variable with the parameters ( γ + p 2 , 1 + η a α ) for α < a ; [ b ( α a ) ( X μ ) Σ 1 ( X μ ) ] 1 δ is a real scalar type-2 beta random variable with the parameters ( γ + p 2 , η α a ( γ + p 2 ) ) for α > a and η α a ( γ + p 2 ) > 0 ; [ b η ( X μ ) Σ 1 ( X μ ) ] 1 δ is a real scalar gamma random variable with the parameters ( γ + p 2 , 1 ) .
Note 1.
We can relax the condition δ > 0 . Note that the models in (10) and (11) are also valid for δ < 0 , and by defining the support appropriately, we can relax the condition δ > 0 in (9) as well.
Note 2.
Consider a function g ( ( X μ ) Σ 1 ( X μ ) ) for g ( r 2 ) 0 for some real scalar variable r and let r g ( r 2 ) d r < . Consider the optimization of Mathai’s entropy in (4) over all possible densities f ( X ) and under the constraint
E [ g ( ( X μ ) Σ 1 ( X μ ) ) ] a α η = fixed
over all f ( X ) , where the expectation is taken in f ( X ) ; then, we end up with an elliptically contoured distribution for f ( X ) when the corresponding density for Y = Σ 1 2 ( X μ ) is a spherically symmetric distribution that is invariant under orthonormal transformations or under the rotation of the axes of coordinates.

4. Real Matrix-Variate Case

Let X = ( x i j ) be a real p × q , p q , and rank p matrix with distinct real scalar variables x i j as elements. Let A > O be a p × p constant positive definite matrix and let B > O be a q × q constant positive definite matrix. Let u = tr ( A 1 2 X B X A 1 2 ) . This u is an important quantity in statistical literature. Hence, we will impose restrictions in terms of moments of u. Consider the optimization of Mathai’s entropy in (4) over all densities f ( X ) , where X is a p × q matrix, as defined above, subject to the constraints:
E [ u γ ( a α η ) ] = fixed   and   E [ u γ ( a α η ) + δ ] = fixed
over all possible densities f ( X ) . Then, proceeding as in Section 3, we end up with the following densities, where we use the same notations of f i ( X ) , C i , i = 1 , 2 , 3 in order to avoid having too many symbols: For α < a ,
f 1 ( X ) = C 1 [ tr ( A 1 2 X B X A 1 2 ) ] γ [ 1 b ( a α ) ( tr ( A 1 2 X B X A 1 2 ) ) δ ] η a α ;
For α > a ,
f 2 ( X ) = C 2 [ tr ( A 1 2 X B X A 1 2 ) ] γ [ 1 + b ( α a ) ( tr ( A 1 2 X B X A 1 2 ) ) δ ] η α a
and for α a ,
f 3 ( X ) = C 3 [ tr ( A 1 2 X B X A 1 2 ) ] γ e b η ( tr ( A 1 2 X B X A 1 2 ) δ ) δ .
For evaluating the normalizing constants, we use the following transformations: Y = A 1 2 X B 1 2 d Y = | A | q 2 | B | p 2 d X , s = tr ( Y Y ) = the sum of squares of all the p q elements in Y, and, hence, tr ( Y Y ) = Z Z , where Z is a 1 × p q vector. Then, from Lemma 2, s = Z Z d Y = d Z = π p q 2 Γ ( p q 2 ) s p q 2 1 d s . Then, for α < a , we evaluate the s-integral by using a real scalar type-1 beta integral; for α > a , we evaluate the s-integral by using a real scalar type-2 beta integral; for α a , the s-integral is evaluated by using a real scalar gamma integral. Then, the normalizing constants are the following:
C 1 = | A | q 2 | B | p 2 Γ ( p q 2 ) π p q 2 δ [ b ( a α ) ] 1 δ ( γ + p q 2 ) Γ ( 1 + η a α + 1 δ ( γ + p q 2 ) ) Γ ( 1 δ ( γ + p q 2 ) ) Γ ( 1 + η a α )
C 2 = | A | q 2 | B | p 2 Γ ( p q 2 ) π p q 2 δ [ b ( α a ) ] 1 δ ( γ + p q 2 ) Γ ( η α a ) Γ ( 1 δ ( γ + p q 2 ) ) Γ ( η α a 1 δ ( γ + p q 2 ) )
C 3 = | A | q 2 | B | p 2 Γ ( p q 2 ) π p q 2 δ [ b η ] 1 δ ( γ + p q 2 ) 1 Γ ( 1 δ ( γ + p q 2 ) )
where, in (23), the conditions are α < a , A > O , B > O , b > 0 , η > 0 , δ > 0 , γ > 0 ; in (24), the conditions are α > a , A > O , B > O , b > 0 , η > 0 , δ > 0 , γ > 0 , η α a 1 δ ( γ + p q 2 ) > 0 ; in (25), the conditions are A > O , B > O , b > 0 , η > 0 , γ > 0 , δ > 0 .
Observe that (21) and (22) are available from (20). Similarly, (20) and (22) are available from (21). In other words, all densities in (20)–(22) are available through the pathway parameter α . Note that (22) for δ = 1 , γ = 1 can be taken as a multivariate version of Maxwell–Boltzmann density coming from a rectangular matrix-variate real random variable. Similarly, for δ = 1 , γ = 1 2 , one can take (22) as a version of the multivariate real Rayleigh density coming from a rectangular matrix-variate real random variable. For γ = 0 , δ = 1 , (20) is a real rectangular matrix-variate Gaussian density. One can consider (20) as a generalized real multivariate type-1 beta density, (21) as a generalized real multivariate type-2 beta density, and (22) as the corresponding gamma density. For γ = 0 , the model in (20) is a suitable model for reliability analysis for a real multivariate situation. As observed in Section 3, one can see that tr ( A 1 2 X B X A 1 2 ) has an H-function distribution for α < a , α > a , α a . In addition, for α < a , [ b ( α a ) tr ( A 1 2 X B X A 1 2 ) ] 1 δ is a real scalar type-1 beta distributed with the parameters ( γ + p q 2 , 1 + η a α ) ; for α > a , [ b ( α a ) tr ( A 1 2 X B X A 1 2 ) ] 1 δ is a real scalar type-2 beta distributed with the parameters ( γ + p q 2 , η α a ( γ + p q 2 ) ) ; for α a , [ b η tr ( A 1 2 X B X A 1 2 ) ] 1 δ is a real scalar gamma distributed with the parameters ( γ + p q 2 , 1 ) .
Note 3.
If a location parameter p × q matrix M is to be introduced, then replace X with X M everywhere. If q p and if X is of rank q, then one can consider v = tr ( B 1 2 X A X B 1 2 ) . Then, parallel results hold for all of the results in Section 4 by interchanging A with B and p with q.

5. Constraints in Terms of Determinants

Let X = ( x i j ) be a p × q , p q , and rank p matrix with distinct elements x i j . Let A > O be p × p and B > O be q × q constant positive definite matrices. Consider the optimization of Mathai’s entropy (4) under the constraint
E [ | I b ( a α ) A 1 2 X B X A 1 2 | ] = fixed
over all real p × q , p q , and rank p matrix-variate densities f ( X ) . Then, following the same procedure as in the above cases, we end up with the density
f 1 ( X ) = C 1 | I b ( a α ) A 1 2 X B X A 1 2 | η a α
for α < a , b > 0 , η > 0 , A > O , B > O , I b ( a α ) A 1 2 X B X A 1 2 > O , and a is a fixed scalar constant. In order to avoid having too many symbols, we will use the same notations of f i ( X ) , C i , i = 1 , 2 , 3 in this section. For α > a , the model in (26) changes into
f 2 ( X ) = C 2 | I + b ( α a ) A 1 2 X B X A 1 2 | η α a
for α > a , b > 0 , η > 0 , A > O , B > O . When α a , both f 1 ( X ) and f 2 ( X ) go to
f 3 ( X ) = C 3 e b η tr ( A 1 2 X B X A 1 2 )
for b > 0 , η > 0 , A > O , B > O . The transition of (26) and (27) to (28) can be seen from the following properties. Let λ 1 , , λ p be the eigenvalues of A 1 2 X B X A 1 2 . Then,
| I b ( a α ) A 1 2 X B X A 1 2 | η a α = j = 1 p [ 1 b ( a α ) λ j ] η a α .
However, from the definition of the mathematical constant e, we have lim α a ( 1 b ( a α ) λ j ) η a α = e b η λ j . In a similar manner, lim α a + ( 1 + b ( α a ) λ j ) η α a = e b η λ j . Then, the product gives the sum of the eigenvalues or the trace in the exponent and, hence, the result. The normalizing constants C i , i = 1 , 2 , 3 can be evaluated by using the following transformations: Y = A 1 2 X B 1 2 d Y = | A | q 2 | B | p 2 d X by using Lemma 1 or A = Y Y d Y = π p q 2 Γ p ( q 2 ) | S | q 2 p + 1 2 d S by using Lemma 2. Then, evaluating the S-integral by using a real matrix-variate type-1 beta integral for α < a , by using a real matrix-variate type-2 beta integral for α > a , or by using a real matrix-variate gamma integral for α a , we obtain the results, where, for example, Γ p ( α ) is the real matrix-variate gamma defined earlier in (14).

5.1. Modification of the Constraint in Terms of a Determinant

Let us consider the matrices X , A , B as in Section 5. Consider the optimization of (4) under the following constraint for α < a :
E [ | A 1 2 X B X A 1 2 | γ ( a α ) η | I b ( a α ) A 1 2 X B X A 1 2 | ] = fixed
over all possible densities f ( X ) . Then, proceeding as in the previous cases, we end up with the following densities:
f 1 ( X ) = C 1 | A 1 2 X B X A 1 2 | γ | I b ( a α ) A 1 2 X B X A 1 2 | η a α
for α < a , b > 0 , γ > 0 , η > o , A > O , B > O , I b ( a α ) A 1 2 X B X A 1 2 > O . For α > a , we have
f 2 ( X ) = C 2 | A 1 2 X B X A 1 2 | γ | I + b ( α a ) A 1 2 X B X A 1 2 | η α a
for α > a , b > 0 , η > 0 , γ > 0 , A > O , B > O , and for α a , we have
f 2 ( X ) = C 3 | A 1 2 X B X A 1 2 | γ e b η tr ( A 1 2 X B X A 1 2 )
for b > 0 , η > 0 , A > O , B > O . Observe that, as in the previous cases, all three models are available through the pathway parameter α from either f 1 ( X ) or f 2 ( X ) . If f 3 ( X ) is the stable situation in a physical system, then the unstable neighborhoods are given by f 1 ( X ) and f 2 ( X ) ; these stable and unstable stages and the transitional stages can be reached through α . For γ = 1 , the model in (31) can be taken as the real rectangular matrix-variate Maxwell–Boltzmann density, and for γ = 1 2 , it is the real rectangular matrix-variate Rayleigh density. The corresponding densities of Y = A 1 2 X B 1 2 can be taken as the standard matrix-variate Maxwell–Boltzmann and Rayleigh densities. The corresponding densities for S = Y Y can be taken as the isotropic or spherically symmetric matrix-variate Maxwell–Boltzmann and Rayleigh densities. The normalizing constants can be evaluated by using the transformations in Section 5 and then evaluating the S-integral by using real matrix-variate type-1 beta, type-2 beta, and gamma integrals. The final expressions will be the following:
C 1 = | A | q 2 | B | p 2 [ b ( a α ) ] p γ + p q 2 Γ p ( q 2 ) π p q 2 Γ p ( γ + q 2 + p + 1 2 + η a α ) Γ p ( γ + q 2 ) Γ p ( p + 1 2 + η a α )
C 2 = | A | q 2 | B | p 2 [ b ( α a ) ] p γ + p q 2 Γ p ( q 2 ) π p q 2 Γ p ( η α a ) Γ p ( γ + q 2 ) Γ p ( η α a γ q 2 )
C 3 = | A | q 2 | B | p 2 [ b η ] p γ + p q 2 Γ p ( q 2 ) π p q 2 1 Γ p ( γ + q 2 )
where in (32) α < a , in (33) α > a and η α a ( γ + q 2 ) > 0 , and in (32)–(34) A > O , B > O , γ > 0 , η > 0 , γ > 0 , b > 0 .
Note 4.
If a location parameter is needed, then replace X with X M , where M is a p × q constant matrix everywhere in Section 5 and Section 5.1.

5.2. Arbitrary Moments

Let u = | A 1 2 X B X A 1 2 | , and if the h-th moment of this determinant u for an arbitrary h is needed, then this moment can be written down by looking at the normalizing constants in (32)–(34). For α < a , the h-th moment is the following:
E [ u h ] = [ b ( a α ) ] p ( γ + q 2 ) Γ p ( γ + q 2 + h ) Γ p ( γ + q 2 ) Γ p ( p + 1 2 + η a α + γ + q 2 ) Γ p ( p + 1 2 + η a α + γ + q 2 + h ) = [ b ( a α ) ] p ( γ + q 2 ) j = 1 p Γ ( γ + q 2 + h j 1 2 ) Γ ( γ + q 2 j 1 2 ) Γ ( p + 1 2 + η a α + γ + q 2 j 1 2 ) Γ ( p + 1 2 + η a α + γ + q 2 j 1 2 + h ) E [ b ( a α ) u ] h = E [ u 1 h ] E [ u 2 h ] . . . E [ u p h ]
for ( h + γ + q 2 ) > p 1 2 , where u 1 , , u p are independently distributed real scalar type-1 beta random variables, with u j having the parameters ( γ + q 2 j 1 2 , p + 1 2 + η a α ) , j = 1 , , p , so that we have the following structural representation for α < a :
| b ( a α ) A 1 2 X B X A 1 2 | = u 1 u p .
Both sides have the same distribution. Similarly, for α > a , we have the following:
E [ b ( α a ) u ] h = j = 1 p Γ ( γ + q 2 j 1 2 + h ) Γ ( γ + q 2 j 1 2 ) Γ ( η α a ( γ + q 2 ) j 1 2 ) Γ ( η α a ( γ + q 2 ) j 1 2 h ) = E [ v 1 h ] E [ v 2 h ] E [ v p h ] b ( α a ) u = v 1 v 2 v p
for ( h + γ + q 2 ) > p 1 2 , η α a ( γ + q 2 ) j 1 2 > 0 , j = 1 , , p , where v 1 , , v p are independently distributed real scalar type-2 beta variables, with v j having the parameters ( γ + q 2 j 1 2 , η α a ( γ + q 2 ) j 1 2 ) , j = 1 , , p . For α a , we have the following from (34):
E [ b η u ] h = Γ p ( γ + q 2 + h ) Γ p ( γ + q 2 ) = j = 1 p Γ ( γ + q 2 j 1 2 + h ) Γ ( γ + q 2 j 1 2 ) = E [ w 1 h ] E [ w 2 h ] E [ w p h ] b η u = w 1 w p
for ( h + γ + q 2 ) > p 1 2 , where w 1 , , w p are independently distributed real scalar gamma variables, with w j having the parameters ( γ + q 2 j 1 2 , 1 ) , j = 1 , , p .
Note 5.
Note that u = | A 1 2 X B X A 1 2 | = | Y Y | . Let the rows of Y be Y 1 , , Y p , where Y j is a 1 × q real vector. Then, Y j can be considered to be a point in a q-dimensional Euclidean space. We have p q of such points. These points (vectors) are linearly independent because we have assumed that the matrix is of rank p. Taking the points in the order Y 1 , , Y p , these points (vectors) create a convex hull, and in this hull, a parallelotope is determined; the volume content of this parallelotope is the determinant | Y Y | 1 2 . Hence, the distribution of this determinant, as well as the moments, is important in stochastic geometry or in geometrical probabilities and other related areas of image processing, pattern recognition, etc. The scaling constants b ( a α ) in (35), b ( α a ) in (36), and b η in (37) can be taken as unities for convenience. Then, the points Y 1 , , Y p are type-1 beta distributed in (35), type-2 beta distributed in (36), and gamma distributed in (37). In general, Y 1 , , Y p have pathway distributions, or these are pathway-distributed random points in q-space.
Note 6.
If q p and the matrix X is of rank q, then we may consider B 1 2 X A X B 1 2 . Then, results corresponding to the results in Section 5, Section 5.1 and Section 5.2 are available by interchanging A with B and p with q. Hence, a separate discussion is not needed in this case. Observe also that tr ( A 1 2 X B X A 1 2 ) = tr ( B 1 2 X A X B 1 2 ) , where one is a p × p matrix and the other is a q × q matrix.

6. Complex Case

For a matrix A, its transpose will be written as A and its complex conjugate transpose as A * . If A = A * , then A is called Hermitian. Any complex matrix A can be written as A = A 1 + i A 2 , i = ( 1 ) , A 1 , A 2 real. When A is Hermitian, then A 1 = A 1 , A 2 = A 2 , that is, A 1 is real symmetric and A 2 is real skew symmetric. If A is p × p Hermitian positive definite, then A = A * > O (Hermitian positive definite). The determinant of A is written as | A | , as well as det ( A ) , and the absolute value of the determinant will be written as | det ( A ) | = [ det ( A ) ] [ det ( A * ) ] = [ det ( A A * ) ] . Variables in the complex domain will be written with a tilde, such as X ˜ . In order to optimize Mathai’s entropy (4) over the density f ( X ˜ ) in the complex domain, we need some results on Jacobians. These will be given here as lemmas without proofs. For the proofs and for other related results in the complex domain, see [11].
Lemma 3.
Let X ˜ = ( x ˜ i j ) be p × q with distinct complex scalar elements x ˜ i j . Let A and B be p × p or q × q nonsingular constant matrices, respectively—real or complex. Then,
Y ˜ = A X ˜ B , | A | 0 , | B | 0 d Y ˜ = | det ( A A * ) | q | det ( B B * ) | p d X ˜ .
Lemma 4.
Let X ˜ = ( x ˜ i j ) be a p × q , p q , and rank p matrix with distinct complex elements x ˜ i j . Let S ˜ = X ˜ X ˜ * , which is p × p Hermitian positive definite. Then, after integrating over the Stiefel manifold,
d X ˜ = π p q Γ ˜ p ( q ) | det ( S ˜ ) | q p d S ˜
where Γ ˜ p ( α ) is a complex matrix-variate gamma given by
Γ ˜ p ( α ) = π p ( p 1 ) 2 Γ ( α ) Γ ( α 1 ) . . . Γ ( α p + 1 ) , ( α ) > p 1
= S ˜ > O | det ( S ˜ ) | α p e tr ( S ˜ ) d S ˜ , ( α ) > p 1 .

Optimization in the Complex Domain

As a first problem, let X ˜ be a p × 1 vector variable in the complex domain with distinct scalar complex elements. Let Σ > O be a p × p Hermitian positive definite constant matrix. Consider the Hermitian form u = ( X ˜ μ ) * Σ 1 ( X ˜ μ ) , where μ is a p × 1 constant vector. This can be taken as the ellipsoid of concentration in 2 p -dimensional Euclidean space or as the ellipsoid of concentration in a p-dimensional complex domain. This ellipsoid is an important quantity in statistical analysis, as well as in various other situations. When X ˜ is a vector random variable in the complex domain with the mean value E [ X ˜ ] = μ and with the covariance matrix Σ = Cov ( X ˜ ) = E [ ( X ˜ μ ) ( X ˜ μ ) * ] , then u is the generalized distance of X ˜ from the point of the location of its expected value μ . Hence, we will optimize Mathai’s entropy in (4) under moment-like constraints on u. Consider the following constraints: For α < a ,
E [ u γ ( a α η ) ] = fixed   and   E [ u γ ( a α η ) + δ ] = fixed
over all possible densities f ( X ˜ ) , where a , α , η > 0 , δ > 0 , γ > 0 are all real scalar constants, a is a fixed location, and α is a real parameter. For α < a , proceeding as in the real case, we will end up with the following density:
f 1 ( X ˜ ) = c 1 [ ( X ˜ μ ) * Σ 1 ( X ˜ μ ) ] γ [ 1 b ( a α ) ( ( X ˜ μ ) * Σ 1 ( X ˜ μ ) ) δ ] η a α
for 1 b ( a α ) ( X ˜ μ ) * Σ 1 ( X ˜ μ ) > 0 , where c 1 is the normalizing constant. In order to avoid having too many symbols, we will use the same notations as in the real case, with variables written with a tilde and constants without a tilde. For α > a , we will have the following density:
f 2 ( X ˜ ) = c 2 [ ( X ˜ μ ) * Σ 1 ( X ˜ μ ) ] γ [ 1 + b ( α a ) ( ( X ˜ μ ) * Σ 1 ( X ˜ μ ) ) δ ] η α a
for α > a , b > 0 , η > 0 , δ > 0 , γ > 0 . When α a , both f 1 ( X ˜ ) and f 2 ( X ˜ ) go to
f 3 ( X ˜ ) = c 3 [ ( X ˜ μ ) * Σ 1 ( X ˜ μ ) ] γ e b η ( X ˜ μ ) * Σ 1 ( X ˜ μ )
for b > 0 , η > 0 . For evaluating the normalizing constants c 1 , c 2 , c 3 , we will use the following transformations: Y ˜ = Σ 1 2 ( X ˜ μ ) d Y ˜ = | det ( Σ ) | 1 d X ˜ by using Lemma 3, where Σ 1 2 is the Hermitian positive definite square root of the Hermitian positive definite Σ 1 ; s = Y * Y d Y = π q Γ ˜ ( q ) s q 1 d s by using Lemma 4, where Y ˜ * is 1 × p , and s is 1 × 1 . Then, we evaluate the s-integral by using a real scalar type-1 beta integral for α < a , real scalar type-2 beta integral for α > a , and real scalar gamma integral for α a . Then, we have the following results:
c 1 = Γ ( p ) | det ( Σ ) | π p δ [ b ( a α ) ] γ + p δ Γ ( 1 + η a α + γ + p δ ) Γ ( γ + p δ ) Γ ( 1 + η a α ) , α < a
c 2 = Γ ( p ) | det ( Σ ) | π p δ [ b ( α a ) ] γ + p δ Γ ( η α a ) Γ ( γ + p δ ) Γ ( η α a ( γ + p δ ) ) , α > a
c 3 = Γ ( p ) | det ( Σ ) | π p δ [ b η ] γ + p δ 1 Γ ( γ + p δ )
for b > 0 , η > 0 , γ > 0 , δ > 0 , and in addition, in (46), η α a ( γ + p δ ) > 0 . Observe that through the pathway parameter α , one can reach all three densities f j ( X ˜ ) , j = 1 , 2 , 3 , and hence, f 1 ( X ˜ ) or f 2 ( X ˜ ) is the pathway model in the complex domain for the p × 1 vector random variable X ˜ . In model-building situations, if f 3 ( X ˜ ) is the stable model, then the unstable neighborhoods are given by f 1 ( X ˜ ) and f 2 ( X ˜ ) , and the transitional stages are also reached through α .
For γ = 1 , δ = 1 , one can consider f 3 ( X ˜ ) in (44) as a multivariate Maxwell–Boltzmann density in the complex domain. For γ = 1 2 , δ = 1 , one can take (44) as a multivariate Rayleigh density in the complex domain. For p = 1 , we have the scalar variable Maxwell–Boltzmann and Rayleigh densities in the complex domain from (44). The corresponding real cases may be seen from [12,13]. Observe that (43) and (44) also hold for δ < 0 , but for δ < 0 , the support must be redefined in (42). Hence, a form of multivariate Maxwell–Boltzmann and Rayleigh densities can be defined for δ < 0 as well. In the complex domain, these densities are defined over the whole complex space. In the complex scalar case, if one has to confine to the sector ( x ˜ μ ) > 0 , then we multiply the corresponding (44) by 1 2 for p = 1 so that one can consider, for example, a time variable that is real positive for the real part and a phase variable for the complex part. Note that in the Rayleigh case for p = 1 , γ = 1 2 , we have [ ( x ˜ μ ) * ( x ˜ μ ) ] 1 2 = | ( x ˜ μ ) | = , which is the absolute value. For γ = 0 , (42) gives a very good model for multivariate reliability analysis in the complex domain. Reliability analysis in the complex domain does not seem to have been discussed in the literature.

7. Optimization with a Trace Constraint

Let X ˜ be a p × q , p q , and rank p matrix with distinct complex scalar variables as elements. Let A > O and B > O be p × p and q × q Hermitian positive definite constant matrices, respectively. Let u = tr ( A 1 2 X ˜ B X ˜ * A 1 2 ) , where A 1 2 is a Hermitian positive definite square root of A. Consider the optimization of Mathai’s entropy in (4) for the density f ( X ˜ ) , where X ˜ is p × q , as described here, subject to the constraints
E [ tr ( A 1 2 X ˜ B X ˜ * A 1 2 ) ] γ ( a α η ) = fixed   and   E [ tr ( A 1 2 X ˜ B X ˜ * A 1 2 ) ] γ ( a α η ) + δ = fixed
over all possible densities f ( X ˜ ) , where X ˜ is p × q , p q , and of rank p. Then, proceeding as in the real case, we end up with the following densities: For α < a ,
f 1 ( X ˜ ) = C 1 [ u ] γ [ 1 b ( a α ) u δ ] η a α , α < a
f 2 ( X ˜ ) = C 2 [ u ] γ [ 1 + b ( α a ) u δ ] η α a , α > a
f 3 ( X ˜ ) = C 3 [ u ] γ e b η u δ
where u = tr ( A 1 2 X ˜ B X ˜ * A 1 2 ) , and the normalizing constants are the following:
C 1 = | det ( A ) | q | det ( B ) | p Γ ( p q ) π p q δ [ b ( a α ) ] γ + p q δ Γ ( γ + p q δ + 1 + η a α ) Γ ( γ + p q δ ) Γ ( 1 + η α a ) ,
for α < a , b > 0 , γ > 0 , δ > 0 , A > O , B > O , η > 0 ;
C 2 = | det ( A ) | q | det ( B ) | p Γ ( p q ) π p q δ [ b ( α a ) ] γ + p q δ Γ ( η α a ) Γ ( γ + p q δ ) Γ ( η α a ( γ + p q δ ) ) ,
for α > a , b > 0 , η > 0 , γ > 0 , δ > 0 , A > O , B > O , η α a ( γ + p q δ ) > 0 and
C 3 = | det ( A ) | q | det ( B ) | p Γ ( p q ) π p q δ [ b η ] γ + p q δ 1 Γ ( γ + p q δ )
for A > O , B > O , b > 0 , η > 0 , γ > 0 , δ > 0 .
Note that (50) can be considered as a multivariate version of the complex Maxwell–Boltzmann and Rayleigh densities for ( γ = 1 , δ = 1 ) and ( γ = 1 2 , δ = 1 ) , respectively. If q p and X ˜ is of rank q, then we can take u = tr ( B 1 2 X ˜ * A X ˜ B 1 2 ) and proceed, as in the p q case, with p and q interchanged and A and B interchanged. We obtain results parallel to the ones above for the case of p q .

8. Constraints in Terms of Determinants

Let X ˜ be p × q , p q , and of rank p with distinct complex scalar variables as elements. Let A > O and B > O be p × p and q × q Hermitian positive definite constant matrices, respectively. Consider the optimization of (4) under the constraint
E [ | det ( A 1 2 X ˜ B X ˜ * A 1 2 ) | γ ( a α η ) | det ( I b ( a α ) A 1 2 X ˜ B X ˜ * A 1 2 ) | ] = fixed
over all possible densities f ( X ˜ ) , where X ˜ is p × q , p q , and of rank p. Then, proceeding as in the real case, we have the following densities:
f 1 ( X ˜ ) = C 1 | det ( A 1 2 X ˜ B X ˜ * A 1 2 ) | γ | det ( I b ( a α ) A 1 2 X ˜ B X ˜ * A 1 2 ) | η a α , α < a
f 2 ( X ˜ ) = C 2 | det ( A 1 2 X ˜ B X ˜ * A 1 2 ) | γ | det ( I + b ( α a ) A 1 2 X ˜ B X ˜ * A 1 2 ) | η α a , α > a
f 3 ( X ˜ ) = C 3 | det ( A 1 2 X ˜ B X ˜ * A 1 2 ) | γ e b η tr ( A 1 2 X ˜ B X ˜ * A 1 2 ) , α a
where in (54), I b ( a α ) A 1 2 X ˜ B X ˜ * A 1 2 > O , and the normalizing constants are the following:
C 1 = | det ( A ) | q | det ( B ) | p Γ ˜ p ( q ) π p q [ b ( a α ) ] p ( γ + q ) Γ ˜ p ( p + 1 2 + η a α + γ + q ) Γ ˜ p ( γ + q ) Γ ˜ p ( η α a )
for α < a , b > 0 , γ > 0 , η > 0 , A > O , B > O ;
C 2 = | det ( A ) | q | det ( B ) | p Γ ˜ p ( q ) π p q [ b ( α a ) ] p ( γ + q ) Γ ˜ p ( η α a ) Γ ˜ p ( γ + q ) Γ ˜ p ( η α a ( γ + q ) )
for α > a , b > 0 , γ > 0 , η > 0 , A > O , B > O , η α a ( γ + q ) > 0 and
C 3 = | det ( A ) | q | det ( B ) | p Γ ˜ p ( q ) π p q [ b η ] p ( γ + q ) 1 Γ ˜ p ( γ + q )
for A > O , B > O , b > 0 , η > 0 , γ > 0 . Note that the model in (56) can be taken as the complex rectangular matrix-variate Maxwell–Botzmann and Rayleigh densities for γ = 1 and γ = 1 2 , respectively. The corresponding real cases were given by [12,13]. The standard versions of complex rectangular matrix-variate Maxwell–Boltzmann and Rayleigh densities are available from (56) by considering the density of Y = A 1 2 X ˜ B 1 2 . Then, in the normalizing constant C 3 , | det ( A ) | q | det ( B ) | p will be absent. The standard density is the following:
f 4 ( Y ˜ ) = Γ ˜ p ( q ) π p q [ ρ ] p ( γ + q ) 1 Γ ˜ p ( γ + q ) | Y ˜ Y ˜ * | γ e ρ tr ( Y ˜ Y ˜ * )
where we have taken b η = ρ for convenience. Then, for p = 1 , q = 1 , we obtain the complex scalar versions of the Maxwell–Boltzmann and Rayleigh densities from (61) as the following:
f 5 ( x ˜ ) = 1 π ρ γ + 1 Γ ( γ + 1 ) [ x ˜ x ˜ * ] γ e ρ x ˜ x ˜ * , ρ > 0 , γ > 0 .
For γ = 1 , we have the Maxwell–Boltzmann density in the complex scalar case, and for γ = 1 2 , we have the Rayleigh density in the complex scalar case. Note that [ x ˜ x ˜ * ] 1 2 = | x ˜ | = , which is the absolute value of the scalar complex variable x ˜ . If the domain must be confined to ( x ˜ ) > 0 , then multiply (61) by 1 2 .

8.1. Arbitrary Moments

As in the real case, we can consider the h-th moment of the absolute value of the determinant | det ( A 1 2 X ˜ B X ˜ * A 1 2 ) | for arbitrary h. Then, from (54), we have the following: For α < a ,
E [ | det ( b ( a α ) A 1 2 X ˜ B X ˜ * A 1 2 ) | h ] = Γ ˜ p ( γ + q + h ) Γ ˜ p ( γ + q ) Γ ˜ p ( γ + q + η a α ) Γ ˜ p ( γ + q + η a α + h ) = j = 1 p Γ ( γ + q ( j 1 ) + h ) Γ ( γ + q ( j 1 ) ) Γ ( γ + q + η a α ( j 1 ) ) Γ ( γ + q + η a α ( j 1 ) + h ) = E [ u 1 h ] E [ u 2 h ] . . . E [ u p h ]
for α < a and ( h + γ + q ) > p 1 , where u 1 , . . . , u p are mutually independently distributed real scalar type-1 beta random variables, with u j having the parameters ( γ + q ( j 1 ) , η a α ) , j = 1 , . . . , p . Therefore, we have the structural representation
| det ( b ( a α ) A 1 2 X ˜ B X ˜ * A 1 2 ) | = u 1 . . . u p
where u 1 , . . . , u p are as defined above and α < a . From (55), we have the following:
E [ | det ( b ( α a ) A 1 2 X ˜ B X ˜ * A 1 2 ) | h ] = Γ ˜ p ( γ + q + h ) Γ ˜ p ( γ + q ) Γ ˜ p ( η α a ( γ + q ) ( j 1 ) ) Γ ˜ p ( η α a ( γ + q ) ( j 1 ) h ) = j = 1 p Γ ( γ + q ( j 1 ) + h ) Γ ( γ + q ( j 1 ) ) Γ ( η α a ( γ + q ) ( j 1 ) ) Γ ( η α a ( γ + q ) ( j 1 ) h ) = E [ v 1 h ] E [ v 2 h ] E [ v p h ]
for α > a , ( h + γ + q ) > p 1 , ( h + η α a ( γ + q ) ) > p 1 , where v 1 , , v p are mutually independently distributed real scalar type-2 beta random variables with the parameters ( γ + q ( j 1 ) , η α a ( γ + q ) ( j 1 ) ) , j = 1 , , p , and we have the structural representation
| det ( b ( α a ) A 1 2 X ˜ B X ˜ * A 1 2 ) | = v 1 v p
for α > a , where v 1 , v p are defined above. From (56), we have the following results:
E [ | det ( b η A 1 2 X ˜ B X ˜ * A 1 2 ) | h ] = Γ ˜ p ( γ + q + h ) Γ ˜ p ( γ + q ) = j = 1 p Γ ( γ + q ( j 1 ) + h ) Γ ( γ + q ( j 1 ) ) = E [ w 1 h ] E [ w 2 h ] E [ w p h ]
where ( h + γ + q ) > p 1 , w 1 , , w p are mutually independently distributed real scalar gamma random variables with the parameters ( γ + q ( j 1 ) , 1 ) , j = 1 , , p , and we have the structural representation
| det ( b η A 1 2 X ˜ B X ˜ * A 1 2 ) | = w 1 w 2 w p
where w 1 , , w p are defined above. Note that if q p and X ˜ is of rank q, then we can obtain results for B 1 2 X ˜ * A X ˜ B 1 2 that are parallel to those in Section 8 and Section 8.1 by interchanging p with q and A with B.

9. Concluding Remarks

In this paper, it is shown that a large number of statistical densities belonging to the pathway family [16] of densities in the scalar, vector, and matrix-variate cases in the real and complex domains can be obtained by optimizing a certain entropy measure. The calculus of variation technique was used for the optimization. The notations were simplified and made consistent in order to avoid having too many symbols to denote different types of variables. Mathematical variables and random variables are treated in the same way to avoid the double notations that are usually used to denote random variables and the resulting confusions.

Author Contributions

N.S., A.M.M. and H.J.H. have equal contributions. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank the referees for their valuable comments, which enabled the authors to improve the presentation of the material in the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Shannon, C.E. A mathematical theory of communications. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef] [Green Version]
  2. Havrda, J.; Charvát, F. Quantification method of classification processes: Concept of structural α-entropy. Kybernetika 1967, 3, 30–35. [Google Scholar]
  3. Mathai, A.M.; Rathie, P.N. Basic Concepts in Information Theory and Statistics: Axiomatic Foundations and Applications; Wiley Eastern: New Delhi, India, 1975. [Google Scholar]
  4. Tsallis, C. Possible generalization of Boltzmann-Gibbs statistics. J. Stat. Phys. 1988, 52, 479–487. [Google Scholar] [CrossRef]
  5. Sebastian, N. Generalized pathway entropy and its application in diffusion entropy analysis and fractional calculus. Commun. Appl. Ind. Math. 2015, 1–20. [Google Scholar] [CrossRef]
  6. Paul, J.; Thomas, P.Y. On some properties and Mathai-Haubold entropy of record values. J. Indian Soc. Probab. Stat. 2019, 31–49. [Google Scholar] [CrossRef]
  7. Mainardi, F. Why the Mittag-Leffler Function Can Be Considered the Queen Function of the Fractional Calculus? Entropy 2020, 22, 1359. [Google Scholar] [CrossRef] [PubMed]
  8. Liang, Y. Diffusion entropy method for ultraslow diffusion using inverse Mittag-Leffler function. Fract. Calc. Appl. Anal. 2018, 21, 104–117. [Google Scholar] [CrossRef] [Green Version]
  9. Ilić, V.M.; Korbel, J.; Gupta, S.; Scarfone, A.M. An overview of generalized entropic forms. arXiv 2021, arXiv:2102.10071v1. [Google Scholar]
  10. Ribeiro, M.; Henriques, T.; Castro, L.; Souto, A.; Antunes, L.; Costa-Santos, C.; Teixeira, A. The entropy universe. Entropy 2021, 23, 222. [Google Scholar] [CrossRef] [PubMed]
  11. Mathai, A.M. Jacobians of Matrix Transformations and Functions of Matrix Argument; World Scientific Publishing: New York, NY, USA, 1997. [Google Scholar]
  12. Mathai, A.M.; Princy, T. Analogues of reliability analysis for matrix-variate cases. Linear Algebra Its Appl. 2017, 532, 287–311. [Google Scholar] [CrossRef]
  13. Mathai, A.M.; Princy, T. Multivariate and matrix-variate analogues of Maxwell-Boltmann and Rayleigh densities. Physica A 2017, 468, 668–676. [Google Scholar] [CrossRef]
  14. Nagy, S.; Dyckerhoff, R.; Mozharovskyi, P. Uniform convergence rates for the approximated half space and projection depth. Electron. J. Stat. 2020, 14, 3939–3975. [Google Scholar] [CrossRef]
  15. Mathai, A.M.; Saxena, R.K.; Haubold, H.J. The H-Function: Theory and Applications; Springer: New York, NY, USA, 2010. [Google Scholar]
  16. Mathai, A.M. A pathway to matrix-variate gamma and normal densities. Linear Algebra Appl. 2005, 396, 317–328. [Google Scholar] [CrossRef] [Green Version]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sebastian, N.; Mathai, A.M.; Haubold, H.J. Entropy Optimization, Maxwell–Boltzmann, and Rayleigh Distributions. Entropy 2021, 23, 754. https://doi.org/10.3390/e23060754

AMA Style

Sebastian N, Mathai AM, Haubold HJ. Entropy Optimization, Maxwell–Boltzmann, and Rayleigh Distributions. Entropy. 2021; 23(6):754. https://doi.org/10.3390/e23060754

Chicago/Turabian Style

Sebastian, Nicy, Arak M. Mathai, and Hans J. Haubold. 2021. "Entropy Optimization, Maxwell–Boltzmann, and Rayleigh Distributions" Entropy 23, no. 6: 754. https://doi.org/10.3390/e23060754

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop