Next Article in Journal
A Note on a Fractional Extension of the Lotka–Volterra Model Using the Rabotnov Exponential Kernel
Previous Article in Journal
Partial Derivatives Estimation of Multivariate Variance Function in Heteroscedastic Model via Wavelet Method
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Interpolating Family of Distributions

by
Saralees Nadarajah
1,* and
Idika E. Okorie
2
1
Department of Mathematics, University of Manchester, Manchester M13 9PL, UK
2
Department of Mathematics, Khalifa University, Abu Dhabi P.O. Box 127788, United Arab Emirates
*
Author to whom correspondence should be addressed.
Axioms 2024, 13(1), 70; https://doi.org/10.3390/axioms13010070
Submission received: 18 December 2023 / Revised: 17 January 2024 / Accepted: 18 January 2024 / Published: 20 January 2024

Abstract

:
A recent paper introduced the interpolating family (IF) of distributions, and they also derived various mathematical properties of the family. Some of the most important properties discussed were the integer order moments of the IF distributions. The moments were expressed as an integral (which were not evaluated) or as finite sums of the beta function. In this paper, more general expressions for moments of any integer order or any real order are derived. Apart from being more general, our expressions converge for a wider range of parameter values. The expressions for entropies are also derived, the maximum likelihood estimation is considered and the finite sample performance of maximum likelihood estimates is investigated.

1. Introduction

Ref. [1] introduced the interpolating family (IF) of size distributions, which is given by the probability density function
f X ( x ) = b q c x x 0 c b 1 G p ( x ) q 1 1 1 p + 1 G p ( x ) q p
for x 0 x < , p 0 , b 0 , c > 0 , q > 0 and x 0 0 , where
G p ( x ) = ( p + 1 ) 1 q + x x 0 c b .
Ref. [1] derived several mathematical properties of (1), including special cases, the cumulative distribution function, survival function, hazard function, quantile function, the median, random variate generation, moments, the mean, variance, unimodality and the location of the mode.
As explained in [1], the distribution given by (1) is not new. The motivation for (1) was to introduce a distribution that combines Pareto-type distributions and Weibull-type distributions into one mathematical form. The aim of this paper is to derive more of the mathematical properties of (1), and hence to add to the applicability of (1). More general expressions for the moment properties of (1) given in Section 3 can entail the development of estimation methods based on moments, L moments, trimmed L moments and probability weighted moments. The derivation of entropies for (1) can help to develop estimation methods based on entropies to fit (1) to real data. The derivation of the maximum likelihood procedure for (1) can help to use the procedure to fit (1) to real data.
Let X be a random variable with its probability density function given by (1). Ref. [1] expressed the rth moment of X as
E X r = i = 0 r r i x 0 i c r i I ( p , b , q ) ,
where
I ( p , b , q ) = q ( p + 1 ) 1 q y q 1 y ( p + 1 ) 1 q r i b 1 y q p + 1 p d y .
Ref. [1] did not simplify (2), and they stated “It is in principle possible to write out I ( p , b , q ) as an infinite series of beta functions, but because this expression is rather intricate and needs to be worked out on a case-by-case basis just like I ( p , b , q ) , we refrain from doing so”. Ref. [1] then derived simpler expressions for (2) for the following three special cases: (i) p = 0 , which was referred to as the IF1 distribution; (ii) p , which was referred to as the IF2 distribution; (iii) 0 < p < and b = 1 , which was referred to as the IF3 distribution. The derived expressions are the finite sums or doubly finite sums of the beta function.
Equation (2) is when X is an IF random variable, and it can be simplified in terms of a known special function whether r is an integer or not. Particular cases of this result are when X is an IF1 random variable or when an IF3 random variable is also derived. Apart from being more general, our expressions converge for a wider range of parameter values. In fact, some of our expressions hold for all admissible values of r, p, b, c, q and x 0 .
The expressions given in this paper involve the Wright generalized hypergeometric function, Ψ q p ( · ) , with the p numerator and q denominator parameters ([2], Equation (1.9)) being defined by
Ψ q p α 1 , A 1 , , α p , A p β 1 , B 1 , , β q , B q ; z = n = 0 j = 1 p Γ α j + A j n j = 1 q Γ β j + B j n z n n !
for z C , where C denotes the set of complex numbers, α j , β k C , A j and B k 0 for j = 1 , , p and k = 1 , , q . This function was originally introduced by [3]. If
j = 1 q B j j = 1 p A j > 1 ,
then (4) converges absolutely for all finite values of z. If
j = 1 q B j j = 1 p A j = 1 ,
then the radius of convergence of (4) is
ρ = k = 1 p A k A k j = 1 q B j B j .
If (6) holds and z = ρ , then (4) converges absolutely if
j = 1 q β j j = 1 p α j + p q 1 2 > 0 .
(See Theorem 1.5 in [2].)
Apart from the Wright generalized hypergeometric function, the calculations in this paper use the gamma and beta functions defined by
Γ ( a ) = 0 t a 1 exp ( t ) d t
and
B ( a , b ) = 0 1 t a 1 ( 1 t ) b 1 d t ,
respectively. The gamma function is defined if a > 0 is any real number. The beta function is defined if a > 0 , b > 0 and a + b > 0 are any real numbers.
The rest of this paper is organized as follows. Section 2 gives a technical lemma that is useful for subsequent calculations. Section 3 derives the rth moment of an IF random variable when r > 0 is an integer or a real number, it also derives the rth moment of an IF1 random variable when r > 0 is an integer or a real number, and it further derives the rth moment of an IF3 random variable when r > 0 is an integer or a real number. Section 4 derives the expressions for two popular entropies. The maximum likelihood estimation for (1) is considered in Section 5. Its finite sample performance is investigated in Section 6. Finally, certain conclusions are detailed in Section 7.

2. A Technical Lemma

In this section, a technical lemma is presented. The integral in the lemma arises in the common mathematical properties of (1).
Lemma 1. 
If δ + 1 b > 0 and γ > δ + 1 b , then
x 0 x x 0 c δ G p ( x ) γ d x = c b ( p + 1 ) γ q δ + 1 b q B γ δ + 1 b , δ + 1 b .
Proof. 
Set y = x x 0 c b . Then, write
x 0 x x 0 c δ G p ( x ) γ d x = c b 0 y δ + 1 b 1 ( p + 1 ) 1 q + y γ d y .
Then set z = ( p + 1 ) 1 q ( p + 1 ) 1 q + y . As such, the integral on the right hand side of (9) can be calculated as
0 y δ + 1 b 1 ( p + 1 ) 1 q + y γ d y = ( p + 1 ) γ q δ + 1 b q 0 1 z γ δ + 1 b 1 ( 1 z ) δ + 1 b 1 d z .
The result follows by combining (9) and (10). □
The use of Lemma 1 is illustrated later in Section 3 and Section 4.

3. Moments of the IF Random Variable

Let X denote an IF random variable. Proposition 1 expresses the integer order moment of X as a finite sum of the Wright generalized hypergeometric function. Proposition 2 expresses the real order moment of X x 0 c as a single Wright generalized hypergeometric function.
Proposition 1. 
Let X denote an IF random variable. If r > 0 is an integer and q > r b , then
E X r = Γ ( p + 1 ) q i = 0 r r i x 0 i c r i Γ r i b + 1 ( p + 1 ) 1 r i b q Ψ 2 1 q r i b , q q + 1 , q , p + 1 , 1 ; 1 .
The Wright generalized hypergeometric function in (11) converges for all admissible values of p, b, c, q and x 0 , such that either b > 0 or b < 0 and 1 + r b > 0 .
Proof. 
When applying the binomial expansion for the last term in the integrand of (3), write
I ( p , b , q ) = q j = 0 p j ( 1 ) j ( p + 1 ) j ( p + 1 ) 1 q y j q q 1 y ( p + 1 ) 1 q r i b d y .
Substituting z = ( p + 1 ) 1 q y , rewrite (12) as
I ( p , b , q ) = q j = 0 p j ( 1 ) j ( p + 1 ) 1 r i b q 0 1 z j q + q r i b 1 ( 1 z ) r i b d z = q j = 0 p j ( 1 ) j ( p + 1 ) 1 r i b q B j q + q r i b , r i b + 1 = q ( p + 1 ) 1 r i b q j = 0 Γ ( p + 1 ) j ! Γ ( p j + 1 ) ( 1 ) j Γ j q + q r i b Γ r i b + 1 Γ j q + q + 1 = Γ ( p + 1 ) Γ r i b + 1 q ( p + 1 ) 1 r i b q j = 0 ( 1 ) j j ! Γ j q + q r i b Γ j q + q + 1 Γ ( p j + 1 ) .
Equation (11) follows from the definition in (4). Note that (6) is satisfied and ρ = 1 in (7). The left hand side of (8) is p + 1 + r i b , which is positive for all i if either b > 0 or b < 0 and p + 1 + r b > 0 . □
Proposition 2. 
Let X denote an IF random variable. If r > 0 is real and 1 < r b < q , then
E X x 0 c r = q Γ ( p + 1 ) Γ r b + 1 ( p + 1 ) 1 r b q Ψ 2 1 q r b , q p + 1 , 1 , q + 1 , q ; 1 .
The Wright generalized hypergeometric function in (13) converges for all admissible values of p, b, c, q and x 0 , such that 1 + r b > 0 .
Proof. 
Since 0 G p ( x ) q p + 1 1 for all x, then expand (1) as
f X ( x ) = b q Γ ( p + 1 ) c x x 0 c b 1 k = 0 ( 1 ) k k ! ( p + 1 ) k Γ ( p k + 1 ) G p ( x ) q k + q + 1
for x 0 x < . As such, write
E X x 0 c r = x 0 b q Γ ( p + 1 ) c x x 0 c r + b 1 · k = 0 ( 1 ) k k ! ( p + 1 ) k Γ ( p k + 1 ) G p ( x ) q k + q + 1 d x = b q Γ ( p + 1 ) c k = 0 ( 1 ) k k ! ( p + 1 ) k Γ ( p k + 1 ) · x 0 x x 0 c r + b 1 G p ( x ) q k q 1 d x .
Using Lemma 1 to calculate the integral in (15), write
E X x 0 c r = q Γ ( p + 1 ) k = 0 ( 1 ) k k ! Γ ( p k + 1 ) ( p + 1 ) 1 + 1 q r + b b q B q k + q r b , r b + 1 = q Γ ( p + 1 ) Γ r b + 1 ( p + 1 ) 1 r b q k = 0 ( 1 ) k k ! Γ q k + q r b Γ ( p k + 1 ) Γ q k + q + 1 .
Equation (13) follows from the definition in (4). Note that (6) is satisfied and ρ = 1 in (7). The left hand side of (8) is p + 1 + r b , which is positive if 1 + r b > 0 . □
Now, let X denote an IF1 random variable. The rth moment of X if r > 0 is an integer can be obtained by setting p = 0 in (11). The rth moment of X x 0 c if r > 0 is a real number can be obtained by setting p = 0 in (13).
Proposition 3. 
Let X denote an IF1 random variable. If r > 0 is an integer and q > r b , then
E X r = q i = 0 r r i x 0 i c r i Γ r i b + 1 Ψ 2 1 q r i b , q q + 1 , q , 1 , 1 ; 1 .
The Wright generalized hypergeometric function in (16) converges for all admissible values of p, b, c, q and x 0 , such that either b > 0 or b < 0 and 1 + r b > 0 .
Proof. (16) follows immediately from (11). □
Proposition 4. 
Let X denote an IF1 random variable. If r > 0 is real and 1 < r b < q , then
E X x 0 c r = q Γ r b + 1 Ψ 2 1 q r b , q 1 , 1 , q + 1 , q ; 1 .
The Wright generalized hypergeometric function in (17) converges for all admissible values of b, c, q and x 0 , such that 1 + r b > 0 .
Proof. (17) follows immediately from (13). □
Now, let X denote an IF3 random variable. The rth moment of X if r > 0 is an integer can be obtained by setting b = 1 in (11). The rth moment of X x 0 c if r > 0 is a real number can be obtained by setting b = 1 in (13).
Proposition 5. 
Let X denote an IF3 random variable. If r > 0 is an integer and q > r , then
E X r = Γ ( p + 1 ) q i = 0 r r i x 0 i c r i Γ r i + 1 ( p + 1 ) 1 r i q Ψ 2 1 q r + i , q q + 1 , q , p + 1 , 1 ; 1 .
The Wright generalized hypergeometric function in (18) converges for all admissible values of p, b, c, q and x 0 .
Proof. (18) follows immediately from (11). □
Proposition 6. 
Let X denote an IF3 random variable. If r > 0 is real and 1 < r < q , then
E X x 0 c r = q Γ ( p + 1 ) Γ r + 1 ( p + 1 ) 1 r q Ψ 2 1 q r , q p + 1 , 1 , q + 1 , q ; 1 .
The Wright generalized hypergeometric function in (13) converges for all admissible values of p, b, c, q and x 0 .
Proof. (19) follows immediately from (13). □

4. Entropies

Two of the most popular entropies are Shannon entropy [4] and Rényi entropy [5], which are defined by
S ( X ) = log f X ( x ) f X ( x ) d x
and
R ( X ) = 1 1 α log f X ( x ) α d x ,
respectively, for α 0 and α 1 . Propositions 7 and 8 derive the explicit expressions for (20) and (21), respectively, when X is an IF random variable.
Proposition 7. 
Let X denote an IF random variable. If α < 1 α b < q α , then
R X = log c b q ( p + 1 ) 1 + 1 b q + 1 1 α log Γ α + 1 α b + 1 1 α log q ( p + 1 ) Γ p α + 1 Ψ 2 1 q α + α 1 b , q p α + 1 , 1 , q α + α , q ; 1 .
The Wright generalized hypergeometric function in (22) converges for all admissible values of p, b, c, q and x 0 , such that p α + α > α 1 b .
Proof. 
Write
f X ( x ) α = b α q α c α x x 0 c b α α G p ( x ) q α α 1 1 p + 1 G p ( x ) q p α .
Since 0 G p ( x ) q p + 1 1 for all x, expand (23) as
f X ( x ) = b α q α Γ p α + 1 c α x x 0 c b α α k = 0 ( 1 ) k k ! ( p + 1 ) k Γ p α k + 1 G p ( x ) q k + q α + α
for x 0 x < . As such, using Lemma 1, we have
x 0 f X ( x ) d x = b α q α Γ p α + 1 c α k = 0 ( 1 ) k k ! ( p + 1 ) k Γ p α k + 1 · x 0 x x 0 c b α α G p ( x ) q k q α α d x = b α 1 q α ( p + 1 ) α + α 1 b q c α 1 Γ p α + 1 · k = 0 ( 1 ) k k ! 1 Γ p α k + 1 B q α + q k + α 1 b , b α α + 1 b = b α 1 q α ( p + 1 ) α + α 1 b q c α 1 Γ p α + 1 Γ α + 1 α b · k = 0 ( 1 ) k k ! Γ q α + q k + α 1 b Γ p α k + 1 Γ q α + q k + α .
Equation (22) follows from the definition in (4). Note that (6) is satisfied and ρ = 1 in (7). The left hand side of (8) is p α + α + 1 α b , which is positive if α + 1 α b > 0 . □
Proposition 8. 
Let X denote an IF random variable. Then,
S X = log c b q ( p + 1 ) 1 + 1 b q 1 1 b Γ ( 1 ) d d α log q ( p + 1 ) Γ p α + 1 Ψ 2 1 q α + α 1 b , q p α + 1 , 1 , q α + α , q ; 1 α = 1 .
Proof. 
This proof follows from the fact that (20) is a particular case of (21) as α approaches 1. □
Equation (22) provides a way through which to quantify the uncertainty in a set of data and a flexible framework with respect to capturing different aspects of information content. It can be used in addition to variance as a measure of uncertainty. (22) is a monotonic increasing function of c, the scale parameter, and it is independent of x 0 , the location parameter. The behavior of (22) with respect to other parameters depends on the Wright generalized hypergeometric function.

5. Estimation

Suppose x 1 , , x n is a random sample from (1). In this section, the maximum likelihood estimation of p , b , c , q , x 0 is considered and the associated observed information matrix is derived. The log-likelihood function is
log L p , b , c , q , x 0 = n log b + n log q n b log c + ( b 1 ) i = 1 n log x i x 0 ( q + 1 ) i = 1 n log G p x i + p i = 1 n log 1 G p x i q p + 1 .
The partial derivatives of (24) with respect to the parameters are
log L p = ( q + 1 ) i = 1 n 1 G p x i G p x i p + i = 1 n log 1 G p x i q p + 1 + p i = 1 n 1 G p x i q p + 1 1 G p x i q p + 1 1 p + 1 + q G p x i G p x i p ,
log L b = n sign ( b ) b n log c + i = 1 n log x i x 0 ( q + 1 ) i = 1 n 1 G p x i G p x i b + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i b ,
log L c = n b c ( q + 1 ) i = 1 n 1 G p x i G p x i c + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i c ,
log L q = n q i = 1 n log G p x i ( q + 1 ) i = 1 n 1 G p x i G p x i q + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q log G p x i + q G p x i G p x i q
and
log L x 0 = ( 1 b ) i = 1 n 1 x i x 0 ( q + 1 ) i = 1 n 1 G p x i G p x i x 0 + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i x 0 ,
where G p ( x ) p = 1 q ( p + 1 ) 1 q 1 , G p ( x ) b = x x 0 c b log x x 0 c , G p ( x ) c = b c x x 0 c b , G p ( x ) q = 1 q 2 ( p + 1 ) 1 q log ( p + 1 ) and G p ( x ) x 0 = b c x x 0 c b 1 . The maximum likelihood estimators of p , b , c , q , x 0 , say p ^ , b ^ , c ^ , q ^ , x 0 ^ , can be obtained as the simultaneous solutions of log L p = 0 , log L b = 0 , log L c = 0 , log L q = 0 and log L x 0 = 0 . The maximum likelihood estimators of p , b , c , q , x 0 can also be obtained by directly maximizing (24). In Section 6, the maximum likelihood estimates were obtained by directly maximizing (24). The optim function in R software [6] was used for numerical maximization. optim was executed for a wide range of initial values. optim did not converge for all of the initial values. Whenever optim converged, the maximum likelihood estimates were unique.
Confidence intervals and tests of the hypothesis about p , b , c , q , x 0 can be based on the fact that p ^ , b ^ , c ^ , q ^ , x 0 ^ has an asymptotic normal distribution with the mean p , b , c , q , x 0 and covariance matrix I 1 p , b , c , q , x 0 , where I p , b , c , q , x 0 denotes the expected information matrix. For a large n, I p , b , c , q , x 0 can be approximated by the observed information matrix J p , b , c , q , x 0 . Standard calculations show that
J = J 1 , 1 J 1 , 2 J 1 , 3 J 1 , 4 J 1 , 5 J 1 , 2 J 2 , 2 J 2 , 3 J 2 , 4 J 2 , 5 J 1 , 3 J 2 , 3 J 3 , 3 J 3 , 4 J 3 , 5 J 1 , 4 J 2 , 4 J 3 , 4 J 4 , 4 J 4 , 5 J 1 , 5 J 2 , 5 J 3 , 5 J 4 , 5 J 5 , 5 ,
where
J 1 , 1 = ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p 2 1 G p x i G p x i p 2 + 2 p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 p + 1 + q G p x i G p x i p p ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 p + 1 + q G p x i G p x i p 2 + p p + 1 i = 1 n 1 G p x i q p + 1 2 G p x i q { 2 ( p + 1 ) 2 2 q ( p + 1 ) G p x i G p x i p + q G p x i 2 G p x i p 2 q ( q + 1 ) G p x i 2 G p x i p 2 } ,
J 1 , 2 = ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p G p x i b + q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i b p q ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 G p x i b 1 p + 1 + q G p x i G p x i p + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 [ 1 p + 1 G p x i b q + 1 G p x i 2 G p x i p G p x i b ] ,
J 1 , 3 = ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p G p x i c + q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i c p q ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 G p x i c 1 p + 1 + q G p x i G p x i p + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 [ 1 p + 1 G p x i c q + 1 G p x i 2 G p x i p G p x i c ] ,
J 1 , 4 = i = 1 n 1 G p x i G p x i p + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p G p x i q ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p q + 1 p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q log G p x i + q G p x i G p x i q p ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 3 q 1 p + 1 + q G p x i G p x i p [ log G p x i + q G p x i G p x i q ] + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q [ 1 p + 1 log G p x i q ( p + 1 ) G p x i G p x i q + 1 G p x i G p x i p + q G p x i 2 G p x i p q q G p x i log G p x i G p x i p q ( q + 1 ) G p x i 2 G p x i p G p x i q ] ,
J 1 , 5 = ( q + 1 ) i = 1 n 1 G p x i 2 G p x i p G p x i x 0 + q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i x 0 p q ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 G p x i x 0 1 p + 1 + q G p x i G p x i p + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 [ 1 p + 1 G p x i x 0 q + 1 G p x i 2 G p x i p G p x i x 0 ] ,
J 2 , 2 = 2 n δ ( b ) b n sign ( b ) 2 b 2 + ( q + 1 ) i = 1 n 1 G p x i G p x i b 2 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b 2 p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i q 1 G p x i b 2 + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i b 2 q + 1 G p x i G p x i b 2 ,
J 2 , 3 = n c + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b G p x i c ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b c p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i b G p x i c + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i b c q + 1 G p x i G p x i b G p x i c ,
J 2 , 4 = i = 1 n 1 G p x i G p x i b + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b G p x i q + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i b p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i b G p x i q p q ( q + 1 ) p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 2 G p x i b G p x i q + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i b q ,
J 2 , 5 = i = 1 n 1 x 0 x i + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b G p x i x 0 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i b x 0 p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i b G p x i x 0 + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i b x 0 q + 1 G p x i G p x i b G p x i x 0 ,
J 3 , 3 = n b c 2 + ( q + 1 ) i = 1 n 1 G p x i G p x i c 2 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i c 2 p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i c 2 p q ( q + 1 ) p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 2 G p x i c + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i c 2 ,
J 3 , 4 = i = 1 n 1 G p x i G p x i c + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i c G p x i q + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i c p q ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 G p x i c log G p x i + q G p x i G p x i q p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i c log G p x i + q + 1 G p x i G p x i q ,
J 3 , 5 = ( q + 1 ) i = 1 n 1 G p x i 2 G p x i c G p x i x 0 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i c x 0 + p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i c G p x i x 0 p q ( q + 1 ) p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 2 G p x i c G p x i x 0 + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i c x 0 ,
J 4 , 4 = n q 2 2 i = 1 n 1 G p x i G p x i q + ( q + 1 ) i = 1 n 1 G p x i G p x i q 2 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i q 2 p ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q log G p x i + q G p x i G p x i q 2 p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q log G p x i + q G p x i G p x i q 2 + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q { 2 G p x i G p x i q q G p x i 2 G p x i q 2 + q G p x i 2 G p x i q 2 } ,
J 4 , 5 = i = 1 n 1 G p x i G p x i x 0 + ( q + 1 ) i = 1 n 1 G p x i 2 G p x i q G p x i x 0 + p q ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 1 G p x i x 0 log G p x i + q G p x i G p x i q p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i x 0 log G p x i + q G p x i G p x i q + p p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 G p x i G p x i x 0 q G p x i 2 G p x i q G p x i x 0
and
J 5 , 5 = ( 1 b ) i = 1 n 1 x i x 0 2 + ( q + 1 ) i = 1 n 1 G p x i G p x i x 0 2 ( q + 1 ) i = 1 n 1 G p x i 2 G p x i x 0 2 + p q 2 ( p + 1 ) 2 i = 1 n 1 G p x i q p + 1 2 G p x i 2 q 2 G p x i x 0 2 p q ( q + 1 ) p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 2 G p x i x 0 2 + p q p + 1 i = 1 n 1 G p x i q p + 1 1 G p x i q 1 2 G p x i x 0 2 ,
where 2 G p ( x ) p 2 = 1 q 1 q + 1 ( p + 1 ) 1 q 2 , 2 G p ( x ) p q = 1 q 2 ( p + 1 ) 1 q 1 1 q 3 ( p + 1 ) 1 q 1 log ( p + 1 ) , 2 G p ( x ) b 2 = x x 0 c b log x x 0 c 2 , 2 G p ( x ) b c = b c x x 0 c b log x x 0 c 1 c x x 0 c b , 2 G p ( x ) b x 0 = b c x x 0 c b 1 log x x 0 c 1 c x x 0 c b 1 , 2 G p ( x ) c 2 = b ( b + 1 ) c 2 x x 0 c b , 2 G p ( x ) c x 0 = b 2 c 2 x x 0 c b 1 , 2 G p ( x ) q 2 = 1 q 4 ( p + 1 ) 1 q   log ( p + 1 ) 2 2 q 3 ( p + 1 ) 1 q log ( p + 1 ) and 2 G p ( x ) x 0 2 = b ( b 1 ) c 2 x x 0 c b 2 . In addition, δ ( · ) denotes the Dirac delta function.
The 100 ( 1 α ) percent confidence intervals for p, b, c, q and x 0 based on (25) are
p ^ ± z 1 α 2 J ^ 1 , 1 ,
b ^ ± z 1 α 2 J ^ 2 , 2 ,
c ^ ± z 1 α 2 J ^ 3 , 3 ,
q ^ ± z 1 α 2 J ^ 4 , 4
and
x 0 ^ ± z 1 α 2 J ^ 5 , 5 ,
respectively, where z 1 α 2 denotes the 100 1 α 2 percentile of the standard normal distribution and J ^ j , j , j = 1 , 2 , , 5 denotes the ( j , j ) th element of the inverse of J with p , b , c , q , x 0 when it is replaced by p ^ , b ^ , c ^ , q ^ , x 0 ^ .

6. Simulation Study

In this section, a simulation study is conducted to check the finite sample performance of p ^ , b ^ , c ^ , q ^ , x 0 ^ , which was detailed in Section 5. The finite sample performance is checked with respect to bias and the mean squared error. The following scheme was used:
(a)
Set p = 0 , b = 1 , c = 1 , q = 1 , x 0 = 0 and n = 20 ;
(b)
Simulate 10,000 random samples each of size n from (1), the inverse method and the quantile function (detailed in Section 4.2 of [1]);
(c)
Fit (1) to each of the 10,000 samples by the method of maximum likelihood in Section 5, and let p ^ i , b ^ i , c ^ i , q ^ i , x 0 ^ i denote the maximum likelihood estimates for the ith sample;
(d)
Compute the biases as
Bias e ^ = 1 10000 i = 1 10000 e ^ i e
for e = p , b , c , q , x 0 ;
(e)
Compute the mean squared errors as
MSE e ^ = 1 1000 i = 1 1000 e ^ i e 2
for e = p , b , c , q , x 0 ;
(f)
Repeat steps (b) to (e) for n = 21 , 22 , , 100 .
The biases are plotted in Figure 1. The mean squared errors are plotted in Figure 2. The numerical values of the biases and mean squared errors are given in Table 1 and Table 2.
With the exception of p, the biases approach zero as n approaches 100. The biases appear positive for p, c, q and x 0 . The biases appear negative for b. In terms of magnitude, the biases are smallest for x 0 and largest for c and q. With the exception of p, the mean squared errors approach zero as n approaches 100. They are smallest for x 0 and the largest for c and q. The biases and mean squared errors appear small enough for b, c, q and x 0 for the n close to 100.
The observations noted are for the particular initial values p = 0 , b = 1 , c = 1 , q = 1 , x 0 = 0 and n = 20 . But the same observations hold for a wide range of other values of p, b, c, q and x 0 . In particular, the magnitude of biases always decreased to zero with increasing n, and the mean squared errors always decreased to zero with increasing n (with the exception of p). Hence, the maximum likelihood estimates b ^ , c ^ , q ^ , x 0 ^ of the interpolating family of distributions can be considered to behave according to the large sample theory of maximum likelihood estimation.

7. Conclusions

A family of distributions, which was proposed by [1] and referred to as the interpolating family of distributions, was studied. More general expressions for the moments of these distributions, as well as expressions for entropies, were derived. The maximum likelihood estimation of the distributions was considered, and the expressions for the score functions and the observed information matrix were derived. Simulations were performed to study the finite sample performance of the estimators. The simulations showed that the maximum likelihood estimator of p did not behave well. This may be overcome by using other estimation methods, including the method of probability weighted moments, biased corrected maximum likelihood estimation, the method of L moments, the method of trimmed L moments, the minimum distance estimation and methods based on entropies.
The most notable results in the paper are as follows: Propositions 1 and 2 expressing the moments of (1) in the most general cases; Proposition 7 expressing the Rényi entropy of (1) in the most general case; Section 5 detailing the explicit expressions for the observed information matrix.
According to [7], page 371), a flexible family of distributions should have the following properties: versatility, tractability, interpretability, a data generating mechanism and a straightforward parameter estimation. With respect to versatility, (1) can exhibit unimodal shapes (see Figure 2 in [1]). However, given (1) has five parameters, one would like to see if multimodal shapes are possible. With respect to tractability, (1) takes an elementary form and so it can be computed easily. The interpretability of the parameters in (1) was discussed in Section 2.2 of [1]. The parameters control location, scale, tail weight and shape, among others, were considered. The quantile function corresponding to (1) takes an elementary form, as shown in Section 4.2 of [1]. As such, the data generation from (1) is straight forward. The maximum likelihood estimation for (1) has to be performed numerically (see Section 5). Simulation studies show that the maximum likelihood estimator of p does not behave well, even for large samples.

Author Contributions

All authors have contributed equally to the manuscript. Conceptualization, S.N. and I.E.O.; methodology, S.N. and I.E.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Both authors upheld the ‘Ethical Responsibilities of Authors’.

Informed Consent Statement

Both authors gave explicit consent to participate in this study. Both authors gave explicit consent to publish this manuscript.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank the editor and the three referees for their careful reading and comments, as their contributions considerably improved this paper.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Sinner, C.; Dominicy, Y.; Trufin, J.; Waterschoot, W.; Weber, P.; Ley, C. From Pareto to Weibull—A constructive review of distributions on R+. Int. Stat. Rev. 2023, 91, 35–54. [Google Scholar] [CrossRef]
  2. Kilbas, A.A.; Srivastava, H.M.; Trujillo, J.J. Theory and Applications of Fractional Differential Equations; Elsevier: Amsterdam, The Netherlands, 2006. [Google Scholar]
  3. Wright, E.M. The asymptotic expansion of the generalized hypergeometric function. J. Lond. Math. Soc. 1935, 10, 286–293. [Google Scholar] [CrossRef]
  4. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423, 623–656. [Google Scholar] [CrossRef]
  5. Rényi, A. On measures of information and entropy. In Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability; Neyman, J., Ed.; University of California Press: Berkeley, CA, USA, 1960; Volume 1, pp. 547–561. [Google Scholar]
  6. R Development Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2024. [Google Scholar]
  7. Ley, C.; Babic, S.; Craens, D. Flexible models for complex data with applications. Annu. Rev. Stat. Appl. 2021, 8, 369–391. [Google Scholar] [CrossRef]
Figure 1. Biases versus n = 20 , 21 , , 100 . The y axes are in log scale.
Figure 1. Biases versus n = 20 , 21 , , 100 . The y axes are in log scale.
Axioms 13 00070 g001
Figure 2. Mean squared errors versus n = 20 , 21 , , 100 . The y axes are in log scale.
Figure 2. Mean squared errors versus n = 20 , 21 , , 100 . The y axes are in log scale.
Axioms 13 00070 g002
Table 1. Biases of p ^ , b ^ , c ^ , q ^ and x 0 ^ .
Table 1. Biases of p ^ , b ^ , c ^ , q ^ and x 0 ^ .
nBias for p ^ Bias for b ^ Bias for c ^ Bias for q ^ Bias for x 0 ^
200.080−0.1690.1920.2510.051
210.077−0.1550.2030.2110.047
220.070−0.1641.3081.4940.045
230.073−0.1590.1330.1830.040
240.078−0.1632.4974.7250.042
250.072−0.1457.4972.8260.039
260.072−0.1540.1960.2020.035
270.066−0.1300.1730.1830.036
280.067−0.1440.2600.2590.034
290.061−0.1410.0910.1370.035
300.066−0.1220.1150.1520.033
310.066−0.1224.5873.8110.030
320.075−0.1350.1120.1570.032
330.074−0.1260.1500.1640.029
340.111−0.1200.1400.1600.028
350.066−0.1110.8330.2460.028
360.071−0.1180.1750.1610.028
370.055−0.1020.0940.1140.027
380.058−0.1080.5580.1660.028
390.062−0.1060.1570.1660.024
400.059−0.1100.1720.1300.025
410.058−0.1050.1270.1320.024
420.058−0.1100.1200.1300.022
430.053−0.0930.5430.1810.022
440.061−0.0980.1660.1340.023
450.053−0.0890.1440.1110.021
460.055−0.0930.1030.1140.022
470.141−0.0860.1280.1120.021
480.060−0.0890.1190.1130.020
490.057−0.0890.2370.1340.021
500.067−0.0910.1990.1270.019
510.056−0.0840.0760.0900.018
520.053−0.0800.0610.0800.019
530.056−0.0820.0530.0820.017
540.061−0.0810.1010.1090.018
550.060−0.0850.0960.0970.017
560.061−0.0900.0700.0930.017
570.057−0.0810.0940.0900.018
580.078−0.0840.0900.0970.017
590.068−0.0760.0700.0870.017
600.050−0.0810.2990.1400.017
610.055−0.0710.0830.0790.015
620.061−0.0810.1120.0970.016
630.052−0.0730.3370.1530.015
640.057−0.0660.0910.0880.016
650.194−0.0700.1580.1150.015
660.063−0.0680.0670.0760.014
670.172−0.0670.0820.0880.015
680.048−0.0680.0720.0700.014
690.052−0.0670.0620.0700.013
700.057−0.0720.1050.0970.014
710.075−0.0740.0870.0900.013
720.055−0.0670.0740.0770.014
730.060−0.0660.1020.0800.013
740.050−0.0680.0670.0760.013
750.053−0.0690.0710.0750.013
760.080−0.0670.1480.0910.012
770.075−0.0670.0850.0810.012
780.052−0.0590.0550.0710.012
790.050−0.0640.0770.0830.012
800.062−0.0710.0770.0900.012
810.049−0.0630.1650.0890.012
820.060−0.0620.2470.1010.012
830.058−0.0620.0800.0780.012
840.063−0.0640.0600.0670.011
850.060−0.0610.0580.0730.012
860.047−0.0600.0710.0750.011
870.056−0.0570.3450.0890.011
880.078−0.0630.0780.0790.011
890.070−0.0580.0720.0710.011
900.044−0.0640.0760.0680.010
910.940−0.0570.4330.0860.011
920.104−0.0560.1160.0730.010
930.054−0.0530.2050.0760.010
940.057−0.0530.0850.0700.010
950.135−0.0620.1610.0810.010
960.056−0.0560.0550.0710.010
970.049−0.0530.0560.0640.009
980.067−0.0650.1220.0970.009
990.052−0.0530.0570.0630.010
1000.110−0.0580.0990.0850.009
Table 2. Mean squared errors of p ^ , b ^ , c ^ , q ^ and x 0 ^ .
Table 2. Mean squared errors of p ^ , b ^ , c ^ , q ^ and x 0 ^ .
nMSE for p ^ MSE for b ^ MSE for c ^ MSE for q ^ MSE for x 0 ^
200.0290.0791.5140.7580.005
210.0670.0791.9340.4040.005
220.0150.0731223.4211613.0960.004
230.0210.0681.5750.3170.003
240.0230.0736.0002.0000.004
250.0200.06254,678.4077090.6330.003
260.0170.0652.4250.6470.003
270.0210.0668.3141.2910.003
280.0180.06212.6635.7910.002
290.0120.0580.3090.1290.003
300.0190.0530.5960.2760.002
310.0180.0512.0002.0000.002
320.1020.0530.3210.1600.002
330.0940.0550.8490.1980.002
341.5970.0501.0450.2400.002
350.0810.050470.4528.3050.002
360.1190.0464.2090.5100.002
370.0120.0430.2250.0950.002
380.0160.037188.9490.9900.002
390.0160.0461.9360.6740.001
400.0140.0385.8650.3660.001
410.0120.0401.1120.2530.001
420.0290.0340.4060.1470.001
430.0110.03589.7661.3480.001
440.0210.0334.2280.2070.001
450.0110.0373.4080.1730.001
460.0130.0330.8140.1330.001
477.9150.0362.3640.1560.001
480.0230.0310.6000.1460.001
490.0210.03119.9950.9780.001
500.1870.0324.2190.3150.001
510.0220.0270.1990.0810.001
520.0120.0270.1150.0570.001
530.0150.0290.0790.0450.001
540.0920.0220.3370.0970.001
550.0310.0270.4020.1060.001
560.0260.0280.1510.0640.001
570.0190.0250.5180.1470.001
580.3010.0240.3340.1070.001
590.1750.0250.1860.0690.001
600.0150.02154.0902.5890.001
610.0290.0220.3220.0660.000
620.1220.0251.2770.1210.001
630.0260.02265.4193.6760.000
640.1650.0240.2260.0850.001
657.5390.0223.7250.1780.000
660.1560.0210.1230.0520.000
6711.1880.0220.3990.0970.000
680.0130.0210.1200.0430.000
690.0160.0220.1430.0470.000
700.0580.0210.3710.1090.000
710.4370.0180.2280.0790.000
720.0410.0180.1190.0450.000
730.2130.0190.5870.1140.000
740.0160.0170.1390.0530.000
750.0240.0180.1690.0490.000
760.6370.0173.5850.1890.000
770.2640.0180.2990.0850.000
780.0390.0190.1380.0540.000
790.0200.0150.2330.0700.000
800.0580.0170.1530.0620.000
810.0140.0188.4130.3160.000
820.1050.01719.0120.2140.000
830.0380.0150.3890.0720.000
840.1760.0160.0620.0400.000
850.1760.0150.1040.0480.000
860.0130.0150.1650.0500.000
870.0290.01674.4130.4560.000
880.4460.0160.1440.0610.000
890.5470.0140.1260.0580.000
900.0120.0140.2970.0520.000
91759.7740.0150.4020.2130.000
921.3390.0162.6880.1000.000
930.0640.0142.2360.2260.000
940.0530.0160.1790.0660.000
953.8620.0157.3810.1730.000
960.0770.0140.0830.0430.000
970.0200.0130.1020.0340.000
980.2160.0140.5420.0890.000
990.0290.0130.1190.0440.000
1001.8820.0140.3310.0990.000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Nadarajah, S.; Okorie, I.E. On the Interpolating Family of Distributions. Axioms 2024, 13, 70. https://doi.org/10.3390/axioms13010070

AMA Style

Nadarajah S, Okorie IE. On the Interpolating Family of Distributions. Axioms. 2024; 13(1):70. https://doi.org/10.3390/axioms13010070

Chicago/Turabian Style

Nadarajah, Saralees, and Idika E. Okorie. 2024. "On the Interpolating Family of Distributions" Axioms 13, no. 1: 70. https://doi.org/10.3390/axioms13010070

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop