Next Article in Journal
Multi-Dimensional Integral Transform with Fox Function in Kernel in Lebesgue-Type Spaces
Next Article in Special Issue
The Development Trends of Computer Numerical Control (CNC) Machine Tool Technology
Previous Article in Journal
The Impact of Heat Transfer and a Magnetic Field on Peristaltic Transport with Slipping through an Asymmetrically Inclined Channel
Previous Article in Special Issue
Fuzzy Radar Evaluation Chart for Improving Machining Quality of Components
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bias-Correction Methods for the Unit Exponential Distribution and Applications

by
Hua Xin
1,
Yuhlong Lio
2,
Ya-Yen Fan
3 and
Tzong-Ru Tsai
3,*
1
School of Mathematics and Statistics, Northeast Petroleum University, Daqing 163318, China
2
Department of Mathematical Sciences, University of South Dakota, Vermillion, SD 57069, USA
3
Department of Statistics, Tamkang University, Tamsui District, New Taipei City 251301, Taiwan
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(12), 1828; https://doi.org/10.3390/math12121828
Submission received: 13 May 2024 / Revised: 6 June 2024 / Accepted: 11 June 2024 / Published: 12 June 2024
(This article belongs to the Special Issue Fuzzy Applications in Industrial Engineering, 3rd Edition)

Abstract

:
The bias of the maximum likelihood estimator can cause a considerable estimation error if the sample size is small. To reduce the bias of the maximum likelihood estimator under the small sample situation, the maximum likelihood and parametric bootstrap bias-correction methods are proposed in this study to obtain more reliable maximum likelihood estimators of the unit exponential distribution parameters. The procedure to implement the bias-corrected maximum likelihood estimation method is derived analytically, and the steps to obtain the bias-corrected bootstrap estimators are presented. The simulation results show that the proposed maximum likelihood bootstrap bias-correction method can significantly reduce the bias and mean squared error of the maximum likelihood estimators for most of the parameter combinations in the simulation study. A soil moisture data set and a numerical example are used for illustration.

1. Introduction

In some instances, we must use proportional variables for modeling. In common cases, for example, we need to make inferences about the proportions of successes; party votes; the consumption of a cause; and specific public event attendance. The applications can be found in social studies, economics, healthcare, engineering, and more.

1.1. Literature Review

The popular modeling process of a proportional sample involves considering a random variable over a range between 0 and 1. Thus, the Beta distribution proposed by Bayes [1] has been widely used to make inferences in the applications of proportional sample modelings, for example, by Fleiss et al. [2], Gilchrist [3], and Seber [4]. However, the Beta distribution is not suitable for all applications. Besides Beta one, many other distributions have been suggested in the literature. Leipnik [5] proposed an approximate distribution of the serial correlation coefficient as a circular universe. Johnson [6] suggested the unit Johnson distribution. Topp and Leone [7] studied a family of the J-shaped cumulative frequency function. Consul and Jain [8] proposed the unit gamma distribution. Kumaraswamy [9] considered a generalized distribution for double-bounded variables. Jørgensen [10] studied a subclass of dispersion models, which are the proper four-parameter dispersion models.
In recent decades, many works have studied the unit interval distribution functions. Smithson and Shou [11] worked on the cumulative distribution function (CDF) and quantile distribution over the unit interval. Altun and Hamedani [12] studied the log-xgamma distribution. Nakamura et al. [13] proposed a new unit interval distribution. Ghitany et al. [14] suggested the unit-inverse Gaussian distribution. Moreover, the unit Gompertz, unit Lindley, and unit Weibull distributions have been studied by Mazucheli et al. [15,16,17]. Altun [18] probed the log-weighted exponential distribution. Gündüz et al. [19] investigated the unit Johnson S U distribution. Biswas and Chakraborty [20] constructed absolutely continuous distributions over the unit interval. They used two absolutely continuous random variables with non-negative support to generate the conditional distribution for one of them, given their sum as one through the convolution concept. Afify et al. [21] proposed another new unit distribution. Krishna et al. [22] studied the unit Teissier distribution. Korkmaz and Korkmaz [23] worked on the unit log–log distribution. Fayomi et al. [24] suggested the unit–power Burr X distribution. Bakouch et al. [25] proposed a new unit exponential distribution and concluded that the aforementioned approaches are mainly based on conventional strategies, including (1) log transformation approaches, (2) the CDF and quantile methodology, (3) reciprocal transformation, (4) exponential transformation, (5) the conditional distribution methodology, and (6) the T-X family approach. Bakouch et al. [25] used the epsilon function,
ϵ λ , a = a + x a x λ a 2 , x ( a , a ) ; otherwise , ϵ λ , a = 0 ,
where λ R + and a > 0 . Dombi et al. [26] provided a detailed procedure for establishing the unit exponential distribution.
The unit exponential distribution proposed by Bakouch et al. [25] is more flexible for modeling than the aforementioned ones and can exhibit either negative or positive skewness. They also showed that the proposed unit exponential distribution has an increasing failure rate or belongs to the decreasing mean residual life class if λ > 0 . They also proposed the maximum likelihood estimation process to obtain the estimators of model parameters.

1.2. The Motivation and Organization

In some situations, we could use the unit exponential distribution proposed by Bakouch et al. [25] with a small-sized sample for statistical inference. It is important to know how to reduce the estimation bias with a small-sized sample. In this study, we propose an analytic bias-corrected maximum likelihood estimation method to reduce the estimation bias. Then, random samples will be generated from the unit exponential distribution to verify the performance of the proposed bias-corrected maximum likelihood estimation method. Moreover, a bias-corrected parametric bootstrap estimation procedure is also proposed to compete with the proposed bias-corrected maximum likelihood estimation method. The second bias-correction procedure relies on heavy computer computation but not on the deviation process of mathematical approximation. This study aims to provide feasible and simple methods to reduce the bias and mean square error of the maximum likelihood estimators of the unit exponential distribution when the sample size is small. The proposed method will be illustrated with one numerical example and one soil moisture example to demonstrate the applications.
In summary, three contributions are included in this study to reduce the bias of the maximum likelihood estimator of the unit exponential distribution parameters:
  • An analytical procedure of the maximum likelihood bias-correction method is proposed.
  • The implementation of the parametric bootstrap bias-correction method is proposed.
  • The performance of the two proposed bias-correction methods is studied using Monte Carlo simulations. We find that the proposed maximum likelihood bias-correction method is more competitive than the other two competitors.
There are five sections in this article. The unit exponential distribution and maximum likelihood estimation procedure will be briefly reviewed in Section 2. Section 3 addresses the detailed derivation of the proposed bias-corrected maximum likelihood estimation method and bias-corrected parametric bootstrap procedure. In Section 4, Monte Carlo simulations are conducted to evaluate the performance of both bias-corrected estimation methods. One example is presented in Section 5 to demonstrate the application of both estimation methods. Some concluding remarks are made in Section 6.

2. The Unit Exponential Distribution and Maximum Likelihood Estimation

Let X be a bounded random variable having support over 0 to 1 and Θ T = ( θ 1 , θ 2 ) = ( α , β ) . The cumulative density function (CDF) of the unit exponential distribution can be defined by
F ( x | Θ ) = 1 exp α 1 1 + x 1 x β ,
if 0 x < 1 , and 1, if x = 1 , where α and β are positive unknown parameters. Denote the distribution of (2) by X UED ( α , β ) . F ( x | α , β ) in (2) can be represented by
F ( x | Θ ) = 1 exp α 1 ϵ 2 β , 1 ( x ) .
Let f ( x | Θ ) = d F ( x | Θ ) d x ; the probability density function (PDF) of UED ( α , β ) can be obtained by
f ( x | Θ ) = 2 α β 1 x 2 1 + x 1 x β exp α 1 1 + x 1 x β , 0 < x < 1 .
Let x p be the pth quantile of UED ( α , β ) , 0 < p < 1 . Utilizing the CDF of Equation (2), one can set p = F ( x p | Θ ) . Then, x p can be presented by
x p = 1 1 α log ( 1 p ) 1 β 1 1 1 α log ( 1 p ) 1 β + 1 , 0 < p < 1 ,
because the support of X is a subset of positive reals. Using the formula m r = E ( x r ) = r 0 1 x r 1 { 1 F ( x ) } d x , the rth moment of UED ( α , β ) can be presented by
m r = r e α 0 1 x r 1 exp α 1 + x 1 x β d x , r = 1 , 2 , .
Let X = ( X 1 , X 2 , , X n ) denote a random sample of size n that is taken from the UED ( α , β ) . The maximum likelihood estimation procedure proposed by Bakouch et al. [25] is briefly addressed as follows. The likelihood function is given by
L ( Θ | x ) = 2 n α n β n i = 1 n ( 1 x i 2 ) i = 1 n 1 + x i 1 x i β exp i = 1 n α 1 1 + x i 1 x i β .
The log-likelihood function can be expressed by
ln L ( Θ | x ) = n ln ( 2 α β ) + ( β 1 ) i = 1 n ln ( 1 + x i ) ( β + 1 ) i = 1 n ln ( 1 x i ) + α i = 1 n 1 1 + x i 1 x i β .
Denote the first, second, and third derivatives of with respect to θ 1 and θ 2 by
i = θ i , i = 1 , 2 , i j = 2 θ i θ j , i , j = 1 , 2 , i j k = 3 θ i θ j θ k , i , j , k = 1 , 2 ,
and denote their mathematical expectations by
η i = E ( i ) , i = 1 , 2 , η i j = E ( i j ) , i , j = 1 , 2 , η i j k = E ( i j k ) , i , j , k = 1 , 2 .
We can obtain the first derivatives of by
1 α = n α + i = 1 n 1 1 + x i 1 x i β ,
and
2 β = n β + i = 1 n ln 1 + x i 1 x i α i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i .
Let 1 = 0 ; we can represent α by
α = 1 n i = 1 n 1 + x i 1 x i β 1 1 .
Plug α of (11) into 2 = 0 ; we can solve the nonlinear equation to obtain the maximum likelihood estimate (MLE) of β , denoted by β ^ , using the Newton–Raphson algorithm. Therefore, the MLE of α , denoted by α ^ , can be obtained by replacing β with β ^ in Equation (11).
The second derivatives of can be obtained by
11 = α 2 = n α 2 ,
22 = β 2 = n β 2 α i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i 2 ,
and
12 = α β = i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i .

3. Bias-Correction Methods

Two bias-correction procedures to reduce the bias of MLEs will be derived in this section.

3.1. The Bias-Corrected Maximum Likelihood Estimation Method

Using algebraic computation, we can simplify the third-order partial derivatives, i j k for i , j , k = 1 , 2 , 3 , and the results are given as follows:
111 3 α 3 = 2 n α 3 ,
112 = 121 = 211 3 α 2 β = 0 ,
222 3 β 3 = 2 n β 3 α i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i 3 ,
and
122 = 212 = 221 3 β 2 α = i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i 2 .
Using the obtained results from Section 2, it is trivial to show that
η 11 = n α 2 ,
η 12 = E ( 12 ) = n E 1 + X 1 X β ln 1 + X 1 X ,
and
η 22 = E ( 22 ) = n β 2 n α E 1 + X 1 X β ln 1 + X 1 X 2 .
It is trivial to show that
η 111 = E ( 111 ) = 2 n α 3 ,
and
η 112 = η 121 = η 211 = 0 .
Moreover, we can show that
η 122 = E ( 122 ) = n E 1 + X 1 X β ln 1 + X 1 X
and
η 222 = E ( 222 ) = 2 n β 3 n α E 1 + X 1 X β ln 1 + X 1 X 3
It is difficult to derive the exact formula of the bias-correction solution in this study. A good idea is to obtain a simple approximation for the bias correction solution if the error is small and negligible. In this study, we use Taylor’s expansion method for three reasons: (1) Taylor’s expansion is easy to implement in this study; (2) the error can be easily controlled up to the fourth derivative term; and (3) the approximation is as close to the true value as the sample size n is large. The approximation procedure is analytically derived as follows. Using Taylor’s expansion, we can obtain the following approximations:
ln ( 1 + x ) x x 2 2 + x 3 3 , ln ( 1 x ) x x 2 2 x 3 3 ,
and
y = ln 1 + x 1 x = ln ( 1 + x ) log ( 1 x ) 2 x + 2 x 3 3 = 2 x 1 + x 2 3 .
Using Taylor’s expansion to the exponential function again, we can show that
1 + x 1 x β = exp β ln 1 + x 1 x exp 2 β x ( 1 + x 2 3 ) 1 + 2 β x + 2 β 2 x 2 .
The rth moment can be approximated by
m r r e α 0 1 x r 1 exp { α ( 1 + 2 β x + 2 β 2 x 2 ) } d x , r = 1 , 2 , . . .
Hence, we can obtain
η 12 = η 21 = n E 1 + X 1 X β ln 1 + X 1 X . 2 n E X 1 + X 2 3 + 2 β X 2 1 + X 2 3 + 2 β 2 X 3 1 + X 2 3 = 2 n m 1 + 2 β m 2 + ( 2 β 2 + 1 3 ) m 3 + 2 3 β m 4 + 2 3 β 2 m 5 ,
and
η 22 = n β 2 n α E 1 + X 1 X β ln 1 + X 1 X 2 = n β 2 4 n α E X 2 1 + X 2 3 2 + 2 β X 3 1 + X 2 3 2 + 2 β 2 X 4 1 + X 2 3 2 = n β 2 4 n α m 2 + 2 β m 3 + 2 ( β 2 + 1 3 ) m 4 + 4 3 β m 5 + 1 3 ( 1 3 + 4 β 2 ) m 6 + 2 9 β m 7 + 2 9 β 2 m 8 .
The Fisher information matrix is a 2 × 2 matrix, denoted by I I ( Θ ) = [ η i j ] . Let
η i j ( k ) = η i j θ k , i , j , k = 1 , 2 .
After algebraic computation, we can obtain the following results about i j k :
111 3 α 3 = 2 n α 3 ,
112 = 121 = 211 3 α 2 β = 0 ,
222 3 β 3 = 2 n β 3 α i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i 3 ,
and
122 = 212 = 221 3 β 2 α = i = 1 n 1 + x i 1 x i β ln 1 + x i 1 x i 2 .
It is trivial to show that
η 111 = E ( 111 ) = 2 n α 3 ,
and
η 112 = η 121 = η 211 = 0 .
Moreover, we can show that
η 122 = E ( 122 ) = E i = 1 n 1 + X i 1 X i β ln 1 + X i 1 X i 2 = n E 1 + X 1 X β ln 1 + X 1 X 2 1 α n β 2 + η 22 .
and
η 222 = E ( 222 ) = 2 n β 3 n α E 1 + X 1 X β ln 1 + X 1 X 3 2 n β 3 8 n α E X 3 1 + X 2 3 3 + 2 β X 4 1 + X 2 3 3 + 2 β 2 X 5 1 + X 2 3 3 = 2 n β 3 8 n α m 3 + 2 β m 4 + ( 1 + 2 β 2 ) m 5 + 2 β m 6 + ( 1 3 + 2 β 2 ) m 7 + 2 3 β m 8 + 1 3 ( 1 9 + 2 β 2 ) m 9 + 2 27 β m 10 + 2 27 β 2 m 11 .
The entries { η i j ( k ) , i , j , k = 1 , 2 } can be presented as follows:
η 11 ( 1 ) = η 11 α = 2 n α 3 ,
η 21 ( 1 ) = η 12 ( 1 ) = η 12 α 2 n m 1 , a + 2 β m 2 , a + ( 2 β 2 + 1 3 ) m 3 , a + 2 3 β m 4 , a + 2 3 β 2 m 5 , a ,
η 22 ( 1 ) = η 22 α 4 n ( m 2 + α m 2 , a ) + 2 β ( m 3 + α m 3 , a ) + 2 ( β 2 + 1 3 ) ( m 4 + α m 4 , a ) + 4 3 β ( m 5 + α m 5 , a ) + 1 3 ( 1 3 + 4 β 2 ) ( m 6 + α m 6 , a ) + 2 9 β ( m 7 + α m 7 , a ) + 2 9 β 2 ( m 8 + α m 8 , a ) ,
where
m r , a 2 β ( m r + 1 + β m r + 2 ) .
We can also obtain
m r , b 2 α ( m r + 1 + 2 β m r + 2 ) .
η 11 ( 2 ) = η 11 β = 0 .
η 21 ( 2 ) = η 12 ( 2 ) = η 12 β , = 2 n m 1 , b + 2 ( m 2 + β m 2 , b ) + 4 β 3 + ( 2 β 2 + 1 3 ) m 3 , b + 2 3 ( m 4 + β m 4 , b ) + 2 3 β ( 2 m 5 + β m 5 , b ) ,
η 22 ( 2 ) = η 22 β (42)     2 n β 3 4 n α m 2 , b + 2 ( m 3 + β m 3 , b ) + 4 β m 4 + 2 ( β 2 + 1 3 ) m 4 , b     + 4 3 ( m 5 + m 5 , b ) + 8 3 m 6 + 1 3 ( 1 3 + 4 β 2 ) m 6 , b + 2 9 ( m 7 + β m 7 , b ) (43)     + 2 9 β ( 2 m 8 + β m 8 , b ) .
In summary, η i j ( k ) and η i j k , i , j , k = 1 , 2 can be represented by
η 11 = n α 2 ,
η 12 = η 21 2 n m 1 + 2 β m 2 + ( 2 β 2 + 1 3 ) m 3 + 2 3 β m 4 + 2 3 β 2 m 5 ,
η 22 n β 2 4 n α m 2 + 2 β m 3 + 2 ( β 2 + 1 3 ) m 4 + 4 3 β m 5 + 1 3 ( 1 3 + 4 β 2 ) m 6 + 2 9 β m 7 + 2 9 β 2 m 8 ,
η 111 = 2 n α 3 ,
η 112 = η 121 = η 211 = 0 ,
η 122 = η 212 = η 221 1 α n β 2 + η 22 ,
η 222 2 n β 3 8 n α m 3 + 2 β m 4 + ( 1 + 2 β 2 ) m 5 + 2 β m 6 + ( 1 3 + 2 β 2 ) m 7 + 2 3 β m 8 + 1 3 ( 1 9 + 2 β 2 ) m 9 + 2 27 β m 10 + 2 27 β 2 m 11 ,
η 11 ( 1 ) = 2 n α 3 ,
η 21 ( 1 ) = η 21 ( 1 ) = η 12 ( 1 ) 2 n m 1 , a + 2 β m 2 , a + ( 2 β 2 + 1 3 ) m 3 , a + 2 3 β m 4 , a + 2 3 β 2 m 5 , a ,
η 22 ( 1 ) = 4 n ( m 2 + α m 2 , a ) + 2 β ( m 3 + α m 3 , a ) + 2 ( β 2 + 1 3 ) ( m 4 + α m 4 , a ) + 4 3 β ( m 5 + α m 5 , a ) + 1 3 ( 1 3 + 4 β 2 ) ( m 6 + α m 6 , a ) + 2 9 β ( m 7 + α m 7 , a ) + 2 9 β 2 ( m 8 + α m 8 , a ) ,
η 11 ( 2 ) = 0 ,
η 12 ( 2 ) = η 21 ( 2 ) = 2 n m 1 , b + 2 ( m 2 + β m 2 , b ) + 4 β m 3 + ( 2 β 2 + 1 3 ) m 3 , b + 2 3 ( m 4 + β m 4 , b ) + 2 3 β ( 2 m 5 + β m 5 , b ) ,
(56) η 22 ( 2 ) 2 n β 3 4 n α m 2 , b + 2 ( m 3 + β m 3 , b ) + 4 β m 4 + 2 ( β 2 + 1 3 ) m 4 , b     + 4 3 ( m 5 + β m 5 , b ) + 8 3 β m 6 + 1 3 ( 1 3 + 4 β 2 ) m 6 , b + 2 9 ( m 7 + β m 7 , b ) (57)     + 2 9 β ( 2 m 8 + β m 8 , b ) .
where m r can be approximated using Equation (26).
Denote the entries of the inverse of I by η i j , i , j = 1 , 2 ; that is, I 1 = [ η i j ] . Assume that the log-likelihood function is well-behaved. Moreover, the log-likelihood follows regular conditions with respect to the derivatives η i j , η i j k , η i j , k , and η i j ( k ) with the order O ( n ) . Let B ( k ) B ( k ) ( Θ ) be a 2 × 2 matrix with the entries b i j ( k ) , where
b i j ( k ) = η i j ( k ) 1 2 η i j k , i , j , k = 1 , 2 .
Let vec ( B ( k ) ) be a vectorization operation, which creates a column vector by stacking the column vectors of B ( k ) below one another. Using the proposed process of Cordeiro and Klein [27], an order O ( n 2 ) bias of the MLE Θ ^ can be presented by
b ( Θ ^ ) = I 1 ( Θ ) B · vec I 1 ( Θ ) + O ( n 2 ) .
where
B = [ B ( 1 ) | B ( 2 ) ]
and
[ vec I 1 ] T = [ η 11 , η 21 , η 12 , η 22 ] .
Denote the bias-corrected MLE of Θ by Θ ˜ T = ( α ˜ , β ˜ ) ; it can be shown that
Θ ˜ = Θ ^ I ^ 1 B ^ · vec I ^ 1 ,
where I ^ 1 I 1 ( Θ ^ ) and B ^ B ( Θ ^ ) . If α ˜ < 0 or β ˜ < 0 , keep the original MLEs α ^ and β ^ and do not update them by their bias-corrected MLEs.
Based on the aforementioned results, we can obtain
b 11 ( 1 ) = n α 3 ,
b 12 ( 1 ) = b 21 ( 1 ) = η 12 ( 1 ) 1 2 η 121 = η 12 ( 1 ) ,
b 22 ( 1 ) = η 22 ( 1 ) 1 2 η 221 ,
and
b 11 ( 2 ) = 0
b 12 ( 2 ) = b 21 ( 2 ) = η 12 ( 2 ) 1 2 η 122 ,
b 22 ( 2 ) = η 22 ( 2 ) 1 2 η 222

3.2. The Bootstrap Bias-Correction Method

In this section, we introduce another popular bias-correction method using the parametric bootstrap approach. For a comprehensive discussion about parametric bootstrap approaches, see the reference of Efron and Tibshirani [28]. This study uses the same term to implement the bootstrap bias-correction steps [29]. Assume that the MLE of Θ is Θ ^ = ( α ^ , β ^ ) T , which can be obtained based on the random sample x = ( x 1 , x 2 , , x n ) . Implementing the following steps:
Step 1:
Generate a random sample, z = ( z 1 , z 2 , , z n ) from the UED ( α ^ , β ^ ) . Use the new generated random sample, z , to obtain the MLE of Θ and denote it by Θ ^ * .
Step 2:
Repeat Step 1 M times. Denote the obtained MLEs by ( Θ ^ 1 * , Θ ^ 2 * , , Θ ^ M * ). Evaluate the bias of Θ ^ by
Θ ^ B i a s = 1 M i = 1 M Θ ^ j * Θ ^ .
Then, the B-BCML estimate is evaluated by
Θ ˜ B B C = ( α ˜ B B C , β ˜ B B C ) = Θ ^ Θ ^ B i a s = 2 Θ ^ 1 M i = 1 M Θ ^ j * .
If α ˜ B B C < 0 or β ˜ B B C < 0 , keep the original MLEs α ^ and β ^ and do not update them by their bias-corrected MLEs.

4. Monte Carlo Simulations

This section uses Monte Carlo simulations to verify the quality of the proposed two bias-correction methods. The parameter combinations, which were used in Bakouch et al. [25] for simulations, are used for the simulation study in this section. Random samples with sizes n = 10 , 15 , 20 , 25 , and 30 are generated from the UED ( α , β ) , where
Set I:
( α , β ) = ( 0.4390 , 1.5145 ) ( 0.44 , 1.51 ) .
Set II:
( α , β ) = ( 0.9856 , 0.2178 ) ( 0.99 , 0.22 ) .
Set III:
( α , β ) = ( 1.8986 , 0.3218 ) ( 1.90 , 0.32 ) .
Set IV:
( α , β ) = ( 2.4390 , 2.5145 ) ( 2.44 , 2.51 ) .
Assume that N iterative runs are used for the simulation study, where N is a large positive number, and the obtained estimates are denoted by θ ^ ( j ) , i = 1 , 2 , . . . , N . For parameter θ , the bias and MSE can be defined, respectively, by
bias = 1 N j = 1 N θ ^ ( j ) θ
and
MSE = 1 N j = 1 N θ ^ ( j ) θ 2 .
The relative bias, denoted by
RB = bias θ ,
and the relative squared-root MSE, denoted by
RSM = MSE θ ,
are used as the performance metrics of the simulation study in this work. Moreover, N = 10,000 is used to check the quality of the proposed methods. For the purpose of removing extreme sample cases with huge MLEs from the simulation study, we generate 10% more samples for the Monte Carlo simulation. Then, the combinations of the biggest values of 10% α ^ are truncated. Denote the maximum likelihood estimation, bias-corrected maximum likelihood estimation method, and bootstrap bias-correction methods by MLE, BC, and Boot-BC methods, respectively.
Before the intensive simulations, we test the impact of M on the quality of the Boot-BC method. Figure 1 shows the bootstrap bias-correction method update rate for M = 200 , 300, 400, 500, 600, and 700. From Figure 1, we can see that the bootstrap bias-correction method update rate with M = 400 is high for Sets I, II, and IV. Hence, we use M = 400 to implement the bootstrap bias-correction method. All of the simulation results are reported in Table 1, Table 2, Table 3, Table 4 and Table 5.
Given Table 1, Table 2, Table 3, Table 4 and Table 5, we find that the update rates of the BC and Boot-BC methods are low if the sample size is less than 30 and the values of α and β are large; this is because the BC and Boot-BC methods are based on the MLE method. These findings indicate that the performance of the MLE method is unstable if the sample size is smaller than 30 and α and β are large for the UED ( α , β ) . When the sample size grows to over 30, the update rates of the BC and Boot-BC methods significantly grow.
Implementing the Boot-BC method is time-consuming because the bias correction depends on bootstrap sampling and the maximum likelihood estimation using bootstrap samples. The BC method can work faster than the Boot-BC method. The Boot-BC and BC methods are competitive when the sample size is over 20. In particular, if the sample size is 30 to 50, the BC and Boot-BC methods can reduce the bias of α ^ and β ^ . When the sample size exceeds 30, the BC method outperforms the typical maximum likelihood estimation and Boot-BC methods for most cases in Table 3, Table 4 and Table 5. The BC method performs better than the Boot-BC method for most cases in Table 3, Table 4 and Table 5 regarding the relative squared-root MSE because the BC method is simple for computation and less time-consuming than the Boot-BC method. We recommend using the BC method for bias reduction for the UED ( α , β ) if the sample size is 30 to 50.

5. An Example

Two examples are used in this section to illustrate the applications of the proposed bias-correction methods. The first numerical example shows the bias correction effect under a small sample. The second example is used to demonstrate the use of the proposed bias-correction methods for the soil moisture data set.

5.1. The Numerical Example

Thirty measurements were generated from U E D ( α = 1 , β = 1 ) . All the generated measurements are listed in Table 6. The MLEs of α and β are α ^ = 0.5639 and β ^ = 1.214 . Both MLEs are used to implement the Kolmogorov–Smirnov (K-S) test for model fitting. The K-S statistics were obtained by D = 0.0966 with the p-value = 0.9483. The K-S testing results show that the UED can be a good candidate model for fitting the data set in Table 6.
Using the proposed maximum likelihood and bootstrap bias-correction methods for this data set, we obtain the BC estimates α ˜ = 0.5967 and β ˜ = 1.1658 . The Boot-BC estimates are α ˜ B B C = 0.5272 and β ˜ B B C = 1.0329 . We can see that the proposed maximum likelihood bias-correction method can reduce the bias of the maximum likelihood estimation method, and the BC estimates are closer to their true values of α = 1 and β = 1 than the MLEs. The bootstrap bias-correction method reduces the bias of the MLE of β , but the bias of the Boot-BC estimate of α is slightly increased.

5.2. The Soil Moisture Example

The permanent wilting point indicates the threshold for plants to extract water from the soil. Plants cannot extract water from the soil when the soil moisture is below the permanent wilting point. Maity [30], page 189, reports 40 soil moisture measurements that use 0.12 as the threshold of the permanent wilting point to evaluate reliability, resilience, and vulnerability. This data set is also displaced in Table 7 for easy reference.
Using the data set shown in Table 7, the original MLEs of the UED ( α , β ) parameters, α and β , were obtained and given in Table 8, which provides α ^ = 0.2707 and β ^ = 2.9962 . The histogram of the soil moisture data set and the density curve of UED ( 0.2707 , 2.9962 ) are exhibited in Figure 2. The summary statistics of the soil moisture data set are obtained by minimum = 0.0296, first quartile = 0.1210, median = 0.1946, third quartile = 0.2949, maximum = 0.4149, and mean = 0.2088, replacing α and β in the UED ( α , β ) by α ^ and β ^ , respectively. The quantile versus quantile plot is given in Figure 3. Using UED to model this data set, the K-S test statistics can be obtained by D = 0.125 with the p-value = 0.6457. The K-S testing results support that the UED is the right model for the soil moisture data set.
Using the proposed bias-correction method described in Section 3.1, we can obtain the bias-corrected MLEs as α ˜ = 0.2697 and β ˜ = 2.9847 . Moreover, using the bootstrap bias-correction method addressed in Section 3.2 with M = 400, we obtain the B-BCML estimates as α ˜ B B C = 0.2613 and β ˜ B B C = 2.7608 . Both bias-corrected MLEs are smaller than the original MLEs, and the reduction magnitude of the B-BCML estimates is larger than that of the bias-corrected MLEs.

6. Conclusions

In this paper, we proposed two bias-correction procedures to reduce the bias of the MLEs of UED parameters when the sample size is small. The first bias-correction procedure is based on the bias-correction one proposed by Cordeiro and Klein [27], which has the order O ( n 2 ) . The second bias-correction procedure is based on the parametric bootstrap one. We analytically derived the procedure to implement the Cordeiro and Klein bias-corrected maximum likelihood estimation method. Moreover, the steps taken to obtain the bias-corrected bootstrap estimators were explored in Section 3.2.
Intensive Monte Carlo simulations were conducted to verify the performance of two bias-correction techniques. We find that the performances of the two proposed bias-correction methods are good. The computation of the Cordeiro and Klein bias-corrected maximum likelihood estimation method is simple, and the bias-corrected bootstrap estimation method is time-consuming. To demonstrate the applications of using two proposed bias-correction methods, a soil moisture data set was used for illustration.

Author Contributions

Conceptualization, investigation, writing and editing, project administration, and funding acquisition: T.-R.T.; validation, investigation, and writing and editing: Y.L.; methodology: H.X.; investigation: Y.-Y.F. and H.X.; and project administration, funding acquisition: T.-R.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Science and Technology Council, Taiwan, grant number NSC 112-2221-E-032-038-MY2; and the National Natural Science Foundation of China, grant number 52174060.

Data Availability Statement

The soil moisture data set can be found in Page 189 of Maity [30].

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bayes, T. An Essay Towards Solving a Problem in the Doctrine of Chances. By the late Rev. Mr. Bayes, F.R.S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philos. Trans. R. Soc. 1763, 53, 370–418. [Google Scholar]
  2. Fleiss, J.L.; Levin, B.; Paik, M.C. Statistical Methods for Rates and Proportions, 3rd ed.; John Wiley & Sons Inc.: Hoboken, NJ, USA, 1993. [Google Scholar]
  3. Gilchrist, W. Statistical Modelling with Quantile Functions; CRC Press: Abingdon, UK, 2000. [Google Scholar]
  4. Seber, G.A.F. Statistical Models for Proportions and Probabilities; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  5. Leipnik, R.B. Distribution of the Serial Correlation Coefficient in a Circularly Correlated Universe. Ann. Math. Stat. 1947, 18, 80–87. [Google Scholar] [CrossRef]
  6. Johnson, N. Systems of Frequency Curves Derived From the First Law of Laplace. Trab. Estad. 1955, 5, 283–291. [Google Scholar] [CrossRef]
  7. Topp, C.W.; Leone, F.C. A Family of J-Shaped Frequency Functions. J. Am. Stat. Assoc. 1955, 50, 209–219. [Google Scholar] [CrossRef]
  8. Consul, P.C.; Jain, G.C. On the Log-Gamma Distribution and Its Properties. Stat. Hefte 1971, 12, 100–106. [Google Scholar] [CrossRef]
  9. Kumaraswamy, P. A Generalized Probability Density Function for Double-Bounded Random Processes. J. Hydrol. 1980, 46, 79–88. [Google Scholar] [CrossRef]
  10. Jøørgensen, B. Proper Dispersion Models. Braz. J. Probab. Stat. 1997, 11, 89–128. [Google Scholar]
  11. Smithson, M.; Shou, Y. CDF-Quantile. Distributions for Modelling RVs on the Unit Interval. Br. J. Math. Stat. Psychol. 2017, 70, 412–438. [Google Scholar] [CrossRef] [PubMed]
  12. Altun, E.; Hamedani, G. The Log-X gamma Distribution with Inference and Application. J. Soc. Fr. Stat. 2018, 159, 40–55. [Google Scholar]
  13. Nakamura, L.R.; Cerqueira, P.H.R.; Ramires, T.G.; Pescim, R.R.; Rigby, R.A.; Stasinopoulos, D.M. A New Continuous Distribution the Unit Interval Applied to Modelling the Points Ratio of Football Teams. J. Appl. Stat. 2019, 46, 416–431. [Google Scholar] [CrossRef]
  14. Ghitany, M.E.; Mazucheli, J.; Menezes, A.F.B.; Alqallaf, F. The Unit-Inverse Gaussian Distribution: A New Alternative to Two-Parameter Distributions on the Unit Interval. Commun. Stat. Theory Methods. 2019, 48, 3423–3438. [Google Scholar] [CrossRef]
  15. Mazucheli, J.; Menezes, A.F.; Dey, S. Unit-Gompertz Distribution with Applications. Statistica 2019, 79, 25–43. [Google Scholar]
  16. Mazucheli, J.; Menezes, A.F.B.; Chakraborty, S. On the One Parameter Unit-Lindley Distribution and Its Associated Regression Model for Proportion Data. J. Appl. Stat. 2019, 46, 700–714. [Google Scholar] [CrossRef]
  17. Mazucheli, J.; Menezes, A.F.B.; Fernandes, L.B.; de Oliveira, R.P.; Ghitany, M.E. The Unit-Weibull Distribution as an Alternative to the Kumaraswamy Distribution for the Modeling of Quantiles Conditional on Covariates. J. Appl. Stat. 2019, 47, 954–974. [Google Scholar] [CrossRef]
  18. Altun, E. The Log-Weighted Exponential Regression Model: Alternative to the Beta Regression Model. Commun. Stat. Theory Methods 2020, 50, 2306–2321. [Google Scholar] [CrossRef]
  19. Gündüz, S.; Mustafa, Ç.; Korkmaz, M.C. A New Unit Distribution Based on the Unbounded Johnson Distribution Rule: The Unit Johnson SU Distribution. Pak. J. Stat. Oper. Res. 2020, 16, 471–490. [Google Scholar] [CrossRef]
  20. Biswas, A.; Chakraborty, S. A New Method for Constructing Continuous Distributions on the Unit Interval. arXiv 2021, arXiv:2101.04661. [Google Scholar]
  21. Afify, A.Z.; Nassar, M.; Kumar, D.; Cordeiro, G.M. A New Unit Distribution: Properties and Applications. Electron. J. Appl. Stat. 2022, 15, 460–484. [Google Scholar]
  22. Krishna, A.; Maya, R.; Chesneau, C.; Irshad, M.R. The Unit Teissier Distribution and Its Applications. Math. Comput. Appl. 2022, 27, 12. [Google Scholar] [CrossRef]
  23. Korkmaz, M.Ç.; Korkmaz, Z.S. The Unit Log–log Distribution: A New Unit Distribution with Alternative Quantile Regression Modeling and Educational Measurements Applications. J. Appl. Stat. 2023, 50, 889–908. [Google Scholar] [CrossRef] [PubMed]
  24. Fayomi, A.; Hassan, A.S.; Baaqeel, H.; Almetwally, E.M. Bayesian Inference and Data Analysis of the Unit–Power Burr X Distribution. Axioms 2023, 12, 297. [Google Scholar] [CrossRef]
  25. Bakouch, H.S.; Hussain, T.; Tošić, M.; Stojanović, V.S.; Qarmalah, N. Unit Exponential Probability Distribution: Characterization and Applications in Environmental and Engineering Data Modeling. Mathematics 2023, 11, 4207. [Google Scholar] [CrossRef]
  26. Dombi, J.; Jónás, T.; Tóth, Z.E. The Epsilon Probability Distribution and its Application in Reliability Theory. Acta Polytech. Hung. 2018, 15, 197–216. [Google Scholar]
  27. Cordeiro, G.M.; Klein, R. Bias correction in ARMA models. Stat. Probab. Lett. 1994, 19, 169–176. [Google Scholar] [CrossRef]
  28. Efron, B.; Tibshirani, R.J. An introduction to the bootstrap. In Monographs on Statistics and Applied Probability; Chapman & Hall: New York, NY, USA, 1993. [Google Scholar]
  29. Tsai, T.-R.; Xin, H.; Fan, Y.-Y.; Lio, Y.L. Bias-Corrected Maximum Likelihood Estimation and Bayesian Inference for the Process Performance Index Using Inverse Gaussian Distribution. Stats 2022, 5, 1079–1096. [Google Scholar] [CrossRef]
  30. Maity, R. Statistical Methods in Hydrology and Hydroclimatology; Springer Nature Singapore Pte Ltd.: Singapore, 2018. [Google Scholar]
Figure 1. The updated rate of the bootstrap bias-correction method for M = 200 , 300 , 400 , 500 , 600 , 700 .
Figure 1. The updated rate of the bootstrap bias-correction method for M = 200 , 300 , 400 , 500 , 600 , 700 .
Mathematics 12 01828 g001
Figure 2. The histogram of soil moisture data set and the density curve of UED ( α = 0.2707 , β = 2.9962 ) .
Figure 2. The histogram of soil moisture data set and the density curve of UED ( α = 0.2707 , β = 2.9962 ) .
Mathematics 12 01828 g002
Figure 3. The quantile vs. quantile plot of the soil moisture data set versus the UED ( α = 0.2707, β = 2.9962).
Figure 3. The quantile vs. quantile plot of the soil moisture data set versus the UED ( α = 0.2707, β = 2.9962).
Mathematics 12 01828 g003
Table 1. The relative bias and relative squared-root MSE for sample size 15.
Table 1. The relative bias and relative squared-root MSE for sample size 15.
Methodsn α β RB ( α )RSM ( α )RB ( β )RSM ( β )Updated Rate
MLE150.441.51−0.06760.75330.27460.56201
BC150.441.510.12581.26140.15100.54740.887
Boot-BC150.441.510.03301.06030.14430.66520.614
MLE150.990.22−0.02010.97300.43630.85811
BC150.990.220.10261.22090.24880.77460.901
Boot-BC150.990.22−0.02741.08250.30530.92110.461
MLE151.90.32−0.12490.94680.76391.33201
BC151.90.320.09710.99990.42161.11630.675
Boot-BC151.90.32−0.16261.00100.62011.36510.364
MLE152.442.510.04141.26970.45510.74161
BC152.442.510.04031.34760.36020.67100.978
Boot-BC152.442.51−0.03141.33190.43080.86830.399
Table 2. The relative bias and relative squared-root MSE for sample size 20.
Table 2. The relative bias and relative squared-root MSE for sample size 20.
Methodsn α β RB ( α )RSM ( α )RB ( β )RSM ( β )Updated Rate
MLE200.441.51−0.07190.63750.21100.45491
BC200.441.510.09101.02450.10690.46250.906
Boot-BC200.441.510.04551.02740.09720.58080.715
MLE200.990.22−0.04880.77190.33350.68281
BC200.990.220.05490.95770.18120.63180.93
Boot-BC200.990.22−0.03570.95050.21430.77380.579
MLE201.90.32−0.10810.82180.57661.04681
BC201.90.320.10930.84980.26960.87830.72
Boot-BC201.90.32−0.13680.90760.44371.10270.464
MLE202.442.510.00381.00830.40560.69941
BC202.442.510.01741.14120.33470.65160.98
Boot-BC202.442.51−0.05681.09740.36990.83830.471
Table 3. The relative bias and relative squared-root MSE for sample size 30.
Table 3. The relative bias and relative squared-root MSE for sample size 30.
Methodsn α β RB ( α )RSM ( α )RB ( β )RSM ( β )Updated Rate
MLE300.441.51−0.05830.52390.14220.34111
BC300.441.510.07860.82060.06030.36640.936
Boot-BC300.441.510.04400.92500.06060.48370.815
MLE300.990.22−0.04410.62100.22410.50641
BC300.990.220.03850.76940.11610.48520.968
Boot-BC300.990.22−0.00040.89370.12680.63910.713
MLE301.90.32−0.10940.64520.40330.77051
BC301.90.320.08530.63860.14740.64080.793
Boot-BC301.90.32−0.11750.79180.29320.86540.592
MLE302.442.51−0.05150.73100.34380.63101
BC302.442.51−0.03490.86580.29610.60460.981
Boot-BC302.442.51−0.09530.86940.30150.79070.572
Table 4. The relative bias and relative squared-root MSE for sample size 40.
Table 4. The relative bias and relative squared-root MSE for sample size 40.
Methodsn α β RB ( α )RSM ( α )RB ( β )RSM ( β )Updated Rate
MLE400.441.51−0.05450.46070.11200.28961
BC400.441.510.06300.71530.04460.31810.956
Boot-BC400.441.510.02040.82980.05430.43330.862
MLE400.990.22−0.04450.53380.17490.42671
BC400.990.220.01600.63980.09410.41560.986
Boot-BC400.990.220.01250.87100.09410.58460.796
MLE401.90.32−0.10040.55360.30900.62521
BC401.90.320.07770.52310.08070.50780.865
Boot-BC401.90.32−0.08930.77290.21250.76120.704
MLE402.442.51−0.06620.61510.29540.57391
BC402.442.51−0.06370.68140.26060.55620.99
Boot-BC402.442.51−0.09030.80320.24470.74950.654
Table 5. The relative bias and relative squared-root MSE for sample size 50.
Table 5. The relative bias and relative squared-root MSE for sample size 50.
Methodsn α β RB ( α )RSM ( α )RB ( β )RSM ( β )Updated Rate
MLE500.441.51−0.04090.41020.08590.2461
BC500.441.510.06410.64340.02790.27710.977
Boot-BC500.441.510.01980.75050.04090.38920.898
MLE500.990.22−0.03510.4720.13590.36081
BC500.990.220.00970.54630.07230.35450.994
Boot-BC500.990.220.03010.8330.06430.53520.857
MLE501.90.32−0.08330.50450.24960.54181
BC501.90.320.06880.46560.05880.42430.91
Boot-BC501.90.32−0.05570.76290.1610.70560.773
MLE502.442.51−0.05760.56210.25290.53011
BC502.442.51−0.06060.59750.2260.51710.995
Boot-BC502.442.51−0.06540.77760.19530.71760.715
Table 6. The numerical example with 30 measurements.
Table 6. The numerical example with 30 measurements.
0.14500.51760.27300.23370.36140.53500.16580.37110.34770.3108
0.43700.58520.52710.61110.29830.12380.60710.33840.38130.1458
0.20820.02280.28610.23190.05150.02100.52420.72070.28200.0737
Table 7. The soil moisture measurements.
Table 7. The soil moisture measurements.
0.08160.22530.19440.33700.12080.09540.05620.23820.19490.3500
0.40800.37450.16470.26540.13000.27030.38370.31520.14480.1152
0.07170.22530.41490.33700.25000.14230.12580.12280.29480.4024
0.28340.29530.16470.11900.06550.05320.02960.21450.15260.1210
Table 8. The MLEs and bias-corrected MLEs based on the sample of soil moisture measurements.
Table 8. The MLEs and bias-corrected MLEs based on the sample of soil moisture measurements.
MLE α ^ = 0.2706 β ^ = 2.9962
BC α ˜ = 0.2697 β ˜ = 2.9847
Boot-BC α ˜ B B C = 0.2613 β ˜ B B C = 2.7608
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xin, H.; Lio, Y.; Fan, Y.-Y.; Tsai, T.-R. Bias-Correction Methods for the Unit Exponential Distribution and Applications. Mathematics 2024, 12, 1828. https://doi.org/10.3390/math12121828

AMA Style

Xin H, Lio Y, Fan Y-Y, Tsai T-R. Bias-Correction Methods for the Unit Exponential Distribution and Applications. Mathematics. 2024; 12(12):1828. https://doi.org/10.3390/math12121828

Chicago/Turabian Style

Xin, Hua, Yuhlong Lio, Ya-Yen Fan, and Tzong-Ru Tsai. 2024. "Bias-Correction Methods for the Unit Exponential Distribution and Applications" Mathematics 12, no. 12: 1828. https://doi.org/10.3390/math12121828

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop