Next Article in Journal
A Clustering Perspective of the Collatz Conjecture
Previous Article in Journal
Fuzzy Optimization Model for Decision-Making in Supply Chain Management
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Bayesian Estimation of Entropy for Burr Type XII Distribution under Progressive Type-II Censored Data

Department of Mathematics, Beijing Jiaotong University, Beijing 100044, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2021, 9(4), 313; https://doi.org/10.3390/math9040313
Submission received: 14 December 2020 / Revised: 23 January 2021 / Accepted: 29 January 2021 / Published: 5 February 2021

Abstract

:
With the rapid development of statistics, information entropy is proposed as an important indicator used to quantify information uncertainty. In this paper, maximum likelihood and Bayesian methods are used to obtain the estimators of the entropy for a two-parameter Burr type XII distribution under progressive type-II censored data. In the part of maximum likelihood estimation, the asymptotic confidence intervals of entropy are calculated. In Bayesian estimation, we consider non-informative and informative priors respectively, and asymmetric and symmetric loss functions are both adopted. Meanwhile, the posterior risk is also calculated to evaluate the performances of the entropy estimators against different loss functions. In a numerical simulation, the Lindley approximation and the Markov chain Monte Carlo method were used to obtain the Bayesian estimates. In turn, the highest posterior density credible intervals of the entropy were derived. Finally, average absolute bias and mean square error were used to evaluate the estimators under different methods, and a real dataset was selected to illustrate the feasibility of the above estimation model.

1. Introduction

Burr type XII distribution was first proposed in [1] by Burr along with thirteen other types of Burr distributions. In recent years, Burr type XII has been applied widely in industry, physics and survival analysis, as a more flexible alternative than the Weibull distribution. It is also called the Singh–Maddala distribution, which is one of the “generalized log logistic distributions.” The cumulative distribution function (cdf) and probability distribution function (pdf) of this distribution are, respectively, as follows:
F ( x ) = 1 x β α , x > 0 , α > 0 , β > 0 ,
and
f ( x ) = α β x β 1 ( 1 + x β ) ( α + 1 ) ,
where β and α are both shape parameters.
In recent years, many researchers have investigated the estimation based on Burr distribution. The probabilistic and statistical properties of Burr type XII and the relevance to other distributions are discussed in [2]. Jaheen and Okasha (2011) [3] used the E-Bayesian method to calculate estimates of the unknown parameter for a Burr type XII distribution with type-II censored data, and compared it with classical estimation method. Wu et al. (2010) [4] constructed the optimal confidence regions of two parameters and the confidence interval of one shape parameter ( β in (1)) in Burr type XII distribution using several pivotal quantities under progressive type-II censoring.
Censoring is common in many fields, such as pharmacology, social economics and engineering, especially in terms of reliability and survival analysis. In actual production, it is difficult to observe the sample data completely due to time and cost constraints. Therefore, censoring data are more practical and efficient. Due to the various conditions of live experiments, different censoring types are suggested. For type-I censoring, the experimental stop time is a pre-determined time T, rather than the time when all items fail. For type-II censoring, the experimental stop time is the occurrence time of a fixed number of failures. At the experimental stop time, the remaining items are removed and the experiment does not continue. However, neither of them allows the items to be removed during the experiment. Therefore, progressive censoring is proposed as a more applicable scheme. Suppose that n independent identically distributed items are tested in survival analysis, ( X 1 , X 2 , X m ( 1 m n ) ) is the observed sample of failure and ( R 1 , , R m 1 , R m ) represents the number of corresponding items removed from the test. Let x i represent the specific value of X i . At the time of X 1 (the first failure) occurring, R 1 surviving items are randomly removed and no longer part of the experiment. At the second failure time x 2 , R 2 surviving items are randomly removed out of the experiment. Repeat this action until the mth failure occurs. At this moment R m = n m i = 1 m 1 R i . Obviously, the censored sample size is n m . As a general situation, progressive type-II contains the type-II censoring ( R m = n m , R 1 = = R m 2 = R m 1 = 0 ) and complete sampling case ( R 1 = = R m 1 = R m = 0 ) .
Up to now, many scholars have employed censored data to study the estimation of parameters for different distributions in different experimental situations. For example, Panahi and Sayyareh (2014) [5] employed a type-II censored sample to compute the maximum likelihood estimators via expectation-maximization algorithm, and obtained the Bayesian estimators for Burr type XII parameters. Qin and Gui (2020) [6] derived the maximum likelihood and Bayesian estimation of Burr type XII parameters based on the competing risks model, and proved the existence and uniqueness of the maximum likelihood estimators. Elsagheer (2016) [7] adopted the maximum likelihood, Bayesian, and bootstrap estimation methods for partially accelerated life experiments for the power hazard function with progressive type-II censoring schemes. Censored data have been widely studied because of their practicability and commonness. However, the use of censored data will inevitably cause loss of information (compared with the completed data), and the current research on information entropy with censored data is relatively lacking. Therefore, for the consideration of applicability and flexibility, we adopt progressive type-II censored data to estimate information entropy. For more discussion on different censoring schemes, refer to [8,9,10].
Information is an abstract concept. In the face of a large number of data, we can easily get the amount of data, but it is not clear how much information is contained in the data—that is, whether the data are valuable. Shannon put forward information entropy in 1984. In the definition, information entropy is used as an index to describe the uncertainty of information. The censoring samples inevitably lead to the loss of information, which in turn affects the estimates of information entropy. Therefore, quantifying the information entropy under different censoring cases is necessary. Additionally, due to the emergence of incomplete samples, the estimation of information entropy has also aroused interest of some authors in recent years. For example, Patra et al. [11] studied the problem of estimating the entropy of two exponential populations with scale and ordered location parameters under several censoring schemes. With the extensive application of Burr distribution in reliability, biology, economy, energy, meteorology, and other fields, its entropy is also of great help to many scholars. In [12], two entropy-based methods were used to apply the extended Burr XII distribution to six peak flow datasets, and quantiles (discharges) corresponding to different return periods were computed. Besides, in spectrum analysis, entropy and the maximum entropy principle (MEP) have many applications as the important subjects of signal processing. According to the MEP, it is helpful to select the probability law with maximum entropy before using probability law in the inference model. Ali et al. [13] presented this process comprehensively. Therefore, the estimation of information entropy is of practical significance to industrial production and experimental design.
Additionally, information entropy is of great research significance in other fields such as physics, communication, economics and so on. For the study of entropy, many researchers have made contributions. Sunoj et al. (2012) [14] introduced a Shannon entropy function based on quantiles and studied the properties of the residual entropy function. AboEleneen and Z.A (2011) [15] simplified the entropy and derived some recurrence relations under progressively type-II censored samples. Lee [16] employed the generalized progressive hybrid censored data to study the entropy estimation using ML and Bayesian methods under the inverse Weibull distribution. Zhao et al. [17] researched the empirical entropy method under type-II censored data and compared it with the empirical likelihood method using a simulation. The famous Shannon information entropy is defined as
H ( X ) = H ( f ) = 0 f ( x ) ln f ( x ) d x ,
where X is a continuous random variable and its pdf is f ( x ) .
Bayesian estimation is a newer and more practical method than classical estimation. Its basic idea is to combine the prior and sample information to obtain the posterior information that we will use, which can not be accomplished by using classical estimation. That means the Bayesian method can utilize prior information besides likelihood information, which can make up for the loss of information caused by censoring to some extent. Based on this idea, we adopted the Bayesian method to derive and calculate the information entropy under different prior distributions. Many scholars have used the Bayesian method to make some statistical inferences based on the Burr distribution, and most of them are about parameter estimation. Qin and Gui (2020) [6] obtained Bayes estimators and associated credible intervals under different loss functions using Markov chain Monte Carlo (MCMC) method with progressive type-II censoring data. Additionally, their inference is based on the competing risks model. Maurya et al. (2017) [18] made use of the Tierney and Kadane (TK) method and importance sampling procedure to derive Bayes estimators under different loss functions under censoring. To the best of our knowledge, no work has been done in applying the Bayesian method to the entropy estimation of a Burr XII distribution with progressive censored data. Furthermore, in the Bayesian method, we chose an approximation called the Lindley method, which is a third-order expansion of the Taylor formula. Theoretically, it performs better than the TK method (second order expansion).
The remaining structure is as follows: Section 2 investigates the progressive type-II censoring and derives the maximum likelihood (ML) estimator for entropy with the progressive censored data. Meanwhile, the corresponding asymptotic confidence intervals (ACIs) of entropy are also given. In Section 3, we obtain the Bayesian estimators and corresponding posterior risks (PRs) against different loss functions with informative and non-informative priors. Further, we utilize the Lindley method and Markov chain Monte Carlo (MCMC) algorithm to calculate Bayesian estimates. Additionally, the highest posterior density (HPD) intervals of entropy that are credible are obtained. In Section 4, the Monte Carlo method’s application in the the simulation is described, and we compare the results under different prior distributions and loss functions. A set of real data is analyzed to illustrate the availability for the above methods. In Section 5 we draw some conclusions.

2. Maximum Likelihood Estimation

Suppose that ( X 1 , X 2 , , X m ) is the progressive type-II censored sample observed from the test with n items. Every X i follows the distribution defined by (1) and x i represents the specific value of X i . Thus the likelihood function with m progressive type-II censored samples is given by
L ( α , β | x ̲ ) = A i = 1 m f ( x i ) F ¯ ( x i ) R i = A α m β m i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 ,
where x 1 x 2 x m , x ̲ = ( x 1 , x 2 , , x m ) , A is a constant and F ¯ ( x i ) = 1 F ( x i ) .
Additionally, the corresponding log-likelihood function is
l ( α , β ) = ln ( L ) = ln ( A ) + m ln ( α β ) + i = 1 m ln ( x i β 1 ) i = 1 m α ( 1 + R i ) ln ( 1 + x i β ) i = 1 m ln ( 1 + x β ) .
Then the likelihood equations of β and α can be obtained as
ln L ( α , β ) β = m β + i = 1 m ln ( x i ) i = 1 m α ( 1 + R i ) + 1 x i β ln ( x i ) 1 + x i β = 0 ,
and
ln L ( α , β ) α = m α i = 1 m ( 1 + R i ) ln ( 1 + x i β ) = 0 .
Obviously, the ML estimates of β and α ( β ^ and α ^ ) are the solutions of (6) and (7), which can be achieved by a numerical method such as the Newton–Raphson method.
In line with the invariance property of the ML estimation, it is not different to obtain the estimator for a function of the parameters. If the entropy in (3) under Burr XII distribution can be simplified into a function of parameters β and α , the ML estimation of entropy can be carried out.
Theorem 1.
Let X be a random variable with cdf (1); then the entropy of X is
H ^ ( X ) = H ^ ( f ) = ln α β β 1 β ψ ( 1 ) ψ ( α ) + 1 α + 1 ,
where ψ is defined by ψ ( z ) = d d z ln Γ ( z ) , which is also called the digamma function. Γ is the gamma function.
Proof. 
See Appendix A.  □
On the basis of Theorem 1, the ML estimator for entropy can be obtained as
H ^ ( f ) = ln α ^ β ^ β ^ 1 β ^ ψ ( 1 ) ψ ( α ^ ) + 1 α ^ + 1 .

Asymptotic Confidence Interval for Entropy

In order to obtain the asymptotic confidence interval (ACI) of information entropy, the observed Fisher information matrix of α and β is given first as
I ( α ^ , β ^ ) = 2 l ( α , β ) α 2 2 l ( α , β ) α β 2 l ( α , β ) α β 2 l ( α , β ) β 2 α = α ^ , β = β ^ ,
where
2 l ( α , β ) α 2 = m α 2 , 2 l ( α , β ) α β = i = 1 m ( 1 + R i ) x i β ln x i 1 + x i β , 2 l ( α , β ) β 2 = m β 2 i = 1 m α ( 1 + R i ) + 1 x i β ln x i 2 ( 1 + x i β ) 2 .
I 1 ( α ^ , β ^ ) is the inverse of I ( α ^ , β ^ ) , and also the variance–covariance matrix of β and α . When the sample size is large, the asymptotic distribution of the parameter estimators is
α ^ β ^ α β D N 0 , I 1 ( α ^ , β ^ ) ,
in which the first and second elements on the diagonal of matrix I 1 ( β ^ , α ^ ) are the estimated variances of β ^ and α ^ , respectively.
Since we want to get the asymptotic confidence interval of entropy in (8) which is a function of β ^ and α ^ , we shall use the delta method to calculate the variance of entropy. In the delta method, by creating a linear approximation of a function, the variance of a simpler linear function can be calculated. Note that
C = H ( f ) α , H ( f ) β ,
H ( f ) α = 1 α + β 1 β ψ ( α ) 1 α 2 , H ( f ) β = 1 β 1 β 2 ψ ( 1 ) ψ ( α ) ,
where ψ ( α ) is the derivative of ψ ( α ) with respect to α (also called the trigamma function). Then using the delta method, the variance of entropy can be given as
V a r ( H ^ ( f ) ) [ C I 1 ( α , β ) C t ] | β = β ^ , α = α ^ ,
where C t is transposed C. Then, the 100(1− γ )% ACI of entropy is
H ^ ( f ) z γ / 2 V a r ( H ^ ( f ) ) , H ^ ( f ) + z γ / 2 V a r ( H ^ ( f ) ) ,
where z γ / 2 is the upper ( γ / 2 )th quantile of the standardized normal distribution.

3. Bayes Estimation

Bayesian estimation is a more practical method compared to classical estimation methods, and has attracted much interest from many researchers in recent years. Many people adopt the Bayesian method to estimate parameters and related functions for different distributions. Fu et al. (2012) [19] considered the Bayesian estimation for Pareto distribution under three non-informative priors. Musleh and Helu (2014) [20] used the ML, least squares, approximate ML and Bayesian estimation to make statistical inferences of unknown parameters for inverse Weibull distributions using progressive type-II censoring schemes. Rastogi and Merovci (2017) [21] obtained Bayesian estimators of parameters for Weibull–Rayleigh distribution under asymmetric and symmetric loss functions. Singh and Tripathi [22] investigated the Bayesian estimation for the unknown parameters of an inverse Weibull distribution based on progressive type-I interval censored data, and obtained the optimal censoring scheme. For more studies on the Bayesian estimation, refer to [23,24,25,26].
In this paper, we use Bayesian estimation to get the estimates of entropy. Then, the parameters β and α in (9) are treated as random variables which conform to the prior distributions. The parameters in the prior distribution are called hyper-parameters distinguished from the parameters β and α . The prior information combined with the likelihood function can be used to derive the posterior information. By comparing the posterior risks under different prior distributions and loss functions, the relative validity of estimation can be evaluated.

3.1. Prior Distribution and Corresponding Posterior Distribution

In order to observe the influences of priors on Bayesian estimators, we adopt an informative prior and a non-informative prior respectively. For the informative prior, gamma distribution is adopted for its flexibility. Moreover, it was found that in the case of this paper, the gamma distribution is a conjugate with respect to the parameter α , which is conducive to the implementation of the sample simulation algorithm later. Assume that α and β are independent random variables, which obey Γ ( a , b ) and Γ ( c , d ) , respectively. Therefore, the form of their joint prior distribution is
π 1 ( α , β ) α a 1 β c 1 e b α d β , c > 0 , d > 0 , a > 0 , b > 0 .
Then the posterior distribution of β and α is (using (4) and (15)).
π 1 ( α , β | x ̲ ) = 1 g 1 α a + m 1 β c + m 1 e b α d β i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 ,
where x ̲ = ( x 1 , x 2 , x 3 , , x m ) and
g 1 = 0 0 α a + m 1 β c + m 1 e b α d β i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 d α d β .
when the prior distributions of β and α are both taken as non-informative priors; the form of their joint prior distribution is
π 2 ( α , β ) = 1 α β .
Thus the posterior distribution of β and α is (using (4) and (17))
π 2 ( α , β | x ̲ ) = 1 g 2 β m 1 α m 1 i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 ,
where
g 2 = 0 0 β m 1 α m 1 i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 d α d β .

3.2. Loss Function

In Bayesian estimation, the loss function is used as a way to evaluate the performances of estimators, and a penalty (the posterior risk) is assigned to each estimator. In fact, the loss function is used to measure the gap between the estimates and true values. Therefore, in order to minimize the posterior risk (PR), under different loss functions, we give Bayesian estimators for H and corresponding posterior risks in Table 1. Different situations require different loss functions. With the purpose of acquiring Bayesian inference more comprehensively, we adopt asymmetric and symmetric loss functions. Besides the squared error loss function (SELF), the weighted squared error loss function (WSELF), the precautionary loss function (PLF) and the K-Loss function (KLF) are also included (see [27,28]). The Bayesian estimators of entropy under different loss functions can be expressed as:
H ^ ( f ) S = E ( H ( f ) | x ) = 0 0 H ( f ) π ( α , β x ̲ ) d α d β , H ^ ( f ) W = E ( H 1 ( f ) | x ) 1 = ( 0 0 H 1 ( f ) π ( β , α x ̲ ) d α d β ) 1 , H ^ ( f ) K = E ( H ( f ) | x ) E ( H 1 ( f ) | x ) = 0 0 H ( f ) π ( α , β x ̲ ) d α d β 0 0 H 1 ( f ) π ( α , β x ̲ ) d α d β 1 / 2 , H ^ ( f ) P = E ( H 2 ( f ) | x ) = 0 0 H 2 ( f ) π ( α , β x ̲ ) d α d β ,
where H ^ ( f ) represents the estimators of entropy H ( f ) , and the lower notation represents different loss functions. For example, H ^ ( f ) S stands for the Bayesian estimator of entropy against SELF.
In order to calculate the Bayesian estimators in (19), we need to obtain E ( H ( f ) | x ) , E ( H 1 ( f ) | x ) and E ( H 2 ( f ) | x ) first. However, one can observe that the entropy estimators we want to further calculate are in the form of a two-integral ratio, and it is not easy to simplify them to other closed integral forms. Hence, we consider two different methods, Lindley approximation and MCMC algorithm, to solve this problem.

3.3. Lindley Approximation

In this subsection, we employ an approximation method, which was proposed by Lindley in [29] to achieve the numerical calculation of entropy estimators. Referring to Lindley’s method, we can define I ( x ) as
I ( x ̲ ) = 0 0 e ρ ( α , β ) + l ( α , β | x ) u ( α , β ) d α d β 0 0 e ρ ( α , β ) + l ( α , β | x ) d α d β .
where u ( α , β ) is the function of β and α , l ( α , β ) is the logarithm of likelihood (defined by (5)), and ρ ( α , β ) = ρ i ( α , β ) = ln π i ( α , β ) is the logarithm of the priors (defined by (15) and (17)), i = 1 , 2 . Further, use the Lindley method to approximate the Formula (20) as
I ( x ̲ ) = u ( α ^ , β ^ ) + 1 2 [ σ ^ α α ( 2 u ^ α ρ ^ α + u ^ α α ) + σ ^ β α ( 2 u ^ β ρ ^ α + u ^ β α ) + σ ^ β β ( u ^ β β + 2 u ^ β ρ ^ β ) ] + 1 2 [ ( l ^ α α α σ ^ α α + l ^ β α α σ ^ β α + l ^ α β α σ ^ α β + l ^ β β α σ ^ β β ) ( u ^ α σ ^ α α + u ^ β σ ^ α β ) + ( l ^ β α α σ ^ α α + l ^ β α β σ ^ β α + l ^ α β β σ ^ α β + l ^ β β β σ ^ β β ) ( u ^ α σ ^ β α + u ^ β σ ^ β β ) ] ,
where α ^ and β ^ are the ML estimators of α and β . We use the symbol with a hat to represent the estimator, and the subscript indicates the derivative of the variable. For instance, the second derivative of the u ( α , β ) with respect to α is expressed as u α α . Similarly, others are expressed as follows:
l ^ α α = 2 l α 2 α = α ^ = m α ^ 2 , l ^ β β = 2 l β 2 β = β ^ = m β ^ 2 i = 1 m α ^ ( R i + 1 ) + 1 x i β ^ ln x i 2 ( 1 + x i β ^ ) 2 , l ^ α β = 2 l α β β = β ^ , α = α ^ = i = 1 m ( R i + 1 ) x i β ^ ln x i 1 + x i β ^ = l ^ β α , l ^ α α α = 3 l α 3 α = α ^ = 2 m α ^ 3 , l ^ β β α = 3 l β 2 α β = β ^ , α = α ^ = i = 1 m ( R i + 1 ) x i β ^ ln x i 2 ( 1 + x i β ^ ) 2 = l ^ β α β , l ^ β α α = 3 l β α 2 β = β ^ , α = α ^ = 0 = l ^ α β α , l ^ β β β = 3 l β 3 β = β ^ = 2 m β 3 i = 1 m k ( R i + 1 ) + 1 ln x i 3 x i β ^ ( 1 x i β ^ ) ( 1 + x i β ^ ) 3 , ρ ^ 1 α = a 1 α ^ b , ρ ^ 1 β = β 1 β ^ , ρ ^ 2 α = 1 α , ρ ^ 2 β = 1 β ,
σ ( i , j ) represents the ( i , j ) t h element of [ 2 l α β ] 1 , i , j = 1 , 2 .
By using the above expression, the Bayesian estimator of entropy in (19) can be further expressed in a more specific form.
  • The Bayesian estimator of entropy in (19) under SELF.
    When u ( α , β ) = H ( f ) = ln ( α β ) ( β 1 β ) ψ ( 1 ) ψ ( α ) + 1 α + 1 ,
    u α = 1 α + ( β 1 β ) ψ ( α ) 1 α 2 , u β = 1 β 1 β 2 ψ ( 1 ) ψ ( α ) , u α β = u β α = 1 β 2 ψ ( α ) , u α α = 1 α 2 + ( β 1 β ) ψ ( α ) + 2 α 3 , u β β = 1 β 2 + 2 β 3 ψ ( 1 ) ψ ( α ) ,
    the Bayesian estimator can be expressed as
    H ^ ( f ) S = E ( H ( f ) x ) = u ( α ^ , β ^ ) + 1 2 [ σ ^ α α ( 2 u ^ α ρ ^ α + u ^ α α ) + σ ^ β α ( 2 u ^ β ρ ^ α + u ^ β α ) + σ ^ β β ( u ^ β β + 2 u ^ β ρ ^ β ) + ( l ^ α α α σ ^ α α + l ^ α β β σ ^ β β ) ( u ^ α σ ^ α α + u ^ β σ ^ α β ) + ( 2 l ^ α β β σ ^ α β + l ^ β β β σ ^ β β ) ( u ^ α σ ^ β α + u ^ β σ ^ β β ) ] .
  • The Bayesian estimator of entropy in (19) under WSELF.
    When u ( α , β ) = H ( f ) 1 = ln ( α β ) ( β 1 β ) ψ ( 1 ) ψ ( α ) + 1 α + 1 1 ,
    u α = H ( f ) 2 1 α ( β 1 β ) ψ ( α ) + 1 α 2 , u β = H ( f ) 2 1 β + 1 β 2 ψ ( 1 ) ψ ( α ) , u α β = u β α = 2 H ( f ) 3 1 β 2 ψ ( α ) , u α α = 2 H ( f ) 3 1 α 2 + ( β 1 β ) ψ ( α ) + 2 α 3 , u β β = 2 H ( f ) 3 1 β 2 + 2 β 3 ψ ( 1 ) ψ ( α ) ,
    the Bayesian estimator can be expressed as
    H ^ ( f ) W = E ( H 1 ( f ) x ) 1 = [ u ( α ^ , β ^ ) + 1 2 ( σ ^ α α ( 2 u ^ α ρ ^ α + u ^ α α ) + σ ^ β α ( 2 u ^ β ρ ^ α + u ^ β α ) + σ ^ β β ( u ^ β β + 2 u ^ β ρ ^ β ) + ( l ^ α α α σ ^ α α + l ^ α β β σ ^ β β ) ( u ^ α σ ^ α α + u ^ β σ ^ α β ) + ( 2 l ^ α β β σ ^ α β + l ^ β β β σ ^ β β ) ( u ^ α σ ^ β α + u ^ β σ ^ β β ) ) ] 1 .
  • The Bayesian estimator of entropy in (19) under PLF.
    When u ( α , β ) = H ( f ) 2 = ln ( α β ) ( β 1 β ) ψ ( 1 ) ψ ( α ) + 1 α + 1 2 ,
    u α = 2 H ( f ) ( β 1 β ) ψ ( α ) 1 α + 1 α 2 , u β = 2 H ( f ) 1 β 1 β 2 ψ ( 1 ) ψ ( α ) , u α β = u β α = 2 H ( f ) 3 1 β 2 ψ ( α ) , u α α = 2 H ( f ) 1 α 2 + ( β 1 β ) ψ ( α ) + 2 α 3 , u β β = 2 H ( f ) 1 β 2 + 2 β 3 ψ ( 1 ) ψ ( α ) ,
    the Bayesian estimator can be expressed as
    H ^ ( f ) P = E ( H 2 ( f ) x ) = [ u ( α ^ , β ^ ) + 1 2 ( σ ^ α α ( 2 u ^ α ρ ^ α + u ^ α α ) + σ ^ β α ( 2 u ^ β ρ ^ α + u ^ β α ) + σ ^ β β ( u ^ β β + 2 u ^ β ρ ^ β ) + ( l ^ α α α σ ^ α α + l ^ α β β σ ^ β β ) ( u ^ α σ ^ α α + u ^ β σ ^ α β ) + ( 2 l ^ α β β σ ^ α β + l ^ β β β σ ^ β β ) ( u ^ α σ ^ β α + u ^ β σ ^ β β ) ) ] 1 2 .
  • The Bayesian estimator of entropy in (19) under KLF. In the light of the Bayesian estimators under SELF and WSELF, when u ( α , β ) is taken as H ( f ) and H ( f ) 1 , E ( H x ̲ ) and E ( H 1 x ̲ ) can be obtained respectively. Accordingly,
    H ^ ( f ) K = E ( H ( f ) x ) E ( H 1 ( f ) x )
    can also be calculated.
Obviously, the exact values of the entropy estimators in (19) can be calculated using Lindley approximate method. Then the corresponding posterior risk values of different loss functions in Table 1 can also be obtained.

3.4. MCMC Method with Gibbs Sampling

Although it is easy to calculate Bayesian estimates by the Lindley method, it cannot provide an interval estimation. In consideration of this, we adopted the MCMC method to compute the Bayesian estimates of entropy and obtain the corresponding HPD credible intervals. As a special case of the MCMC method, Gibbs sampling requires a marginal distribution for each parameter to generate a set of Markov chain samples. Take the gamma prior, for example: the conditional posterior densities of β and α can be obtained respectively as
π 1 ( α | β , x ̲ ) α a + m 1 e α b + i = 1 m ( R i + 1 ) ln ( 1 + x i β ) ,
π 1 ( β | α , x ̲ ) β c + m 1 e d β i = 1 m x β 1 ( 1 + x β ) α ( R i + 1 ) 1 .
Due to a + m > 0 , b + i = 1 m ( R i + 1 ) ln ( 1 + x i β ) > 0 , it can be seen that π 1 ( α | β , x ̲ ) is a gamma distribution. However, π 1 ( β | α , x ̲ ) is not a well-known distribution, so we adopt the Metropolis–Hastings method with a normal proposal distribution to generate the sample β ( k ) . Thus, the process of generating Markov chain samples can be described as in Algorithm 1.
Algorithm 1 Gibbs sampling.
1:
Set initial value ( α ( 0 ) , β ( 0 ) ) and k = 1 .
2:
Generate α ( k ) from Γ ( a + m , b + i = 1 m ( R i + 1 ) ln ( 1 + x i β ( k 1 ) ) ) .
3:
Generate β ( k ) from π 1 ( β | α ( k ) , x ̲ ) with N β ( k 1 ) , V a r ( β ^ ) proposal distribution.
4:
Set k = k + 1 .
5:
Repeat step 2–4, K times, then obtain ( α ( k ) , β ( k ) ), k = 1 , 2 , , K .
6:
Substitute by ( α ( k ) , β ( k ) ) into (9) to compute H ( 1 ) ( f ) , H ( 2 ) ( f ) , , H ( K ) ( f ) and H ( 1 ) 1 ( f ) , H ( 2 ) 1 ( f ) , , H ( K ) 1 ( f ) and H ( 1 ) 2 ( f ) , H ( 2 ) 2 ( f ) , , H ( K ) 2 ( f ) .
Based on SELF, WSELF, KLF and PLF, the Bayesian estimates in (19) can be obtained as
H ^ ( f ) S = k = M + 1 K H ( k ) ( f ) K M , H ^ ( f ) W = k = M + 1 K H ( k ) 1 ( f ) K M 1 , H ^ ( f ) K = k = M + 1 K H ( k ) ( f ) H ( k ) ( f ) H ( k ) 1 ( f ) H ( k ) 1 ( f ) K M 1 / 2 , H ^ ( f ) P = k = M + 1 K H ( k ) 2 ( f ) K M 1 / 2 ,
In the Markov chain algorithm, in order to avoid the influence caused by the initial value, some number of initial iteration results are discarded. This part is called burn-in period, and the number of the discarded iterations is denoted by M. Similarly, the corresponding posterior risks in Table 1 can also be calculated.
In the interest of getting the HPDs, arrange H ( M + 1 ) ( f ) , H ( M + 2 ) ( f ) , , H ( K ) ( f ) , and get H ( 1 ) * ( f ) < H ( 2 ) * ( f ) < < H ( K M ) * ( f ) . Thus, several 100 ( 1 γ ) % credible intervals of H ( f ) can be obtained as
H ( 1 ) * ( f ) , H [ ( K M ) ( 1 γ ) ] * ( f ) , , H [ ( K M ) ( γ ) ] * ( f ) , H ( K ) * ( f )
where [ ( K M ) ( 1 γ ) ] represents the largest integer less than or equal to ( K M ) ( 1 γ ) . The HPD credible interval is the shortest length interval in (29).

4. Simulation Study and Real Data Analysis

4.1. Monte Carlo Simulation

In this subsection, the Monte Carlo simulation method is used to further calculate the entropy estimators obtained by the different methods. In order to demonstrate the performances of proposed methods more comprehensively, we chose different sample sizes and progressive censoring schemes. That is,
  • Scheme 1: R m = n m , R 1 = R 2 = = R m 1 = 0 .
  • Scheme 2: R 1 = n m , R 2 = R 3 = = R m 1 = 0 .
  • Scheme 3: R i = 0 f o r i m + 1 2 , R ( m + 1 ) / 2 = n m , i f m i s o d d , R i = 0 f o r i m 2 , R m / 2 = n m , i f m i s e v e n .
  • Scheme 4: R 1 = R 2 = = R 2 m n = 0 , R 2 m n + 1 = = R m = 1 (assuming that censoring ratio is less than 50%).
The performance of point and interval estimation is evaluated by some different quantities. In order to compare the performance of ML and Bayesian estimates of entropy, average absolute bias (AB) and mean square error (MSE) are calculated respectively by 1 N j = 1 N H ^ ( f ) H ( f ) , and 1 N j = 1 N ( H ^ ( f ) H ( f ) ) 2 , with N times repetition. For showing the performances of ACIs and HPDs, average width (AW) and coverage probability (CV) of 100(1- γ )% credible/confidence intervals of H ^ ( f ) were calculated. In particular, on account of various loss functions, the posterior risks (in Table 1) were used to evaluate the performance of Bayesian estimates. To show the results more visually, Figure 1 was drawn with censoring ratio 50% and Scheme 2, where the horizontal axis has different n values, and the vertical axis is ABs under different methods.
Given the censoring scheme ( R 1 , R 2 , , R m ) ,the values of n, m, the simulation steps are shown in Algorithm 2.
Algorithm 2 Monte Carlo simulation.
1:
Generate m i.i.d observations U 1 , U 2 , , U m from U(0, 1).
2:
For given values R 1 , R 2 , , R m of corresponding censoring scheme, set Z i = U i 1 / ( i + j = m i + 1 m R j ) , i = 1 , 2 , 3 , , m .
3:
Then we set V i = 1 Z m Z m 1 Z m i + 1 , for i = 1 , 2 , 3 , , m . Hence V 1 , V 2 , , V m is a progressive type-II censored data of size m from U(0, 1).
4:
For given α and β , set x i = F 1 ( V i ) = 1 ( 1 V i ) 1 / α 1 1 / β , i = 1 , 2 , 3 , , m . Finally x 1 , x 2 , , x m is the desired progressive type-II censored data of size m from Pareto( α , β ) in (1). 1.30
5:
Calculate the ML estimates of α and β using (6), (7) and H ( f ) in (9). The ACI of H ^ ( f ) can also be calculated.
6:
Choosing the hyperparameters a , b , c , d . For different priors and loss functions (in Table 1), we use the Lindley method and Gibbs samples in Section 3.4 to calculate the exact estimates of the entropy in (19). The PRs and HPDs can also be obtained.
7:
Repeat steps 1–6 above N times and calculate the MSEs, ABs, AWs, CPs and PRs. Then we can get Table 2, Table 3, Table 4 and Table 5. In the following tables, Lindley0 indicates the Lindley method with non-informative prior,. Accordingly, Lindley1 denotes the Lindley method using informative prior, that is gamma prior. Similarly, the symbols MCMC0 and MCMC1 are also used.
Table 2. MSEs and ABs (in brackets) of ML and Bayesian estimates of entropy for Burr XII distribution with α = 3 and β = 2 under different censoring schemes (CS). The hyperparemeters for gamma prior are a = 3 , b = 1 , c = 2 and d = 1 .
Table 2. MSEs and ABs (in brackets) of ML and Bayesian estimates of entropy for Burr XII distribution with α = 3 and β = 2 under different censoring schemes (CS). The hyperparemeters for gamma prior are a = 3 , b = 1 , c = 2 and d = 1 .
nmCSMLEBayesian Eastimates
Lindley0Lindley1MCMC0MCMC1
301510.1609(0.3171)0.1556(0.3122)4.6443(0.3997)0.1547(0.3114)0.0492(0.1763)
20.0614(0.1954)0.0594(0.1915)0.0283(0.1345)0.0595(0.1914)0.0353(0.1490)
30.0899(0.2337)0.0812(0.2226)0.0307(0.1226)0.0811(0.2230)0.0374(0.1552)
40.0913(0.2394)0.0883(0.2351)0.0803(0.1383)0.0892(0.2357)0.0390(0.1588)
2110.0620(0.1991)0.0601(0.1983)0.0244(0.1294)0.0599(0.1976)0.0329(0.1496)
20.0359(0.1457)0.0354(0.1458)0.0224(0.1182)0.0355(0.1459)0.0247(0.1236)
30.0435(0.1632)0.0418(0.1610)0.0241(0.1233)0.0420(0.1606)0.0276(0.1319)
40.0613(0.1981)0.0590(0.1954)0.0260(0.1324)0.0590(0.1954)0.0341(0.1506)
402010.1022(0.2490)0.0971(0.2442)0.5050(0.1608)0.0973(0.2443)0.0367(0.1545)
20.0429(0.1668)0.0413(0.1643)0.0253(0.1304)0.0414(0.1647)0.0280(0.1364)
30.0631(0.1977)0.0602(0.1950)0.0272(0.1338)0.0605(0.1954)0.0350(0.1511)
40.0724(0.2168)0.0686(0.2121)0.0259(0.1303)0.0691(0.2126)0.0363(0.1565)
2810.0410(0.1598)0.0401(0.1582)0.0221(0.1202)0.0403(0.1586)0.0259(0.1286)
20.0290(0.1351)0.0283(0.1323)0.0205(0.1139)0.0284(0.1323)0.0216(0.1167)
30.0418(0.1623)0.0406(0.1615)0.0267(0.1334)0.0407(0.1617)0.0293(0.1384)
40.0398(0.1594)0.0377(0.1555)0.0220(0.1210)0.0376(0.1556)0.0251(0.1281)
502510.0784(0.2261)0.0782(0.2266)0.0350(0.1300)0.0787(0.2278)0.0397(0.1609)
20.0299(0.1379)0.0291(0.1363)0.0205(0.1155)0.0292(0.1365)0.0216(0.1182)
30.0431(0.1635)0.0425(0.1629)0.0250(0.1270)0.0423(0.1623)0.0282(0.1337)
40.0480(0.1699)0.0471(0.1715)0.0313(0.1285)0.0472(0.1718)0.0286(0.1367)
3510.0336(0.1493)0.0330(0.1481)0.0215(0.1213)0.0330(0.1480)0.0233(0.1254)
20.0215(0.1162)0.0213(0.1156)0.0170(0.1037)0.0213(0.1156)0.0174(0.1048)
30.0275(0.1309)0.0267(0.1303)0.0199(0.1136)0.0267(0.1304)0.0208(0.1158)
40.0313(0.1396)0.0305(0.1391)0.0204(0.1160)0.0306(0.1396)0.0219(0.1190)
Table 3. Bayesian estimates and corresponding posterior risks (PRs) (in brackets) of entropy for Burr XII distribution with α = 0.8 and β = 0.8 under different censoring schemes and loss functions using the Lindley method. The hyperparemeters for gamma prior are a = 0.8 , b = 1 , c = 0.8 and d = 1 . The true value of entropy is 2.793 .
Table 3. Bayesian estimates and corresponding posterior risks (PRs) (in brackets) of entropy for Burr XII distribution with α = 0.8 and β = 0.8 under different censoring schemes and loss functions using the Lindley method. The hyperparemeters for gamma prior are a = 0.8 , b = 1 , c = 0.8 and d = 1 . The true value of entropy is 2.793 .
nmCSLindley0Lindley1
SELFWSELFKLFPLFSELFWSELFKLFPLF
301512.8022(1.1857)2.1462(0.6560)2.4524(0.6114)3.0063(0.4083)2.8208(1.1010)2.3003(0.5205)2.5473(0.4526)3.0097(0.3777)
22.8638(1.0253)2.4540(0.4098)2.6510(0.3340)3.0376(0.3475)2.8704(0.9687)2.4866(0.3837)2.6716(0.3086)3.0344(0.3281)
32.8335(1.1239)2.3643(0.4692)2.5883(0.3969)3.0253(0.3837)2.8449(1.0596)2.4157(0.4291)2.6215(0.3553)3.0254(0.3610)
42.8145(1.1016)2.3209(0.4936)2.5558(0.4253)3.0038(0.3787)2.8264(1.0380)2.3869(0.4395)2.5974(0.3683)3.0044(0.3560)
2112.8314(0.7104)2.5503(0.2810)2.6872(0.2204)2.9542(0.2456)2.8355(0.6838)2.5669(0.2685)2.6978(0.2092)2.9536(0.2363)
22.8515(0.6982)2.5845(0.2670)2.7147(0.2066)2.9714(0.2398)2.8535(0.6706)2.5983(0.2552)2.7229(0.1965)2.9687(0.2303)
32.8426(0.7352)2.5560(0.2867)2.6955(0.2243)2.9691(0.2530)2.8459(0.7086)2.5718(0.2741)2.7054(0.2132)2.9678(0.2438)
42.7593(0.6666)2.4908(0.2685)2.6217(0.2156)2.8776(0.2365)2.7650(0.6417)2.5084(0.2567)2.6336(0.2047)2.8787(0.2274)
402012.7927(0.8560)2.4480(0.3447)2.6147(0.2816)2.9420(0.2985)2.8024(0.8173)2.4765(0.3258)2.6344(0.2631)2.9446(0.2844)
22.8165(0.7627)2.5205(0.2960)2.6644(0.2349)2.9488(0.2646)2.8220(0.7310)2.5395(0.2825)2.6770(0.2224)2.9487(0.2533)
32.8685(0.8655)2.5337(0.3349)2.6959(0.2643)3.0156(0.2942)2.8735(0.8311)2.5551(0.3184)2.7096(0.2492)3.0146(0.2823)
42.8304(0.8386)2.4943(0.3361)2.6571(0.2695)2.9749(0.2889)2.8366(0.8044)2.5188(0.3178)2.6730(0.2523)2.9750(0.2768)
2812.7980(0.5549)2.5854(0.2126)2.6896(0.1644)2.8954(0.1949)2.8008(0.5395)2.5949(0.2059)2.6959(0.1587)2.8955(0.1894)
22.8414(0.5578)2.6351(0.2063)2.7363(0.1566)2.9379(0.1930)2.8424(0.5416)2.6425(0.1999)2.7406(0.1513)2.9361(0.1875)
32.8446(0.5558)2.6299(0.2147)2.7351(0.1633)2.9407(0.1921)2.8461(0.5406)2.6384(0.2077)2.7403(0.1575)2.9395(0.1869)
42.8529(0.5577)2.6386(0.2143)2.7437(0.1624)2.9490(0.1923)2.8541(0.5427)2.6468(0.2073)2.7485(0.1567)2.9477(0.1871)
502512.7753(0.7262)2.4931(0.2821)2.6304(0.2263)2.9031(0.2558)2.7822(0.6996)2.5122(0.2700)2.6437(0.2149)2.9052(0.2460)
22.8070(0.5725)2.5877(0.2193)2.6951(0.1695)2.9072(0.2004)2.8100(0.5545)2.5983(0.2117)2.7021(0.1630)2.9070(0.1940)
32.8402(0.6645)2.5848(0.2553)2.7095(0.1976)2.9549(0.2293)2.8435(0.6437)2.5976(0.2459)2.7178(0.1893)2.9545(0.2220)
42.8524(0.6746)2.5789(0.2735)2.7122(0.2121)2.9683(0.2318)2.8554(0.6527)2.5943(0.2610)2.7217(0.2012)2.9675(0.2242)
3512.8489(0.4326)2.6890(0.1599)2.7678(0.1189)2.9239(0.1499)2.8498(0.4237)2.6934(0.1564)2.7705(0.1161)2.9232(0.1468)
22.7824(0.4411)2.6152(0.1672)2.6975(0.1279)2.8605(0.1564)2.7842(0.4304)2.6214(0.1628)2.7016(0.1242)2.8604(0.1525)
32.8171(0.4394)2.6553(0.1618)2.7350(0.1219)2.8940(0.1539)2.8184(0.4301)2.6602(0.1581)2.7381(0.1189)2.8937(0.1506)
42.8035(0.4380)2.6365(0.1669)2.7187(0.1266)2.8805(0.1541)2.8049(0.4283)2.6421(0.1628)2.7223(0.1232)2.8803(0.1507)
Table 4. Bayesian estimates and corresponding PRs (in brackets) of entropy for Pareto distribution with α = 0.8 and β = 0.8 under different censoring schemes and loss functions using MCMC method. The hyperparemeters for gamma prior are a = 0.8 , b = 1 , c = 0.8 and d = 1 .
Table 4. Bayesian estimates and corresponding PRs (in brackets) of entropy for Pareto distribution with α = 0.8 and β = 0.8 under different censoring schemes and loss functions using MCMC method. The hyperparemeters for gamma prior are a = 0.8 , b = 1 , c = 0.8 and d = 1 .
nmCSMCMC0MCMC1
SELFWSELFKLFPLFSELFWSELFKLFPLF
301512.7461(1.4005)2.1244(0.6217)2.4154(0.5853)2.9903(0.4883)2.7547(1.2807)2.2748(0.4799)2.5032(0.4219)2.9780(0.4468)
22.8385(1.1453)2.4020(0.4365)2.6111(0.3634)3.0335(0.3901)2.8404(1.0580)2.3656(0.4748)2.5922(0.4014)3.0209(0.3610)
32.9709(1.2939)2.4918(0.4791)2.7208(0.3845)3.1812(0.4206)2.9606(1.1859)2.5657(0.3949)2.7561(0.3079)3.1545(0.3879)
42.8740(1.2657)2.5723(0.3017)2.7189(0.2346)3.0863(0.4247)2.8680(1.1661)2.4528(0.4152)2.6523(0.3386)3.0645(0.3931)
2112.8582(0.8119)2.5773(0.2809)2.7141(0.2180)2.9968(0.2774)2.8562(0.7739)2.5846(0.2716)2.7170(0.2102)2.9886(0.2648)
22.8724(0.7254)2.6055(0.2668)2.7357(0.2048)2.9960(0.2472)2.8741(0.6930)2.6235(0.2506)2.7460(0.1910)2.9922(0.2363)
32.7520(0.8026)2.4486(0.3034)2.5959(0.2478)2.8942(0.2843)2.7578(0.7635)2.4749(0.2829)2.6125(0.2286)2.8929(0.2703)
42.8237(0.8019)2.5316(0.2921)2.6736(0.2308)2.9623(0.2772)2.8179(0.7531)2.5453(0.2726)2.6781(0.2142)2.9485(0.2612)
402012.8116(1.0896)2.4004(0.4111)2.5979(0.3426)2.9991(0.3750)2.8129(1.0103)2.4340(0.3789)2.6166(0.3113)2.9871(0.3484)
22.8102(0.8351)2.4775(0.3328)2.6386(0.2686)2.9551(0.2897)2.8125(0.7909)2.5094(0.3032)2.6566(0.2416)2.9498(0.2745)
32.8118(0.8848)2.4865(0.3253)2.6442(0.2616)2.9650(0.3063)2.8165(0.8378)2.5143(0.3022)2.6611(0.2403)2.9615(0.2900)
42.8652(0.9589)2.5229(0.3423)2.6886(0.2714)3.0279(0.3254)2.8642(0.8974)2.5468(0.3173)2.7008(0.2492)3.0167(0.3052)
2812.8527(0.5505)2.6577(0.1950)2.7535(0.1468)2.9476(0.1898)2.8501(0.5313)2.6631(0.1870)2.7550(0.1404)2.9418(0.1835)
22.8826(0.6225)2.6429(0.2397)2.7602(0.1814)2.9886(0.2121)2.8818(0.5949)2.6554(0.2264)2.7663(0.1705)2.9832(0.2029)
32.8649(0.5455)2.6712(0.1937)2.7664(0.1450)2.9586(0.1874)2.8671(0.5258)2.6820(0.1851)2.7730(0.1380)2.9573(0.1805)
42.8949(0.6095)2.6815(0.2134)2.7862(0.1592)2.9983(0.2069)2.8942(0.5843)2.6903(0.2039)2.7904(0.1516)2.9935(0.1985)
502512.9104(0.8956)2.5571(0.3532)2.7280(0.2763)3.0604(0.3000)2.9102(0.8493)2.5902(0.3200)2.7456(0.2471)3.0527(0.2849)
22.7777(0.7067)2.5033(0.2743)2.6369(0.2192)2.9021(0.2488)2.7781(0.6659)2.5238(0.2543)2.6479(0.2015)2.8955(0.2347)
32.8245(0.7498)2.5478(0.2767)2.6826(0.2172)2.9542(0.2595)2.8246(0.7196)2.5614(0.2632)2.6898(0.2055)2.9492(0.2493)
42.8672(0.7257)2.6103(0.2569)2.7357(0.1969)2.9911(0.2477)2.8678(0.6898)2.6249(0.2429)2.7437(0.1851)2.9857(0.2357)
3512.8164(0.4576)2.6491(0.1672)2.7315(0.1262)2.8965(0.1602)2.8182(0.4473)2.6557(0.1626)2.7357(0.1224)2.8965(0.1565)
22.8728(0.4601)2.7126(0.1602)2.7916(0.1181)2.9518(0.1580)2.8690(0.4457)2.7143(0.1547)2.7906(0.1140)2.9457(0.1533)
32.8077(0.4152)2.6620(0.1456)2.7339(0.1094)2.8806(0.1460)2.8079(0.4019)2.6669(0.1410)2.7365(0.1057)2.8786(0.1414)
42.8338(0.4928)2.6556(0.1782)2.7433(0.1342)2.9195(0.1713)2.8335(0.4790)2.6613(0.1722)2.7461(0.1294)2.9168(0.1666)
Table 5. Average widths (AWs) (first column) and CVs (second column) of 95% confidence/credible intervals of entropy estimates for Burr XII distribution with α = 3 and β = 2 under different censoring schemes ( γ = 0.05 ).
Table 5. Average widths (AWs) (first column) and CVs (second column) of 95% confidence/credible intervals of entropy estimates for Burr XII distribution with α = 3 and β = 2 under different censoring schemes ( γ = 0.05 ).
nmCSACIHPD
MCMC0MCMC1
301511.3030.8801.4370.9301.1270.992
20.8430.8900.9250.9340.8270.984
30.9790.8961.0950.9340.9490.982
41.0550.8821.1750.9300.9900.992
2110.8950.9120.9520.9400.8400.980
20.7110.9240.7550.9540.6980.958
30.7760.8960.8310.9340.7560.964
40.8570.9080.9200.9420.8210.960
402011.1370.9141.2250.9401.0110.982
20.7310.9080.7770.9280.7170.952
30.8570.9020.9330.9420.8340.970
40.9240.9100.9910.9400.8720.970
2810.7800.9400.8120.9520.7410.978
20.6190.9260.6460.9300.6100.948
30.6800.9100.7150.9420.6660.956
40.7390.9160.7710.9440.7130.970
502511.0240.9201.0850.9260.9300.962
20.6580.9480.6920.9660.6460.978
30.7560.9340.8060.9460.7390.972
40.8250.9240.8750.9440.7920.970
3510.6960.9200.7250.9320.6680.962
20.5550.9240.5720.9440.5490.944
30.6050.9220.6320.9360.5950.950
40.6670.9260.6940.9380.6470.970
Similarly, we used the Lindley method and a Monte Carlo simulation to get a set of posterior risks of entropy under different true values of β in a simulation experiment, which is shown in the form of graphs. We used hyperparameters a = 0.8 , b = 1 , c = 0.8 , d = 1 , n = 50 , m = 35 , and the censoring scheme was R 1 = R 2 = = R 34 = 0 , R 35 = 15 . The results are shown in Figure 2.
Here are some observations regarding the performances of the estimators for entropy according to Table 2, Table 3, Table 4 and Table 5 and Figure 2:
  • Comparing the results in Table 2, for fixed n and increased m, the MSEs and ABs of the entropy estimates decrease as expected regardless of censoring schemes. For the fixed censoring ratio (e.g., n = 30 , m = 15 and n = 40 , m = 20 ), the MSEs and ABs also decrease distinctly with n increasing. The BOLD data in the Table 2 had the best performance under different methods respectively.
  • In Table 2, Table 3 and Table 4, different censoring schemes do have impacts on the estimated results, among which Schemes 2 and 1 performed best and worst respectively, in terms of MSEs, ABs and PRs.
  • According to Figure 1 and Table 2, the following conclusions can be drawn: In Bayesian estimation, the MSEs and ABs of entropy estimates with the informative prior are smaller than those with non-informative prior, which is obvious for both Lindley and MCMC methods. Meanwhile, under the same prior, MSEs and ABs under the Lindley method are close to those under MCMC, and even slightly smaller than that under MCMC. In Table 2, ML estimates are worse than Bayesian estimates, regardless of Lindley and MCMC methods, in terms of MSEs and ABs. The elementary reason is that Bayesian estimation combines data information with the priors of parameters, which ML estimation cannot achieve.
  • As shown in Table 3 and Table 4, the estimates performed best against KLF, and worst against SELF in terms of PRs with both Lindley and MCMC methods for all censoring schemes (except for some small sample sizes). In Figure 2, it can be observed more intuitively that the posterior risk against SELF is the largest for small parameter values, and that the PRs against WSELF and PLF are very close in most cases. For the convenience of comparison, several posterior values of risk are marked in the figure. It can be seen that the trends of PRs are very similar under two kinds of priors, but the PRs under non-informative prior are always strictly greater than the that under informative prior for the same parameters.
  • In Table 5, it can be noted that the CVs of HPD credible intervals are quite close to the nominal level (95%) compared to ACIs. The AWs of ACIs and HPD intervals under non-informative prior are very close. However, with the informative prior, the HPD interval performs better than ACI for each censoring scheme.

4.2. Real Data Analysis

In this section, a set of real data is analyzed to illustrate the feasibility of the above model. This dataset was used in [2], which was about the first failure time of small electric trolleys within large manufacturing plants for transportation and delivery. Meanwhile, Lio et al. [30] used the goodness of fit test to check whether the dataset was reasonably acknowledged by Burr XII, and gave the ML estimates of the parameters using the dataset, α = 0.08 and β = 5.47 , respectively. Meanwhile, the Kolmogorov–-Smirnov test has p = 0.1008 and A I C = 4.1757 . These results prove that Burr XII distribution fits this set of data reasonably. For easy reference, these dataset is reproduced in Table 6.
The censoring scheme we chose was R 1 = R 2 = = R 15 = 0 , R 16 = 4 . Then we got Table 7. Additionally, the 95% credible intervals using MCMC0 and MCMC1 method are, respectively, (3.441,7.304) and (3.371,6.971). If the experimenter obtains an extreme value of entropy using the real data, and is not sure whether it should be discarded or recalculated, one can refer to the credible interval to process it. However, the widths of credible intervals here are relatively wide. The primary reason may be that there is not a large sample size of real data. Through the simulation experiment results in the column of AWs (first column) in Table 5, it can be concluded that a larger sample size can lead to a shorter interval length. Therefore, in practice, a larger sample size is helpful to get a shorter posterior interval.

5. Conclusions

In this paper, we investigated the statistical inferences for the information entropy of Burr XII distribution using progressive type-II censored data. Based on point and interval estimation, frequentist and Bayesian estimations were developed. In the Bayesian section, we demonstrated the performances of estimators under different loss functions, prior distributions and censoring schemes, which is helpful for the selection of models with entropy, such as those using the maximum entropy principle.
We compared the ML and Bayesian estimators using Lindley and MCMC methods in terms of MSEs (ABs), and found that Bayesian estimators performed significantly better than ML estimators, and that Bayesian estimators with the informative prior performed better than those with the non-informative prior. Additionally, it was also found that different censoring schemes do have impacts on the estimated results, among which Scheme 2 performed best. If one want to estimate the entropy of the Burr XII distribution using progressive censored data, as in the case in this article, Bayesian estimation with the informative prior and censoring Scheme 2 may be the appropriate choices. Posterior risks are used to evaluate the performances of estimators against different loss functions, which provides a variety of comparative reference.

Author Contributions

Investigation, X.W.; supervision, W.G. All authors have read and agreed to the published version of the manuscript.

Funding

The authors’ work was partially supported by the National Statistical Science Research Project of China (No. 2019LZ32).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

MLMaximum likehood
pdfProbability distribution function
cdfCumulative distribution function
MCMCMarkov chain Monte Carlo
ACIAsymptotic confidence interval
HPDHighest posterior density
MSEMean square error
ABAverage absolute bias
PRPosterior risk
AWAverage width
CVCoverage probability
SELFsquared error loss function
WSELFweighted squared error loss function
KLFK-loss function
PLFprecautionary loss function

Appendix A

For f ( x ) = α β x β 1 ( 1 + x β ) ( α + 1 ) , x > 0 , the log-density is
ln f ( x ) = ln ( α β ) + ( β 1 ) ln x ( 1 + α ) ln ( 1 + x β ) ;
then the entropy can be expressed as
H ( f ) = 0 f ( x ) ln f ( x ) d x = 0 f ( x ) ln ( α β ) + ( β 1 ) ln x ( 1 + α ) ln ( 1 + x β ) d x = ln ( α β ) ( β 1 ) E ( ln X ) ( 1 + α ) E ln ( 1 + X β ) ,
where X is a random variable with pdf f ( x ) . Thus it is necessary to further deduce E ( ln X ) and E ln ( 1 + X β ) .
Taking the derivative of α on both sides of 0 f ( x ) d x = 1 can lead to
0 β x β 1 ( 1 + x β ) ( α + 1 ) α ( 1 + x β ) ( α + 1 ) ln ( 1 + x β ) d x = 0 ,
and then
E ln ( 1 + X β ) = 1 α .
Calculate
E ( X r ) = 0 x r f ( x ) d x = 0 .
Make the following variable substitution 1 + x β = 1 1 t , 0 < t < 1 . Then (A4) can be reduced to
E ( X r ) = α 0 1 t r β ( 1 t ) α r β 1 d t = α B r β + 1 , α r β = Γ r β + 1 Γ α r β Γ ( α ) ,
where α r β > 0 . Taking the derivative of both sides of (A5) with respect to r, we obtain
d E ( X r ) d r = E ( X r ln X ) = 1 β Γ ( α ) Γ r β + 1 Γ α r β Γ r β + 1 Γ α r β .
At r = 0 ,
E ( ln X ) = 1 β ψ ( 1 ) ψ ( k ) .
Substituting (A4) and (A6) into (A2), we get
H ( f ) = ln α β β 1 β ψ ( 1 ) ψ ( α ) + 1 α + 1 .

References

  1. Burr, I.W. Cumulative Frequency Functions. Ann. Math. Stats 1942, 13, 215–232. [Google Scholar] [CrossRef]
  2. Zimmer, W.J.; Keats, J.B.; Wang, F.K. The Burr XII Distribution in Reliability Analysis. J. Qual. Technol. 1998, 30, 386–394. [Google Scholar] [CrossRef]
  3. Jaheen, Z.F.; Okasha, H.M. E-Bayesian estimation for the Burr type XII model based on type-II censoring. Appl. Math. Model. 2011, 35, 4730–4737. [Google Scholar] [CrossRef]
  4. Wu, S.F.; Wu, C.C.; Chen, Y.L.; Yu, Y.R.; Lin, Y.P. Interval estimation of a two-parameter Burr-XII distribution under progressive censoring. Stats 2010, 44, 77–88. [Google Scholar] [CrossRef]
  5. Panahi, H.; Sayyareh, A. Parameter estimation and prediction of order statistics for the Burr Type XII distribution with Type II censoring. J. Appl. Stat. 2014, 41, 215–232. [Google Scholar] [CrossRef]
  6. Qin, X.; Gui, W. Statistical inference of Burr-XII distribution under progressive Type-II censored competing risks data with binomial removals. J. Comput. Appl. Math. 2020, 378, 112922. [Google Scholar] [CrossRef]
  7. Elsagheer, R.M. Inferences in Constant-Partially Accelerated Life Tests Based on Progressive Type-II Censoring. Bull. Malays. Math. Sci. Soc. 2016, 41, 609–626. [Google Scholar]
  8. Du, Y.; Guo, Y.; Gui, W. Statistical Inference for the Information Entropy of the Log-Logistic Distribution under Progressive Type-I Interval Censoring Schemes. Symmetry 2018, 10, 445. [Google Scholar] [CrossRef] [Green Version]
  9. Soliman, A.; Abd Ellah, A.; Abou-Elheggag, N.; Modhesh, A. Estimation from Burr type XII distribution using progressive first-failure censored data. J. Stat. Comput. Simul. 2013, 83, 2270–2290. [Google Scholar] [CrossRef]
  10. Wang, L.; Li, H. Inference for exponential competing risks data under generalized progressive hybrid censoring. Commun. Statist. Simul. Comput. 2019, 1–17. [Google Scholar] [CrossRef]
  11. Patra, L.K.; Kayal, S.; Kumar, S. Measuring Uncertainty Under Prior Information. IEEE Trans. Inf. Theory 2020, 66, 2570–2580. [Google Scholar] [CrossRef]
  12. Papalexiou, S.M.; Koutsoyiannis, D. Entropy based derivation of probability distributions: A case study to daily rainfall. Adv. Water Resour. 2012, 45, 51–57. [Google Scholar] [CrossRef]
  13. Ali, M.D. Entropy, Information Theory, Information Geometry and Bayesian Inference in Data, Signal and Image Processing and Inverse Problems. Entropy 2015, 17, 3989–4027. [Google Scholar]
  14. Sunoj, S.M.; Sankaran, P.G. Quantile based entropy function. Stat. Probab. Lett. 2012, 82, 1049–1053. [Google Scholar] [CrossRef]
  15. AboEleneen, Z.A. The Entropy of Progressively Censored Samples. Entropy 2011, 13, 437–449. [Google Scholar] [CrossRef] [Green Version]
  16. Lee, K. Estimation of entropy of the inverse Weibull distribution under generalized progressive hybrid censored data. J. Korean Data Inf. Sci. Soc. 2017, 28, 659–668. [Google Scholar]
  17. Zhao, G.Q.; Liang, W.; He, S.Y. Empirical entropy for right censored data. Acta Math. Appl. Sin. Ser. 2015, 31, 395–404. [Google Scholar] [CrossRef]
  18. Maurya, R.K.; Tripathi, Y.M.; Rastogi, M.K.; Asgharzadeh, A. Parameter estimation for a Burr XII distribution under progressive censoring. Am. J. Math. Manag. Sci. 2017, 36, 259–276. [Google Scholar] [CrossRef]
  19. Fu, J.; Xu, A.; Tang, Y. Objective Bayesian analysis of Pareto distribution under progressive Type-II censoring. Stat. Probab. Lett. 2012, 82, 1829–1836. [Google Scholar] [CrossRef]
  20. Musleh, R.M.; Helu, A. Estimation of the inverse Weibull distribution based on progressively censored data: Comparative study. Reliab. Eng. Syst. Saf. 2014, 131, 216–227. [Google Scholar] [CrossRef]
  21. Rastogi, M.K.; Merovci, F. Bayesian estimation for parameters and reliability characteristic of the Weibull Rayleigh distribution. J. King Saud Univ. Sci. 2017, 30, 472–478. [Google Scholar] [CrossRef]
  22. Singh, S.; Tripathi, Y.M. Estimating the parameters of an inverse Weibull distribution under progressive type-I interval censoring. Stat. Pap. 2018, 59, 21–56. [Google Scholar] [CrossRef]
  23. Lee, W.C.; Wu, J.W.; Hong, M.L.; Lin, L.S.; Chan, R.L. Assessing the lifetime performance index of Rayleigh products based on the Bayesian estimation under progressive type II right censored samples. J. Comput. Appl. Math. 2011, 235, 1676–1688. [Google Scholar] [CrossRef]
  24. Srivastava, R.S.; Kumar, V.; Rao, A.K. Bayesian estimation of the shape parameter and reliability of generalized Pareto distribution using precautionary loss function with censoring. Arch. Soc. Esp. Oftalmol. 2012, 87, 96. [Google Scholar]
  25. Kızılaslan, F. Classical and Bayesian estimation of reliability in a multicomponent stress–strength model based on a general class of inverse exponentiated distributions. Stat. Pap. 2018, 59, 1161–1192. [Google Scholar] [CrossRef]
  26. Albert, J. Bayesian Computation with R, 2nd ed.; Springer: Berlin, Germany, 2009. [Google Scholar]
  27. Renjini, K.R.; Abdul-Sathar, E.I.; Rajesh, G. A study of the effect of loss functions on the Bayes estimates of dynamic cumulative residual entropy for Pareto distribution under upper record values. J. Stat. Comput. Simul. 2016, 86, 324–339. [Google Scholar] [CrossRef]
  28. Sajid, A. On the Mean Residual Life Function and Stress and Strength Analysis under Different Loss Function for Lindley Distribution. J. Qual. Reliab. Eng. 2013, 2013, 1–13. [Google Scholar]
  29. Lindley, D.V. Approximate Bayesian methods. Trab. Estad. Investig. Oper. 1980, 31, 223–245. [Google Scholar] [CrossRef]
  30. Lio, Y.L.; Tsai, T.R.; Wu, S.J. Acceptance sampling plans from truncated life tests based on the Burr type XII percentiles. J. Chin. Inst. Ind. Eng. 2010, 27, 270–280. [Google Scholar] [CrossRef]
Figure 1. ABs of estimators of entropy under different methods and n values.
Figure 1. ABs of estimators of entropy under different methods and n values.
Mathematics 09 00313 g001
Figure 2. Posterior risks of estimators of entropy under different β ( α = 0.5 ).
Figure 2. Posterior risks of estimators of entropy under different β ( α = 0.5 ).
Mathematics 09 00313 g002
Table 1. Bayesian estimators and corresponding posterior risks of different loss functions.
Table 1. Bayesian estimators and corresponding posterior risks of different loss functions.
Loss FunctionBayesian EstimatorPosterior Risk
SELF: ( H ^ H ) 2 E ( H | x ) V ( H | x )
WSELF: ( H ^ H ) 2 H [ E ( H 1 | x ) ] 1 E ( H | x ) [ E ( H 1 | x ) ] 1
KLF: H ^ H H H ^ 2 E ( H x ̲ ) E ( H 1 x ̲ ) 2 E ( H x ̲ ) E ( H 1 x ̲ ) 1
PLF: ( H ^ H ) 2 H ^ E ( H 2 | x ) 2 E ( H 2 | x ) 2 E ( H | x )
Table 6. The dataset contains the first failure time (in months) of electric trolleys within large manufacturing plants for transportation and delivery.
Table 6. The dataset contains the first failure time (in months) of electric trolleys within large manufacturing plants for transportation and delivery.
0.91.52.33.23.95.06.27.58.310.4
11.112.615.016.319.322.624.831.538.153.0
Table 7. Bayesian estimates and corresponding PRs (second row) of entropy for different loss functions. For hyperparameters, a = 1 , b = 1 , c = 2 , d = 1 .
Table 7. Bayesian estimates and corresponding PRs (second row) of entropy for different loss functions. For hyperparameters, a = 1 , b = 1 , c = 2 , d = 1 .
MLELindley0Lindley1MCMC0MCMC1
SELFWSELFKLFPLFSELFWSELFKLFPLFSELFWSELFKLFPLFSELFWSELFKLFPLF
4.74474.95644.78584.87045.03315.07944.91434.99625.14774.93214.76064.84565.02494.83184.67544.75304.9139
0.76610.17060.07130.15340.69890.16510.06720.13670.92420.17150.07210.18560.80020.15640.06690.1642
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Wang, X.; Gui, W. Bayesian Estimation of Entropy for Burr Type XII Distribution under Progressive Type-II Censored Data. Mathematics 2021, 9, 313. https://doi.org/10.3390/math9040313

AMA Style

Wang X, Gui W. Bayesian Estimation of Entropy for Burr Type XII Distribution under Progressive Type-II Censored Data. Mathematics. 2021; 9(4):313. https://doi.org/10.3390/math9040313

Chicago/Turabian Style

Wang, Xinjing, and Wenhao Gui. 2021. "Bayesian Estimation of Entropy for Burr Type XII Distribution under Progressive Type-II Censored Data" Mathematics 9, no. 4: 313. https://doi.org/10.3390/math9040313

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop