1. Introduction
Fréchet distribution, known also as inverse Weibull distribution or Type-II extreme value, is one of the common lifetime distributions in extreme value theory. It has the ability to model different failure rates compared to other known distributions, so it has been widely used in several fields such as engineering, biology, physics and others. The three-parameter Kumaraswamy–Fréchet distribution (KFD) was introduced by Shahbaz et al. [
1] to fit real-life data with various shapes of failure rate. Recently, Tomazella et al. [
2] showed that the KFD has a simple structure but its parameters are non-identifiable. They, thus, proposed a new parameterized (identifiable) KFD which was also referred to as Type-II Lehmann Fréchet distribution (LFD-TII). Further, they studied various characteristics of LFD-TII and stated that it is flexible for data modelling with a unimodal hazard rate shape. Suppose
X is a random variable, used to test the lifetime of a unit (or product), which follows the LFD-TII with two shape parameters
and a scale parameter
as
where
.
Subsequently, its probability density function (PDF, say,
) and cumulative distribution function (CDF say
) are given, respectively, by
In addition, its reliability function (RF, say,
) and hazard function (HF, say,
), for
, are given by
respectively. Putting
in (
1), the Fréchet distribution is acquired as a special case. Various shapes of the density and hazard functions, when
and some specified values of
and
, of the LFD-TII are shown in
Figure 1. It shows that the density function has a long right tail and that the peak of the distribution decreases when the parameter values of
and
increase. Further, the HF plots indicate that the failure rate function has a unimodal shape.
In the context of reliability experiments, practitioners prefer to deal with a progressive Type-II censoring scheme (PCS-TII) compared to a conventional Type-II (failure) censoring scheme because it allows the removal of survival item(s) during experiments at different stages other than the termination point. Before starting the experiment,
n(test sample size),
(effective sample size) and
(removal pattern) must be predetermined by the experimenter. When the first failure
occurs, the
of survival units are randomly drawn from the remaining
units. Again, when the second failure
occurs, the
of survival units are randomly drawn from the remaining
units, and so on. Lastly, when the
mth failure
occurs,
are removed from the test and terminated. For more details, see an excellent monograph published by Balakrishnan and Cramer [
3].
In this case, the likelihood function for
is defined as
where
.
In this study, besides the conventional likelihood function (LF) given in (
5), the product of spacing (PS) method is also used. This method was independently introduced and discussed by Cheng and Amin [
4] and Ranneby [
5] as an alternative approach for estimating parameter(s) of continuous univariate distributions.
Following the same philosophy of deriving the maximum likelihood estimators (MLEs), the maximum product of spacing estimators (MPSEs) are derived by determining the parameter choices that maximize the product of the spacings between the ordered values of the target distribution. In addition, in the case of small sample sizes for heavy-tailed or skewed distributions, Anatolyev and Kosenok [
6] indicated that the product of spacing estimators are highly efficient compared to the estimators of likelihood method. According to Ng et al. [
7], the PS function of PCS-TII data is defined as
where
and
.
In the literature, we have not come across any study related to the problem of estimating any unknown parameter of the LFD-TII under any incomplete (censored) data, to the best of our knowledge. To address this gap, the objectives of the present study are fourfold. First, we will mainly focus on both classical and Bayesian estimations to develop the point and interval estimates of
,
,
,
and
. Using Fisher’s information, approximate confidence intervals (ACIs) for
,
,
or any of their related functions are constructed. Since classical estimators cannot be obtained explicitly, they are evaluated numerically by applying the Newton–Raphson (N-R) method via the ‘maxLik’ package in
programming software; see Henningsen and Toomet [
8]. Using both LF and PS methods, the Bayes estimators against the squared-error loss (SEL) and general-entropy loss (GEL) functions cannot be obtained in closed expressions. To adapt to this problem, utilizing gamma priors, Monte-Carlo Markov-chain (MCMC) techniques are considered to approximate the Bayes estimates and also to construct the highest posterior density (HPD) intervals. Thus, to carry out the Bayesian computations, the ’coda’ package via
programming software proposed by Plummer et al. [
9], which simulates MCMC variates, is used. The second objective discusses how to determine the optimum PCS-TII plan from a set of all possible removal patterns. Third, to compare the performance of the proposed methods, Monte-Carlo simulations are presented. The behavior of the point estimates is compared in terms of their root mean squared-errors (RMSEs) and mean relative absolute biases (MRABs), while the behavior of the interval estimates is compared using their average confidence lengths (ACLs) and coverage probabilities (CPs). The last goal is to demonstrate the suitability and flexibility of the LFD-TII compared to six other popular distributions in modeling real data sets, as well as validating the proposed methodologies in a real-life scenario. We also provide some recommendations.
The next sections of the paper are arranged as follows:
Section 2 concerns the classical inference. Bayesian inference is investigated in
Section 3. Two-sided ACI/HPD intervals are presented in
Section 4. Simulation outputs are presented in
Section 5. In
Section 6, some criteria of optimal censoring are presented. Two real applications are provided in
Section 7. We conclude the article in
Section 8.
5. Simulation Study
In this section, the behavior of the proposed frequentist and Bayesian estimators is evaluated through an extensive Monte-Carlo simulation study.
Following Balakrishnan and Cramer [
3], for various combinations of
n,
m and
R, we replicated 1,000 times from
. At mission time
, the corresponding actual value of
and
is taken as 0.998 and 0.099, respectively. Taking
40 and 80, the failure percentages (FPs),
, are used as 50 and 90%. In addition, for each set of
n and
m, various patterns of
are used, namely:
In Bayesian calculations, two different sets of hyperparameters for
,
and
are used called, Prior 1:
and
; and Prior 2:
and
. Using the hybrid strategy proposed in
Section 3.4, we simulate
samples from the MCMC-LF (or MCMC-PS) approach and discard the first 2000 values
as burn-in. Hence, based on 10,000 MCMC-LF samples, the Bayes estimates and 95% HPD intervals are computed using SEL and GEL (with
) functions.
Using the following formulae, the average estimates (AE) of
,
,
,
and
, say, (
), with their RMSEs and MRABs are calculated, respectively, as
and
where
is the number of generated sequences;
is the calculated estimate obtained at
sample of
,
,
,
,
and
.
In addition, the ACLs and CPs of
ACI (or HPD) intervals of
are given by
and
respectively, where
and
denote the lower and upper sides, respectively, and
is the indicator function.
The convergence status of the hybrid MCMC algorithm for the simulated MCMC draws of each unknown parameter is evaluated via both trace and autocorrelation plots (when
and censoring
as an example), see
Figure 3. This demonstrates that the trace plots resemble random noise as well as the autocorrelation values close to zero when the lag-value grows. As a result, MCMC draws are appropriately mixed and the estimate results are reasonable.
Graphically, the RMSEs, MRABs, ACLs and CPs of
,
,
,
and
are shown with heatmaps, see
Figure 4,
Figure 5,
Figure 6,
Figure 7 and
Figure 8, respectively, while all simulation results of the same unknown parameters are available in the
supplementary file. All numerical evaluations were coded in
4.0.4 software. The
scripts that support the Monte-Carlo findings are available from the corresponding author upon reasonable request. For specification, based on Prior 1 (say, P1) as an example, several notations of the proposed estimation methods are used, such as: Bayes estimates under SEL function from LF method mentioned as “SE-LF-P1”; Bayes estimates under GEL function from LF method for
mentioned as “GE1-LF-P1” and “GE1-LF-P2”, respectively; and the HPD intervals from the LF method denoted as “HPD-LF-P1”.
All estimates of , , , or perform better in terms of lowest RMSEs, MRABs and ACLs as well highest CPs.
As n (or FP) increases, the proposed estimates become even better than expected. A similar pattern is also observed when decreases.
Bayesian (point/interval) estimates of all unknown parameters have satisfactory behavior compared to the frequentist (point/interval) estimates, as expected. Since the variance in Prior 2 is lower than the variance in Prior 1, it is observed that the Bayes MCMC estimates using Prior 2 have performed better.
Bayes estimates of , , , and are underestimates (overestimates) when has a positive (negative) value. Meanwhile, the Bayes estimates obtained based on the GEL behave satisfactorily compared to those calculated from the SEL.
The MPSE/MCMC-PS estimates of , and performed better than others while the MLE/MCMC-LF estimates of and performed better than others.
The ACI/HPD interval estimates, in most cases, of and obtained from LF performed better than others while the ACI/HPD interval estimates of , and obtained from PS performed better than others in terms of the smallest ACLs and largest CPs.
Comparing the Scheme-1 (first stage) and Scheme-3 (last stage), in terms of their lowest RMSE, MRAB and ACL values as well as their highest CP values, it is clear that the MLE/MCMC-LF estimates for , and are greater under Scheme-3 than -1, while and are greater under Scheme-1 than Scheme-3.
The ACLs of the proposed interval estimates obtained by the MPSE/MCMC-PS of all estimates are greater based on Scheme-1 than Scheme-3. In addition, the ACLs of the proposed interval estimates obtained by the MLE/MCMC-LF of and are narrower under Scheme-1 than Scheme-3, while those associated with , and are greater under Scheme-1 than Scheme-3. The opposite behavior of 95% ACI/HPD intervals is also observed in terms of their CPs.
To sum up, the Bayes Markov-chain Monte-Carlo approach is recommended to estimate the Type-II Lehmann–Fréchet parameters under progressive censoring.
6. Optimum Censoring
In reliability experiments, one of the most significant challenges for any practitioner is how to select an optimal censoring from a group of all probable plans which provides a significant amount of information about the unknown distribution parameter(s) of interest. To specify the optimum PCS plan
, the values of
n and
m must be fixed in advance; see Ng et al. [
15]. In the literature, several authors have also discussed the issue of comparing two (or more) different schemes, see, for example, Pradhan and Kundu [
16], Lee and Cho [
17], and Elshahhat and Abu El Azm [
18], among others.
In this study, using the MLEs/MPSEs of , several criteria of optimum PCS-TII plans are used, namely:
.
.
.
.
.
Regarding criterion
, with respect to the MLEs (or MPSEs) of
,
and
, the experimenter aims to maximize the trace of Fisher’s elements
,
and
; regarding criteria
and
, the experimenter aims to minimize the determinant (det) and trace values of the V-C matrices (
29) and (
30); regarding criterion
, the experimenter aims to minimize the variance in the logarithmic
quantile, say
, where
. From (
2), the logarithmic of the quantile function is
Using (
31), the approximated variance estimate of
is given by
where
7. Real-Life Applications
To examine the applicability of the theoretical results of
,
,
,
and
to a real situation, two real data sets from the engineering and physics fields are analyzed. The first data set (say Data-I), reported by Caroni [
19], consists of the number of million rotations before the failure of each of 22 ball bearings. The second data set (say, Data-II), presented by Hinkley [
20] and discussed by Elshahhat et al. [
21], represents thirty consecutive values of March precipitation (in inches) at Minneapolis/St Paul. Both data sets I and II are reported in
Table 1.
We first fit the LFD-TII to both complete data sets in
Table 1 along with six lifetime models as competitors, namely: inverse Weibull distribution (IWD), Kumaraswamy-inverse Rayleigh distribution (KIRD), Kumaraswamy-inverted exponential distribution (KIED), Kumaraswamy exponentiated inverse Rayleigh distribution (KEIRD), Kumaraswamy-inverse Weibull distribution (KIWD) and Kumaraswamy-inverse Gompertz distribution (KIGD). All the densities of the competing models (for
and
) are reported in
Table 2. To select the best model, different criteria are used, namely: (i) the Kolmogorov–Smirnov (K-S) statistic with its
p value; (ii) negative log-likelihood (NL); (iii) Akaike’s (A), (iv) consistent Akaike’s (CA); (v) Bayesian (B); and (vi) Hannan–Quinn (HQ) information criteria.
Via the ‘AdequacyModel’ package in
programming software proposed by Marinho et al. [
27], using the two data sets I and II listed in
Table 1, the MLEs of model parameters and selection criteria are computed and provided in
Table 3. In addition, the K-S statistics with their
p values are listed in
Table 4. Among all the fitted lifetime models,
Table 3 and
Table 4 show that the LFD-TII has the lowest values with respect to NL, A, CA, B, HQ information criteria and K-S statistics, as well as the highest
p value. Thus, the LFD-TII provides the best fit, for both given data sets, than all the competitive distributions. As an example, the
script used to fit the LFD-TII parameters, to evaluate the model selection criteria, and to compute the K-S distance with its
p value, is reported in
Appendix A.
Furthermore,
Figure 9 display (i) the histogram plot of data sets I and II and the estimated densities and (ii) the plot of estimated and empirical reliability parameters. It shows that the LFD-TII is the best lifetime model among other competing models for both data sets I and II. It also supports the same findings listed in
Table 3 and
Table 4.
From
Table 1, taking
, three different PCS-TII samples are obtained and provided in
Table 5, where the censoring plan
is referred as
for short. Since we do not have any prior information on
,
and
, the Bayes estimates based on SEL and GEL (for
) are developed using the improper gamma prior. We also take 0.0001 for all given hyperparameters. Using the MCMC algorithm described in
Section 3, for each unknown parameter, the first 10,000 simulated values of 50,000 MCMC samples are discarded. To run the MCMC sampler, the frequentist estimates of
,
and
are used as initial values. According to the LF and PS methods, using each sample in
Table 5, the point estimates with their standard errors (St.Es) as well as the interval estimates with their lengths of
,
,
,
and
are calculated and reported in
Table 6 and
Table 7. The acquired estimators of
and
are evaluated at
and 1 for data sets I and II, respectively. It is observed, from
Table 6 and
Table 7, that the Bayes MCMC estimates of all unknown parameters performed better than the frequentist approaches in terms of the minimum St.Es, as well as, the HPD interval estimates also behaving better than others in terms of the shortest lengths.
Using the estimated variances and covariances of MLEs and MPSEs, for data sets I and II, the problem of selecting the optimum PCS-TII censoring is discussed and listed in
Table 8. To distinguish, the selected optimal censoring scheme is marked with an asterisk. From
Table 8, it is noted that
- (i)
From Data-I: Using the LF approach: is the best censoring plan under , while is the best under . Using the PS approach: is the best censoring plan for all the given optimum criteria.
- (ii)
From Data-II: Using the LF approach: is the best censoring plan under , while is the best under . Using the PS approach: is the best censoring plan under ; is the best censoring plan under ; while is the best censoring plan under .
To demonstrate the performance of the 40,000 MCMC simulated variates of
,
,
,
or
, both trace and histogram plots based on
from data sets I and II are available as
supplementary materials. In each trace plot, the symmetric Bayes estimate and its bounds of 95% HPD interval are expressed by solid (—) and dashed (- - -) horizontal lines, respectively. In addition, the symmetric Bayes estimate is plotted with vertical dash-dotted line in histogram plots. However, the trace plots show that the proposed MCMC algorithm converges well. In addition, in most cases, the generated posteriors of all unknown parameters behave most symmetrically. As a conclusion, we can say that the proposed estimation methodologies using LF and PF approaches based on the given data sets I and II provide a good demonstration of Type-II Lehmann–Fréchet distribution lifetime model.