1. Introduction
General progressive censoring plays a key role in the field of lifetime analysis. It generalizes the progressive censoring scheme, which allows removing surviving units from a life test stage-by-stage and to some extent, resolves the time and money constraints (Sen, [
1]). Additionally, it contains two censoring schemes: left censoring and right censoring. Here considers the situation of general progressive Type-II right censoring (GPTIIC).
In a life-test, first randomly select
N units as the test sample. Suppose that the first
r failures are unobserved; for
, while the
failure is observed,
surviving units are withdrawn from the test randomly; while the
failure is observed, all remaining units are withdrawn and the life-test is terminated, that is,
. Denote the totally observed failure times as
, which are also called the general progressively Type-II censored order statistics (GPTIICOS). For notational convenience, represent it with
. Here, denote the corresponding censoring scheme as
. Note that: (i) GPTIIC with
represents the conventional Type-II right censoring scheme. (ii) GPTIIC with
represents the progressive Type-II right censoring scheme. The reader can find more information by consulting the book by Balakrishnan and Aggarwala [
2], which has discussed some mathematical properties and inferential methods based on this general progressive censoring scheme.
Consider a class of distributions that have the inverted form of the standard probability density functions (PDF) and distribution functions (CDF) for the scale family. We call it the inverted scale family (ISF). It is said that a positive and continuous random variable
X with CDF (Equation (
1)) and PDF (Equation (2)) belongs to the inverted scale family (i.e., ISF(
)) with the scale parameter
.
Note that a positive and continuous random variable Y with CDF and PDF is said to come from a standard scale distribution.
Distributions from ISF are essential for the reliability life data analysis. For example, the generalized inverse exponential distribution (GIED) from ISF, the case of the inverse exponential distribution (IED) included, is able to model the aging criteria of various shapes. In addition, a reliability model based on the inverse Weibull distribution (IWD) from ISF may be suitable when the empirical studies on a given data set indicate the probability of a unimodal hazard function. In light of the importance of ISF, statistical inference on the related distributions has appealed to many authors. For the inverted gamma distribution, Lin [
3] acquired the MLE as well as the estimation of the corresponding reliability function. In the case of Type-II censoring, Kundu [
4] dealt with the Bayesian estimation for IWD. Prakash [
5] studied some properties for IED from a Bayesian point of view. Moreover, articles by Krishna [
6], Dey [
7], and Essam [
8] are several recent works on the estimation for GIED models under the progressive censoring.
Although the scheme of GPTIIC and distributions from ISF both play a vital role in life researches, not much attention has been paid to the situation when the samples are from ISF under GPTIIC. So, in this case, we consider the following two estimation problems.
One is the interval estimation, which, as a classical estimation problem, has engaged the attention of many authors. Article by Viveros and Balakrishnan [
9] derived the interval estimates for some life-time parameters. Balakrishnan et al. [
10] obtained interval estimation for Normal distribution in the case of progressive Type-II censoring. Wang [
11] conducted a study on the scale family under GPTIIC and proposed the exact interval estimation. Dey [
7] derived interval estimations for GIED under progressive first censoring.
The other is the estimation of
, which denotes the stress-strength parameter. Suppose a random stress acts on a system; for constructing a stress-strength modeling, we denote the system strength as
X and the value of the random stress as
Y. The system will fail only when its strength is less than the received stress. Thus,
can reflect the system reliability. As
R provides a wide application in many areas, it is necessary to discuss the related estimation problems in the statistical literature. Articles by Church and Harris [
12], Downtown [
13], Govidarajulu [
14], Woodward and Kelley [
15], and Owen et al. [
16] dealt with the situation of
X and
Y coming from normal distributions. Raqab and Kundu [
17], Kundu and Gupta, [
18] and Kundu and Raqad [
19] conducted studies on this estimation problem for the generalized Rayleigh distributions, Weibull distributions, and three-parameter Weibull distributions, respectively.
Here we conduct a study on the exact interval estimation of
as well as the estimation of
R for ISF under GPTIIC. First, the exact interval estimation of
is achieved in
Section 2 by constructing the pivotal quantity. Then we make an assessment on the interval estimation performance through Monte Carlo simulations and further illustrate the estimation process with a remission time dataset. Moreover, the estimation of
is achieved in
Section 3. The MLE, AMLE, and Bayesian estimator of
R are acquired, together with the corresponding interval estimations. In addition, with Bootstrap methods, we derive two interval estimates, suitable for small samples. Finally, we evaluate the effectiveness of the proposed estimations through Monte Carlo simulations and provide an illustrative example of two real datasets.
2. Exact Interval Estimation for
2.1. Interval Estimation with Pivotal Quantity Method
Here, a theorem (proposed by Aggarwala and Balakrishnan [
2,
20]) is introduced to construct the pivotal quantity.
Theorem 1. Consider are the GPTIICOS from the Uniform (0,1) distribution. Thenfollow independent beta distributions given by Equation (3). Lemma 1. Consider are the GPTIICOS from the inverted scale family. Thenfollow independent beta distributions given by Equation (3). Proof. Since , ranging from 0 to 1, increases as X increases, are the GPTIICOS from the Uniform (0,1) distribution. For , represent in the Theorem 1 with . The proof is thus completed. ☐
Denote the CDF of
as
. Then independent random variables
are from the Uniform (0,1) distribution. Here,
is described as Equation (
4). Thus
are independent
random variables. Here,
represents the
distribution with
degrees of freedom. Then we construct the pivotal quantity as:
It is clear that
follows
. For the inverted scale family, when
,
reduces to
As can be seen from Equation (
6), for
,
strictly decreases as
increases. We can also find that for any positive integer
, the exact confidence interval can be derived from Equation (
5) easily when
is strictly monotonic as
increases. Here we consider the failure rate function
for the inverted scale family to determine the monotonicity of
. The following is a brief overview of
.
Suppose a positive continuous random variable
X from ISF(
) as the lifetime of some units in a life-test. Then, the failure rate function represents the probability intensity that an alive unit will fail at the time
x, which is defined as:
For the inverted scale family, it becomes
Here, can be seen as a function of .
According to
, a theorem is given below. Note that
(as shown in Equation (
5)) is strictly decreasing on
if the condition given in Theorem 2 is satisfied.
Theorem 2. Consider are the GPTIICOS from ISF(θ) with censoring scheme . The pivotal quantity (as shown in Equation (5)) is strictly decreasing on if the failure rate function is strictly decreasing on . Proof. Let
and for
, let
. The pivotal quantity
becomes
The failure rate function from Equation (
7) can be rewritten as:
According to Equation (
8), since
strictly decreases as
increases, we can infer that
is strictly decreasing on
. Thus the monotonicity of
can be determined. The derivative of
is:
From the above equation, it is clear that is strictly decreasing on . Considering the monotonicity of the CDF, we can find is also strictly decreasing on . Thus, we complete the proof. ☐
The exact confidence interval of is given below.
Theorem 3. For any , if is strictly decreasing on , thusis an exact confidence interval for θ. Here, denotes the solution to for any and denotes the ξ percentile of . Proof. Since
is a strictly decreasing function on
, the probability that
belongs to the interval of Equation (
10) is:
☐
Remark 1. For any , if is strictly increasing on , thusis an exact confidence interval for θ. Here, denotes the solution to for and denotes the ξ percentile of . Example 1. The generalized inverse exponential distribution (i.e., GIED(), which includes the inverse exponential distribution (i.e., IED(θ)) as a special case when , is from the inverted scale family with By Theorem 2, we have with the derivative . Let , then, a system of equations is given by From Equation (12), when , we have and thus , which proves that is strictly decreasing on . So is strictly decreasing on . For the generalized inverse exponential distribution, from Equation (5), the pivotal quantity is developed as: When , reduces to Then applying Theorem 3, we achieve the exact interval estimation.
Example 2. The inverse Weibull distribution (i.e., IWD()), including the inverse exponential distribution (when ) and the inverse Rayleigh distribution (IRD(θ), when ) as special cases, is from the inverted scale family with By Theorem 2, we have with the derivative . Let , then, a system of equations is given by: Clearly, when , we have and thus , which proves that is strictly decreasing on . So is strictly decreasing on .
For the inverse Weibull distribution, from Equation (5), the pivotal quantity is developed as: When , reduces to Then applying Theorem 3, we achieve the exact interval estimation.
2.2. Simulation Study
Here, the exact interval estimation for ISF under GPTIIC is implemented through Monte Carlo simulations. Take IED as a prospective life distribution (the CDF and PDF are given in Example 1). With interval coverage percentages, we evaluate the validity of the exact confidence interval derived in Theorem 3.
According to Balakrishnan [
2], GPTIICOS from IED(
) can be generated by the following algorithm.
- Step 1:
Generate a random variable from .
- Step 2:
Generate independent Uniform (0,1) random variables .
- Step 3:
Set where .
- Step 4:
Set for .
- Step 5:
Set for .
thus denote the GPTIICOS from IED() with the censoring scheme .
Here, suppose
are from IED(5). For evaluating the performance of the exact confidence interval estimation, we consider the cases of
,
and
as well as four different
, including
,
,
, and
. In this case, we give the average confidence intervals for
and the corresponding interval lengths and interval coverage percentages with 10,000 simulations. The results are shown in
Table 1. Here, the considered confidence levels are
and
.
As can be seen from
Table 1, the obtained interval coverage percentage is close to the corresponding confidence level. This indicates the proposed interval estimation has a good performance. Additionally, the simulation results show the relationship between the confidence interval length and the values of
:
(a) When
(i.e., the number of completely observed failures) and
N (i.e., the sample size) are fixed, the confidence interval length is, to some extent, related to
r (that is, the number of the first unobserved failures), as shown in
Table 1 that the average confidence interval is wider when
r is larger.
(b) When the values of r and n are fixed, the interval length is related to N as it is larger when N is smaller.
(c) When the values of N and r are fixed, the interval length is related to n as it is smaller when n is larger.
Obviously, a narrow confidence interval can provide more information about the unknown parameter. We conclude that the estimation is more accurate when more samples are completely observed for the confidence interval length decreases as the number of the completely observed samples increases.
2.3. An Illustrated Example
A real dataset, reported by Gross and Declark [
21], is provided below for illustrating the proposed estimation process in the previous subsections. It shows the remission time of 20 patients receiving analgesics (in minutes):
| | | | | | | | | |
| | | | | | | | | |
Here, we consider three reliability models based on IWD (as shown in Example 2), GIED (as shown in Example 1), and the Weibull distribution (i.e., WD, the CDF and PDF are given below), respectively.
We use maximum likelihood estimation to construct the above three parametric models, and for each model, five measures are used to test the goodness-of-fit: (1) −ln, (2) distance, (3) p-value, (4) , and (5) . Here, for each model, −ln can be calculated with the data and MLEs of the corresponding parameters. The following briefly introduces and .
(i)
, proposed by Akaike [
22], can reflect the fitting effect of a reliability model to the data. The equation is given below:
Here, for an estimated parametric model,
m denotes the number of parameters, and
denotes the maximum value of the log-likelihood function. For a given data set, the best fit order of several competing models can be given from their
, and the one with lower
ranks higher.
(ii)
, proposed by Schwarz [
23], is a selection criterion for choosing between several estimated statistical models with different numbers of parameters. The equation is given by:
Here, for an estimated parametric model, m denotes the number of parameters, n denotes the number of observed failures, and denotes the maximum value of the log-likelihood function. has a close relationship with the . When a data set is given, same as in , the best fit order of several competing models can be given from their , and the one with lower ranks higher. However, compared with , has a greater penalty for additional parameters.
In the case of MLE, the values of the above five measures based on the three reliability models are listed in
Table 2. Clearly, the best fit order for the given data set is:
Since IWD has the lowest values of −ln,
distance,
,
, and the highest
p-value, it is considered the best of the three reliability models. Furthermore, the fitting effects of the three reliability models are intuitively shown in
Figure 1. Hence, we fit the IWD model to the data set. It appears from
Table 2 that the values of −ln,
distance,
p-value,
, and
in the case of MLE are
,
,
,
, and
, respectively, indicating the IWD model fits well to the data when
takes the value of
. Hence, for estimating the exact confidence interval of
, we can assume that the sample follows IWD(
,3.777054). Set
,
, and then
Table 3 shows the censoring scheme and the generated censored data.
We compute the exact confidence interval for
with Equation (
5). Here, the
and
confidence intervals are obtained to be
and
, respectively.
3. Estimation of
Consider the situation when the distributions of independent random variables
X and
Y are ISF(
) and ISF(
) (the CDF and PDF are shown in
Section 1), respectively.
Then, the stress-strength parameter is described as:
Suppose are the GPTIICOS from X with censoring scheme ; are the GPTIICOS from Y with censoring scheme . Based on the above samples, we discuss the estimation of R.
3.1. Maximum Likelihood Estimator (MLE)
In order to get the MLE of R, the MLEs of and need to be derived. Here, denote the MLEs of , , R as , , for convenience.
The likelihood function is described as Equation (
15) based on the above samples.
where
It seems impossible to get the MLEs of
and
explicitly from the likelihood Equations (
16) and (17). However, we can solve Equations (
16) and (17) with the R programming language to obtain
and
. Then
can be expressed as
. In the following subsection, explicit solutions of
and
will be given with an approximative method.
3.2. Approximate Maximum Likelihood Estimator (AMLE)
From the above analysis, getting explicit solutions for
and
from the nonlinear Equations (
16) and (17) is quite difficult. So we use Taylor expansions to acquire approximate estimators. Denote the AMLEs of
,
,
R as
,
,
, respectively.
Considering the transformation
, then its CDF and PDF are respectively given by:
Remark 2. is decreasing on .
Proof. According to the monotonicity of , the proof is thus completed. ☐
Let
and
. The likelihood equations of
and
can be rewritten by:
Consider the following Taylor expansions that keep only the first two terms. Let
,
,
and expand them around the point
; let
,
,
and expand them around the point
. Note that for
,
and
are the derivatives of
and
, respectively.
Then we consider the expression for
and
. According to the transformation of
and the monotonicity of
, we have
and
. For
, let
. Here
denote the GPTIICOS from the Uniform (0,1) distribution. Then
can be approximated as
, where
can be obtained from the following equations (proposed by Balakrishnan [
2]).
where
Clearly, can be approximated in a similar way.
Combining the above Taylor expansions, the likelihood Equations (
20) and (21) then reduce to
Here, we show two theorems to give the explicit solutions for and .
Theorem 4. [24] If is concave on , then it is said that the function is log-concave, and the following conditions are satisfied: (i) is monotone decreasing on . (ii) . Remark 3. It is easy to prove Theorem 4 holds in the case of and .
Theorem 5. [25] Suppose is the density function on and is the distribution function. If is log-concave on , then (i) is log-concave, (ii) is also log-concave on . Lemma 2. If the standard scale density function is log-concave on , then the AMLEs of and are given in Equation (22) and can be obtained with Equation (14).where Proof. Substitute
by
and
by
and then Equation (
22) are the solutions to the likelihood equations. We need to prove that
and
are positive. By Theorem 5,
and
are log-concave on
when
is log-concave and we have
and
applying Theorem 4. Thus,
and
. It is impossible that
for
. So,
is negative. Thus, from Equation (
22),
is positive. Similarly, we can prove
is also positive.
☐
Note that
(i) in the cases of GIED and IWD (given in
Section 2.1, when
),
is log-concave on
;
(ii) If is not log-concave on , we can take the positive roots of the likelihood equations to get the AMLEs and .
3.3. R-Symmetric and R-Asymmetric Approximate Confidence Intervals
First, we obtain an asymptotic confidence interval, which is symmetric about , with the asymptotic distributions of , as well as . Then considering the obtained confidence interval may not work well for small samples, with Bootstrap methods, we further acquire two R-asymmetric interval estimates.
3.3.1. R-Symmetric Interval Estimation Based on Asymptotic Distributions
The asymptotic distribution of
can be derived with the Fisher information matrix of
, which is described as follows.
where
can further be used to derive the approximate variance-covariance matrix. If all the regularity conditions are satisfied for the distribution from ISF, we give the following asymptotic distributions for and based on the asymptotic properties of MLE.
Theorem 6. For , , we have in distribution where with , .
Theorem 7. For , , we have where Proof. Applying the Delta method and Theorem 6, we can describe the asymptotic distribution of
as
, where
It is easy to prove that and . Thus the proof is completed. ☐
From the results of Theorem 7, a
asymptotic confidence interval can be determined as:
, which is symmetric about
. Here,
A is estimated with
and
.
is
th percentile of
. Similarly, by estimating matrix
A with
and
, the R-symmetric asymptotic confidence interval based on
is thus acquired. Note that
and
is given in
Section 3.2.
3.3.2. R-Asymmetric Interval Estimation Based on Bootstrap Methods
Since the obtained asymptotic distributions may not hold for small samples, two Bootstrap confidence intervals are thereby proposed, which are R-asymmetric and considered to have a better performance.
(a) Bootstrap-p method
- Step 1:
Generate
from
and
from
, respectively. Based on the generated samples, calculate
with Equation (
14);
- Step 2:
Repeat step 1, times;
- Step 3:
Define the CDF of as . Let for a given z. Thus an approximate confidence interval of R is .
(b) Bootstrap-t method
- Step 1:
Calculate , and with and ;
- Step 2:
Generate from and from , respectively;
- Step 3:
Calculate
with Equation (
14) and calculate
with
(Note that
can be derived from Theorem 7);
- Step 4:
Repeat steps 2 and 3, times;
- Step 5:
Define the CDF of as . Let for a given z. Thus an approximate confidence interval of R is .
3.4. Bayesian Estimation
Since gamma distributions are frequently used as prior distributions for parameters from the inverted scale family according to the related articles [
4,
7,
26,
27,
28], we assume independent gamma priors on
and
, that is,
for
. The PDFs are given in Equation (
24). Note that
R can be derived in a similar way when
follows other prior distributions.
Then, the joint posterior density function of
and
is:
Equation (
25) is found to be intractable, and therefore, getting the respective Bayesian estimator analytically is quite difficult. However, samples can be generated from the posterior distribution with Gibbs sampling so that the Bayesian estimator of
R and HPD credible interval, proposed by Chen [
29], can be obtained easily. Here give the posterior density functions of
and
:
Both posterior density functions are found to be unknown. So the Metropolis-Hastings () method is used to generate and from and , respectively. Gibbs sampling algorithm is as follows:
- Step 1:
Determine the initial value ;
- Step 2:
Set ;
- Step 3:
Using the method, generate from . Here, the proposal distribution is ;
- Step 4:
Using the method, generate from . Here, the proposal distribution is ;
- Step 5:
Compute
by Equation (
14);
- Step 6:
Set ;
- Step 7:
Repeat steps 3 to 6 K times.
Then the approximate posterior mean and posterior variance of
R is:
The HPD credible interval at
credible level can be given by:
Here, denotes the smallest of {}.
3.5. Simulation Study
Supposing both
X and
Y are from IED (as shown in Example 1), the obtained estimations for
R can be achieved through Monte Carlo simulations. The GPTIICOS from
X and
Y are generated with the algorithm in
Section 2.2.
Here three censoring schemes are considered, which are listed and numbered in
Table 4, and six sets of parameter values
are used for simulations, including
,
,
,
,
, and
.
We make an evaluation on the validity of estimations obtained from each censoring scheme and each value of . The evaluation process is as follows. First, the accuracy of the point estimates is compared in light of their mean squared errors (MSE) and biases. Then, with the average interval lengths and interval coverage percentages, we make an assessment on the different interval estimates.
The average point estimates for each parameter value and censoring scheme are obtained through 1000 simulations, which are shown in
Table 5 together with the corresponding average biases and MSEs. Further, we give the heat maps of
Table 5 in
Figure 2 and
Figure 3. The proposed intervals at
confidence/credible level are also obtained.
Table 6 gives the obtained average interval lengths and interval coverage percentages for different methods. In our simulations, the HPD credible intervals are acquired after 2000 samples while the Bootstrap confidence intervals are acquired after 250 resamples.
From
Table 5, all the point estimates are quite effective as the corresponding MSEs and biases are pretty small. Additionally, we speculate from
Figure 2 and
Figure 3 that the accuracy of the estimators is somewhat related to the true value of
because the MSEs and biases of the Bayesian estimator show a great difference between the case of
and the others. We can find in the cases that
takes
and
, the MSEs and biases of the Bayesian estimator are much higher than those in the other cases. However, in the cases of
, Bayes estimates are extremely accurate.
For the interval estimations, we refer to
Table 6. It can be seen that the interval length of the Bootstrap-t interval is wider than others for most cases and according to the interval coverage percentages, the performance of Bootstrap-t intervals is slightly better than the performance of asymptotic confidence intervals obtained from
and
. Moreover, the interval coverage percentages of the R-symmetric asymptotic confidence intervals are more deviated from
, which indicates that their performances are not good when compared with others. That is probably because small samples may affect the asymptotic distribution of
R, which thereby may affect the accuracy of interval estimation. However, it is obvious that the HPD credible interval performs better when compared to the asymptotic intervals and Bootstrap intervals since the corresponding interval coverage percentages are closer to the nominal level. So, Bayes estimation is considered as the best method, and according to the interval coverage percentages, the ranking of the three methods is given below.
3.6. An Illustrated Example
Here, an analysis of two real datasets, reported by Badar and Priest [
30], is given below. Many reliability models, for example, the models based on the generalized Rayleigh distributions, Weibull distributions and three-parameter Weibull distributions, have been applied to the two datasets (see Raqab and Kundu [
17], Kundu and Gupta [
18], and Kundu and Raqad [
19]).
Data Set 1.The data below shows the breaking strength for 63 carbon fibers of 10 mm gauge length in a tension test (unit: GPA). | | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | |
Data Set 2.The data below shows the breaking strength for 69 carbon fibers of 20 mm gauge length in a tension test (unit: GPA). | | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | | |
| | | | | | | | |
Here, the fitting of three reliability models based on GIED (as shown in Example 1), WD (as shown in
Section 2.3), IWD (as shown in Example 2) is considered. We use maximum likelihood estimation to construct the above three parametric models. Same as in
Section 2.3, for each model, five measures are considered to test the goodness-of-fit: (1) −ln, (2)
distance, (3)
p-value, (4)
, and (5)
. In the case of MLE, these values based on the three reliability models are listed in
Table 7. Clearly, the best fit order for both datasets is:
Since GIED has not only the lowest values of −ln,
, and
but also the highest
p-value, it is considered as the best model. Furthermore, the fitting effects of the three reliability models are intuitively shown in
Figure 4. So GIED models are used to fit the two datasets separately. It appears from
Table 7 that for Data Set 1, the values of −ln,
distance,
p-value,
, and
in the case of MLE are
,
,
,
, and
, respectively, indicating the GIED model fits the Data Set 1 very well when
takes
. Same as for Data Set 1, the GIED model fits Data Set 2 very well when
takes
. Hence, it is reasonable to assume
X follows GIED(
,175.2879) and
Y follows GIED(
,205.87851). Set
.
Table 8 shows the respective censoring schemes and the generated censored data.
With the R programming language, the proposed estimates of
R are derived. As shown in
Table 9, the MLE, AMLE, Bayesian estimator are 0.782, 0.794, and 0.782, respectively. The corresponding
confidence/credible intervals are (0.680,0.885), (0.690,0.898), and (0.651,0.871). Additionally, The Bootstrap-p and Bootstrap-t confidence intervals are obtained to be (0.665,0.878) and (0.694,0.913), respectively.