1. Introduction
In practice, failure of products often occurs due to multiple causes. Such causes of failure are referred as competing risks in the literature and appear in various application fields such as industrial engineering, reliability analysis, lifetime studies, among others. For competing risks data, they commonly consist of failure time and cause indicator under a standard scenario. Inference for competing risks data has attracted wide attention and been discussed by many authors. See, some recent works of Rafiee et al. [
1], Balakrishnan et al. [
2], Varghese and Vaidyanatha [
3], Koley et al. [
4], among others. In traditional analysis, failure causes are commonly treated as independent, but such assumption sometimes may be improper due to practical complexity. Therefore, considering dependent models seems more proper to describe competing failure causes which have been a hot topic of recent discussion. Under this point, various ways are proposed for modeling dependent failure causes, and commonly used approaches include bivariate and multivariate distributions (e.g., [
5,
6,
7]), shock method (e.g., [
8,
9,
10,
11]), and copula method (e.g., [
12,
13,
14,
15,
16]). It is worth mentioning that the aforementioned shock model plays an important role for analysis of competing risks in reliability theory and lifetime studies, and that copula method provides another popular and flexible way for modeling dependent variables due to their advantage of separating the marginal distribution and the dependent structure. Specifically, besides traditional copula functions as mentioned in the above reference, the vine copula method has received much attention recently in the uncertainty analysis fields due to the associated novel structural inferential analysis approach, and simultaneously greatly expands the application scope of the copula methods. For example, Amini et al. [
17] proposed a novel way via vine-copula function in uncertainty quantification of aging dams using meta-models to fully capture nonlinear dependencies. Zhang et al. [
18] proposed a vine-copula-based partially accelerated competing risks model using a tampered random variable transformation.
In practical data analysis, failure times usually appear as incomplete data due to experimental limitation such as cost and time constraints. Among various features of incomplete data, truncation and censoring are two most important characteristics frequently appearing in practice. Especially, when units enter the study at a known time point after the time origin, one could obtain data suffering left truncation, whereas for right censoring, it means that the failure times are only known to exceed the prefixed censoring point. Therefore, it is seen that truncation and censoring are classical topics in survival and medical analysis and also widely recognized in both biostatistics and reliability engineering, among other various fields, and that due to complex life-cycle environment and testing conditions, left-truncated and right-censored (LTRC) data as a more widespread phenomenon for failure times are more general in practical situations. For a famous example in engineering, Hong et al. [
19] reported an example of LTRC data for voltage power transformers. In this example, there are approximately 150,000 high-voltage power transmission transformers in service in the US. Transformers were installed before or after the year 1980, and the data collection period of failure was conducted by corresponding energy companies between 1980 and 2008. In this case, only complete information for the transformers installed after 1980 and for the transformers installed before 1980 but failed after 1980 is still available, and the transformer information still surviving till 2008 is right-censored in consequence. Thus, the observed transformers failure times appear as LTRC data. Similar real life examples also widely appear in various fields of medical treatment, survival analysis, iostatistics, among others. Therefore, it is meaningful to discuss LTRC data that provides potential theoretical investigation and practical applications in real life analysis and decision-making situations. In literature, inference of LTRC data has been extensively discussed by many authors from various perspectives. To name a few, Hong et al. [
19], Balakrishnan and Mitra [
20,
21,
22], Mitra and Balakrishnan [
23], Kundu and Mitra [
24] with the aid of EM algorithms. Emura et al. [
25] compared between the Newton–Raphson (NR) and EM algorithms. The Bayesian analysis of LTRC data from lognormal, Weibull, and gamma distributions was also developed by Mitra et al. [
26], Ranjan et al. [
27], Wang et al. [
28]. Analysis of LTRC data via regression approach along with semiparametric and covariate factors was also discussed by McGough [
29], Zhang et al. [
30], Park [
31], Frumento and Bottai [
32], Huang and Qin [
33], among others. In addition, when there are multiple failure causes involved, associated LTRC competing risks data were studied by Wang et al. [
13], Kundu et al. [
34], Wang et al. [
35], Una-Alvarez and Veraverbeke [
36], Shih and Emura [
37], among others. Interested readers may refer to Emura and Michimae [
38] for a review. For clarity,
Figure 1 is presented to show associated analysis strategies about LTRC data.
Due to the potential theoretical and practical importance of LTRC data, this paper considers inference for dependent LTRC competing risks data. When the failure causes of LTRC failure time follows a simple shock model, namely the Marshall–Olkin bivariate exponential (MOBE) distribution proposed by Marshall and Olkin [
11], estimation for the unknown parameters and the reliability indices are developed under classical and various Bayesian procedures, respectively. The potential novelties and contributions of this paper are as follows. On the one hand, due to the existing literature focusing on independent LTRC competing risks data, this paper considers dependent competing risks data under the LTRC scheme. Although the lifetime of failure causes are modeled by a simpler MOBE distribution, similar results could be obtained when other relatively complex baseline models are used in a Marshall–Olkin-type distributional structure. On the other hand, various Bayesian inferential approaches including traditional, E-Bayesian, and objective-Bayesian estimations are proposed in this paper, and associated risk criterion quantities and asymptotic properties are obtained in different cases. Specifically, it is worth mentioning that in one of our authors’ previous paper [
35], inference of dependent LTRC competing risks data is still established, and the common point and difference between these two papers are presented as follows. Firstly, the common point is that the dependent LTRC competing risks data are all modeled by Marshall–Olkin-type distributions in both papers. In paper [
35], the lifetime of the causes of risks are discussed based on a Marshall–Olkin-type distribution with Weibull baseline, whereas the exponential-based Marshall–Olkin bivariate distribution is considered in the current paper. Secondly, one main difference is that prior distributions are different between these two papers; a general dependent prior is implemented for model parameters in [
35] and the associated results are obtained via Monte Carlo sampling, whereas in the current paper, the independent gamma priors are adopted for incorporating extra prior information under a Bayesian perspective and exact estimators are established subsequently. Finally, comparing with the traditional standard Bayesian estimation in [
35], another main difference between two papers is that E-Bayesian and objective Bayesian methods are further proposed in the current paper, where associated estimated risks and E-posterior risks are established and the corresponding asymptotic equivalence is also investigated, and that relatively robust estimates are provided under such scenarios. For illustration, a flowchart about the main contents of the paper is presented in
Figure 2. To the best of our knowledge, this problem has not been discussed before in the literature.
The article is organized as follows. In
Section 2, the MOBE model and the LTRC data description are introduced.
Section 3 establishes the maximum likelihood estimators (MLEs) and the associated approximate confidence intervals (ACIs) for parameters of interest. Conventional Bayesian, E-Bayesian, and objective Bayesian estimations are proposed in
Section 4. Simulation studies and two real-life examples are conducted in
Section 5. Finally, some brief concluding remarks are presented in
Section 6.
2. Model and Data Description
2.1. Marshall–Olkin Bivariate Exponential Distribution
A random variable
U follows an exponential distribution with hazard rate
when the probability density function (PDF), cumulative distribution function (CDF), and the survival function (SF) of
U are respectively given by
This is denoted by
.
Let , and be independent exponential random variables satisfying . Define and , then the random vector has the MOBE distribution with parameters , denoted by .
For the sake of simplicity, denote and . Some results of the MOBE model are provided below and the associated proofs are omitted for saving space.
Theorem 1. Suppose the random vector follows the MOBE distribution with parameters , the joint SF of is given by Corollary 1. Suppose the random vector follows the MOBE distribution with parameters , the joint PDF of can be expressed as Corollary 2. Suppose the random vector follows the MOBE distribution with parameters , then variable with SF and hazard rate function (HRF) at mission time are given as It is noted from Theorem 1 that, when , variables and are independent. Therefore, parameter can be regarded as the dependent structure between and . Moreover, it is also noted from Corollary 1 that the probability contributions for events , and are , and , respectively. In addition, it is also seen from Corollary 2 that for units with dependent failure causes following the MOBE model, the associated failure time of the units can be described by the variable .
2.2. Data Description and Notation
Without loss of generality, consider a lifetime experiment with identical units, and their lifetimes are described by independent and identically distributed (i.i.d.) random variables . Corresponding to i-th unit , it is assumed that there is a prefixed left truncation point, say ; each unit can be placed on the test before or after the corresponding left truncation point , and the failure time could be observed only if , otherwise no information is available for the units. In addition, for i-th unit if it survives after , it may be censored after another determining point . In this manner, the obtained observations are referred as LTRC data. Further, for the sake of clarity, the notations used in this paper for the dependent LTRC competing risk data are presented as follows.
failure time of i-th unit under cause ;
left-truncated time for the i-th unit;
right-censored time for the i-th unit;
observed lifetime of the i-th unit, i.e., ;
indicator variable for the
i-th unit with
truncated indicator variable for the
i-th unit with
set of indices of censored observations;
set of indices of failures due to cause and 3;
cardinality of . It is assumed that .
In this paper, we suppose that there are
n LTRC observations with two causes of failure in experiment, and the associated variables of competing risks are
satisfying
. Therefore, observed data follow
. In such manner, the following dependent LTRC competing risks data are obtained as
Theorem 2. Suppose the LTRC observation (4) comes from the MOBE model (2); then, the likelihood contribution of a testing unit can be obtained as follows From Theorem 2, the likelihood function of parameters
, and
is
with
.
4. Method of Bayesian Estimation
In this section, traditional Bayesian, E-Bayesian, and objective Bayesian methods are proposed for parameter and reliability indices estimation, respectively.
4.1. Prior Information and Posterior Analysis
For Bayesian estimation, it is noted that prior information should be incorporated into the inferential procedure. In this section, we adopted the independent gamma distributions to describe the prior information about the model parameters. In statistical inference, the gamma distribution is a flexible distribution that can be used to model different prior information based on proper choice of hyperparameters, it also becomes inversely proportional to its argument when hyperparameters are set to zeros, and some other models such as the Erlang, exponential, and chi-square distributions are special cases of the gamma distribution. In addition, the gamma distribution is also the maximum entropy probability distribution, which also makes this feature an appealing fitting property in different fields such as business, science, and engineering. Therefore, it is assumed that parameters
, and
are statistically independent, and the gamma conjugate prior for
is assumed with hyperparameters
and
as
Therefore, the joint prior of
is given by
and the joint posterior density of
can be obtained from (
6) and (
14) as
implying that the marginal posterior function of the parameter
is
which is also the gamma distribution.
Under squared error loss, since the Bayesian estimator is a posterior expectation, the following results are directly obtained and details are omitted for concision.
Theorem 4. Under squared error loss, one has:
The Bayesian estimator of the parameter is given by The Bayesian estimators of the SF and HRF can be expressed as
It is noted that, sometimes, there are rare prior information collected from historical information or past data; then, flat or non-informative priors may be more proper in this situation. Although gamma priors are adopted in our illustration, the Bayesian results could be also established as a special case under rare or non-information situations by setting all hyperparameters as zero that consequently reduce to non-informative priors. In addition, for an arbitrary significance level
,
Bayesian highest posterior density (HPD) credible intervals for parameters
could be also constructed, and a simple approach namely Algorithm 1 is provided as follows.
Algorithm 1: Bayesian HPD credible interval estimation |
- Step 1
For arbitrary chosen , obtain the solutions and from equation , and construct an interval as
- Step 2
Calculate the posterior probability
- Step 3
The HPD credible interval of could be obtained from following three cases as:
For given k in Step 1, if with prefixed accuracy level, then is the targeted HPD credible interval estimate; If , then increase p and turn to Steps 1 and 2; If , then decrease p and turn to Steps 1 and 2.
|
4.2. E-Bayesian Estimation
The expected-Bayesian (E-Bayesian) estimation was firstly proposed by Han [
39], and has attracted wide attention and been discussed by many authors. See, for example, some recent works of Basheer et al. [
40], Okasha and Wang [
41], among others.
Following the idea of Han [
39], hyperparameters
and
should be selected to guarantee that the prior density function
decreases in
, which implies that
and
. Under such requirements, the following three independent priors for hyperparameters
and
are chosen as
Note that since there are no closed forms of the E-Bayesian estimators for reliability indices and , the results are just reported for parameters , and for concision.
Theorem 5. Under squared error loss and priors , and , the E-Bayesian estimator of can be written respectively as Further, using (
16) and (
19), the posterior densities of parameter
with respect to
, and
can be expressed respectively as
It is seen from (
23)–(
25) that there are no closed forms of posterior densities for
, and
under
, and
. In order to construct E-Bayesian HPD credible intervals, a constrained optimization problem is proposed as follows.
For arbitrary
, let
be the
Bayesian interval for
under prior
satisfying that
Thus, the shortest-length
credible interval
for
can be obtained by solving following optimization problem as
This can be further obtained by minimizing the Lagrangian function as
where
is the Lagrangian multiplier. Therefore, by using the Lagrangian multiplier method, the
Bayesian HPD credible interval
for
with respect to
can be obtained numerically, where
and
are the solutions of the following nonlinear equations:
4.3. Some Results of Bayesian and E-Bayesian Estimation
In following, the posterior risk (PR) of Bayesian estimators are presented under squared error loss, which generally are used to measure the associated estimated risk of Bayesian estimators.
Theorem 6. The PR of Bayesian estimators under squared error are obtained as Similarly, another criterion quantity called the E-posterior risk (EPR) is proposed by Han [
42] which is an effective measurement for the E-Bayesian estimators.
Theorem 7. Under square error loss, the EPR of with respect to priors can be expressed respectively as Some relations among various E-Bayesian estimators are also presented as follows.
Theorem 8. Let , for the E-Bayesian estimators with respect to squared error loss, it is seen that
We note that with respect to priors , and , although E-Bayesian estimators in Theorem 5 are different order relations, they are asymptotically equivalent to each other under the given conditions.
4.4. Objective Bayesian Estimation
Sometimes, prior information is difficult to collect especially when there are rare historical data or a practitioner is not familiar with targeted problems. Therefore, to give a fair inference under the Bayesian approach, objective-Bayesian (O-B) is proposed for eliminating the personal subject effect in priors. Here, objective-Bayesian estimation is proposed in this subsection.
Following Guan et al. [
43], a probability matching prior for
is given by
Therefore, from (
6) and (
34), the posterior density of
can be written as
Theorem 9. With respect to prior (34), the posterior density (35) of is proper. Therefore, under squared error loss, the O-B estimator of
can be expressed as
It is noted that the Bayesian credible intervals cannot be found directly for model parameters from a posterior distribution (
35). Alternatively, an important sampling approach namely Algrithm 2 is presented for constructing the Bayesian HPD credible intervals as follows.
Algorithm 2: Objective Bayesian HPD credible interval estimation |
- Step 1
- Step 2
Repeat Step 1 N times, and obtain . - Step 3
To construct a HPD credible interval of , any function of can be used. Suppose is a quantity satisfying for . Denote and with . Arrange with respect to in an ascending order as with , where are not ordered and associated with . A simulation-consistent Bayes estimate of can be obtained as , where
Using the above approach, a credible interval can be constructed as
Therefore, a HPD credible interval of could be constructed as satisfying
|
6. Concluding Remarks
In this paper, estimation for dependent competing risks model is discussed under the LTRC scheme. When the lifetime of the causes of failure follows the MOBE distribution, point and interval estimates of unknown parameters as well as reliability indices are obtained in classical and Bayesian procedures, respectively. Besides classical MLEs and ACIs, three types of Bayesian point and interval estimates are also proposed including common Bayesian, E-Bayesian, and objective-Bayesian, respectively. Extensive simulation studies and two real-life examples are carried out for investigating the performance of our methods, and the numerical results show that all classical and Bayesian results perform satisfactory, and that the proposed Bayesian methods are superior to classical likelihood-based results in general. Although inference for LTRC competing risks data is discussed from the MOBE distributions, the scope of this paper can be extended to many other engineering and reliability application fields when other Marshal–Olkin-type bivariate models are available. In addition, there are also limitations in our study. For example, the proposed methods may effectively process relatively simple problems when two random variables are correlated under a shock phenomenon, but cannot completely describe the correlation among complex and multidimensional variable situations. In such cases, one potential approach may refer to the copula function-based method, which seems to be of interest and will be discussed in the future.