Next Article in Journal
The Integrated Violin-Box-Scatter (VBS) Plot to Visualize the Distribution of a Continuous Variable
Previous Article in Journal
Doubly Robust Estimation and Semiparametric Efficiency in Generalized Partially Linear Models with Missing Outcomes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Weighted Empirical Likelihood for Accelerated Life Model with Various Types of Censored Data

1
Statistics Program, Department of Mathematics, University of Maryland, College Park, MD 20742, USA
2
The Janssen Pharmaceutical Company of Johnson & Johnson, New Brunswick, NJ 08933, USA
*
Author to whom correspondence should be addressed.
Stats 2024, 7(3), 944-954; https://doi.org/10.3390/stats7030057
Submission received: 8 July 2024 / Revised: 26 August 2024 / Accepted: 28 August 2024 / Published: 3 September 2024
(This article belongs to the Section Survival Analysis)

Abstract

:
In analysis of survival data, the Accelerated Life Model (ALM) is one of the widely used semiparametric models, and we often encounter various types of censored survival data, such as right censored data, doubly censored data, interval censored data, partly interval-censored data, etc. For complicated types of censored data, the studies of statistical inferences on the ALM are very technical and challenging mathematically, thus up to now little work has been done. In this article, we extend the concept of weighted empirical likelihood (WEL) from univariate case to multivariate case, and we apply it to the ALM, which leads to an estimation approach, called weighted maximum likelihood estimator, as well as the WEL based confidence interval for the regression parameter. Our proposed procedures are applicable to various types of censored data under a unified framework, and some simulation results are presented.

1. Introduction

When we are interested in testing whether a new treatment is more effective in a clinical trial, one possible and simple assumption could be that the survival time in treatment group is proportional to the survival time in control group. Such model assumption is one of the widely used statistical models in survival data analysis, and it is called the accelerated life model (ALM) [1]. The general form of the ALM is given by
T = T 0 ψ ( Z )
where T is the survival time of a patient, T 0 is the survival time of a patient in control group, and Z is a p -dimensional covariate vector. For parametric function ψ ( Z ) = e β Z with β as parameter, taking logarithm on both sides of above equation and assuming β 0 as the true parameter, we write ALM (1) in the following regression form with α 0 = E ( log T 0 ) :
log T = α 0 β 0 Z + E ,
where E = log T 0 α 0 is a random variable (r.v.) with distribution function (d.f.) F 0 , and does not depend on covariate vector Z .
Let random vectors observed on ( T , Z ) be given by
( T 1 , Z 1 ) , ( T 2 , Z 2 ) , , ( T n , Z n ) .
In practice, the actually observed survival data are often not (3), because lifetime variable T is usually subject to censoring, such as right censoring, double censoring, interval censoring, partly interval-censoring, etc., and the research interest is the impact of covariate Z to T. For simplicity of presentation, in this article we consider the case that covariate Z is a scalar rather than a vector, i.e., dimension p = 1 in (2) and (3), while it should be noted that the generalization of our results here to multivariate case p > 1 is straightforward.
When T i in (3) is subject to right censoring, or double censoring, or interval censoring, or partly interval-censoring, etc., the actually observed survival data on (3) are
( V i , δ i , Z i ) , i = 1 , 2 , , n ,
where ( V i , δ i ) represents the observed censored data on T i ; see specific descriptions below.
Right Censored Sample: Under right censoring, ( V i , δ i ) in (4) is given by
V i = T i , if   T i C i , δ i = 1 C i , if   T i > C i , δ i = 0
where C i is the right censoring variable and is independent of ( T i , Z i ) . With right censored data ( V i , δ i ) , i = 1 , , n , observed on T i ’s, the nonparametric maximum likelihood estimator (NPMLE) F ^ K M ( t ) for d.f. F T ( t ) of lifetime variable T was given by [2], and the strong consistency and asymptotic Gaussian property of F ^ K M ( t ) were established by [3,4], respectively. One example of right censored data is the leukemia data; see [5,6].
Doubly Censored Sample:. Under double censoring, ( V i , δ i ) in (4) is given by
V i = T i , if   D i < T i C i , δ i = 1 C i , if   T i > C i , δ i = 2 D i , if   T i D i , δ i = 3
where C i and D i are right and left censoring variables, respectively, and they are independent of ( T i , Z i ) with P { D i < C i } = 1 . Doubly censored data is more complicated than right censored data, and has been encountered in some important medical and scientific studies, such as child development data [7,8], breast cancer data [9,10], etc. The breast cancer data considered in [9] is an example of doubly censored survival data (4) under (6). With doubly censored data ( V i , δ i ) , i = 1 , , n , observed on T i ’s, the NPMLE F ^ M R for F T was given by [11], and the strong consistency and asymptotic Gaussian property of F ^ M R were established by [12,13], respectively.
Interval Censored Sample: Under Case 1 interval censoring, ( V i , δ i ) in (4) is:
V i = C i , δ i = I { T i C i } ,
where C i is independent of ( T i , Z i ) . This type of censoring is also called current status data. For Case 2 interval censoring, we refer to the definition on ( V i , δ i ) ’s in [14]. Case 1 and Case 2 interval censored data both are more complicated than right censored data, and have been encountered in some important medical studies, such as AIDS research [15,16], etc. With interval censored Case 1 or Case 2 data ( V i , δ i ) , i = 1 , , n , observed on T i ’s, the NPMLE F ^ G W for F T was given by [14], and the strong consistency of F ^ G W was also established by [14].
Partly Interval-Censored Sample: Under Case 1 partly interval-censoring, ( V i , δ i ) in (4) is given by:
( V i , δ i ) = ( T i , 1 ) if   1 i k ( C i , δ i ) , if   k + 1 i n
where ( C i , δ i ) is the Case 1 interval censoring data given in (7). For the definition of general partly interval-censored data observed on T i in (4), we refer to [17]. Partly interval-censored data is also quite complicated type of censoring, and has been encountered in studies of heart disease [18], diabetes [19], etc. With partly interval-censored data ( V i , δ i ) , i = 1 , , n , observed on T i ’s, the NPMLE F ^ H for F T was given by [17], and the strong consistency and asymptotic Gaussian property of F ^ H was also established by [17].
Based on our best knowledge, up to now most methods developed for the ALM are based on right censored data (4)–(5), and there has been little work done in statistical literature on the ALM (2) for complicated types of censored data, such as doubly censored data, interval censored data, partly interval-censored data, etc.
Some of the works on the ALM with right censored data are based on the idea of least squares method because the log form of ALM (2) looks like a linear regression model; see [20,21,22,23], among others. There are also some empirical likelihood-based methods developed for the ALM with right censored data; see [24,25,26], among others. Ref. [27] considered rank-based methods for the same setting.
Obviously, it is of great interest and importance to develop a unified likelihood-based approach for providing consistent estimator of regression parameter β 0 in ALM (2) with various types of censored data aforementioned, so that the ALM could be more broadly applied to analyze survival data in practice. In this context, we notice that in [28,29], the weighted empirical likelihood (WEL) is formulated in a unified form through univariate NPMLE of the underlying d.f. for different types of censored data, and it may be viewed as the asymptotic version of empirical likelihood function [30] for censored data. As shown in [29], obtaining the usual empirical likelihood function for complicated types of censored data is often difficult under the given model setting, and it usually results in a very complicated likelihood function which is difficult to handle and often mathematically intractable! Due to its simplicity of formulation, the WEL can much more easily incorporate model assumptions into likelihood function when considering an inference problem under particular setting with censored data. In this article, we extend the concept of univariate WEL method in [28,29] into multivariate case and apply it to the ALM. This approach is applicable in unified fashion for various types of censored data aforementioned, and provides a multivariate weighted empirical likelihood based maximum likelihood estimator (WMLE) for ( β 0 , F 0 ) in ALM (2). Moreover, we provide the weighted empirical likelihood ratio (WELR) based confidence interval for β 0 in ALM (2).
The rest of this article is organized as follows: Section 2 derives the relevant estimating equations, obtains the WMLE for ( β 0 , F 0 ) in ALM (2), and describes WELR based confidence interval for β 0 ; and Section 3 presents some related simulation results with some concluding remarks. All proofs are given in Section 4.

2. Weighted Maximum Likelihood Estimator

Recall the univariate version of weighted empirical likelihood given in [28,29] as follows. If lifetime variables T i ’s in (3) are subject to right censoring, or double censoring, or interval censoring, or partly interval-censoring, etc., as mentioned in Section 1, then in each case, the NPMLE F ^ K M or F ^ M R or F ^ G W or F ^ H , etc., for F T can be written as
F ^ n T ( t ) = i = 1 m p ^ i I { U i T t } , t R
where p ^ i > 0 and { U 1 T , , U m T } is a subset of all distinct observed points in { V 1 , , V n } . The univariate version of weighted empirical likelihood (WEL) function is given by
L W ( F ) = i = 1 m F ( U i T ) F ( U i T ) n p ^ i
where F ( t ) , t R , is an arbitrary d.f. But, such likelihood function does not apply to ( T i , Z i ) ’s in (3), which is bivariate data for p = 1 .
A natural generalization of (10) to bivariate case is as follows. Assume that without the ALM assumption (2), a consistent distribution estimator for joint d.f. G 0 ( t , z ) of random vector ( log T , Z ) based on censored data (4) with V i replaced by log V i is given by
G ^ n ( t , z ) = i = 1 m j = 1 q ω ^ i j I { U i t , W j z }
where U 1 < < U m and W 1 < < W q with ω ^ i j 0 and i = 1 m + 1 j = 1 q ω ^ i j = 1 . Then, the bivariate version of WEL is given by
L W ( G ) = i = 1 m j = 1 q d G ( U i , W j ) n ω ^ i j
where G ( t , z ) is an arbitrary d.f. and d G ( U i , W j ) is the probability mass of G at ( U i , W j ) .
Without assuming that G 0 ( t , z ) satisfies ALM assumption (2), for right censored data (4)–(5) with V i replaced by log V i , [31] derived the empirical likelihood based NPMLE G ^ R R ( t , z ) for above bivariate d.f. G 0 ( t , z ) , which has the same form as (11), where U 1 < < U m are distinct values of log V 1 , , log V n , and W 1 < < W q are distinct values among Z 1 , , Z n . Under regularity conditions, [31] showed that for discrete covariate Z, G ^ R R ( t , z ) is uniformly strong consistent and n ( G ^ R R G 0 ) weakly converges to a centered Gaussian process as n . Interestingly, Corollary 1 of [31] showed that G ^ R R ( t , z ) can be expressed as the sum of products of the marginal NPMLE for F Z ( z ) and the conditional Kaplan-Meier estimator in the univariate case given Z = z . For continuous covariate Z, [32] applied naive adjustment or neighborhood adjustment to G ^ R R ( t , z ) which expectedly gives consistent estimator for G 0 ( t , z ) . These suggest that for data (4) with V i replaced by log V i , i.e., T i ’s in (3) subject to double censoring, or interval censoring, or partly interval-censoring, bivariate distribution estimator for above G 0 ( t , z ) can be similarly constructed via univariate NPMLE’s F ^ M R [11], or F ^ G W [14], or F ^ H [17], respectively, and all these resulting bivariate distribution estimators can be written in the same form as (11).
Note that without ALM assumption (2), [9] constructed a bivariate distribution estimator G ^ R G ( t , z ) for above G 0 ( t , z ) with doubly censored data (4) under (6) with V i replaced by log V i . This G ^ R G can also be written in the form of G ^ n ( t , z ) in (11), although G ^ R G ( t , z ) contains negative, but asymptotically 0, probability masses.

2.1. Bivariate WEL Function under ALM

Next, based on above bivariate WEL function L W ( G ) in (12), we derive the WEL function under ALM assumption (2). Since T 0 is independent of Z under ALM (2), the conditional d.f. of log T given Z = z is given by
G 0 ( t ; z ) = P ( log T t | Z = z ) = P { E t α 0 + β 0 z ) = F 0 ( t α 0 + β 0 z ) ,
where F 0 ( t ) is the d.f. of E with p.d.f. f 0 ( t ) , which gives the conditional p.d.f. of log T given Z = z and the joint p.d.f. of random vector ( log T , Z ) as below, respectively:
g 0 ( t ; z ) = f 0 ( t α 0 + β 0 z ) and g 0 ( t , z ) = f 0 ( t α 0 + β 0 z ) f Z ( z ) .
Thus, under ALM (2), above bivariate version of WEL function (12) with G ( t , z ) = G 0 ( t , z ) is written in the following way:
i = 1 m j = 1 q g 0 ( U i , W j ) n ω ^ i j = i = 1 m j = 1 q f 0 ( U i α 0 + β 0 W j ) f Z ( W j ) n ω ^ i j
which is proportional to i = 1 m j = 1 q f 0 ( U i α 0 + β 0 W j ) n ω ^ i j . Hence, the WEL function for ( α 0 , β 0 , F 0 ) in ALM (2) with censored survival data (4) is given by
L W ( α , β , F ) = i = 1 m j = 1 q d F ( U i α + β W j ) n ω ^ i j = ω ^ i j > 0 p i j ( α , β ) n ω ^ i j
where ( α , β ) R 2 , F ( t ) is an arbitrary d.f. with notation d F ( t ) = F ( t ) F ( t ) , and
p i j ( α , β ) = d F ( U i α + β W j ) , i = 1 , , m ; j = 1 , , q .
Therefore, the WMLE ( α ^ n , β ^ n , F ^ n ) for ( α 0 , β 0 , F 0 ) is the solution at which L W ( α , β , F ) is maximized under appropriate estimating equations.

2.2. Estimating Equations

The reason of using above mentioned appropriate estimating equations here is that the usual constraints on the probability masses for the optimization problem on WEL function is not sufficient for the ALM setting because the parameters in ALM are “hidden” in the probability masses. This leads us to consider the analogous idea by [33] in conjunction with empirical likelihood as follows.
From [33], the appropriate estimating equations as constraints for the maximization problem of the WEL L W ( G ) in (12) under ALM assumption (2) should be some function μ ( Y ; θ ) for Y = ( log T , Z ) and θ = ( α , β ) such that E G 0 { μ ( Y ; θ 0 ) } = 0 , where θ 0 = ( α 0 , β 0 ) , and, as stated above Equation (11), G 0 ( t , z ) is the joint d.f. of ( log T , Z ) . Such estimating function μ ( Y ; θ ) reflects auxiliary information about ALM (2), then the estimating equation for maximization of L W ( G ) is E G { μ ( Y ; θ ) } = 0 for arbitrary d.f. G ( t , z ) with θ = ( α , β ) satisfying ALM assumption (2). To find such function μ ( Y ; θ ) , we notice the following lemma without proof.
Lemma 1.
Let X and Y be two independent random variables with E { X } = 0 , then E { X Y | Y } = Y E { X } = 0 .
From ALM (2) and Lemma 1, we have the following
E G 0 { ( log T α 0 + β 0 Z ) ( 1 , Z ) | Z } = E G 0 { E ( 1 , Z ) | Z } = E G 0 { E } ( 1 , Z ) = 0 ,
which implies E G 0 { ( log T α 0 + β 0 Z ) ( 1 , Z ) } = 0 , thus the function we are looking for is:
μ ( Y ; θ ) = ( log T α + β Z ) ( 1 , Z ) .
The derivation from the WEL function L W ( G ) in (12) to WEL function L W ( α , β , F ) in (14) under ALM assumption (2) implies that Equation (13) holds for any d.f. G ( t , z ) with θ satisfying ALM assumption (2), which means that for such G ( t , z ) , its conditional p.d.f. of log T given Z = z is given by d G ( t ; z ) = d F ( t α + β z ) , where F ( t ) is the arbitrary d.f. in (14), and Equation (15) gives d G ( U i ; W j ) = p i j ( α , β ) . From (13), we know that in the context of WEL, such G ( t , z ) has discrete joint p.d.f. d G ( t , z ) = f Z ( z ) d G ( t ; z ) of ( log T , Z ) with observed data ( V i , δ i , Z i ) ’s in (4), where U 1 , , U m are distinct values of log V 1 , , log V n , and W 1 , , W q are distinct values of Z 1 , , Z n . Thus, we have that f Z ( Z i ) = n 1 , i = 1 , , n , which gives the following
E G { ( log T α + β Z ) ( 1 , Z ) } = E G E { ( log T α + β Z ) ( 1 , Z ) | Z } = j = 1 n f Z ( Z j ) E { ( log T α + β Z ) ( 1 , Z ) | Z = Z j } = n 1 j = 1 n i = 1 n ( log V i α + β Z j ) ( 1 , Z j ) d G ( log V i ; Z j ) = n 1 j = 1 q i = 1 m ( U i α + β W j ) ( 1 , W j ) d G ( U i ; W j ) .
Hence, the estimating equation by (16) is expressed as:
E G { μ ( Y ; θ ) } = n 1 i = 1 m j = 1 q ( U i α + β W j ) ( 1 , W j ) p i j ( α , β ) .
Therefore, under estimating equation E G { μ ( Y ; θ ) } = 0 , the WMLE ( α ^ n , β ^ n , F ^ n ) for ( α 0 , β 0 , F 0 ) in ALM (2) is given by the solution of the following optimization problem:
max L W ( α , β , F ) = ω ^ i j > 0 p i j ( α , β ) n ω ^ i j Subject   to : p i j ( α , β ) > 0 , ω ^ i j > 0 p i j ( α , β ) = 1 , f o r ( i , j ) A ω ^ i j > 0 ( U i α + β W j ) p i j ( α , β ) = 0 ω ^ i j > 0 ( U i α + β W j ) W j p i j ( α , β ) = 0
where A = ( i , j ) | ω ^ i j > 0 , i = 1 , , m ; j = 1 , , q .
Theorem 1.
The WMLE ( α ^ n , β ^ n , F ^ n ) for ( α 0 , β 0 , F 0 ) is given by
α ^ n = ω ^ i j > 0 U i ω ^ i j ω ^ i j > 0 W j 2 ω ^ i j ω ^ i j > 0 U i W j ω ^ i j ω ^ i j > 0 W j ω ^ i j ω ^ i j > 0 W j 2 ω ^ i j ω ^ i j > 0 W j ω ^ i j 2 β ^ n = ω ^ i j > 0 U i ω ^ i j ω ^ i j > 0 W j ω ^ i j ω ^ i j > 0 U i W j ω ^ i j ω ^ i j > 0 W j 2 ω ^ i j ω ^ i j > 0 W j ω ^ i j 2 F ^ n ( t ) = i = 1 m j = 1 q ω ^ i j I { U i α ^ n + β ^ n W j t } .
The proof of Theorem 1 is given in Section 4. To construct weighted empirical likelihood ratio (WELR) based confidence interval (WELRCI) for β 0 , consider the hypothesis test: H 0 : β = β 0 v s . H 1 : β β 0 , then the WELR is given by
r ( β 0 ) = sup α R , F F L W ( α , β 0 , F ) L W ( α ^ n , β ^ n , F ^ n ) = L W ( α ˜ n , β 0 , F ˜ n ) L W ( α ^ n , β ^ n , F ^ n )
where α ˜ n = ω ^ i j > 0 U i ω ^ i j + β 0 ω ^ i j > 0 W j ω ^ i j and F ˜ n ( t ) = i = 1 m j = 1 q ω ^ i j I { U i α ˜ n + β 0 W j t } . In turn, the WELRCI for β 0 is given by S n = { β | r ( β ) c } , where 0 < c < 1 is a constant, which is determined based on asymptotic properties of WELR r ( β 0 ) . From Theorem 3 in paper by [29], it is expected that 2 log r ( β 0 ) asymptotically has a scaled chi-squared distribution for large sample size n.
Remark 1.
Note that the results in this section hold in a unified form as given above for various types of censored data mentioned in Section 1 as long as for a particular type of censored data without ALM assumption (2), a consistent bivariate distribution estimator G ^ n ( t , z ) in (11) for d.f. G 0 ( t , z ) of random vector ( log T , Z ) is available. For multivariate covariate vector Z with p > 1 , the number of constraint equations in optimization problem (18) for WMLE increases accordingly to p + 1 based on the derivation of constraint Equation (17), which is quite evident mathematically.

3. Simulation Studies

This section presents some simulation results of the WMLE β ^ n in Theorem 1 for β 0 in ALM (2) based on right censored data (4)–(5). In all our simulation studies, G ^ n ( t , z ) in (11) is calculated based on G ^ R R ( t , z ) by [31], and routines in FORTRAN for computing β ^ n are available from the authors.
Let Exp( μ ) represent the exponential distribution with mean μ , and U 1 k , , k 1 k the uniform distribution on set 1 k , , k 1 k . In our simulation studies, we use Exp(1) as the d.f. of T 0 , Exp(2) as the d.f. of the right censoring variable C i , and the following three cases as the d.f. of covariate Z in ALM (2):
Case 1:
Z i F Z = U 1 6 , 2 6 , 3 6 , 4 6 , 5 6 .
Case 2:
Z i F Z = U 1 11 , 2 11 , 3 11 , 4 11 , 5 11 , 6 11 , 7 11 , 8 11 , 9 11 , 10 11 .
Case 3:
Z i F Z , with p.d.f. f Z ( z ) = P { Z i = z } given by:
f Z 1 6 = 0.15 , f Z 2 6 = 0.20 , f Z 3 6 = 0.30 , f Z 4 6 = 0.20 , f Z 5 6 = 0.15 .
For a given value of β 0 and generated values of T 0 and Z, T is computed based on ALM (2), and ( V i , δ i ) ’s are obtained via log T i and log C i . For each case of β 0 = 0 , ± 1 , ± 2 , we generate 1000 samples with sample size n = 50 , 100 , 200 , 500 , 1000 , respectively. For each case of F Z , Table 1, Table 2 and Table 3 include the simulation averages of β ^ n with the simulation standard deviation given in the parenthesis next to the average, respectively. The censoring percentage in each setting is also reported in Table 1, Table 2 and Table 3, respectively.
Remark 2.
Note that in our simulation studies, we consider covariate Z having discrete distribution with p.d.f. f Z ( z ) , which are three different cases: both Case 1 and Case 2 have uniform distribution, but the support of Case 2 has more points than that of Case 1, while f Z ( z ) of Case 3 has a symmetric distribution. Our intention for such selection of f Z ( z ) is to see how different covariate distributions affect the performance of WMLE β ^ n . Obviously, Table 1, Table 2 and Table 3 show that for β 0 = 0 , the WMLE β ^ n in Cases 1–3 performs similarly and the estimation errors decrease as the sample size increases. This is because for β 0 = 0 , the ALM (2) becomes log T = log T 0 + E , which is not affected by the covariate variable Z. For β 0 0 , the simulation results in Table 1, Table 2 and Table 3 show that for Cases 1–3 with discrete covariate Z, our WMLE β ^ n performs well and it gets closer to the true value β 0 as the sample size increases, which is expected. It should be noted that Table 1, Table 2 and Table 3 include some simulations with very heavily censored data, but our WMLE β ^ n still performs very well for all discrete covariate Z considered.
Remark 3.
Here, we discuss some situations not considered in above simulation results in Table 1, Table 2 and Table 3. For right censored data (4)–(5) with continuous covariate Z, the proposed naive or neighborhood adjustment in Ren (2017) to bivariate d.f. G ^ R R ( t , z ) by [31] is needed, and our WMLE β ^ n still works well. For doubly censored data given by (4) with (6), the consistent estimator G ^ n ( t , z ) in (11) for G 0 ( t , z ) may be the estimator G ^ R G ( t , z ) by [9], or may be the one constructed in the same way as above G ^ R R ( t , z ) , i.e., it is the sum of products of the marginal NPMLE for F Z ( z ) and the conditional NPMLE F ^ M R in the univariate case by [11] given Z = z . For interval censored data or partly interval-censored data given by (4) with (7) or (8), the consistent estimator G ^ n ( t , z ) in (11) for G 0 ( t , z ) can be obtained in the same way, then our WMLE β ^ n still works following the same formulas in Theorem 1.

4. Proofs

Proof of Theorem 1.
Notice that the maximum value of L W ( α , β , F ) in (18) can only be attained at those p i j ( α , β ) ’s which satisfy p i j ( α , β ) > 0 for ( i , j ) A , thus this fact gives those constraint conditions on p i j ( α , β ) in (18).
For simplicity of the notations, we denote in (15):
p i j ( α , β ) = p i j , ( i , j ) A ,
then constraints in (18) become
p i j > 0 , ω ^ i j > 0 p i j = 1 , f o r ( i , j ) A ω ^ i j > 0 ( U i α + β W j ) p i j = 0 ω ^ i j > 0 ( U i α + β W j ) W j p i j = 0 .
From the last two equations in (22) and constraint equation ω ^ i j > 0 p i j = 1 , we solve for ( α , β ) in terms of U i ’s, W j ’s and p i j ’s as follows:
ω ^ i j > 0 U i p i j = α ω ^ i j > 0 p i j β ω ^ i j > 0 W j p i j = α β ω ^ i j > 0 W j p i j ω ^ i j > 0 U i W j p i j = α ω ^ i j > 0 W j p i j β ω ^ i j > 0 W j 2 p i j ω ^ i j > 0 U i p i j ω ^ i j > 0 W j p i j = α ω ^ i j > 0 W j p i j β ω ^ i j > 0 W j p i j 2 β = ω ^ i j > 0 U i p i j ω ^ i j > 0 W j p i j ω ^ i j > 0 U i W j p i j ω ^ i j > 0 W j 2 p i j ω ^ i j > 0 W j p i j 2 α = ω ^ i j > 0 U i p i j + β ω ^ i j > 0 W j p i j .
Thus, the optimization problem (18) is equivalent to:
max p L ( p ) ω ^ i j > 0 p i j n ω ^ i j Subject   to : p i j > 0 , ω ^ i j > 0 p i j = 1 , for   ( i , j ) A ,
where p i j ’s are given by (15) and (21). The solution p ^ i j ’s for above optimization problem (24) gives WMLE ( α ^ n , β ^ n ) via (23):
α ^ n = ω ^ i j > 0 U i p ^ i j + β ^ n ω ^ i j > 0 W j p ^ i j β ^ n = ω ^ i j > 0 U i p ^ i j ω ^ i j > 0 W j p ^ i j ω ^ i j > 0 U i W j p ^ i j ω ^ i j > 0 W j 2 p ^ i j ω ^ i j > 0 W j p ^ i j 2 ,
and the WMLE F ^ n ( t ) is given via (15) by
F ^ n ( t ) = i = 1 m j = 1 q p ^ i j I { U i α ^ n + β ^ n W j t } .
Next, we solve optimization problem (24) as follows. Denote p = { p i j | ( i , j ) A } , then we maximize L ( p ) in (24) over p i j ’s to obtain p ^ i j ’s. Let
( p ) = n 1 log L ( p ) = ω ^ i j > 0 ω ^ i j log p i j .
Thus, the optimization problem (24) is equivalent to:
max p ( p ) = ω ^ i j > 0 ω ^ i j log p i j Subject   to : p i j > 0 , ω ^ i j > 0 p i j = 1 , for   ( i , j ) A .
For Lagrange multiplier λ , we let
h 1 ( p ) = ω ^ i j > 0 ω ^ i j log p i j + λ ω ^ i j > 0 p i j 1 ,
then its partial derivative with respect to p i j > 0 for ( i , j ) A is given by
h 1 p i j = ω ^ i j p i j + λ .
Set h 1 ( p ) = 0 , then from the second constraint condition in (28) and conditions on ω ^ i j ’s in (11) we obtain:
ω ^ i j > 0 p i j h 1 p i j = ω ^ i j > 0 ω ^ i j + λ ω ^ i j > 0 p i j = 1 + λ = 0 λ = 1 .
Plugging λ = 1 into equation h 1 ( p ) = 0 , we obtain that for ( i , j ) A :
h 1 p i j = ω ^ i j p i j 1 = 0 p i j = ω ^ i j > 0 .
Thus, we know that ( p ) in optimization problem (28) is maximized at
p ^ = p ^ i j = ω ^ i j | ( i , j ) A .
From (25), (26) and (30), we know that the WMLE ( α ^ n , β ^ n ) for ( α 0 , β 0 ) is given by
α ^ n = ω ^ i j > 0 U i ω ^ i j + β ^ n ω ^ i j > 0 W j ω ^ i j β ^ n = ω ^ i j > 0 U i ω ^ i j ω ^ i j > 0 W j ω ^ i j ω ^ i j > 0 U i W j ω ^ i j ω ^ i j > 0 W j 2 ω ^ i j ω ^ i j > 0 W j ω ^ i j 2 F ^ n ( t ) = i = 1 m j = 1 q ω ^ i j I { U i α ^ n + β ^ n W j t } ,
which implies (19) of Theorem 1. □

Author Contributions

Conceptualization, J.-J.R. and Y.L.; methodology, J.-J.R. and Y.L.; software, J.-J.R. and Y.L.; validation, J.-J.R. and Y.L.; formal analysis, J.-J.R. and Y.L.; investigation, J.-J.R. and Y.L.; resources, J.-J.R. and Y.L.; data curation, J.-J.R. and Y.L.; writing—original draft preparation, J.-J.R. and Y.L.; writing—review and editing, J.-J.R. and Y.L.; visualization, J.-J.R. and Y.L.; supervision, J.-J.R. and Y.L.; project administration, J.-J.R. and Y.L.; funding acquisition, J.-J.R. All authors have read and agreed to the published version of the manuscript.

Funding

Dr. Ren’s research was partially supported by NSF grant DMS-1407461.

Data Availability Statement

The article which develops novel statistical methodology, thus it does not included any data analysis and does not have any data to share.

Acknowledgments

The authors thank three reviewers for their comments and suggestions on the earlier draft of this article.

Conflicts of Interest

Author Yiming Lyu was employed by The Jansen Pharmaceutical Company of Johnson & Johnson Johnson. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

  1. Cox, D.R.; Oakes, D. Analysis of Survival Data; Chapman & Hall: London, UK, 1984. [Google Scholar]
  2. Kaplan, E.L.; Meier, P. Nonparametric estimation from incomplete observations. J. Am. Stat. Assoc. 1958, 53, 457–481. [Google Scholar] [CrossRef]
  3. Stute, W.; Wang, J.L. The strong law under random censorship. Ann. Stat. 1993, 21, 1591–1607. [Google Scholar] [CrossRef]
  4. Gill, R. Large sample behavior of the product-limit estimator on the whole line. Ann. Stat. 1983, 11, 49–58. [Google Scholar] [CrossRef]
  5. Cox, D.R. Regression models and life-tables (with discussion). J. R. Stat. Soc. Ser. B 1972, 34, 187–220. [Google Scholar] [CrossRef]
  6. Gehan, A.E. A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 1965, 52, 203–223. [Google Scholar] [CrossRef] [PubMed]
  7. Leiderman, P.H.; Babu, D.; Kagia, J.; Kraemer, H.C.; Leiderman, C.F. African infant precocity and some social influences during the first year. Nature 1973, 242, 247–249. [Google Scholar] [CrossRef]
  8. Turnbull, B.W. Nonparametric estimation of a survivorship function with doubly censored data. J. Am. Stat. Assoc. 1974, 69, 169–173. [Google Scholar] [CrossRef]
  9. Ren, J.; Gu, M.G. Regression M-estimators for doubly censored data. Ann. Stat. 1997, 25, 2638–2664. [Google Scholar] [CrossRef]
  10. Ren, J.; Peer, P.G. A study on effectiveness of screening mammograms. Int. J. Epidemiol. 2000, 29, 803–806. [Google Scholar] [CrossRef]
  11. Mykland, P.A.; Ren, J. Self-consistent and maximum likelihood estimation for doubly censored data. Ann. Stat. 1996, 24, 1740–1764. [Google Scholar] [CrossRef]
  12. Chang, M.N.; Yang, G.L. Strong consistency of a nonparametric estimator of the survival function with doubly censored data. Ann. Stat. 1987, 15, 1536–1547. [Google Scholar] [CrossRef]
  13. Gu, M.G.; Zhang, C.H. Asymptotic properties of self-consistent estimators based on doubly censored data. Ann. Stat. 1993, 21, 611–624. [Google Scholar] [CrossRef]
  14. Groeneboom, P.; Wellner, J.A. Information Bounds and Nonparametric Maximum Likelihood Estimation; Birkhäuser Verlag: Basel, Switzerland, 1992. [Google Scholar]
  15. Kim, M.Y.; De Gruttola, V.G.; Lagakos, S.W. Analyzing doubly censored data with covariates, with application to AIDS. Biometrics 1993, 49, 13–22. [Google Scholar] [CrossRef] [PubMed]
  16. Ren, J. Goodness of fit tests with interval censored data. Scand. J. Stat. 2003, 30, 211–226. [Google Scholar] [CrossRef]
  17. Huang, J. Asymptotic properties of nonparametric estimation based on partly interval-censored data. Stat. Sin. 1999, 9, 501–519. [Google Scholar]
  18. Odell, P.M.; Anderson, K.M.; D’Agostino, R.B. Maximum likelihood estimation for interval-censored data using a Weibull-based accelerated failure time model. Biometrics 1992, 48, 951–959. [Google Scholar] [CrossRef]
  19. Enevoldsen, A.K.; Borch-Johnson, K.; Kreiner, S.; Nerup, J.; Deckert, T. Declining incidence of persistent proteinuria in type I (insulin-dependent) diabetic patient in Denmark. Diabetes 1987, 36, 205–209. [Google Scholar] [CrossRef]
  20. Buckley, J.; James, I. Linear regression with censored data. Biometrika 1979, 66, 429–436. [Google Scholar] [CrossRef]
  21. Jin, Z.; Lin, D.Y.; Ying, Z. On least-squares regression with censored data. Biometrika 2006, 93, 147–161. [Google Scholar] [CrossRef]
  22. Lai, T.L.; Ying, Z. Large sample theory of a modified Buckley-James estimator for regression analysis with censored data. Ann. Stat. 1991, 19, 1370–1402. [Google Scholar] [CrossRef]
  23. Ritov, Y. Estimation in a linear regression model with censored data. Ann. Stat. 1990, 18, 303–328. [Google Scholar] [CrossRef]
  24. Li, G.; Wang, Q. Empirical likelihood regression analysis for right censored data. Stat. Sin. 2003, 13, 51–68. [Google Scholar]
  25. Zhou, M. Empirical likelihood analysis of the rank estimator for the censored accelerated failure time model. Biometrika 2005, 92, 492–498. [Google Scholar] [CrossRef]
  26. Zhou, M.; Li, G. Empirical likelihood analysis of the Buckley-James estimator. J. Multivar. Anal. 2008, 99, 649–664. [Google Scholar] [CrossRef]
  27. Jin, Z.; Lin, D.Y.; Wei, L.J.; Yang, Z. Rank-based inference for the accelerated failure time model. Biometrika 2003, 90, 341–353. [Google Scholar] [CrossRef]
  28. Ren, J. Weighted Empirical Likelihood Ratio Confidence Intervals for the Mean with Censored Data. Ann. Inst. Stat. Math. 2001, 53, 498–516. [Google Scholar] [CrossRef]
  29. Ren, J. Weighted empirical likelihood in some two-sample semiparametric models with various types of censored data. Ann. Stat. 2008, 36, 147–166. [Google Scholar] [CrossRef]
  30. Owen, A.B. Empirical likelihood ratio confidence intervals for a single functional. Biometrika 1988, 75, 237–249. [Google Scholar] [CrossRef]
  31. Ren, J.; Riddlesworth, T.D. Empirical likelihood bivariate nonparametric maximum likelihood estimator with right censored data. Ann. Inst. Stat. Math. 2014, 66, 913–930. [Google Scholar] [CrossRef]
  32. Ren, J. Empirical likelihood bivariate nonparametric maximum likelihood estimator with right censored data and continuous covariate. Stat. Its Interface 2017, 10, 601–605. [Google Scholar] [CrossRef]
  33. Qin, J.; Lawless, J. Empirical Likelihood and general estimating equations. Ann. Stat. 1994, 22, 300–325. [Google Scholar] [CrossRef]
Table 1. Simulation Results of β ^ n for Right Censored Data with Case 1 of F Z .
Table 1. Simulation Results of β ^ n for Right Censored Data with Case 1 of F Z .
Parameter β 0 = 2 β 0 = 1 β 0 = 0 β 0 = 1 β 0 = 2
Sample SizeAve. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n
n = 50 1.987 (0.79)0.992 (0.80)0.005 (0.81)−0.905 (0.83)−1.639 (0.84)
n = 100 2.003 (0.56)0.999 (0.57)0.012 (0.58)−0.933 (0.60)−1.735 (0.62)
n = 200 2.001 (0.39)0.993 (0.40)0.001 (0.41)−0.968 (0.42)−1.809 (0.45)
n = 500 2.000 (0.25)0.999 (0.25)−0.004 (0.26)−0.989 (0.27)−1.884 (0.30)
n = 1000 1.996 (0.17)0.996 (0.17)−0.007 (0.18)−1.001 (0.19)−1.931 (0.22)
Censoring %16.5%23.5%33.4%45.3%57.3%
Table 2. Simulation Results of β ^ n for Right Censored Data with Case 2 of F Z .
Table 2. Simulation Results of β ^ n for Right Censored Data with Case 2 of F Z .
Parameter β 0 = 2 β 0 = 1 β 0 = 0 β 0 = 1 β 0 = 2
Sample SizeAve. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n
n = 50 1.975 (0.70)0.982 (0.71)0.006 (0.72)−0.870 (0.73)−1.551 (0.73)
n = 100 2.007 (0.51)1.000 (0.51)0.010 (0.52)−0.917 (0.54)−1.681 (0.55)
n = 200 2.011 (0.36)1.006 (0.36)0.002 (0.37)−0.966 (0.39)−1.782 (0.41)
n = 500 2.002 (0.22)1.007 (0.23)−0.003 (0.23)−1.009 (0.25)−1.891 (0.27)
n = 1000 1.995 (0.16)1.000 (0.16)−0.004 (0.17)−1.020 (0.18)−1.950 (0.20)
Censoring %16.7%23.6%33.4%45.3%57.2%
Table 3. Simulation Results of β ^ n for Right Censored Data with Case 3 of F Z .
Table 3. Simulation Results of β ^ n for Right Censored Data with Case 3 of F Z .
Parameter β 0 = 2 β 0 = 1 β 0 = 0 β 0 = 1 β 0 = 2
Sample SizeAve. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n Ave. β ^ n
n = 50 1.983 (0.87)0.990 (0.88)0.006 (0.90)−0.896 (0.93)−1.609 (0.93)
n = 100 2.007 (0.62)1.001 (0.63)0.012 (0.64)−0.927 (0.66)−1.715 (0.70)
n = 200 2.005 (0.43)0.997 (0.44)−0.000 (0.45)−0.966 (0.48)−1.793 (0.50)
n = 500 2.002 (0.27)0.999 (0.28)−0.005 (0.29)−0.996 (0.31)−1.884 (0.33)
n = 1000 1.997 (0.19)0.997 (0.19)−0.008 (0.21)−1.008 (0.21)−1.935 (0.24)
Censoring %16.3%23.5%33.4%45.3%57.4%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ren, J.-J.; Lyu, Y. Weighted Empirical Likelihood for Accelerated Life Model with Various Types of Censored Data. Stats 2024, 7, 944-954. https://doi.org/10.3390/stats7030057

AMA Style

Ren J-J, Lyu Y. Weighted Empirical Likelihood for Accelerated Life Model with Various Types of Censored Data. Stats. 2024; 7(3):944-954. https://doi.org/10.3390/stats7030057

Chicago/Turabian Style

Ren, Jian-Jian, and Yiming Lyu. 2024. "Weighted Empirical Likelihood for Accelerated Life Model with Various Types of Censored Data" Stats 7, no. 3: 944-954. https://doi.org/10.3390/stats7030057

APA Style

Ren, J. -J., & Lyu, Y. (2024). Weighted Empirical Likelihood for Accelerated Life Model with Various Types of Censored Data. Stats, 7(3), 944-954. https://doi.org/10.3390/stats7030057

Article Metrics

Back to TopTop