Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data

Hsieh, Jin-Jian; Chen, Yun-Jhu

doi:10.3390/axioms11100521

Open AccessArticle

Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data

by

Jin-Jian Hsieh

^*

and

Yun-Jhu Chen

Department of Mathematics, National Chung Cheng University, Chia-Yi 621301, Taiwan

^*

Author to whom correspondence should be addressed.

Axioms 2022, 11(10), 521; https://doi.org/10.3390/axioms11100521

Submission received: 25 August 2022 / Revised: 22 September 2022 / Accepted: 27 September 2022 / Published: 1 October 2022

(This article belongs to the Special Issue Applied Mathematics in Biology and Medicine)

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

Truncation data arise when the interested event time can be observed only if it satisfies a certain condition. Most of the existing approaches analyze this kind of data by assuming the truncated variable is quasi-independent of the interested event time. However, in many situations, the quasi-independence assumption may be not suitable. In this article, the authors consider the copulas to relax the quasi-independence assumption. Additionally, the survival function of the interested event time is estimated by a copula-graphic approach. Then, the authors propose two estimation procedures for the proportional hazard (PH) model and the proportional odds (PO) model, which can be applied to the right-truncated data, and the left-truncated and right-censoring data. Subsequently, the performance of the proposed estimation approaches is assessed via simulation studies. Finally, the proposed methodologies are applied to analyze two real datasets (the retirement center dataset and the transfusion-related AIDS dataset).

Keywords:

censoring; copula model; proportional hazard model; proportional odds model; dependent truncated data

1. Introduction

In survival analysis, censoring and truncation are usually encountered, which make the statistical inference complicated. When the censoring occurs, we do not know the survival time exactly. However, we have partial information for the censoring subject. In the truncation data, there is no information for the truncated subject. In this article, we consider two kinds of data, the left truncated and right censoring data and the right truncated data. The pair of truncated variables,

(X, Y)

, satisfy the condition

Y > X

. When Y is the variable of interest, the data are called left-truncated data. Then, the observation data are

{(X_{i}, T_{i}, δ_{i}) : X_{i} < T_{i}, i = 1, \dots, n}

, where

T = m i n (Y, C)

,

δ = 1_{{Y < C}}

, and C is the right censoring time. On the other hand, when X is the variable of interest, the data are said to be right-truncated data. The observation data are

{(X_{i}, Y_{i}) : X_{i} < Y_{i}, i = 1, \dots, n}

. Previous studies assumed that X and Y are quasi-independent for the survival function estimation [1,2,3], linear regression model [4,5] and hazard rate model [6,7,8]. However, this assumption of quasi-independence may not be suitable in some situations. For instance, ref. [9] rejected the quasi-independence assumption using conditional Kendall’s tau for the transfusion-related AIDS dataset. Throughout this paper, we formulate a regression model on the interested event time (X or Y), which includes the PH model and the PO model, to discuss the relationships between the event time and covariates. For the regression problem under dependent truncated data, ref. [10] considered the semiparametric inference for an accelerated failure time model, which utilized the information of truncated variable into the regression model to relax the independent truncation assumption. Ref. [11] suggested an expectation-maximization algorithm to relax the independence truncation assumption for the Cox regression model. Ref. [12] proposed the conditional maximum likelihood estimators to relax the independence truncation assumption. This paper considers the copulas to relax the independence truncation assumption between X and Y. In addition, we apply the method by [13], which generalizes the copula-graphic estimator, to estimate the survival functions and the Kendall’s tau. Then, we extend the methods by [14] (the application of the area between two survival curves) and [15] (the minimization of the norm distance between two survival curves) to estimate the regression parameters for the PH model and the PO model under dependent truncated data.

The rest of this article is described in the following. In Section 2, the dependent truncated data, the copula models, and the regression models (the PH model and the PO model) are introduced. In Section 3, using the method by [13], we estimate the survival functions and Kendall’s tau in the case of dependent truncated data. Furthermore, we build a regression model and propose two methods for estimating the regression parameters under the dependent truncated data using the methods by [14,15], which are applied to semi-competing risks data and dependent current status data. Finally, the bootstrapping procedure is described. In Section 4, the performance of the proposed estimation procedures is examined via simulation studies in different sample sizes, Kendall’s tau, and copula models. In Section 5, we analyze two real datasets [16,17] and apply the estimation procedures mentioned in Section 3. Finally, we conclude in Section 6.

2. Data and Model Assumptions

2.1. Dependent Truncated Data

Dependent truncated data are a data type in the survival analysis. In this kind of incomplete data, the truncated variables pairs of

(X, Y)

can only be observed when they satisfy

Y > X

. Depending on the variable of interest (i.e., Y or X), left- or right-truncated variables are respectively termed. In the example by [16], the retirement center dataset was considered a set of left truncated and right censoring data. It defined X as the entry age, Y as the length of lifetime, and C as the censoring time. Assume that C is independent of (X, Y) and

δ = 1_{{Y < C}}

. In this example, the participants were only included when they lived long enough to enter the retirement house. Define

T = m i n (Y, C)

, and the observed data are

{(X_{i}, T_{i}, δ_{i}) : X_{i} < T_{i}, i = 1, \dots, n}

. In the other case, the dataset by [17] was considered a set of right-truncated data. It defined X as the induction time and Y as the time between the infection onset and the end of the study. In this example, the participants were only included if they had developed AIDS before 102 months from the beginning of the study (i.e., they satisfied

X < Y

). The observed data are

{(X_{i}, Y_{i}) : X_{i} < Y_{i}, i = 1, \dots, n}

.

2.2. Semisurvival Copula

A semisurvival copula was used in the case of truncated data [13]. The semisurvival copula is a statistical method proposed for modeling the dependence between X and Y when the pairs satisfy the condition of

X < Y

. The joint distribution of X and Y is expressed as follows:

π (x, y) = P r (X \leq x, Y > y | Y > X) = \frac{1}{c} C_{α} {(F_{X} (x), S_{Y} (y)},

(1)

where

F_{X} (x)

is the cumulative distribution function of X,

S_{Y} (y)

is the survival function of Y,

C_{α}

is a semisurvival copula function, and c is a normalizing constant, which represents

P r (Y > X)

. As is widely known, the Archimedean copula model (AC model) [18] is a popular subclass of the copula model with the generating function

ϕ_{α}

, which can be expressed as follows:

C_{α} {u, v} = ϕ_{α}^{- 1} (ϕ_{α} (u) + ϕ_{α} (v)), 0 < u, v < 1,

(2)

where

ϕ_{α} : (0, 1] \to [0, \infty)

is a continuous strictly decreasing convex function with

ϕ_{α} (1) = 0

,

ϕ_{α}^{^{'}} (t) < 0

,

ϕ_{α}^{^{''}} (t) > 0

. Furthermore,

α

is an association parameter related to Kendall’s tau. The AC model is a simpler form of the copula model and includes many famous copula functions, such as Clayton copula, Frank copula, Gumbel copula, independence copula, etc.

2.3. Regression Model

Let

T^{*}

be the event time we are interested in, which could be Y or X. Then, we consider the following regression model as

h (T^{*}) = Z^{^{'}} θ + ε,

(3)

where

h (.)

is an unknown monotonic increasing function, Z is a discrete covariate,

θ

is a parameter, both Z and

θ

are

p \times 1

vectors, and

ε

is the error term, which distribution is known. Two distinct scenarios are described when

h (.)

is unknown and the distribution of

ε

is formulated. In one scenario, the Cox proportional hazard model is described when

ε

follows the Gumbel extreme value distribution. In the other case, the proportional odds model is described when

ε

follows the standard logistic distribution.

The Cox PH model, which is a regression model commonly employed in survival analysis, links the hazard function with the covariates. The Cox PH model consists of two parts. One is

h_{0} (y)

, the baseline hazard function, and the other is the exponential term,

exp (Z^{^{'}} θ)

. The form of the Cox PH model is as follows:

h (y, Z) = h_{0} (y) e^{Z^{^{'}} θ} .

(4)

Furthermore, the hazard ratio of

Z_{1}

to

Z_{2}

can be expressed as follows:

H R = \frac{h (y, Z_{1})}{h (y, Z_{2})} = e x p [{(Z_{1} - Z_{2})}^{^{'}} θ] .

(5)

The PO model is one of the linear models, designed for studying the effect of covariates on the odds. The form of the PO model is as follows:

ϕ (y, Z) = ϕ_{0} (y) e^{Z^{^{'}} θ},

(6)

where

ϕ (y, Z)

is the failure odds of the individual at time y with variable Z.

ϕ_{0} (y)

is the baseline odds. Furthermore, the odds ratio of

Z_{1}

to

Z_{2}

can be expressed as follows:

O R = \frac{ϕ (y, Z_{1})}{ϕ (y, Z_{2})} = e x p [{(Z_{1} - Z_{2})}^{^{'}} θ] .

(7)

3. The Proposed Estimation Procedures

The purpose of this article is to estimate the parameter

θ

within the regression model (3) under dependent truncated data, which measures the covariate effect on the interested event time. By [13,19], the survival function of Y (or X) can be obtained, which is defined as

{\hat{S}}_{Y} (y)

(or

{\hat{S}}_{X} (x)

). In the following two subsections, two methods are proposed to estimate the regression parameter

θ

. The one, proposed by [14], was used for the case of semi-competing risks data. The other, proposed by [15], was used for the case of dependent current status data. Here, we extend the two methods to the left truncated and right censoring data, and the right truncated data. When Y is the interesting variable, we apply

{\hat{S}}_{Y} (y)

to Method 1 and Method 2, such as the following inference. When X is the interesting variable, we apply

{\hat{S}}_{X} (x)

to Method 1 and Method 2.

3.1. Method 1: The Application of the Area between Two Survival Curves

The test statistic with

Z = 0, 1

, as shown in the following, can be used to implement the hypothesis test

H_{0} : S_{Y, 0} (x) = S_{Y, 1} (x)

, where

S_{Y, Z} (x) = P r (Y > x | Z = z)

,

U_{Y} = \sqrt{\frac{n_{0} n_{1}}{n}} \int W (x) {{\hat{S}}_{Y, 0} (x) - {\hat{S}}_{Y, 1} (x)} d x .

(8)

In the above test statistic,

W (x)

is the weight function,

n_{0}

is the sample size of group

Z = 0

,

n_{1}

is the sample size of group

Z = 1

, and

n = n_{0} + n_{1}

is the total size. For the general case, define

{z_{k}, k = 1, 2, \dots, K}

as the set of all possible values of Z and

θ_{0}

as the true value of

θ

. If the regression model (3) is true, there exists a functional transformation,

ξ_{θ} (.)

such that

ξ_{{(z_{j} - z_{k})}^{T} θ_{0}} (S_{Y, z_{k}}) (y) = S_{Y, z_{j}} (y)

. Define

g_{k j} (y, θ) = ξ_{z_{k j}^{T} θ} (S_{Y, z_{k}}) (y) - S_{Y, z_{j}} (y)

, then, the estimator of g is

{\hat{g}}_{k j} (y, θ) = ξ_{z_{k j}^{T} θ} ({\hat{S}}_{Y, z_{k}}) (y) - {\hat{S}}_{Y, z_{j}} (y)

, where

z_{k j} = z_{j} - z_{k}

and

{\hat{S}}_{Y, Z} (y) = \hat{P r} (Y > y | Z = z)

. Therefore, the estimating equation of

θ

becomes

U (θ) = \sum_{k < j} w_{0} (z_{k j}^{T} θ) z_{k j} \sqrt{\frac{n_{k} n_{j}}{n_{k} + n_{j}}} \int_{0}^{t_{k j}} W_{k j} (y) {\hat{g}}_{k j} (y, θ) d y = 0,

(9)

where

t_{k j}

is the largest value of Y in the subsample with

Z = z_{k}

and

Z = z_{j}

;

w_{0} (.)

is the weight function of group

Z_{k}

and group

Z_{j}

; and

W_{k j} (.)

is the weight function for time y of two survival curves. From [13], they proved the consistency and asymptotic normality properties of

{\hat{S}}_{Y, z} (x)

. Then, the consistency and asymptotically normality properties of

\hat{θ}

of method 1 can be proved by Theorem 1 of [14].

The aforementioned content discussed the situation of multiple covariates

Z_{k}, k = 1, \dots, K

. Based on this discussion, the following examples are described with

Z = 0, 1

. When

h (y)

is unknown and the distribution of

ε

is formulated, the general functional transformation is

ξ_{θ} (S) (y) = S_{ε} [S_{ε}^{- 1} {S (y)} - θ]

, where

S_{ε} (y) = P r (ε > y)

is the survival function of

ε

. Subsequently, the transformation under the truncated data with

Z = 0, 1

can be described as follows:

S_{Y, 0} (x) = P r (Y > x | Z = 0) = P r (h (Y) > h (x) | Z = 0) = P r (ε > h (x)) = S_{ε} (h (x))

S_{Y, 1} (x) = P r (Y > x | Z = 1) = P r (h (Y) > h (x) | Z = 1) = P r (ε > h (x) - θ) = S_{ε} (h (x) - θ) .

Plugging

S_{Y, 0} (x)

into

ξ_{θ_{0}} (.)

, then,

ξ_{θ_{0}} (S_{Y, 0}) (x) = S_{ε} [S_{ε}^{- 1} {S_{ε} (h (x))} - θ_{0}] = S_{ε} (h (x) - θ_{0}) = S_{Y, 1} (x),

(10)

where

θ_{0}

is the true value of

θ

. The next two distinct models are discussed under the circumstances where

Z = 0, 1

,

h (.)

is unknown, and the distribution of

ε

is formulated.

3.1.1. Estimation under Cox Proportional Hazard Model

When

ε

has the Gumbel extreme value distribution, the regression model (3) is considered the Cox PH model. Subsequently,

S_{ε} (y) = e x p {- e x p (y)}

and

ξ_{θ} (S) (y) = S {(y)}^{e x p (- θ)}

. In this case, it follows

S_{Y, 1} (y) = S_{Y, 0} {(y)}^{e x p (- θ)} .

(11)

Hence, the equation for estimating

θ

becomes

\hat{U} (θ) = \sqrt{\frac{n_{0} n_{1}}{n}} \int_{0}^{t_{(n)}} W (t) {{\hat{S}}_{Y, 0} {(t)}^{e x p (- θ)} - {\hat{S}}_{Y, 1} (t)} d t = 0 .

Thus, the following equation for estimating

θ

can be expressed as

\hat{U} (θ) = \sqrt{\frac{n_{0} n_{1}}{n}} \sum_{i = 1}^{n - 1} W (t_{(i)}) (t_{(i + 1)} - t_{(i)}) {{\hat{S}}_{Y, 0} {(t_{(i)})}^{e x p (- θ)} - {\hat{S}}_{Y, 1} (t_{(i)})} = 0,

(12)

where

t_{(i)}

is the ordered survival jump time point in the two pool samples.

3.1.2. Estimation under the Proportional Odds Model

When

ε

has the standard logistic distribution, the regression model (3) is considered the PO model. Subsequently,

S_{ε} (y) = 1 / {1 + e^{y}}

and

ξ_{θ} (S) (y) = S (y) / {e x p (- θ) - S (y) e x p (- θ) + S (y)}

. In this case, it follows

S_{Y, 1} (y) = \frac{S_{Y, 0} (y)}{e x p (- θ) - S_{Y, 0} (y) e x p (- θ) + S_{Y, 0} (y)} .

(13)

Hence, the equation for estimating

θ

becomes

\hat{U} (θ) = \sqrt{\frac{n_{0} n_{1}}{n}} \int_{0}^{t_{(n)}} W (t) \{\frac{{\hat{S}}_{Y, 0} (y)}{e x p (- θ) - {\hat{S}}_{Y, 0} (y) e x p (- θ) + {\hat{S}}_{Y, 0} (y)} - {\hat{S}}_{Y, 1} (t)\} d t = 0 .

Therefore, the estimating equation of

θ

is

\hat{U} (θ) = \sqrt{\frac{n_{0} n_{1}}{n}} \sum_{i = 1}^{n - 1} W (t_{(i)}) (t_{(i + 1)} - t_{(i)}) \{\frac{{\hat{S}}_{Y, 0} (t_{(i)})}{e x p (- θ) - {\hat{S}}_{Y, 0} (t_{(i)}) e x p (- θ) + {\hat{S}}_{Y, 0} (t_{(i)})} - {\hat{S}}_{Y, 1} (t_{(i)})\} = 0 .

(14)

3.2. Method 2: The Minimization of the Norm Distance between Two Survival Curves

We use the area between two survival curves in Method 1. In Method 2, as proposed by [15], we would consider the minimization of the norm distance between two curves. Define the estimator of g as

{\hat{g}}_{k j} (y, θ) = ξ_{z_{k j}^{T} θ} ({\hat{S}}_{Y, z_{k}}) (y) - {\hat{S}}_{Y, z_{j}} (y)

, then, the norm distance between two survival curves can be expressed as follows:

\begin{matrix} \tilde{U} (θ) & = \sum_{k < j} w_{0} (z_{k j}^{T} θ) | | {\hat{g}}_{k j} (y, θ) | | \\ = \sum_{k < j} w_{0} (z_{k j}^{T} θ) | | ξ_{z_{k j}^{T} θ} ({\hat{S}}_{Y, z_{k}}) (y) - {\hat{S}}_{Y, z_{j}} (y) | | \\ = \sum_{k < j} w_{0} (z_{k j}^{T} θ) {[\sum_{t_{(i)} \in A_{k j}} {(ξ_{z_{k j}^{T} θ} ({\hat{S}}_{Y, z_{k}}) (t_{(i)}) - {\hat{S}}_{Y, z_{j}} (t_{(i)}))}^{2}]}^{\frac{1}{2}}, \end{matrix}

(15)

where

w_{0} (.)

is the weight function and

A_{k j}

is the set of survival jump time points in the pool samples with

Z = z_{k}

and

Z = z_{j}

. Subsequently, we obtain the parameter estimation of

θ

by minimizing

\tilde{U} (θ)

. Following the similar argument in Appendix B of [14] with the large sample properties of

{\hat{S}}_{Y, z_{k}} (y)

[13], the consistency property of

\hat{θ}

can be obtained. Then, by Taylor series expansion and the inference procedure in the proof of Theorem 1 of [14], we can prove the asymptotic normality property of

\sqrt{n} (\hat{θ} - θ_{0})

.

3.3. Estimate Variance by the Bootstrap Approach

In this paper, we apply the bootstrap method to estimate the variance of the estimation of

θ

. Under the left truncated and right censoring data,

(u_{i}, v_{i})

are generated with the copula model, which is independent of

w_{i}

, and they are all set to follow Uniform(0,1). From [13],

{\hat{S}}_{Y, z_{k}} (t)

,

{\hat{F}}_{Y, z_{k}} (t) = 1 - {\hat{S}}_{Y, z_{k}} (t)

,

{\hat{F}}_{X, z_{k}} (t)

, and

{\hat{S}}_{C, z_{k}} (t)

for group

Z = z_{k}

can be obtained. Therefore, we can obtain the data

{(X_{i}^{* *}, Y_{i}^{* *}, C_{i}^{* *}), i = 1, . . ., n_{z_{k}}}

with

X_{i}^{* *} < m i n (Y_{i}^{* *}, C_{i}^{* *})

, where

X_{i}^{* *} = {\hat{F}}_{X, z_{k}}^{- 1} (u_{i})

,

Y_{i}^{* *} = {\hat{F}}_{Y, z_{k}}^{- 1} (v_{i})

, and

C_{i}^{* *} = {\hat{F}}_{C, z_{k}}^{- 1} (w_{i})

. Define

T_{i}^{* *} = m i n (Y_{i}^{* *}, C_{i}^{* *})

and

δ_{i}^{* *} = I (Y_{i}^{* *} < C_{i}^{* *})

, then, the new bootstrapped data are

{(X_{i}^{* *}, T_{i}^{* *}, δ_{i}^{* *}) : X_{i}^{* *} < T_{i}^{* *}, i = 1, \dots, n} .

In the other case, under the right truncated data, the copula relational variables are generated by

(u_{i}, v_{i})

, and the margins follow Uniform(0,1). By [13],

{\hat{S}}_{X, z_{k}} (t)

,

{\hat{F}}_{X, z_{k}} (t) = 1 - {\hat{S}}_{X, z_{k}} (t)

, and

{\hat{F}}_{Y, z_{k}} (t)

for group

Z = z_{k}

are already obtained. Therefore, we can obtain the data

{(X_{i}^{* *}, Y_{i}^{* *}), i = 1, \dots, n_{z_{k}}}

with

X_{i}^{* *} < Y_{i}^{* *}

, where

X_{i}^{* *} = {\hat{F}}_{X, z_{k}}^{- 1} (u_{i})

and

Y_{i}^{* *} = {\hat{F}}_{Y, z_{k}}^{- 1} (v_{i})

. The new bootstrapped data are

{(X_{i}^{* *}, Y_{i}^{* *}) : X_{i}^{* *} < Y_{i}^{* *}, i = 1, \dots, n} .

Based on the bootstrapping data, we can estimate

θ

by the above methods. Repeating the procedure B times, the standard deviation and the confidence interval can be obtained by the B values of

\hat{θ}

.

4. Simulation Studies

In this section, we examine the performance of the proposed estimation procedures via simulations. Firstly, we generate the sample size with

n = 100

(or 200) for each group by using the Clayton copula, Gumbel copula, and Frank copula for the dependence between X and Y. Below, the left-truncated and right-censoring data, and the right-truncated data are separately discussed. When Y is the variable of interest, define

h (Y) = l o g (Y)

for the PH model,

h (Y) = l o g (e^{Y} - 1)

for the PO model,

w_{0} (.) = W_{k j} (.) = 1

, and generate C from Uniform(0,10). For the Clayton and Gumbel copula with

τ = - 0.05

and Frank copula with

τ = 0.3, 0.5

, we generate

e^{ε}

from

e x p (1)

and X from

e x p (1)

under the PH model, and

ε

from the standard logistic distribution and X from

e x p (1)

under the PO model. For the Frank copula with

τ = 0.7

, we generate

e^{ε}

from

e x p (1)

and X from

e x p (0.7)

under the PH model, and

ε

from the standard logistic distribution and X from

e x p (0.7)

under the PO model. When X is the variable of interest, define

h (X) = l o g (X)

for the PH model,

h (X) = l o g (e^{X} - 1)

for the PO model, and

w_{0} (.) = W_{k j} (.) = 1

. We set Kendall’s tau

τ = - 0.05

for the Clayton and Gumbel copula, and

τ = 0.3, 0.5, 0.7

for the Frank copula. Then, we generate

e^{ε}

from

e x p (1)

and Y from

e x p (1)

for the PH model, and

ε

from the standard logistic distribution and Y from

e x p (1)

for the PO model. For the regression model, we consider two cases on

T^{*}

, which is the event time that we are interested in (X or Y). In the Case 1, we consider the regression model with

Z = 0, 1

,

C a s e 1 : h (T^{*}) = θ_{0} Z + ε,

where

θ_{0} = - 0.3

for the PH model and

θ_{0} = 0.3

for the PO model. In Case 2, we consider the regression model

C a s e 2 : h (T^{*}) = θ_{10} Z_{1} + θ_{20} Z_{2} + ε,

where

(Z_{1}, Z_{2}) = (0, 0)

for group 1,

(Z_{1}, Z_{2}) = (1, 0)

for group 2,

(Z_{1}, Z_{2}) = (0, 1)

for group 3, and

(θ_{01}, θ_{02}) = (- 0.3, - 0.3)

for the PH model and

(θ_{01}, θ_{02}) = (0.3, 0.3)

for the PO model. Through 500 iterations of simulation runs and 100 iterations of bootstrapping procedures, we obtain five indices of the simulation results, which are presented as the bias of the proposed method (Bias), the empirical standard deviation (EmpSD), the average standard deviation based on the bootstrapping method (AveSD), the mean squares error (MSE), and the coverage probability (CP) of the 95% confidence interval.

Table 1, Table 2, Table 3 and Table 4 show the results of case 1 and case 2 when the variable of interest is Y. Table 5, Table 6, Table 7 and Table 8 show the simulation results when the variable of interest is X. According to Table 1, Table 2, Table 3 and Table 4, under the left-truncated and right-censoring data, the performance of the proposed methods is good, and the standard deviation and the mean square error of method 1 are smaller than those of method 2 in most situations. From Table 5, Table 6, Table 7 and Table 8, under the right-truncated data, the performance of the proposed methods is also good, and the standard deviation and the mean square error of method 2 are smaller than method 1 under the conditions of (i) the Clayton and Gumbel copula with the PH and the PO model and (ii) the Frank copula with the PO model. However, the standard deviation and the mean square error of method 1 are smaller than method 2 under the Frank copula with the PH model. Moreover, across the tables, we note that the standard deviation and the mean square error are decreasing when the correlation is increasing. The coverage probability (CP) of the 95% confidence interval is near 0.95.

5. Data Analysis

In this section, we analyze two real datasets [16,17] with the proposed methods. Ref. [16] included a retirement center dataset from the Channing House retirement community in Palo Alto, California, which includes the age at death (i.e., the failure time), the age when admitted into the community (i.e., the truncated time), and the age when leaving the community or the termination of the study (i.e., the censoring time). In this dataset, the failure time is the variable of interest, and it is an example of the left-truncated and right-censoring data. The data include 462 observations, 97 males (46 deceased and 51 were censored), and 365 females (130 deceased and 235 were censored). All observations in this dataset were obtained by the health care program at the center. The residents were granted easy access to medical care without any additional financial burden. For these data, we divide the observations into two groups by gender. Then, we transform the time scale as 10 months one unit for the truncated time X and the length of the lifetime Y. Define

Z = 0

for females and

Z = 1

for males. We employ the PH model and PO model to study the relationship between the failure time and Z. The first plot of Figure 1 is the survival curves of the failure time with the Frank copula in the female and male groups.

The second set of data is introduced by [17], who studied patients who developed AIDS through contaminated blood transfusions up until 1 July 1986. The data include the infection time Q, the induction time X, and age in years. Define

Y = 102 - Q

as the time between the infection onset and the termination of the study. Individuals were observed only when

X < Y

. In this dataset, the induction time is the variable of interest and it is an example of the right truncated data. The dataset includes 293 observations, 34 observations aged 0–4 years as child patients, 119 observations aged 5–59 years as adult patients, and 141 observations aged 60 and over as elderly patients. Similar to [9], we also divide the observations into three groups by age. Define

(Z_{1}, Z_{2}) = (0, 0)

for child,

(Z_{1}, Z_{2}) = (1, 0)

for adult, and

(Z_{1}, Z_{2}) = (0, 1)

for elderly patients. We use the PH model and the PO model to investigate the relationship between the induction time and

(Z_{1}, Z_{2})

. We transform the time scale as 10 months one unit in the induction time X and Y. The second plot of Figure 1 is the survival curves of the induction time with the Frank copula in the three age groups. Ref. [9] claimed that X and Y were not quasi-independent. Hence, we propose to analyze the data with the Frank copula model [13] to specify the relationship between X and Y.

In the retirement center dataset,

\hat{τ}

is 0.39 for the male group and 0.09 for the female group. In the transfusion-related AIDS dataset,

\hat{τ}

are 0.13, 0.34, and 0.40 for the child group, the adult group, and the elderly group, respectively. With the 1000 bootstrapping times, Table 9 and Table 10 show the results of the estimation of

θ

(

\hat{θ}

), the standard deviation of

\hat{θ}

(SD), the 95% confidence interval of

θ

(95% C.I),

D_{R}

(DR), and p-value of

D_{R}

(PV), where the

D_{R}

statistic is a model selection approach from [14]. According to these tables, we note that all the p-values of

D_{R}

are larger than 0.05. That is, the PH model and the PO model are both proper for the retirement center dataset and the transfusion-related AIDS dataset. Additionally, we can obtain the better-fitting model in each dataset by using the smallest DR. The PH model is the better-fitting model for the retirement dataset, and the PO model is the better-fitting model for the transfusion-related AIDS dataset. In the retirement center dataset, the 95% confidence interval contains 0, which means that the difference in the lifetime between males and females is not significant. In the transfusion-related AIDS dataset, the difference in the induction time between (i) child and adult patients and (ii) child and elderly patients are both significant, but the difference in the induction time between adult and elderly patients is not significant. Next, we take some examples to explain the covariate effect based on the

\hat{θ}

. In the retirement center data, from Table 9 with Method 1 and the PH model,

\hat{H R} = e^{(0.3211)} = 1.3786

, which means the hazard of death in males is 1.3786 times larger than in females. In the transfusion-related AIDS data, from Table 10 with Method 1 and the PO model,

\hat{θ} = (- 1.9358, - 2.2260, 0.2902)

, which means that the failure odds of the AIDS onset in adults is 0.1443 times less than children, the failure odds of the AIDS onset in elderly is 0.1079 times less than children, and the failure odds of the AIDS onset in adults is 1.3367 times larger than the elderly.

6. Conclusions

This paper studies the general regression model, which includes the PH model and the PO model, under dependent truncated data, and applies the copula model to relax the quasi-independence assumption between the truncation time and the failure time. Then, based on [13], we obtain the estimators of

F_{X}

,

S_{Y}

, and

α

. Two proposed methods (the application of the area between two survival curves and the minimization of the norm distance between two survival curves) are used to estimate the regression parameter

θ

for dependent truncated data. From the simulations, it shows that the performance of the suggested approaches is good and compares the two methods via various simulation settings. When Y is the variable of interest, Method 1 is more appropriate under most situations. When X is the variable of interest, Method 1 is more appropriate under the Frank copula with the PH model. On the contrary, Method 2 is more appropriate under the conditions of (i) the Clayton and Gumbel copula with the PH and the PO model and (ii) the Frank copula with the PO model. Finally, we analyze two actual datasets (the retirement center dataset and the transfusion-related AIDS dataset). In the retirement center dataset, we find that the hazard of death in males is higher than in females, but the difference is not significant, which is the same as the result of the analysis by [20]. In the transfusion-related AIDS dataset, we find that the child patients have shorter induction periods than adult and elderly patients. This conclusion is that the differences in the induction time between (i) child and adult patients and (ii) child and elderly patients are both significant, which is the same as the result of the analysis by [13]. The advantage of the proposed method is that it can be applied to two kinds of data, the right-truncated data and the left-truncated and right-censoring data, under a general regression model, which includes the PH model and the PO model.

Author Contributions

Conceptualization, J.-J.H.; methodology, J.-J.H.; software, J.-J.H. and Y.-J.C.; validation, J.-J.H. and Y.-J.C.; formal analysis, J.-J.H. and Y.-J.C.; investigation, J.-J.H.; resources, J.-J.H.; data curation, J.-J.H. and Y.-J.C.; writing—original draft preparation, J.-J.H. and Y.-J.C.; writing—review and editing, J.-J.H.; visualization, J.-J.H. and Y.-J.C.; supervision, J.-J.H.; project administration, J.-J.H.; funding acquisition, J.-J.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Science and Technology Council of Taiwan, grant number: MOST 110-2118-M-194-002-MY2.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Turnbull, B.W. The empirical distribution function with arbitrarily grouped, censored and truncated data. J. R. Stat. Soc. Ser. B 1976, 38, 290–295. [Google Scholar] [CrossRef] [Green Version]
Efron, B.; Petrosian, V. Survival analysis of the gamma-ray burst data. J. Am. Stat. Assoc. 1994, 89, 452–462. [Google Scholar] [CrossRef]
Lagakos, S.W.; Barraj, L.M.; Gruttola, V.D. Nonparametric analysis of truncated survival data, with application to AIDS. Biometrika 1988, 75, 515–523. [Google Scholar] [CrossRef]
Bhattacharya, P.K.; Chernoff, H.; Yang, S.S. Nonparametric estimation of the slope of a truncated regression. Ann. Stat. 1983, 11, 505–514. [Google Scholar] [CrossRef]
Tsui, K.L.; Jewell, N.P.; Wu, C.F.J. A nonparametric approach to the truncated regression problem. J. Am. Stat. Assoc. 1988, 83, 785–792. [Google Scholar] [CrossRef]
Alioum, A.; Commenges, D. A proportional hazards model for arbitrarily censored and truncated Data. Biometrics 1996, 52, 512–524. [Google Scholar] [CrossRef] [PubMed]
Finkelstein, D.M.; Moore, D.F.; Schoenfeld, D.A. A proportional hazards model for truncated AIDS data. Biometrics 1993, 49, 731–740. [Google Scholar] [CrossRef] [PubMed]
Kim, M.; Paik, M.C.; Jang, J.; Cheung, Y.K.; Willey, J.; Elkind, S.V.; Sacco, R.L. Cox proportional hazards models with left truncation and time-varying coefficient: Application of age at event as outcome in cohort studies. Biom. J. 2017, 59, 405–419. [Google Scholar] [CrossRef] [PubMed]
Tsai, W.Y. Testing the assumption of independence of truncation time and failure time. Biometrika 1990, 77, 169–177. [Google Scholar] [CrossRef]
Emura, T.; Wang, W. Semiparametric inference for an accelerated failure time model with dependent truncation. Ann. Inst. Stat. Math. 2016, 68, 1073–1094. [Google Scholar] [CrossRef]
Rennert, L.; Xie, S.X. Cox regression model under dependent truncation. Biometrics 2022, 78, 460–473. [Google Scholar] [CrossRef] [PubMed]
Shen, P.S.; Hsu, H. Conditional maximum likelihood estimation for semiparametric transformation models with doubly truncated data. Comput. Stat. Data Anal. 2020, 144, 106862. [Google Scholar] [CrossRef]
Chaieb, L.L.; Rivest, L.P.; Abdous, B. Estimating survival under a dependent truncation. Biometrika 2006, 93, 655–669. [Google Scholar] [CrossRef]
Hsieh, J.J.; Wang, W.; Ding, A. Regression analysis based on semi-competing risks data. J. R. Stat. Soc. Ser. (Stat. Methodol.) 2008, 70, 3–20. [Google Scholar]
Hsieh, J.J.; Lai, Y.H. Proportional hazard model and proportional odds model under dependent current status data. Master’s Thesis, National Chung Cheng University, Chia-Yi, Taiwan, 2019. [Google Scholar]
Hyde, J. Testing survival under right-censoring and left-truncation. Biometrika 1977, 64, 225–230. [Google Scholar] [CrossRef]
Kalbfleisch, J.D.; Lawless, J.F. Inference based on retrospective ascertainment: An analysis of the data on transfusion-related AIDS. J. Am. Stat. Assoc. 1989, 84, 360–372. [Google Scholar] [CrossRef]
Genest, C.; Rivest, L.P. Statistical inference procedures for bivariate Archimedean copulas. J. Am. Stat. Assoc. 1993, 88, 1034–1043. [Google Scholar] [CrossRef]
Lai, T.L.; Ying, Z. Estimating a distribution function with truncated and censored data. Ann. Stat. 1991, 19, 417–442. [Google Scholar] [CrossRef]
Shen, P.S. A class of rank-based test for left-truncated and right-censored data. Ann. Inst. Stat. Math. 2009, 61, 461–476. [Google Scholar] [CrossRef]

Figure 1. The survival curves for the retirement center data and AIDS data.

Table 1. The estimators of

θ

under Case 1 with

τ = - 0.05

when the variable of interest is Y.

Table 1. The estimators of

θ

under Case 1 with

τ = - 0.05

when the variable of interest is Y.

			Method 1					Method 2
$n_{Z}$	Copula	Model	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Clayton	PH	−0.0323	0.5006	0.4864	0.2516	0.954	−0.0261	0.5694	0.5568	0.3249	0.952
		PO	0.0257	0.6575	0.7026	0.4329	0.948	0.0174	0.6785	0.7195	0.4607	0.944
200	Clayton	PH	−0.0276	0.2804	0.2882	0.0794	0.946	−0.0252	0.3196	0.3340	0.1028	0.952
		PO	−0.0150	0.4551	0.5097	0.2073	0.972	−0.0172	0.4702	0.5218	0.2214	0.960
100	Gumbel	PH	−0.0297	0.2818	0.2960	0.0803	0.968	−0.0356	0.3564	0.3726	0.1283	0.964
		PO	−0.0293	0.4646	0.4663	0.2167	0.938	−0.0390	0.5089	0.52463	0.2605	0.948
200	Gumbel	PH	−0.0195	0.2094	0.2013	0.0442	0.950	−0.0264	0.2562	0.2507	0.0663	0.948
		PO	−0.0059	0.3497	0.3423	0.1223	0.950	−0.0029	0.3787	0.3796	0.1434	0.968

Table 2. The estimators of

θ

under Case 1 when the variable of interest is Y.

Table 2. The estimators of

θ

under Case 1 when the variable of interest is Y.

				Method 1					Method 2
$n_{Z}$	Copula	$τ$	Model	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Frank	0.3	PH	0.0268	0.3638	0.3293	0.1330	0.932	0.0326	0.4388	0.4021	0.1936	0.926
			PO	−0.0520	0.6403	0.5925	0.4127	0.930	−0.0528	0.6542	0.6024	0.4308	0.930
		0.5	PH	0.0318	0.2944	0.2997	0.0877	0.950	0.0352	0.3392	0.3513	0.1163	0.956
			PO	−0.0203	0.5456	0.5455	0.2981	0.952	−0.0232	0.5226	0.5214	0.2737	0.954
		0.7	PH	−0.0353	0.2007	0.2318	0.0415	0.970	−0.0355	0.2219	0.2589	0.0505	0.980
			PO	0.0418	0.3286	0.3544	0.1097	0.968	0.0324	0.2879	0.3156	0.0839	0.974
200	Frank	0.3	PH	0.0185	0.2796	0.2692	0.0785	0.938	0.0200	0.3434	0.3268	0.1183	0.942
			PO	0.0390	0.5467	0.4942	0.3004	0.932	0.0418	0.5544	0.5023	0.3091	0.936
		0.5	PH	−0.0011	0.2285	0.2466	0.0522	0.960	0.0005	0.2608	0.2858	0.0680	0.964
			PO	0.0187	0.4188	0.4099	0.1757	0.940	0.0154	0.3956	0.3881	0.1567	0.946
		0.7	PH	−0.0264	0.1493	0.1626	0.0230	0.966	−0.0217	0.1661	0.1789	0.0281	0.966
			PO	0.0324	0.2133	0.2370	0.0466	0.972	0.0260	0.1858	0.2042	0.0352	0.968

Table 3. The estimators of

θ

under Case 2 with

τ = - 0.05

when the variable of interest is Y.

Table 3. The estimators of

θ

under Case 2 with

τ = - 0.05

when the variable of interest is Y.

				Method 1					Method 2
$n_{Z}$	Copula	Model	$θ$	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Clayton	PH	$θ_{1}$	−0.0226	0.4171	0.3938	0.1745	0.936	−0.0141	0.4803	0.4558	0.2309	0.940
			$θ_{2}$	−0.0183	0.4027	0.3932	0.1625	0.936	−0.0053	0.4764	0.4576	0.2269	0.934
		PO	$θ_{1}$	−0.0414	0.8127	0.7505	0.6623	0.954	−0.0356	0.8131	0.7544	0.6624	0.950
			$θ_{2}$	−0.0028	0.7608	0.7360	0.5788	0.954	−0.0021	0.7749	0.7420	0.6004	0.946
200	Clayton	PH	$θ_{1}$	−0.0249	0.3108	0.2855	0.0972	0.924	−0.0266	0.3557	0.3303	0.1272	0.926
			$θ_{2}$	−0.0303	0.3069	0.2842	0.0951	0.922	−0.0232	0.3504	0.3273	0.1233	0.926
		PO	$θ_{1}$	0.0005	0.4345	0.4516	0.1888	0.942	0.0061	0.4484	0.4596	0.2011	0.934
			$θ_{2}$	−0.0083	0.4685	0.4459	0.2196	0.938	−0.0023	0.4861	0.4559	0.2363	0.932
100	Gumbel	PH	$θ_{1}$	−0.0208	0.3037	0.2725	0.0927	0.912	−0.0284	0.3723	0.3368	0.1394	0.908
			$θ_{2}$	−0.0508	0.3005	0.2803	0.0929	0.932	−0.0578	0.3693	0.3453	0.1397	0.930
		PO	$θ_{1}$	−0.0112	0.5248	0.4581	0.2755	0.944	−0.0014	0.5666	0.4924	0.3210	0.930
			$θ_{2}$	−0.0332	0.5089	0.4691	0.2601	0.940	−0.0219	0.5511	0.5065	0.3041	0.946
200	Gumbel	PH	$θ_{1}$	−0.0255	0.2216	0.1968	0.0497	0.926	−0.0359	0.2705	0.2431	0.0744	0.922
			$θ_{2}$	−0.0316	0.2181	0.2026	0.0485	0.950	−0.0354	0.2646	0.2501	0.0713	0.950
		PO	$θ_{1}$	0.0038	0.3387	0.3220	0.1147	0.944	0.0068	0.3616	0.3471	0.1308	0.940
			$θ_{2}$	−0.0116	0.3315	0.3242	0.1100	0.942	−0.0047	0.3635	0.3501	0.1321	0.944

Table 4. The estimators of

θ

under Case 2 when the variable of interest is Y.

Table 4. The estimators of

θ

under Case 2 when the variable of interest is Y.

					Method 1					Method 2
$n_{Z}$	Copula	$τ$	Model	$θ$	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Frank	0.3	PH	$θ_{1}$	−0.0053	0.3374	0.3128	0.1139	0.934	−0.0128	0.4117	0.3778	0.1697	0.920
				$θ_{2}$	−0.0316	0.3553	0.3303	0.1273	0.928	−0.0378	0.4325	0.3974	0.1885	0.922
			PO	$θ_{1}$	0.0028	0.6187	0.5897	0.3828	0.938	−0.0001	0.6247	0.5961	0.3902	0.940
				$θ_{2}$	0.0081	0.6018	0.5953	0.3622	0.948	0.0047	0.6222	0.6027	0.3872	0.950
		0.5	PH	$θ_{1}$	0.0253	0.3035	0.2893	0.0927	0.934	0.0314	0.3556	0.3414	0.1274	0.926
				$θ_{2}$	0.0154	0.3162	0.2972	0.1002	0.928	0.0225	0.3708	0.3484	0.1380	0.934
			PO	$θ_{1}$	−0.0125	0.5091	0.5391	0.2593	0.956	−0.0227	0.4951	0.5116	0.2457	0.952
				$θ_{2}$	0.0378	0.5161	0.5305	0.2677	0.948	0.0275	0.4890	0.5026	0.2399	0.952
		0.7	PH	$θ_{1}$	−0.0066	0.2002	0.2112	0.0401	0.968	−0.0061	0.2263	0.2372	0.0513	0.962
				$θ_{2}$	−0.0315	0.2008	0.2143	0.0413	0.956	−0.0289	0.2178	0.2390	0.0483	0.970
			PO	$θ_{1}$	0.0319	0.3202	0.3532	0.1035	0.964	0.0263	0.2799	0.3109	0.0790	0.964
				$θ_{2}$	0.0263	0.3248	0.3520	0.1062	0.974	0.0181	0.2875	0.3106	0.0830	0.972
200	Frank	0.3	PH	$θ_{1}$	0.0067	0.3022	0.2680	0.0914	0.936	0.0055	0.3685	0.3275	0.1358	0.934
				$θ_{2}$	−0.0126	0.2889	0.2821	0.0836	0.938	−0.0127	0.3514	0.3420	0.1236	0.944
			PO	$θ_{1}$	−0.0073	0.5046	0.4683	0.2547	0.930	−0.0001	0.5130	0.4746	0.2632	0.924
				$θ_{2}$	0.0117	0.5025	0.4675	0.2527	0.942	0.0205	0.5060	0.4747	0.2565	0.936
		0.5	PH	$θ_{1}$	−0.0028	0.2629	0.2427	0.0691	0.924	−0.0061	0.3093	0.2840	0.0957	0.918
				$θ_{2}$	0.0056	0.2570	0.2467	0.0661	0.926	0.0100	0.3011	0.2862	0.0908	0.928
			PO	$θ_{1}$	0.0181	0.4210	0.4274	0.1776	0.956	0.0158	0.3943	0.4027	0.1557	0.960
				$θ_{2}$	0.0376	0.4061	0.4219	0.1663	0.942	0.0358	0.3764	0.3975	0.1430	0.958
		0.7	PH	$θ_{1}$	−0.0171	0.1461	0.1498	0.0216	0.964	−0.0198	0.1608	0.1671	0.0263	0.960
				$θ_{2}$	−0.0186	0.1433	0.1507	0.0209	0.966	−0.0162	0.1560	0.1660	0.0246	0.966
			PO	$θ_{1}$	0.0254	0.2164	0.2359	0.0475	0.976	0.0273	0.1872	0.2022	0.0358	0.972
				$θ_{2}$	0.0234	0.2345	0.2361	0.0555	0.962	0.0184	0.1942	0.2022	0.0380	0.972

Table 5. The estimators of

θ

under Case 1 with

τ = - 0.05

when the variable of interest is X.

Table 5. The estimators of

θ

under Case 1 with

τ = - 0.05

when the variable of interest is X.

			Method 1					Method 2
$n_{Z}$	Copula	Model	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Clayton	PH	0.0382	0.4689	0.5291	0.2213	0.986	0.0079	0.4659	0.5295	0.2171	0.984
		PO	0.0372	0.9020	0.9787	0.8149	0.968	0.0267	0.7170	0.8316	0.5148	0.972
200	Clayton	PH	0.0194	0.2902	0.2891	0.0846	0.956	−0.0031	0.2557	0.2665	0.0654	0.956
		PO	0.0259	0.5889	0.5695	0.3475	0.930	0.0147	0.4225	0.4139	0.1787	0.940
100	Gumbel	PH	0.0457	0.4382	0.4152	0.1941	0.950	0.0239	0.3944	0.3892	0.1561	0.950
		PO	0.0480	0.8515	0.7571	0.7273	0.934	0.0146	0.5982	0.5819	0.3581	0.958
200	Gumbel	PH	0.0536	0.3378	0.3029	0.1170	0.952	0.0261	0.2859	0.2699	0.0824	0.936
		PO	0.0113	0.6745	0.6106	0.4551	0.932	−0.0106	0.4164	0.4199	0.1735	0.960

Table 6. The estimators of

θ

under Case 1 when the variable of interest is X.

Table 6. The estimators of

θ

under Case 1 when the variable of interest is X.

				Method 1					Method 2
$n_{Z}$	Copula	$τ$	Model	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Frank	0.3	PH	0.0331	0.4416	0.4570	0.1961	0.950	0.0179	0.4779	0.4874	0.2287	0.948
			PO	−0.0249	0.7691	0.8107	0.5922	0.948	−0.0349	0.7033	0.7170	0.4959	0.946
		0.5	PH	−0.0063	0.3680	0.4048	0.1355	0.976	−0.0397	0.4215	0.4567	0.1793	0.970
			PO	−0.0014	0.6859	0.7133	0.4705	0.966	−0.0064	0.6465	0.6621	0.4180	0.964
		0.7	PH	0.0016	0.2539	0.3195	0.0644	0.982	−0.0587	0.2892	0.3811	0.0871	0.982
			PO	0.0036	0.4650	0.5175	0.2163	0.972	−0.0005	0.4304	0.4919	0.1853	0.974
200	Frank	0.3	PH	0.0253	0.3209	0.3309	0.1036	0.958	0.0031	0.3431	0.3553	0.1177	0.950
			PO	−0.0006	0.6926	0.6527	0.4797	0.934	−0.0176	0.5938	0.5633	0.3530	0.926
		0.5	PH	0.0138	0.2236	0.2197	0.0502	0.934	−0.0115	0.2354	0.2344	0.0556	0.930
			PO	−0.0058	0.5275	0.5259	0.2783	0.938	0.0070	0.4916	0.4753	0.2417	0.938
		0.7	PH	−0.0006	0.1581	0.1895	0.0250	0.980	−0.0305	0.1672	0.2053	0.0289	0.972
			PO	0.0060	0.3061	0.2943	0.0938	0.930	0.0143	0.2640	0.2663	0.0699	0.928

Table 7. The estimators of

θ

under Case 2 with

τ = - 0.05

when the variable of interest is X.

Table 7. The estimators of

θ

under Case 2 with

τ = - 0.05

when the variable of interest is X.

				Method 1					Method 2
$n_{Z}$	Copula	Model	$θ$	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Clayton	PH	$θ_{1}$	0.0396	0.4900	0.5181	0.2417	0.986	0.0080	0.4824	0.5225	0.2328	0.988
			$θ_{2}$	0.0383	0.4941	0.5204	0.2456	0.974	0.0114	0.4931	0.5187	0.2432	0.974
		PO	$θ_{1}$	−0.0181	0.9071	0.9843	0.8231	0.968	−0.0186	0.6848	0.8196	0.4693	0.984
			$θ_{2}$	0.0140	0.9838	1.0189	0.9680	0.968	0.0061	0.7876	0.8355	0.6203	0.980
200	Clayton	PH	$θ_{1}$	0.0358	0.3247	0.3144	0.1067	0.956	0.0041	0.2893	0.2887	0.0837	0.952
			$θ_{2}$	0.0290	0.3192	0.3114	0.1027	0.940	0.0028	0.2922	0.2885	0.0854	0.938
		PO	$θ_{1}$	−0.0069	0.6139	0.5803	0.3769	0.940	0.0027	0.4234	0.4186	0.1793	0.944
			$θ_{2}$	−0.0257	0.6227	0.5764	0.3884	0.922	−0.0314	0.4180	0.4144	0.1757	0.930
100	Gumbel	PH	$θ_{1}$	0.0136	0.4293	0.4003	0.1845	0.932	−0.0104	0.3868	0.3749	0.1497	0.940
			$θ_{2}$	0.0429	0.4388	0.4090	0.1944	0.936	0.0133	0.3938	0.3821	0.1552	0.946
		PO	$θ_{1}$	−0.0274	0.8631	0.7659	0.7457	0.920	−0.0197	0.5823	0.5648	0.3395	0.942
			$θ_{2}$	0.0254	0.8361	0.7653	0.6997	0.942	0.0097	0.5812	0.5842	0.3379	0.964
200	Gumbel	PH	$θ_{1}$	0.0105	0.3103	0.2964	0.0964	0.956	−0.0070	0.2611	0.2655	0.0682	0.960
			$θ_{2}$	0.0033	0.2987	0.2926	0.0892	0.962	−0.0126	0.2511	0.2633	0.0632	0.958
		PO	$θ_{1}$	0.0035	0.7022	0.6143	0.4931	0.940	0.0023	0.4135	0.4072	0.1710	0.952
			$θ_{2}$	0.0396	0.7082	0.6343	0.5031	0.940	0.0049	0.4238	0.4273	0.1797	0.962

Table 8. The estimators of

θ

under Case 2 when the variable of interest is X.

Table 8. The estimators of

θ

under Case 2 when the variable of interest is X.

					Method 1					Method 2
$n_{Z}$	Copula	$τ$	Model	$θ$	Bias	EmpSd	AveSd	MSE	CP	Bias	EmpSd	AveSd	MSE	CP
100	Frank	0.3	PH	$θ_{1}$	0.0220	0.4441	0.4629	0.1977	0.950	−0.0104	0.4907	0.4964	0.2409	0.950
				$θ_{2}$	0.0037	0.4408	0.4589	0.1943	0.954	−0.0261	0.4749	0.4946	0.2262	0.956
			PO	$θ_{1}$	−0.0112	0.8613	0.8231	0.7420	0.948	−0.0197	0.7354	0.7079	0.5412	0.944
				$θ_{2}$	0.0080	0.8313	0.8118	0.6911	0.952	−0.0015	0.7363	0.7093	0.5421	0.940
		0.5	PH	$θ_{1}$	−0.0078	0.3374	0.3986	0.1139	0.974	−0.0382	0.3856	0.4567	0.1501	0.972
				$θ_{2}$	−0.0176	0.3460	0.3982	0.1201	0.974	−0.0502	0.3854	0.4500	0.1511	0.970
			PO	$θ_{1}$	−0.0484	0.7556	0.7170	0.5733	0.930	−0.0165	0.7168	0.6674	0.5141	0.940
				$θ_{2}$	−0.0216	0.7357	0.7159	0.5418	0.938	−0.0212	0.6886	0.6607	0.4746	0.940
		0.7	PH	$θ_{1}$	−0.0093	0.2472	0.3085	0.0612	0.974	−0.0541	0.2935	0.3706	0.0891	0.980
				$θ_{2}$	−0.0064	0.2447	0.3103	0.0599	0.980	−0.0577	0.2891	0.3684	0.0869	0.984
			PO	$θ_{1}$	−0.0490	0.4702	0.5188	0.2234	0.966	−0.0310	0.4429	0.4926	0.1971	0.968
				$θ_{2}$	−0.0505	0.5325	0.5164	0.2861	0.944	−0.0425	0.4922	0.4915	0.2441	0.952
200	Frank	0.3	PH	$θ_{1}$	0.0121	0.3128	0.3122	0.0980	0.958	−0.0042	0.3419	0.3370	0.1169	0.940
				$θ_{2}$	0.0258	0.3091	0.3164	0.0962	0.948	0.0052	0.3300	0.3389	0.1089	0.948
			PO	$θ_{1}$	0.0021	0.7255	0.6964	0.5264	0.950	0.0093	0.6142	0.5952	0.3773	0.938
				$θ_{2}$	−0.0130	0.7226	0.6886	0.5223	0.944	−0.0160	0.6081	0.5898	0.3700	0.946
		0.5	PH	$θ_{1}$	0.0217	0.2080	0.2108	0.0437	0.940	0.0013	0.2284	0.2279	0.0522	0.934
				$θ_{2}$	0.0082	0.2068	0.2104	0.0428	0.938	−0.0162	0.2212	0.2267	0.0492	0.940
			PO	$θ_{1}$	0.0357	0.5554	0.5235	0.3098	0.926	0.0234	0.4832	0.4698	0.2340	0.932
				$θ_{2}$	0.0418	0.5151	0.5183	0.2671	0.948	0.0151	0.4612	0.4637	0.2129	0.940
		0.7	PH	$θ_{1}$	0.0047	0.1633	0.1881	0.0267	0.974	−0.0288	0.1747	0.2007	0.0314	0.964
				$θ_{2}$	0.0012	0.1562	0.1845	0.0244	0.974	−0.0286	0.1722	0.1989	0.0305	0.964
			PO	$θ_{1}$	0.0014	0.3730	0.3817	0.1391	0.942	0.0036	0.3279	0.3436	0.1075	0.948
				$θ_{2}$	0.0159	0.3708	0.3827	0.1377	0.948	0.0119	0.3114	0.3439	0.0971	0.950

Table 9. The analysis for the retirement center data.

		Method 1						Method 2
Copula	Model	$\hat{θ}$	SD	95% C. I		DR	PV	$\hat{θ}$	SD	95% C. I		DR	PV
Frank	PH	0.3211	0.4908	−0.6408	1.2831	0.1651	0.669	0.1749	0.4310	−0.6699	1.0197	0.1651	0.659
	PO	0.5155	0.6694	−0.7965	1.8275	0.1695	0.538	0.3011	0.5881	−0.8516	1.4538	0.1695	0.553

Table 10. The analysis for the AIDS data.

			Method 1						Method 2
Copula	Model		$\hat{θ}$	SD	95% C. I		DR	PV	$\hat{θ}$	SD	95% C. I		DR	PV
Frank	PH	$θ_{1}$	−1.0065	0.4244	−1.8383	−0.1746			−1.2322	0.4709	−2.1552	−0.3092
		$θ_{2}$	−1.1941	0.3715	−1.9222	−0.4659	0.2661	0.823	−1.3928	0.4052	−2.1870	−0.5986	0.3488	0.728
		$θ_{1} - θ_{2}$	0.1876	0.4918	−0.7763	1.1515			0.1606	0.5246	−0.8676	1.1888
	PO	$θ_{1}$	−1.9358	0.7267	−3.3601	−0.5115			−2.1772	0.7509	−3.6489	−0.7054
		$θ_{2}$	−2.2260	0.6684	−3.5361	−0.9159	0.2254	0.918	−2.3858	0.7037	−3.7651	−1.0065	0.2628	0.804
		$θ_{1} - θ_{2}$	0.2902	0.7392	−1.1586	1.7390			0.2086	0.7053	−1.1737	1.5909

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hsieh, J.-J.; Chen, Y.-J. Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data. Axioms 2022, 11, 521. https://doi.org/10.3390/axioms11100521

AMA Style

Hsieh J-J, Chen Y-J. Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data. Axioms. 2022; 11(10):521. https://doi.org/10.3390/axioms11100521

Chicago/Turabian Style

Hsieh, Jin-Jian, and Yun-Jhu Chen. 2022. "Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data" Axioms 11, no. 10: 521. https://doi.org/10.3390/axioms11100521

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Proportional Hazard Model and Proportional Odds Model under Dependent Truncated Data

Abstract

1. Introduction

2. Data and Model Assumptions

2.1. Dependent Truncated Data

2.2. Semisurvival Copula

2.3. Regression Model

3. The Proposed Estimation Procedures

3.1. Method 1: The Application of the Area between Two Survival Curves

3.1.1. Estimation under Cox Proportional Hazard Model

3.1.2. Estimation under the Proportional Odds Model

3.2. Method 2: The Minimization of the Norm Distance between Two Survival Curves

3.3. Estimate Variance by the Bootstrap Approach

4. Simulation Studies

5. Data Analysis

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI