Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis

Cai, Kaida; Liu, Hanwen; Fu, Wenzhi; Zhao, Xin

doi:10.3390/axioms13120831

Open AccessArticle

Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis

¹

Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China

²

Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China

³

Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China

⁴

Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing 210096, China

^*

Author to whom correspondence should be addressed.

Axioms 2024, 13(12), 831; https://doi.org/10.3390/axioms13120831

Submission received: 23 October 2024 / Revised: 22 November 2024 / Accepted: 26 November 2024 / Published: 27 November 2024

(This article belongs to the Section Mathematical Analysis)

Download

Browse Figures

Versions Notes

Abstract

:

In this study, we propose a smoothed weighted quantile regression (SWQR), which combines convolution smoothing with a weighted framework to address the limitations. By smoothing the non-differentiable quantile regression loss function, SWQR can improve computational efficiency and allow for more stable model estimation in complex datasets. We construct an efficient optimization process based on gradient-based algorithms by introducing weight refinement and iterative parameter estimation methods to minimize the smoothed weighted quantile regression loss function. In the simulation studies, we compare the proposed method with two existing methods, including martingale-based quantile regression (MartingaleQR) and weighted quantile regression (WeightedQR). The results emphasize the superior computational efficiency of SWQR, outperforming other methods, particularly WeightedQR, by requiring significantly less runtime, especially in settings with large sample sizes. Additionally, SWQR maintains robust performance, achieving competitive accuracy and handling the challenges of right censoring effectively, particularly at higher quantiles. We further illustrate the proposed method using a real dataset on primary biliary cirrhosis, where it exhibits stable coefficient estimates and robust performance across quantile levels with different censoring rates. These findings highlight the potential of SWQR as a flexible and robust method for analyzing censored data in survival analysis, particularly in scenarios where computational efficiency is a key concern.

Keywords:

quantile regression; convolution smoothing; survival analysis; right censoring; gradient-based optimization

MSC:

62G08; 62N02; 62P10; 62H12; 90C53

1. Introduction

Survival analysis, a statistical method used to study the time until the occurrence of an event of interest, often faces the challenge of censoring, which occurs when the exact survival time is unknown [1]. Censoring arises in many real-world applications, such as medical studies where some patients may leave the study early, or their event of interest (e.g., death) may not have occurred by the study’s conclusion [2]. Right-censoring, where only the lower bound of the survival time is observed, is the most common type. As noted by Shedden et al. [3], right-censoring was present in 46.6% of patients in their lung cancer study due to factors like transfer to other hospitals or death unrelated to lung cancer. Understanding and addressing this phenomenon is critical for accurate survival time modeling, particularly in clinical trials where survival times are frequently subject to censoring [4,5].

Traditional methods, such as the Cox proportional hazards model and the accelerated failure time (AFT) model, have been widely employed to estimate the conditional distribution of survival time [6,7,8]. However, both approaches rely on the homogeneity assumption, which posits that covariates affect only the expected survival time but not its distributional shape. While these models are computationally convenient and have strong asymptotic properties, they offer limited insights because they primarily focus on the conditional mean of survival time [1,9]. In many practical applications, such as drug efficacy trials, it is more relevant to focus on the tail behavior of the survival distribution, for instance, on the longest surviving patients [10,11]. This limitation of traditional models has prompted the introduction of quantile regression into survival analysis, where the objective is to model the conditional quantiles of the survival time distribution rather than just its mean [12]. Quantile regression offers a comprehensive and flexible alternative to traditional survival analysis methods, enabling the evaluation of covariate effects across the entire distribution of survival times while maintaining interpretable results, as comprehensively reviewed by Peng [13].

Quantile regression, developed by Koenker and Bassett Jr. [14], provides a method to estimate the conditional quantiles of a response variable based on covariates. Powell [12] extended this method to censored data, specifically focusing on the right-censoring scenario. In this case, the observed data are a combination of the minimum of the actual survival time and the censoring time. Powell [12]’s approach provided a way to estimate the quantile regression parameters under right-censoring by modifying the original quantile regression loss function. Subsequent research has expanded upon this work, offering alternative perspectives for handling censored data. One such perspective is the weighted quantile regression approach, as proposed by Portnoy [15] and Wang and Wang [16], which introduces weights to handle different types of censored and uncensored observations. Meanwhile, Peng and Huang [17] approached the problem from a martingale perspective, developing an estimating equation that uses a series of quantile levels to approximate the cumulative effect of censoring on the survival time distribution. However, even with these advances, challenges remain, especially when dealing with censored data and high-dimensional settings [15]. Wang et al. [18] introduced a variable selection method for censored quantile regression, addressing the challenge of identifying significant predictors in survival analysis under censoring. Furthermore, Fei et al. [19] proposed a method for high-dimensional censored quantile regression, enabling improved inference in settings where covariates far exceed the sample size. However, another major hurdle is that the traditional quantile regression loss function is non-differentiable, making it difficult to apply gradient-based optimization techniques, which are commonly used in high-dimensional data analysis [14,20]. In response to these challenges, convolution smoothing has emerged as a powerful technique [21,22]. By smoothing the non-differentiable quantile regression loss function, convolution smoothing transforms the optimization problem into a smooth one, allowing for the use of efficient gradient-based methods [20,23,24]. This approach has garnered significant attention in recent years for its ability to handle both computational complexity and non-smooth optimization issues. Fernandes et al. [20] demonstrated the effectiveness of convolution smoothing in quantile regression without censoring, showing that this method could greatly improve convergence rates while maintaining estimation accuracy. Building on this foundation, He et al. [23] and He et al. [24] extended convolution smoothing techniques to high-dimensional data and censored data using martingale-based approaches, significantly enhancing the scalability of quantile regression in survival analysis. Yan et al. [25] further advanced this area by introducing inference techniques for high-dimensional quantile regression, leveraging convolution smoothing and debiasing to enhance estimation accuracy and hypothesis testing.

In this study, we propose a smoothed weighted quantile regression method based on the framework introduced by Portnoy [15]. The primary objective is to develop an approach that efficiently estimates quantile-specific effects in censored survival data by integrating convolution smoothing into a weighted quantile regression framework. This method addresses the non-differentiability of weighted quantile regression loss functions, enabling stable and computationally efficient estimation, particularly for datasets with varying degrees of censoring. By incorporating convolution smoothing into a weighted quantile regression framework, our method not only resolves the optimization challenges posed by non-smooth loss functions but also leverages adaptive weighting to effectively account for censoring mechanisms. One of the key advantages of our method is its ability to balance accuracy and computational feasibility. In contrast to existing techniques, our method offers superior performance in terms of estimation accuracy across a wide range of quantile levels. By introducing convolution smoothing, we ensure that the optimization problem is smooth and differentiable, which facilitates the use of quasi-Newton and other gradient-based algorithms. This leads to faster convergence and more stable estimates, even in complex survival datasets.

The rest of the paper is organized as follows. In Section 2, we present the methodological framework, including the formulation of the smoothed weighted quantile regression. The algorithm and optimization approach are illustrated in Section 3. Section 4.1 describes the simulation study setup and provides a comparative analysis of SWQR against existing methods. The real data application, which demonstrates the practical effectiveness of SWQR in modeling censored survival data, is detailed in Section 4.2. Discussions and conclusions are provided in Section 5.

2. Methodology

In quantile regression, the conditional quantile function of the response variable Y given covariates X is expressed as

F_{Y | X}^{- 1} (τ) = X^{⊤} β (τ),

where

τ \in (0, 1)

is the quantile level, and

β (τ)

represents the quantile-specific regression coefficients. Suppose there are n independent individuals, let

Y_{i}

and

X_{i} = {(X_{i 1}, \dots, X_{i p})}^{⊤}

denote the response variable and

p \times 1

vector of covariates for ith individual. For uncensored data, given the observations

(y_{i}, x_{i})

with

i = 1, \dots, n

, the quantile regression coefficients

β (τ)

are estimated by minimizing the following objective function

\hat{β} (τ) = arg min_{β} \frac{1}{n} \sum_{i = 1}^{n} ρ_{τ} (y_{i} - x_{i}^{⊤} β),

where

ρ_{τ} (u)

is defined as

ρ_{τ} (u) = u (τ - I (u \leq 0)),

with

I (u \leq 0)

being the indicator function [26]. This approach works well for uncensored data but cannot handle right-censored observations directly.

This study focuses on estimating the parameters of quantile regression models when the response variable is subject to right-censoring, a common issue in survival analysis and biomedical research. However, when censoring occurs, we only observe

{\tilde{y}}_{i} = min (y_{i}, C_{i})

, where

C_{i}

is the censoring time, and

δ_{i} = I (y_{i} \leq C_{i})

indicates whether the observation is censored. In the presence of right-censoring, the weights are introduced to account for censored observations [15]. To address the censoring, we first partition the quantile range

(0, τ)

into segments as follows

0 < τ_{1} < τ_{2} < \dots < τ_{K} = τ,

which divides the quantile levels into K steps. For each quantile level

τ_{k}

, according to Portnoy [15], the weighted quantile regression estimator is given by

{\hat{β}}_{w} (τ_{k}) = arg min_{β} \frac{1}{n} \sum_{i = 1}^{n} w_{i, k} ρ_{τ_{k}} ({\tilde{y}}_{i} - x_{i}^{⊤} β) + (1 - w_{i, k}) ρ_{τ_{k}} (y^{M} - x_{i}^{⊤} β),

(1)

where

w_{i, k}

are weights for censored and uncensored observations,

ρ_{τ_{k}} (u)

is the check function for quantile level

τ_{k}

, and

y^{M}

is a large constant. The weights

w_{i, k}

are computed based on the censoring status

w_{i, k} = \{\begin{matrix} 1, & δ_{i} = 1 or (δ_{i} = 0 and {\tilde{y}}_{i} > x_{i}^{⊤} {\hat{β}}_{w} (τ_{k - 1})), \\ \frac{τ_{k} - τ_{l (i)}}{1 - τ_{l (i)}}, & δ_{i} = 0 and {\tilde{y}}_{i} \leq x_{i}^{⊤} {\hat{β}}_{w} (τ_{k - 1}), \end{matrix}

(2)

where

τ_{l (i)}

with

l = 1, \dots, k - 1

and

i = 1, \dots, n

is the smallest quantile level such that

x_{i}^{⊤} {\hat{β}}_{w} (τ_{l}) \geq {\tilde{y}}_{i}

for ith individual.

The function

ρ_{τ} (u)

, known as the check function, is non-smooth at

u = 0

, which introduces challenges in optimization since gradient-based methods struggle with non-differentiable points. This non-smoothness often leads to slow convergence or difficulty in obtaining precise parameter estimates. To address this challenge, inspired by the work of [20], we introduce convolution smoothing as a solution. This technique involves applying a smoothing kernel to the check function, effectively transforming it into a smooth and differentiable function, which allows us to leverage gradient-based optimization methods more efficiently. By smoothing the original check function of Equation (1), the optimization becomes more stable and converges more rapidly. The loss function of the smoothed quantile regression can then be expressed as

\begin{matrix} R_{w s} = & \frac{1}{n} \sum_{i = 1}^{n} \{w_{i, k} \int_{- \infty}^{+ \infty} ρ_{τ_{k}} (u) K_{h} (({\tilde{y}}_{i} - x_{i}^{⊤} β) - u) d u \\ + (1 - w_{i, k}) \int_{- \infty}^{+ \infty} ρ_{τ_{k}} (u) K_{h} ((y^{M} - x_{i}^{⊤} β) - u) d u\}, \end{matrix}

(3)

where

K_{h} (u)

is a kernel function with bandwidth h that controls the level of smoothing. The corresponding regression estimator

{\hat{β}}_{w s} (τ_{k})

can be obtained by minimizing the loss function. In this study, we use the Gaussian kernel, defined as

K_{h} (u) = \frac{1}{h} K (\frac{u}{h}), K (u) = \frac{1}{\sqrt{2 π}} e^{- u^{2} / 2} .

The bandwidth h controls the degree of smoothing: a smaller h results in minimal smoothing, while a larger h produces greater smoothing.

The convolution process integrates the check function with the kernel, spreading the residuals

{\tilde{y}}_{i} - x_{i}^{⊤} β

over a range

\int_{- \infty}^{\infty} ρ_{τ_{k}} (u) K_{h} (({\tilde{y}}_{i} - x_{i}^{⊤} β) - u) d u .

This smoothing ensures that the resulting objective function is continuously differentiable. This approach maintains the essential properties of the quantile regression model while enabling efficient computation of gradients and Hessians, facilitating faster and more reliable convergence during the optimization process. The gradient of the smoothed loss function is computed as

\frac{\partial R_{w s}}{\partial β} = \frac{1}{n} \sum_{i = 1}^{n} [w_{i, k} (\int_{- \infty}^{x_{i}^{⊤} β - {\tilde{y}}_{i}} K_{h} (u) d u - τ_{k}) + (1 - w_{i, k}) (\int_{- \infty}^{x_{i}^{⊤} β - y^{M}} K_{h} (u) d u - τ_{k})] x_{i} .

(4)

Similarly, the Hessian matrix, representing the second-order derivatives of the smoothed loss function, is given by

\frac{\partial^{2} R_{w s}}{\partial β \partial β^{⊤}} = \frac{1}{n} \sum_{i = 1}^{n} [w_{i, k} K_{h} (x_{i}^{⊤} β - {\tilde{y}}_{i}) + (1 - w_{i, k}) K_{h} (x_{i}^{⊤} β - y^{M})] x_{i} x_{i}^{⊤} .

(5)

This smoothed approach not only ensures the differentiability of the loss function but also facilitates the computation of gradients and Hessians, which are essential for efficient optimization using quasi-Newton methods or other gradient-based algorithms. This methodology integrates convolution smoothing and weighted quantile regression to efficiently estimate parameters in the presence of right-censored data. By smoothing the check function, the optimization process is made more efficient while retaining the key properties of quantile regression. Translation tools, including Youdao Dictionary software and Google Translate, were used to assist in grammar correction and language refinement during the preparation of this study.

3. Computational Algorithm

In this study, we employ a line search method combined with a quasi-Newton approach [24]. Specifically, we utilize the Barzilai-Borwein (BB) step size method, first introduced by Barzilai and Borwein [27]. This method has been proven to enhance the efficiency of gradient-based algorithms, particularly in large-scale optimization problems. In this context, the gradient

g^{(t)}

, step direction

d^{(t)}

, and Hessian approximation

H^{(t)}

are computed at each iteration t. We define

δ^{(t)} = β^{(t + 1)} - β^{(t)}

, representing the change in the estimated parameters between iterations. Similarly,

g^{(t)} = d^{(t)} - d^{(t - 1)}

, denotes the difference between successive gradients. The method satisfies the secant Equation

δ^{(t)} \approx η^{(t)} g^{(t)}

, where the scalar

η^{(t)}

is determined by minimizing the following least squares problem

η^{(t)} = arg min_{η} {∥δ^{t} - η g^{(t)}∥}_{2}^{2} .

That is,

η^{(t)}

can be calculated as

η^{(t)} = \frac{〈 δ^{(t)}, g^{(t)} 〉}{〈 g^{(t)}, g^{(t)} 〉} .

The quasi-Newton algorithm (Algorithm 1) with Barzilai-Borwein step size method is as follows.

Algorithm 1: BB-Quasi–Newton Algorithm

To solve the optimization problem presented in Equation (3), we introduce a weight refinement and iterative parameter estimation methods into gradient-based algorithms to minimize the smoothed weighted quantile regression loss function. Compared to traditional linear programming optimization models, this approach significantly improves computational efficiency. The optimization process is based on the above BB-Quasi–Newton algorithm. Assume we have sample observations

{({\tilde{y}}_{i}, δ_{i}, x_{i})}_{i = 1}^{n}

, quantile levels

τ_{1} < τ_{2} < \dots < τ_{K}

, initial value

β^{(0)}

, initial learning rate

η^{(0)}

, maximum iterations maxTime and tolerance tol, the optimization process is as follows.

Step 1:: For the first quantile level $τ_{1}$ , directly set $w_{i, 1}^{c} = 1$ , substitute into Line 17, and calculate the initial estimate ${\hat{β}}_{w s}^{c} (τ_{1})$ .
Step 2:: Assume we have obtained ${\hat{β}}_{w s}^{c} (τ_{l})$ , for $l = 1, \dots, k - 1$ . For each $i = 1, 2, \dots, n$ , if $x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1}) \geq {\tilde{y}}_{i}$ , calculate the temporary quantile level

$τ_{l (i)}^{t} = \frac{[x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{l (i)}) - {\tilde{y}}_{i}] τ_{l (i)} + [{\tilde{y}}_{i} - x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{l (i) - 1})] τ_{l (i) - 1}}{x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{l (i)}) - x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{l (i) - 1})} .$
Step 3:: Use $τ_{l (i)}^{t}$ from Step 2, substitute into the following equation, and calculate the temporary weight $w_{i, k}^{t}$ with $i = 1, \dots, n$ ,

$w_{i, k}^{t} = \{\begin{matrix} 1, & δ_{i} = 1 or (δ_{i} = 0 and {\tilde{y}}_{i} > x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1})), \\ \frac{τ_{k} - τ_{l (i)}^{t}}{1 - τ_{l (i)}^{t}}, & δ_{i} = 0 and {\tilde{y}}_{i} \leq x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1}) . \end{matrix}$
Step 4:: Substitute the temporary weight $w_{i, k}^{t}$ from Step 3 into Equation (3), and apply the BB-Quasi–Newton algorithm to compute ${\hat{β}}_{w s}^{t} (τ_{k})$ .
Step 5:: For observations where $x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1}) < {\tilde{y}}_{i} < x_{i}^{⊤} {\hat{β}}_{w s}^{t} (τ_{k})$ , recompute $τ_{l (i)}^{c}$ using the following formula

$τ_{l (i)}^{c} = \frac{(x_{i}^{⊤} {\hat{β}}_{w s}^{t} (τ_{k}) - {\tilde{y}}_{i}) τ_{k} + ({\tilde{y}}_{i} - x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1})) τ_{k - 1}}{x_{i}^{⊤} {\hat{β}}_{w s}^{t} (τ_{k}) - x_{i}^{⊤} {\hat{β}}_{w s}^{c} (τ_{k - 1})} .$
Step 6:: Substitute the results from Steps 5 and 2 into the following equation to recalculate the weight $w_{i, k}^{c}$ ,

$w_{i, k}^{c} = \{\begin{matrix} 1, & δ_{i} = 1 or (δ_{i} = 0 and {\tilde{y}}_{i} > x_{i}^{⊤} {\hat{β}}_{w s}^{t} (τ_{k})), \\ \frac{τ_{k} - τ_{l (i)}^{c}}{1 - τ_{l (i)}^{c}}, & δ_{i} = 0 and {\tilde{y}}_{i} \leq x_{i}^{⊤} {\hat{β}}_{w s}^{t} (τ_{k}) . \end{matrix}$
Step 7:: Finally, substitute the updated weight $w_{i, k}^{c}$ into the following equation and use the BB-Quasi–Newton algorithm to compute the final estimate ${\hat{β}}_{w s}^{c} (τ_{k})$ ,

$\begin{matrix} {\hat{β}}_{w s}^{c} (τ_{k}) = arg min_{β} \frac{1}{n} \sum_{i = 1}^{n} \{w_{i, k}^{c} \int_{- \infty}^{+ \infty} ρ_{τ_{k}} (u) K_{h} (({\tilde{y}}_{i} - x_{i}^{⊤} β) - u) d u \\ + (1 - w_{i, k}^{c}) \int_{- \infty}^{+ \infty} ρ_{τ_{k}} (u) K_{h} ((y^{M} - x_{i}^{⊤} β) - u) d u\} . \end{matrix}$

4. Results

4.1. Simulation Study

In this section, we present a detailed simulation study to evaluate the performance of the proposed smoothed weighted quantile regression method for censored data. We compare our approach with two existing methods, including the weighted quantile regression method and martingale-based quantile regression method [15,17]. Portnoy [15] proposed a recursively reweighted estimation procedure for censored quantile regression, extending the Kaplan–Meier estimator to handle more complex regression settings involving censoring. On the other hand, Peng and Huang [17] developed a martingale-based estimating equation, which simplifies quantile regression with survival data by minimizing

L 1

-type convex functions, ensuring computational efficiency and robustness. For the comparison methods, the quantreg package in R is used, employing the crq.fit.por and crq.fit.pen functions for Portnoy [15]’s and Peng and Huang [17]’s methods, respectively. All simulations were executed on a workstation equipped with two 24-core 2.99 GHz CPUs and 128 GB of memory. Each simulation run utilized 8 CPU cores concurrently.

In the simulation study, the covariates

x_{i} \in R^{p}

are split into three groups

\begin{matrix} x_{i} = (x_{i, 1}, \dots, x_{i, p_{1}}, x_{i, p_{1} + 1}, \dots, x_{i, p_{2}}, x_{i, p_{2} + 1}, \dots, x_{i, p}), \end{matrix}

where the first

p_{1}

covariates

x_{i, j}

with

j = 1, \dots, p_{1}

follow a multivariate normal distribution

N (0, Σ)

, and the remaining covariates come from a mixture of normal and binomial distributions. Specifically,

x_{i, j}

with

j = 1, \dots, p_{1}

follow a multivariate normal distribution

N (0, Σ)

, where

Σ

is a

p_{1} \times p_{1}

covariance matrix with

σ_{j, k} = 0 . 5^{| j - k |}

, for

1 \leq j, k \leq p_{1}

. The second group of covariates,

x_{i, j}

with

j = p_{1} + 1, \dots, p_{2}

, follows a multivariate uniform distribution on the interval

{[- 2, 2]}^{p_{2} - p_{1}}

with covariance matrix

Σ

, and the final group

x_{i, j}

with

j = p_{2} + 1, \dots, p

consists of binary variables drawn from a Bernoulli distribution with probability 0.5. The response variable

y_{i} = {\tilde{x}}_{i}^{⊤} γ + ϵ_{i}

, where

γ_{j} \sim U (- 2, 2), j = 1, 2, \dots, p

,

ϵ_{i} \sim t_{2}

. Let

Q_{t_{2}} (τ)

represent the quantile function of the t-distribution with 2 degrees of freedom and denote

x_{i} = {(1, {\tilde{x}}_{i}^{⊤})}^{⊤}

, the response variable

y_{i}

is generated according to the following model

y_{i} = x_{i}^{⊤} β^{*} (τ) + (ϵ_{i} - Q_{t_{2}} (τ)),

where

β^{★} (τ) = {(Q_{t_{2}} (τ), γ^{⊤})}^{⊤}

,

γ = {(γ_{1}, \dots, γ_{p})}^{⊤}

with

γ_{i}

follows a uniform distribution on interval

[- 2, 2]

and

ϵ_{i}

follows t-distribution with 2 degrees of freedom. Random censoring

C_{i}

is introduced by sampling censoring times from a mixture distribution:

C_{i} = I (u_{i} = 1) N (0, 16) + I (u_{i} = 2) N (5, 1) + I (u_{i} = 3) N (10, 0.25), i = 1, \dots, n,

where

u_{i}

are randomly drawn from the set

{1, 2, 3}

. The observed response

{\tilde{y}}_{i}

is then the minimum of the true response and the censoring time,

{\tilde{y}}_{i} = min (y_{i}, C_{i})

. Furthermore, the bandwidth of the Gaussian kernel for our method is selected as

h = 0.035

.

For each quantile level

τ_{k}

, the estimation is conducted using a quantile grid of

{τ_{k}}_{k = 1}^{m}

with values ranging from

{0.05, 0.055, 0.060, \dots, 0.8}

. The performance of each method is assessed at every quantile level based on the estimation error, average quantile loss, and runtime.The estimation error is quantified using

∥ \hat{β} (τ_{k}) - β^{*} (τ_{k}) ∥_{2}

, while the average quantile loss is given by calculating

\frac{1}{n} \sum_{i = 1}^{n} ρ_{τ_{k}} (y_{i} - {\hat{y}}_{i})

. We consider three main scenarios (1–3) with different sample sizes and numbers of covariates, each replicated 1000 times. Specifically, we have

n = 3000, p = 25

with

p_{1} = 10

and

p_{2} = 20

,

n = 4000, p = 35

with

p_{1} = 14

and

p_{2} = 28

, and

n = 6000, p = 50

with

p_{1} = 20

and

p_{2} = 40

. In addition to these primary settings, we introduce Scenario 4 specifically designed to evaluate the performance of the proposed method under varying censoring rates. This scenario considers a sample size of

n = 4000

and

p = 60

, with

p_{1} = 24

and

p_{2} = 48

, and examines three censoring levels, 20%, 35%, and 50%. This scenario provides insight into how the degree of censoring impacts estimation accuracy and runtime, further testing the robustness of the proposed method. The acronyms used for the methods are: “SWQR” for our proposed Smoothed Weighted Quantile Regression method, “WeightedQR” for Peng and Huang’s method, and “MartingaleQR” for Portnoy’s method. The results are summarized in the following figures and tables, which show the performance of the three methods.

Figure 1, Figure 2 and Figure 3 provide a comprehensive comparison of estimation error trends and runtimes across the Scenarios 1–3 with varying sample sizes (

n = 3000, 4000, 6000

) and covariate dimensions (

p = 25, 35, 50

). Across all scenarios, SWQR consistently demonstrates competitive performance, achieving estimation errors comparable to or slightly better than WeightedQR and significantly outperforming MartingaleQR, particularly at mid-range quantile levels (

τ = 0.3

to

τ = 0.6

). The trends indicate that MartingaleQR suffers from higher errors, especially for lower and upper quantile levels, which could be attributed to its less robust handling of censoring effects. These findings confirm that SWQR balances both robustness and accuracy in quantile estimation across all tested conditions. The runtime analysis, as summarized in the corresponding figures, highlights the computational efficiency of SWQR. SWQR consistently requires the least runtime among the three methods, making it the most efficient in scenarios involving larger datasets. WeightedQR, while competitive in terms of estimation error, incurs substantially higher computational costs, particularly as the sample size and covariate dimensionality grow. MartingaleQR exhibits moderate runtimes but suffers from higher estimation errors, making it less suitable for practical applications.

Table 1, Table 2 and Table 3, featuring estimation errors and average quantile losses, provide insights into the comparative performance of the methods across different scenarios. SWQR achieves average quantile loss values closely aligned with, and often slightly lower than, those of WeightedQR, showcasing its robustness and accuracy in quantile estimation. Notably, in Scenario 3 (

n = 6000, p = 50

), SWQR maintains its effectiveness despite the increased dataset size, demonstrating scalability and suitability for handling large-scale survival data. MartingaleQR, on the other hand, consistently shows higher average quantile losses and estimation errors. The runtime analysis is summarized in Table 4. SWQR demonstrates a significant computational advantage over WeightedQR, requiring substantially less time to complete estimation tasks across all scenarios. This is particularly evident in Scenarios 2 and 3, where WeightedQR exhibits a steep increase in runtime as the sample size and covariate dimensionality grow. Although MartingaleQR is computationally efficient, its higher average quantile losses and estimation errors make it less favorable for practical use. The overall balance of accuracy and computational efficiency achieved by SWQR underscores its suitability for large-scale survival analysis, offering a reliable alternative to existing methods when both precision and speed are critical considerations.

In Scenario 4, we evaluated the performance of MartingaleQR, WeightedQR, and SWQR under varying censoring rates (20%, 35%, and 50%, respectively) with

n = 4000

and

p = 60

. Figure 4, Figure 5 and Figure 6 depict the estimation errors and runtimes across quantile levels. The results reveal that SWQR consistently outperforms MartingaleQR in terms of estimation error, particularly at lower quantile levels. Compared to WeightedQR, SWQR demonstrates comparable or slightly better estimation performance across all censoring rates, maintaining its robustness even under higher levels of censoring. Furthermore, the increase in censoring rate slightly elevates the estimation error for all methods; however, the relative performance among the methods remains consistent. Regarding runtime, SWQR achieves the lowest computational time across all scenarios, maintaining efficiency regardless of the censoring rate. WeightedQR, in contrast, exhibits significantly higher runtimes, especially under lower censoring rates.

Table 5, Table 6, Table 7 and Table 8 provide a detailed numerical comparison, highlighting the estimation errors, average quantile losses, and runtimes for each censoring rate. The average quantile losses for SWQR are closely aligned with those of WeightedQR and remain consistently lower than those of MartingaleQR. This indicates SWQR’s ability to handle data with varying censoring rates effectively. Notably, as the censoring rate increases, the performance gap between WeightedQR and SWQR narrows, with SWQR maintaining its advantage in computational efficiency. In terms of recorded runtime in Table 8, SWQR consistently demonstrates the fastest execution across all censoring levels, highlighting its computational efficiency, while WeightedQR offers competitive accuracy, it suffers from considerably higher computational costs, making it less practical for large-scale applications. MartingaleQR, on the other hand, achieves moderate runtime but struggles with reduced estimation accuracy, particularly under higher censoring rates. This analysis underscores SWQR’s strong balance between speed and accuracy, showcasing its robustness and scalability for datasets with different censoring rates.

In conclusion, the simulation results highlight that the proposed SWQR method consistently surpasses MartingaleQR in estimation accuracy across various quantile levels, particularly as the sample size and number of covariates increase. SWQR balances computational efficiency and estimation accuracy, outperforming WeightedQR, which is limited by its high computational cost in large-scale settings. According to the results, the disparity in runtimes becomes increasingly evident with higher dimensionality, emphasizing SWQR’s capability to deliver reliable performance with minimal computational burden. These results underscore the practicality of SWQR, making it an effective solution for large-scale survival data analysis where both accuracy and resource efficiency are critical considerations.

4.2. Real Data Application

To further evaluate the performance of the proposed smoothed weighted quantile regression method, we applied it to a dataset on primary biliary cirrhosis (PBC), a liver disease dataset extensively used in survival analysis literature [28]. This dataset consists of information on 418 patients, with their survival times either observed or right-censored. The data include clinical measurements such as age, edema status, and biochemical markers (serum bilirubin, albumin, and prothrombin time). Approximately 40% of the patients in the dataset are censored, making it a suitable candidate for testing the robustness and accuracy of censored quantile regression methods. Our aim is to model the relationship between the logarithm of survival time and the selected covariates at different quantile levels. The response variable of interest is the logarithm of survival time,

Y = ln (T)

, and the covariates include: age of the patient (in years) (

X_{1}

), edema status (

X_{2}

), Logarithm of serum bilirubin concentration (mg/dL) (

X_{3}

), Logarithm of albumin concentration (gm/dL) (

X_{4}

), Logarithm of prothrombin time (in seconds) (

X_{5}

). We model the conditional quantile function of the logarithm of survival time at each quantile level

τ

as:

F_{Y | X}^{- 1} (τ) = β_{0} + β_{1} X_{1} + β_{2} X_{2} + β_{3} X_{3} + β_{4} X_{4} + β_{5} X_{5},

where the regression coefficients

β_{j} (τ)

represent the effect of each covariate on the quantile

τ

of the survival time distribution. To compare the performance of the SWQR method, we apply MartingleQR and WeightedQR [15,17]. For the SWQR method, a quantile grid

{τ_{k}}_{k = 1}^{m}

is selected, ranging from 0.05 to 0.80 in increments of 0.05. The estimation of coefficients is performed for each quantile level, and the performance of all three methods was assessed based on the estimated coefficients.

The estimated regression coefficients for each method are computed across the quantile grid, with the results compared in Table 9 and Figure 7. Each covariate’s impact on survival time was examined, highlighting differences in how the methods handle censored observations. For lower quantiles (

τ = 0.1

to

τ = 0.3

), the SWQR method closely matches the results from Portnoy [15]’s method in terms of coefficient magnitude and direction, while showing slightly more stability compared to Peng and Huang [17]’s method. For example, the effect of serum bilirubin (

X_{3}

) on survival time diminishes as we move toward higher quantiles, reflecting how severe cases impact survival time differently across the population. Similarly, the impact of age on survival (

X_{1}

) shows consistent trends across all methods, with SWQR capturing the effects more consistently across quantiles. At higher quantiles (

τ = 0.6

and above), the SWQR method continues to provide reliable coefficient estimates, while the performance of Portnoy [15]’s method starts to deteriorate due to challenges with higher censoring. Peng and Huang [17]’s method also encounters computational difficulties, particularly with respect to estimating coefficients for

X_{2}

(edema status), as its estimates fluctuate more erratically at these higher quantiles.

In this real-data application, the SWQR method demonstrated its robustness and flexibility in handling censored survival data across a wide range of quantile levels. The method not only provided stable and accurate coefficient estimates, particularly in higher quantiles where censoring becomes more pronounced. The consistent performance of SWQR across different quantile levels, coupled with its computational advantages, makes it a valuable tool for analyzing censored survival data in medical and other applied fields.

5. Discussion and Conclusions

In this study, we introduced smoothed weighted quantile regression, a novel approach designed to address the challenge of right-censored data, commonly encountered in survival analysis. SWQR combines convolution smoothing with quantile regression, offering an efficient solution to the non-differentiability issues inherent in traditional methods, thus enabling gradient-based optimization. By comparing SWQR with two widely used methods, including MartingaleQR and WeightedQR, our findings demonstrate that SWQR is both computationally efficient and robust across various settings.

The results from the simulation study reveal that SWQR generally performs better than MartingaleQR in terms of estimation accuracy, particularly at certain quantile levels and as the sample size and number of covariates increase. Across different quantile levels, SWQR exhibits superior performance, particularly at higher quantiles where the presence of right-censoring can severely degrade the accuracy of other methods. While WeightedQR performs slightly better in terms of accuracy at certain quantiles, it comes at a high computational cost, making it impractical for large-scale data. In contrast, SWQR strikes a balance, maintaining high levels of accuracy while remaining computationally feasible. This balance becomes more evident in large-scale settings, as shown in the simulations with

n = 6000

and

p = 50

, where the runtime of WeightedQR increases dramatically, whereas SWQR continues to perform efficiently. These results confirm that SWQR is a scalable and practical method for handling large datasets, making it a valuable tool in applications where computational resources are a constraint. Furthermore, the method’s ability to provide reliable estimates across a broad range of quantile levels is especially advantageous when analyzing the tails of a distribution, where outliers or extreme values may carry significant importance. Additionally, SWQR consistently achieves average quantile losses similar to or slightly lower than those of WeightedQR across different scenarios, highlighting its ability to maintain accuracy while balancing computational costs. This performance is especially notable in Scenario 4, where censoring rates are varied. Under higher censoring rates, SWQR demonstrates robustness and stability in estimation accuracy compared to MartingaleQR, which struggles significantly. WeightedQR maintains high accuracy but with a steep increase in runtime, further emphasizing the advantage of SWQR in practical applications.

The real data application using the PBC dataset provided further validation of SWQR’s practical utility, but its results were relatively consistent with those observed in simulations. While the dataset is of moderate size, SWQR demonstrated stable performance across the quantile spectrum, particularly in cases with heavy censoring. The differences between SWQR, WeightedQR, and MartingaleQR in real data echoed the patterns found in the simulations. SWQR provided more reliable estimates at higher quantiles compared to MartingaleQR, which struggled with heavily censored data. The average quantile loss confirms the utility of SWQR in real-world scenarios, showing its ability to handle heavy censoring without sacrificing estimation quality. The flexibility of SWQR in modeling the distribution of survival times across different quantiles shows its capability for clinical applications where extreme outcomes are crucial, such as identifying high-risk patients. However, since the patterns of covariate effects, such as the diminishing influence of serum bilirubin at higher quantiles, align with known clinical behaviors, the real data application primarily reinforced the findings from the simulations rather than presenting novel insights.

Despite the strengths of SWQR, there remain areas for future research. One notable challenge is the selection of the bandwidth parameter for the smoothing kernel; while the chosen bandwidth performed well in our simulations, more systematic approaches for optimizing this parameter may yield further improvements. Additionally, investigating the performance of SWQR in high-dimensional settings, especially in ultra-high-dimensional scenarios where

p ≫ n

, would be a valuable direction for future research to assess its robustness. Furthermore, developing and reporting theoretical results, such as the asymptotic properties of SWQR, could also provide a stronger foundation for its application in survival analysis.

In conclusion, the proposed SWQR method offers a practical and robust approach for quantile regression in the presence of right-censored data. Its ability to balance computational efficiency with high estimation accuracy, even in large-scale settings, makes it a valuable tool in fields such as biomedicine, economics, and beyond. As data complexity continues to grow, the flexibility and scalability of SWQR will be critical for advancing statistical modeling and decision-making in a wide range of applications.

Author Contributions

Conceptualization, K.C. and X.Z.; methodology, K.C. and H.L.; software, H.L.; data curation, H.L.; formal analysis, K.C., H.L. and W.F.; writing—original draft preparation, K.C., H.L., W.F. and X.Z.; funding acquisition, K.C. and X.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the National Natural Science Foundation of China (12301334, 12201108), Natural Science Foundation of Jiangsu Province (BK20230804) and Fundamental Research Funds for the Central Universities (2242023R40055, MCCSE2023B04, 2242023K40012). Kaida Cai is recipient of the Zhishan Young Scholar Award at the Southeast University.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in the study are available in the StatLib at https://lib.stat.cmu.edu/datasets/pbc, accessed on 1 October 2024.

Acknowledgments

The authors acknowledge the use of language tools, including Youdao Dictionary software and Google Translate, for assisting in grammar correction and language refinement during the preparation of this manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SWQR	Smoothed Weighted Quantile Regression
QR	Quantile Regression
PBC	Primary Biliary Cirrhosis
MartingleQR	Martingale-based Quantile Regression
WeightedQR	Weighted Quantile Regression
BB	Barzilai-Borwein

References

Lawless, J.F. Statistical Models and Methods for Lifetime Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Klein, J.P.; Moeschberger, M.L. Survival Analysis: Techniques for Censored and Truncated Data; Springer Science & Business Media: New York, NY, USA, 2006. [Google Scholar]
Shedden, K.; Taylor, J.M.G.; Enkemann, S.; Tsao, M.S.; Yeatman, T.; Gerald, W.L.; Eschrich, S.A.; Jurisica, I.; Giordano, T.J.; Misek, D.E.; et al. Gene expression–based survival prediction in lung adenocarcinoma: A multi-site, blinded validation study. Nat. Med. 2008, 14, 822–827. [Google Scholar] [PubMed]
Fleming, T.R.; Lin, D. Survival analysis in clinical trials: Past developments and future directions. Biometrics 2000, 56, 971–983. [Google Scholar] [CrossRef] [PubMed]
Collett, D. Modelling Survival Data in Medical Research; Chapman and Hall/CRC: Boca Raton, FL, USA, 2023. [Google Scholar]
Wei, L.J. The accelerated failure time model: A useful alternative to the Cox regression model in survival analysis. Stat. Med. 1992, 11, 1871–1879. [Google Scholar] [CrossRef]
Saikia, R.; Barman, M.P. A review on accelerated failure time models. Int. J. Stat. Syst. 2017, 12, 311–322. [Google Scholar]
Cox, D.R. Analysis of Survival Data; Chapman and Hall/CRC: Boca Raton, FL, USA, 2018. [Google Scholar]
Kalbfleisch, J.D.; Prentice, R.L. The Statistical Analysis of Failure Time Data; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
Keiding, N.; Andersen, P.K.; Klein, J.P. The role of frailty models and accelerated failure time models in describing heterogeneity due to omitted covariates. Stat. Med. 1997, 16, 215–224. [Google Scholar] [CrossRef]
Royston, P.; Altman, D.G. External validation of a Cox prognostic model: Principles and methods. BMC Med Res. Methodol. 2013, 13, 33. [Google Scholar] [CrossRef]
Powell, J.L. Censored regression quantiles. J. Econom. 1986, 32, 143–155. [Google Scholar] [CrossRef]
Peng, L. Quantile regression for survival data. Annu. Rev. Stat. Its Appl. 2021, 8, 413–437. [Google Scholar] [CrossRef]
Koenker, R.; Bassett, G., Jr. Regression quantiles. Econom. J. Econom. Soc. 1978, 23, 33–50. [Google Scholar] [CrossRef]
Portnoy, S. Censored regression quantiles. J. Am. Stat. Assoc. 2003, 98, 1001–1012. [Google Scholar] [CrossRef]
Wang, H.J.; Wang, L. Locally weighted censored quantile regression. J. Am. Stat. Assoc. 2009, 104, 1117–1128. [Google Scholar] [CrossRef]
Peng, L.; Huang, Y. Survival analysis with quantile regression models. J. Am. Stat. Assoc. 2008, 103, 637–649. [Google Scholar] [CrossRef]
Wang, H.J.; Zhou, J.; Li, Y. Variable selection for censored quantile regresion. Stat. Sin. 2013, 23, 145. [Google Scholar]
Fei, Z.; Zheng, Q.; Hong, H.G.; Li, Y. Inference for high-dimensional censored quantile regression. J. Am. Stat. Assoc. 2023, 118, 898–912. [Google Scholar] [CrossRef]
Fernandes, M.; Guerre, E.; Horta, E. Smoothing quantile regressions. J. Bus. Econ. Stat. 2021, 39, 338–357. [Google Scholar] [CrossRef]
Nesterov, Y. Smooth minimization of non-smooth functions. Math. Program. 2005, 103, 127–152. [Google Scholar] [CrossRef]
Tan, K.M.; Wang, L.; Zhou, W.X. High-dimensional quantile regression: Convolution smoothing and concave regularization. J. R. Stat. Soc. Ser. B Stat. Methodol. 2022, 84, 205–233. [Google Scholar] [CrossRef]
He, X.; Pan, X.; Tan, K.M.; Zhou, W.X. Scalable estimation and inference for censored quantile regression process. Ann. Stat. 2022, 50, 2899–2924. [Google Scholar] [CrossRef]
He, X.; Pan, X.; Tan, K.M.; Zhou, W.X. Smoothed quantile regression with large-scale inference. J. Econom. 2023, 232, 367–388. [Google Scholar] [CrossRef]
Yan, Y.; Wang, X.; Zhang, R. Confidence intervals and hypothesis testing for high-dimensional quantile regression: Convolution smoothing and debiasing. J. Mach. Learn. Res. 2023, 24, 1–49. [Google Scholar]
Koenker, R. Quantile Regression; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Barzilai, J.; Borwein, J.M. Two-point step size gradient methods. IMA J. Numer. Anal. 1988, 8, 141–148. [Google Scholar] [CrossRef]
Dickson, E.R.; Grambsch, P.M.; Fleming, T.R.; Fisher, L.D.; Langworthy, A. Prognosis in primary biliary cirrhosis: Model for decision making. Hepatology 1989, 10, 1–7. [Google Scholar] [CrossRef]

Figure 1. Estimation errors and runtimes for Scenario 1 with

n = 3000

and

p = 25

at different quantile levels.

Figure 1. Estimation errors and runtimes for Scenario 1 with

n = 3000

and

p = 25

at different quantile levels.

Figure 2. Estimation errors and runtimes for Scenario 2 with

n = 4000

and

p = 35

at different quantile levels.

Figure 2. Estimation errors and runtimes for Scenario 2 with

n = 4000

and

p = 35

at different quantile levels.

Figure 3. Estimation errors and runtimes for Scenario 3 with

n = 6000

and

p = 50

at different quantile levels.

Figure 3. Estimation errors and runtimes for Scenario 3 with

n = 6000

and

p = 50

at different quantile levels.

Figure 4. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 20% censoring at different quantile levels.

Figure 4. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 20% censoring at different quantile levels.

Figure 5. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 35% censoring at different quantile levels.

Figure 5. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 35% censoring at different quantile levels.

Figure 6. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 50% censoring at different quantile levels.

Figure 6. Estimation errors and runtimes for Scenario 4 with

n = 4000

,

p = 60

, and 50% censoring at different quantile levels.

Figure 7. Estimated regression coefficients for the covariates across different quantile levels using the SWQR, WeightedQR, and MartingleQR methods. The covariates include (a) age, (b) edema, (c) log(bilirubin), (d) log(albumin), and (e) log(prothrombin time).

Table 1. Estimation errors and average quantile losses for Scenario 1 with

n = 3000

and

p = 25

at different quantile levels.

Table 1. Estimation errors and average quantile losses for Scenario 1 with

n = 3000

and

p = 25

at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.4476	0.3255	0.3257	0.5693	0.5652	0.5658
0.3	0.2882	0.2455	0.2453	0.6494	0.6476	0.6479
0.4	0.2386	0.2230	0.2222	0.6928	0.6919	0.6921
0.5	0.2210	0.2144	0.2151	0.7059	0.7055	0.7056
0.6	0.2287	0.2299	0.2280	0.6907	0.6905	0.6906
0.7	0.2611	0.2656	0.2652	0.6448	0.6447	0.6448

Table 2. Estimation errors and average quantile losses for Scenario 2 with

n = 4000

and

p = 35

at different quantile levels.

Table 2. Estimation errors and average quantile losses for Scenario 2 with

n = 4000

and

p = 35

at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.4485	0.3442	0.3586	0.5605	0.5565	0.5577
0.3	0.3038	0.2670	0.2733	0.6401	0.6384	0.6389
0.4	0.2494	0.2349	0.2363	0.6838	0.6829	0.6833
0.5	0.2322	0.2237	0.2237	0.6975	0.6970	0.6972
0.6	0.2330	0.2307	0.2309	0.6828	0.6826	0.6826
0.7	0.2646	0.2691	0.2700	0.6379	0.6379	0.6379

Table 3. Estimation errors and average quantile losses for Scenario 3 with

n = 6000

and

p = 50

at different quantile levels.

Table 3. Estimation errors and average quantile losses for Scenario 3 with

n = 6000

and

p = 50

at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.4563	0.3432	0.3497	0.5542	0.5503	0.5512
0.3	0.3064	0.2650	0.2674	0.6332	0.6316	0.6321
0.4	0.2512	0.2343	0.2369	0.6766	0.6757	0.6762
0.5	0.2317	0.2254	0.2260	0.6906	0.6902	0.6905
0.6	0.2371	0.2378	0.2386	0.6768	0.6766	0.6768
0.7	0.2651	0.2675	0.2674	0.6332	0.6331	0.6333

Table 4. Average runtimes for Scenarios 1–3 with different sample sizes and covariate dimensions.

Average Runtime (in Seconds)	MartingaleQR	WeightedQR	SWQR
$n = 3000, p = 25$	1.8564	4.4825	1.2736
$n = 4000, p = 35$	4.2152	10.8706	2.2430
$n = 6000, p = 50$	11.2568	41.0936	4.4198

Table 5. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 20% censoring at different quantile levels.

Table 5. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 20% censoring at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.5284	0.4141	0.4098	0.5590	0.5549	0.5550
0.3	0.3643	0.3211	0.3183	0.6399	0.6382	0.6382
0.4	0.2998	0.2806	0.2772	0.6845	0.6836	0.6836
0.5	0.2713	0.2692	0.2672	0.6987	0.6983	0.6983
0.6	0.2829	0.2839	0.2824	0.6847	0.6844	0.6845
0.7	0.3235	0.3294	0.3271	0.6401	0.6400	0.6400

Table 6. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 35% censoring at different quantile levels.

Table 6. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 35% censoring at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.5392	0.4231	0.4276	0.5646	0.5606	0.5607
0.3	0.3763	0.3397	0.3394	0.6465	0.6449	0.6449
0.4	0.3147	0.2992	0.2973	0.6917	0.6910	0.6910
0.5	0.2964	0.2937	0.2938	0.7064	0.7060	0.7061
0.6	0.3127	0.3178	0.3169	0.6925	0.6924	0.6924
0.7	0.3736	0.3759	0.3776	0.6481	0.6481	0.6481

Table 7. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 50% censoring at different quantile levels.

Table 7. Estimation errors and average quantile losses for Scenario 4 with

n = 4000

,

p = 60

, and 50% censoring at different quantile levels.

$τ$	Estimation Error			Average Quantile Loss
$τ$	MartingaleQR	WeightedQR	SWQR	MartingaleQR	WeightedQR	SWQR
0.2	0.5593	0.4504	0.4688	0.5652	0.5616	0.5625
0.3	0.4016	0.3688	0.3747	0.6475	0.6462	0.6468
0.4	0.3449	0.3313	0.3359	0.6931	0.6926	0.6929
0.5	0.3285	0.3269	0.3326	0.7081	0.7079	0.7082
0.6	0.3611	0.3614	0.3729	0.6947	0.6948	0.6952
0.7	0.4380	0.4345	0.4538	0.6511	0.6513	0.6517

Table 8. Average runtimes for Scenario 4 with

n = 4000

and

p = 60

at different censoring rates.

Table 8. Average runtimes for Scenario 4 with

n = 4000

and

p = 60

at different censoring rates.

Average Runtime (in Seconds)	MartingaleQR	WeightedQR	SWQR
20% censoring	11.9355	30.7947	4.2853
35% censoring	9.9623	28.2293	4.5146
50% censoring	7.6677	26.9169	4.8053

Table 9. Estimated regression coefficients at different quantile levels.

Estimated Coefficients	MartingleQR	WeightedQR	SWQR
${\hat{β}}_{1} (τ = 0.2)$	−0.0242	−0.0248	−0.0315
${\hat{β}}_{1} (τ = 0.3)$	−0.0247	−0.0217	−0.0271
${\hat{β}}_{1} (τ = 0.4)$	−0.0247	−0.0251	−0.0272
${\hat{β}}_{1} (τ = 0.5)$	−0.0314	−0.0320	−0.0319
${\hat{β}}_{1} (τ = 0.6)$	−0.0294	−0.0334	−0.0287
${\hat{β}}_{1} (τ = 0.7)$	−0.0241	−0.0235	−0.0262
${\hat{β}}_{2} (τ = 0.2)$	−0.8194	−0.8309	−0.7714
${\hat{β}}_{2} (τ = 0.3)$	−0.8565	−0.9189	−0.8974
${\hat{β}}_{2} (τ = 0.4)$	−0.9623	−0.9629	−1.0515
${\hat{β}}_{2} (τ = 0.5)$	−0.8513	−0.7693	−0.9534
${\hat{β}}_{2} (τ = 0.6)$	−0.6393	−0.5547	−0.5323
${\hat{β}}_{2} (τ = 0.7)$	−0.6384	−0.4901	−0.6046
${\hat{β}}_{3} (τ = 0.2)$	−0.5543	−0.5881	−0.6682
${\hat{β}}_{3} (τ = 0.3)$	−0.5842	−0.5738	−0.6463
${\hat{β}}_{3} (τ = 0.4)$	−0.5776	−0.5564	−0.6280
${\hat{β}}_{3} (τ = 0.5)$	−0.5365	−0.5030	−0.5899
${\hat{β}}_{3} (τ = 0.6)$	−0.4969	−0.4778	−0.5551
${\hat{β}}_{3} (τ = 0.7)$	−0.5626	−0.4953	−0.5648
${\hat{β}}_{4} (τ = 0.2)$	1.7351	1.4623	1.6809
${\hat{β}}_{4} (τ = 0.3)$	1.7709	1.5881	1.4501
${\hat{β}}_{4} (τ = 0.4)$	0.9546	0.9600	0.9270
${\hat{β}}_{4} (τ = 0.5)$	1.2977	1.5140	0.9674
${\hat{β}}_{4} (τ = 0.6)$	1.3242	1.2769	1.5422
${\hat{β}}_{4} (τ = 0.7)$	1.3222	1.5836	1.4378
${\hat{β}}_{5} (τ = 0.2)$	−3.4812	−3.2869	−3.3993
${\hat{β}}_{5} (τ = 0.3)$	−2.3751	−2.7282	−2.6992
${\hat{β}}_{5} (τ = 0.4)$	−2.5118	−2.6874	−2.3583
${\hat{β}}_{5} (τ = 0.5)$	−1.0635	−1.3906	−1.0801
${\hat{β}}_{5} (τ = 0.6)$	−0.8393	−1.8940	−1.5616
${\hat{β}}_{5} (τ = 0.7)$	−0.8966	−1.3901	−1.0299

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cai, K.; Liu, H.; Fu, W.; Zhao, X. Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis. Axioms 2024, 13, 831. https://doi.org/10.3390/axioms13120831

AMA Style

Cai K, Liu H, Fu W, Zhao X. Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis. Axioms. 2024; 13(12):831. https://doi.org/10.3390/axioms13120831

Chicago/Turabian Style

Cai, Kaida, Hanwen Liu, Wenzhi Fu, and Xin Zhao. 2024. "Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis" Axioms 13, no. 12: 831. https://doi.org/10.3390/axioms13120831

APA Style

Cai, K., Liu, H., Fu, W., & Zhao, X. (2024). Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis. Axioms, 13(12), 831. https://doi.org/10.3390/axioms13120831

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Smoothed Weighted Quantile Regression for Censored Data in Survival Analysis

Abstract

1. Introduction

2. Methodology

3. Computational Algorithm

4. Results

4.1. Simulation Study

4.2. Real Data Application

5. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI