Abstract
This work proposes a resampling technique to approximate the smoothing parameter of Beran’s estimator. It is based on resampling by the smoothed bootstrap and minimising the bootstrap approximation of the mean integrated squared error to find the bootstrap bandwidth. The behaviour of this method has been tested by simulation on several models. Bootstrap confidence intervals are also addressed in this research and their performance is analysed in the simulation study.
1. Introduction
Let be a simple random sample of with X being the covariate, the observed variable and the uncensoring indicator. Usually, T is the time until the occurrence of an event and C is the censoring time. The generalised product-limit estimator of the conditional survival function proposed in [1] is given by
where
with and is the bandwidth for the covariable. This estimator depends on a smoothing parameter which is, in practice, unknown. Therefore, finding a method for automatic selection of this bandwidth is truly interesting and very helpful in the analysis of real data subject to censoring. Bootstrap confidence intervals of are also proposed.
2. Bandwidth Selector
Let be an appropriate pilot bandwidth. The bootstrap resampling algorithm consists of generating and and obtaining
for each . The bootstrap sample is formed as .
The optimal smoothing parameter is the bandwidth that minimizes the mean integrated squared error given by:
Then, the bootstrap bandwidth is obtained by minimizing the Monte Carlo approximation of the bootstrap MISE defined as follows
where is the Beran survival estimation with pilot bandwidth r using the original sample , is the Beran survival estimation with bandwidth h using the bootstrap resample , and B the number of bootstrap resamples.
3. Bootstrap Confidence Intervals
Let be an appropriate smoothing parameter and fixed values , the bootstrap confidence interval for a confidence level of is given by
where is the Beran estimation with the pilot bandwidth r that is used in the bootstrap resampling, and and are the and percentiles of the resampling distribution of , being the Beran survival estimation of the bootstrap resample.
4. Simulation Study
A simulation study is carried out to analyse the behaviour of the bootstrap algorithm previously described. Several models with different conditional probabilities of censoring were considered. Figure 1 shows the bootstrap estimations of the conditional survival function in two of these scenarios: Model 1 considers the Weibull distribution for life and censoring times and Model 2 considers exponential life and censoring times. Both models have a conditional probability of censoring equal to 0.5. Figure 2 shows the bootstrap confidence intervals in one sample from Models 1 and 2.
Figure 1.
Theoretical survival function (solid line), Beran’s estimation with optimal bandwidth (dotted line) and Beran’s estimation with bootstrap bandwidth (dashed line) for Model 1 (left) and Model 2 (right).
Figure 2.
Theoretical survival function (solid line), Beran’s estimator with bootstrap bandwidth (dashed line) and the bootstrap confidence intervals (dotted line) for each t in a grid of size in Model 1 (left) and Model 2 (right).
5. Conclusions
The results of the simulations show that this bootstrap algorithm provides adequate smoothing parameters to estimate the survival function in this context. The bootstrap bandwidths obtained are similar to the optimal ones and the estimation errors of both are quite similar. Bootstrap confidence intervals have a reasonable behaviour.
Future lines of work focus on developing a method for choosing the bidimensional smoothing parameters involved in the doubly smoothed Beran estimator presented in [2]. In addition, we deal with the construction of confidence intervals for the conditional survival function based on the doubly smoothed Beran estimator.
Funding
This research has been supported by MINECO Grant MTM2017-82724-R, and by the Xunta de Galicia (Grupos de Referencia Competitiva ED431C-2016-015 and Centro Singular de Investigación de Galicia ED431G/01), all of them through the ERDF.
References
- Beran, R. Nonparametric Regression with Randomly Censored Survival Data; Technical Report; University of California: Los Angeles, CA, USA, 1981. [Google Scholar]
- Peláez Suárez, R.; Cao Abad, R.; Vilar Fernández, J.M. Nonparametric Estimation of the Conditional Survival Function with Double Smoothing. SORT 2021, 45. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

