Next Article in Journal
The Use of a Game Theory Model to Explore the Emergence of Core/Periphery Structure in Networks and Its Symmetry
Previous Article in Journal
Facial Homogeneous Colouring of Graphs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Admissibility of Simultaneous Bootstrap Confidence Intervals

1
Department of Mathematics and Statistics, York University, Toronto, ON M3J 1P3, Canada
2
Berlin Institute of Health, Institute of Biometry and Clinical Epidemiology Berlin and Charité-Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin, 10117 Berlin, Germany
3
Berlin Institute of Health (BIH), Anna-Louisa-Karsch-Straße 2, 10178 Berlin, Germany
4
BNU-HKBU United International College, Zhuhai 519087, China
*
Author to whom correspondence should be addressed.
Symmetry 2021, 13(7), 1212; https://doi.org/10.3390/sym13071212
Submission received: 6 May 2021 / Revised: 23 June 2021 / Accepted: 29 June 2021 / Published: 6 July 2021
(This article belongs to the Section Life Sciences)

Abstract

:
Simultaneous confidence intervals are commonly used in joint inference of multiple parameters. When the underlying joint distribution of the estimates is unknown, nonparametric methods can be applied to provide distribution-free simultaneous confidence intervals. In this note, we propose new one-sided and two-sided nonparametric simultaneous confidence intervals based on the percentile bootstrap approach. The admissibility of the proposed intervals is established. The numerical results demonstrate that the proposed confidence intervals maintain the correct coverage probability for both normal and non-normal distributions. For smoothed bootstrap estimates, we extend Efron’s (2014) nonparametric delta method to construct nonparametric simultaneous confidence intervals. The methods are applied to construct simultaneous confidence intervals for LASSO regression estimates.

1. Introduction

In statistical sciences, the computation of simultaneous confidence intervals (SCIs) for parameters of interest plays a major role [1]. For instance, multiple testing problems, regression analysis, and multivariate inference are well known application areas. When the finite sampling distribution of the estimator (here, a vector) under the alternative is known, exact SCIs can be computed. However, that is often not the case, and when the underlying distribution of the estimator is difficult to obtain analytically, bootstrap methods (using drawing with replacement from the data) can be used to approximate its sampling distribution (see, e.g., [2,3], among others). The authors of [4] provide an algorithm to construct SCIs for multivariate estimates, which calculates the number of bootstrap samples that fall outside a confidence region. To achieve the pre-specified level of ( 1 α ) × 100 % , one can trial different values of the lower and upper bounds of the intervals until attaining the target level. In [5], a more efficient way of computing ( 1 α ) × 100 % upper SCIs was developed by sorting the bootstrap realizations of the multivariate estimates by the maximum rank across all the coordinates. Then, the ( 1 α ) × 100 % percentile of the maximum rank is determined. The limit of the confidence interval for each coordinate is the value of the estimate sharing the same rank as the ( 1 α ) percentile of the maximum rank. By symmetry, the ( 1 α ) × 100 % lower SCIs can also be constructed.
In this note, we improve their method by sharpening the limits of the SCIs for each coordinate. It is observed that the value of the estimate sharing the same rank as the ( 1 α ) × 100 % percentile of the maximum rank is indeed higher than the estimates in the selected ( 1 α ) × 100 % samples, but the threshold for each coordinate is too conservative. We can tighten them rather than using the same maximum rank cutoff values (see details below). Furthermore, we show that the new proposed SCIs are admissible in the sense that there exist no other bootstrap SCIs which have the same ( 1 α ) × 100 % coverage probability while being uniformly smaller than or equal to the proposed confidence intervals in all coordinates and strictly smaller in at least one coordinate. We apply the proposed nonparametric SCIs on LASSO penalized regression estimates to perform post model selection inference. We observe that the nonparametric method is able to preserve the asymmetry of the sampling distribution of the LASSO penalized regression estimates. This is advantageous compared to parametric-based SCIs, which are often constructed symmetrically around the point estimate ignoring the asymmetric nature of the penalized estimates.

2. Methods

In the following, let Z denote a dataset and let θ = ( θ 1 , , θ p ) denote a vector of parameters of interest. Then, the SCI C j = [ a j ( Z ) , b j ( Z ) ] , j = 1 , , p has a simultaneous coverage level of 1 α , if
P j = 1 p a j ( Z ) θ j b j ( Z ) 1 α .
Bootstrap-based algorithms for the computations of the thresholds a j ( Z ) and b j ( Z ) will now be discussed. First, we will introduce the Mandel–Betensky intervals (MB) and our proposed modification (MMB), which are obtained by sharpening their individual thresholds. Later, we show that they are admissible at level ( 1 α ) .

2.1. Mandel–Betensky Intervals and Modified (MMB) Upper Intervals

We first estimate the upper bounds (using resampling) and will focus later on two-sided SCIs. Since the computation of the upper MBB bounds uses the same initial steps as those of the MB intervals, we provide their computational details simultaneously:
  • Generate B bootstrap samples Z b , b = 1 , , B , by sampling with replacement and obtain the bootstrap estimates θ ˜ b = { θ ˜ b 1 , , θ ˜ b p } .
  • For each coordinate j, j = 1 , , p , order the bootstrap estimates as θ ˜ ( 1 j ) θ ˜ ( 2 j ) θ ˜ ( B j ) . In the case of tied observation, we use random ranks. The rank of θ ˜ b j is denoted as r ( b , j ) . For each sample, find the maximum rank r ( b ) = max j r ( b , j ) , which is the largest rank associated with the bth bootstrap sample.
  • Calculate the ( 1 α ) —percentile r 1 α of r ( b ) , b = 1 , 2 , , B .
  • Denote the collection of bootstrap samples with the maximum rank below r 1 α as Φ = { b : r ( b ) r 1 α } .
  • Construct the upper limits for each coordinate: For coordinate j, calculate t j = max b Φ r ( b , j ) . The upper limits of the SCIs are given by θ ˜ ( t 1 1 ) , θ ˜ ( t 2 2 ) , θ ˜ ( t p p ) .
In comparison, the MB method uses t j = r 1 α for all j. It can be shown that t j = max b Φ r ( b , j ) r 1 α . This is because for any b Φ , r ( b , j ) r ( b ) r 1 α . The proposed method using individual rank thresholds instead of the same maximum rank threshold for all coordinates. By this means, the new method sharpens the upper bound and leads to universally shorter or equal SCIs. We summarize the results in the following theorem.
Theorem 1.
Assume there are no ties for the ( 1 α ) —percentile of r ( b ) . The proposed modified MB upper intervals have ( 1 α ) coverage and they are admissible at level ( 1 α ) in the sense that there are no other simultaneous confidence intervals which have ( 1 α ) coverage but their lengths (of all the coordinates) are universally shorter than the modified MB intervals.

2.2. Modified MB Two-Sided Simultaneous Confidence Intervals

  • Follow the same steps 1 and 2 as in the algorithm for the computation of the MMB upper intervals.
  • Calculate the ( 1 α / 2 ) percentile r 1 α / 2 of r ( b ) .
  • Denote the collection of bootstrap samples with the maximum sample rank equal or below the r 1 α / 2 as Φ , i.e., b Φ if r ( b ) r 1 α / 2 .
  • For each coordinate j, order the bootstrap estimates in set Φ . The new rank of θ ˜ b j in set Φ is denoted as r ( b , j ) . For each sample, find the minimum sample rank r ( b ) = min j r ( b , j ) which is the smallest rank associated with the bth bootstrap sample in set Φ .
  • Calculate the α / ( 2 α ) percentile the r α / ( 2 α ) of r ( b ) .
  • Denote the collection of bootstrap samples within Φ with the minimum sample rank above r α / ( 2 α ) as Ψ = { b Φ r ( b ) r α / ( 2 α ) } .
  • To construct the upper and lower limits for each coordinate, calculate t j = max b Ψ r ( b , j ) , and w j = min b Ψ r ( b , j ) . The simultaneous confidence interval upper limits are θ ˜ ( t j j ) and the lower limits are θ ˜ ( w j j ) , j = 1 , , p , respectively.
Theorem 2.
Assume there are no ties for the ( 1 α / 2 ) —percentile of r ( b ) , and α / ( 2 α ) percentile of r ( b ) . The proposed modified MB two-sided intervals have ( 1 α ) coverage and they are admissible at level ( 1 α ) in the sense that there are no other two-sided simultaneous confidence intervals which have ( 1 α ) coverage but their interval lengths (for all of the coordinates) are universally smaller than the MBB interval.
Proofs of Theorem 1 and 2 are provided in Appendix A. The theoretical results above assume that there are no ties for the ( 1 α / 2 ) percentile of r ( b ) , and the ( α / ( 2 α ) ) percentile of r ( b ) . In practice, however, ties can often occur. Increasing the number of bootstrap samples can decrease the chance of having ties. In the presence of ties in ranking r ( b ) , there could be multiple samples denoted as set S 1 with the same maximum rank r 1 α / 2 . We can arbitrarily split them to set Φ and set Φ c while maintaining that there are B ( 1 α / 2 ) samples in Φ . A similar strategy can be applied when there are multiple samples denoted as set S 2 with tied minimum sample rank for the ( α / ( 2 α ) ) —percentile of r ( b ) . The resulting SCIs are not unique, but the admissibility result still holds true if the competing set of intervals have the same coverage rate and the covered set of bootstrap samples Ω satisfies the condition that Ω Ψ c ( S 1 S 2 ) . The condition implies Ω contains at least a sample b such that b is not in Ψ and also not in S 1 S 2 . That means either maximum rank r ( b ) > r 1 α / 2 , or the minimum rank r ( b ) < r α / 2 α . Then, the competing set cannot have intervals universally shorter than the proposed MMB intervals.
We conduct simulation studies to verify the validity of the proposed method. We generate a regression model y i = x i T β + ϵ i , where β = ( β 1 , , β p ) T , and ϵ i either follows a normal N ( 0 , 1 ) , or l o g n o r m a l ( 0 , 1 ) distribution, i = 1 , , n . We construct the nonparametric 100 ( 1 α ) % SCIs for the unknown regression parameters with varying sample sizes and regression parameters. We compared the performance of the percentile-based methods with the simultaneous confidence intervals based on the multivariate normal (MVN) critical values. We also include the Bonferroni method which uses the critical value of the univariate normal distribution but the significance level is adjusted to α / p . The covariance structure of the multivariate normal is estimated by the sample covariance of the bootstrap estimates. For each simulation setting, we simulate 1000 datasets. For each dataset, we generate 10,000 bootstrap samples. The significance level α is set to 0.05 . It is shown in Table 1 that the proposed MMB intervals maintain satisfactory simultaneous coverage rates close to the nominal level for both normal and lognormal errors and different values of p. The MMB method’s coverage probability is very close to that of the MB method, while the interval lengths are uniformly shorter across all the simulation settings. The Bonferroni method is always more conservative than the MVN method. When p increases, for lognormal errors, the MVN method and the Bonferroni method are conservative with familywise type I error rate equal to 0.018 which is much less than the norminal level of 0.05 .

3. Simultaneous Confidence Interval for Penalized Regression Estimates

In this application, we construct simultaneous confidence intervals using penalized estimates outputted from LASSO generalized linear regression [6]. In order to make post model selection confidence intervals, two strategies were proposed by [7]: bootstrap confidence intervals and smoothed bootstrap confidence interval.
Hereby, different bootstrap samples may lead to different identified submodels s . This implies that for some of the bootstrap samples, θ j could be thresholded to be zero, whereas in other bootstrap samples, the resulting θ j could be nonzero. The author of [7] proposed using the bootstrap of the smoothed estimate θ ^ = b = 1 B θ ˜ b / B ([8,9,10]) to correct the erratic jumpiness of the estimates obtained from different bootstrap samples. He also proposed a method for estimating the asymptotic standard error and to construct a confidence interval for a single parameter using the nonparametric delta method. It is, however, desirable to construct SCIs using the vector of smoothed bootstrap estimates. First, we compute the (asymptotic) covariance matrix of two smoothed bootstrap estimates. This result is established for the ideal bootstrap, where B is equal to all the n n possible choices of the bootstrap samples. Given the original dataset Z = ( z 1 , , z n ) , let the bth bootstrap sample be ( z b 1 * , , z b n * ) and the two estimates be θ ˜ b l , and θ ˜ b m . Let Z b i * = # { k { 1 , , n } z b k * = z i } , which is the number of data points in the bootstrap sample equal to z i . Then, the nonparametric delta-method-based estimator of the asymptotic covariance matrix for the ideal smoothed bootstrap estimates θ ^ l = b = 1 B θ ˜ b l / B , and θ ^ m = b = 1 B θ ˜ b m / B , is
σ ˜ l m = i = 1 n Cov * ( Z b i * , θ ˜ b l ) Cov * ( Z b i * , θ ˜ b m ) ,
where Cov * ( Z b i * , θ ˜ b l ) and Cov * ( Z b i * , θ ˜ b m ) denote the bootstrap covariance between Z b i * and θ ˜ b l , and the bootstrap covariance between Z b i * and θ ˜ b m , respectively.
The ideal smoothed bootstrap estimates are a functional on the data ( z 1 , , z n ) , and symmetric in their argument. They can be considered as the functionals of the empirical distribution F ^ , which assigns 1 / n probability to each of the observed data points. It has been shown by [11], that the asymptotic covariance of two functionals of the empirical distribution, can be obtained as 1 n 2 i = 1 n D ^ i l D ^ i m , where D ^ i l = lim ϵ 0 θ ^ l ( p 0 + ϵ ( δ i p 0 ) ) θ ^ l ( p 0 ) ϵ ,   D ^ i m = lim ϵ 0 θ ^ m ( p 0 + ϵ ( δ i p 0 ) ) θ ^ m ( p 0 ) ϵ ,   δ i denotes a vector of all zeros except a one for the ith coordinate and p 0 = ( 1 / n , , 1 / n ) . From Efron (2014), we obtain D ^ i l = n b = 1 B ( Z b i * 1 ) θ ˜ b l / B = n Cov * ( Z b i * , θ ˜ b l ) , and D ^ i m = n b = 1 B ( Z b i * 1 ) θ ˜ b m / B = n Cov * ( Z b i * , θ ˜ b m ) . In practice, the estimator of the covariance of the non-ideal bootstrap estimates can take the form σ ^ l m = i = 1 n Cov ^ i l Cov ^ i m , and Cov ^ i l = 1 B b = 1 B ( Z b i * Z . i * ) ( θ ˜ b l θ ^ l ) ,   Cov ^ i m = 1 B b = 1 B ( Z b i * Z . i * ) ( θ ˜ b m θ ^ m ) , respectively. Then, the joint limiting distribution of ( θ ^ 1 , , θ ^ p ) is multivariate normal with covariance matrix that can be estimated as Σ ^ = ( σ ^ l m ) . The numerical evaluation of the multivariate normal distribution is provided with the R package mvtnorm, see [12]. Let c 1 α denote the critical value of the multivariate distribution of Σ , then the SCIs for all the parameters are given by
C I l = θ ^ l c 1 α σ ^ l l , θ ^ l + c 1 α σ ^ l l , l = 1 , , p .
We perform the generalized linear regression with the LASSO penalty on the heart data [13]. The dataset contains the coronary heart disease status of 462 patients and the covariates include sbp (systolic blood pressure), tobacco (cumulative tobacco), ldl (low density lipoprotein cholesterol), adiposity, famhist (family history of heart disease), typea (type-A behavior), obesity, alcohol (current alcohol consumption), and age (age at onset). The optimum penalty size is determined by the Bayesian information criterion. Then, we perform 1000 bootstrap samples and construct the SCIs for the unknown regression parameters. We compare the results of MMB and MB methods with the smoothed bootstrap confidence intervals using the critical values from the multivariate normal distributions and the Bonferroni critical values. The nonparametric delta method is used to derive the limiting covariance structure. It is shown in Table 2 that all four methods provide similar lower and upper bounds. The MVN and Bonferroni methods generate confidence intervals all containing zeros. In comparison, most of the confidence intervals generated by MMB and MB methods have upper bounds or lower bounds at zero value. Figure 1 depicts all the confidence intervals generated by all four methods. Each confidence interval is scaled by the standard deviation of the point estimate so that all intervals can be presented on the same scale. The MMB and MB methods have mostly asymmetric confidence intervals which are either non-negative or non-positive. This asymmetric structure not only reflects the sparse nature of the penalized estimates, but also provides evidence on the direction of the parameter. Compared to the MB method, the proposed MMB method produces shorter intervals.

4. Discussion

We propose nonparametric simultaneous confidence intervals to provide distribution-free joint inferences for multiple parameters. Compared to existing MB intervals, the proposed modified MB intervals have uniformly sharper upper and lower bounds. When used in post model selection inference, the MMB intervals are advantageous as they preserve the asymmetry of the sampling distributions of the penalized estimates.
When the number of parameters increases with the sample size, it is difficult to develop simultaneous confidence intervals with desired coverage probability for all the parameters. There are a number of challenges for parametric or nonparametric methods: (1) To obtain the empirical multivariate distribution of θ ^ 1 , , θ ^ P , it requires much larger bootstrap samples. (2) The correction for the adjustment of multiplicities is more severe. (3) The quantile for the high-dimensional multivariate distribution is harder to compute. To overcome this issue, we may adopt the approach of false-discovery-rate adjusted multiple confidence intervals [14]. This warrants future investigations to develop nonparametric simultaneous intervals for high dimensional parameters.

Author Contributions

Conceptualization, X.G., F.K. and Q.L.; methodology, X.G., F.K. and Q.L.; formal analysis, X.G.; writing, X.G., F.K. and Q.L.; funding acquisition, X.G. All authors have read and agreed to the published version of the manuscript.

Funding

Gao’s research was funded by Natural Sciences and Engineering Research Council of Canada.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The R codes of all the numerical studies are available at https://github.com/xingaostat/Nonparametric-simultaneous-intervals, accessed on 2 July 2021.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SCIsimultaneous confidence interval
LASSOleast absolute shrinkage and selection operator
MBMandel–Betensky interval
MMBmodified Mandel–Betensky interval
MVNmultivariate normal

Appendix A

Proof of Theorem 1.
There are B ( 1 α ) samples in Φ ; thus, the simultaneous confidence intervals have ( 1 α ) coverage. To prove the admissibility, we consider a competing set of simultaneous confidence intervals which also have ( 1 α ) coverage with θ ˘ j , 1 j p denoting the upper limits of the intervals. Due to the same rate of coverage, the competing set contains the same number B ( 1 α ) of bootstrap samples denoted as the set Ψ . First we consider the situation that the competitor set Ψ is the same as the set Φ . Assume there exists a coordinate j such that the threshold for the competitor θ ˘ j < θ ˜ ( t j j ) , then the competitor will not cover the set of bootstrap samples b * where b * = argmax b Φ r ( b , j ) . This is a contradiction and it implies that for all coordinates the competing simultaneous confidence intervals have to be at least as large as the MMB intervals. Consider the situation that there is at least one bootstrap sample b Φ c which is contained in the competitor set Ψ . We can find a coordinate j such that, for sample b , it attains its maximum rank at j = argmax j r ( b , j ) . When j ’s are not unique, we can arbitrarily choose one among them. We have θ ˜ b j θ ˘ j . For coordinate j , because b Φ c , it implies r ( b , j ) > r 1 α t j , therefore, θ ˘ j θ ˜ b j > θ ˜ ( t j j ) . Then, the interval length of the competitor is greater than that of the MMB for the coordinate j . This establishes the proof to the theorem. □
Proof of Theorem 2.
There are B ( 1 α / 2 ) samples in set Φ , and there are B ( 1 α / 2 ) { 1 α / ( 2 α ) } = B ( 1 α ) samples in set Ψ . So the simultaneous confidence intervals have the targeted ( 1 α ) coverage. To prove the admissibility, we consider a competing set of intervals which also have ( 1 α ) coverage with the upper limits θ ˘ j U , and the lower limits θ ˘ j L , j = 1 , , p , and contain the same number B ( 1 α ) bootstrap samples denoted by the set Ω . First consider the situation that the set Ω is equal to the set Ψ . Assume that there is a coordinate j such that the two-sided interval of the competing set is shorter, which implies either θ ˘ j U < θ ˜ ( t j j ) , or θ ˘ j L > θ ˜ ( w j j ) . For the former, the competing set will not cover the set of bootstrap samples b * with b * = argmax b Ψ r ( b , j ) . For the latter, the competing set will not contain the set of bootstrap samples b * * with b * * = argmin b Ψ r ( b , j ) . This contradicts with the fact that the set Ω is equal to the set Ψ . Thus for each coordinate, the competing interval length has to be at least as large as the MMB interval. If the set Ω is not equal to the set Ψ , then there is at least one bootstrap sample b Φ c or b Φ Ψ c which is contained by the set Ω . For b Φ c , consider the coordinate j such that, for sample b , j = argmax j r ( b , j ) . If j is not unique, choose one among them. As θ ˘ j U is the competing interval’s upper limit for the coordinate j , we have θ ˜ b j θ ˘ j U . For coordinate j , because b Φ c , it implies r ( b , j ) > r 1 α . This leads to θ ˜ b j > θ ˜ ( t j j ) . Thus, we have θ ˘ j U > θ ˜ ( t j j ) . Then the coordinate interval length of the competing interval is greater than that of the MMB for the coordinate j . For b Φ Ψ c , consider any coordinate j such that, for sample b , it attains the lowest rank among all coordinates j = argmin j r ( b , j ) . As θ ˘ j L is the lower limit, we have θ ˜ b j θ ˘ j L . For coordinate j , because b Φ Ψ c , it implies r ( b , j ) < r α / 2 α , therefore, θ ˜ b j < θ ˜ ( w j j ) and θ ˘ j L < θ ˜ ( w j j ) . This establishes the proof for the theorem. □

References

  1. Draper, N.R.; Guttman, I. Confidence intervals versus regions. Statistician 1995, 44, 399–403. [Google Scholar] [CrossRef]
  2. Efron, B. Bootstrap methods: Another look at the Jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
  3. Hall, P. The Bootstrap and Edgeworth Expansion; Springer: Berlin/Heidelberg, Germany, 1992. [Google Scholar]
  4. Davison, A.C.; Hinkley, D.V. Bootstrap Methods and Their Application (No. 1); Cambridge University Press: London, UK, 1997. [Google Scholar]
  5. Mandel, M.; Betensky, R. Simultaneous confidence intervals based on the percentile bootstrap approach. Comput. Stat. Data Anal. 2008, 52, 2158–2165. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Methodol. 1996, 58, 267–288. [Google Scholar] [CrossRef]
  7. Efron, B. Estimation and Accuracy after Model Selection. J. Am. Stat. Assoc. 2014, 109, 991–1007. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  8. Efron, B.; Tibshirani, R. Using Specially Designed Exponential Families for Density Estimation. Ann. Stat. 1996, 24, 2431–2461. [Google Scholar] [CrossRef]
  9. Breiman, L. Bagging Predictors. Mach. Learn. 1996, 24, 123–140. [Google Scholar] [CrossRef] [Green Version]
  10. Buja, A.; Stuetzle, W. Observations on Bagging. Stat. Sin. 2006, 16, 323–351. [Google Scholar]
  11. Jaeckel, L. The Infinitesimal Jackknife; Bell Laboratory: Holmdel, NJ, USA, 1972. [Google Scholar]
  12. Hothorn, T.; Bretz, F.; Westfall, P.; Heiberger, R.M.; Schuetzenmeister, A.; Scheibe, S.; Hothorn, M.T. Package ‘multcomp’: Simultaneous Inference in General Parametric Models. In Project for Statistical Computing; CRC Press: Vienna, Austria, 2016. [Google Scholar]
  13. Hastie, T.; Tibshirani, R.; Friedman, J. Elements of Statistical Learning; Data Mining, Inference and Prediction; Springer: New York, NY, USA, 2001. [Google Scholar]
  14. Benjamini, Y.; Yekutieli, D. False discovery rate–adjusted multiple confidence intervals for selected parameters. J. Am. Stat. Assoc. 2005, 100, 71–81. [Google Scholar] [CrossRef]
Figure 1. The plots of the SCI generated by four methods on heart data.
Figure 1. The plots of the SCI generated by four methods on heart data.
Symmetry 13 01212 g001
Table 1. Simultaneous coverage probability and size of the confidence interval.
Table 1. Simultaneous coverage probability and size of the confidence interval.
MMBMBMVNBonferroni
pnErrorCov-ProbSizeCov-ProbSizeCov-ProbSizeCov-ProbSize
3200 N ( 0 , 1 ) 0.94900.58930.94900.59070.95000.58800.95000.5900
3200 l o g N ( 0 , 1 ) 0.92501.55660.92601.56350.94701.55130.95001.5625
15200 N ( 0 , 1 ) 0.94101.68270.94401.69150.94101.67560.94301.6827
15200 l o g N ( 0 , 1 ) 0.93404.46060.93804.50040.96604.43740.97004.4821
30200 N ( 0 , 1 ) 0.94402.74550.94502.77570.93802.72450.94202.7359
30200 l o g N ( 0 , 1 ) 0.95707.29600.96007.40040.98207.24510.98207.3127
Results are obtained from 1000 simulated datasets, the number of bootstrap samples is 10,000; “cov-prob” denotes the coverage probability; the size of intervals is calculated as | | θ ^ U θ ^ L | | 2 2 . The simultaneous confidence level is 0.95.
Table 2. Comparison of four methods for the construction of simultaneous confidence intervals of LASSO regression estimates for heart data.
Table 2. Comparison of four methods for the construction of simultaneous confidence intervals of LASSO regression estimates for heart data.
MMBMBMVN NP DeltaBonferroni NP Delta
UpperLowerUpperLowerUpperLowerUpperLower
intercept0.8804−6.99710.8999−7.4115−0.3777−6.0964−0.3514−6.1227
sbp0.0156−0.02430.0156−0.02920.0092−0.00970.0093−0.0098
tobacco0.23740.00000.23740.00000.1921−0.04800.1932−0.0491
ldl0.47030.00000.47030.00000.3693−0.17720.3718−0.1797
adiposity0.15430.00000.1846−0.04080.0595−0.04300.0599−0.0434
famhist1.37440.00001.39940.00001.0511−0.55621.0585−0.5636
typea0.09290.00000.0929−0.01110.0453−0.02780.0456−0.0281
obesity0.0000−0.28570.0000−0.32460.0873−0.13240.0884−0.1334
alcohol0.0272−0.00730.0272−0.01720.0154−0.01030.0155−0.0104
age0.09200.00000.09200.00000.0853−0.00920.0857−0.0096
The bootstrap number is 1000. Upper and lower stands for the upper and lower limits of the SCI. The simultaneous confidence level is 0.95. Both MVN and Bonferroni methods are implemented using nonparametric Delta method to obtain the covariance estimates for the smoothed bootstrap estimates.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Gao, X.; Konietschke, F.; Li, Q. On the Admissibility of Simultaneous Bootstrap Confidence Intervals. Symmetry 2021, 13, 1212. https://doi.org/10.3390/sym13071212

AMA Style

Gao X, Konietschke F, Li Q. On the Admissibility of Simultaneous Bootstrap Confidence Intervals. Symmetry. 2021; 13(7):1212. https://doi.org/10.3390/sym13071212

Chicago/Turabian Style

Gao, Xin, Frank Konietschke, and Qiong Li. 2021. "On the Admissibility of Simultaneous Bootstrap Confidence Intervals" Symmetry 13, no. 7: 1212. https://doi.org/10.3390/sym13071212

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop