Resampling under Complex Sampling Designs: Roots, Development and the Way Forward
Abstract
1. Introduction
1.1. Generalities
1.2. Superpopulation Model and Sampling Design: Basic Aspects
- (a)
 - is a probability measure on for every in .
 - (b)
 - is a Borel-measurable function of for every .
 
1.3. Descriptive and Analytic Inference
2. From Efron’s iid Bootstrap to Pseudo-Population Based Resampling
2.1. Efron’s Bootstrap: A Few Basic Aspects
- -
 - Conditionally on , the r.v.s in are i.i.d. with common d.f. , the finite population d.f.
 - -
 - Unconditionally, the r.v.s in are i.i.d. with common d.f. .
 
- E1.
 - Conditionally on , converges weakly to a Brownian bridge W on the scale of as N, n increase. The same result also holds unconditionally.
 - E2.
 - weakly converges to a Brownian bridge W on the scale of as N increases.
 - E3.
 - and are asymptotically independent.
 - E4.
 - If , with , then converges weakly to , as n, N increase.
 - E5.
 - Conditionally on , , converges weakly to a Brownian bridge on the scale of as N, n increase.
 
3. Failure of Efron’s Bootstrap in the Non-i.i.d. Case
- S1.
 - Conditionally on , converges weakly to , where W is a Brownian bridge on the scale of as N, n increase. The same result also holds unconditionally.
 - S2.
 - weakly converges to a Brownian bridge W on the scale of as N increases.
 - S3.
 - and are asymptotically independent.
 - S4.
 - converges weakly to W, a Brownian bridge on the scale of , as n, N increase.
 - S5.
 - Conditionally on and , converges weakly to a Brownian bridge on the scale of as N, n increase.
 
- This is the closest to Efron’s original idea of replicating, at a sample level, the sampling process from the population.
 - This is the only resampling procedure justified by asymptotic arguments similar to those of [17] for Efron’s bootstrap.
 
4. Accounting for the Sampling Design in Resampling: The Pseudo-Population Approach
4.1. Pseudo-Populations: Definition
4.2. Resampling from Pseudo-Populations
4.3. Resampling Based on Pseudo-Populations: Basics Results for Descriptive Inference
- Under appropriate regularity conditions, the conditional distribution of , given and , converges weakly, as both n and N tend to infinity, to a Gaussian process with null mean function and covariance kernel . This result, furthermore, holds for a set of sequences of s and s having -probability 1.
 - If the functional is Hadamard-differentiable at with Hadamard derivative , then, again conditionally on and , tends in distribution to , which is a Normal variate with zero expectation and variance .
 
- .
 - Under appropriate regularity conditions, the conditional distribution of , given , , , converges weakly, as both n and N tend to infinity, to a Gaussian process with a null mean function and covariance kernel . This result, furthermore, holds for a set of sequences of s and s having -probability 1 and in probability w.r.t. the sampling design.
 - .
 - If the functional is continuously Hadamard-differentiable at , with Hadamard derivative , then, again conditionally on , , , tends in distribution to , that turns out to be a Normal variate with zero expectation and variance .
 
- -
 - Conditional approach. A single pseudo-population is constructed, and M independent bootstrap samples are drawn. In this way, M independent replications are generated.
 - -
 - Unconditional approach. M independent pseudo-populations are constructed, and from each of them, a single bootstrap sample is drawn. In this case, M independent replications are generated.
 
4.4. Resampling Based on Pseudo-Populations: Basics Results for Analytic Inference
- -
 - The generation of s from the superpopulation model.
 - -
 - The selection of the sample from the finite population.
 
- 1.
 - Under appropriate regularity conditions, the (unconditional) distribution of converges weakly, as both n and N tend to infinity to a Gaussian process with a null mean function and covariance kernel .
 - .
 - Under appropriate regularity conditions, and conditionally on , , , the distribution of converges weakly, as both n and N tend to infinity to the same Gaussian process with a null mean function and covariance kernel .
 - 2.
 - The limiting process can be written as , where is the limiting Gaussian process obtained for descriptive inference, is an independent Gaussian process (essentially, a Brownian bridge on the scale of ), and f is the limiting value of the sampling fraction.
 - 3.
 - If the functional is Hadamard-differentiable at , with Hadamard derivative , then tends in distribution to , that turns out to be a Normal variate with zero expectation and variance .
 - .
 - If the functional is continuously Hadamard-differentiable at , with Hadamard derivative , then, conditionally on , , and , tends in distribution to the same Normal variate with zero expectation and variance .
 
5. Computational Issues
6. Open Problems and Final Considerations
Author Contributions
Funding
Institutional Review Board Statement
Conflicts of Interest
References
- Efron, B. Bootstrap methods: Another look at the jackknife. Ann. Stat. 1979, 7, 1–26. [Google Scholar] [CrossRef]
 - Mashreghi, Z.; Haziza, D.; Léger, C. A survey of bootstrap methods in finite population sampling. Stat. Surv. 2016, 10, 1–52. [Google Scholar] [CrossRef]
 - McCarthy, P.J.; Snowden, C.B. The bootstrap and finite population sampling. In Vital and Health Statistics; Public Heath Service Publication, U.S. Government Printing: Washington, DC, USA, 1985; Volume 95, pp. 1–23. [Google Scholar]
 - Rao, J.N.K.; Wu, C.F.J. Resampling inference with complex survey data. J. Am. Stat. Assoc. 1988, 83, 231–241. [Google Scholar] [CrossRef]
 - Sitter, R.R. A resampling procedure for complex data. J. Am. Stat. Assoc. 1992, 87, 755–765. [Google Scholar] [CrossRef]
 - Chatterjee, A. Asymptotic properties of sample quantiles from a finite population. Ann. Inst. Stat. Math. 2011, 63, 157–179. [Google Scholar] [CrossRef]
 - Rao, J.N.K.; Wu, C.F.J.; Yue, K. Some recent work on resampling methods for complex surveys. Surv. Methodol. 1992, 18, 209–217. [Google Scholar]
 - Conti, P.L.; Marella, D. Inference for quantiles of a fnite population: Asymptotic vs. resampling results. Scand. J. Stat. 2015, 42, 545–561. [Google Scholar] [CrossRef]
 - Beaumont, J.F.; Patak, Z. On the Generalized Bootstrap for Sample Surveys with Special Attention to Poisson Ssampling. Int. Stat. Rev. 2012, 80, 127–148. [Google Scholar] [CrossRef]
 - Antal, E.; Tillé, Y. A direct bootstrap method for complex sampling designs from a finite population. J. Am. Stat. Assoc. 2011, 106, 534–543. [Google Scholar] [CrossRef]
 - Gross, S.T. Median estimation in sample surveys. In Proceedings of the Section on Survey Research Methods, American Statistical Association, Houston, TX, USA, 11–14 August 1980; pp. 181–184. [Google Scholar]
 - Chao, M.T.; Lo, S.H. A bootstrap method for finite population. Sankhya 1985, 47, 399–405. [Google Scholar]
 - Booth, J.G.; Butler, R.W.; Hall, P. Bootstrap methods for finite populations. J. Am. Stat. Assoc. 1994, 89, 1282–1289. [Google Scholar] [CrossRef]
 - Holmberg, A. A bootstrap approach to probability proportional-to-size sampling. In Proceedings of the ASA Section on Survey Research Methods, Alexandria, VA, USA, 1998; pp. 378–383. [Google Scholar]
 - Chauvet, G. Méthodes de Bootstrap en Population Finie. Ph.D. Dissertation, Laboratoire de Statistique d’enquêtes, CREST-ENSAI, Universioté de Rennes, Rennes, France, 2007. [Google Scholar]
 - Conti, P.L. On the estimation of the distribution function of a finite population under high entropy sampling designs, with applications. Sankhya B 2014, 76, 234–259. [Google Scholar] [CrossRef]
 - Bickel, P.J.; Freedman, D. Some asymptotic theory for the bootstrap. Ann. Stat. 1981, 9, 1196–1216. [Google Scholar] [CrossRef]
 - van der Vaart, A. Asymptotic Statistics; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
 - Pfeffermann, D.; Sverchkov, M. Parametric and semi-parametric estimation of regression models fitted to survey data. Sankhya B 1999, 61, 166–186. [Google Scholar]
 - Conti, P.L.; Marella, D.; Mecatti, F.; Andreis, F. A unified principled framework for resampling based on pseudo-populations: Asymptotic theory. Bernoulli 2020, 26, 1044–1069. [Google Scholar] [CrossRef]
 - Pfeffermann, D.; Sverchkov, M. Prediction of finite population totals based on the sample distribution. Surv. Methodol. 2004, 30, 79–92. [Google Scholar]
 - Boistard, H.; Lophuhaä, H.P.; Ruiz-Gazen, A. Functional central limit theorems for single-stage sampling design. Ann. Stat. 2017, 45, 1728–1758. [Google Scholar] [CrossRef]
 - Bertail, P.; Chautru, E.; Clémençon, S. Empirical Processes in Survey Sampling with (Conditional) Poisson Designs. Scand. J. Stat. 2017, 44, 97–111. [Google Scholar] [CrossRef]
 - Han, Q.; Wellner, J.A. Complex sampling designs: Uniform limit theorems and applications. Ann. Stat. 2021, 49, 459–485. [Google Scholar] [CrossRef]
 - Di Iorio, A. Analytic Inference in Finite Population Framework Via Resampling. Unpublished Ph.D. Thesis, Department of Statistical Science, Sapienza Università di Roma, Roma, Italy, 2016. [Google Scholar]
 - Ranalli, M.G.; Mecatti, F. Comparing Recent Approaches for Bootstrapping Sample Survey Data: A First Step Towards a Unified Approach. In Proceedings of the ASA Section on Survey Research Methods, Alexandria, VA, USA, 2012; pp. 4088–4099. [Google Scholar]
 - Quatember, A. Pseudo-Populations—A Basic Concept in Statistical Surveys; Springer: New York, NY, USA, 2015. [Google Scholar]
 - Quatember, A. The Finite Population Bootstrap—From the Maximum Likelihood to the Horvitz-Thompson Approach. Austrian J. Stat. 2014, 43, 93–102. [Google Scholar] [CrossRef][Green Version]
 - Conti, P.L.; Mecatti, F.; Nicolussi, F. Efficient unequal probability resampling from finite populations. Comput. Stat. Data Anal. 2022, 167, 107366. [Google Scholar] [CrossRef]
 - Thompson, S.K. Sampling, 3rd ed; Wiley: New York, NY, USA, 2012. [Google Scholar]
 - Thompson, S.K. Adaptive and Network Sampling for Inference and Interventions in Changing Populations. J. Surv. Stat. Methodol. 2017, 5, 1–21. [Google Scholar] [CrossRef]
 
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.  | 
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Conti, P.L.; Mecatti, F. Resampling under Complex Sampling Designs: Roots, Development and the Way Forward. Stats 2022, 5, 258-269. https://doi.org/10.3390/stats5010016
Conti PL, Mecatti F. Resampling under Complex Sampling Designs: Roots, Development and the Way Forward. Stats. 2022; 5(1):258-269. https://doi.org/10.3390/stats5010016
Chicago/Turabian StyleConti, Pier Luigi, and Fulvia Mecatti. 2022. "Resampling under Complex Sampling Designs: Roots, Development and the Way Forward" Stats 5, no. 1: 258-269. https://doi.org/10.3390/stats5010016
APA StyleConti, P. L., & Mecatti, F. (2022). Resampling under Complex Sampling Designs: Roots, Development and the Way Forward. Stats, 5(1), 258-269. https://doi.org/10.3390/stats5010016
        
                                                
