Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures
Abstract
1. Introduction
Advantages of the Proposed Estimators
- Robust to Skewed and Outlier-Susceptible Data: The proposed estimators employ robust statistical measures such as the trimean, decile mean, and quartile-based indicators, which enhance their performance in datasets affected by skewness or extreme values.
- Improved Estimation Efficiency: Through transformation techniques and optimal parameter adjustment, the new estimators consistently yield lower mean squared errors (MSEs) and higher percent relative efficiencies (PREs) when compared to traditional approaches.
- Use of Nontraditional Auxiliary Information: Unlike classical estimators that rely solely on mean-based auxiliary variables, the proposed class effectively utilizes additional distributional characteristics such as interquartile range, midrange, quartile average, and quartile deviation.
- Flexible and Customizable Structure: The general form of the estimators allows for adjustment of parameters based on sampling design and the nature of auxiliary data, making them suitable for a variety of practical scenarios.
- Validated Through Simulation and Real Data: The estimators have been thoroughly evaluated using simulated populations from multiple distributions as well as real datasets, confirming their accuracy and reliability in practical applications.
- Practical Relevance Across Fields: These estimators are especially useful in domains such as economics, environmental studies, and healthcare, where data often exhibit asymmetry and require robust, median-based analysis.
2. Concepts and Existing Estimators
3. Suggested Class of Estimators
3.1. Properties of the Suggested Class of Estimators
3.2. Implementation Algorithm
4. Mathematical Comparison
- (i):
- (ii):
- (iii):
- (iv):
- (v):
- (vi):
- (vii):
- (viii):
5. Results and Discussion
5.1. Simulation Study
- Simulated dataset 1: A highly skewed distribution for X is described by the Exponential distribution, , with .
- Simulated dataset 2: A slightly skewed distribution for X is described by the Log-Normal distribution, given by , with .
- Simulated dataset 3: A uniform distribution is used as the baseline for X, represented as , with .
- Simulated dataset 4: The distribution of X with moderate skewness and spread is represented by a Gamma distribution, specified as , with a correlation coefficient .
- Simulated dataset 5: A distribution with heavy tails for X is represented by the Cauchy distribution, described as , with .
- 1.
- Using the specified distributions, generate a dataset consisting of values for the variables X and Y.
- 2.
- The population of size is divided into three equal strata of 500 units each to create a balanced stratification, allowing clear assessment of estimator performance under moderate stratification.
- 3.
- To assess the accuracy of the estimators, calculate key statistical summaries, including the maximum and minimum values. Additionally, determine the most efficient outcomes for the available estimators.
- 4.
- Draw a sample of size from each stratum using simple random sampling without replacement (SRSWOR) from their corresponding population size .
- 5.
- The simulated percent relative efficiency (SPRE) values for all estimators discussed in this study are calculated across different sample sizes. This phase ensures that the SPRE of each estimator are analyzed for a collection of samples.
- 6.
- Following 35,000 iterations of steps 3 and 4, the designated formulas are employed to determine the final simulated mean squared errors (SMSEs) and simulated percent relative efficiencies (SPREs) values.
5.2. Numerical Analysis Using Real Datasets
- :
- This variable denotes the overall amount of fish collected from every source in the year 1995, encompassing both commercial operations and recreational fishing efforts;
- This variable indicates the count of fish captured in 1994 by people engaged in recreational saltwater fishing, with all commercial catch activities excluded;
- :
- This variable represents the total number of fish caught in 1995, reflecting the complete harvest for that year;
- This variable reflects the quantity of fish caught in 1994 by recreational saltwater anglers, highlighting the contribution of non-commercial fishing to the overall catch.
- Employment level by division and district in 2010;
- Number of registered factories by division and district in 2010;
- Employment level by division and district in 2012;
- Number of registered factories by division and district in 2012.
- The overall student enrollment for the academic year 2012–2013;
- The 2012–2013 academic year recorded a certain number of government-managed primary schools for both genders;
- The number of students registered in the 2012–2013 academic year;
- The 2012–2013 academic year indicated the overall number of government middle schools serving both genders.
5.3. Discussion
- The results from both simulated and real datasets, as presented in Table 2 and Table 3, indicate that the PRE values of all newly introduced estimators are higher than those of the previously established ones discussed in Section 2. This highlights the enhanced effectiveness of the suggested estimators in relation to existing techniques.
- Additionally, the upward trend in the graphical representations shown in Figure 1 and Figure 2, based on various distributions and actual datasets further confirms that the new estimators consistently achieve higher PRE values than the conventional estimators. The inverse correlation between the PRE values of the new and traditional estimators strengthens the idea that the newly introduced estimators offer a more efficient estimation method.
- Although different distributions and correlation levels were used, Figure 1 shows a similar trend across all plots. This consistency highlights the robustness of the proposed estimators, particularly , which are designed using transformation techniques and nontraditional auxiliary measures. The drop in PRE from estimator 7 to 8 is due to the higher variability in ’s nonlinear form. Each plot includes 16 estimators: the first 8 are existing methods ( to ) and the last 8 are proposed estimators ( to ). Among them, , , and achieve the highest PRE values.
- The datasets used in Figure 2 come from different sources, and they share structural similarities such as skewness and moderate to strong correlation between study and auxiliary variables. This explains the similar trends in PRE values across plots, reflecting the robustness of the proposed estimators under such conditions. The sharp drop in PRE from estimator 4 to 5 is due to the structural difference in their exponential forms. Estimator reduces the effect of median differences, while may increase variability, especially when and differ, making it less stable. Among all existing and proposed estimators, consistently achieves the highest PRE values and is recommended for practical use.
- The boxplots in Figure 3 and Figure 4 provide a concise summary of the distribution of percent relative efficiency (PRE) values across both simulated and real data settings. Notably, the proposed estimators (particularly and demonstrate consistently higher median PRE values and narrower interquartile ranges, indicating not only superior efficiency but also greater stability across different population scenarios. In contrast, the existing estimators show lower medians and greater variability, highlighting the robustness and reliability of the proposed methods under varied sampling conditions.
6. Conclusions
Limitations and Considerations
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Algorithm 1 Computation of the Proposed Estimator |
Input:
|
References
- Alghamdi, A.S.; Alrweili, H. A comparative study of new ratio-type family of estimators under stratified two-phase sampling. Mathematics 2025, 13, 327. [Google Scholar] [CrossRef]
- Daraz, U.; Alomair, M.A.; Albalawi, O.; Al Naim, A.S. New techniques for estimating finite population variance using ranks of auxiliary variable in two-stage sampling. Mathematics 2024, 12, 2741. [Google Scholar] [CrossRef]
- Alghamdi, A.S.; Almulhim, F.A. Optimizing finite population mean estimation using simulation and empirical data. Mathematics 2025, 13, 1635. [Google Scholar] [CrossRef]
- Alomair, M.A.; Daraz, U. Dual transformation of auxiliary variables by using outliers in stratified random sampling. Mathematics 2024, 12, 2839. [Google Scholar] [CrossRef]
- Alghamdi, A.S.; Alrweili, H. New class of estimators for finite population mean under stratified double phase sampling with simulation and real-life application. Mathematics 2025, 13, 329. [Google Scholar] [CrossRef]
- Gross, S. Median estimation in sample surveys. In Proceedings of the Section on Survey Research Methods; American Statistical Association Ithaca: Alexandria, VA, USA, 1980. [Google Scholar]
- Sedransk, J.; Meyer, J. Confidence intervals for the quantiles of a finite population: Simple random and stratified simple random sampling. J. R. Stat. Soc. Ser. B (Methodol.) 1978, 40, 239–252. [Google Scholar] [CrossRef]
- Philip, S.; Sedransk, J. Lower bounds for confidence coefficients for confidence intervals for finite population quantiles. Commun. Stat.-Theory Methods 1983, 12, 1329–1344. [Google Scholar] [CrossRef]
- Kuk, Y.C.A.; Mak, T.K. Median estimation in the presence of auxiliary information. J. R. Stat. Soc. Ser. B 1989, 51, 261–269. [Google Scholar] [CrossRef]
- Rao, T.J. On certail methods of improving ration and regression estimators. Commun.-Stat.-Theory Methods 1991, 20, 3325–3340. [Google Scholar] [CrossRef]
- Singh, S.; Joarder, A.H.; Tracy, D.S. Median estimation using double sampling. Aust. N. Z. J. Stat. 2001, 43, 33–46. [Google Scholar] [CrossRef]
- Khoshnevisan, M.; Singh, H.P.; Singh, S.; Smarandache, F. A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling; Virginia Polytechnic Institute and State University: Blacksburg, VA, USA, 2002. [Google Scholar]
- Singh, S. Advanced Sampling Theory With Applications: How Michael Selected Amy; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003; Volume 2. [Google Scholar]
- Gupta, S.; Shabbir, J.; Ahmad, S. Estimation of median in two-phase sampling using two auxiliary variables. Commun. Stat.-Theory Methods 2008, 37, 1815–1822. [Google Scholar] [CrossRef]
- Aladag, S.; Cingi, H. Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Commun. Stat.-Theory Methods 2015, 44, 1013–1032. [Google Scholar] [CrossRef]
- Solanki, R.S.; Singh, H.P. Some classes of estimators for median estimation in survey sampling. Commun. Stat.-Theory Methods 2015, 44, 1450–1465. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Albalawi, O. Double exponential ratio estimator of a finite population variance under extreme values in simple random sampling. Mathematics 2024, 12, 1737. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Alomair, M.A.; Aldoghan, L.A. New classes of difference cum-ratio-type exponential estimators for a finite population variance in stratified random sampling. Heliyon 2024, 10, e33402. [Google Scholar] [CrossRef]
- Daraz, U.; Alomair, M.A.; Albalawi, O. Variance estimation under some transformation for both symmetric and asymmetric data. Symmetry 2024, 16, 957. [Google Scholar] [CrossRef]
- Shabbir, J.; Gupta, S. A generalized class of difference type estimators for population median in survey sampling. Hacet. J. Math. Stat. 2017, 46, 1015–1028. [Google Scholar] [CrossRef]
- Almulhim, F.A.; Alghamdi, A.S. Simulation-based evaluation of robust transformation techniques for median estimation under simple random sampling. Axioms 2025, 14, 301. [Google Scholar] [CrossRef]
- Daraz, U.; Almulhim, F.A.; Alomair, M.A.; Alomair, A.M. Population median estimation using auxiliary variables: A simulation study with real data across sample sizes and parameters. Mathematics 2025, 13, 1660. [Google Scholar] [CrossRef]
- Irfan, M.; Maria, J.; Shongwe, S.C.; Zohaib, M.; Bhatti, S.H. Estimation of population median under robust measures of an auxiliary variable. Math. Probl. Eng. 2021, 2021, 4839077. [Google Scholar] [CrossRef]
- Shabbir, J.; Gupta, S.; Narjis, G. On improved class of difference type estimators for population median in survey sampling. Commun. Stat.-Theory Methods 2022, 51, 3334–3354. [Google Scholar] [CrossRef]
- Subzar, M.; Lone, S.A.; Ekpenyong, E.J.; Salam, A.; Aslam, M.; Raja, T.A.; Almutlak, S.A. Efficient class of ratio cum median estimators for estimating the population median. PLoS ONE 2023, 18, e0274690. [Google Scholar] [CrossRef]
- Iseh, M.J. Model formulation on efficiency for median estimation under a fixed cost in survey sampling. Model Assist. Stat. Appl. 2023, 18, 373–385. [Google Scholar] [CrossRef]
- Hussain, M.A.; Javed, M.; Zohaib, M.; Shongwe, S.C.; Awais, M.; Zaagan, A.A.; Irfan, M. Estimation of population median using bivariate auxiliary information in simple random sampling. Heliyon 2024, 10, 7. [Google Scholar] [CrossRef]
- Bhushan, S.; Kumar, A.; Lone, S.A.; Anwar, S.; Gunaime, N.M. An efficient class of estimators in stratified random sampling with an application to real data. Axioms 2023, 12, 576. [Google Scholar] [CrossRef]
- Bahl, S.; Tuteja, R. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 1991, 12, 159–164. [Google Scholar] [CrossRef]
- Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 20. [Google Scholar] [CrossRef]
- Daraz, U.; Khan, M. Estimation of variance of the difference-cum-ratio-type exponential estimator in simple random sampling. Res. Math. Stat. 2021, 8, 1899402. [Google Scholar] [CrossRef]
- Daraz, U.; Agustiana, D.; Wu, J.; Emam, W. Twofold auxiliary information under two-phase sampling: An improved family of double-transformed variance estimators. Axioms 2025, 14, 64. [Google Scholar] [CrossRef]
- Daraz, U.; Wu, J.; Agustiana, D.; Emam, W. Finite population variance estimation using Monte Carlo simulation and real life application. Symmetry 2025, 17, 84. [Google Scholar] [CrossRef]
- Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2013. Available online: https://www.pbs.gov.pk/content/microdata (accessed on 10 July 2025).
- Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2014. Available online: https://www.pbs.gov.pk/content/microdata (accessed on 10 July 2025).
Different Classes of | ||||
---|---|---|---|---|
1 | ||||
1 | 0 | |||
0 | 1 | |||
0 | ||||
1 | 1 | |||
1 | ||||
0 |
Estimator | Pop-1 | Pop-2 | Pop-3 | Pop-4 | Pop-5 |
---|---|---|---|---|---|
100 | 100 | 100 | 100 | 100 | |
119.270 | 117.015 | 116.24 | 115.70 | 114.54 | |
133.18 | 127.12 | 126.11 | 123.08 | 125.10 | |
137.22 | 133.18 | 131.16 | 128.13 | 133.15 | |
111.56 | 112.97 | 111.99 | 107.92 | 113.49 | |
193.77 | 168.53 | 158.43 | 167.52 | 164.49 | |
241.26 | 191.76 | 166.51 | 190.75 | 181.65 | |
141.35 | 140.25 | 148.33 | 135.20 | 139.24 | |
271.31 | 235.50 | 198.94 | 223.29 | 234.22 | |
290.30 | 256.94 | 232.86 | 244.06 | 257.28 | |
221.92 | 207.97 | 182.80 | 203.97 | 216.08 | |
286.36 | 252.49 | 212.00 | 233.25 | 246.40 | |
215.65 | 203.47 | 177.38 | 197.26 | 210.59 | |
276.46 | 244.38 | 210.70 | 228.70 | 241.24 | |
225.99 | 210.60 | 201.30 | 202.15 | 214.81 | |
245.75 | 230.47 | 209.48 | 206.88 | 219.59 |
Estimator | Dataset 1 | Dataset 2 | Dataset 3 |
---|---|---|---|
100 | 100 | 100 | |
104.48 | 103.49 | 102.31 | |
106.46 | 224.199 | 105.40 | |
113.59 | 220.09 | 105.77 | |
43.62 | 42.76 | 45.31 | |
119.70 | 232.48 | 106.50 | |
121.86 | 233.11 | 106.61 | |
117.35 | 201.57 | 107.62 | |
148.19 | 251.67 | 184.34 | |
182.32 | 274.98 | 207.57 | |
132.82 | 237.03 | 166.16 | |
162.21 | 246.33 | 196.40 | |
127.77 | 239.19 | 160.10 | |
160.10 | 245.82 | 191.41 | |
151.01 | 241.02 | 166.93 | |
159.09 | 243.62 | 171.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alghamdi, A.S.; Almulhim, F.A. Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry 2025, 17, 1136. https://doi.org/10.3390/sym17071136
Alghamdi AS, Almulhim FA. Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry. 2025; 17(7):1136. https://doi.org/10.3390/sym17071136
Chicago/Turabian StyleAlghamdi, Abdulaziz S., and Fatimah A. Almulhim. 2025. "Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures" Symmetry 17, no. 7: 1136. https://doi.org/10.3390/sym17071136
APA StyleAlghamdi, A. S., & Almulhim, F. A. (2025). Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry, 17(7), 1136. https://doi.org/10.3390/sym17071136