Next Article in Journal
Quasi-Periodic Dynamics and Wave Solutions of the Ivancevic Option Pricing Model Using Multi-Solution Techniques
Previous Article in Journal
Charge Density Waves in Solids—From First Concepts to Modern Insights
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures

by
Abdulaziz S. Alghamdi
1 and
Fatimah A. Almulhim
2,*
1
Department of Mathematics, College of Science & Arts, King Abdulaziz University, P.O. Box 344, Rabigh 21911, Saudi Arabia
2
Department of Mathematical Sciences, College of Science, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(7), 1136; https://doi.org/10.3390/sym17071136
Submission received: 3 June 2025 / Revised: 10 July 2025 / Accepted: 11 July 2025 / Published: 15 July 2025
(This article belongs to the Section Mathematics)

Abstract

This research focuses on estimating the population median within a stratified random sampling framework by using robust statistical measures with transformation-based methodologies. An efficient estimator aims to minimize both the bias and the variance, thereby reducing the overall mean squared error (MSE, leading to more reliable outcomes. We introduce an improved class of proposed estimators that utilizes transformation techniques to effectively address data variability and enhance estimation accuracy. To evaluate their performance, we derive expressions for bias and mean square error (MSE) up to the first-order approximation for both existing and newly developed estimators, establishing theoretical conditions for their effectiveness. Additionally, the proposed estimators are compared with traditional methods using simulated populations generated from different probability distributions and actual datasets. The results indicate that the newly introduced estimators improve precision and efficiency in median estimation, yielding more reliable outcomes. When assessed against conventional estimators, the findings demonstrate that the new estimators outperform in terms of the percent relative efficiency criterion.

1. Introduction

Research on estimating the finite population median has been relatively limited compared to the extensive work on parameters such as the mean, variance, proportion, and total. The median is often a more reliable measure of central tendency than the mean, especially in highly skewed distributions, such as those of income, expenditures, taxation, and production. The development of effective methods for median estimation in finite populations remains understudied, highlighting the need for further research to improve estimation techniques that utilize auxiliary variables. Further information regarding the use of unknown population parameters can be found at [1,2,3,4,5].
The use of auxiliary information to estimate the population median has been greatly enhanced by early foundational research [6,7,8,9], which laid the groundwork for subsequent studies in this field. Various estimators for estimating the finite population median have been developed over time, utilizing different sampling methods [9]. One notable contribution came from [10], who introduced innovative approaches to improve regression and ratio estimators for median estimation. In another advancement, Ref. [11] proposed a generalized set of estimators that incorporate two auxiliary variables within the same framework, while Ref. [12] explored double sampling techniques to enhance estimation accuracy. Additionally, Ref. [13] defined a minimum unbiased estimator by including the known median of an auxiliary variable. Further progress was made by [14], who examined median estimation in two-phase sampling with the use of two auxiliary variables. To improve median estimation under stratified random sampling and simple random sampling, Refs. [15,16] proposed enhanced estimators. More recently, Refs. [17,18,19] developed new estimators incorporating auxiliary data for various population parameters under different sampling methods. In response to growing interest in improving median estimation, numerous developments have been made in recent years. For an in-depth understanding of this topic, readers are encouraged to consult [20,21,22,23,24,25,26,27,28], and the references within these works.
Estimators often become less accurate or yield misleading results when faced with extreme values. Traditional estimators, including ratio, product, regression, and exponential methods for estimating the finite population median, depend extensively on standard auxiliary variable measures, which limits their effectiveness. This dependence can be particularly problematic in datasets affected by outliers. To address this, an efficient estimator can minimize both bias and mean squared error, leading to more reliable outcomes. This paper introduces an advanced class of stratified estimators that apply a transformation technique. By linearly combining two robust measures, the trimean and decile mean, with five non-traditional measures such as the inter-quartile range, mid-range, quartile average, and quartile deviation on auxiliary variables, these estimators aim to estimate the finite population median under stratified random sampling. This transformation improves the efficiency of the estimators and enhances their ability to handle data variability. The proposed estimators are particularly suited for skewed distributions and datasets with outliers, making them more adaptable than many traditional methods. Their flexibility makes them especially valuable in fields like environmental research, healthcare, and income analysis.

Advantages of the Proposed Estimators

The proposed class of estimators addresses several limitations observed in traditional median estimation methods, particularly under stratified sampling. Below are the key benefits that highlight their theoretical strength and practical utility:
  • Robust to Skewed and Outlier-Susceptible Data: The proposed estimators employ robust statistical measures such as the trimean, decile mean, and quartile-based indicators, which enhance their performance in datasets affected by skewness or extreme values.
  • Improved Estimation Efficiency: Through transformation techniques and optimal parameter adjustment, the new estimators consistently yield lower mean squared errors (MSEs) and higher percent relative efficiencies (PREs) when compared to traditional approaches.
  • Use of Nontraditional Auxiliary Information: Unlike classical estimators that rely solely on mean-based auxiliary variables, the proposed class effectively utilizes additional distributional characteristics such as interquartile range, midrange, quartile average, and quartile deviation.
  • Flexible and Customizable Structure: The general form of the estimators allows for adjustment of parameters based on sampling design and the nature of auxiliary data, making them suitable for a variety of practical scenarios.
  • Validated Through Simulation and Real Data: The estimators have been thoroughly evaluated using simulated populations from multiple distributions as well as real datasets, confirming their accuracy and reliability in practical applications.
  • Practical Relevance Across Fields: These estimators are especially useful in domains such as economics, environmental studies, and healthcare, where data often exhibit asymmetry and require robust, median-based analysis.
This paper is structured into multiple sections to ensure a logical and organized presentation. The fundamental terminology and theoretical background relevant to this work are introduced in Section 2, which also includes a review of previously developed estimators. Section 3 introduces a new family of estimators and explores their formulation in detail. A comprehensive analytical comparison between the existing and proposed estimators is presented in Section 4. Section 5 outlines the simulation procedure used to generate various population scenarios from selected distributions, aimed at testing and supporting the theoretical insights. This section also features numerical results to illustrate practical applications. The final section, Section 6, summarizes the study’s contributions and offers suggestions for future research directions.

2. Concepts and Existing Estimators

A finite population of size N is divided into L distinct strata. This population is represented by the vector γ = ( γ 1 , γ 2 , , γ N ) . Each stratum h contains N h units, and the total number of units across all strata is N, so that h = 1 L N h = N . In every stratum, the variable of interest is Y, and X represents as an auxiliary variable. For the ith unit in the hth stratum, the values are recorded as y h i and x h i . A sample of size n h is selected without replacement from each stratum, ensuring that the total sample size satisfies h = 1 L n h = n . The true median values of Y and X in the hth are denoted by M y h and M x h . Their corresponding sample medians are M ^ y h and M ^ x h . Let C M y h and C M x h represent the population coefficients of variation of the study variable Y and the auxiliary variable X, respectively. The probability density functions for the population medians are given as f y ( M y h ) and f x ( M x h ) . The correlation between the population medians of the study variable M y h and the auxiliary variable M x h within the hth stratum is denoted by ρ y h x h . It is mathematically expressed as:
ρ ( M y h , M x h ) = 4 P 11 ( y h , x h ) 1 ,
where P 11 represents the joint probability given by:
P 11 = P ( y h M y h x h M x h ) .
To study the mathematical behavior of various estimators, such as their bias, mean squared error, and the condition for achieving the lowest MSE, specific relative error terms are applied. Let
e 0 h = M ^ y h M y h M y h
and
e 1 h = M ^ x h M x h M x h ,
such that E e i h = 0 for i = 0 , 1 .
Furthermore,
E e 0 h 2 = δ h C M y h 2 ,
E e 1 h 2 = δ h C M x h 2
and
E e 0 h e 1 h = δ h C M y x h = ρ y x h C M y h C M x h ,
where
C M y h = 1 M y h f y h ( M y h ) ,
C M x h = 1 M y h f x h ( M x h ) ,
show the population coefficient of variations of both variables, Y and X, in the hth stratum, and
δ h = 1 4 1 n h 1 N h ,
be the finite population correction factor.
This section evaluates the bias and mean squared error of commonly used stratified estimators for estimating the median of a population. It also examines how these estimators perform in comparison to a newly proposed class, with the goal of identifying possible improvements in accuracy and efficiency.
The traditional unbiased estimator for the stratified population median is given below:
M ^ 1 = h = 1 L W h M ^ y h .
The variance associated with M ^ 1 is defined as:
V ( M ^ 1 ) = h = 1 L δ h W h 2 M y h 2 C M y h 2 ,
where
W h = N h N ,
represents the known stratum weight for the hth stratum, respectively.
The estimator M ^ 2 is a stratified extension of the ratio-based method introduced by [9], which is defined as follows:
M ^ 2 = h = 1 L W h M ^ y h M ^ x h M x h .
The bias and mean squared error of M ^ 2 are given by the following formulas:
B i a s M ^ 2 h = 1 L δ h W h M y h C M x h 2 C M y h x h
and
M S E M ^ 2 h = 1 L δ h W h 2 M y h 2 C M y h 2 + C M x h 2 2 C M y h x h .
A difference type estimator was first introduced by [13], and its corresponding form for stratified sampling is referred to as M ^ 3 , which is presented as follows:
M ^ 3 = h = 1 L W h M ^ y h + d h M x h M ^ x h ,
the term d h represents an unknown component, and its optimal value corresponds to the hth stratum is determined as follows:
d h m i n = ρ y h x h M y h C M y h M x h C M x h .
The minimum MSE of M ^ 3 is given below:
M S E M ^ 3 m i n h = 1 L δ h W h 2 M y h 2 C M y h 2 1 ρ y h x h 2 .
Based on the approach introduced by [29], the following estimators are constructed for estimating the median in a stratified sampling framework, as detailed below:
M ^ 4 = h = 1 L W h M ^ y h exp M x h M ^ x h M x h + M ^ x h
and
M ^ 5 = h = 1 L W h M ^ y h exp M ^ x h M x h M x h + M ^ x h .
The biases and mean squared errors (MSEs) for M ^ 4 and M ^ 5 are defined as follows:
B i a s M ^ 4 h = 1 L δ h W h M y h 3 8 C M x h 2 1 2 C M y h x h ,
B i a s M ^ 5 h = 1 L δ h W h M y h 1 2 C M y h x h 3 8 C M x h 2 ,
M S E M ^ 4 h = 1 L δ h W h 2 M y h 2 C M y h 2 + 1 4 C M x h 2 C M y h x h
and
M S E M ^ 5 h = 1 L δ h W h 2 M y h 2 C M y h 2 + 1 4 C M x h 2 + C M y h x h .
Using the difference based methodology proposed by [10,14], the following estimators for median estimation within the context of stratified random sampling are formulated as follows:
M ^ 6 = h = 1 L W h d 1 h M ^ y h + d 2 h M x h M ^ x h ,
M ^ 7 = h = 1 L W h d 3 h M ^ y h + d 4 h M x h M ^ x h M x h M ^ x h
and
M ^ 8 = h = 1 L W h d 5 h M ^ y h + d 6 h M x h M ^ x h M x h M ^ x h M x h + M ^ x h .
The following values represent the optimum constants d i h ( i = 1 , 2 , , 6 ) :
d 1 h o p t = 1 1 + δ h C M y h 2 1 ρ y h x h 2 ,
d 2 h o p t = M y h M x h ρ y h x h C M y h 1 + δ h C M y h 2 1 ρ y h x h 2 ,
d 3 h o p t = 1 δ h C M y h 2 1 δ h C M y h 2 + δ h C M y h 2 1 ρ y h x h 2 ,
d 4 h o p t = M y h M x h 1 + d 3 h o p t ρ y h x h C M y h C M x h 2 ,
d 5 h o p t = 1 8 8 δ h C M x h 2 1 + δ h C M x h 2 1 ρ y h x h 2
and
d 6 h o p t = M y h M x h 1 2 + d 5 h o p t ρ y h x h C M y h C M x h 1 .
The following expressions represent the minimum biases and mean square errors by using the optimum values of d i h for hth stratum ( i = 1 , 2 , , 6 ) , which are defined as:
B i a s M ^ 6 h = 1 L δ h W h M y h d 1 h 1 ,
B i a s M ^ 7 h = 1 L δ h W h M y h d 3 h 1 + δ h d 3 h M y h C M x h 2 C M y h x h + δ h d 4 h M x h C M x h 2 ,
B i a s M ^ 8 h = 1 L δ h W h M y h d 5 h 1 + δ h d 5 h M y h 3 8 C M x h 2 1 2 C M y h x h + δ h 2 d 6 h M x h C M x h 2 ,
M S E M ^ 6 m i n h = 1 L δ h W h 2 M y h 2 C M y h 2 1 ρ y h x h 2 1 + δ h C M y h 2 1 ρ y h x h 2 ,
M S E M ^ 7 m i n h = 1 L δ h W h 2 M y h 2 1 δ h C M x h 2 C M y h 2 1 ρ y h x h 2 1 δ h C M x h 2 + δ h C M y h 2 1 ρ y h x h 2
and
M S E M ^ 8 m i n h = 1 L δ h W h 2 M y h 2 C M x h 2 1 ρ y h x h 2 δ h 4 C M x h 2 1 16 C M x h 2 + C M y h 2 1 ρ y h x h 2 1 + δ h C M y h 2 1 ρ y h x h 2 .

3. Suggested Class of Estimators

In this section, the proposed class of estimators is based on commonly used techniques in survey sampling, including ratio-type, product-type, and exponential-type forms. Inspired by the work of [30,31,32,33], we use transformation terms and bias correction components that are effective when the auxiliary variable is correlated with the study variable. The flexibility provided by the parameters allows the construction of a broad class of estimators, including several known forms as special cases. This proposed class of estimators is derived by linearly combining unconventional measures and robust measures based on stratified random sampling. The use of this transformation enhances the efficiency of the estimator and helps manage data variability. The proposed estimator is expressed as:
M ^ S = h = 1 L W h k 1 h M ^ y h M x h M ^ x h α 1 h + k 2 h M x h M ^ x h M x h M ^ x h α 2 h exp a 1 h ( M x h M ^ x h ) a 1 h ( M ^ x h + M x h ) + 2 a 2 h ,
where the known parameters ( α 1 h , α 2 h , a 1 h , a 2 h ) are associated with the auxiliary variable X, and the constants k i h (for i = 1 , 2 ) are unknown. By varying the combinations of α 1 h , α 2 h , a 1 h , and a 2 h as detailed in Table 1, we can further derive different estimators from Equation (23).
where
L h = exp a 1 h ( M ^ x h M x h ) a 1 h ( M ^ x h + M x h ) + 2 a 2 h , Interquartile range : Q R h = Q 3 h Q 1 h , Midrange : M R h = X h m i n + X h m a x 2 , Quartile average : Q A h = Q 3 h + Q 1 h 2 , Quartile deviation : Q D h = Q 3 h Q 1 h 2 , Trimean : T M h = Q 1 h + 2 Q 2 h + Q 3 h 4
and
Decile mean : D M h = i = 1 9 D i h 9 .

3.1. Properties of the Suggested Class of Estimators

To derive the bias and mean squared error (MSE) expressions for the proposed class of estimators, we use a first-order approximation by expanding the terms and neglecting higher-order components of degree greater than 2. This simplification is standard in survey sampling literature, as it provides clear analytical expressions while maintaining accuracy, especially when the sample size is moderate or large.
Therefore, in order to investigate the statistical properties of the proposed estimator, Equation (23) is rewritten in terms of relative errors to enhance efficiency. This formulation allows for the derivation of the bias and MSE of M ^ S , as detailed below:
M ^ S = h = 1 L W h k 1 h M y h 1 + e 0 h 1 + e 1 h α 1 h k 2 h M x h e 1 h 1 + e 1 h α 2 h × exp g 4 h e 1 h 2 1 + g 4 h 2 e 1 h 1 ,
where
g 4 h = a 1 h M x h a 1 h M x h + a 2 h .
We analyze the right-hand side of Equation (24) by applying the first-order Taylor series expansion. For simplification, we disregard terms where e i > 2 , as their impact is minimal in this scenario. This method enables us to obtain the following essential expression:
M ^ S h = 1 L W h M y h h = 1 L W h [ M y h + k 1 h M y h ( 1 + e 0 h e 1 h ( α 1 h + g 4 h 2 ) + e 1 h 2 α 1 h g 4 h 2 + 3 g 4 h 2 8 + α 1 h ( α 1 h + 1 ) 2 e 0 h e 1 h ( α 1 h + g 4 2 ) ) k 2 h M x h ( e 1 h e 1 h 2 ( α 2 h + g 4 h 2 ) ) ] .
The bias of M ^ S is obtained by taking the expectation of Equation (25) and substituting the terms ( e 0 h , e 1 h , e 1 h 2 , e 0 h e 1 h ) with their corresponding expected values. This results in the following expression:
B i a s M ^ S h = 1 L W h M y h k 1 h M y h Δ D h k 2 h Δ G h ,
where
Δ D h = 1 + δ h C M x h 2 3 g 4 h 2 + 4 α 1 h g 4 h + α 1 + 1 8 C M y h x h 2 α 1 h + g 4 h 2
and
Δ G h = δ h M x h C M x h 2 2 α 2 h + g 4 h 2 .
The MSE of M ^ S is determined by computing the expectation after squaring both sides of Equation (25). This process yields the following equation:
M S E M ^ S h = 1 L W h 2 [ M y h 2 + k 1 h 2 M y h 2 Δ A h + k 2 h 2 Δ B h 2 k 1 h M y h 2 Δ D h 2 k 2 h M y h Δ G h + 2 k 1 h k 2 h M y h Δ F h ] ,
where
Δ A h = 1 + δ h C M y h 2 + C M x h 2 α 1 h 3 α 1 h + 1 + 2 g 4 g 4 + 1 2 2 C M y x 2 2 α 1 h + g 4 h ,
Δ B h = δ h M x h 2 C M x h 2
and
Δ F h = δ h M x h C M x h 2 ( α 1 h + α 2 h + g 4 h ) C M y h x h .
The optimal values of k 1 h and k 2 h are derived by minimizing Equation (27), and they are given as follows:
k 1 h o p t = Δ B h Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2
and
k 2 h o p t = M y h Δ A H Δ G H Δ D h Δ F h Δ A h Δ B h Δ F h 2 .
By substituting the optimal values of k 1 h and k 2 h into Equations (26) and (27), the resulting minimum bias and MSE for M ^ S are obtained, as shown below:
B i a s M ^ S m i n h = 1 L W h M y h 1 Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 ,
and
M S E M ^ S m i n h = 1 L W h 2 M y h 2 1 Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 .
In the proposed class of estimators, certain model parameters are selected to minimize the mean squared error (MSE). Section 3 provides analytical expressions for their optimal values, while practical selection is performed in both simulation (Section 5.1) and real data applications (Section 5.2). For the three real-life datasets, parameter values yielding the lowest empirical MSE were chosen, confirming their adaptability and effectiveness across different population structures.

3.2. Implementation Algorithm

To facilitate the practical application of the proposed estimator S ^ , we provide the following step-by-step algorithm that outlines the computation procedure using a stratified sample. For more details, please refer to the Appendix A.

4. Mathematical Comparison

We provide the efficiency criteria in this section by using the MSE formulas of the recommended family of estimators given in Section 3 and all other estimators given in Section 2, for example M ^ 1 , M ^ 2 , M ^ 3 , M ^ 4 , M ^ 5 , M ^ 6 , M ^ 7 , and M ^ 8 .
(i):
From comparing the formula proved in (29) with the one given in (2), the following condition is obtained:
V a r ( M ^ 1 ) > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(ii):
From comparing the formula obtained in (29) with the expression in (2), we get the following condition:
M S E ( M ^ 2 ) > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 + C M x h 2 2 C M y h x h 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(iii):
From comparing the formula obtained in (29) with the formula given in (7) yields the following condition:
M S E ( M ^ 3 ) m i n > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 1 ρ y h x h 2 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(iv):
From comparing the formula proved in (29) with the one given in (12), the following condition is obtained:
M S E ( M ^ 4 ) > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 + 1 4 C M x h 2 C M y h x h 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(v):
From comparing the formula proved in (29) with the one given in (13), the following condition is derived:
M S E ( M ^ 5 ) > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 + 1 4 C M x h 2 + C M y h x h 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(vi):
From comparing the formula obtained in (29) with the expression in (20), we get the following condition:
M S E ( M ^ 6 ) min > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 1 ρ y h x h 2 1 + δ h C M y h 2 1 ρ y h x h 2 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(vii):
From comparing the formula proved in in (29) with the expression (20), we obtain the following condition:
M S E ( M ^ 7 ) min > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M y h 2 1 δ h C M x h 2 1 ρ y h x h 2 1 δ h C M x h 2 + δ h C M y h 2 1 ρ y h x h 2 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .
(viii):
From comparing the formula mentioned in (29) with the expression in (22), we get the following condition:
M S E ( M ^ 8 ) min > M S E M ^ S m i n if h = 1 L W h 2 M y h 2 δ h C M x h 2 1 ρ y h x h 2 δ h 2 4 C M x h 2 1 16 C M x h 2 + C M y h 2 1 ρ y h x h 2 1 + δ C M y h 2 1 ρ y h x h 2 1 + Δ A h Δ G h 2 + Δ B h Δ D h 2 2 Δ D h Δ F h Δ G h Δ A h Δ B h Δ F h 2 > 0 .

5. Results and Discussion

This section generates five different artificial populations using suitable positively skewed probability distributions. In addition, three separate datasets are analyzed to assess the reliability and practical performance of the proposed estimators.

5.1. Simulation Study

When estimating the median, the selection of an appropriate distribution is guided by the underlying features of the population. The median proves particularly effective for handling datasets that are asymmetric, include extreme values, or diverge from normality. In this analysis, the auxiliary variable X was generated using one of five carefully chosen positively skewed distributions:
  • Simulated dataset 1: A highly skewed distribution for X is described by the Exponential distribution, X Exponential   ( μ 0 = 1 2 ) , with ρ y x = 0.55 .
  • Simulated dataset 2: A slightly skewed distribution for X is described by the Log-Normal distribution, given by Log - Normal   ( μ 1 = 8 , μ 2 = 5 ) , with ρ y x = 0.50 .
  • Simulated dataset 3: A uniform distribution is used as the baseline for X, represented as Uniform   ( μ 3 = 2 , μ 4 = 10 ) , with ρ y x = 0 .
  • Simulated dataset 4: The distribution of X with moderate skewness and spread is represented by a Gamma distribution, specified as X Gamma   ( β 1 = 10 , β 2 = 6 ) , with a correlation coefficient ρ y x = 0.6 .
  • Simulated dataset 5: A distribution with heavy tails for X is represented by the Cauchy distribution, described as Cauchy   ( β 3 = 6 , β 4 = 2 ) , with ρ y x = 0.45 .
As a result, these five distributions are ideal for examining and demonstrating the effectiveness of the proposed estimators across different scenarios and their respective properties. These distributions deliberately reflect a range of practical scenarios frequently observed in stratified survey sampling, such as skewed income distributions, symmetric patterns in educational levels, and heavy-tailed characteristics in environmental measurements, thereby allowing a realistic and comprehensive evaluation of the proposed estimators under varying conditions. Before presenting the results, it is important to note that the effect of the correlation coefficient on estimator performance is implicitly examined through simulation studies involving multiple populations with varying correlation levels, including positive, zero, and negative values. This approach allows us to assess the robustness and adaptability of the proposed estimators across different correlation scenarios.
The following equation can be employed to determine the variable Y:
Y = ρ y x × X + e ,
here, ρ y x represents the correlation coefficient, while the error component e follows a standard normal distribution, denoted as N ( 0 , 1 ) .
To evaluate the efficiency and robustness of both the proposed and existing estimators, we applied specific R programming techniques under various distribution types and correlation scenarios to examine their PRE values.
1.
Using the specified distributions, generate a dataset consisting of N = 1500 values for the variables X and Y.
2.
The population of size N = 1500 is divided into three equal strata of 500 units each to create a balanced stratification, allowing clear assessment of estimator performance under moderate stratification.
3.
To assess the accuracy of the estimators, calculate key statistical summaries, including the maximum and minimum values. Additionally, determine the most efficient outcomes for the available estimators.
4.
Draw a sample of size n h from each stratum using simple random sampling without replacement (SRSWOR) from their corresponding population size N h .
5.
The simulated percent relative efficiency (SPRE) values for all estimators discussed in this study are calculated across different sample sizes. This phase ensures that the SPRE of each estimator are analyzed for a collection of samples.
6.
Following 35,000 iterations of steps 3 and 4, the designated formulas are employed to determine the final simulated mean squared errors (SMSEs) and simulated percent relative efficiencies (SPREs) values.
S M S E ( Δ ^ t ) min = k = 1 35000 Δ ^ t k M y 2 35000 ,
and
S P R E = S V a r M ^ 1 S M S E ( Δ ^ t ) min × 100 ,
where S V a r M ^ 1 represents the simulated variance of estimator M ^ 1 , and Δ ^ t = M ^ 1 , M ^ 2 , M ^ 3 , M ^ 4 , M ^ 5 , M ^ 6 , M ^ 7 M ^ 8 , M ^ S 1 , M ^ S 2 , , M ^ S 8 . The simulated percent relative efficiencies of the proposed class of estimators and the existing estimators are presented in Table 2.
Computational Considerations:
The proposed estimators M ^ S 1 through M ^ S 8 rely on robust auxiliary statistics and simple algebraic transformations, all of which are computationally efficient. These measures (e.g., trimean, interquartile range, quartile deviation) can be calculated using standard statistical routines without the need for iterative or matrix-intensive operations. Our simulation study, conducted with N = 1500 units and 35,000 replications, was executed in R within a reasonable computational time, confirming the practical feasibility of applying the proposed estimators in large-scale survey settings.

5.2. Numerical Analysis Using Real Datasets

The performance of the suggested estimators is assessed by examining their mean squared error across three distinct datasets. A detailed statistical overview is presented below.
Population 1
(Source: [13])
Y 1 :
This variable denotes the overall amount of fish collected from every source in the year 1995, encompassing both commercial operations and recreational fishing efforts;
X 1 :
This variable indicates the count of fish captured in 1994 by people engaged in recreational saltwater fishing, with all commercial catch activities excluded;
Y 2 :
This variable represents the total number of fish caught in 1995, reflecting the complete harvest for that year;
X 2 :
This variable reflects the quantity of fish caught in 1994 by recreational saltwater anglers, highlighting the contribution of non-commercial fishing to the overall catch.
N 1 = 69 , n 1 = 17 , N 2 = 69 , n 2 = 17 , X 1 m i n = 17051.500 , X 1 m a x = 20987.500 , X 2 m i n = 15055 , X 2 m a x = 19005 , M x 1 = 2011 , M y 1 = 2068 , M x 2 = 2007 , M y 2 = 2068 , f x 1 ( M x 1 ) = 0.00014 , f y 1 ( M y 1 ) = 0.00014 , f x 2 ( M x 2 ) = 0.00014 , f y 2 ( M y 2 ) = 0.00014 , ρ y 1 x 1 = 0.151 , ρ y 2 x 2 = 0.314 , T M 1 = 4043 , T M 2 = 3777 , D M 1 = 3853 , D M 2 = 3615.200 , Q R 1 = 3936 , Q R 2 = 3936 , Q A 1 = 2956 , Q A 2 = 3002 , Q D 1 = 1968 , Q D 2 = 1975 , M R 1 = 19019.500 , M R 2 = 17030 .
Population 2.
A real finite population is analyzed using data from the 2013 Punjab Development Statistics (page 226) [34], which contains information on the number of registered factories and employment levels across various districts and divisions.
Y 1 :
Employment level by division and district in 2010;
X 1 :
Number of registered factories by division and district in 2010;
Y 2 :
Employment level by division and district in 2012;
X 2 :
Number of registered factories by division and district in 2012.
N 1 = 36 , n 1 = 14 , N 2 = 36 , n 2 = 14 , X 1 m i n = 24 , X 1 m a x = 1986 , X 2 m i n = 24 , X 2 m a x = 2055 , M x 1 = 168.500 , M y 1 = 10484.500 , M x 2 = 171.500 , M y 2 = 10494.500 , f x 1 ( M x 1 ) = 0.002463666 , f y 1 ( M y 1 ) = 0.00004033736 , f x 2 ( M x 2 ) = 0.002315051 , f y 2 ( M y 2 ) = 0.00004086913 , ρ y 1 x 1 = 0.912 , ρ y 2 x 2 = 0.5194465 , T M 1 = 193.438 , T M 2 = 195.750 , D M 1 = 432.500 , D M 2 = 431.500 , Q R 1 = 252.250 , Q R 2 = 265 , Q A 1 = 218.375 , Q A 2 = 220 , Q D 1 = 127.125 , Q D 2 = 132.500 , M R 1 = 1005 , M R 2 = 1039.500 .
Population 3.
A real finite population dataset is examined, sourced from the 2014 edition of the Punjab Development Statistics (page 135) [35]. The dataset provides details on student enrollment in government-run primary and middle schools, disaggregated by gender.
Y 1 :
The overall student enrollment for the academic year 2012–2013;
X 1 :
The 2012–2013 academic year recorded a certain number of government-managed primary schools for both genders;
Y 2 :
The number of students registered in the 2012–2013 academic year;
X 2 :
The 2012–2013 academic year indicated the overall number of government middle schools serving both genders.
N 1 = 36 , n 1 = 14 , N 2 = 36 , n 2 = 14 , X 1 m i n = 388 , X 1 m a x = 1534 , X 2 m i n = 84 , X 2 m a x = 478 , M x 1 = 1016.500 , M y 1 = 116230 , M x 2 = 206 , M y 2 = 49661 , f x 1 ( M x 1 ) = 0.000951993 , f y 1 ( M y 1 ) = 0.00000835 , f x 2 ( M x 2 ) = 0.004094403 , f y 2 ( M y 2 ) = 0.0000143374 , ρ y 1 x 1 = 0.084 , ρ y 2 x 2 = 0.875 , T M 1 = 891.188 , T M 2 = 210.688 , D M 1 = 982.650 , D M 2 = 231 , Q R 1 = 378.250 , Q R 2 = 125.750 , Q A 1 = 891.875 , Q A 2 = 215.375 , Q D 1 = 982.650 , Q D 2 = 62.875 , M R 1 = 961 , M R 2 = 281 .
Next, compute the percent relative efficiency values for each estimator. The results, highlighting the effectiveness of the proposed estimators, are displayed in Table 3.
P R E = V a r M ^ 1 M S E ( T ^ ) × 100 ,
where T ^ = M ^ 1 , M ^ 2 , M ^ 3 , M ^ 4 , M ^ 5 , M ^ 6 , M ^ 7 M ^ 8 , M ^ S 1 , M ^ S 2 , , M ^ S 8 .

5.3. Discussion

Simulations were carried out using appropriate distributions with varying ρ y x values. Additionally, the performance of the proposed estimator family underwent evaluation by analyzing three datasets. The percent relative efficiency (PRE) served to assess the various estimators. The PRE values for the newly proposed family and existing estimators across five simulated distributions appear in Table 2. The results from real datasets are shown in Table 3. Based on these analyses, the following key conclusions can be drawn:
  • The results from both simulated and real datasets, as presented in Table 2 and Table 3, indicate that the PRE values of all newly introduced estimators are higher than those of the previously established ones discussed in Section 2. This highlights the enhanced effectiveness of the suggested estimators in relation to existing techniques.
  • Additionally, the upward trend in the graphical representations shown in Figure 1 and Figure 2, based on various distributions and actual datasets further confirms that the new estimators consistently achieve higher PRE values than the conventional estimators. The inverse correlation between the PRE values of the new and traditional estimators strengthens the idea that the newly introduced estimators offer a more efficient estimation method.
  • Although different distributions and correlation levels were used, Figure 1 shows a similar trend across all plots. This consistency highlights the robustness of the proposed estimators, particularly M ^ S 2 , which are designed using transformation techniques and nontraditional auxiliary measures. The drop in PRE from estimator 7 to 8 is due to the higher variability in M ^ 8 ’s nonlinear form. Each plot includes 16 estimators: the first 8 are existing methods ( M ^ 1 to M ^ 8 ) and the last 8 are proposed estimators ( M ^ S 1 to M ^ S 8 ). Among them, M ^ S 2 , M ^ S 4 , and M ^ S 6 achieve the highest PRE values.
  • The results in Table 2 and Table 3 show that the estimator M ^ S 2 consistently achieves the highest percent relative efficiency (PRE) among all considered estimators. Therefore, M ^ S 2 is recommended for practical use, particularly with skewed or outlier-prone data in stratified sampling.
  • The datasets used in Figure 2 come from different sources, and they share structural similarities such as skewness and moderate to strong correlation between study and auxiliary variables. This explains the similar trends in PRE values across plots, reflecting the robustness of the proposed estimators under such conditions. The sharp drop in PRE from estimator 4 to 5 is due to the structural difference in their exponential forms. Estimator M ^ 4 reduces the effect of median differences, while M ^ 5 may increase variability, especially when M x h and M ^ x h differ, making it less stable. Among all existing and proposed estimators, M ^ S 2 consistently achieves the highest PRE values and is recommended for practical use.
  • The boxplots in Figure 3 and Figure 4 provide a concise summary of the distribution of percent relative efficiency (PRE) values across both simulated and real data settings. Notably, the proposed estimators (particularly M ^ S 2 , M ^ S 4 and M ^ S 6 demonstrate consistently higher median PRE values and narrower interquartile ranges, indicating not only superior efficiency but also greater stability across different population scenarios. In contrast, the existing estimators show lower medians and greater variability, highlighting the robustness and reliability of the proposed methods under varied sampling conditions.

6. Conclusions

This research presented a stratified class of estimators aimed at estimating the median of a finite population using stratified random sampling, based on various transformations of an auxiliary variable. Bias and mean squared error formulas for both existing and proposed estimators were derived through a first-order approximation approach, which offered valuable insights into their accuracy and potential for improvement. To assess the practical utility of the proposed estimators, a series of simulation studies was carried out across five distinct probability distributions under various scenarios, along with the examination of three actual datasets. The findings, outlined in Table 2 and Table 3, reveal that the newly proposed estimators consistently outperformed traditional ones, demonstrating greater precision and efficiency. Moreover, all of the new estimators achieved higher efficiency than the existing methods.
According to the detailed outcomes from both simulation studies and real datasets evaluations, as shown in Table 2 and Table 3, the estimator M ^ S 2 consistently achieves the highest percent efficiency values when compared to all other existing estimators. As a result, M ^ S 2 is strongly suggested for practical use, particularly in scenarios involving skewed distributions or datasets influenced by extreme values under stratified random sampling.
Additionally, this study explored the performance of the improved estimators within the context of stratified random sampling. Based on the results, there is considerable potential to develop new estimators that further enhance mean squared error (MSE) under different sampling schemes. Future work could focus on extending these estimators to address challenges such as non-response, measurement errors, or incomplete auxiliary information. Investigating their application in double or multi-phase sampling, as well as their robustness in the presence of outliers or highly skewed data, also represents a valuable direction for further research.

Limitations and Considerations

While the proposed estimators M ^ S 1 through M ^ S 8 demonstrate higher performance under skewed distributions and datasets affected by outliers, there are scenarios in which they may underperform. In particular, their relative efficiency may decrease when the auxiliary variable shows very low or no correlation with the study variable, thereby reducing the effectiveness of using auxiliary information. Additionally, in populations where the auxiliary variable is symmetrically or normally distributed, traditional estimators may perform comparably well. Finally, the use of robust measures (such as trimean or quartile deviation) can become unstable when strata contain very small sample sizes, potentially affecting estimator reliability. Researchers are therefore encouraged to evaluate data characteristics before selecting the appropriate estimator.

Author Contributions

Conceptualization, A.S.A.; Methodology, A.S.A.; Software, A.S.A.; Validation, F.A.A.; Formal analysis, A.S.A.; Investigation, F.A.A.; Resources, F.A.A.; Data curation, A.S.A.; Writing—original draft, A.S.A.; Writing—review & editing, A.S.A. and F.A.A.; Visualization, A.S.A. and F.A.A.; Supervision, F.A.A.; Project administration, F.A.A.; Funding acquisition, F.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2025R515), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Algorithm 1 Computation of the Proposed Estimator M ^ S
Input:
 •
Stratified sample data { y h i , x h i } , for h = 1 , 2 , , L
 •
Stratum sizes N h , sample sizes n h
 •
Stratum weights W h = N h N
 •
Chosen parameters α 1 h , α 2 h , a 1 h , a 2 h
Step 1: Compute sample and population statistics (per stratum h):
 •
Sample medians: M ^ y h , M ^ x h , population medians: M y h , M x h
 •
Auxiliary measures:
 -
Q R h = Q 3 h Q 1 h , M R h = X h , min + X h , max 2 , Q A h = Q 1 h + Q 3 h 2
 -
Q D h = Q 3 h Q 1 h 2 , T M h = Q 1 h + 2 Q 2 h + Q 3 h 4 , D M h = i = 1 9 D i h 9
 •
Estimate:
C M y h = 1 M y h f y h ( M y h ) , C M x h = 1 M x h f x h ( M x h ) , C M y h x h = ρ y h x h C M y h C M x h
Step 2: Compute intermediate components:
δ h = 1 4 1 n h 1 N h , g 4 h = a 1 h M x h a 1 h M x h + a 2 h
Δ D h = 1 + δ h C M x h 2 3 g 4 h 2 + 4 α 1 h ( g 4 h + α 1 h + 1 ) 8 C M y h x h ( 2 α 1 h + g 4 h ) Δ G h = δ h M x h C M x h 2 ( 2 α 2 h + g 4 h ) Δ A h = 1 + δ h C M y h 2 + C M x h 2 α 1 h ( 3 α 1 h + 1 ) + 2 g 4 h ( g 4 h + 1 ) 2 2 C M y h x h ( 2 α 1 h + g 4 h ) Δ B h = δ h M x h 2 C M x h 2 , Δ F h = δ h M x h C M x h 2 ( α 1 h + α 2 h + g 4 h ) C M y h x h
Step 3: Compute optimal constants  k 1 h ,   k 2 h :
k 1 h = Δ B h Δ D h Δ F h Δ G h Δ A h Δ B h ( Δ F h ) 2 , k 2 h = M y h ( Δ A h Δ G h Δ D h Δ F h ) Δ A h Δ B h ( Δ F h ) 2
Step 4: Compute the estimator  M ^ S :
S ^ = h = 1 L W h k 1 h M ^ y h M x h M ^ x h α 1 h + k 2 h ( M x h M ^ x h ) M x h M ^ x h α 2 h exp a 1 h ( M x h M ^ x h ) a 1 h ( M x h + M ^ x h ) + 2 a 2 h
Output: Final estimate of the population median M ^ S

References

  1. Alghamdi, A.S.; Alrweili, H. A comparative study of new ratio-type family of estimators under stratified two-phase sampling. Mathematics 2025, 13, 327. [Google Scholar] [CrossRef]
  2. Daraz, U.; Alomair, M.A.; Albalawi, O.; Al Naim, A.S. New techniques for estimating finite population variance using ranks of auxiliary variable in two-stage sampling. Mathematics 2024, 12, 2741. [Google Scholar] [CrossRef]
  3. Alghamdi, A.S.; Almulhim, F.A. Optimizing finite population mean estimation using simulation and empirical data. Mathematics 2025, 13, 1635. [Google Scholar] [CrossRef]
  4. Alomair, M.A.; Daraz, U. Dual transformation of auxiliary variables by using outliers in stratified random sampling. Mathematics 2024, 12, 2839. [Google Scholar] [CrossRef]
  5. Alghamdi, A.S.; Alrweili, H. New class of estimators for finite population mean under stratified double phase sampling with simulation and real-life application. Mathematics 2025, 13, 329. [Google Scholar] [CrossRef]
  6. Gross, S. Median estimation in sample surveys. In Proceedings of the Section on Survey Research Methods; American Statistical Association Ithaca: Alexandria, VA, USA, 1980. [Google Scholar]
  7. Sedransk, J.; Meyer, J. Confidence intervals for the quantiles of a finite population: Simple random and stratified simple random sampling. J. R. Stat. Soc. Ser. B (Methodol.) 1978, 40, 239–252. [Google Scholar] [CrossRef]
  8. Philip, S.; Sedransk, J. Lower bounds for confidence coefficients for confidence intervals for finite population quantiles. Commun. Stat.-Theory Methods 1983, 12, 1329–1344. [Google Scholar] [CrossRef]
  9. Kuk, Y.C.A.; Mak, T.K. Median estimation in the presence of auxiliary information. J. R. Stat. Soc. Ser. B 1989, 51, 261–269. [Google Scholar] [CrossRef]
  10. Rao, T.J. On certail methods of improving ration and regression estimators. Commun.-Stat.-Theory Methods 1991, 20, 3325–3340. [Google Scholar] [CrossRef]
  11. Singh, S.; Joarder, A.H.; Tracy, D.S. Median estimation using double sampling. Aust. N. Z. J. Stat. 2001, 43, 33–46. [Google Scholar] [CrossRef]
  12. Khoshnevisan, M.; Singh, H.P.; Singh, S.; Smarandache, F. A General Class of Estimators of Population Median Using Two Auxiliary Variables in Double Sampling; Virginia Polytechnic Institute and State University: Blacksburg, VA, USA, 2002. [Google Scholar]
  13. Singh, S. Advanced Sampling Theory With Applications: How Michael Selected Amy; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003; Volume 2. [Google Scholar]
  14. Gupta, S.; Shabbir, J.; Ahmad, S. Estimation of median in two-phase sampling using two auxiliary variables. Commun. Stat.-Theory Methods 2008, 37, 1815–1822. [Google Scholar] [CrossRef]
  15. Aladag, S.; Cingi, H. Improvement in estimating the population median in simple random sampling and stratified random sampling using auxiliary information. Commun. Stat.-Theory Methods 2015, 44, 1013–1032. [Google Scholar] [CrossRef]
  16. Solanki, R.S.; Singh, H.P. Some classes of estimators for median estimation in survey sampling. Commun. Stat.-Theory Methods 2015, 44, 1450–1465. [Google Scholar] [CrossRef]
  17. Daraz, U.; Wu, J.; Albalawi, O. Double exponential ratio estimator of a finite population variance under extreme values in simple random sampling. Mathematics 2024, 12, 1737. [Google Scholar] [CrossRef]
  18. Daraz, U.; Wu, J.; Alomair, M.A.; Aldoghan, L.A. New classes of difference cum-ratio-type exponential estimators for a finite population variance in stratified random sampling. Heliyon 2024, 10, e33402. [Google Scholar] [CrossRef]
  19. Daraz, U.; Alomair, M.A.; Albalawi, O. Variance estimation under some transformation for both symmetric and asymmetric data. Symmetry 2024, 16, 957. [Google Scholar] [CrossRef]
  20. Shabbir, J.; Gupta, S. A generalized class of difference type estimators for population median in survey sampling. Hacet. J. Math. Stat. 2017, 46, 1015–1028. [Google Scholar] [CrossRef]
  21. Almulhim, F.A.; Alghamdi, A.S. Simulation-based evaluation of robust transformation techniques for median estimation under simple random sampling. Axioms 2025, 14, 301. [Google Scholar] [CrossRef]
  22. Daraz, U.; Almulhim, F.A.; Alomair, M.A.; Alomair, A.M. Population median estimation using auxiliary variables: A simulation study with real data across sample sizes and parameters. Mathematics 2025, 13, 1660. [Google Scholar] [CrossRef]
  23. Irfan, M.; Maria, J.; Shongwe, S.C.; Zohaib, M.; Bhatti, S.H. Estimation of population median under robust measures of an auxiliary variable. Math. Probl. Eng. 2021, 2021, 4839077. [Google Scholar] [CrossRef]
  24. Shabbir, J.; Gupta, S.; Narjis, G. On improved class of difference type estimators for population median in survey sampling. Commun. Stat.-Theory Methods 2022, 51, 3334–3354. [Google Scholar] [CrossRef]
  25. Subzar, M.; Lone, S.A.; Ekpenyong, E.J.; Salam, A.; Aslam, M.; Raja, T.A.; Almutlak, S.A. Efficient class of ratio cum median estimators for estimating the population median. PLoS ONE 2023, 18, e0274690. [Google Scholar] [CrossRef]
  26. Iseh, M.J. Model formulation on efficiency for median estimation under a fixed cost in survey sampling. Model Assist. Stat. Appl. 2023, 18, 373–385. [Google Scholar] [CrossRef]
  27. Hussain, M.A.; Javed, M.; Zohaib, M.; Shongwe, S.C.; Awais, M.; Zaagan, A.A.; Irfan, M. Estimation of population median using bivariate auxiliary information in simple random sampling. Heliyon 2024, 10, 7. [Google Scholar] [CrossRef]
  28. Bhushan, S.; Kumar, A.; Lone, S.A.; Anwar, S.; Gunaime, N.M. An efficient class of estimators in stratified random sampling with an application to real data. Axioms 2023, 12, 576. [Google Scholar] [CrossRef]
  29. Bahl, S.; Tuteja, R. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 1991, 12, 159–164. [Google Scholar] [CrossRef]
  30. Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 20. [Google Scholar] [CrossRef]
  31. Daraz, U.; Khan, M. Estimation of variance of the difference-cum-ratio-type exponential estimator in simple random sampling. Res. Math. Stat. 2021, 8, 1899402. [Google Scholar] [CrossRef]
  32. Daraz, U.; Agustiana, D.; Wu, J.; Emam, W. Twofold auxiliary information under two-phase sampling: An improved family of double-transformed variance estimators. Axioms 2025, 14, 64. [Google Scholar] [CrossRef]
  33. Daraz, U.; Wu, J.; Agustiana, D.; Emam, W. Finite population variance estimation using Monte Carlo simulation and real life application. Symmetry 2025, 17, 84. [Google Scholar] [CrossRef]
  34. Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2013. Available online: https://www.pbs.gov.pk/content/microdata (accessed on 10 July 2025).
  35. Bureau of Statistics. Punjab Development Statistics Government of the Punjab, Lahore, Pakistan; Bureau of Statistics: Islamabad, Pakistan, 2014. Available online: https://www.pbs.gov.pk/content/microdata (accessed on 10 July 2025).
Figure 1. A graphical summary of findings derived from data generated across multiple population distributions. The figure compares 16 estimators: the first 8 represent existing methods ( M ^ 1 to M ^ 8 ), while the remaining 8 correspond to the proposed estimators ( M ^ S 1 to M ^ S 8 ).
Figure 1. A graphical summary of findings derived from data generated across multiple population distributions. The figure compares 16 estimators: the first 8 represent existing methods ( M ^ 1 to M ^ 8 ), while the remaining 8 correspond to the proposed estimators ( M ^ S 1 to M ^ S 8 ).
Symmetry 17 01136 g001
Figure 2. A graphical illustration of the outcomes based on data obtained from three real population datasets. The figure compares 16 estimators: the first 8 represent existing methods ( M ^ 1 to M ^ 8 ), while the remaining 8 correspond to the proposed estimators ( M ^ S 1 to M ^ S 8 ). (a) Data-1: (Source: [13]). (b) Data-2: (Source: [34]). (c) Data-3: (Source: [35]).
Figure 2. A graphical illustration of the outcomes based on data obtained from three real population datasets. The figure compares 16 estimators: the first 8 represent existing methods ( M ^ 1 to M ^ 8 ), while the remaining 8 correspond to the proposed estimators ( M ^ S 1 to M ^ S 8 ). (a) Data-1: (Source: [13]). (b) Data-2: (Source: [34]). (c) Data-3: (Source: [35]).
Symmetry 17 01136 g002
Figure 3. A boxplot representation of the percent relative efficiency (PRE) values for different estimators across multiple simulated populations.
Figure 3. A boxplot representation of the percent relative efficiency (PRE) values for different estimators across multiple simulated populations.
Symmetry 17 01136 g003
Figure 4. A boxplot representation of the percent relative efficiency (PRE) values for different estimators across real datasets.
Figure 4. A boxplot representation of the percent relative efficiency (PRE) values for different estimators across real datasets.
Symmetry 17 01136 g004
Table 1. A family of new proposed estimators.
Table 1. A family of new proposed estimators.
Different Classes of M ^ S α 1 h α 2 h a 1 h a 2 h
M ^ S 1 = h = 1 L W h k 1 h M ^ y h M x h M ^ x h + k 2 h M x h M ^ x h M ^ x h M x h L h 1 1 Q R h D M h
M ^ S 2 = h = 1 L W h k 1 h M ^ y h M x h M ^ x h + k 2 h M x h M ^ x h L h 10 M R h T M h
M ^ S 3 = h = 1 L W h k 1 h M ^ y h M ^ x h M x h + k 2 h M x h M ^ x h M ^ x h M x h L h 1 1 Q A h Q D h
M ^ S 4 = h = 1 L W h k 1 h M ^ y h + k 2 h M x h M ^ x h M x h M ^ x h L 01 Q D h Q A h
M ^ S 5 = h = 1 L W h k 1 h M ^ y h + k 2 h M x h M ^ x h M ^ x h M x h L h 0 1 T M h M R h
M ^ S 6 = h = 1 L W h k 1 h s y h 2 M x h M ^ x h + k 2 h M x h M ^ x h M x h M ^ x h L h 11 D M h Q R h
M ^ S 7 = h = 1 L W h k 1 h M ^ y h M ^ x h M x h + k 2 h M x h M ^ x h M x h M ^ x h L h 1 1 M R h Q A h
M ^ S 8 = h = 1 L W h k 1 h M ^ y h M ^ x h M x h + k 2 h M x h M ^ x h L h 1 0 Q D h T M h
Table 2. Percent relative efficiency (PRE) values for simulated distributions.
Table 2. Percent relative efficiency (PRE) values for simulated distributions.
EstimatorPop-1Pop-2Pop-3Pop-4Pop-5
M ^ 1 100100100100100
M ^ 2 119.270117.015116.24115.70114.54
M ^ 3 133.18127.12126.11123.08125.10
M ^ 4 137.22133.18131.16128.13133.15
M ^ 5 111.56112.97111.99107.92113.49
M ^ 6 193.77168.53158.43167.52164.49
M ^ 7 241.26191.76166.51190.75181.65
M ^ 8 141.35140.25148.33135.20139.24
M ^ S 1 271.31235.50198.94223.29234.22
M ^ S 2 290.30256.94232.86244.06257.28
M ^ S 3 221.92207.97182.80203.97216.08
M ^ S 4 286.36252.49212.00233.25246.40
M ^ S 5 215.65203.47177.38197.26210.59
M ^ S 6 276.46244.38210.70228.70241.24
M ^ S 7 225.99210.60201.30202.15214.81
M ^ S 8 245.75230.47209.48206.88219.59
Table 3. PRE values of various estimators using real populations.
Table 3. PRE values of various estimators using real populations.
EstimatorDataset 1Dataset 2Dataset 3
M ^ 1 100100100
M ^ 2 104.48103.49102.31
M ^ 3 106.46224.199105.40
M ^ 4 113.59220.09105.77
M ^ 5 43.6242.7645.31
M ^ 6 119.70232.48106.50
M ^ 7 121.86233.11106.61
M ^ 8 117.35201.57107.62
M ^ S 1 148.19251.67184.34
M ^ S 2 182.32274.98207.57
M ^ S 3 132.82237.03166.16
M ^ S 4 162.21246.33196.40
M ^ S 5 127.77239.19160.10
M ^ S 6 160.10245.82191.41
M ^ S 7 151.01241.02166.93
M ^ S 8 159.09243.62171.21
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alghamdi, A.S.; Almulhim, F.A. Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry 2025, 17, 1136. https://doi.org/10.3390/sym17071136

AMA Style

Alghamdi AS, Almulhim FA. Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry. 2025; 17(7):1136. https://doi.org/10.3390/sym17071136

Chicago/Turabian Style

Alghamdi, Abdulaziz S., and Fatimah A. Almulhim. 2025. "Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures" Symmetry 17, no. 7: 1136. https://doi.org/10.3390/sym17071136

APA Style

Alghamdi, A. S., & Almulhim, F. A. (2025). Improved Median Estimation in Stratified Surveys via Nontraditional Auxiliary Measures. Symmetry, 17(7), 1136. https://doi.org/10.3390/sym17071136

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop