Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling

Daraz, Umer; Wu, Jinbiao; Albalawi, Olayan

doi:10.3390/math12111737

Open AccessArticle

Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling

by

Umer Daraz

¹

,

Jinbiao Wu

^1,*

and

Olayan Albalawi

²

¹

School of Mathematics and Statistics, Central South University, Changsha 410017, China

²

Department of Statistics, Faculty of Science, University of Tabuk, Tabuk 71491, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(11), 1737; https://doi.org/10.3390/math12111737

Submission received: 13 April 2024 / Revised: 28 May 2024 / Accepted: 29 May 2024 / Published: 3 June 2024

(This article belongs to the Special Issue Survey Statistics and Survey Sampling: Challenges and Opportunities)

Download

Browse Figures

Versions Notes

Abstract

:

This article presents an improved class of efficient estimators aimed at estimating the finite population variance of the study variable. These estimators are especially useful when we have information about the minimum/maximum values of the auxiliary variable within a framework of simple random sampling. The characteristics of the proposed class of estimators, including bias and mean squared error (

M S E

) under simple random sampling are derived through a first-order approximation. To assess the performance and validate the theoretical outcomes, we conduct a simulation study. Results indicate that the proposed class of estimators has lower

M S E s

as compared to other existing estimators across all simulation scenarios. Three datasets are used in the application section to emphasize the effectiveness of the proposed class of estimators over conventional unbiased variance estimators, ratio and regression estimators, and other existing estimators.

Keywords:

auxiliary information; study variable; minimum and maximum values; variance estimation; bias; MSE

MSC:

62D05

1. Introduction

The goal of survey sampling is to collect precise information on the characteristics of the population in order to maximize the effectiveness of the estimators under investigation while reducing expenses, time, and human efforts. There are a few extreme values in many populations, and it can be quite sensitive to estimate unknown population characteristics without taking these data into account. In some cases, the results can be inflated or underestimated. It is important to note that when dealing with extreme values in the dataset, the effectiveness of classical estimators tends to decrease in terms of mean square error (

M S E

). There may be a temptation to remove such data from the sample. To effectively confront this challenge, it is essential to incorporate this information into the process of estimating the population characteristics. By performing a linear transformation on the known minimum and maximum values of the auxiliary variable, ref. [1] provided two estimators. Such designs were not studied further after that, until the works of [2]. They employed the idea of using extreme values on various estimators of the finite population mean. Under the extreme values, ref. [3] improved the estimation of the finite population mean using a stratified random sampling strategy. For more details, see [4,5,6,7] and references therein.

It is difficult to control variability in applications, and the estimation problem of finite population variance is a significant concern. Researchers encounter this issue in biological and agricultural studies, which makes the desired outcomes appear unpredictable. A careful approach to utilizing auxiliary information can improve the estimator’s accuracy. For estimating the finite population variance, a number of researchers have proposed different kinds of estimators, including [8,9,10,11,12,13,14,15,16,17,18,19,20,21].

In this article, the extreme values of the auxiliary variable are retained in the data and used as auxiliary information. As discussed by [10], we propose an improved class of estimators in this article for estimating the finite population variance utilizing the known information on the extreme values of the auxiliary variable under a simple random sampling scheme for further improvement.

The next sections of this article are as follows. Appendix A introduces the concepts and notations. Some existing estimators are described in Section 2. In Section 3, we provide an in-depth discussion of our proposed class of estimators. Section 4 provides the mathematical comparison. Section 5 presents a simulation study to generate six different artificial populations by using different probability distributions to verify theoretical results discussed in Section 4. Some numerical examples are also listed in this section to illustrate our theoretical results. Finally, some conclusions and ideas for further research are discussed in Section 6.

2. Existing Estimators

In this section, we consider the existing estimators of the finite population variances and compare them with our proposed class of estimators.

The usual variance estimator of

{\hat{S}}_{y}^{2} = s_{y}^{2}

for population variance is given by:

V a r ({\hat{S}}_{y}^{2}) = θ S_{y}^{4} λ_{40}^{*} .

(1)

For population variance

{\hat{S}}_{r}^{2}

, [12] proposed a ratio estimator, which is given by:

{\hat{S}}_{r}^{2} = s_{y}^{2} (\frac{S_{x}^{2}}{s_{x}^{2}}) .

(2)

The bias and

M S E

of

{\hat{S}}_{r}^{2},

are expressed as follows:

B i a s ({\hat{S}}_{r}^{2}) ≅ θ S_{y}^{4} (λ_{04}^{*} - λ_{22}^{*})

(3)

and

M S E ({\hat{S}}_{r}^{2}) ≅ θ S_{y}^{4} (λ_{40}^{*} + λ_{04}^{*} - 2 λ_{22}^{*}) .

(4)

The linear regression estimator

{\hat{S}}_{l r}^{2},

proposed by [22], is defined as:

{\hat{S}}_{l r}^{2} = s_{y}^{2} + b_{(s_{y}^{2}, s_{x}^{2})} (S_{x}^{2} - s_{x}^{2}),

(5)

where

b_{(s_{y}^{2}, s_{x}^{2})} = \frac{s_{y}^{2} {\hat{λ}}_{22}^{*}}{s_{x}^{2} {\hat{λ}}_{04}^{*}}

is the sample regression coefficient.

The

M S E

of the estimator

{\hat{S}}_{l r}^{2},

is expressed as follows:

M S E ({\hat{S}}_{l r}^{2}) ≅ θ S_{y}^{4} λ_{40}^{*} (1 - ρ^{* 2}),

(6)

where

ρ^{*} = \frac{λ_{22}^{*}}{\sqrt{λ_{40}^{*}} \sqrt{λ_{04}^{*}}}

.

For population variance under simple random sampling, [9] proposed an exponential ratio type estimator

{\hat{S}}_{b t}^{2}

, which is defined as:

{\hat{S}}_{b t}^{2} = s_{y}^{2} exp (\frac{S_{x}^{2} - s_{x}^{2}}{S_{x}^{2} + s_{x}^{2}}) .

(7)

The bias and

M S E

of

{\hat{S}}_{b t}^{2}

are expressed as follows:

B i a s ({\hat{S}}_{b t}^{2}) ≅ \frac{1}{2} θ S_{y}^{2} (\frac{3 λ_{04}^{*}}{4} - λ_{22}^{*})

(8)

and

M S E ({\hat{S}}_{b t}^{2}) ≅ θ S_{y}^{4} (λ_{40}^{*} + \frac{λ_{04}^{*}}{4} - λ_{22}^{*}) .

(9)

Using the kurtosis of an auxiliary variable in simple random sampling, ref. [18] suggested a ratio-type estimator

{\hat{S}}_{u s}^{2}

, which is defined as:

{\hat{S}}_{u s}^{2} = s_{y}^{2} (\frac{S_{x}^{2} + λ_{04}}{s_{x}^{2} + λ_{04}}) .

(10)

The bias and

M S E

of

{\hat{S}}_{u s}^{2},

are expressed as follows:

B i a s ({\hat{S}}_{u s}^{2}) ≅ θ S_{y}^{2} g_{0} (g_{0} λ_{04}^{*} - λ_{22}^{*})

(11)

and

M S E ({\hat{S}}_{u s}^{2}) ≅ θ S_{y}^{4} (λ_{40}^{*} + g_{0}^{2} λ_{04}^{*} - 2 g_{0} λ_{22}^{*}),

(12)

where

g_{0} = \frac{S_{x}^{2}}{S_{x}^{2} + λ_{04}}

.

According to [13], some ratio estimators are defined as:

{\hat{S}}_{c k_{1}}^{2} = s_{y}^{2} (\frac{S_{x}^{2} + C_{x}}{s_{x}^{2} + C_{x}}),

(13)

{\hat{S}}_{c k_{2}}^{2} = s_{y}^{2} (\frac{λ_{04} S_{x}^{2} + C_{x}}{λ_{04} s_{x}^{2} + C_{x}})

(14)

and

{\hat{S}}_{c k_{3}}^{2} = s_{y}^{2} (\frac{C_{x} S_{x}^{2} + λ_{04}}{C_{x} s_{x}^{2} + λ_{04}}) .

(15)

The bias and

M S E

of

{\hat{S}}_{c k_{i}}^{2} (i = 1, 2, 3),

are expressed as follows:

B i a s ({\hat{S}}_{c k_{i}}^{2}) ≅ θ S_{y}^{2} g_{i} (g_{i} λ_{04}^{*} - λ_{22}^{*})

(16)

and

M S E ({\hat{S}}_{c k_{i}}^{2}) ≅ θ S_{y}^{4} (λ_{40}^{*} + g_{i}^{2} λ_{04}^{*} - 2 g_{i} λ_{22}^{*}),

(17)

where

g_{1} = \frac{S_{x}^{2}}{S_{x}^{2} + C_{x}}, g_{2} = \frac{λ_{04} S_{x}^{2}}{λ_{04} S_{x}^{2} + C_{x}}, g_{3} = \frac{C_{x} S_{x}^{2}}{C_{x} S_{x}^{2} + λ_{04}}

.

3. Proposed Estimator

In this section, motivated by [10], an improved class of estimators is introduced by utilizing the known minimum and maximum values of the auxiliary variable to estimate the finite population variance. The proposed estimators is defined as:

{\hat{S}}_{e d}^{2} = s_{y}^{2} exp [ω_{1} \{\frac{(\bar{X} - \bar{x})}{(\bar{X} + \bar{x}) + 2 a_{1}}\}] exp [ω_{2} \{\frac{a_{2} (S_{x}^{2} - s_{x}^{2})}{a_{2} (S_{x}^{2} + s_{x}^{2}) + 2 a_{3}}\}],

(18)

where

(ω_{i}, i = 1, 2)

represent known constant values, whereas the auxiliary variables’ parameters are

a_{1} = X_{M} - X_{m},

a_{2},

and

a_{3}

. We derive the various classes of the suggested estimator from (18), which are listed in Table 1, where

L_{1} = (\frac{(\bar{X} - \bar{x})}{(\bar{X} + \bar{x}) + 2 (x_{M} - x_{m})}) .

Now, we rewrite (18) in terms of errors to get the bias and the

M S E

of the suggested estimator

{\hat{S}}_{e d}^{2}

, that is:

\begin{matrix} {\hat{S}}_{e d}^{2} = S_{y}^{2} (1 + e_{0}) exp {[\frac{- ω_{1} g_{4} e_{1}}{2} (1 + \frac{g_{4} e_{1}}{2})]}^{- 1} exp {[\frac{- ω_{2} g_{5} e_{2}}{2} (1 + \frac{g_{5} e_{2}}{2})]}^{- 1} \end{matrix}

(19)

where

g_{4} = \frac{\bar{X}}{\bar{X} + a_{1}}

and

g_{5} = \frac{a_{2} S_{x}^{2}}{a_{2} S_{x}^{2} + a_{3}}

.

Applying the Taylor series to the first approximation order, we obtain:

\begin{matrix} {\hat{S}}_{e d}^{2} - S_{y}^{2} ≅ S_{y}^{2} [e_{0} - \frac{ω_{1} g_{4}}{2} e_{1} - \frac{ω_{2} g_{g_{5}}}{2} e_{2} + (\frac{ω_{1} g_{4}^{2}}{4} + \frac{ω_{1}^{2} g_{4}^{2}}{8}) e_{1}^{2} + (\frac{ω_{2} g_{5}^{2}}{4} + \frac{ω_{2}^{2} g_{5}^{2}}{8}) e_{2}^{2} \\ - \frac{ω_{1} g_{4}}{2} e_{0} e_{1} - \frac{ω_{2} g_{5}}{2} e_{0} e_{2} + \frac{ω_{1} ω_{2} g_{4} g_{5}}{2} e_{1} e_{2}] . \end{matrix}

(20)

Using (20), the bias of

{\hat{S}}_{e d}^{2}

is given by:

\begin{matrix} B i a s ({\hat{S}}_{e d}^{2}) ≅ θ S_{y}^{2} [(\frac{ω_{1} g_{4}^{2}}{4} + \frac{ω_{1}^{2} g_{4}^{2}}{8}) C_{x}^{2} + (\frac{ω_{2} g_{5}^{2}}{4} + \frac{ω_{2}^{2} g_{5}^{2}}{8}) λ_{04}^{*} - \frac{ω_{1} g_{4}}{2} C_{x} λ_{21} \\ - \frac{ω_{2} g_{5}}{2} λ_{22}^{*} + \frac{ω_{1} ω_{2 h} g_{4} g_{5}}{2} C_{x} λ_{03}] . \end{matrix}

(21)

By squaring both sides of (20) and taking the expected value, we obtained a first-order approximate

M S E

, which is given by the following equation:

\begin{matrix} M S E ({\hat{S}}_{e d}^{2}) ≅ θ S_{y}^{4} [λ_{40}^{*} + \frac{ω_{1}^{2} g_{4}^{2}}{4} C_{x}^{2} + \frac{ω_{2}^{2} g_{5}^{2}}{4} λ_{04}^{*} - ω_{1} g_{4} C_{x} λ_{21} - ω_{2} g_{5} λ_{22}^{*} + \frac{ω_{1} ω_{2} g_{4} g_{5}}{2} C_{x} λ_{03}] . \end{matrix}

(22)

The bias and

M S E

for

{\hat{S}}_{e d}^{2}

, can be rewritten by substituting the known constant values of

(ω_{1} = ω_{2} = 1)

into (21) and (22), and after the simple simplifications, we obtain:

\begin{matrix} B i a s ({\hat{S}}_{e d}^{2}) ≅ θ S_{y}^{2} [\frac{3}{8} (g_{4}^{2} C_{x}^{2} + g_{5}^{2} λ_{04}^{*}) - \frac{1}{2} (g_{4} C_{x} λ_{21} + g_{5} λ_{22}^{*} - g_{4} g_{5} C_{x} λ_{03})] \end{matrix}

(23)

and

\begin{matrix} M S E ({\hat{S}}_{e d}^{2}) ≅ θ S_{y}^{4} [λ_{40}^{*} + \frac{1}{4} (g_{4}^{2} C_{x}^{2} + g_{5}^{2} λ_{04}^{*}) - \frac{1}{2} (2 g_{4} C_{x} λ_{21} + 2 g_{5} λ_{22}^{*} - g_{4} g_{5} C_{x} λ_{03})] . \end{matrix}

(24)

4. Mathematical Comparison

In this section, we discuss the comparisons between the proposed class of estimators

{\hat{S}}_{e d}^{2}

, with other existing estimators

{\hat{S}}_{y}^{2}, {\hat{S}}_{r}^{2}, {\hat{S}}_{l r}^{2}, {\hat{S}}_{b t}^{2}, {\hat{S}}_{u s}^{2}

, and

{\hat{S}}_{k c_{i}}^{2}

.

Condition (i): By (1) and (24):

V a r ({\hat{S}}_{y}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

(2 g_{4} C_{x} λ_{21} + 2 g_{5} λ_{22}^{*} - g_{4} g_{5} C_{x} λ_{03}) > \frac{1}{2} (g_{4}^{2} C_{x}^{2} + g_{5}^{2} λ_{04}^{*}) .

Condition (ii): By (4) and (24):

M S E ({\hat{S}}_{r}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

[2 g_{4} C_{x} λ_{21} + 2 λ_{22}^{*} (g_{5} - 2) - g_{4} g_{5} C_{x} λ_{03}] > \frac{1}{2} [g_{4}^{2} C_{x}^{2} + λ_{04}^{*} (g_{5}^{2} - 4)] .

Condition (iii): By (6) and (24):

M S E ({\hat{S}}_{l r}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

(2 g_{4} C_{x} λ_{21} + 2 g_{5} λ_{22}^{*} - g_{4} g_{5} C_{x} λ_{03}) > \frac{1}{2} (4 ρ_{y x}^{* 2} λ_{40}^{*} + g_{4}^{2} C_{x}^{2} + g_{5}^{2} λ_{04}^{*}) .

Condition (iv): By (9) and (24):

M S E ({\hat{S}}_{b t}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

[2 g_{4} C_{x} λ_{21} + 2 λ_{22}^{*} (g_{5} - 1) - g_{4} g_{5} λ_{03}] > \frac{1}{2} [g_{4}^{2} C_{x}^{2} + λ_{04}^{*} (g_{5}^{2} - 1)] .

Condition (v): By (12) and (24):

M S E ({\hat{S}}_{u s}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

[g_{4} C_{x} λ_{21} + λ_{22}^{*} (g_{5} - 4 g_{0}) - g_{4} g_{5} C_{x} λ_{03}] > \frac{1}{2} [g_{4}^{2} C_{x}^{2} + λ_{04}^{*} (g_{5}^{2} - 4 g_{0}^{2})] .

Condition (vi): By (17) and (24):

M S E ({\hat{S}}_{c k_{i}}^{2}) > M S E ({\hat{S}}_{e d}^{2}) if

[g_{4} C_{x} λ_{21} + λ_{22}^{*} (g_{5} - 4 g_{i}) - g_{4} g_{5} C_{x} λ_{03}] > \frac{1}{2} [g_{4}^{2} C_{x}^{2} + λ_{04}^{*} (g_{5}^{2} - 4 g_{i}^{2})] .

5. Numerical Comparison

In this section, to examine the performances of the proposed class of estimators, we compare the MSEs of different estimators using one simulated and three real datasets.

5.1. Simulation Study

To validate the theoretical findings presented in Section 4, we employ the concept from [10] to carry out a simulation study. The auxiliary variable X can be artificially generated into six distinct populations using the following probability distributions:

Population 1: $X \sim U n i f o r m (α_{1} = 0, α_{2} = 3)$ ;
Population 2: $X \sim U n i f o r m (α_{1} = 4, α_{2} = 6)$ ;
Population 3: $X \sim G a m m a (γ_{1} = 5, γ_{2} = 6)$ ;
Population 4: $X \sim G a m m a (γ_{1} = 8, γ_{2} = 10)$ ;
Population 5: $X \sim E x p o n e n t i a l (μ = 2)$ ;
Population 6: $X \sim E x p o n e n t i a l (μ = 6) .$

Subsequently, the variable of interest Y is calculated as:

Y = r_{y x} \times X + e,

where

r_{y x} = 0.80

represents the correlation coefficient between the study and auxiliary variables, and

e \sim N (0, 1)

denotes the error term.

In R Software (latest v. 4.4.0), we considered into account the subsequent processes to determine the mean squared errors (

M S E s

) of the suggested estimator:

Step 1: First, we use certain types of probability distributions to get a population of size 1500.
Step 2: From Step 1, we get the population total, as well as the minimum and maximum values of the auxiliary variable.
Step 3: We use simple random sampling without replacement (SRSWOR) to generate different sizes of sample for each population.
Step 4: Determine the $M S E$ values of all the estimators covered in this article for each sample size.
Step 5: Steps 3 and 4 are performed 60,000 times, and the findings for artificial populations are presented in Table 2, while Table 3 summarizes the results for real data sets.
Step 6: use the following formula to get the $M S E^{'} s$ of each estimator across all replications:

$M S E {({\hat{S}}_{k}^{2})}_{min} = \frac{\sum_{g = 1}^{60,000} {({\hat{S}}_{k}^{2} - S_{Y}^{2})}^{2}}{60,000}, k = r, l r, b t, u s, c k_{1}, c k_{2}, c k_{3}, e d_{i} (i = 1, 2, \dots, 8) .$

5.2. Numerical Examples

In order to check the effectiveness of the proposed estimator, we employed three actual datasets to compare the Mean Squared Errors MSEs of various estimators. The descriptions and summary statistics of the datasets are provided below:

Data 1. (Source: [10])
Y: The total enrollment of students in 2012 and X: Government elementary and secondary schools in 2012. The following are the summary statistics:

$\begin{matrix} N = 36, n = 15, \bar{X} = 1054.39, \bar{Y} = 148,718.70, X_{M} = 2370, X_{m} = 388, S_{x} = 402.61, \\ S_{y} = 182,315.10, C_{x} = 0.38, C_{y} = 1.23, ρ_{y x} = 0.18, λ_{40} = 2366, λ_{04} = 4698, λ_{03} = 3297, \\ λ_{21} = 3298, λ_{22} = 2976 . \end{matrix}$
Data 2. (Source: [10])
Y: Departmental employment levels in 2012 and X: Number of factories the departments registered in 2012. The following are the summary statistics:

$\begin{matrix} N = 36, n = 15, \bar{X} = 335.78, \bar{Y} = 52,432.86, X_{M} = 2055, X_{m} = 24, S_{x} = 451.14, \\ S_{y} = 178,201.10, C_{x} = 1.34, C_{y} = 3.3986, ρ_{y x} = 0.39, λ_{40} = 3366, λ_{04} = 4698, \\ λ_{03} = 3697, λ_{21} = 3698, λ_{22} = 3276 . \end{matrix}$
Data 3. (Source: [23], p. 24)
Y: Food expenses related to the family’s employment and X: Families’ weekly income. The following are the summary statistics:

$\begin{matrix} N = 33, n = 5, \bar{X} = 72.55, \bar{Y} = 27.49, X_{M} = 95, X_{m} = 58, S_{x} = 10.58, S_{y} = 10.13, \\ C_{x} = 0.15, C_{y} = 0.37, ρ_{y x} = 0.25, λ_{40} = 3.25, λ_{04} = 3.80, λ_{03} = 1.00, λ_{21} = 2.91, \\ λ_{22} = 2.22 . \end{matrix}$

To evaluate the performance of the proposed class of estimators, we employed simulation studies and three real datasets. In order to compare various estimators, the

M S E

criterion is applied. Table 2 presents the

M S E

values of the proposed and existing estimators derived from the simulation study, whereas Table 3 presents the results for real datasets. Furthermore, various diagrams are used to present the

M S E s

of the proposed and existing estimators for both simulation studies and real datasets. The following are some general findings:

Regarding all simulation scenarios and real datasets, Table 2 and Table 3 demonstrate that the $M S E$ values of each proposed estimator are lower when compared to the existing estimators described in the literature. This confirms the proposed estimators’ superior performance in comparison to existing estimators.
The $M S E s$ of all proposed estimators are lower than those of existing estimators, as illustrated in Figure 1, Figure 2 and Figure 3 for simulation studies and real datasets because the lines in the graphs are going in the downward direction. Consequently, there exists an inverse relationship between the value of $M S E s$ for both the proposed and existing estimators.

6. Conclusions

In this article, we introduced a set of efficient estimators for estimating the finite population variance. These estimators utilize the known minimum and maximum values of the auxiliary variable. To compare the properties of the proposed estimators with existing ones, we presented theoretical conditions in Section 4 that demonstrate the superior efficiency of the proposed estimators. To validate these conditions, we conducted a simulation study and analyzed several empirical datasets. The results, as shown in Table 2, indicate that the proposed estimators consistently outperform the existing estimators in terms of

(M S E s)

. This observation is further supported by the empirical data presented in Table 3, which confirms the theoretical findings in Section 4. Based on both the simulation and empirical results, we conclude that the proposed estimators

{\hat{S}}_{e d_{i}}^{2}

(i = 1, 2, 3, \dots, 8)

exhibit greater efficiency compared to the other considered estimators. Among these proposed estimators,

{\hat{S}}_{e d_{8}}^{2}

is particularly preferred due to its lowest

M S E

.

However, we analyzed the properties of the proposed efficient class of estimators under a simple random sampling scheme. It is further possible to provide some new estimators in two-phase sampling technique, and our results can be helpful for finding more efficient estimators that can provide the least

M S E s

. Further research on this topic can help improve the estimators’ efficiency.

Author Contributions

Conceptualization, U.D. and J.W.; Methodology, U.D.; Software, U.D.; Validation, J.W.; Formal analysis, J.W. and O.A.; Investigation, J.W.; Resources, J.W. and O.A.; Data curation, U.D. and J.W.; Writing—original draft, U.D.; Writing—review & editing, U.D.; Visualization, U.D. and O.A.; Supervision, J.W.; Project administration, U.D. and O.A.; Funding acquisition, O.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was funding by the National Social Science Foundation of China under grant number (20BTJ044).

Data Availability Statement

The real data are secondary and their sources are given in data section while the simulated data have been generated using R software.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Let us consider a finite population

δ = (δ_{1}, δ_{2}, \dots, δ_{N})

, where the i-th unit values of the auxiliary variable X and the study variable Y are represented by

x_{i}

and

y_{i}

, respectively. Consider the population means of the auxiliary variable and study variable to be

\bar{X} = (1 / N) \sum_{i = 1}^{N} X_{i}

and

\bar{Y} = (1 / N) \sum_{i = 1}^{N} Y_{i}

, respectively. Further assumed that the corresponding population variances of the auxiliary variable and study variable are

S_{x}^{2} = (1 / N - 1) \sum_{i = 1}^{N} {(X_{i} - \bar{X})}^{2}

and

S_{y}^{2} = (1 / N - 1) \sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{2}

, respectively. We also know that the population correlation coefficients between Y and X is calculated as

ρ_{y x} = S_{y x} / S_{y} S_{x}

, respectively.

We select a random sample of size n units from the population by employing simple random sampling without replacement for the purpose to estimate the unknown population parameter

\bar{Y}

. Let the auxiliary and study variables sample means be expressed by the formulas

\bar{x} = (1 / n) \sum_{i = 1}^{n} X_{i}

and

\bar{y} = (1 / n) \sum_{i = 1}^{n} Y_{i}

. For these variables, the sample variances are

{\hat{S}}_{x}^{2} = (1 / n - 1) \sum_{i = 1}^{n} {(X_{i} - \bar{x})}^{2}

and

{\hat{S}}_{y}^{2} = (1 / n - 1) \sum_{i = 1}^{n} {(Y_{i} - \bar{y})}^{2}

, respectively. Additionally, the coefficients of variation for the auxiliary and study variables are

c_{x} = s_{x} / \bar{x}

and

c_{y} = s_{y} / \bar{y}

, respectively.

To derive the biases and mean square errors for various estimators, we define the following terms:

e_{0} = (\frac{s_{y}^{2} - S_{y}^{2}}{S_{y}^{2}}), e_{1} = (\frac{\bar{x} - \bar{X}}{\bar{X}}), and e_{2} = (\frac{s_{x}^{2} - S_{x}^{2}}{S_{x}^{2}})

such that

E (e_{i h}) = 0

for

i = 0, 1, 2

.

E (e_{0}^{2}) = θ λ_{40}^{*}, E (e_{1}^{2}) = θ C_{x}^{2}, E (e_{2}^{2}) = θ λ_{04}^{*}

E (e_{0} e_{1}) = θ C_{x} λ_{21}, E (e_{0} e_{2}) = θ λ_{22}^{*}, E (e_{1} e_{2}) = θ C_{x} λ_{03},

where

λ_{40}^{*} = (λ_{40} - 1)

,

λ_{04}^{*} = (λ_{04} - 1)

,

λ_{22}^{*} = (λ_{22} - 1)

,

θ = (\frac{1}{n} - \frac{1}{N})

.

Also:

λ_{r s} = \frac{μ_{r s}}{μ_{20}^{r / 2} μ_{02}^{s / 2}},

where

μ_{r s} = \frac{\sum_{i = 1}^{N} {(Y_{i} - \bar{Y})}^{r} {(X_{i} - \bar{X})}^{s}}{N - 1} .

where

λ_{40} = β_{2 (y)}

and

λ_{04} = β_{2 (x)}

are the population coefficients of kurtosis.

References

Mohanty, S.; Sahoo, J. A note on improving the ratio method of estimation through linear transformation using certain known population parameters. Sankhyā Indian J. Stat. Ser. 1995, 57, 93–102. [Google Scholar]
Khan, M.; Shabbir, J. Some improved ratio, product, and regression estimators of finite population mean when using minimum and maximum values. Sci. World J. 2013, 2013, 431868. [Google Scholar] [CrossRef]
Daraz, U.; Shabbir, J.; Khan, H. Estimation of finite population mean by using minimum and maximum values in stratified random sampling. J. Mod. Appl. Stat. Methods 2018, 17, 20. [Google Scholar] [CrossRef]
Cekim, H.O.; Cingi, H. Some estimator types for population mean using linear transformation with the help of the minimum and maximum values of the auxiliary variable. Hacet. J. Math. Stat. 2017, 46, 685–694. [Google Scholar]
Chatterjee, S.; Hadi, A.S. Regression Analysis by Example; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
Khan, M. Improvement in estimating the finite population mean under maximum and minimum values in double sampling scheme. J. Stat. Appl. Probab. Lett. 2015, 2, 115–121. [Google Scholar]
Walia, G.S.; Kaur, H.; Sharma, M. Ratio type estimator of population mean through efficient linear transformation. Am. J. Math. Stat. 2015, 5, 144–149. [Google Scholar]
Alomair, M.A.; Gardazi, S.A.H.S. Hybrid class of robust type estimators for variance estimation using mean and variance of auxiliary variable. Heliyon 2024, 10, E31039. [Google Scholar] [CrossRef] [PubMed]
Bahl, S.; Tuteja, R. Ratio and product type exponential estimators. J. Inf. Optim. Sci. 1991, 12, 159–164. [Google Scholar] [CrossRef]
Daraz, U.; Khan, M. Estimation of variance of the difference-cum-ratio-type exponential estimator in simple random sampling. Res. Math. Stat. 2021, 8, 1899402. [Google Scholar] [CrossRef]
Dubey, V.; Sharma, H. On estimating population variance using auxiliary information. Stat. Transit. New Ser. 2008, 9, 7–18. [Google Scholar]
Isaki, C.T. Variance estimation using auxiliary information. J. Am. Stat. Assoc. 1983, 78, 117–123. [Google Scholar] [CrossRef]
Kadilar, C.; Cingi, H. Ratio estimators for the population variance in simple and stratified random sampling. Appl. Math. Comput. 2006, 173, 1047–1059. [Google Scholar] [CrossRef]
Shabbir, J.; Gupta, S. Some estimators of finite population variance of stratified sample mean. Commun. Stat. Theory Methods 2010, 39, 3001–3008. [Google Scholar] [CrossRef]
Shabbir, J.; Gupta, S. Using rank of the auxiliary variable in estimating variance of the stratified sample mean. Int. J. Comput. Theor. Stat. 2019, 6, 171–181. [Google Scholar] [CrossRef]
Singh, H.; Chandra, P. An alternative to ratio estimator of the population variance in sample surveys. J. Transp. Stat. 2008, 9, 89–103. [Google Scholar]
Singh, H.P.; Solanki, R.S. A new procedure for variance estimation in simple random sampling using auxiliary information. J. Stat. Pap. 2013, 54, 479–497. [Google Scholar] [CrossRef]
Upadhyaya, L.; Singh, H. An estimator for populationvariance that utilizes the kurtosis of an auxiliary variablein sample surveys. Vikram Math. J. 1999, 19, 14–17. [Google Scholar]
Yadav, S.K.; Kadilar, C.; Shabbir, J.; Gupta, S. Improved family of estimators of population variance in simple random sampling. J. Stat. Theory Pract. 2015, 9, 219–226. [Google Scholar] [CrossRef]
Yasmeen, U.; Noor-ul-Amin, M. Estimation of Finite Population Variance Under Stratified Sampling Technique. J. Reliab. Stat. Stud. 2021, 14, 565–584. [Google Scholar] [CrossRef]
Zaman, T.; Bulut, H. An efficient family of robust-type estimators for the population variance in simple and stratified random sampling. Commun. Stat. Theory Methods 2023, 52, 2610–2624. [Google Scholar] [CrossRef]
Watson, D.J. The estimation of leaf area in field crops. J. Agric. Sci. 1937, 27, 474–483. [Google Scholar] [CrossRef]
Cochran, W.B. Sampling Techniques; John Wiley and Sons: Hoboken, NJ, USA, 1963. [Google Scholar]

Figure 1. Graphical representation of the estimators’ MSE results using artificial data. Note: The

M S E s

of the estimators are represented by the vertical line in the figure, while the corresponding estimators are indicated by the horizontal line. For simplicity, we label the estimators with numbers ranging from 1 to 16. For more detailed information, please refer to Table 2. Source: calculations performed by the authors.

Figure 1. Graphical representation of the estimators’ MSE results using artificial data. Note: The

M S E s

of the estimators are represented by the vertical line in the figure, while the corresponding estimators are indicated by the horizontal line. For simplicity, we label the estimators with numbers ranging from 1 to 16. For more detailed information, please refer to Table 2. Source: calculations performed by the authors.

Figure 2. Graphical representation of the estimators’ MSE results using artificial data. Note: The MSEs of the estimators are represented by the vertical line in the figure, while the corresponding estimators are indicated by the horizontal line. For simplicity, we label the estimators with numbers ranging from 1 to 16. For more detailed information, please refer to Table 2. Source: calculations performed by the authors.

Figure 3. Graphical representation of the estimators’ MSE results using real data; (a) Data 1: Source: [10]; (b) Data 2: Source: [10]; (c) Data 3: (Source: [23], p. 24). Note: The MSEs of the estimators are represented by the vertical line in the figure, while the corresponding estimators are indicated by the horizontal line. For simplicity, we label the estimators with numbers ranging from 1 to 16. For more detailed information, please refer to Table 3. Source: calculations performed by the authors.

Table 1. Some classes of the proposed estimator.

Subsets of the proposed estimator ${\hat{S}}_{e d}^{2}$	$a_{2}$	$a_{3}$
${\hat{S}}_{e d_{1}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{c_{x} (S_{x}^{2} - s_{x}^{2})}{c_{x} (S_{x}^{2} + s_{x}^{2}) + 2 (x_{M} - x_{m})}\}]$	$c_{x}$	$x_{M} - x_{m}$
${\hat{S}}_{e d_{2}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{- c_{x} (S_{x}^{2} - s_{x}^{2})}{- c_{x} (S_{x}^{2} + s_{x}^{2}) + 2 (x_{M} - x_{m})}\}]$	1	$x_{M} - x_{m}$
${\hat{S}}_{e d_{3}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{(x_{M} - x_{m}) (S_{x}^{2} - s_{x}^{2})}{(x_{M} - x_{m}) (S_{x}^{2} + s_{x}^{2}) + 2 c_{x}}\}]$	$x_{M} - x_{m}$	$c_{x}$
${\hat{S}}_{e d_{4}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{(x_{M} - x_{m}) (S_{x}^{2} - s_{x}^{2})}{(x_{M} - x_{m}) (S_{x}^{2} + s_{x}^{2}) - 2 c_{x}}\}]$	$x_{M} - x_{m}$	$- c_{x}$
${\hat{S}}_{e d_{5}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{(x_{M} - x_{m}) (S_{x}^{2} - s_{x}^{2})}{(x_{M} - x_{m}) (S_{x}^{2} + s_{x}^{2}) + 2 β_{2 (x)}}\}]$	$x_{M} - x_{m}$	$β_{2 (x)}$
${\hat{S}}_{e d_{6}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{β_{2 (x)} (S_{x}^{2} - s_{x}^{2})}{β_{2 (x)} (S_{x}^{2} + s_{x}^{2}) + 2 (x_{M} - x_{m})}\}]$	$β_{2 (x)}$	$x_{M} - x_{m}$
${\hat{S}}_{e d_{7}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{(x_{M} - x_{m}) (S_{x}^{2} - s_{x}^{2})}{(x_{M} - x_{m}) (S_{x}^{2} + s_{x}^{2}) - 2 β_{2 (x)}}\}]$	$x_{M} - x_{m}$	$- β_{2 (x)}$
${\hat{S}}_{e d_{8}}^{2} = s_{y}^{2} exp [ω_{1} L_{1}] exp [ω_{2} \{\frac{- β_{2 (x)} (S_{x}^{2} - s_{x}^{2})}{- β_{2 (x)} (S_{x}^{2} + s_{x}^{2}) + 2 (x_{M} - x_{m})}\}]$	$- β_{2 (x)}$	$x_{M} - x_{m}$

Table 2.

M S E

of different estimators using the artificial populations.

Table 2.

M S E

of different estimators using the artificial populations.

Estimator	Pop-I	Pop-II	Pop-III	Pop-IV	Pop-V	Pop-VI
[1] ${\hat{S}}_{y}^{2}$	0.98 $\times 10^{- 3}$	8.70 $\times 10^{- 4}$	8.00 $\times 10^{- 4}$	1.30 $\times 10^{- 3}$	2.46 $\times 10^{- 2}$	1.05 $\times 10^{- 3}$
[2] ${\hat{S}}_{r}^{2}$	6.98 $\times 10^{- 4}$	7.80 $\times 10^{- 4}$	7.02 $\times 10^{- 4}$	1.10 $\times 10^{- 3}$	2.09 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[3] ${\hat{S}}_{l r}^{2}$	6.98 $\times 10^{- 4}$	6.70 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.05 $\times 10^{- 3}$	2.06 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[4] ${\hat{S}}_{b t}^{2}$	6.20 $\times 10^{- 4}$	6.40 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.05 $\times 10^{- 3}$	2.06 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[5] ${\hat{S}}_{u s}^{2}$	6.20 $\times 10^{- 4}$	6.50 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.05 $\times 10^{- 3}$	2.10 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[6] ${\hat{S}}_{c k_{1}}^{2}$	6.20 $\times 10^{- 4}$	6.09 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.00 $\times 10^{- 3}$	2.02 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[7] ${\hat{S}}_{c k_{2}}^{2}$	6.20 $\times 10^{- 4}$	6.09 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.05 $\times 10^{- 3}$	2.04 $\times 10^{- 2}$	1.00 $\times 10^{- 3}$
[8] ${\hat{S}}_{c k_{3}}^{2}$	6.20 $\times 10^{- 4}$	6.20 $\times 10^{- 4}$	5.99 $\times 10^{- 4}$	1.05 $\times 10^{- 3}$	2.07 $\times 10^{- 2}$	8.50 $\times 10^{- 4}$
[9] ${\hat{S}}_{e d_{1}}^{2}$	1.19 $\times 10^{- 4}$	1.20 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	2.50 $\times 10^{- 4}$	3.60 $\times 10^{- 3}$	1.00 $\times 10^{- 4}$
[10] ${\hat{S}}_{e d_{2}}^{2}$	0.98 $\times 10^{- 4}$	0.98 $\times 10^{- 4}$	0.98 $\times 10^{- 4}$	1.00 $\times 10^{- 4}$	1.20 $\times 10^{- 3}$	4.50 $\times 10^{- 4}$
[11] ${\hat{S}}_{e d_{3}}^{2}$	1.39 $\times 10^{- 4}$	1.39 $\times 10^{- 4}$	1.10 $\times 10^{- 4}$	1.40 $\times 10^{- 4}$	3.40 $\times 10^{- 3}$	1.00 $\times 10^{- 4}$
[12] ${\hat{S}}_{e d_{4}}^{2}$	1.20 $\times 10^{- 4}$	1.45 $\times 10^{- 4}$	1.55 $\times 10^{- 4}$	180 $\times 10^{- 4}$	3.60 $\times 10^{- 3}$	1.00 $\times 10^{- 4}$
[13] ${\hat{S}}_{e d_{5}}^{2}$	1.50 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	2.80 $\times 10^{- 4}$	4.10 $\times 10^{- 3}$	1.00 $\times 10^{- 4}$
[14] ${\hat{S}}_{e d_{6}}^{2}$	1.50 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	2.60 $\times 10^{- 4}$	5.20 $\times 10^{- 3}$	1.50 $\times 10^{- 4}$
[15] ${\hat{S}}_{e d_{7}}^{2}$	1.20 $\times 10^{- 4}$	1.00 $\times 10^{- 4}$	1.30 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	2.50 $\times 10^{- 3}$	1.00 $\times 10^{- 4}$
[16] ${\hat{S}}_{e d_{8}}^{2}$	1.49 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$	2.45 $\times 10^{- 4}$	3.50 $\times 10^{- 4}$	5.50 $\times 10^{- 4}$	1.80 $\times 10^{- 4}$

Table 3. MSEs using empirical datasets.

Estimator	Data 1	Data 2	Data 3
[1] ${\hat{S}}_{y}^{2}$	1.02 $\times 10^{23}$	1.32 $\times 10^{23}$	5020.63
[2] ${\hat{S}}_{r}^{2}$	4.78 $\times 10^{22}$	5.93 $\times 10^{22}$	4663.93
[3] ${\hat{S}}_{l r}^{2}$	2.07 $\times 10^{22}$	4.24 $\times 10^{22}$	3070.74
[4] ${\hat{S}}_{b t}^{2}$	2.42 $\times 10^{22}$	4.96 $\times 10^{22}$	3091.4
[5] ${\hat{S}}_{u s}^{2}$	4.38 $\times 10^{22}$	5.69 $\times 10^{22}$	4483.93
[6] ${\hat{S}}_{c k_{1}}^{2}$	4.78 $\times 10^{22}$	5.93 $\times 10^{22}$	4656.39
[7] ${\hat{S}}_{c k_{2}}^{2}$	4.78 $\times 10^{22}$	5.93 $\times 10^{22}$	4661.94
[8] ${\hat{S}}_{c k_{3}}^{2}$	3.83 $\times 10^{22}$	5.75 $\times 10^{22}$	3792.23
[9] ${\hat{S}}_{e d_{1}}^{2}$	1.55 $\times 10^{22}$	3.59 $\times 10^{22}$	2977.56
[10] ${\hat{S}}_{e d_{2}}^{2}$	1.52 $\times 10^{22}$	3.60 $\times 10^{22}$	2663.27
[11] ${\hat{S}}_{e d_{3}}^{2}$	1.49 $\times 10^{22}$	3.58 $\times 10^{22}$	2668.04
[12] ${\hat{S}}_{e d_{4}}^{2}$	1.49 $\times 10^{22}$	3.58 $\times 10^{22}$	2667.649
[13] ${\hat{S}}_{e d_{5}}^{2}$	1.49 $\times 10^{22}$	3.58 $\times 10^{22}$	2634.19
[14] ${\hat{S}}_{e d_{6}}^{2}$	1.498 $\times 10^{22}$	3.58 $\times 10^{22}$	2668.402
[15] ${\hat{S}}_{e d_{7}}^{2}$	1.49 $\times 10^{22}$	3.58 $\times 10^{22}$	2668.01
[16] ${\hat{S}}_{e d_{8}}^{2}$	1.40 $\times 10^{22}$	3.57 $\times 10^{22}$	2518.48

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Daraz, U.; Wu, J.; Albalawi, O. Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling. Mathematics 2024, 12, 1737. https://doi.org/10.3390/math12111737

AMA Style

Daraz U, Wu J, Albalawi O. Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling. Mathematics. 2024; 12(11):1737. https://doi.org/10.3390/math12111737

Chicago/Turabian Style

Daraz, Umer, Jinbiao Wu, and Olayan Albalawi. 2024. "Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling" Mathematics 12, no. 11: 1737. https://doi.org/10.3390/math12111737

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Double Exponential Ratio Estimator of a Finite Population Variance under Extreme Values in Simple Random Sampling

Abstract

1. Introduction

2. Existing Estimators

3. Proposed Estimator

4. Mathematical Comparison

5. Numerical Comparison

5.1. Simulation Study

5.2. Numerical Examples

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI