1. Introduction
It is well known that the sample mean,
, is the uniformly minimum variance unbiased (UMVU) estimator of the normal population mean
; see the paper by Sahai et al. [
1]. Dropping the requirement of unbiasedness, Searls [
2] proposed the minimum mean squared error (MMSE) estimator for normal mean with known coefficient of variation (CV). Khan [
3] discussed the estimation of the mean with known CV in one sample case. Gleser and Healy [
4] proposed the minimum quadratic risk scale-invariant estimator for the normal mean with known CV. Bhat and Rao [
5] investigated the tests for a normal mean with known CV. Niwitpong et al. [
6] provided confidence intervals for the difference between normal population means with known CVs. Niwitpong [
7] presented confidence intervals for the normal mean with known CV. Niwitpong and Niwitpong [
8] proposed the confidence interval for the normal mean with a known CV based on the best unbiased estimator, which was proposed by Khan [
3]. Niwitpong [
9] proposed the confidence interval for the normal mean with a known CV based on the
t-test. Niwitpong and Niwitpong [
10] constructed new confidence intervals for the difference between normal means with known CV. Sodanin et al. [
11] proposed confidence intervals for the common mean of normal distributions with known CV.
In practice, the CV is unknown. Furthermore, the CV needs to be estimated. Therefore, Srivastava [
12] proposed a UMVU estimator for the estimation of the normal mean with unknown CV,
, where CV is defined as
. The UMVU estimator, estimated from the MMSE estimator of Searls [
2], is more efficient than the usual unbiased estimator sample mean
whenever
is at least 0.5. Srivastava and Singh [
13] provided a UMVU estimate of the relative efficiency ratio of
. Moreover, Sahai [
14] developed a new estimator for the normal mean with unknown CV. Sahai and Acharya [
15] studied the iterative estimation of the normal population mean using computational-statistical intelligence. However, a confidence interval provides more information about a population value of the quantity than a point estimate. Therefore, it is of practical and theoretical importance to develop procedures for confidence interval estimation of the mean of the normal distribution with unknown CV. Hence, along similar lines as Srivastava [
12], we construct the new confidence intervals for the normal mean with unknown CV and compare with the standard confidence intervals: the Student’s
t-distribution and the
z-distribution. The comparison can be based on coverage probability, as well as the length of the confidence intervals. The average length of the confidence intervals could also be analytically obtained and hence compared; see, e.g., Sodanin et al. [
16], who proposed the confidence intervals for the normal population mean with unknown CV based on the generalized confidence interval (GCI) approach. This paper extends the work of Sodanin et al. [
16] to construct confidence intervals for the normal population mean with unknown CV based on the GCI approach and the new confidence intervals based on the large sample (LS) approach. Furthermore, three new confidence intervals for the difference between normal means with unknown CVs were also proposed based on the GCI approach, the LS approach and the method of variance estimates recovery (MOVER) approach and compared with the well-known Welch–Satterthwaite (WS) approach. For more on confidence intervals on CV, we refer our readers to Banik and Kibria [
17], Gulhar et al. [
18] and, recently, Albatineh et al. [
19], among others.
This paper is organized as follows. In
Section 2, the confidence intervals for the single normal mean with unknown CV are presented. In
Section 3, the confidence intervals for the difference between normal means with unknown CVs are provided. In
Section 4, simulation results are presented to evaluate the coverage probabilities and average lengths in the comparison of the proposed approaches. In
Section 5, the proposed approaches are illustrated using three examples.
Section 6 summarizes this paper.
2. Confidence Intervals for the Mean of the Normal Distribution with Unknown Coefficient of Variation
Suppose that are independent random variables each having the normal distribution with mean and variance . The CV is defined by . Let and be the sample mean and sample variance for X, respectively. Furthermore, let and be the observed sample of and , respectively.
Searls [
2] proposed the MMSE estimator for the normal population mean with variance,
, defined by:
However, the CV needs to be estimated. Srivastava [
12] proposed an estimator of the mean with unknown CV, which is defined by:
Moreover, Sahai [
14] proposed an alternative estimator of the normal population mean with unknown CV, which is defined by:
The estimator of
is defined by:
Theorem 1. Suppose that is a random sample from . Suppose and are a sample mean and a sample variance, respectively. Let be an estimator of the normal population mean with unknown CV, and let be an estimator of . The mean and variance of are obtained by:
and
Proof. Let
and
. Since
. Then:
According to Thangjai et al. [
20], the mean of
is computed by the moment generating function, and the variance of
is computed by Stein’s lemma. Therefore, the mean and the variance of
are defined by:
From Thangjai et al. [
20], the mean and the variance of
are defined by:
and
Therefore, the mean and variance of
are defined by:
and
According to Blumenfeld [
21], the mean and variance of
are obtained by:
and
Hence, Theorem 1 is proven. ☐
Proposition 1. Let be a random sample from the normal distribution with the mean μ and the variance . Let and be the corresponding point estimates of μ and . Then:where , is in Equation (5) and is in Equation (6). Proof. Let
be an estimator of the mean with unknown CV, and let
be an estimator of
. From Theorem 1,
is distributed normally with mean
and variance
, which is defined by:
where:
and
Applying the asymptotic theory, the estimator
is consistent. That is,
converges in probability to
and
converges in probability to
μ as
. The estimator is asymptotically normal and is defined by:
where
represents that it converges in the distribution. Hence, Proposition 1 is proven. ☐
Theorem 2. Suppose that is a random sample from . Let θ be an estimator of normal population mean with unknown CV, and let be an estimator of θ. The mean and variance of are obtained by:
and
Proof. For the proof of the mean and variance of is similarly to Theorem 1. ☐
Proposition 2. Let be a random sample from the normal distribution with the mean μ and the variance . Let and be the corresponding point estimates of μ and . Then:where , is in Equation (8), and is in Equation (9). Proof. For the proof of the distribution of is similar to Proposition 1. ☐
2.1. Generalized Confidence Intervals for the Mean of the Normal Distribution with Unknown Coefficient of Variation
Definition 1. Let be a random sample from a distribution , which depends on a vector of parameters where θ is the parameter of interest and ϑ is possibly a vector of nuisance parameters. Weerahandi [22] defines a generalized pivot for confidence interval estimation, where x is an observed value of X, as a random variable having the following two properties: - (i)
has a probability distribution that is free of unknown parameters.
- (ii)
The observed value of , , is the parameter of interest.
Let be the -th percentile of . Then, becomes a two-sided GCI for θ.
Recall that:
where V is chi-squared distribution with
degrees of freedom. Now, write:
The generalized pivotal quantity (GPQ) for
is defined by:
Moreover, the mean is given by:
where
Z and
U denote the standard normal distribution and chi-square distribution with
degrees of freedom, respectively. Thus, the GPQ for μ is defined by:
Therefore, the GPQ for
θ is defined by:
Moreover, the GPQ for
is defined by:
Therefore, the
two-sided confidence intervals for the single normal mean with unknown CV based on the GCI approach are obtained by:
and
where
and
denote the
-th percentiles of
and
, respectively.
Algorithm 1. For a given , the GCI for θ and can be computed by the following steps:
- Step 1.
Generate
, and then, compute
from Equation (
13).
- Step 2.
Generate
and
, then compute
from Equation (
15).
- Step 3.
Compute
from Equation (
16), and compute
from Equation (
17).
- Step 4.
Repeat Steps 1–3 a total q times, and obtain an array of ’s and ’s.
- Step 5.
Compute , , and .
2.2. Large Sample Confidence Intervals for the Mean of the Normal Distribution with Unknown Coefficient of Variation
Again, from Equations (
2) and (
4), the estimators of the mean with unknown CV are defined by:
and
From Theorem 2, the variance of
is defined by:
with
μ and
replaced by
and
, respectively.
From Theorem 1, the variance of
is defined by:
with
μ and
replaced by
and
, respectively.
Therefore, the
two-sided confidence intervals for the single normal mean with unknown CV based on the LS approach are obtained by:
and
where
denotes the
-th quantile of the standard normal distribution.
Algorithm 2. The coverage probability for θ and can be computed by the following steps:
- Step 1.
Generate from and then compute and .
- Step 2.
Use Algorithm 1 to construct and record whether or not the value of θ falls in the corresponding confidence interval.
- Step 3.
Use Algorithm 1 to construct and record whether or not the value of falls in the corresponding confidence interval.
- Step 4.
Use Equation (
24) to construct
and record whether or not the value of
θ falls in the corresponding confidence interval.
- Step 5.
Use Equation (
25) to construct
and record whether or not the value of
falls in the corresponding confidence interval.
- Step 6.
Repeat Steps 1–5, a total M times. Then, for and , the fraction of times that all θ are in their corresponding confidence intervals provides an estimate of the coverage probability. Similarly, for and , the fraction of times that all are in their corresponding confidence intervals provides an estimate of the coverage probability.
3. Confidence Intervals for the Difference between the Means of Normal Distributions with Unknown Coefficients of Variation
Suppose that are independent random variables each having a normal distribution with mean and variance . Additionally, suppose that are independent random variables each having a normal distribution with mean and variance . Furthermore, X and Y are independent. Let and be the sample mean and the sample variance for X, respectively. Furthermore, let and be the observed sample of and , respectively. Similarly, let and be the sample mean and the sample variance for Y, respectively. Furthermore, let and be the observed sample of and , respectively.
Let
be the difference between means with unknown CVs. The estimators of
δ are defined by:
and
where
and
denote the estimator of
and
, respectively, and
and
denote the estimator of
and
, respectively.
Theorem 3. Suppose that is a random sample from , and suppose that is a random sample from . Let X and Y be independent. Let and be the sample mean and the sample variance for X, respectively. Furthermore, let and be the sample mean and the sample variance for Y, respectively. Let and be the mean with unknown CV of X and Y, respectively. Let δ be the difference between and . Let be an estimator of δ. The mean and variance of are obtained by:and Proof. Let
be the difference between means with unknown CVs. Let
be an estimator of
δ, which is defined by:
Thus, the mean and variance of
are obtained by:
and
Hence, Theorem 3 is proven. ☐
Theorem 4. Suppose that is a random sample from and suppose that is a random sample from . Let X and Y be independent. Let and be the sample mean and the sample variance for X, respectively. Furthermore, let and be the sample mean and the sample variance for Y, respectively. Let and be the mean with unknown CV of X and Y, respectively. Let be the difference between and . Additionally, let be an estimator of . The mean and variance of are obtained by:and Proof. For the proof of the mean and variance of is similar to Theorem 3. ☐
3.1. Generalized Confidence Intervals for the Difference between Means of Normal Distributions with Unknown Coefficients of Variation
From the random variable
X and
Y, since:
The GPQs for
and
are defined by:
Moreover, the means are given by:
Thus, the GPQs for
and
are defined by:
Therefore, the GPQ for δ is defined by:
Moreover, the GPQ for
is defined by:
Therefore, the
two-sided confidence intervals for the difference between normal means with unknown CVs based on the GCI approach are obtained by:
and
where
and
denote the
-th percentiles of
and
, respectively.
Algorithm 3. For a given and , the GCI for δ and can be computed by the following steps:
- Step 1.
Generate
and
, then compute
and
from Equation (
33).
- Step 2.
Generate
,
,
and
, then compute
and
from Equation (
35).
- Step 3.
Compute
from Equation (
36), and compute
from Equation (
37).
- Step 4.
Repeat Steps 1–3, a total q times, and obtain an array of ’s and ’s.
- Step 5.
Compute , , and .
3.2. Large Sample Confidence Intervals for the Difference between Means of Normal Distributions with Unknown Coefficients of Variation
Again, the estimators of the difference between means with unknown CVs are defined by:
and
From Theorem 3, the variance of
is defined by:
with
,
,
and
replaced by
,
,
and
, respectively.
From Theorem 4, the variance of
is defined by:
with
,
,
and
replaced by
,
,
and
, respectively.
Therefore, the
two-sided confidence intervals for the difference between normal means with unknown CVs based on the LS approach are obtained by:
and
where
denotes the
-th quantile of the standard normal distribution.
3.3. Method of Variance Estimates Recovery Confidence Intervals for the Difference between Means of Normal Distributions with Unknown Coefficients of Variation
Since the difference between means is denoted by
, where
and
are the means of
and
, respectively, suppose that
and
are estimators of
and
, respectively. The confidence intervals for
and
are defined by:
and
Similarly, the difference between means is denoted by
. The confidence intervals for
and
are defined by:
and
The MOVER approach, introduced by Donner and Zou [
23], is used to construct the
two-sided confidence interval
of
where
and
denote the lower limit and upper limit of the confidence interval, respectively. The lower limit and upper limit for
δ are given by:
and
Similarly, the lower limit and upper limit for
are given by:
and
Therefore, the
two-sided confidence intervals for the difference between normal means with unknown CVs based on the MOVER approach are obtained by:
and
Algorithm 4. The coverage probability for δ and can be computed by the following steps:
- Step 1.
Generate from , and then, compute and . Additionally, generate from , and then, compute and .
- Step 2.
Use Algorithm 3 to construct , and record whether or not the values of δ fall in the corresponding confidence interval.
- Step 3.
Use Algorithm 3 to construct , and record whether or not the values of fall in the corresponding confidence interval.
- Step 4.
Use Equation (
44) to construct
, and record whether or not the values of
δ fall in the corresponding confidence interval.
- Step 5.
Use Equation (
45) to construct
, and record whether or not the values of
fall in the corresponding confidence interval.
- Step 6.
Use Equation (
54) to construct
, and record whether or not the values of
δ fall in the corresponding confidence interval.
- Step 7.
Use Equation (
55) to construct
, and record whether or not the values of
fall in the corresponding confidence interval.
- Step 8.
Repeat Steps 1–7, a total M times. Then, for , and , the fraction of times that all δ are in their corresponding confidence intervals provides an estimate of the coverage probability. Similarly, for , and , the fraction of times that all are in their corresponding confidence intervals provides an estimate of the coverage probability.
4. Simulation Studies
To compare the performance of the confidence intervals, coverage probabilities and average lengths, introduced in
Section 2 and
Section 3, two simulation studies were conducted. Comparison studies were also conducted using the Student’s t-distribution, the z-distribution and the WS approach. The Student’s t-distribution was used to construct the confidence interval for the single mean of the normal distribution when the sample size is small, whereas the z-distribution was used to construct the confidence interval when the sample size is large. The WS approach was used for constructing the confidence interval for the difference of the means of the normal distribution; see the paper by Niwitpong and Niwitpong [
24]. The nominal confidence level of
0.95 was set. The confidence interval, with the values of the coverage probability greater than or close to the nominal confidence level and also having the shortest average length, was chosen.
Firstly, the performances of the confidence intervals for the single mean of the normal distribution with unknown CV (
θ and
) were compared. The confidence intervals were constructed with the GCI approach (
and
) and the LS approach (
and
). Furthermore, the standard confidence interval for the single mean of the normal distribution (
) was constructed based on the Student’s t-distribution and the z-distribution. Algorithm 1 and Algorithm 2 were used to compute coverage probabilities and average lengths with
2500 and
5000 of sample size n from
for
1.0,
0.3, 0.5, 0.7, 0.9, 1.0, 1.1, 1.3, 1.5, 1.7, 2.0 and
10, 20, 30, 50, 100. The CVs were computed by
.
Table 1 and
Table 2 show the coverage probabilities and average lengths of the 95% two-sided confidence intervals for
θ,
and
μ. The results indicated that the GCIs are similar to the paper by Sodanin et al. [
16] in terms of coverage probability and average length. For the GCI approach,
provides better confidence interval estimates than
in almost all cases. This is because the coverage probabilities of
are close to 1.00 when σ increases. Hence,
is a conservative confidence interval when σ increases. For the LS approach, the coverage probabilities of
and
provide less than the nominal confidence level of 0.95 and are close to 1.00 when
σ increases. Therefore, the LS approach is not recommended to construct the confidence interval for the single mean of the normal distribution with unknown CV. This is then compared with
. For a small sample size, the coverage probability of
performs as well as that of
. The length of
is a bit shorter than the length of
. Hence,
is better than
in terms of the average length when the sample size is small. For a large sample size,
is better than
in terms of coverage probability. Furthermore, the coverage probability of
is more stable than that of
in all sample size cases.
The second simulation study was to compare the performance of confidence intervals for the difference between two means of normal distributions with unknown CVs (
δ and
). There are three approaches; GCIs are defined as
and
; large sample confidence intervals are defined as
and
; and MOVER confidence intervals are defined as
and
compared with the WS confidence interval for the difference of the means of the normal distribution (
). Algorithm 3 and Algorithm 4 were used to compute coverage probabilities and average lengths with
2500 and
5000. The sample sizes n from
and m from
for the sample sizes were
(10,10), (10,20), (30,30), (20,30), (50,50), (30,50), (100,100) and (50,100). The population means were
1.0, and the population standard deviations were
0.3, 0.5, 0.7, 0.9, 1.0, 1.1, 1.3, 1.5, 1.7, 2.0 and
1.0. The coefficients of variation were computed by
and
; also, the ratio of
to
reduces to
when we set
.
Table 3 and
Table 4 show that the coverage probabilities and average lengths of 95% two-sided confidence intervals for
δ,
and
. For the GCI approach, the coverage probabilities of
are close to the nominal confidence level of 0.95 for all cases. For small sample sizes,
is the conservative confidence interval because the coverage probabilities are in the range from 0.97–1.00. Moreover, the coverage probabilities of
are close to the nominal confidence level of 0.95 when the sample sizes (
n and
m) increase. For the LS approach,
have the coverage probabilities under the nominal confidence level of 0.95 and close to the nominal confidence level of 0.95 when the sample sizes are large. Furthermore,
is a conservative confidence interval because the coverage probabilities are close to 1.00. For the MOVER approach, the coverage probability of
is not stable, whereas
is a conservative confidence interval. In addition,
is better than
in terms of coverage probability.
6. Discussion and Conclusions
Sodanin et al. [
16] constructed the GCIs for the mean of the normal distribution with unknown CV. This paper provides generalized confidence intervals (
and
) and proposes large sample confidence intervals (
and
) for the single mean of the normal distribution with unknown CV (
θ and
). Comparison studies were also conducted using the standard confidence interval for the normal mean (
) based on the Student’s t-distribution and the z-distribution, which are much more simple and easier to implement. Moreover, the new confidence intervals were proposed for the difference between two means of the normal distributions with unknown CVs (δ and
). The confidence intervals for δ and
were constructed based on the GCI approach (
and
), the LS approach (
and
) and the MOVER approach (
and
), compared with the standard confidence interval, using the WS approach to construct the confidence interval for the difference of two means of the normal distribution (
). The coverage probabilities and average lengths of the proposed confidence intervals were evaluated through Monte Carlo simulations.
For the single mean with unknown CV, the results are similar to the paper by Sodanin et al. [
16] in terms of coverage probability and average length for all cases. The coverage probabilities of
were satisfactorily stable around 0.95. Therefore,
was preferred for the single mean of the normal distribution with unknown CV.
and
have the coverage probabilities under the nominal confidence level of 0.95 and close to 1.00 when
σ increases. Therefore, the LS approach is not recommended to construct the confidence interval for the mean with unknown CV. Furthermore,
is better than
in terms of the average length when the sample size is small, whereas
is better than
in terms of coverage probability when the sample size is large. However, the coverage probability of
is more stable than that of
in all sample size cases. Therefore, the GCI approach is recommended as an interval estimator for the mean with unknown CV.
For the difference of two means with unknown CVs, the coverage probabilities of satisfy the nominal confidence level of 0.95 for all cases. Therefore, was preferred for the difference of the means with unknown CVs. The LS and MOVER approaches are not recommended to construct the confidence interval for the difference of means with unknown CVs. Furthermore, is better than in terms of the coverage probability. Therefore, the GCI approach can be used to estimate the confidence interval for the difference of means with unknown CVs.
Hence, it can be seen in this paper that the new estimator of Srivastava [
12] is utilized and well established both in constructing the single mean confidence interval and the difference of means of normal distributions when the CVs are unknown.