1. Introduction
Consider a sample of
from the two-parameter gamma model with mean
and shape
. The joint density is
where
is a minimal sufficient statistic. The two-parameter gamma distribution is often used to model the non-negative data with right-skewed distribution. Moreover, depending on the values of the parameters, it can have a decreasing failure rate, a constant failure rate, or an increasing failure rate. This makes it a valuable model for analyzing data arising from engineering, environmental, meteorological, and medical studies.
Similar to the normal distribution, the two-parameter gamma distribution has a two-dimensional minimal sufficient statistic
. Another version of the minimal sufficient statistic is
, where
r is the log offset of the arithmetic mean from the geometric mean
Notice that the density of
has location model form for a fixed
. It follows from [
1] that the conditional density for
s given
r and the marginal density for
r take the form
where
appears in the transformed measure
which requires
-dimensional integration. Hence, the joint density for
is
By change of variable, the joint density for (s, t) takes the form
The same result can be obtained by using the properties of the exponential transformation model by [
2] or the conditional argument in [
3]. Note that
requires
-dimensional integration, and it is available exactly only for small value of
n (see [
3,
4]). Unlike the normal model, where inference for the normal mean can be obtained explicitly, inference for the gamma mean is a complicated and challenging problem. Many asymptotics inferential methods for the gamma mean exist in statistical literature. Moreover, most of the existing asymptotics methods are likelihood-based methods. Some are very simple but not very accurate, and others are very accurate but mathematically complicated and also computationally intensive. Furthermore, only limited methods can be applied to the problem of comparing the means of
independent gamma distributions.
Ref. [
5] considered the log-likelihood function obtained from (1), which takes the form
The maximum likelihood estimate (MLE) is
, where
, and
satisfies
with
being the digamma function. In addition, the observed information matrix evaluated at MLE is
It is well-known that the variance-covariance matrix of the maximum likelihood estimators can be approximated by
. Hence, the standardized maximumm likelihood estimator is
where
can be approximated by the
entry of
. With the regularity conditions stated in [
6] and also in [
7], for large
n,
with first order accuracy,
. Thus, inference for
can be obtained based on the limiting distribution. This method is generally known as the Wald method or the asymptotic MLE method.
Another commonly used method to obtain inference for
is the log-likelihood ratio method. Let
be the constrained MLE, which maximizes
for a fixed
. In this case,
must satisfy
Then the log-likelihood ratio statistic is
Again, with the regularity conditions stated in [
6] and also in [
7], for large
n,
with first order accuracy,
. Hence, inference for
can be approximated based on the limiting
distribution. This method is also known as the Wilks method.
To improve the accuracy of the log-likelihood ratio method, ref. [
8] applied the Bartlett correction to the log-likelihood ratio statistic (see [
9]). The resulting Bartlett corrected log-likelihood ratio statistic takes the form
where
and
is known as the Bartlett correction factor. Ref. [
9] showed that the Bartlett corrected log-likelihood ratio statistic converges to
distribution with fourth order accuracy. Hence, inference for
can be approximated based on the limiting
distribution.
Refs. [
3,
4] showed that the exact form of
in (2) is only available when
n is small. Jensen used the fact that the model is an exponential-transformation model and applied the saddlepoint method to approximate
and derived an inference procedure for
with third order accuracy. However, due to the complexity of the method, ref. [
3] only provided tables for 1, 2.5, 97.5, and 99 percentiles of
for sample sizes 10, 20, 40, and
∞, which were obtained by extensive iterative calculations. On the other hand, ref. [
4] proposed another third order method to obtain inference for
. This method is asymptotically equivalent to Jensen’s method with the exception that it involves direct implementation of the method derived in [
10].
Note that Gross and Clark’s method is very simple but not very accurate. The log-likelihood ratio method is slightly more complicated because of the calculation of the constrained MLE, and it is still not very accurate. The Bartlett corrected log-likelihood ratio method by [
8] gives very accurate results. It is also relatively straightforward because [
8] derived all the necessary equations. The method presented in [
4] is also very accurate, but it is computational intensive. Gross and Clark’s method, Jensen’s method and Fraser, Reid and Wong’s method are not applicable to the problem of testing homogeneity of
independent gamma means. The log-likelihood ratio method can be extended to this problem but it has only first order accuracy. Ref. [
8] also derived the explicit Bartlett correction factor for testing equality of two independent gamma means, but, due to the complexity of the method, did not derive the explicit Bartlett correction factor for testing homogeneity of
independent gamma means.
In this paper, a Bartlett-type correction of the log-likelihood ratio statistic is proposed in
Section 2. The proposed Bartlett-type correction factor is numerically obtained by simulations. The proposed method is then applied testing homogeneity of
independent gamma means in
Section 3 and
Section 4, respectively. Some concluding remarks are given in
Section 5. Real-life examples and simulation studies results are presented to compare the accuracy of the proposed method with the existing methods.
2. Main Results
Let
be the log-likelihood function with a
p-dimensional parameter
. With the regularity conditions stated in [
6], the log-likelihood ratio statistic,
is asymptotically distributed as
with first order accuracy, where
is the overall MLE, which maximizes
. Ref. [
9] showed that the mean of
can be expressed as
where
n is the size of the observed sample and
is the Bartlett correction factor. Hence, the Bartlett corrected log-likelihood ratio statistic is
and has mean
p with fourth order accuracy.
The above method can be generalized to the case when
is the parameter of interest and dimension of
is
. With the regularity conditions stated in [
6], the log-likelihood ratio statistic
is asymptotically distributed as
with first order accuracy. Note that
is the overall MLE, which maximizes
, and
is the constrained MLE, which maximizes
for a given value of
. The Bartlett corrected log-likelihood ratio statistic is then
and
has mean
m with fourth order accuracy.
Theoretically, the Bartlett correction method gives extremely accurate results, even for small sample sizes. However, obtaining the explicit closed form expression of the Bartlett correction factor is a very difficult problem. There exists only limited problems in statistical literature that the explicit closed form, or even the asymptotic form of the Bartlett correction is available. For example, Ref. [
8] obtained the Bartlett correction factor for the one-sample gamma mean problem as well as for the equality of the two independent gamma means problem only. However, they did not discuss the case for testing homogeneity of
independent gamma means. The aim of this paper is to propose a systematic way of approximating a Bartlett-type correction factor.
Since the log-likelihood ratio statistic
has the limiting distribution
, similar to the Bartlett correction method, we want to find a scale transformation of
such that the transformed statistic has the exact mean
m. An obvious transformation is
However, calculating
is extremely complicated, if not impossible. If we have an observed sample of
, we can then apply the method of moments to estimate
. The primary task is to obtain such a sample. We propose to employ simulations to create such a sample. The main idea to to generate samples from the original model but with the parameters being the constrained MLE obtained from the original observed sample. We summarized the idea into the following algorithm.
Assume: | is a sample from a model with density
, and is the parameter of interest. |
Have: | The log-likelihood function is given in (3). |
| From the log-likelihood function, we can obtain
and . |
Step 1: | Generate a sample of size , from
the density . |
Step 2: | From the simulated sample, obtain the log-likelihood ratio statistic and
denote it as . |
Step 3: | Repeat Steps 1 to 2 N times, where N is large. |
| As a result, we have . |
Step 4: | Obtain
|
Step 5: | By the method of moments, is an consistent estimate of
. Hence,
has mean m, and its limiting distribution is . |
3. One-Sample Gamma Mean Problem
Consider the one-sample gamma mean problem with the log-likelihood given in (3). Ref. [
5] proposed to use Wald statistic given in (4) to obtain inference for
, whereas [
8] recommended to use the Bartlett corrected log-likelihood ratio statistic given in (6) to obtain inference for
. In this paper, a Bartlett-typed corrected log-likelihood ratio statistic based on the algorithm given in
Section 2 is proposed as an alternative approach to obtain inference for
. To compare the results obtained by these methods, we consider the data set given in [
5], which is the survival time of 20 mice exposed to 240 rad of gamma radiation:
152 | 152 | 115 | 109 | 137 | 88 | 94 | 77 | 160 | 165 |
125 | 40 | 128 | 123 | 136 | 101 | 62 | 153 | 83 | 69 |
Table 1 recorded the 95% confidence intervals for
and the
p-values for testing
vs.
obtained by the methods discussed in this paper. From
Table 1, we observed that results obtained by Jensen and Kristensen’s method and by the proposed method are almost identical. However, they are very different from the results obtained by Gross and Clark’s method and the standard log-likelihood ratio statistic method.
Simulation studies are performed to compare the accuracy of the methods discussed in this paper. In particular, 5000 simulated samples are obtained for each combination of
, and
n. Moreover, we use
for each of the simulated sample to approximate the mean of the log-likelihood ratio statistic. The proportion of samples that was rejected at 5% significance level are recorded in
Table 2. Theoretically, the true percentage of samples that will be rejected is 5% with a standard deviation of 0.31%. Extensive simulation studies were performed. All results are very similar and, therefore, only a subset of them are presented in
Table 2. Results from other combinations of the parameters are available from the authors upon request.
We observed that results by Gross and Clark’s method are significantly different from the nominal 5% level, but the accuracy is improving slowly as the sample size n increases. Results by the log-likelihood ratio method are slightly better, and the accuracy improves much faster as the sample size increases. Results by both Jensen and Kristensen’s method and the proposed method are very accurate and always within 3 standard deviation of the nominal 5% level, even when the sample size is as small as 5.
4. Testing Homogeneity of Independent Gamma Means
In this section, the proposed method is extended to testing homogeneity of
independent gamma means problem. Let
be a sample from the two-parameter gamma model with mean
and shape
, where
and
. Moreover, assume the
k two-parameter gamma models are independent. Let
be the log-likelihood function from the
ith model. Then the joint log-likelihood function is
where
Since the
k models are independent, the overall MLE is
where
and
must satisfy
Hence, the log-likelihood function is evaluated at MLE
.
To test homogeneity of
k gamma means, the null and alternative hypotheses are
The constrained log-likelihood function is
The constrained MLE can must satisfy
Thus, the log-likelihood function evaluated at the constrained MLE is
. Finally, the log-likelihood ratio statistic is
and it is asymptotically distributed as
with first order accuracy. Note that for this problem, the Wald method is not applicable. Moreover, ref. [
8] did not derive the exact Bartlett correction factor for
. However, the proposed method is still applicable.
To illustrate the application of the log-likelihood ratio method and the proposed method for this problem, we consider the intervals in service-hours between failures of the air-conditioning equipment in 10 Boeing 720 jet aircrafts reported in “Example T” from [
11]. It is assumed that the reported times for each aircraft is distributed as a two-parameter gamma distribution. The question of interest is whether the ten aircrafts have the same mean intervals in service hours between failure. In other words, we are testing
The log-likelihood ratio method gives a
p-value of 0.0871, whereas the proposed method gives a
p-value of 0.1295 (using
N = 100,000). At 10% level of significance, the two methods give contradictory results with the log-likelihood ratio method rejecting
, and the proposed method failing to reject
.
As in the one sample case, to compare the accuracy of the two methods, extensive simulation studies were performed. For each combination of
, 5000 simulated samples are obtained. Moreover, for each simulated sample, N = 100 is used to estimate the mean of the log-likelihood ratio statistic. The proportion of samples that reject the null hypothesis of homogeneity of the
k means at 5% significance levels are recorded. Since all results are very similar, only results from the 8 cases listed in
Table 3 are reported.
Results in
Table 4 and
Table 5 demonstrate that the log-likelihood ratio method gives unsatisfactory results, especially when the sample sizes are small. However, the accuracy of the results improve as the sample sizes increase. In comparison, the results from the proposed method are very accurate, and they are always within 3 standard deviation of the nominal 5% level, regardless of the sample sizes.