1. Introduction
Advanced statistical methods require models that calculate probabilities derived from intractable distributions, which causes complicated calculations. The saddlepoint approximation technique can solve some of these problems. This method of approximation was proposed by Daniels [
1], and it is a particular application of the mathematical saddlepoint methods applied to statistical science. This technique provides an approximation formula for the cumulative distribution function (CDF) or the probability mass function (PMF) from its associated moment generating function (MGF). The saddlepoint approach can answer many scientific applied problems, such as bivariate symmetry tests for complete and competing risks data (Bell and Haller [
2]). Kepner and Randies [
3] introduced a comparison of tests for bivariate symmetry versus location and/or scale alternatives. Rao and Raghunath [
4] proposed a simple nonparametric test for bivariate symmetry about a line. This method of approximation also provides us with a very accurate solution for estimating the hazard rate function (see Dissanayake and Trindade [
5]), and Al Mutairi [
6] discussed improved methods for calculating the hazard rate function for the stopped sum model using saddlepoint approximation. The distribution of a linear combination of random variables appears in many applied problems and has attracted the attention of many scholars; specifically, the linear combination of two independent random variables has been investigated by many researchers (see, e.g., Amari and Misra [
7], Di Salvo [
8], Almasi et al. [
9], Al Mutairi and Low [
10], and Al Mutairi [
11] among others). Currently, the demand for these models has increased, due to their importance and wide range of applications.
This study improves the hazard rate function estimation methods for a linear combination of independent Poisson random variables by using the saddlepoint approximation.
It is well known that the Poisson distribution is stable; that is, the distribution of the sum of two independent Poisson random variables has the Poisson distribution. This means that if two random variables and are independent, the distribution of their sum .
At the same time, rescaling a Poisson random variable is not Poisson anymore. Random variable , where , is not integer-valued and, hence, not Poisson. Thus, the exact distribution of a linear combination of independent Poisson random variables is not always known and very difficult to obtain.
Therefore, approximation methods are essential and can provide a piece of helpful information about the distribution of these complicated models. The saddlepoint approximation method can replace the exact answer with high accuracy, and this adds to its many advantages in terms of saving time and effort.
2. Saddlepoint Approximation for the Linear Combination of Poisson Random Variables
The linear combination of Poisson random variables is given by linear combinations of convoluted random variables, which occur in a wide range of fields. In most cases, the exact distribution of these linear combinations is extremely difficult to determine, and the normal approximation usually performs very badly for these complicated distributions. A better method of approximating linear combination distributions involves the additional use of saddlepoint approximation. Saddlepoint approximation is able to provide accurate expressions for distribution functions that are unknown in their closed forms. This method not only yields an accurate approximation near the centre of the distribution but also controls the relative error in the far tail of the distribution.
The linear combination of Poisson random variables is given by
where
are real constants and
are independent Poisson random variables. Note that we do not require that
be identically distributed.
Recall that the cumulant generating function (CGF) of a random variable
is defined as a logarithm of its moment generating function
The saddlepoint technique requires calculating the cumulant generating function for the linear combination
of Poisson random variables as
Note that in Butler [
12] Sections 1.1.2 and 1.1.5 (formulae (1.13) and (1.14)), the saddlepoint approximation is defined only for continuous and integer-valued distributions
where
is the unique solution for the saddlepoint equation
.
The saddlepoint approximation formula for the discrete cumulative distribution function requires some modification of the continuous formula in order to derive the formula with a very high level of accuracy. Daniels [
13] derived three such continuity corrections. Let
be a discrete random variable with CDF F (k) supporting the integers with mean
. The right or left tail probability is used to solve many applied problems and the failure rate, as shown in this work.
All three saddlepoint continuity corrections are very accurate, and which one is more accurate than the others depends upon the specific applications or problems (Butler [
12]).
The distribution of the linear combination of Poisson random variables
is neither continuous nor integer-valued, and, hence, we need a continuity correction for the saddlepoint approximation. For a more detailed summary of these arguments, see Daniels [
1] and Daniels [
13] Section 2 for more details. The first continuity correction for the right tail is given as
where
is the mean of the random variable
,
and
are the standard normal distribution and density functions, respectively, and
The second continuity correction is given as
where
is the unique solution for the saddlepoint equation
, and
The third continuity correction suggested in Butler [
12] Section 1 (pp. 17–18) is the same as formula
but with
replaced by
.
3. An Example of the Saddlepoint Approximation for the Distribution Function
Suppose the linear combination of two independent Poisson random variables
where
and
. The CGF is
The saddlepoint equation is
To show the performance of this method, we consider the sum of two independent Poisson random variables with means
and
, which is also a Poisson random variable with mean
. Let
For
, we have
and, hence, the saddlepoint is
and the PMS
from Equation (1) is
The exact
is given by calculating
when
.
The absolute error of our approximation is .
Next, we find the first continuity correction given in Equation (2)
because
The exact right tail is
with an absolute error of
.
We note that saddlepoint approximations give accurate results in calculating the mass function and distribution function and can replace the exact answers. We can compare this with other approximation methods, for example, the normal approximation for the right tail when
with an absolute error of
.
To prove the performance of the saddlepoint approach, the present study compared it with another method of approximation, namely, the Haldane approximation. Pentikäinen [
14] proved the accuracy of that method under certain circumstances. Suppose
is a random variable with mean
, variance
, and coefficient of skewness
. The Haldane type A approximation for the right tail is given by
where
(see Borowiak [
15]). We get
with an absolute error of
.
From the calculations above, we conclude that the saddlepoint approximation for the distribution function is more accurate than the normal and Haldane approximations and, hence, the saddlepoint hazard (rate) function is more accurate than those of the associated normal and Haldane approximations.
The aim of this study is to compare the saddlepoint and normal approximations, which are common approximation methods.
4. The Hazard Rate Function Based on the Saddlepoint Approximation
The hazard rate function for a discrete distribution taking values in
is defined as
(see, e.g., Daly [
16]). Consider the same example of the sum of two independent Poisson random variables with means
and
as in the previous section. Equations (3) and (5) lead to the saddlepoint approximation of the hazard rate function with the first continuity correction
while the exact hazard rate based on Equations (4) and (6) is
Here, the absolute error for the saddlepoint and exact values is
. For the normal approximation, based on Equation (7), the hazard rate approximation is
with an absolute error of
.
The discussion above pertains to the special case of the linear combination of two independent, not identically Poisson random variables with constant . We consider this case, because we already know the exact solution for this sum, which gives us the possibility of comparing the exact solution with the methods of approximation.
Now we consider other positive real values for and , which gives the linear combination of independent Poisson random variables. It has a complicated distribution, and the exact mass and density functions, in most cases, are unknown or difficult to obtain. Therefore, the methods of approximation are very important.
Table 1,
Table 2 and
Table 3 show the exact values for the three saddlepoint continuity corrections and the normal approximations, respectively, of the hazard rate function for the linear combination
, where
and
, for different values
. The exact value is found by generating
random samples from the linear combination using the R programme.
From the above calculations in
Table 1,
Table 2 and
Table 3, the performance of the saddlepoint approach is evident because of the relative error between the saddlepoint and the exact values. Thus, the relative error of the saddlepoint is smaller than the relative error of the normal approximation.
Next, to show the effectiveness of this method of approximation, we consider a four-component linear combination of independent Poisson random variables
Applying the same technique as for two independent Poisson random variables, we obtain that the CGF is given by
and the saddlepoint equation is obtained as
For example, suppose that
with constants
. Consider the value
. This also gives us a good opportunity to investigate the accuracy of this method, because the exact solution is already known as
.
From the saddlepoint equation , we obtain that and the saddlepoint is .
Based on Equation (1),
with its corresponding exact answer
. This leads to a high level of accuracy for calculating the mass function, with an absolute error of
.
The exact right tail probability when
is obtained as
Calculating the first continuity correction by Equation (2), we obtain
This provides the first continuity correction of the saddlepoint approximation for the right tail with
with an absolute error of
.
The corresponding normal approximation for
is
with an absolute error of
. This means that the saddlepoint approximation is closer to the exact value than the normal approximation.
From the calculations above, we can derive the saddlepoint approximation for the hazard rate for the first continuity correction when
as
with an absolute error, for the exact hazard rate
, of
.
Table 4,
Table 5 and
Table 6 present the three saddlepoint approximations with their associated normal approximation, respectively.
From
Table 4,
Table 5 and
Table 6, we see that the saddlepoint method of approximation does well, with a higher level of accuracy than the normal approximation. Thus, the relative error for the saddlepoint is smaller than the relative error for the normal approximation.