1. Introduction
Let
be independent and identically distributed random variables (i.i.d. r.v.’s),
be a Poisson r.v. with expectation
and independent of the sequence
for each
. The r.v. is
and is called a Poisson random sum, and its distribution is called a compound Poisson. Here, for definiteness, we assume that
. Poisson random sums
are popular mathematic models in many fields. In particular, in the classical collective risk model [
1], the r.v.
describes the total insurance claim amount per time unit with the intensity of the claim arrivals equaling
. Many examples of applied problems that make use of Poisson random sums can be found, e.g., in the books [
2,
3,
4]. As a rule, these problems can be successfully solved only if the distribution of the r.v.
is either known or approximated accurately enough.
Assume that
. We denote
As is well known, under the above assumptions, the compound Poisson distributions are asymptotically normal:
Therefore, irrespective of the common distribution
of the summands
the distribution function (d.f.) of the Poisson random sum
can be approximated by the normal law with the corresponding location and scale parameters under reasonable (”convenient,” computable) estimates
for the uniform distance
:
Under the above assumptions,
may converge to zero arbitrarily and slowly ([
5], Theorems 5 and 8). Some possible upper bounds for
in this situation were presented in [
6]. However, under some additional moment-type conditions, the rate of convergence of
to zero can be rather universally estimated by a “convenient” power-type function. For example, if
for some
, then
, as
. A particular form of
is determined by the available moment characteristics of
.
The main attention was traditionally given to the case
since, generally, for
, the convergence rate remains the same as for
. Moreover, by analogy with convergence rate bounds for the sums of a non-random number of independent r.v.s’, central moments were initially used in the moment-type bounds for
since these bounds themselves were obtained by a more or less ingenious application of the formula of total probability in order to extend to random sums the bounds initially constructed for non-random sums. These bounds had a rather cumbersome form, as shown in [
7,
8].
However, in the construction of the estimates of the accuracy of the normal approximation to compound Poisson distributions, it turned out to be convenient and reasonable to use non-central moments. In these terms, the bounds take a pretty simple form [
9,
10]
where
is the
non-central Lyapunov ratio or
non-central Lyapunov fraction. Estimate (
1) is an analog of the Berry–Esseen inequality for Poisson random sums (or for compound Poisson distributions).
The first upper bounds for the constant
[
9,
10,
11] were greater than the then best-known upper bounds for the absolute constant
C in the
classical Berry–Esseen inequality [
12,
13]
where
is known as the central Lyapunov ratio or the central Lyapunov fraction. Michel [
14] was the first to prove that
(four years later, this result was independently re-proved in [
15]). Finally, the authors of [
16] succeeded in proving that
. Namely, in that paper, the upper bound of
was obtained, which was strictly less than the lower bound
[
17] for the absolute constant
C. Later the upper bound of
for
was lowered to
[
18] (see also [
19], Theorem 2.4.3) and
([
20], Theorem 4). The first lower bound,
, for
, was obtained in the paper [
21]. In ([
5], Theorem 5) and ([
22], Chapter 3, p. 50), this estimate was improved to
In [
5], an intermediate estimate was obtained in terms of the least upper bound with respect to
and
m, whereas in [
22], exact values
,
, were found to provide the lower bound for this supremum. However, if we let
, then the limit value is
only. The lower bound for the constant
is presented here with the separation of the term
, and due to that, this number plays the same asymptotic role in inequality (
1), as the Esseen lower bound
in the classical Berry–Esseen inequality. For more details concerning the asymptotically exact constants, see [
5,
23]. A detailed survey of the moment-type bounds for the accuracy of the normal approximation to the compound Poisson distribution, including both the case
and asymptotic settings, can be found in [
5] (for the case
and non-asymptotic setting see also [
18],
Section 3).
It should be noted that the estimate (
1) in terms of the non-central Lyapunov ratio
implies a similar estimate in terms of the central Lyapunov ratio
where
is an absolute constant, but not vice versa. Namely, let
and let
be the class of all distributions on
with finite third moments. In 1996 S. Shorgin [
24] proved that for any
hence, with the account of the upper bound
[
20], it follows that
and also that inequality (
4) does not imply (
1); that is, bounding (
1) in terms of the non-central Lyapunov ratio not only obtains in a more natural way than (
4) but is also more accurate. However, inequality (
4) is more natural and extremely convenient in estimating the rate of convergence of distributions of randomly stopped random walks with equivalent elementary trends and variances to variance-mean mixtures of normal laws [
25,
26,
27,
28,
29], in particular, to skew the exponential power law, skew the Student’s law, and more generally, the variance-generalized gamma and generalized hyperbolic distributions. Note that such asymptotic behavior of the elementary trends and variances is typical for the increments of a Wiener process with drift, and due to the considerable trends, the central moments of elementary increments are computed in a much simpler way than the non-central ones, which gives an advantage to inequality (
4) over inequality (
1).
In 2001, S. Shorgin [
30] suggested that
and described the hypothetical extreme of the two-point distribution of the r.v.
X.
In 2011, Korolev, Shevtsova, and Shorgin [
31] demonstrated that the least upper bound
can be found in the class of distributions concentrated in at most three points, and that the estimate
was computed numerically, which implies that
see also ([
19], Section 2.4). Note that, as of 2011, the best-known upper bound for
was
[
18], yielding a worse upper bound
, published in the cited works.
In the present paper, a complete proof of hypothesis (
5) is given, but the main result consists of the solution to this problem in a more delicate setting. Namely, we suggest the fixing of the value of the normalized mathematical expectation
and instead of the unconditional optimization problem (
5), we solve the problem of conditional optimization
which allows us to take the possible smallness of the centering parameter
into account and majorize the ratio
by a quantity close to unity, which is almost one and a half times more accurate, than is allowed by (
5). The values
are not considered here because the only distribution satisfying the conditions
and
is the degenerate in the point
t one. The solution to the conditional optimization problem (
6) reduces the calculation of the least upper bound to
In the present paper,
is calculated for each value of the centering parameter
(Theorem 1 and
Table 1), and hypothesis (
5) is proved by writing the
in the form
and calculating the latest upper bound with respect to
(Theorem 2 and
Table 1). In particular, from (
7), it follows that for any
, we have
and hence, for any distribution
with the known value of the normalized first-order moment
, inequality (
4) holds with a sharper value of the constant
The values of
rounded up to the fourth digit are presented for some
in the fourth column of
Table 1. In addition, in Theorem 3, the form of the constant
,
is presented for the case where only an upper bound
is known for the normalized expectation.
Regarding the methods, computation of the least upper bound in (
7) is implemented in two steps: a reduction to the distributions concentrated in two points at most (see
Section 3, “Reduction to the case of two-point distributions”), and the analysis of the two-point distributions (see
Section 4, “Analysis of the two-point distributions”), the last step is, in fact, the most difficult one from a technical point of view. It also should be noted here that the standard technique based on the works [
32,
33,
34] (see also [
35]) allows the reduction of only up to the three-point distributions, since there are three linear conditions in total for
in (
6) and (
7): the two moment conditions
, plus one probability normalization condition
. In fact, the same moments should be fixed in (
5) to make the objective function
linear with respect to
, and hence, no further reduction in (
5) can be allowed by just the standard techniques. Therefore, we used an alternative approach based on the construction of a special lower bound to
with two tangency points in the form of a linear combination of the functions
, generating the required moment conditions
(Lemma 1 in
Section 3), and then integrating the obtained inequality with respect to
x (Lemma 2 in
Section 3). This trick allows us to immediately reduce the calculation of the least upper bound in (
7) to the analysis of the two-point distributions, which is implemented in Lemma 4 of
Section 4.
Section 2, “Formulations of main results,” contains accurate formulations of the main results, and
Section 5, “Proofs of main results,” contains their proofs.
To conclude this introductory overview, note as well that an “opposite” problem of comparing the central and non-central absolute moments
was considered in the papers [
36], with
and [
37], with an arbitrary
; for a wider class of functions of
X and
, including
; and also in [
38] with
under an additional restriction
for each
.
3. Reduction to the Case of Two-Point Distributions
The aim of the present section is to prove that for every r.v.
X with
, there exists an r.v.
Y with the same expectation and variance, and with the third absolute moment matching
X (and whose distribution is then uniquely defined), such that
. Since
, this would immediately imply that
, and, hence, the investigation of the least upper bound in
can be restricted to the analysis of the two-point distributions only.
Following Richter [
39], we start with the construction (Lemma 1) of a special lower bound for the function
,
, which satisfies the following two important properties:
it is a linear combination , of the functions generating the given moment conditions ; and
it has exactly two tangent points with .
Afterward, we integrate (Lemma 2) the obtained inequality with respect to
x to construct a lower bound to
as a linear combination of
, and
and note that equality in the obtained inequality is attained iff
X is a two-point r.v. with possible special values. Finally, we prove in Lemma 3 that for every
and any r.v.
X satisfying the above three moment conditions
, there exists a two-point distribution (of the r.v.
Y), whose support satisfies all the conditions in the coefficients
of
imposed by Lemma 2 and which then satisfies the required inequality
The last statement allows us to immediately conclude that only the two-point distributions may be extremal.
Lemma 1. Let Then for all such thatthe inequalityholds, wherewith equality attained exactly in the two points: and . Remark 1. In [38], Lemma 1, it was demonstrated that for any and real such thatthe inequalityholds with the same functions , as in Lemma 1 for the case where with equality attained exactly the two points and . Let
be the left-hand and right-hand sides of (
20) with
, respectively.
Figure 3,
Figure 4 and
Figure 5 illustrate that several variants of the location of tangency points of the functions
f and
g with respect to the stationary points of
g are possible. On the left side of these figures are the plots of
(solid line) and
(dotted line), whereas on the right side, for clarity, is the plot of the difference
.
Proof. By virtue of the relations (
21)–(
24), the problem is reduced to the case of
by the scale transformation. We let
The coefficients
a,
b,
c, and
d given in the formulation of the lemma, were constructed so that points
u and
v were the tangency points of the functions
and
; that is, these coefficients are defined as the solution to the system of the following four linear equations:
Next, we proved that
for any
.
(1a) Let
. We have
Since
it suffices to show that
We have
therefore,
and
. Moreover,
if and only if
.
(1b) Let
. Then
Since
it suffices to show that
We have
therefore,
and
. Moreover,
if and only if
.
(1c) Let
. Then
moreover,
if and only if
(as it was proved above),
Taking into account the relations
we have
. Moreover,
if and only if
. Note that
therefore,
, and
Hence,
increases, and with the account of
, we find that
for
; that is,
decreases for
. Since
we have
for
.
2. Now let
. We have
(2a) Let
. Then
where
Note that
with the equality attained iff
. Therefore, it suffices to show that
. However, this follows from the relations
(2b) Let
. Then
where
Note that
with the equality attained iff
. Therefore, it suffices to show that
. However, this follows from the relations
(2c) Let
. For all
, we have
Moreover,
Consider the case
Since
the function
is concave. Since
the function
has at most one root
on the interval
. Moreover,
for
and
for
. Therefore,
either increases on the whole interval
(if
is nonnegative), or increases on
and decreases on
, so that
Since
and
, we have
.
Now consider the case
. In this case,
is convex. Note that
Since
, and
is convex, the function
has exactly one root
on the interval
. Moreover,
for
and
for
. So, the function
h increases on the interval
and decreases on
. Therefore,
With the account of
and
, we have
for all
. □
Lemma 1 trivially yields the following statement.
Lemma 2. For any and every such thatthe inequalityholds with equality attained iff the distribution of the r.v. X is concentrated in the two points: and By , let us denote the class of all the non-degenerate two-point distributions. Obviously, .
Lemma 3. For any moreover, the least upper bound on the right-hand side can be attained only on the two-point distributions. Proof. It suffices to prove that for any
and r.v.
X with
there exists a two-point r.v.
Y with
satisfying the inequality
Indeed, the above moment conditions imply that
where only equality is possible since
.
(1) Let
. Consider a two-point r.v.
that takes values
with probabilities
p and
, respectively, and satisfies
Then we necessarily have
We show that
iff
. We have
The last inequality trivially holds for
since the left-hand side is positive, whereas the right-hand side is non-positive. If
, then both sides of this inequality are positive. Therefore, they can be squared:
Unifying the intervals under consideration, we obtain the desired statement. Note that on
the function
of the argument,
p takes all the values from the interval
because, for any
, we have
and
is continuous. Hence, for every
there exists
such that
. Furthermore, note that,
and, hence, the couple
satisfy all the conditions of Lemma 2, according to which, with the account of the definition of the r.v.
, we have
where the equality is attained iff the distribution of the r.v.
X is concentrated in exactly two points
and
; that is, iff
. Therefore, the desired statement holds with the r.v.
.
(2) Now let
. By virtue of Jensen’s inequality, for the strictly convex function
, we have
where the equality holds iff
The condition
immediately implies that, in this case, the r.v.
X must have the two-point distribution of the form
. So, the desired statement holds with
. □