Stieltjes and Hamburger Reduced Moment Problem When MaxEnt Solution Does Not Exist

Pier Luigi Novi Inverardi; Aldo Tagliani

doi:10.3390/math9040309

Abstract

For a given set of moments whose predetermined values represent the available information, we consider the case where the Maximum Entropy (MaxEnt) solutions for Stieltjes and Hamburger reduced moment problems do not exist. Genuinely relying upon MaxEnt rationale we find the distribution with largest entropy and we prove that this distribution gives the best approximation of the true but unknown underlying distribution. Despite the nice properties just listed, the suggested approximation suffers from some numerical drawbacks and we will discuss this aspect in detail in the paper.

Keywords:

probability distribution; Stieltjes and Hamburger reduced moment problem; entropy; maximum entropy; moment space

1. Problem Formulation and MaxEnt Rationale

In the context of testable information that is, when a statement about a probability distribution whose truth or falsity is well-defined, the principle of maximum entropy states that the probability distribution which best represents the current state of knowledge is the one with largest entropy. In this spirit, Maximum Entropy (MaxEnt) methods are traditionally used to select a probability distribution in situations when some (prior) knowledge about the true probability distribution is available and several (up to an infinite set of) different probability distributions are consistent with it. In such a situation MaxEnt methods represent correct methods for doing inference about the true but unknown underlying distribution generating the data that have been observed.

Suppose that X be an absolutely continuous random variable having probability density function (pdf) f defined on an unbounded support

S_{X}

and that

{μ_{k}^{*}}_{k = 1}^{M}

, with

μ_{0}^{*} = 1

, be M finite integer moments whose values are pre-determined that is,

μ_{k}^{*} = \int_{S_{X}} x^{k} f (x) d x, k = 0, \dots, M,

(1)

for an arbitrary

M \in N

. Quantities such as in (1) may be intended to represent the available (pre-determined) information relatively to X.

The Stieltjes (Hamburger) reduced moment problem [1] consists of recovering an unknown pdf f, having support

S_{X} = R^{+}

(

S_{X} = R

), from the knowledge of prefixed moment set

{μ_{k}^{*}}_{k = 1}^{M}

.

Due to the non-uniqueness of the recovered density, the best choice among the (potentially, infinite) competitors may be done by invoking the Maximum Entropy (MaxEnt) principle [2] which consists in maximizing the Shannon-entropy

H_{f} = - \int_{S_{X}} f (x) ln f (x) d x

under the constraints (1). Since entropy may be regarded as an objective measure of the uncertainty in a distribution, “... the MaxEnt distribution is uniquely determined as the one which is maximally non-committal with regard the missing information” ([2], p. 623) so that “...It agrees with is known but expresses maximum uncertainty with respect to all other matters, and thus leaves a maximum possible freedom for our final decisions to be influenced by the subsequent sample data” ([3], p. 231). In other words, the MaxEnt method dictates the most "reasonable and objective" distribution subject to given constraints.

More formally, in such situation we have to manage a constrained optimization problem involving Shannon entropy and a set of given constraints (here the first M integer moments and the normalization constraint given by (1) when

k = 0

).

This problem is typically solved using the method of Lagrange multipliers leading to a MaxEnt distribution whose density function

f_{M}

is given by [4], p. 59

f_{M ∣ (μ_{1}^{*}, \dots, μ_{M}^{*})} (x) = exp (- \sum_{j = 0}^{M} λ_{j} x^{j}),

(2)

fulfils the given constraints

{μ_{k}^{*}}_{k = 0}^{M}

since

μ_{k}^{*} = \int_{S_{X}} x^{k} f_{M} (x) d x, k = 0, 1, . . ., M

(3)

and has entropy

H_{f_{M}} (μ_{1}^{*}, \dots, μ_{M}^{*}) = - \int_{S_{X}} f_{M} (x) ln f_{M} (x) d x = λ_{0} + \sum_{k = 1}^{M} λ_{j} μ_{k}^{*}

(4)

where

H_{f_{M}} (μ_{1}^{*}, \dots, μ_{M}^{*}) = max_{f \in C_{M}} H_{f} (μ_{1}^{*}, \dots, μ_{M}^{*})

and

C_{M} = : \{f \geq 0 | \int_{S_{X}} x^{k} f (x) d x = μ_{k}^{*}, k = 0, \dots, M\} = \{f_{(μ_{1}^{*}, \dots, μ_{M}^{*})}\} .

(5)

From now on, for sake of brevity, we will write each member f of

C_{M}

omitting the dependency on

(μ_{1}^{*}, \dots, μ_{M}^{*})

, predetermined set of moments; hence, f and

f_{M}

will stand for

f_{(μ_{1}^{*}, \dots, μ_{M}^{*})}

and

f_{M ∣ (μ_{1}^{*}, \dots, μ_{M}^{*})} \in C_{M}

, respectively. The same will be done for the corresponding entropies: we will write

H_{f}

and

H_{f_{M}}

in place of

H_{f} (μ_{1}^{*}, \dots, μ_{M}^{*})

and

H_{f_{M}} (μ_{1}^{*}, \dots, μ_{M}^{*})

.

A few words about our notation are now opportune. Since in the sequel an arbitrary moment

μ_{j}

may play different roles, we establish to use

$μ_{j}^{*}$ for prescribed moments;
$μ_{j}$ for variable (free to vary) moments;
$μ_{j, f_{j - 1}}$ for the j-th moment of $f_{j - 1}$ , that is $μ_{j, f_{j - 1}} = \int_{S_{X}} x^{j} f_{j - 1} (x) d x$ (in general $μ_{j} \neq μ_{j}^{*}$ )
$μ_{j}^{-}$ to indicate the smallest value of $μ_{j}$ , once $(μ_{1}^{*}, \dots, μ_{j - 1}^{*})$ are prescribed.

Our attention is solely addressed towards sequences

{μ_{k}^{*}}_{k = 1}^{\infty}

whose underlying density f has finite entropy

H_{f}

. More precisely, only distributions with

H_{f} = - \infty

are not considered. Indeed, once

{μ_{k}^{*}}_{k = 0}^{\infty}

is assigned,

H_{f} = + \infty

is not feasible, as it is well known in MaxEnt setup that

H_{f} \leq H_{f_{2}} = \frac{1}{2} ln [2 π e (μ_{2}^{*} - {(μ_{1}^{*})}^{2})]

is finite because Lyapunov’s inequality

μ_{2}^{*} - {(μ_{1}^{*})}^{2}

(Hamburger case) and

H_{f} \leq H_{f_{1}} = 1 + ln μ_{1}^{*}

is finite for every

μ_{1}^{*} > 0

(Stieltjes case).

Here

(λ_{0}, \dots, λ_{M})

is the vector of Lagrange multipliers, with

λ_{M} \geq 0

to guarantee integrability of

f_{M}

. If it is possible to determine Lagrange multipliers from the constraints

{μ_{k}^{*}}_{k = 1}^{M}

then the moment problem admits solution and

f_{M}

is MaxEnt solution (which is unique in S due to strict concavity of (4)).

The above non negativity condition on

λ_{M}

which is a consequence of unbounded support

S_{X}

, is crucial and renders the moment problem solvable only under certain restrictive assumptions on the prescribed moment vector

(μ_{1}^{*}, \dots, μ_{M}^{*})

. This is the ultimate reason upon which the present paper relies.

The existence conditions of the MaxEnt solution

f_{M}

have been deeply investigated in literature ([5,6,7,8,9] just to mention some widely cited papers); over the years an intense debate—combining the results of the above papers—has established the correct existence conditions underlying the Stieltjes and Hamburger moment problem (more details on this topic may be found in the Appendix A).

On the other hand, when the existence conditions for

f_{M}

are not satisfied, the non-existence of the MaxEnt solution in Stieltjes and Hamburger reduced moment problem poses a series of interesting and important questions about how to find an approximant of the unknown density f least committed to the information not given to us (still obeying to Jaynes’ Principle). This problem is addressed the present paper.

More formally, take

C_{M}

to be the set of the density functions satisfying the

M + 1

moment constraints (that is, they share the same

M + 1

predetermined moments) and let

μ (C_{M})

be the moment space associated to

C_{M}

; hence, the indeterminacy of the moment problem (1) follows.

A common way to regularize the problem, as recalled before, consists in applying the MaxEnt Principle obtaining

E_{M}

, the set of MaxEnt densities functions which is a subset of

C_{M}

; consequently, let

μ (E_{M})

be the moment space relative to the set of MaxEnt densities functions

E_{M}

. Because, in general,

μ (C_{M})

strictly includes

μ (E_{M})

there are admissible moment vectors in

Int (μ (C_{M}))

, the interior of

μ (C_{M})

, for which the moment problem (1) is solvable but the MaxEnt problem (3) has no solution and the usual regularization based on MaxEnt strategy is therefore precluded.

The implications of such issue are often understated in practical applications where the usual procedure limits itself to:

In the Stieltjes or symmetric Hamburger cases: to replace the support $R^{+}$ ( $R$ ) with an arbitrarily large interval $[a, b]$ . As a consequence of it, the problem is numerically solved within a proper interval $[a, b]$ , changing the original Stieltjes (Hamburger) moment problem into Hausdorff one. In the MaxEnt setup, the latter admits a solution for each set ${μ_{k}^{*}}_{k = 1}^{M} \in Int (μ (C_{M}))$ ([7], Theorem 2).
If $f_{M}$ does not exist (conclusion drawn uniquely from numerical evidence), $f_{M - 1}$ (Stieltjes case) or $f_{M - 2}$ (symmetric Hamburger case) always exist (see Appendix A). In such a case, although the first M moments are known, we have to settle for a density constrained by ${μ_{k}^{*}}_{k = 1}^{M - 1}$ or ${μ_{k}^{*}}_{k = 1}^{M - 2}$ . However, this is not completely coherent with the MaxEnt principle that prescribes to use not only the available but all the available information; hence discarding available information seems to be conceptually in contrast with the MaxEnt spirit. However, from the point of practical applications, to consider or not to consider the prefixed moment $μ_{M}^{*}$ seems to have negligible effects on the summarizing quantities of the underlying distribution (mostly expected values of suitable functions) in which we may be interested in. We will resume this issue, after having carefully motivated and proved the proposed solution, in the last section of the paper devoted to discussion and conclusions.

We call the solutions 1. and 2. “forced” pseudo-solutions; they might indeed lead to the unpleasant fact that a MaxEnt solution always exists, although the original Stieltjes (Hamburger) moment problem does not admit any solution. Hence the crucial question is: does there exist a way to regularize the (indeterminate) moment problem (1) coherently with all and only the available information exploiting the MaxEnt rationale setup without forcing to unnatural solutions, i.e., based on totally inappropriate application of the MaxEnt principle?

Before proceed recall

C_{M}

and define the following class of density functions:

{\tilde{C}}_{M} = : {f \geq 0 | \int_{S} x^{k} f (x) d x = μ_{k}^{*}, k = 0, . . ., M, μ_{M + 1} = + \infty}

(6)

with

{\tilde{C}}_{M} \subset C_{M}

, whose entries satisfy the given constraints expressed in terms of

M + 1

assigned integer moments

μ_{k}^{*} = E (X^{k}), k = 0, 1, \dots, M

.

Now the question is: once

{μ_{k}^{*}}_{k = 1}^{M} \in μ (C_{M}) \ μ (E_{M})

are pre-determined (that is, the MaxEnt problem does not admit solution) what is the optimal choice of the pdf that we can select in place of

f_{M}

? Relying upon the MaxEnt rationale, the best substitute of the missing

f_{M}

should be given by suitable one

{\tilde{f}}_{M} \in C_{M}

having the overall largest entropy; that is, select

{\tilde{f}}_{M} \in C_{M}

actually satisfying the relationship

sup_{f \in C_{M}} H_{f} - H_{{\tilde{f}}_{M}} < ε

(7)

for an arbitrarily small

ε

.

We are aimed to find

{sup}_{f \in C_{M}} H_{f}

,

{\tilde{f}}_{M}

and the corresponding entropy

H_{{\tilde{f}}_{M}}

, proving that it may be accomplished by MaxEnt machinery (see Equations (9)–(11) below).

The remainder of the paper is organized as follows. Section 2 and Section 3 are devoted to evaluating the best pdf in Stieltjes and Hamburger cases respectively. We devote Section 4 to numerical aspects and in Section 5 we round up with some concluding remarks. In Appendix A. the existence conditions of MaxEnt distributions in Stieltjes and Hamburger case are shortly reviewed.

2. Stieltjes Case

Let us consider

{μ_{k}^{*}}_{k = 1}^{M} \in μ (C_{M}) \ μ (E_{M})

; consequently

f_{M}

does not exist. In this section we provide a formal justification about motivation (rationale) and optimality of the proposed substitute

{\tilde{f}}_{M}

of the MaxEnt density

f_{M}

. We deal with the issue of selecting the "best” pdf both satisfying the constraints (given by predetermined integer moments) and with the overall largest entropy.

Before start, some relevant facts need to be collected together. Since MaxEnt density

f_{M}

does not exist both

f_{M - 1}

with its M-th moment

μ_{M, f_{M - 1}} \leq μ_{M}^{*}

and

f_{M + 1} = f_{M + 1} (μ_{M + 1}) = f_{M + 1 ∣ μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}} \in C_{M}

exist; the latter exists for any value

μ_{M + 1} > μ_{M + 1}^{-}

(see Appendix A for more details).

Since the procedure here adopted remains valid for each value

μ_{M}^{*} > μ_{M, f_{M - 1}}

, as

μ_{M}^{*} \to + \infty

from Lyapunov’s inequality, we have

μ_{M + 1} > {(μ_{M}^{*})}^{(1 + \frac{1}{M})}

and consequently

μ_{M + 1} \to + \infty

too. As well, since MaxEnt density does not exist, some additional information not given to us must be added; of course,

μ_{M + 1}

is the most suitable candidate to represent it.

Once this is established, the relevant question is: what value for

μ_{M + 1}

? Recalling that

H_{f_{M + 1}} (μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}) = H_{f_{M + 1}} (μ_{M + 1})

is monotonic increasing ([4], Equation (2.73), p. 59), with upper bound

H_{f_{M - 1}}

so that

{lim}_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1})

exists,

μ_{M + 1}

should assume the overall largest value, so that the decreasing of entropy is as small as possible.

Since

{\{μ_{k}^{*}\}}_{k = 1}^{M} \in μ (C_{M}) \ μ (E_{M})

,

f_{M}

and then

H_{f_{M}}

are meaningless,

C_{M}

includes infinitely many f and

{sup}_{f \in C_{M}} H_{f}

must be calculated. The moment set

(μ_{1}^{*}, \dots, μ_{M - 1}^{*}, μ_{M}^{*} > μ_{M, f_{M - 1}}, μ_{M + 1})

is considered too, where

μ_{M, f_{M - 1}}

is the M-th order moment of

f_{M - 1}

and

μ_{M + 1}

varies continuously within

μ (E_{M + 1})

with

(μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}^{-}) \in \partial μ (C_{M + 1}) .

(8)

If

f_{M + 1} (μ_{M + 1})

is density corresponding to the set of moments (8), the following theorem holds.

Theorem 1.

The following two relationships hold

sup_{f \in C_{M}} H_{f} = H_{f_{M - 1}}

(9)

and

lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) = H_{f_{M - 1}} .

(10)

Now,

{\tilde{f}}_{M}

is identified with

f_{M + 1} ({\tilde{μ}}_{M + 1})

where

{\tilde{μ}}_{M + 1}

is such that

H_{f_{M - 1}} - H_{f_{M + 1}} ({\tilde{μ}}_{M + 1}) < ε

(11)

and ε indicates a fixed tolerance.

Proof.

If

f_{M}

does not exist,

f_{M - 1}

exists with entropy

H_{f_{M - 1}}

and M-th moment

μ_{M, f_{M - 1}} < μ_{M}^{*}

respectively. The function

H_{f_{M}} (μ_{M})

, with

μ_{M} > μ_{M}^{-}

, is monotonic increasing. As

$μ_{M}^{*} \leq μ_{M, f_{M - 1}}$ one has ${sup}_{f \in C_{M}} H_{f} = {max}_{f \in C_{M}} H_{f} = H_{f_{M}} \leq H_{f_{M - 1}}$ . The latter represents the maximum attainable entropy once $(μ_{1}^{*}, \dots, μ_{M}^{*})$ are prescribed;
$μ_{M}^{*} > μ_{M, f_{M - 1}}$ , from monotonicity of $H_{f_{M}} (μ_{M})$ it follows ${sup}_{f \in C_{M}} H_{f} = H_{f_{M - 1}}$ independent on $μ_{M}^{*}$ . Equivalently, $H_{f_{M - 1}}$ is strict upper bound for the entropies of all densities which have same lower moments $(μ_{1}^{*}, . . ., μ_{M - 1}^{*})$ as $f_{M - 1}$ but whose highest moment $μ_{M}^{*}$ exceeds $μ_{M, f_{M - 1}}$ .

Hence Equation (9) is proved.

Let us now consider the suitable class

E_{M + 1} (μ_{M + 1}) = \{f_{M + 1} (μ_{M + 1}) = f_{M + 1 ∣ μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}} (x)\}

(12)

where

μ_{M + 1} \in (μ_{M + 1}^{-}, \infty)

is assumed as parameter and

μ_{M, f_{M - 1}}

is the M-th order moment of

f_{M - 1}

. Equivalently, the entries of

E_{M + 1} (μ_{M + 1})

are MaxEnt pdfs constrained by

(μ_{1}^{*}, . . ., μ_{M}^{*}, μ_{M + 1})

, belong to

C_{M}

and, primarily, they all have analytically tractable entropy.

In (5), (6) and (12) three classes of functions

C_{M}

,

{\tilde{C}}_{M}

and

E_{M + 1} (μ_{M + 1})

had been defined. Relying upon the identity

C_{M} = E_{M + 1} (μ_{M + 1}) \cup (C_{M} \ E_{M + 1} (μ_{M + 1}) \ {\tilde{C}}_{M}) \cup {\tilde{C}}_{M}

we investigate the entropy

H_{f}

of functions f belonging to (a)

E_{M + 1} (μ_{M + 1})

, (b)

C_{M} \ E_{M + 1} (μ_{M + 1}) \ {\tilde{C}}_{M}

and (c)

{\tilde{C}}_{M}

respectively.

Consider that $H_{f_{M + 1}} (μ_{M + 1})$ , bounded by $H_{f_{M - 1}}$ from above, is a differentiable monotonic increasing function of $μ_{M + 1}$ and then it tends to a finite limit so that

$sup_{f_{M + 1 ∣ μ_{M + 1} \in E_{M + 1} (μ_{M + 1})}} H_{f_{M + 1}} (μ_{M + 1}) = lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) .$
Each $f \in C_{M} \ E_{M + 1} (μ_{M + 1}) \ {\tilde{C}}_{M}$ has entropy $H_{f}$ and its $(M + 1)$ -th finite moment, say $μ_{M + 1, f}$ . Since f and $f_{M + 1 ∣ μ_{M + 1, f}}$ share same moments $(μ_{1}^{*}, . . ., μ_{M}^{*}, μ_{M + 1, f})$ then $H_{f} \leq H_{f_{M + 1}} (μ_{M + 1, f})$ holds, from which

$sup_{f \in (C_{M} \ E_{M + 1} (μ_{M + 1}) \ {\tilde{C}}_{M})} H_{f} \leq sup_{f_{M + 1 ∣ μ_{M + 1} \in E_{M + 1} (μ_{M + 1})}} H_{f_{M + 1}} (μ_{M + 1}) = lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) .$
In analogy with (12), let us introduce the following class

$C_{M + 1} (μ_{M + 1}) = \{f (μ_{M + 1}) = f_{∣ μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}} (x)\}$

where $μ_{M + 1} > μ_{M + 1}^{-}$ assumes arbitrary values. For a fixed $μ_{M + 1}$ , each $f \in C_{M + 1} (μ_{M + 1})$ satisfies the following inequality

$H_{f} \leq sup_{f \in C_{M + 1} (μ_{M + 1})} H_{f} = H_{f_{M + 1}} (μ_{M + 1}) .$

Taking $μ_{M + 1} \to \infty$ , $C_{M + 1} (μ_{M + 1})$ coincides with ${\tilde{C}}_{M}$ and then

$sup_{f \in {\tilde{C}}_{M}} H_{f} = lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) .$

Collecting together both the achieved results in above items (a), (b), (c) and taking into account (9) one has

sup_{f \in C_{M}} H_{f} = lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) = H_{f_{M - 1}} .

Hence Equation (10) is also proved.

Equation (10) is restated as follows: if

ε

indicates a fixed tolerance, there exists a value

{\tilde{μ}}_{M + 1}

of

μ_{M + 1}

such that

H_{f_{M - 1}} - H_{f_{M + 1}} ({\tilde{μ}}_{M + 1}) < ε

holds. Next

{\tilde{f}}_{M}

is identified with

f_{M + 1} ({\tilde{μ}}_{M + 1})

so that its entropy

H_{{\tilde{f}}_{M}}

coincides with

H_{f_{M + 1}} ({\tilde{μ}}_{M + 1})

. From which the wanted result

H_{f_{M - 1}} - H_{{\tilde{f}}_{M}} < ε

(or, equivalently (7)) follows. As a consequence

{\tilde{f}}_{M}

is the proposed substitute of

f_{M}

and Equation (11) is proved. □

In conclusion:

As $f_{M}$ does not exist, although the current use of MaxEnt fails, a solution is found back to (9)–(11) from which the desired result (7).
The existence of MaxEnt $f_{M}$ implies its uniqueness, unlike ${\tilde{f}}_{M}$ which depends on the assumed tolerance. In numerical Examples below just above remark will be actually used.

3. Hamburger Case

The non-symmetric Hamburger case when M even is here disregarded because the existence of the MaxEnt solution

f_{M}

is guaranteed. Now, we will concentrate our attention on the symmetric case with

M \geq 4

and on the non-symmetric case with

M \geq 3

odd. In both cases, thanks to MaxEnt formalism, the procedure used in Stieltjes case can be extended to Hamburger one (see [9]); this fact represents one of the main advantages of MaxEnt machinery.

3.1. Symmetric Case with $M \geq 4$ Even

We recall

f_{M}

is symmetric function for every M even so that Lagrange multipliers

λ_{2 j - 1} = 0

.

Theorem 2.

Suppose the moment set

(μ_{1}^{*}, . . ., μ_{M}^{*})

is prescribed and

f_{M}

does not exist. Symmetric Hamburger case is analogous to Stieltjes one and then Theorem 1 holds true, with

μ_{M + 1}

replaced by

μ_{M + 2}

and

μ_{M - 1}

by

μ_{M - 2}

.

Proof.

Just remember that if

f_{M}

does not exist for a prescribed moment set

(μ_{1}^{*}, . . ., μ_{M}^{*})

then

f_{M - 2}

exists with its next moments

μ_{M - 1, f_{M - 2}}

,

μ_{M, f_{M - 2}}

and entropy

H_{f_{M - 2}}

. If

μ_{M}^{*} = μ_{M, f_{M - 2}}

holds, then

H_{f_{M - 2}}

is the maximum attainable entropy. In analogy with (12) let

E_{M + 2} = \{f_{M + 2} (μ_{M + 2}) = : f_{M + 2 ∣ (μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1}^{*} = 0, μ_{M + 2}}, μ_{M + 2} \in (μ_{M + 2}^{-}, \infty)\}

where the parameter

μ_{M + 2}

is introduced and thanks to MaxEnt machinery the proof continues analogously to the Stieltjes case. □

3.2. Non-Symmetric Case with $M \geq 3$ Odd

If M is odd

f_{M}

does not exist for every set of moments belonging to

μ (C_{M})

because

\int_{R} f_{M} (x) d x = + \infty

. We now look at the problem from a different point of view. Suppose

(μ_{1}^{*}, . . ., μ_{M}^{*}) \in μ (C_{M})

is prescribed. In general

f_{M - 1}

exists (equivalently,

(μ_{1}^{*}, . . ., μ_{M - 1}^{*}) \in μ (E_{M - 1})

(see Appendix), with its M-th moment

μ_{M, f_{M - 1}}

. Two alternatives are possible:

$μ_{M}^{*} = μ_{M, f_{M - 1}}$ then $f_{M}$ with $λ_{M} = 0$ exists and coincides with $f_{M - 1}$ . Then usual MaxEnt method may be used;
$μ_{M}^{*} \neq μ_{M, f_{M - 1}}$ (highly probable case), then $f_{M}$ does not exist.

Both items 1. and 2. recall the Stieltjes case; more precisely, item (ii) may be solved taking into account

f_{M + 1} (μ_{M + 1}) \in E_{M + 1}

(which exists for each

μ_{M + 1} > μ_{M + 1}^{-}

). Then non-symmetric case, with M odd and for every set of moments belonging to

μ (C_{M})

, is solved analogously to the Stieltjes case and Theorem 1 holds true. A consequence of achieved results in this section is the following. Let us consider a non symmetric Hamburger moment. In Theorem 1 we proved

H_{f_{M - 1}} = sup_{f \in C_{M}} H_{f} = lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1}) .

Since the entropy is monotonic non increasing as M increases, the latter equalities enable us to set

H_{f_{M}} = H_{f_{M - 1}}

. As a consequence, the two subsequences

{H_{f_{2 M}}}

and

{H_{f_{2 M - 1}}}

have coinciding entries. In past paper Milev and Tagliani proved that

{H_{f_{2 M}}}

converges to

H_{f}

([10], Theorem 1). Joining together the two achievements, both

{H_{f_{2 M}}}

and

{H_{f_{2 M - 1}}}

converge to the same limit

H_{f}

, then so does

{H_{f_{M}}}

, filling the gaps left by even moments.

4. Numerical Aspects

The procedure just above described and rooted on MaxEnt machinery suffers from some numerical drawbacks which will be here discussed. It deserves to recall similar drawbacks had been previously found ([11,12]) although for the special value

M = 4

in Hamburger case, exploring special regions of the moment space. Essentially, numerical troubles arise because the expected solution

f_{M + 1}

is contaminated with a small wiggle that (a) moving to infinity, (b) is scaled in such a way that its contribution to the (M + 1)-th order moment

μ_{M + 1}

is always

O (1)

and (c) may become invisible to numerical methods of quadrature.

Now we provide some theoretical ground to justify above heuristics, which holds true in both Hamburger and Stieltjes case thanks to MaxEnt formalism. First of all, under the constraints

(μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1})

, we prove the wiggle exists. At this purpose both the relationships

λ_{M} < 0

for each

μ_{M + 1}

and

λ_{M + 1} \to 0

as

μ_{M + 1} \to \infty

have to be proved.

$λ_{M} < 0$ for each $μ_{M + 1}$ . From Appendix we recalled if $f_{M - 1}$ exists with its M-th moment $μ_{M, f_{M - 1}}$ , MaxEnt $f_{M}$ does not exist if $μ_{M}^{*} > μ_{M, f_{M - 1}}$ . Let us consider $f_{M}$ where $μ_{M}$ varies continuously. Then $λ_{M}$ is monotonically decreasing ([9], Equation (2.1)) with $λ_{M} = 0$ as $μ_{M} = μ_{M, f_{M - 1}}$ . As a consequence no set $(λ_{1}, . . ., λ_{M})$ satisfies the constraints ( $μ_{1}^{*}, \dots, μ_{M}^{*}$ ) since the monotonicity of $λ_{M}$ would require $λ_{M} < 0$ . Let us consider $f_{M + 1}$ where $μ_{M + 1}$ varies continuously. Here, for each $μ_{M + 1}$ , $λ_{M + 1} > 0$ guarantees integrability, so that $λ_{M}$ may assume every real value. Collecting together the results about $λ_{M}$ , the set $(λ_{1}, \dots, λ_{M - 1}, λ_{M} < 0, λ_{M + 1})$ satisfies the constraints $(μ_{1}^{*}, \dots, μ_{M}^{*}, μ_{M + 1})$ for each $μ_{M + 1}$ . Equivalently, we can assert $(λ_{1}, \dots, λ_{M - 1}, λ_{M} < 0)$ are appointed to meet $(μ_{1}^{*}, \dots, μ_{M}^{*})$ , whilst $λ_{M + 1}$ to meet $μ_{M + 1}$ .
$λ_{M + 1} \to 0$ as $μ_{M + 1} \to \infty$ . Differentiating (4) with respect to $λ_{M + 1}$ and recalling the relationship ([9], Equation (2.1))

$\sum_{0}^{M} μ_{j}^{*} \frac{d λ_{j}}{d μ_{M + 1}} + μ_{M + 1} \frac{d λ_{M + 1}}{d μ_{M + 1}} = 0$

one has

$\frac{d H_{f_{M + 1}} (μ_{M + 1})}{d μ_{M + 1}} = \sum_{0}^{M} μ_{j}^{*} \frac{d λ_{j}}{d μ_{M + 1}} + μ_{M + 1} \frac{d λ_{M + 1}}{d μ_{M + 1}} + λ_{M + 1} = λ_{M + 1} .$

From Theorem 1 we proved, as $μ_{M + 1} \to \infty$ , $H_{f_{M + 1}} \to H_{f_{M - 1}}$ , so that $\frac{d H_{f_{M + 1} (μ_{M + 1})}}{d μ_{M + 1}} \to 0$ and then $λ_{M + 1} \to 0$ too.

We are ready to prove the statement concerning the fact that

f_{M + 1} (μ_{M + 1})

exhibits a small wiggle at

x ≫ 1

(analogously, in symmetric Hamburger case the wiggle is exhibited at

∣ x ∣ ≫ 1

). At

x ≫ 1

f_{M + 1} (x) \sim exp (- λ_{M} x^{M} - λ_{M + 1} x^{M + 1})

so that

f_{M + 1}

admits maximum value at

x_{wig} = - \frac{M λ_{M}}{(M + 1) λ_{M + 1}} > 0 .

As

μ_{M + 1}

increases we proved the relationships

λ_{M} < 0

and

λ_{M + 1} \to 0

, so that

x_{wig} > 0

moves to infinity (from numerical evidence, as

μ_{M + 1}

increases,

∣ λ_{M} ∣ \to 0

too much slower than

λ_{M + 1}

). Since

f_{M + 1}

has finite moments

(μ_{1}^{*}, . . ., μ_{M}^{*})

for each

μ_{M + 1}

, it follows the wiggle in a compact packet is scaled in such a way that its contribution to the (M + 1)-th order moment

μ_{M + 1}

is always

O (1)

(whilst for all higher moments the contribution due to this maximum obviously grows without bound, as a consequence of Lyapunov’s inequality).

An additional complication comes from the fact that height and position of wiggle is extremely sensitive to the parameters

λ_{M}

and

λ_{M + 1}

, so that it becomes progressively smaller and smaller until to be “invisible” if an unsuitable numerical method of quadrature is adopted. As a consequence the procedure becomes increasingly ill-conditioned to such a degree that numerical error precludes finding a suitable solution. As remedy, for instance, the quadrature on the unbounded domain has to be mapped onto finite interval, as well an adaptive quadrature is required. Since the wiggle moves along x-axis as

μ_{M + 1}

increases, a fixed nodes quadrature formula could be unsuitable as the wiggle could become invisible for some values of

μ_{M + 1}

.

Above remedies are just a numerical trick, not a reduction of Stieltjes or Hamburger problem into Hausdorff one. Indeed, all the subsequent numerical examples consider and use random variables X having unbounded support

R^{+}

or

R

.

As well the dual formulation, which evaluates

(λ_{1}, . . ., λ_{M + 1})

minimizing the potential function

{λ_{j}}_{j = 1}^{M + 1} : min_{λ_{1}, . . ., λ_{M + 1}} [ln (\int_{S} exp (- \sum_{j = 1}^{M + 1} λ_{j} x^{j}) d x) + \sum_{j = 1}^{M} λ_{j} μ_{j}^{*} + λ_{M + 1} μ_{M + 1}]

avoids the computation of higher moments, as required by Newton-type methods by solving (3).

The drawbacks just illustrated lead us to equip the stopping criterion (7) based on entropy with a further one based on the moments, which allows us the relationship

μ_{j, f_{M + 1}} = μ_{j}^{*}, j = 1, . . ., M

holds true. That is,

max_{1 \leq j \leq M} ∣ \frac{μ_{j}^{*} - μ_{j, f_{M + 1}}}{μ_{j}^{*}} ∣ < ε_{1}

(13)

(or involving the absolute error) for a proper

ε_{1}

.

The following question arises: it is

{\tilde{f}}_{M}

, here identified with

f_{M + 1} ({\tilde{μ}}_{M + 1})

and

{\tilde{μ}}_{M + 1}

is chosen so that stopping criteria (7) and (13) are verified, an acceptable approximation of underlying unknown density? Although the wiggle has non-physical meaning, nevertheless from the approximate density one like to calculate accurate and interesting quantities. We will resume the issue in the final part of the paper.

For practical purposes in both Stieltjes and Hamburger case

{\tilde{f}}_{M}

is calculated according to (9)–(11) uniquely by means of MaxEnt machinery following these two distinct steps

First, the sequence ${μ_{k}^{*}}_{k = 0}^{M}$ is prescribed and $f_{M}$ does not exists; then we know $f_{M - 1}$ exists with entropy $H_{f_{M - 1}}$ ;
The next step relies upon on the monotonicity of $H_{f_{M + 1}} (μ_{M + 1})$ . If $ε$ indicates a fixed tolerance, $f_{M + 1} (μ_{M + 1})$ is calculated taking increasing values of $μ_{M + 1}$ until for some ${\tilde{μ}}_{M + 1}$ inequality $H_{f_{M - 1}} - H_{f_{M + 1}} ({\tilde{μ}}_{M + 1}) < ε$ is satisfied, assuming implicitly (13) is satisfied too. Next ${\tilde{f}}_{M} \equiv f_{M + 1} ({\tilde{μ}}_{M + 1})$ is set.

Before to illustrate some numerical examples that confirm the goodness of the proposed method, it is worth spend some words discussing the outlined procedure. The calculation of

{\tilde{f}}_{M}

is obtained through an approximate procedure and hence has a limited range of applicability. The main problem is the presence of wiggles; at the end to contain their detrimental effect it is necessary that convergence of

H_{f_{M + 1}} (μ_{M + 1})

to

H_{f_{M - 1}}

be fast. For example, the value

μ_{M}^{*}

that precludes the existence of

f_{M}

in the Stieltjes case, must be such that the difference

μ_{M}^{*} - μ_{M, f_{M - 1}}

be small. Larger values make the convergence of

H_{f_{M + 1}} (μ_{M + 1})

to

H_{f_{M - 1}}

slow, allowing the generation of a small wiggle at great distance. The latter may become invisible to numerical quadrature methods.

Below are some numerical examples which both take into account the above remarks about the difference

μ_{M}^{*} - μ_{M, f_{M - 1}}

and illustrate the theoretical and numerical aspects mentioned above.

Example 1.

The Stieltjes case with

M = 2

and prescribed

(μ_{1}^{*}, μ_{2}^{*})

is considered. Now

f_{M}

exists if and only if the inequality

{(μ_{1}^{*})}^{2} < μ_{2}^{*} \leq 2 {(μ_{1}^{*})}^{2}

holds ([5], Theorem 2). The moment set

{(μ_{1}, μ_{2}) ∣ μ_{2} = 2 μ_{1}^{2} ∣ μ_{1} > 0} \in I n t (μ (C_{M}))

represents an additional boundary in

μ (C_{M})

. If the moments satisfy the reverse inequality

μ_{2}^{*} > 2 {(μ_{1}^{*})}^{2}

there is no pdf which maximizes the entropy.

We consider the latter case taking

μ_{1}^{*} = 1

and

μ_{2}^{*} = 2.1

; then

f_{M + 1} ({\tilde{μ}}_{M + 1})

is calculated by means of (11), with

H_{f_{M - 1}} = 1 + ln μ_{1}^{*} = 1

. Values of entropy

H_{f_{M + 1}} (μ_{M + 1})

with increasing values of

μ_{M + 1} > μ_{M + 1}^{-} = 4.41

not reported here lead to conclusion that the entropy stabilizes rapidly as

μ_{M + 1}

increases. This may be an evidence of high accuracy in the reconstruction.

Taking

ε = 10^{- 4}

, Equation (11) is satisfied starting from

{\tilde{μ}}_{M + 1} = 20

. Then

{\tilde{f}}_{M}

, which is identified with

f_{M + 1} ({\tilde{μ}}_{M + 1})

, jointly with

f_{M - 1}

are displayed in Figure 1 (top). The difference between

{\tilde{f}}_{M}

and

f_{M - 1}

is insignificant since

μ_{M}^{*} - μ_{M, f_{M - 1}} = 0.1

was chosen to avoid the dentrimental effect of wiggle. In Figure 1 (bottom) the same

f_{M + 1} ({\tilde{μ}}_{M + 1})

, on a logarithmic scale and on extended x-axis scale, is reported to evidenciate the presence of small wiggle. The moments

μ_{1, {\tilde{f}}_{M}}

,

μ_{2, {\tilde{f}}_{M}}

satisfy

∣ μ_{1}^{*} - μ_{1, {\tilde{f}}_{M}} ∣ \sim 10^{- 8}

,

∣ μ_{2}^{*} - μ_{2, {\tilde{f}}_{M}} ∣ \sim 10^{- 6}

, respectively. It can be concluded

{\tilde{f}}_{M} \equiv f_{M + 1} ({\tilde{μ}}_{M + 1})

satisfies all the expected theoretical properties and can be considered the "best" substitute of the missing

f_{M}

.

Figure 1. Stieltjes case,

M = 2

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and

f_{M - 1}

(top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

Example 2.

Hamburger case with

M = 3

; here

μ_{1}^{*} = 0, μ_{2}^{*} = 1, μ_{3}^{*} = 0.5

, with the skewness = 0.5, are assumed. MaxEnt density

f_{M}

does not exist if the first 3 moments are specified and the skewness is required to be non-zero. Then

f_{M}

does not exists whilst

f_{M - 1}

(Normal distribution) exists, with entropy

H_{f_{M - 1}} = \frac{1}{2} ln [2 π e (μ_{2}^{*} - {(μ_{1}^{*})}^{2})] ≃ 1.4189385

. Taking

ε = 10^{- 3}

, Equation (11) is satisfied starting from

{\tilde{μ}}_{M + 1} = 14.65

. Then

{\tilde{f}}_{M}

, which is identified with

f_{M + 1} ({\tilde{μ}}_{M + 1})

, jointly with

f_{M - 1}

are displayed in Figure 2 (top). In Figure 2 (bottom) the same

f_{M + 1} ({\tilde{μ}}_{M + 1})

is displayed but on a logarithmic scale and on extended x-axis scale, to highlight the presence of small wiggle.

Figure 2. Hamburger case,

M = 3

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and Normal (top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

The moments

μ_{1, {\tilde{f}}_{M}}

,

μ_{2, {\tilde{f}}_{M}}

,

μ_{3, {\tilde{f}}_{M}}

satisfy

∣ μ_{1}^{*} - μ_{1, {\tilde{f}}_{M}} ∣ \sim 10^{- 7}

,

∣ μ_{2}^{*} - μ_{2, {\tilde{f}}_{M}} ∣ \sim 10^{- 7}

,

∣ μ_{3}^{*} - μ_{3, {\tilde{f}}_{M}} ∣ \sim 10^{- 5}

, respectively. It can be concluded

{\tilde{f}}_{M}

satisfies all the expected theoretical properties and can be considered the "best" substitute of the missing

f_{M}

.

Remark 1.

It is worth to note that the nonsymmetric Hamburger case with

M = 3

has been discussed in [13], pp. 413–415, Equation (12.32), but solely on the basis of a simple heuristic reasoning; they use a tricky problem to observe that even if the Lagrange multipliers cannot be chosen to satisfy the given constraints, the “maximum” entropy can be found and it is equal to

sup_{f \in C_{M}} H_{f} = H_{f_{M - 1}}

concluding that in this situation the entropy may only be ϵ-achievable. Just to give a simple example of it, but not a formal justification, the authors consider the case in which a Normal distribution be contaminated with a small “wiggle” at a very high value of x; consequently the moments of new distribution are almost the same as those of the non contaminated Normal, the biggest change being in the third moment (the new distribution is not any more symmetric). However, adding new wiggles in opportune positions to balance the changes caused by the original wiggle we can bring the first and the second moments back to their original values and also get any value of the third moment without reducing the entropy significantly below that of the associated non contaminated Normal (from this the conclusion about the ϵ-achievability of the entropy).

Just above heuristic procedure is displayed in Figure 2, and interpreted saying

{\tilde{f}}_{M}

may be identified with the Normal distribution on which some wiggles are superimposed.

This result is a particular case of the more general result covered by this paper and coincides with the above (9)–(11) when, in this case,

f_{M - 1}

is the density function of a Normal distribution.

Lastly, all above heuristics agrees with the mathematical general result that two continuous density functions having the same first

M + 1

moments (including

μ_{0} = 1

) cross each other in at least

M + 1

points ([14], Vol.1, No. 140, p. 83). In our case

f_{M + 1}

and the Normal density plotted in Figure 2, share the first

M + 1 = 3

moments and they cross each other at three points as the inspection of the previous figure suggests.

Example 3.

Symmetric Hamburger case with

M = 4

, prescribed

(μ_{2}^{*} = 1, μ_{4}^{*} = 4)

and MaxEnt density

f_{M}

are considered.

f_{M - 2}

is the Normal distribution with

μ_{M, f_{M - 2}} = 3

and entropy

H_{f_{M - 2}} = \frac{1}{2} ln [2 π e μ_{2}^{*}] ≃ 1.4189385

.

f_{M}

does not exists, being its existence condition

μ_{M}^{*} \leq μ_{M, f_{M - 2}} = 3

not verified. A further even moment

μ_{M + 2}

with increasing values is added and

f_{M + 2} (μ_{M + 2})

has to be calculated.

Taking

ε = 10^{- 3}

, Equation (11) is satisfied starting from

{\tilde{μ}}_{M + 2} = 160

. Then

{\tilde{f}}_{M}

, which is identified with

f_{M + 2} ({\tilde{μ}}_{M + 2})

, jointly with

f_{M - 2}

are displayed in Figure 3 (top). In Figure 3 (bottom) the same

f_{M + 2} ({\tilde{μ}}_{M + 2})

, in logarithmic scale and on extended x-axis scale, is reported to highlight the presence of two symmetric wiggles travelling in opposite direction and illustrated too in same Figure (bottom). The moments

μ_{2, {\tilde{f}}_{M}}

,

μ_{4, {\tilde{f}}_{M}}

satisfy

∣ μ_{2}^{*} - μ_{2, {\tilde{f}}_{M}} ∣ \sim 10^{- 8}

,

∣ μ_{4}^{*} - μ_{4, {\tilde{f}}_{M}} ∣ \sim 10^{- 6}

, respectively.

Figure 3. Symmetric Hamburger case with

M = 4

.

f_{M + 2} ({\tilde{μ}}_{M + 2})

and Normal (top).

f_{M + 2} ({\tilde{μ}}_{M + 2})

in logarithmic scale (bottom).

In each of the previous three examples we have assumed that

μ_{M}^{*}

and

μ_{M, f_{M - 1}}

differ from a small amount and this to avoid the detrimental effect due to the wiggle; consequently, the difference between

{\tilde{f}}_{M}

and

f_{M - 1}

becomes insignificant too. As a result,

1.: The convergence of $H_{f_{M + 1}}$ to $H_{f_{M - 1}}$ is fast and avoids the formation of small evanescent wiggles at a great distance;
2.: The rise of numerical quadrature problems.

As a consequence the two densities

f_{M + 1}

and

f_{M - 1}

are almost superimposed precluding the possibility to evaluate the effect produced in

f_{M - 1}

from having discarded

μ_{M}^{*}

. It would be interesting to be able to assess how high values

μ_{M}^{*} - μ_{M, f_{M - 1}}

may affect the difference

f_{M + 1} - f_{M - 1}

. The goal could be achieved by a suitable numerical quadrature method.

5. Discussion and Conclusions

In the present paper, we have discussed the case which arises when, in presence of a prefixed moment set

(μ_{1}^{*}, μ_{2}^{*}, \dots, μ_{M}^{*})

representing the available information, the (reduced) moment problem admits solution but the MaxEnt density as a solution of the regularization problem does not exist. In the previous sections, we have given the conditions under which a solution of the Stieltjes and Hamburger (reduced) moment problems may be found in the genuine Jaynes’ spirit by finding the overall largest entropy distribution which is compatible with the available information and showing that this is the best approximant of the underlying true but unknown distribution. The substitute of the missing MaxEnt solution is found using solely the usual MaxEnt machinery.

Now we look at the issue from a different point of view. Suppose

(μ_{1}^{*}, μ_{2}^{*}, \dots, μ_{M}^{*})

represent all and only the available information. Two cases 1. and 2. may present:

Only the first M moments may be measured but additionally the $f_{M}$ exists. In this situation the traditional MaxEnt machinery will produce the usual solution $f_{M}$ which has a well known analytical form corresponding to the Jaynes’ non committal approximant (MaxEnt) of the underlying f (see Equation (2));
Only the first M moments may be measured but additionally the $f_{M}$ does not exist. Here any information about the analytical form of the substitute of the missing MaxEnt solution is lacking. If only the first M moments may be measured, it is reasonable to assume the underlying f admits the first M moments solely. Then it can to be restated $(μ_{1}^{*}, μ_{2}^{*}, \dots, μ_{M}^{*}, μ_{M + 1} = + \infty)$ to represent all and only the available information. Next, assuming $μ_{M + 1}$ takes finite value, MaxEnt machinery may be invoked, from which $f_{M + 1}$ as above and the consequent Theorem 1. To find a genuine minimal committal approximant in the MaxEnt spirit of the underlying f just

$lim_{μ_{M + 1} \to \infty} H_{f_{M + 1}} (μ_{M + 1})$

is taken, so that, from the monotonicity of $H_{f_{M + 1}}$ , the spurious information represented by $μ_{M + 1}$ has a minimum effect on the approximant (in other terms, to guarantee to be minimal committal).

The solution we have proposed in this paper for case 2. offers an alternative and exhaustive answer to the common empirical “forced” practices consisting in

(a): Replacing an unbounded support with an arbitrarily large interval, or
(b): Neglecting the prescribed higher moment so that the reduced number of moments allows the existence of MaxEnt solution.

As we have widely said before (see Introduction), solutions like (a) and (b) imply a forced pseudo-solution of the original problem which conflicts with MaxEnt rationale.

The above conflict is not merely theoretical and it has some practical consequences. This leads us to distinguish theoretical and practical aspects of the procedure we proposed. MaxEnt technique is invoked because one reputes the found distribution to be “the best” and the obtained results are “the best”. Essentially this is the practitioner’s main concern. More specifically, since the MaxEnt distribution constrained by first M moments does not exist, we are inclined to turn to

f_{M - 1}

. Depending on whether

μ_{M}^{*}

is considered or not considered,

f_{M - 1}

or

{\tilde{f}}_{M}

will be used to approximate the unknown underlying density f. It may happen that some summarizing quantities based on different approximations of f as

f_{M - 1}

or

{\tilde{f}}_{M}

, remain unaltered as we illustrate in next few rows.

If g is a bounded function of X,

{\tilde{f}}_{M}

and

f_{M - 1}

lead to similar values, as Pinsker’s inequality ([15], p. 390) and (11) yield,

\begin{matrix} ∣ E_{f_{M - 1}} [g (X)] - E_{{\tilde{f}}_{M}} [g (X)] ∣ & \leq \int_{S_{X}} ∣ g (x) ∣ \cdot ∣ f_{M - 1} (x) - {\tilde{f}}_{M} (x) ∣ d x \\ \leq {‖ g ‖}_{\infty} \sqrt{2 (H_{f_{M - 1}} - H_{{\tilde{f}}_{M}})} \leq {‖ g ‖}_{\infty} \sqrt{2 \cdot ε} \end{matrix}

As a consequence, although we settle for a density constrained by fewer moments, and then conceptually in contrast with the MaxEnt spirit, nevertheless the results remain unaltered.

The matter runs similarly whether quantiles have to be calculated. They may be configured as expected values of proper bounded functions: indeed, for fixed x,

F (x) = E [g (t)]

with

g (t) = 1

if

t \in [0, x]

and

g (t) = 0

if

t \in (x, \infty)

. Then, if

{\tilde{F}}_{M}

and

F_{M - 1}

denote the distribution functions corresponding to

{\tilde{f}}_{M}

and

f_{M - 1}

, respectively, we have in Stieltjes case (and mutatis mutandis equivalently holds for Hamburger case)

\begin{matrix} ∣ F_{M - 1} (x) - {\tilde{F}}_{M} (x) ∣ & \leq \int_{0}^{x} ∣ f_{M - 1} (t) - {\tilde{f}}_{M} (t) ∣ d t \\ \leq \int_{0}^{\infty} ∣ f_{M - 1} (t) - {\tilde{f}}_{M} (t) ∣ d t \\ \leq \sqrt{2 (H_{f_{M - 1}} - H_{{\tilde{f}}_{M}})} \leq \sqrt{2 \cdot ε} . \end{matrix}

Again, although we settle for a density constrained by fewer moments, and this goes conceptually against the spirit of Jaynes, nevertheless the results concerning expected values of g remain unaltered. However, if g is an arbitrary unbounded function of X then the sequence of the above inequalities does not hold and the calculation of expected values of g could lead to different results, i.e.,

E_{f_{M - 1}} [g (X)] \neq E_{{\tilde{f}}_{M}} [g (X)]

.

In conclusion: if the maximum entropy distribution does not exist, being guided by the spirit of maximum entropy could always turn out to be the best choice.

Author Contributions

Conceptualization, P.L.N.I. and A.T.; Methodology, P.L.N.I.; Software, A.T.; Writing original draft, A.T.; Writing, review and editing, P.L.N.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Existence of MaxEnt Distributions

Symmetric positive definite Hankel matrices

Δ_{M} = {‖ μ_{i + j} ‖}_{i, j = 0}^{N}

and

Δ_{M, 1} = {‖ μ_{i + j + 1} ‖}_{i, j = 0}^{N}

,

M = 2 N

and

M = 1, 2, . . .

are recalled. Necessary condition for the existence of MaxEnt distributions is the positivity of determinants

∣ Δ_{M} ∣ > 0

for Hamburger case and

∣ Δ_{M} ∣ > 0

,

∣ Δ_{M, 1} ∣ > 0

for Stieltjes case ([1], Theorem 1.2, p. 5 and Theorem 1.3, p. 6). The existence sufficient conditions in both cases are quoted below. Once the first moments

(μ_{1}, . . ., μ_{M})

are assigned

let us define the Mth moment space $C_{M}$ as the convex hull of the curve ${(x^{1}, . . ., x^{M}), x \in S}$ . $C_{M}$ is convex and closed with boundary $\partial C_{M}$ . Then we mean $C_{M} = Int C_{M} \cup \partial C_{M}$
let us call $μ_{M + 1}^{-}$ the value of $μ_{M + 1}$ such that $∣ Δ_{N} ∣ = 0$ or $∣ Δ_{N, 1} ∣ = 0$ , with $M = 2 N$ or $M = 2 N + 1$ , so that the point $(μ_{1}, . . ., μ_{M}, μ_{M + 1}^{-}) \in \partial C_{M + 1}$
we recall too the moment space $E_{M}$ relative to the MaxEnt densities, where $E_{M} \subseteq C_{M}$ . Here we mean $E_{M} = Int E_{M} \cup \partial E_{M}$ . Take note $\partial E_{M}$ includes both points $\in \partial C_{M}$ and points $\in Int C_{M}$ , so that $E_{M} = Int E_{M} \cup (\partial E_{M} \cap \partial C_{M}) \cup (\partial E_{M} \cap Int C_{M})$ . Additional boundary points $(\partial E_{M} \cap Int C_{M})$ arise from the conditions
(i)
$λ_{M} = 0$ (see Equation (2)) in Stieltjes case. For the special case $M = 2$ , see [5], Theorem 2 or Example 2.
(ii)
$λ_{M - 1} = λ_{M} = 0$ (see Equation (2)) in symmetric Hamburger case.

Appendix A.1. Stieltjes Case

The moment set

(μ_{1}^{*}, . . ., μ_{M}^{*})

is prescribed. The existence of

f_{M}

has been investigated ([7], Theorem 4 and [8], Theorem 3). Here the results are summarized in next two items.

Let us suppose $f_{M}$ exists with its $(M + 1)$ th moment $μ_{M + 1, f_{M}}$ . If $μ_{M + 1}^{*} \leq μ_{M + 1, f_{M}}$ , then $f_{M + 1}$ exists; conversely if $μ_{M + 1}^{*} > μ_{M + 1, f_{M}}$ , then $f_{M + 1}$ does not exist. The existence of $f_{M}$ is iteratively and numerically determined, starting from $f_{1}$ which exists.
If $f_{M}$ does not exist, both $f_{M - 1}$ and $f_{M + 1}$ exist for every $μ_{M - 1} > μ_{M - 1}^{-}$ and $μ_{M + 1} > μ_{M + 1}^{-}$ respectively;
If $(μ_{1}^{*}, \dots, μ_{M - 1}^{*})$ are fixed, whilst $μ_{M} > μ_{M}^{-}$ varies continuously, the entropy $H_{f_{M}} (μ_{M})$ of $f_{M}$ is monotonic increasing function ([4], p. 59, Equation (2.73)).

Appendix A.2. Hamburger Case

The moment set

(μ_{1}^{*}, . . ., μ_{M}^{*})

with M even, is considered. The existence of

f_{M}

has been investigated ([7], Theorem 4 and [9], Theorem 2). Here the results are summarized in next three items:

In the non-symmetric case, the existence of $f_{M}$ is guaranteed except for a special set of moments which is unknown a priori. So that, excluding the latter ones, the positivity of the Hankel determinants, which is necessary condition of the solvability, guarantees the existence of $f_{M}$ .
In the symmetric case (i.e., $μ_{2 j - 1}^{*} = 0$ ) the condition of the solvability of $f_{M}$ is analogous to Stieltjes case, being $f_{M}$ symmetric. Thus the existence of $f_{M}$ , $M \geq 4$ , is iteratively and numerically determined, starting from $f_{2}$ which exists (being the Normal distribution); if $f_{M}$ does not exist, both $f_{M - 2}$ and $f_{M + 2}$ exist for every $μ_{M - 2} > μ_{M - 2}^{-}$ and $μ_{M + 2} > μ_{M + 2}^{-}$ respectively;
if $(μ_{1}^{*}, \dots, μ_{M - 1}^{*})$ are fixed, whilst $μ_{M} > μ_{M}^{-}$ varies continuously, thanks to MaxEnt machinery, the entropy $H_{f_{M}} (μ_{M})$ of $f_{M}$ is monotonic increasing function ([4], p. 59, Equation (2.73)). In MaxEnt setup this guarantees procedures and results valid for Stieltjes case can be equally extended to Hamburger one.

References

Shohat, J.A.; Tamarkin, J.D. Math. Surveys and Monograph. In The Problem of Moments; American Mathematical Society: Providence, RI, USA, 1943; Volume I. [Google Scholar]
Jaynes, E.T. Information Theory and Statistical Mechanics. Phys. Rev. 1957, 106, 620–630. [Google Scholar] [CrossRef]
Jaynes, E.T. Prior Probabilities. IEEE Trans. Syst. Sci. Cybern. 1968, 4, 227–241. [Google Scholar] [CrossRef]
Kesavan, H.K.; Kapur, J.N. Entropy Optimization Principles with Applications; Academic Press: London, UK, 1992. [Google Scholar]
Dowson, D.C.; Wragg, A. Maximum-entropy distributions having prescribed first and second moments (Corresp.). IEEE Trans. Inf. Theory 1973, 19, 689–693. [Google Scholar] [CrossRef]
Kociszewski, A. The existence conditions for maximum entropy distributions, having prescribed the first three moments. J. Phys. A Math. Gen. 1986, 19, L823–L827. [Google Scholar] [CrossRef]
Junk, M. Maximum entropy for reduced moment problems. Math. Models Methods Appl. Sci. 2000, 10, 1001–1025. [Google Scholar] [CrossRef]
Tagliani, A. Maximum entropy solutions and moment problem in unbounded domains. Appl. Math. Lett. 2003, 16, 519–524. [Google Scholar] [CrossRef]
Tagliani, A. Hamburger moment problem and maximum entropy: On the existence conditions. Appl. Math. Comp. 2014, 231, 111–116. [Google Scholar] [CrossRef]
Milev, M.; Tagliani, A. Entropy convergence of finite moment approximations in Hamburger and Stieltjes problems. Stat. Probab. Lett. 2017, 120, 114–117. [Google Scholar] [CrossRef]
Junk, M. Domain of Definition of Levermore’s Five-Moment System. J. Stat. Phys. 1998, 93, 1143–1167. [Google Scholar] [CrossRef]
Summy, D.P.; Pullin, D.I. On the Five-Moment Hamburger Maximum Entropy Reconstruction. J. Stat. Phys. 2018, 172, 854–879. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
Pólya, G.; Szegö, G. Problems and Theorems in Analysis I; Springer: Berlin, Germany, 1972. [Google Scholar]
Kullback, S. Information Theory and Statistics; John Wiley: New York, NY, USA, 1959. [Google Scholar]

Figure 1. Stieltjes case,

M = 2

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and

f_{M - 1}

(top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

Figure 1. Stieltjes case,

M = 2

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and

f_{M - 1}

(top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

Figure 2. Hamburger case,

M = 3

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and Normal (top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

Figure 2. Hamburger case,

M = 3

.

f_{M + 1} ({\tilde{μ}}_{M + 1})

and Normal (top).

f_{M + 1} ({\tilde{μ}}_{M + 1})

in logarithmic scale (bottom).

Figure 3. Symmetric Hamburger case with

M = 4

.

f_{M + 2} ({\tilde{μ}}_{M + 2})

and Normal (top).

f_{M + 2} ({\tilde{μ}}_{M + 2})

in logarithmic scale (bottom).

Figure 3. Symmetric Hamburger case with

M = 4

.

f_{M + 2} ({\tilde{μ}}_{M + 2})

and Normal (top).

f_{M + 2} ({\tilde{μ}}_{M + 2})

in logarithmic scale (bottom).

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

Stieltjes and Hamburger Reduced Moment Problem When MaxEnt Solution Does Not Exist

Abstract

1. Problem Formulation and MaxEnt Rationale

2. Stieltjes Case

3. Hamburger Case

3.1. Symmetric Case with M ≥ 4 Even

3.2. Non-Symmetric Case with M ≥ 3 Odd

4. Numerical Aspects

5. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Existence of MaxEnt Distributions

Appendix A.1. Stieltjes Case

Appendix A.2. Hamburger Case

References

Article Metrics

Article Access Statistics

3.1. Symmetric Case with $M \geq 4$ Even

3.2. Non-Symmetric Case with $M \geq 3$ Odd