Stochastic Order and Generalized Weighted Mean Invariance

Mateu Sbert; Jordi Poch; Shuning Chen; Víctor Elvira

doi:10.3390/e23060662

Abstract

In this paper, we present order invariance theoretical results for weighted quasi-arithmetic means of a monotonic series of numbers. The quasi-arithmetic mean, or Kolmogorov–Nagumo mean, generalizes the classical mean and appears in many disciplines, from information theory to physics, from economics to traffic flow. Stochastic orders are defined on weights (or equivalently, discrete probability distributions). They were introduced to study risk in economics and decision theory, and recently have found utility in Monte Carlo techniques and in image processing. We show in this paper that, if two distributions of weights are ordered under first stochastic order, then for any monotonic series of numbers their weighted quasi-arithmetic means share the same order. This means for instance that arithmetic and harmonic mean for two different distributions of weights always have to be aligned if the weights are stochastically ordered, this is, either both means increase or both decrease. We explore the invariance properties when convex (concave) functions define both the quasi-arithmetic mean and the series of numbers, we show its relationship with increasing concave order and increasing convex order, and we observe the important role played by a new defined mirror property of stochastic orders. We also give some applications to entropy and cross-entropy and present an example of multiple importance sampling Monte Carlo technique that illustrates the usefulness and transversality of our approach. Invariance theorems are useful when a system is represented by a set of quasi-arithmetic means and we want to change the distribution of weights so that all means evolve in the same direction.

Keywords:

stochastic order; weighted mean; arithmetic mean; harmonic mean; Kolmogorov mean; quasi-arithmetic mean; cross-entropy; Shannon entropy; Rényi entropy; Tsallis entropy; multiple importance sampling

1. Introduction and Motivation

Stochastic orders [1,2], are orders defined in probability theory and statistics, to quantify the concept of one random variable being bigger or smaller than another one. Discrete probability distributions, also called probability mass functions (pmf), are sequences of n-tuples of non-negative values that add up to 1, and can be thus interpreted in several ways, for instance, as weights in computation of moments of the discrete random variable described by the pmf, or as equivalence classes of compositional data [3]. Stochastic orders have found application in decision and risk theory [4], and in economics in general, among many other fields [2]. Some stochastic orders have been defined based on order invariance: two pmf’s are ordered when the arithmetic means of any increasing sequence of real numbers weighted with the corresponding pmf’s are ordered in the same direction. This raises the question whether this invariance might also hold for other kind of means beyond the arithmetic mean.

The quasi-arithmetic means, also called Kolmogorov or Kolmogorov–Nagumo means, are ubiquitous in many branches of science [5]. They have the expression

g^{- 1} (\sum_{k = 1}^{M} α_{k} g (b_{k}))

, where

g (x)

is a real-valued strictly monotonous function,

{b_{k}}_{k = 1}^{M}

a sequence of reals, and

{α_{k}}_{k = 1}^{M}

a set of weights with

\sum_{k = 1}^{M} α_{k} = 1

. This family of means comprises the usual means: arithmetic

g (x) = x

,

\sum_{k = 1}^{M} α_{k} b_{k}

, harmonic

g (x) = 1 / x

,

{(\sum_{k = 1}^{M} \frac{α_{k}}{b_{k}})}^{- 1}

, power mean

g (x) = x^{p}

,

{(\sum_{k = 1}^{M} α_{k} b_{k}^{p})}^{\frac{1}{p}}

. For a long time, economists have discussed the best mean for a problem [6]. Harmonic mean is used for the price earning ratio, and power means are used to represent the aggregate labor demand and its corresponding wage [7], and the constant elasticity of substitution (CES) [8]. Yoshida [9,10] has studied the invariance under quasi-arithmetic means with function

g (x)

increasing and for utility functions. In information theory, Alfred Rényi [11] defined axiomatically the entropy of a probability distribution as a Kolmogorov mean of the information

1 / log p_{k}

conveyed by result k with probability

p_{k}

, and recently, Americo et al. [12] defined conditional entropy based on quasi-arithmetic means. In physics, the equivalent spring constant of springs combined in series is obtained as the harmonic mean of the individual spring constants, and in parallel, as their arithmetic mean [13], while the equivalent resistance of resistors combined in parallel is obtained by the harmonic mean of the individual resistances, and in series by their arithmetic mean [14]. In traffic flow [15], arithmetic and harmonic mean of speed distribution are used. In [16] both geometric and harmonic mean are used in addition to arithmetic mean to improve noise source maps.

In our recent work on inequalities for generalized, quasi-arithmetic weighted means [17], we found some invariance properties, that depended on the particular relationship considered between the sequences of weights. These relationships between weights define first stochastic order, and likelihood ratio order. Their application to multiple importance sampling (MIS), a Monte Carlo technique, has been presented in [18], its application to cross entropy in [19], and in [20] applications to image processing, traffic flow and income distribution have been shown. In [21], the invariance results on products of distributions of independent scalar r.v.’s [22] was generalized to any joint distribution of a 2-dimensional r.v.

In this paper, we show that the order invariance is a necessary and sufficient condition for first stochastic order, and that it holds under any quasi-arithmetic mean. We also study invariance under the second stochastic order, likelihood ratio, hazard-rate, and increasing convex stochastic orders. The fact that the invariance results hold for both increasing and decreasing monotonic functions allows us to use both utilities and liabilities, represented by efficiencies and expected error, respectively, to look for an optimal solution, where in liabilities we look for minimum expected error, while in utilities for maximum efficiency.

The rest of the paper is organized as follows. In Section 2, we introduce the stochastic order; in Section 3, the arithmetic mean and its relationship with stochastic order. In Section 4, we present the invariance theorems; in Section 5, we discuss the invariance for concave (convex) functions; in Section 6, its application to stochastic orders; in Section 7, we present an example based on the linear combination of Monte Carlo estimators. Finally, conclusions and future work are given in Section 8.

2. Stochastic Orders

Stochastic orders are pre-orders (i.e, binary relationships holding symmetric and transitive properties) defined on probability distributions with finite support. Note that equivalently, one can think of sequences (i.e., ordered sets) of non-negative weights/values that sum up to one. Observe that any sequence

{α_{k}}

of M positive numbers such that

\sum_{k = 1}^{M} α_{k} = 1

can be considered a probability distribution. It can be seen too as an element of the (M-1)-Simplex, i.e.,

{α_{k}} \in Δ_{M - 1}

. While several interpretations hold and hence increase the range of applicability, in the remaining of this paper, we will talk of sequence without any loss of generality.

Notation.

We use the symbols

≺, ≻

to represent orders between two sequences

{α_{k}}

and

{α_{k}^{'}}

of size M, e.g., and we write

{α_{k}} ≻ {α_{k}^{'}}

or equivalently

{α_{k}^{'}} ≺ {α_{k}}

. We will denote the elements of the sequences without the curly brackets, e.g., the first element of the sequence

{α_{k}}

is denoted as

α_{1}

, while the last one is

α_{M}

. Moreover, the first and last elements of the sequence receive special attention, since when

α_{1} \leq α_{1}^{'}

, we can write this order as

{α_{k}} ≻_{F} {α_{k}^{'}}

(where

≻_{F}

stands for first ) and whenever

α_{M} \geq α_{M}^{'}

, we can write

{α_{k}} ≻_{L} {α_{k}^{'}}

(where

≻_{L}

stands for last ). The case with both

α_{1} \leq α_{1}^{'}

and

α_{M} \geq α_{M}^{'}

, we can denote as

{α_{k}} ≻_{F L} {α_{k}^{'}}

. The orders

≻_{F}, ≻_{L}, ≻_{F L}

are superorders of first stochastic dominance order,

{α_{k}} ≻_{F S D} {α_{k}^{'}}

, that will be studied in Section 6. We denote as

{α_{M - k + 1}}

the sequence with the same elements of

{α_{k}}

but in reversed order.

Example 1

(Toy example). Given sequences

{α_{k}} = {0.5, 0.1, 0.2, 0.2}

, and

{α_{k}^{'}} =

{0.4, 0.1, 0.2, 0.3}

, we have both

{α_{k}} ≺_{F} {α_{k}^{'}}

,

{α_{k}} ≺_{L} {α_{k}^{'}}

, and thus

{α_{k}} ≺_{F L} {α_{k}^{'}}

. On the other hand for sequences

{α_{k}}

and

{α_{k}^{″}} = {0.4, 0.3, 0.2, 0.1}

, we have

{α_{k}} ≺_{F} {α_{k}^{″}}

,

{α_{k}} ⊀_{L} {α_{k}^{″}}

, and thus

{α_{k}} ⊀_{F L} {α_{k}^{″}}

.

Property 1

(Mirror property). One desirable property of stochastic orders is such that when reversing the ordering of the sequence elements, the stochastic order should be reversed as well, i.e., if

{α_{k}} ≻ {α_{k}^{'}}

, then

{α_{M - k + 1}^{'}} ≻ {α_{M - k + 1}}

. This is similar to the invariance of physical laws to the right and left hand. We call this property the mirror property.

Definition 1

(Mirror property). We say that a stochastic order has the mirror property if

{α_{k}^{'}} ≺ {α_{k}} \Rightarrow {α_{M - k + 1}} ≺ {α_{M - k + 1}^{'}} .

(1)

Observe that the simple orders defined before,

≻_{F}

,

≻_{L}

, do not hold this property, but

≻_{F L}

holds it. We will see in Section 6 that usual stochastic orders do hold the mirror property. However, an order that is insensitive to the permutation of the elements of a sequence, like majorization or Lorentz order, does not hold the mirror property.

3. Quasi-Arithmetic Mean

Stochastic orders are usually defined by invariance to arithmetic mean [1,2] (see Section 6), and we want to investigate in this paper invariance to more general means. We define here the kind of means we are interested in.

Definition 2

(Quasi-arithmetic or Kolmogorov or Kolmogorov–Nagumo mean). A quasi-arithmetic weighted mean (or Kolmogorov mean)

M ({b_{k}}, {α_{k}})

of a sequence of real numbers

{b_{k}}_{k = 1}^{M}

, is of the form

g^{- 1} (\sum_{k = 1}^{M} α_{k} g (b_{k})))

, where

g (x)

is a real valued, invertible strictly monotonic function, with inverse function

g^{- 1} (x)

and

{α_{k}}_{k = 1}^{M}

are positive weights such that

\sum_{k = 1}^{M} α_{k} = 1

.

Examples of such mean are arithmetic weighted mean (

g (x) = x

), harmonic weighted mean (

g (x) = 1 / x

), geometric weighted mean (

g (x) = log x

) and in general, the weighted power means (

g (x) = x^{r}

).

Given a distribution

{p_{k}}

, Shannon entropy,

H ({p_{k}}) = - \sum_{k} p_{k} log p_{k}

, and Rényi entropy,

R_{β} ({p_{k}}) = \frac{1}{1 - β} log \sum_{k} p_{k}^{β}

, can be considered as quasi-arithmetic means of the sequence

{\frac{1}{log p_{k}}}

with weights

{p_{k}}

: Shannon entropy with

g (x) = x

(arithmetic mean or expected value), and Rényi entropy with

g (x) = 2^{(β - 1) x}

[11]. Tsallis entropy,

S_{q} ({p_{k}}) = \frac{1 - \sum_{i = 1}^{M} p_{i}^{q}}{q - 1}

, can be considered the weighted arithmetic mean of the sequence

{\frac{1}{{ln}_{q} p_{k}}}

with weights

{p_{k}}

, where

{ln}_{q} x = \frac{x^{1 - q} - 1}{1 - q}

is the q-logarithm function [23]. Without loss of generality, we consider from now on that

{b_{k} = f (k)}

where

f (x)

is a real valued function.

Lemma 1.

Consider the sequences of M positive weights

{α_{k}}_{k = 1}^{M}

and

{α_{k}^{'}}_{k = 1}^{M}

,

\sum_{k = 1}^{M} α_{k} = 1, \sum_{k = 1}^{M} α_{k}^{'} = 1

,

g (x)

a strictly monotonic function. The following conditions (a), (b), (a’), (b’), (c) and (d) are equivalent.

(a) for

g (x)

increasing, for any increasing function

f (x)

the following inequality holds:

\sum_{k = 1}^{M} α_{k}^{'} g (f (k)) \leq \sum_{k = 1}^{M} α_{k} g (f (k))

(2)

(b) for

g (x)

increasing, for any increasing function

f (x)

the following inequality holds:

\sum_{k = 1}^{M} α_{M - k + 1} g (f (k)) \leq \sum_{k = 1}^{M} α_{M - k + 1}^{'} g (f (k))

(3)

(a’) for

g (x)

decreasing, for any increasing function

f (x)

the following inequality holds:

\sum_{k = 1}^{M} α_{k} g (f (k)) \leq \sum_{k = 1}^{M} α_{k}^{'} g (f (k))

(4)

(b’) for

g (x)

decreasing, for any increasing function

f (x)

the following inequality holds:

\sum_{k = 1}^{M} α_{M - k + 1}^{'} g (f (k)) \leq \sum_{k = 1}^{M} α_{M - k + 1} g (f (k))

(5)

(c) the following inequalities hold:

\begin{matrix} α_{1}^{'} & \geq & α_{1} \\ α_{1}^{'} + α_{2}^{'} & \geq & α_{1} + α_{2} \\ ⋮ \\ α_{1}^{'} + \dots + α_{M - 1}^{'} & \geq & α_{1} + \dots + α_{M - 1} \\ α_{1}^{'} + \dots + α_{M - 1}^{'} + α_{M}^{'} & = & α_{1} + \dots + α_{M - 1} + α_{M} \end{matrix}

(6)

(d) the following inequalities hold:

\begin{matrix} α_{M}^{'} & \leq & α_{M} \\ α_{M}^{'} + α_{M - 1}^{'} & \leq & α_{M} + α_{M - 1} \\ ⋮ \\ α_{M}^{'} + \dots + α_{2}^{'} & \leq & α_{M} + \dots + α_{2} \\ α_{M}^{'} + α_{M - 1}^{'} + \dots + α_{1}^{'} & = & α_{M} + α_{M - 1} + \dots + α_{1} \end{matrix}

(7)

If

f (x)

is decreasing, the inequalities in a) through b’) are reversed.

Proof.

Indirect or partial proofs can be found in [17,20]. We provide a complete proof in the Appendix A. □

Note.

Observe that in Lemma 1, it is sufficient to consider the monotonicity of the sequences

{f (k)}_{k = 1}^{M}

. Furthermore, Lemma 1 can be extended to any real sequences

{y_{k}}

and

{x_{k}}

, such that

\sum_{k = 1}^{M} y_{k} = \sum_{k = 1}^{M} x_{k}

. It is enough to observe that the order of all the inequalities are unchanged by adding a positive constant, so that

{y_{k}}

and

{x_{k}}

can be made positive, and also are unchanged by the multiplication of a positive constant, so that the resulting

{y_{k}}

and

{x_{k}}

sequences can be normalized.

Theorem 1.

Given a mean

M

with strictly monotonic function

g (x)

and two distributions

{α_{k}^{'}}, {α_{k}}

, the following propositions are equivalent:

(a) for all increasing functions

f (x)

M ({f (k)}, {α_{k}^{'}}) \leq M ({f (k)}, {α_{k}})

(8)

(b) for all increasing functions

f (x)

M ({f (k)}, {α_{M - k + 1}}) \leq M ({f (k)}, {α_{M - k + 1}^{'}})

(9)

(a’) for all decreasing functions

f (x)

M ({f (k)}, {α_{k}^{'}}) \geq M ({f (k)}, {α_{k}})

(10)

(b’) for all decreasing functions

f (x)

M ({f (k)}, {α_{M - k + 1}}) \geq M ({f (k)}, {α_{M - k + 1}^{'}})

(11)

(c) Condition (c) of Lemma 1 holds

Proof.

It is a direct consequence of Lemma 1 and the definition of quasi-arithmetic mean, observing that the inverse

g^{- 1} (x)

of a strictly monotonic increasing (respectively decreasing) function

g (x)

is also increasing (respectively decreasing). □

4. Invariance

Theorem 2

(Invariance). Given two distributions

{α_{k}^{'}}, {α_{k}}

, and two quasi-arithmetic means

M

,

M^{🟉}

, the following propositions are equivalent:

(a) for all increasing functions

f (x)

M ({f (k)}, {α_{k}^{'}}) \leq M ({f (k)}, {α_{k}})

(12)

(b) for all increasing functions

f (x)

M^{🟉} ({f (k)}, {α_{k}^{'}}) \leq M^{🟉} ({f (k)}, {α_{k}})

(13)

(a’) for all decreasing functions

f (x)

M ({f (k)}, {α_{k}^{'}}) \geq M ({f (k)}, {α_{k}})

(14)

(b’) for all decreasing functions

f (x)

M^{🟉} ({f (k)}, {α_{k}^{'}}) \geq M^{🟉} ({f (k)}, {α_{k}})

(15)

Proof.

It is a direct consequence of the observation that conditions (c) and (d) in Lemma 1 do not depend on any particular

g (x)

function considered, and thus the order of inequalities does not change with the mean considered as long as

{α_{k}}

and

{α_{k}^{'}}

are kept fixed. □

The following properties relate stochastic order with the quasi-arithmetic mean. Let

I

be the set of monotonous functions,

I^{<}

the set of increasing functions,

I^{>}

the set of decreasing functions.

Definition 3

(preserve mean order property). We say that a stochastic order preserves mean order for a given mean

M

and a set

I^{<} \subset I^{<}

of increasing functions when, for all functions

f (x) \in I^{<}

and any distributions

{α_{k}^{'}}, {α_{k}}

,

{α_{k}} ≺ {α_{k}^{'}} \Rightarrow M ({f (k)}, {α_{k}^{'}}) \leq M ({f (k)}, {α_{k}}) .

(16)

Definition 4

(preserve inverse mean order property). We say that a stochastic order preserves inverse mean order for a given mean

M

and a set

I^{>} \subset I^{>}

of decreasing functions when, for all functions

f (x) \in I^{>}

and any distributions

{α_{k}^{'}}, {α_{k}}

,

{α_{k}} ≺ {α_{k}^{'}} \Rightarrow M ({f (k)}, {α_{k}^{'}}) \geq M ({f (k)}, {α_{k}}) .

(17)

Theorem 2 together with the preserve mean order properties allows us to state the following invariance property:

Theorem 3

(preserve mean order invariance). Given a stochastic order that preserves mean order (respectively preserves inverse mean order) for a given mean and for

I^{<}

(respectively for

I^{>}

), then for any mean it preserves both mean order for

I^{<}

and inverse mean order for

I^{>}

. In other words, the preserve mean order properties are invariant with respect to the mean considered.

Observe that from Lemma 1 and Theorems 1 and 3, we have that a necessary and sufficient condition for an order to preserve mean order for

I^{<}

(or preserve inverse mean order for

I^{>}

) is the holding of Equation (6), independently of the mean considered. We will see in Section 6 that this corresponds to first stochastic dominance order.

5. Concavity and Convexity

Let us consider now

I_{v}^{<} \subset I^{<}

, the set of all increasing concave functions,

I_{x}^{<} \subset I^{<}

, the set of all increasing convex functions,

I_{v}^{>} \subset I^{>}

, the set of all decreasing concave functions, and

I_{x}^{>} \subset I^{>}

, the set of all decreasing convex functions. The following theorem relates the preserve mean order properties with the mirror property.

Theorem 4.

If an order holds the mirror property, then holding preserve mean order for

I_{v}^{<}

(respectively

I_{x}^{<}

) implies holding preserve inverse mean order for

I_{v}^{>}

(respectively

I_{x}^{>}

) and viceversa.

Proof.

Suppose

{α_{k}^{'}} ≺ {α_{k}}

and

f (x)

decreasing and concave (respectively convex). Then, by the mirror property

{α_{M - k + 1}} ≺ {α_{M - k + 1}^{'}}

, and by the hypothesis of the theorem

\begin{matrix} M ({f (k)}, {α_{k}}) & = & M ({f (M - k + 1)}, {α_{M - k + 1}}) = M ({f (m (k))}, {α_{M - k + 1}}) \\ \leq & M ({f (m (k))}, {α_{M - k + 1}^{'}}) = M ({f (k)}, {α_{k}^{'}}), \end{matrix}

(18)

where

m (x) = M - x + 1

, and because if

f (x)

is decreasing and concave (respectively convex) then

f (m (x))

is increasing and concave (respectively convex). □

The following result is necessary to prove Lemma 2.

Theorem 5.

Given

f (x)

and weights

{α_{k}^{'}}, {α_{k}}

, if inequality

M ({f (k)}, {α_{k}}) \leq M ({f (k)},

{α_{k}})

holds for any

g (x)

strictly increasing and convex (respectively concave) then it holds for any

g (x)

strictly decreasing and concave (respectively convex). If it holds for any

g (x)

strictly decreasing and convex (respectively concave) then it holds for any

g (x)

strictly increasing and concave (respectively convex).

Proof.

Consider the quasi-arithmetic mean with Kolmogorov function

g^{🟉} = - g (x)

(with inverse

g^{🟉 - 1} = g^{- 1} (- x)

). When

g (x)

is increasing,

g^{🟉} (x)

is decreasing and viceversa. When

g (x)

is convex,

g^{🟉} (x)

is concave and viceversa. We have

\begin{matrix} g^{🟉 - 1} (\sum_{i} α_{i}^{'} g^{🟉} (f (i))) & = & g^{- 1} (- (\sum_{i} α_{i}^{'} (- g (f (i))))) = g^{- 1} (\sum_{i} α_{i}^{'} g (f (i))) \\ \leq & g^{- 1} (\sum_{i} α_{i} g (f (i))) = g^{- 1} (- (\sum_{i} α_{i} (- g (f (i))))) = g^{🟉 - 1} (\sum_{i} α_{i} g^{🟉} (f (i))) \end{matrix}

(19)

□

The following Lemma is needed to prove how the invariance properties for one mean extends to other means.

Lemma 2.

Given two distributions

{α_{k}^{'}}, {α_{k}}

, and a quasi-arithmetic mean

M ({f (k)}, {α_{k}})

with function

g (x)

. Consider the following Equations:

M ({f (k)}, {α_{k}^{'}}) \leq M ({f (k)}, {α_{k}}),

(20)

and

\begin{matrix} M ({f (k)}, {α_{M - k + 1}}) \leq M ({f (k)}, {α_{M - k + 1}^{'}}) . \end{matrix}

(21)

Then, for each line in Table 1, and for

g (x), f (x)

filling the conditions in first and second column in Table 1 and holding Equations (20) and (21), Equations (20) and (21) hold too for

g^{🟉} (x), f^{🟉} (x)

filling the conditions in columns three and four.

Table 1. For each line, for

g (x), f (x)

filling the conditions in first and second column, then Equations (20) and (21) hold for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. By changing

f^{🟉}

from increasing to decreasing, the reverse of Equations (20) and (21) hold. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

Additionally, for each line in Table 1, changing

f^{🟉} (x)

in column four from increasing to decreasing, Equations (20) and (21) hold with the inequalities reversed.

Proof.

The proof of lines 1–8 in Table 1 is in the Appendix B. Lines 1’-8’ in Table 1 are a direct consequence of Theorem 5 applied to lines 1–8. □

Theorem 6.

Given a quasi-arithmetic mean

M ({f (k)}, {α_{k}})

with function

g (x)

, we have that if an order holds the mirror property then for each line in Table 1, and for

g (x)

filling the condition in the first column in Table 1 and preserving order for

f (x)

in the second column in Table 1, the mean for

g^{🟉} (x)

filling the condition in column three in Table 1 preserves order for

f^{🟉} (x)

filling the condition in column four in Table 1.

Proof.

It is enough to apply Lemma 2, taking into account that if Equation (20) holds then Equation (21) holds too by the mirror property. □

Corollary 1.

Consider the weighted arithmetic mean

A ({f (k)}, {α_{k}})

. Given an order that holds the mirror property and preserves the order for mean

A

, then

(a) If order is preserved for mean

A

and for

I_{v}^{<}

then order is preserved for any mean with concave-increasing/convex-decreasing (respectively concave-decreasing/convex-increasing) function

g (x)

and for

I_{v}^{<}

(respectively for

I_{x}^{<}

).

(b) If order is preserved for mean

A

and for

I_{x}^{<}

, then order is preserved for any mean with convex-increasing/concave-decreasing (respectively convex-decreasing/concave-increasing) function

g (x)

and for

I_{x}^{<}

(respectively for

I_{v}^{<}

).

Proof.

Arithmetic mean is a quasi-arithmetic mean with increasing function

g (x) = x

, which is both concave-increasing and convex-increasing, and thus Table 1 collapses to Table 2.

Table 2. For each line,

g (x) = x

,

f (x)

filling the conditions in second column and Equations (20) and (21) holding, then Equations (20) and (21) hold too for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

□

Functions

x^{p}

, for

p \geq 1

are convex-increasing and with

p \leq 0

are convex-decreasing over

R_{+ +}

,

e^{x}

convex-increasing over

R

, while

log x

,

x^{p}

, for

0 < p \leq 1

are concave-increasing over

R_{+ +}

. Affine functions are both concave and convex over

R

. If

g (x)

is convex,

- g (x)

is concave and viceversa, and the composition of concave-increasing and concave is concave, and convex-increasing and convex is convex. We will see in the next section that preserving the order for mean

A

for

I_{v}^{<}

is defined as second-order stochastic dominance or increasing concave order, and preserving the order for mean

A

for

I_{x}^{<}

is defined as increase convex order stochastic dominance. Both orders hold the mirror property and thus Corollary 1 applies to both.

6. Application to Stochastic Orders and Cross-Entropy

6.1. First-Order Stochastic Dominance

Originating in the economics risk literature [24], first-order stochastic dominance [1,2] FSD, between two probability distributions,

{α_{k}^{'}}

,

{α_{k}}

,

{α_{k}^{'}} ≺_{F S D} {α_{k}}

, is defined as:

Definition 5.

{α_{k}^{'}} ≺_{F S D} {α_{k}}

⇔ for any increasing function

f (x)

,

E {[f (k)]}_{α_{k}^{'}} \leq E {[f (k)]}_{α_{k}} .

Remember that the expected value is the arithmetic weighted mean.

A necessary and sufficient condition for first-order stochastic dominance is defined by the condition

c)

of Lemma 1, which is equivalent to condition

d)

of Lemma 1, thus

{α_{k}^{'}} ≺_{F S D} {α_{k}} \Rightarrow {α_{M - k + 1}} ≺_{F S D} {α_{M - k + 1}^{'}}

, i.e., the mirror property holds. From the definition of FSD, and from Theorem 3, we can redefine FSD,

{α_{k}^{'}} ≺_{F S D} {α_{k}}

as there exists a mean

M

such that, for any increasing function

f (x)

, Equation (16) holds. This is, the definition of FSD is independent of the mean considered, while the original definition relies on the expected value (arithmetic mean). The mean considered can be arithmetic, harmonic, geometric or any other quasi-arithmetic mean.

Let us consider now a strictly monotonous function

g (x)

, and define a generalized cross-entropy

GCE ({p_{i}}, {q_{i}}) = g^{- 1} (\sum_{i} p_{i} g (- log q_{i}))

. Observe that it is a quasi-arithmetic mean, and for

g (x) = x

, we get the cross-entropy

C E ({p_{i}}, {q_{i}}) = - \sum p_{i} log q_{i}

. Other functions that generalize cross entropy have been defined in the context of training deep neural networks [25]. We can state the following theorem:

Theorem 7.

Given distributions

{p_{i}^{'}}, {p_{i}}, {q_{i}}

, increasing, and

{p_{k}^{'}} ≺_{F S D} {p_{k}}

, then for any mean

GCE

with function

g (x)

,

GCE ({p_{i}}, {q_{i}}) \leq GCE ({p_{i}^{'}}, {q_{i}})

.

Proof.

Observe first that

f (k) = - log q_{k}

is a decreasing sequence. From the hypothesis, condition

c)

of Lemma 1 holds, thus we can then apply Theorem 1 to the mean with function

g (x)

and for

{f (k)}

decreasing. □

The following result relates the entropies of two distributions with the first stochastic order.

Theorem 8.

Given

{p_{k}}, {q_{k}}

increasing, and

{q_{k}} ≺_{F S D} {p_{k}}

, then

H ({p_{k}}) \leq H ({q_{k}})

, where

H

stands for Shannon entropy.

Proof.

The Kullback–Leibler distance is always positive [26],

\sum_{i} p_{i} log \frac{p_{i}}{q_{i}} \geq 0

, and thus we have that

H ({p_{k}}) = - \sum_{i} p_{i} log p_{i} \leq - \sum p_{i} log q_{i}

. Applying Theorem 7 for

g (x) = x

,

- \sum p_{i} log q_{i} \leq - \sum q_{i} log q_{i} = H ({q_{k}}) .

□

6.2. Second-Order Stochastic Dominance and Increase Convex Ordering

Definition 6.

Second-order stochastic dominance between two probability distributions,

{α_{k}}

,

{α_{k}^{'}}

,

{α_{k}^{'}} ≺_{S S D} {α_{k}}

, occurs when for any increasing concave function

f (x)

,

E {[f (k)]}_{α_{k}^{'}} \leq E {[f (k)]}_{α_{k}} .

Trivially

{α_{k}^{'}} ≺_{F S D} {α_{k}} \Rightarrow {α_{k}^{'}} ≺_{S S D} {α_{k}} .

Let us consider the cumulative distribution function,

F_{i} = α_{1} + α_{2} + \dots + α_{i}

, and the survival function

{\bar{F}}_{i} = 1 - F_{i} = α_{M} + α_{M - 1} + \dots + α_{i + 1}

. A necessary and sufficient condition for second-order stochastic dominance [1,2] is the following:

\begin{matrix} F_{1}^{'} & \geq & F_{1} \\ F_{1}^{'} + F_{2}^{'} & \geq & F_{1} + F_{2} \\ ⋮ \\ F_{1}^{'} + \dots + F_{M - 1}^{'} & \geq & F_{1} + \dots + F_{M - 1} \\ F_{1}^{'} + \dots + F_{M - 1}^{'} + F_{M}^{'} & \geq & F_{1} + \dots + F_{M - 1} + F_{M}, \end{matrix}

(22)

or equivalently,

\begin{matrix} {\bar{F}}_{1}^{'} & \leq & {\bar{F}}_{1} \\ {\bar{F}}_{1}^{'} + {\bar{F}}_{2}^{'} & \leq & {\bar{F}}_{1} + {\bar{F}}_{2} \\ ⋮ \\ {\bar{F}}_{1}^{'} + \dots + {\bar{F}}_{M - 1}^{'} & \leq & {\bar{F}}_{1} + \dots + {\bar{F}}_{M - 1} \\ {\bar{F}}_{1}^{'} + \dots + {\bar{F}}_{M - 1}^{'} + {\bar{F}}_{M}^{'} & \leq & {\bar{F}}_{1} + \dots + {\bar{F}}_{M - 1} + {\bar{F}}_{M} . \end{matrix}

(23)

Let us see first that

{α_{k}^{'}} ≺_{S S D} {α_{k}} \Rightarrow {α_{M - k + 1}} ≺_{S S D} {α_{M - k + 1}^{'}}

, i.e., SSD holds the mirror property. Effectively, define

G_{i} = α_{M} + α_{M - 1} + \dots + α_{i + 1}

, and the survival function

{\bar{G}}_{i} = 1 - G_{i} = α_{1} + α_{2} + \dots + α_{i}

. However,

G_{i} = {\bar{F}}_{i}

,

{\bar{G}}_{i} = F_{i}

,

G_{i}^{'} = {\bar{F}}^{'}_{i}

,

{\bar{G}}^{'}_{i} = F_{i}^{'}

, and substituting in Equation (22) or Equation (23), we obtain the desired result.

From Corollary 1, second-order stochastic dominance preserves mean order for all means

M

defined by a concave-increasing/convex-decreasing (respectively concave-decreasing/convex-increasing) function

g (x)

, and for the set of all concave-increasing functions

I_{v}^{<}

(respectively convex-increasing functions

I_{x}^{<}

). For instance, order is preserved for the geometric mean, or any mean with

g (x) = x^{p}

,

0 < p \leq 1

and for

I_{v}^{<}

, or for or any mean with

g (x) = x^{p}

with

p < 0

, as the harmonic mean, for

I_{x}^{<}

. In particular, we can state the following theorem about cross-entropy of two distributions,

Theorem 9.

Given distributions

{p_{i}^{'}}, {p_{i}}, {q_{i} = f (i)}

,

f (x)

concave-increasing, and

{p_{i}^{'}} ≺_{S S D} {p_{i}}

, then

C E ({p_{i}}, {q_{i}}) \leq C E ({p_{i}^{'}}, {q_{i}})

.

Proof.

We have that

C E ({p_{i}}, {q_{i}}) = - log G ({q_{i}}, {p_{i}}) = - log (Π_{i} q_{i}^{p_{i}}) = - log (exp (\sum_{i}

p_{i} log q_{i}))

, where

G

stands for geometric mean. The geometric mean is a quasi-arithmetic mean with function

g (x) = log x

, concave-increasing. Using Definition 6 and applying Corollary 1(a), we obtain the inequality

G ({q_{i}}, {p_{i}^{'}}) \leq G ({q_{i}}, {p_{i}})

and we apply the function

- log x

to both members of this inequality. □

When we consider

f (x)

, a convex instead of a concave function, we talk of increasing convex order, ICX. This is,

Definition 7.

Second-order stochastic dominance between two probability distributions,

{α_{k}}

,

{α_{k}^{'}}

,

{α_{k}^{'}} ≺_{I C X} {α_{k}}

, occurs when for any increasing convex function

f (x)

,

E {[f (k)]}_{α_{k}^{'}} \leq E {[f (k)]}_{α_{k}} .

{α_{k}}

is greater in increasing convex order than

{α_{k}^{'}}

,

{α_{k}^{'}} ≺_{I C X} {α_{k}}

, if and only if the following inequalities hold:

\begin{matrix} F_{M}^{'} & \geq & F_{M} \\ F_{M}^{'} + F_{M - 1}^{'} & \geq & F_{M} + F_{M - 1} \\ ⋮ \\ F_{M}^{'} + \dots + F_{2}^{'} & \geq & F_{M} + \dots + F_{2} \\ F_{M}^{'} + \dots + F_{2}^{'} + F_{1}^{'} & \geq & F_{M} + \dots + F_{2} + F_{1} \end{matrix}

(24)

or equivalently,

\begin{matrix} {\bar{F}}_{M}^{'} & \leq & {\bar{F}}_{M} \\ {\bar{F}}_{M}^{'} + {\bar{F}}_{M - 1}^{'} & \leq & {\bar{F}}_{M} + {\bar{F}}_{M - 1} \\ ⋮ \\ {\bar{F}}_{M}^{'} + \dots + {\bar{F}}_{2}^{'} & \leq & {\bar{F}}_{M} + \dots + {\bar{F}}_{2} \\ {\bar{F}}_{M}^{'} + \dots + {\bar{F}}_{2}^{'} + {\bar{F}}_{1}^{'} & \leq & {\bar{F}}_{M} + \dots + {\bar{F}}_{2} + {\bar{F}}_{1} \end{matrix}

(25)

Trivially

{α_{k}^{'}} ≺_{F S D} {α_{k}} \Rightarrow {α_{k}^{'}} ≺_{I C X} {α_{k}} .

Mirror property is also immediate. From Corollary 1, second-order stochastic dominance preserves mean order for all means

M

defined by a convex-increasing/concave-decreasing (respectively convex-decreasing/concave-increasing) function

g (x)

and for the set of all convex-increasing functions

I_{x}^{<}

(respectively concave-increasing functions

I_{v}^{<}

). For instance, any mean with

g (x) = x^{p}

,

p \geq 1

as the weighted quadratic mean with

I_{x}^{<}

, or

g (x) = x^{p}

,

p < 0

as the harmonic mean with

I_{v}^{<}

, or the mean with

g (x) = - log x

with

I_{v}^{<}

. In particular, we can state the following result,

Theorem 10.

Given distributions

{p_{i}^{'}}, {p_{i}}, {q_{i} = f (i)}

,

f (x)

concave-increasing, and

{p_{i}^{'}} ≺_{I C X} {p_{i}}

, then

C E ({p_{i}}, {q_{i}}) \leq C E ({p_{i}^{'}}, {q_{i}})

.

Proof.

Consider the quasi-arithmetic mean

G^{'} ({q_{i}}, {p_{i}})

with

g (x) = - log x

, convex-decreasing. We have that

C E ({p_{i}}, {q_{i}}) = - log G^{'} ({q_{i}}, {p_{i}})

. Using Definition 7 and applying Corollary 1(b), we have that

G^{'} ({q_{i}}, {p_{i}^{'}}) \leq G^{'} ({q_{i}}, {p_{i}})

, and we apply to each member of the inequality the function

- log x

. □

6.3. Likelihood Ratio Dominance

Definition 8.

Likelihood ratio dominance, LR,

{α_{k}^{'}} ≺_{L R} {α_{k}}

, is defined as

{α_{k}^{'}} ≺_{L R} {α_{k}} \Leftrightarrow \forall (i, j), 1 \leq i, j \leq M, i \leq j \Rightarrow α_{i}^{'} / α_{j}^{'} \geq α_{i} / α_{j} .

(26)

As

α_{i}^{'} / α_{j}^{'} \geq α_{i} / α_{j} \Rightarrow α_{j} / α_{i} \geq α_{j}^{'} / α_{i}^{'}

, we have that LR holds the mirror property.

It can be shown that LR order implies FSD order,

and then LR order holds Theorems 3 and 7 and Equation (16) holds for any mean

M

,

Theorem 11.

Likelihood-ratio order implies first stochastic order, i.e.,

{α_{k}^{'}} ≺_{L R} {α_{k}} \Rightarrow {α_{k}^{'}} ≺_{F S D} {α_{k}} .

(27)

Proof.

There are indirect proofs in the literature [1,17]. A direct proof is in the Appendix C. □

As the condition for LR order, Equation (26), is easy to check, this order comes very handy in proving sufficient condition for FSD order. Additionally, for the uniform distribution

{1 / N}

and any increasing distribution

{α_{k}}

, we have that

{1 / M} ≺_{L R} {α_{k}}

, while for any decreasing distribution

{α_{k}}

we have that

{α_{k}} ≺_{L R} {1 / M}

.

Consider Shannon entropy,

H ({p_{k}})

, and Rényi entropy

R_{β} ({p_{k}})

of a distribution

{p_{k}}

, that as seen in Section 3 are quasi-arithmetic means of the sequence

{1 / log p_{k}}

with weights

{p_{k}}

, and with

g (x) = x

and

g (x) = 2^{(β - 1) x}

, respectively. Without loss of generality, we can consider

{p_{k}}

in increasing order, then the sequence

{1 / log p_{k}}

will be decreasing. We have that

{1 / M} ≺ {p_{k}}

, and then we can apply Theorem 3, and obtain for Shannon entropy

H ({p_{k}}) \leq A ({1 / log p_{k}}),

where

A ({1 / log p_{k}})

is the arithmetic mean of

{1 / log p_{k}}

, and for Rényi entropy, observing that

g^{- 1} (x) = \frac{1}{1 - β} log x

,

R_{β} ({p_{k}}) \leq \frac{1}{1 - β} log \sum_{k} \frac{1}{M} p_{k}^{(β - 1)} = \frac{1}{1 - β} log A ({p_{k}^{(β - 1)}} .

For Tsallis entropy, which can be considered the weighted arithmetic mean of the sequence

{1 / {ln}_{q} p_{k}}

with weights

{p_{k}}

, as for

{p_{k}}

in increasing order the sequence

{1 / {ln}_{q} p_{k}}

will be decreasing (the derivative of

{ln}_{q} x

is positive), we have similarly to Shannon entropy,

S_{q} ({p_{k}}) \leq A ({1 / {ln}_{q} p_{k}}) .

We can also state the following result for the generalized cross entropy

GCE

. We say that

{p_{i}}

is comonotonic [27,28] with

{q_{i}}

when, for all

1 \leq i, j \leq M

,

q_{i} < q_{j} \Rightarrow p_{i} \leq p_{j}

.

Theorem 12.

Given distributions

{p_{i}}, {q_{i}}

comonotonic, then

GCE ({p_{i}}, {q_{i}}) \leq GCE ({1 / M},

{q_{i}})

.

Proof.

Without loss of generality, we can reorder

{p_{i}}, {q_{i}}

so that both are increasing. We know that

{1 / M} ≺_{L R} {p_{k}}

and thus

{1 / M} ≺_{F S D} {p_{k}}

. From the definition of

≺_{F S D}

, preserve mean order holds for increasing functions and for arithmetic mean, thus we can then apply Theorem 3 for

{- log q_{i}}

decreasing. □

From Theorem 12, when

{p_{i}}, {q_{i}}

are comonotonic, then

C E ({p_{i}}, {q_{i}}) \leq \frac{1}{M} \sum_{k = 1}^{M} - log q_{k} .

(28)

6.4. Hazard Rate

Definition 9.

A probability distribution

{α_{k}}

is greater in hazard rate, HR, to

{α_{k}^{'}}

,

{α_{k}^{'}} ≺_{H R} {α_{k}}

, if and only if for all

j \geq i

, the following condition is filled

\frac{{\bar{F}}^{'}_{i}}{{\bar{F}}_{i}} \geq \frac{{\bar{F}}^{'}_{j}}{{\bar{F}}_{j}} .

Observe that this can be written as

\frac{{\bar{F}}_{j}}{{\bar{F}}^{'}_{j}} \geq \frac{{\bar{F}}_{i}}{{\bar{F}}^{'}_{i}},

or

\frac{1 - F_{j}}{1 - {F^{'}}_{j}} \geq \frac{1 - F_{i}}{1 - {F^{'}}_{i}},

If we consider now the sequences written in inverse order, i.e.,

α_{M - i + 1}, α_{M - i + 1}^{'}

, it is clear that if

i \leq j

then

M - j + 1 \leq M - i + 1

, and from

1 - {\bar{F}}_{j} = {\bar{F}}_{M - j + 1}

, we get

\frac{{\bar{F}}_{M - j + 1}}{{\bar{F}}^{'}_{M - j + 1}} \geq \frac{{\bar{F}}_{M - i + 1}}{{\bar{F}}^{'}_{M - i + 1}},

which means

{α_{M - k + 1}} ≺_{H R} {α_{M - k + 1}^{'}}

, and HR order holds the mirror property.

It can be shown that

{α_{k}^{'}} ≺_{H R} {α_{k}} \Rightarrow {α_{k}^{'}} ≺_{F S D} {α_{k}}

and then HR order holds Theorems 3 and 7, and Equation (16) holds for any mean

M

.

7. Example: Linear Combination of Monte Carlo Techniques

When we want to estimate an integral

μ = \int h (x) d x

using MIS (multiple importance sampling) Monte Carlo methods, we have several choices of techniques, each of them with a given pdf

{p_{k} (x)}

,

1 \leq k \leq n

, which provide the primary estimators

{I_{k, 1} = \frac{h (x)}{p_{k} (x)}}

. If for

h (x) > 0

, we have that

p_{k} (x) > 0

, then the technique is unbiased. We are interested in optimal ways of combining the techniques. One option is linearly combining the different estimators

{I_{i, n_{i}}}

,

\sum_{i = 1}^{M} w_{i} I_{i, n_{i}}

, with weights

{w_{k}}

and sampling proportions

{n_{i} = α_{i} N}

, with

N = \sum_{i = 1}^{M} n_{i}

. If all techniques are unbiased, the resulting combination is also unbiased. The variance is given by

V_{N} = \sum_{i = 1}^{M} w_{i}^{2} \frac{v_{i}}{n_{i}} = \frac{1}{N} \sum_{i = 1}^{M} w_{i}^{2} \frac{v_{i}}{α_{i}} = \frac{1}{N} V,

(29)

where

v_{i}

are the variances for the primary estimators of each technique and V is the variance for the primary MIS estimator.

The optimal combination of weights, i.e. the one that leads to minimum variance, has been studied in [18,29,30].

The variance value will depend on two sets of weights,

{w_{i}}

and

{α_{i}}

, but there are cases where we can reduce it to a single set of weights. We present the following examples, where variances are taken for

N = 1

:

when $α_{i} = w_{i}$ , then $V = \sum_{i = 1}^{M} α_{i} v_{i}$ , the weighted arithmetic mean ( $g (x) = x$ ) of ${v_{i}}$ .
when the sampling proportions are fixed, the optimal variance is given by $H ({\frac{v_{k}}{α_{k}}})$ , the weighted harmonic mean ( $g (x) = 1 / x$ ) of ${v_{i}}$ .
when weights ${w_{i}}$ are fixed, the optimal variance is given by ${(\sum_{k = 1}^{M} w_{k} {\sqrt{v}}_{k})}^{2}$ , which is the weighted power mean of ${v_{i}}$ with exponent $r = 1 / 2$ ( $g (x) = x^{1 / 2}$ ).

Observe that the variance in these three cases is a quasi-arithmetic mean.

Let us now order

{v_{k}}

in increasing order, and let us take

α_{k}^{'} = h (v_{k})

,

h (x)

decreasing, and

α_{k} = 1 / M

. We have that

α_{k}^{'} = h (v_{k}) ≺_{L R} {α_{k} = 1 / M}

and thus, by Theorem 11,

α_{k}^{'} = h (v_{k}) ≺_{F S D} {α_{k} = 1 / M}

, and we can apply Theorem 2 with

f (k) = v_{k}

to the quasi-arithmetic means above. Thus, the variance is less when taking sampling proportions or coefficients decreasing in

v_{k}

than for equal sampling or equal weighting.

The same would be the case when considering a different measure of error, whenever the error of the combined techniques can also be expressed as a weighted quasi-arithmetic mean of the values of this measure for all techniques. For instance, suppose we use as measure of error the standard deviation,

σ_{i}^{2} = v_{i}

,

σ^{2} = V

, the above cases become

when $α_{i} = w_{i}$ , then $σ = \sqrt{V} = \sqrt{\sum_{i = 1}^{M} α_{i} σ_{i}^{2}}$ , the weighted root mean square or quadratic mean of ${σ_{i}}$ , or weighted power mean with $r = 2$ ( $g (x) = x^{2}$ ).
when the weights ${w_{i}}$ are fixed, the optimal standard deviation is given by $σ = \sqrt{V} = \sqrt{H ({\frac{σ_{k}^{2}}{α_{k}}})}$ , the power mean of ${σ_{i}}$ with $r = - 2$ ( $g (x) = x^{- 2}$ ).
when the sampling proportions are fixed, the optimal standard deviation is given by $σ = \sqrt{V} = \sum_{k = 1}^{M} w_{k} σ_{k}$ , the arithmetic mean of ${σ_{i}}$ ( $g (x) = x$ ).

For more details see [18].

Liabilities vs. Utilities

From the above example, we can study the relationship between utilities, which we try to maximize, and liabilities, that we try to minimize. As a liability can always be considered as the inverse of a utility, we can see from Theorem 1 that establishing invariance properties on the order between means of utilities is equivalent to doing this with liabilities, except that the order is inverted. We can define for instance the efficiency

E = 1 / V

as the inverse of the variance, and obtain it as a weighted mean of individual technique’s efficiencies,

e_{i} = 1 / v_{i}

. Consider, for instance, the first case in the example above, where

V = \sum_{i = 1}^{M} α_{i} v_{i}

. We have

E = \frac{1}{V} = \frac{1}{\sum_{i = 1}^{M} α_{i} v_{i}} = \frac{1}{\sum_{i = 1}^{M} α_{i} e_{i}^{- 1}} = H ({\frac{e_{i}}{α_{i}}}),

where

H

denotes the weighted harmonic mean.

Considering now the second case

E = \frac{1}{V} = \frac{1}{H ({\frac{v_{k}}{α_{k}}})} = \sum_{i = 1}^{M} α_{i} v_{i}^{- 1} = \sum_{i = 1}^{M} α_{i} e_{i},

or weighted arithmetic mean. For the third case,

E = \frac{1}{V} = \frac{1}{{(\sum_{k = 1}^{M} w_{k} {\sqrt{v}}_{k})}^{2}} = {(\sum_{k = 1}^{M} w_{k} e_{k}^{- 1 / 2})}^{- 2},

which is the weighted power mean with

r = - 1 / 2

.

8. Conclusions and Future Work

We have presented in this paper the relationship between stochastic orders and quasi-arithmetic means. We have proved several ordering invariance theorems, that show that given two distributions under a certain stochastic order, the ordering of the means is preserved for any quasi-arithmetic mean we might consider, this is, not only for the arithmetic mean (or expected value). We have shown how the results apply to first order, second order, likelihood ratio, hazard-rate, and increasing convex stochastic orders, and its application to cross-entropy. We have also presented an application example based on the linear combination of Monte Carlo estimators, and shown that the invariance allows costs or liabilities to be considered as the symmetric case of utilities.

In the future, we want to generalize our results to spatial weight matrices [31]. The rows in a spatial weight matrix are weights that give the influence of n entities over each other. Different weighted means such as arithmetic, harmonic, or geometric [32] can be used to compute this influence. We can thus apply our invariance results to each row. We will also investigate which of our results for Shannon entropy extend to Tsallis entropy too. Both Shannon and Tsallis entropy are weighted arithmetic means [23], and given

{p_{k}}

monotonic, both

{1 / log p_{k}}

and

{1 / {ln}_{q} p_{k}}

are monotonic too. Finally, we will investigate the invariance of the different stochastic orders under the operations of compositional data [3].

Author Contributions

Conceptualization, M.S. and S.C.; methodology, M.S. and J.P.; validation, J.P. and V.E.; writing—original draft preparation, M.S.; writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

Funding

Mateu Sbert and Jordi Poch are funded in part by grant PID2019-106426RB-C31 from the Spanish Government. Víctor Elvira was partially supported by Agence Nationale de la Recherche of France under PISCES project (ANR-17-CE40-0031-01).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Lemma 1

Proof.

Subtracting 1 from both sides of each inequality proves

(c) \Leftrightarrow (d)

. To prove

(a) \Rightarrow (c)

, we proceed in the following way. Consider the increasing sequence

{g (f (1)), \dots, g (f (1)),

g (f (M)), \dots, g (f (M))}

,

f (1) < f (M)

(and thus

g (f (1)) < g (f (M))

by the strict monotonicity of

g (x)

), and where

g (f (1))

is written l times, denote

L = a_{1} + \dots + a_{l}

,

L^{'} = a_{1}^{'} + \dots + a_{l}^{'}

. Since

a_{l + 1} + \dots + a_{M} = 1 - L

,

a_{l + 1}^{'} + \dots + a_{M}^{'} = 1 - L^{'}

, then (a) gives

L^{'} g (f (1)) + (1 - L^{'}) g (f (M)) - L g (f (1)) - (1 - L) g (f (M)) \leq 0,

i.e.,

(L^{'} - L) (g (f (1)) - g (f (M))) \leq 0 \Rightarrow L^{'} \geq L .

This proves the first

M - 1

inequalities in c, thus

(a) \Rightarrow (c)

. Observe now that

L^{'} \geq L \Rightarrow 1 - L \geq 1 - L^{'},

and thus

(a) \Rightarrow (d)

(although this can be seen directly from

(a) \Rightarrow (c)

and

(c) \Leftrightarrow (d)

. To prove that

(b)

implies

(c)

and

(d)

, consider the sequence,

{g (f (1)), \dots, g (f (1)), g (f (M)), \dots, g (f (M))},

f (1) < f (M)

,

g (f (M))

is written l times, and the same definitions as before for

L, L^{'}

, then

(b)

gives

(1 - L) g (f (1)) + L g (f (M)) - (1 - L^{'}) g (f (1)) - L^{'} g (f (M)) \leq 0,

and we proceed as above. Thus,

(a) \Rightarrow (c)

.

Let us see now that

(c)

implies

(a)

.

Define for

1 \leq k \leq M

,

A_{k} = \sum_{j = 1}^{k} α_{k}, A_{k}^{'} = \sum_{j = 1}^{k} α_{k}^{'}

, and

A_{0} = A_{0}^{'} = 0

. Then,

\begin{matrix} \sum_{k = 1}^{M} α_{k} g (f (k)) - \sum_{k = 1}^{M} α_{k}^{'} g (f (k)) = \sum_{k = 1}^{M} (α_{k} - α_{k}^{'}) g (f (k)) \\ = & \sum_{k = 1}^{M} (A_{k} - A_{k - 1} - {A^{'}}_{k} + {A^{'}}_{k - 1}) g (f (k)) \\ = & \sum_{k = 1}^{M} (A_{k} - {A^{'}}_{k}) g (f (k)) - \sum_{k = 1}^{M} (A_{k - 1} - {A^{'}}_{k - 1}) g (f (k)) \\ = & \sum_{k = 1}^{M - 1} (A_{k} - {A^{'}}_{k}) g (f (k)) - \sum_{k = 0}^{M - 1} (A_{k} - {A^{'}}_{k}) g (f (k + 1)) \\ = & \sum_{k = 1}^{M - 1} (A_{k} - {A^{'}}_{k}) g (f (k)) - \sum_{k = 1}^{M - 1} (A_{k} - {A^{'}}_{k}) g (f (k + 1)) \\ = & \sum_{k = 1}^{M - 1} (A_{k} - {A^{'}}_{k}) (g (f (k)) - g (f (k + 1))) \geq 0, \end{matrix}

(A1)

as

(c

) implies that for all k,

A_{k}^{'} - A_{k} \geq 0

, and

{g (f (k))}

is an increasing sequence. Thus,

(c) \Rightarrow (a)

.

Repeating the proof for the sequences

{α_{M - k + 1}}

and

{α_{M - k + 1}^{'}}

, we obtain that

(d) \Rightarrow (b)

.

Let

g (x)

be decreasing. Observe that

\sum_{k = 1}^{M} α_{M - k + 1} g (f (k)) = \sum_{k = 1}^{M} α_{M - k + 1} g (f (m (M - k + 1))) = \sum_{k = 1}^{M} α_{k} g (f (m (k))),

where

m (k) = M - k + 1

, and thus

(a) \Rightarrow (b^{'})

because

{g (f (m (k)))}

is an increasing sequence. Now suppose

g (x)

is increasing, from

\sum_{k = 1}^{M} α_{k} g (f (k)) = \sum_{k = 1}^{M} α_{M - k + 1} g (f (M - k + 1)) = \sum_{k = 1}^{M} α_{M - k + 1} g (f (m (k))),

we have that

(b^{'}) \Rightarrow (a)

because

{g (f (m (k)))}

is a decreasing sequence. We can show similarly that

(b) \Leftrightarrow (a^{'})

.

Consider now

f (x)

decreasing. Reversed

(a)

is

\sum_{k = 1}^{M} α_{k} g (f (k)) \leq \sum_{k = 1}^{M} α_{k}^{'} g (f (k)),

(A2)

but it can be written as

\begin{matrix} \sum_{k = 1}^{M} α_{M - k + 1} g (f (M - k + 1)) \leq \sum_{k = 1}^{M} α_{M - k + 1}^{'} g (f (M - k + 1)) \\ = & \sum_{k = 1}^{M} α_{M - k + 1} g (f (m (k))) \leq \sum_{k = 1}^{M} α_{M - k + 1}^{'} g (f (m (k))), \end{matrix}

(A3)

where

{g (f (m (k)))}

is an increasing sequence, and thus

(c) \Rightarrow (a)

when

f (x)

is decreasing and for order of inequality in

(a)

reversed. The other cases for

f (x)

decreasing can be analogously proven. □

Appendix B. Proof of Lines 1–8 from Table 1 in Lemma 2

Proof.

Let us prove first Lines 1–4 from Table 1.

Let

g (x)

be the strictly monotonic function associated with mean

M

. Consider the following inequalities. For

g (x)

increasing,

g^{🟉}

increasing:

\begin{matrix} \sum_{k = 1}^{M} α_{k}^{'} g^{🟉} (f^{🟉} (k)) & = & \sum_{k = 1}^{M} α_{k}^{'} g (g^{- 1} (g^{🟉} (f^{🟉} (k)))) \\ \leq & \sum_{k = 1}^{M} α_{k} g (g^{- 1} (g^{🟉} (f^{🟉} (k)))) = \sum_{k = 1}^{M} α_{k} g^{🟉} (f^{🟉} (k)) . \end{matrix}

(A4)

For

g (x)

decreasing,

g^{🟉}

decreasing:

\begin{matrix} \sum_{k = 1}^{M} α_{k} g^{🟉} (f^{🟉} (k)) & = & \sum_{k = 1}^{M} α_{k} g (g^{- 1} (g^{🟉} (f^{🟉} (k)))) \\ \leq & \sum_{k = 1}^{M} α_{k}^{'} g (g^{- 1} (g^{🟉} (f^{🟉} (k)))) \leq \sum_{k = 1}^{M} α_{k}^{'} g^{🟉} (f^{🟉} (k)) . \end{matrix}

(A5)

Observe that the central inequalities in Equations (A4) and (A5) are true whenever Equation (20) holds and viceversa. We are interested in considering all possibilities for it to hold, when

f^{🟉}

and

f (x) = g^{- 1} (h (x))

are increasing, with

h (x) = g^{🟉} (f^{🟉} (x))

. We need first to study the convexity/concavity of the function

g^{- 1} (h (x))

. Let us suppose all functions are in

C^{2}

, then

h^{'} (x) = {g^{🟉}}^{'} (f^{🟉} (x)) {f^{🟉}}^{'} (x)

, and

h^{″} (x) = {g^{🟉}}^{″} (f^{🟉} (x)) {({f^{🟉}}^{'} (x))}^{2} + {g^{🟉}}^{'} (f^{🟉} (x)) {f^{🟉}}^{″} (x)

. Dropping the arguments for convenience, we have

h^{″} = {g^{🟉}}^{″} {({f^{🟉}}^{'})}^{2} + {g^{🟉}}^{'} {f^{🟉}}^{″}

. For

h^{″}

to be positive (convex)

{g^{🟉}}^{″}

has to be positive (convex) and

{g^{🟉}}^{'}, {f^{🟉}}^{″}

have to be both positive or negative (thus

g^{🟉}

convex-increasing, f convex, or

g^{🟉}

convex-decreasing, f concave). For

h^{″}

to be negative,

{g^{🟉}}^{″}

has to be negative (concave) and

{g^{🟉}}^{'}, {f^{🟉}}^{″}

have to have a different sign (thus

g^{🟉}

concave-increasing, f concave, or

g^{🟉}

concave-decreasing, f convex). For

g^{- 1} (h (x))

, we can apply the same rule. Observe also that the composition of two increasing or two decreasing functions is increasing, and increasing and decreasing or viceversa is decreasing. For the inverse function, as

(g^{- 1} (g (x)) = x

we have that

{g^{- 1}}^{'} (g (x)) g^{'} (x) = 1

,

{g^{- 1}}^{″} (g (x)) {(g^{'} (x))}^{2} + {g^{- 1}}^{'} (g (x)) g^{″} (x) = 0

, which isolating

{g^{- 1}}^{″} (g (x))

and dropping arguments becomes

{g^{- 1}}^{″} = - {g^{- 1}}^{'} g^{″}

, and thus for

g^{- 1}

convex-increasing (concave-increasing), the inverse is concave-increasing (convex-increasing), while for

g^{- 1}

convex-decreasing (concave-decreasing), the inverse is convex-decreasing (concave-decreasing). In Table A1, we show the different possibilities for

f (x)

to be increasing when

f^{🟉}

is increasing.

□

Table A1. Different possible combinations where the concavity/convexity of

g^{- 1} (h (x))

can be predicted for

g (x)

and

f^{🟉}, f (x)

increasing. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

Table A1. Different possible combinations where the concavity/convexity of

g^{- 1} (h (x))

can be predicted for

g (x)

and

f^{🟉}, f (x)

increasing. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$h = g^{🟉} \circ f^{🟉}$	$f = g^{- 1} \circ h$
1	ICX	ICV	ICV	ICV	ICV	ICV
2	DCX	DCX	DCV	ICX	DCV	ICX
3	ICV	ICX	ICX	ICX	ICX	ICX
4	DCV	DCV	DCX	ICV	DCX	ICV

Let us prove now Lines 5–8 from Table 1.

Consider the following inequalities, where

m (x) = M - x + 1

, for

g (x)

increasing,

g^{🟉}

decreasing,

\begin{matrix} \sum_{k = 1}^{M} α_{k} g^{🟉} (f^{🟉} (k)) & = & \sum_{k = 1}^{M} α_{M - k + 1} g^{🟉} (f^{🟉} (M - k + 1)) \\ = & \sum_{k = 1}^{M} α_{M - k + 1} g (g^{- 1} (g^{🟉} (f^{🟉} (m (k))))) \leq \sum_{k = 1}^{M} α_{M - k + 1}^{'} g (g^{- 1} (g^{🟉} (f^{🟉} (m (k))))) \\ = & \sum_{k = 1}^{M} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (M - k + 1)) = \sum_{k = 1}^{M} α_{k}^{'} g^{🟉} (f^{🟉} (k)), \end{matrix}

(A6)

and for

g (x)

decreasing,

g^{🟉}

increasing

\begin{matrix} \sum_{k = 1}^{M} α_{k}^{'} g^{🟉} (f^{🟉} (k)) & = & \sum_{k = 1}^{M} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (M - k + 1)) \\ = & \sum_{k = 1}^{M} α_{M - k + 1}^{'} g (g^{- 1} (g^{🟉} (f^{🟉} (m (k))))) \leq \sum_{k = 1}^{M} α_{M - k + 1} g (g^{- 1} (g^{🟉} (f^{🟉} (m (k))))) \\ = & \sum_{k = 1}^{M} α_{M - k + 1} g^{🟉} (f^{🟉} (M - k + 1)) = \sum_{k = 1}^{M} α_{k} g^{🟉} (f^{🟉} (k)) . \end{matrix}

(A7)

Observe that the central inequalities in Equations (A6) and (A7) are true whenever Equation (21) holds and viceversa. We are interested in considering all possibilities for it to hold, when

f^{🟉}

and

f (x) = g^{- 1} (h (x))

are increasing, with

h (x) = g^{🟉} (f^{🟉} (m (x)))

. As before, we need first to study the convexity/concavity of the function

g^{- 1} (h (x))

.

Consider the function

h (x) = g^{🟉} (f^{🟉} (m (x))

. The second derivative is, as

m^{'} (x) = - 1

and

m^{″} (x) = 0

,

h^{″} (x) = {g^{🟉}}^{″} (f^{🟉} (m (x)) {f^{🟉}}^{' 2} (m (x)) + {g^{🟉}}^{'} (f^{🟉} (m (x)) {f^{🟉}}^{″} (m (x))

. Let us drop the arguments for convenience, then

h^{″} = {g^{🟉}}^{″} {f^{🟉}}^{' 2} + {g^{🟉}}^{'} {f^{🟉}}^{″}

. Observe that it is the same result obtained above. In Table A2, we see the possible combinations for

f (x)

to be increasing when

f^{🟉} (x)

is increasing. The difference with Table A1 is that we have to consider here

m (x)

.

Table A2. Different possible combinations where the concavity/convexity of

g^{- 1} (h (x))

can be predicted and

f (x)

is increasing when

g (x)

is increasing and

g^{🟉} (x)

decreasing (or viceversa) when

f^{🟉}

is increasing. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

Table A2. Different possible combinations where the concavity/convexity of

g^{- 1} (h (x))

can be predicted and

f (x)

is increasing when

g (x)

is increasing and

g^{🟉} (x)

decreasing (or viceversa) when

f^{🟉}

is increasing. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$f^{🟉} (m)$	$h = g^{🟉} \circ f^{🟉} \circ m$	$f = g^{- 1} \circ h$
5	ICX	ICV	DCV	ICX	DCX	ICV	ICV
6	DCX	DCX	ICV	ICV	DCV	DCV	ICX
7	ICV	ICX	DCX	ICV	DCV	ICX	ICX
8	DCV	DCV	ICX	ICX	DCX	DCX	ICV

Then, we can summarize the results from Table A1 and Table A2 in Table 1. This will do to prove Equation (20). The results for Equation (21) are obtained as above by substituting

{α_{k}}, {α_{k}^{'}}

by

{α_{M - k + 1}^{'}}, {α_{M - k + 1}}

.

Suppose now

f^{🟉} (x)

is decreasing and that Equations (20) and Equation (21) hold for

g^{🟉} (x)

and for

f^{🟉} (x)

increasing. Then, the following inequalities, for

g^{🟉} (x)

increasing,

\begin{matrix} \sum_{k} α_{k} g^{🟉} (f^{🟉} (k)) & = & \sum_{k} α_{M - k + 1} g^{🟉} (f^{🟉} (m (k))) \\ \leq & \sum_{k} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (m (k))) = \sum_{k} α_{k}^{'} g^{🟉} (f^{🟉} (k)), \end{matrix}

(A8)

\begin{matrix} \sum_{k} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (k)) & = & \sum_{k} α_{k}^{'} g^{🟉} (f^{🟉} (m (k))) \\ \leq & \sum_{k} α_{k} g^{🟉} (f^{🟉} (m (k))) = \sum_{k} α_{M - k + 1} g^{🟉} (f^{🟉} (k)), \end{matrix}

(A9)

and for

g^{🟉} (x)

decreasing,

\begin{matrix} \sum_{k} α_{k}^{'} g^{🟉} (f^{🟉} (k)) & = & \sum_{k} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (m (k))) \\ \leq & \sum_{k} α_{M - k + 1} g^{🟉} (f^{🟉} (m (k))) = \sum_{k} α_{k} g^{🟉} (f^{🟉} (k)), \end{matrix}

(A10)

\begin{matrix} \sum_{k} α_{M - k + 1} g^{🟉} (f^{🟉} (k)) & = & \sum_{k} α_{k} g^{🟉} (f^{🟉} (m (k))) \\ \leq & \sum_{k} α_{k}^{'} g^{🟉} (f^{🟉} (m (k))) = \sum_{k} α_{M - k + 1}^{'} g^{🟉} (f^{🟉} (k)), \end{matrix}

(A11)

hold because

f^{🟉} (m (x))

is an increasing function. Thus, Equations (20) and (21) hold with the direction of inequality reversed. Observe also that

f^{🟉} (m (x))

will have the same character of convexity/concavity than

f^{🟉} (x)

. Applying Equations (A8)–(A11), according to whether

g^{🟉}

is increasing or decreasing, to each line of Table 1, we obtain the desired result.

Appendix C. Proof of Theorem 11

Proof.

First, note that if

\frac{α_{i}^{'}}{α_{j}^{'}} \geq \frac{α_{i}}{α_{j}}

for

i \leq j

, then

α_{1}^{'} \geq α_{1}

and

α_{M}^{'} \leq α_{M}

, see [17]. Let us proceed now by induction. For

M = 2

, the condition

c)

of Lemma 1 is true, because

α_{1}^{'} \geq α_{1}

, and

α_{1}^{'} + α_{2}^{'} = α_{1} + α_{2}

. Let us suppose that it is true for

M - 1

. If

{α_{k}^{'}}_{k = 1}^{M} ≺_{L R} {α_{k}}_{k = 1}^{M}

, then it is immediate that

{α_{k}^{'} / \sum_{k = 1}^{M - 1} α_{k}^{'}}_{k = 1}^{M - 1} ≺_{L R} {α_{k} / \sum_{k = 1}^{M - 1} α_{k}}_{k = 1}^{M - 1}

, and by the induction hypothesis, the condition

c)

of Lemma 1 holds, and thus

\begin{matrix} α_{1}^{'} / \sum_{k = 1}^{M - 1} α_{k}^{'} & \geq & α_{1} / \sum_{k = 1}^{M - 1} α_{k} \\ (α_{1}^{'} + α_{2}^{'}) / \sum_{k = 1}^{M - 1} α_{k}^{'} & \geq & (α_{1} + α_{2}) / \sum_{k = 1}^{M - 1} α_{k} \\ \dots & \geq & \dots \\ (α_{1}^{'} + α_{2}^{'} + \dots + α_{M - 1}^{'}) / \sum_{k = 1}^{M - 1} α_{k}^{'} & = & (α_{1} + α_{2} + \dots + α_{n - 1}) / \sum_{k = 1}^{M - 1} α_{k} . \end{matrix}

(A12)

However, we know that

α_{M}^{'} \leq α_{M}

, and thus

\sum_{k = 1}^{M - 1} α_{k}^{'} \geq \sum_{k = 1}^{M - 1} α_{k}

because

\sum_{k = 1}^{M - 1} α_{k}^{'} = \sum_{k = 1}^{M - 1} α_{k} = 1

, and thus we obtain condition

c)

of Lemma 1. □

References

Belzunce, F.; Martinez-Riquelme, C.; Mulero, J. An Introduction to Stochastic Orders; Academic Press: Cambridge, MA, USA, 2016; pp. i–ii. [Google Scholar] [CrossRef]
Shaked, M.; Shanthikumar, G. Stochastic Orders; Springer: Berlin/Heidelberg, Germany, 2007; 474p. [Google Scholar] [CrossRef]
Pawlowsky-Glahn, V.; Egozcue, J.; Tolosana-Delgado, R. Modeling and Analysis of Compositional Data; J. Wiley & Sons: Hoboken, NJ, USA, 2015. [Google Scholar]
Levy, H. Stochastic Dominance and Expected Utility: Survey and Analysis. Manag. Sci. 1992, 38, 555–593. [Google Scholar] [CrossRef]
Bullen, P. Handbook of Means and Their Inequalities; Springer Science+Business Media: Berlin/Heidelberg, Germany, 2003. [Google Scholar]
Coggeshall, F. The Arithmetic, Geometric, and Harmonic Means. Q. J. Econ. 1886, 1, 83–86. [Google Scholar] [CrossRef]
Fernández-Villaverde, J. The Econometrics of DSGE Models. SERIEs 2010, 1, 3–49. [Google Scholar] [CrossRef]
Wikipedia Contributors. Constant Elasticity of Substitution. Available online: https://en.wikipedia.org/wiki/Constant_elasticity_of_substitution (accessed on 14 April 2021).
Yoshida, Y. Weighted Quasi-Arithmetic Means and a Risk Index for Stochastic Environments. Int. J. Uncertainty, Fuzziness -Knowl.-Based Syst. 2011, 19, 1–16. [Google Scholar] [CrossRef]
Yoshida, Y. Weighted Quasi-Arithmetic Means: Utility Functions and Weighting Functions. In Modeling Decisions for Artificial Intelligence, MDAI 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 25–36. [Google Scholar]
Rényi, A. On Measures of Entropy and Information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1961; Volume 1, pp. 547–561. [Google Scholar]
Américo, A.; Khouzani, M.; Malacaria, P. Conditional Entropy and Data Processing: An Axiomatic Approach Based on Core-Concavity. IEEE Trans. Inf. Theory 2020, 66, 5537–5547. [Google Scholar] [CrossRef]
Wikipedia Contributors. Series and Parallel Springs. Available online: https://en.wikipedia.org/wiki/Series_and_parallel_springs (accessed on 2 April 2021).
Wikipedia Contributors. Resistor. Available online: https://en.wikipedia.org/wiki/Resistor (accessed on 2 April 2021).
Wikibooks Contributors. Fundamentals of Transportation/Traffic Flow. 2017. Available online: https://en.wikibooks.org/wiki/Fundamentals_of_Transportation/Traffic_Flow (accessed on 18 January 2018).
Padois, T.; Doutres, O.; Sgard, F.; Berry, A. On the use of geometric and harmonic means with the generalized cross-correlation in the time domain to improve noise source maps. J. Acoust. Soc. Am. 2016, 140. [Google Scholar] [CrossRef]
Sbert, M.; Poch, J. A necessary and sufficient condition for the inequality of generalized weighted means. J. Inequalities Appl. 2016, 2016, 292. [Google Scholar] [CrossRef]
Sbert, M.; Havran, V.; Szirmay-Kalos, L.; Elvira, V. Multiple importance sampling characterization by weighted mean invariance. Vis. Comput. 2018, 34, 843–852. [Google Scholar] [CrossRef]
Sbert, M.; Poch, J.; Chen, M.; Bardera, A. Some Order Preserving Inequalities for Cross Entropy and Kullback-Leibler Divergence. Entropy 2018, 20, 959. [Google Scholar] [CrossRef]
Sbert, M.; Ancuti, C.; Ancuti, C.O.; Poch, J.; Chen, S.; Vila, M. Histogram Ordering. IEEE Access 2021, 9, 28785–28796. [Google Scholar] [CrossRef]
Sbert, M.; Yoshida, Y. Stochastics orders on two-dimensional space: Application to cross entropy. In Modeling Decisions for Artificial Intelligence - MDAI 2020; Torra, V., Narukawa, Y., Nin, J., Agell, N., Eds.; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Yoshida, Y. Weighted quasi-arithmetic means on two-dimensional regions: An independent case. In Modeling Decisions for Artificial Intelligence - MDAI 2016; Torra, V., Narukawa, Y., Eds.; Springer: Berlin/Heidelberg, Germany, 2016; pp. 82–93. [Google Scholar]
Tsallis, C.; Baldovin, F.; Cerbino, R.; Pierobon, P. Introduction to Nonextensive Statistical Mechanics and Thermodynamics. Available online: http://arXiv:cond-mat/0309093v1 (accessed on 13 May 2021).
Hadar, J.; Russell, W. Rules for Ordering Uncertain Prospects. Am. Econ. Rev. 1969, 59, 25–34. [Google Scholar]
Zhang, Z.; Sabuncu, M.R. Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels. arXiv 2018, arXiv:1805.07836. [Google Scholar]
Cover, T.M.; Thomas, J.A. Elements of Information Theory; John Wiley & Sons: New York, NY, USA, 2006. [Google Scholar]
Dellacherie, C. Quelques commentarires sur les prolongements de capacités. Available online: http://www.numdam.org/article/SPS_1971__5__77_0.pdf (accessed on 22 May 2021).
Renneberg, D. Non Additive Measure and Integral; Kluwer Academic Publ.: Dordrecht, The Netherlands, 1994. [Google Scholar]
Havran, V.; Sbert, M. Optimal Combination of Techniques in Multiple Importance Sampling. In Proceedings of the 13th ACM SIGGRAPH International Conference on Virtual-Reality Continuum and Its Applications in Industry, VRCAI ’14; ACM: New York, NY, USA, 2014; pp. 141–150. [Google Scholar] [CrossRef]
Sbert, M.; Havran, V. Adaptive multiple importance sampling for general functions. Vis. Comput. 2017, 33, 845–855. [Google Scholar] [CrossRef]
Zhou, X.; Lin, H. Spatial Weights Matrix. In Encyclopedia of GIS; Springer: Boston, MA, USA, 2008; p. 1113. [Google Scholar] [CrossRef]
Smith, M.J.D.; Goodchild, M.F.; Longley, P. Geospatial Analysis: A Comprehensive Guide to Principles, Techniques and Software Tools; Troubador Publishing Ltd: Leicester, UK, 2015. [Google Scholar]

Table 1. For each line, for

g (x), f (x)

filling the conditions in first and second column, then Equations (20) and (21) hold for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. By changing

f^{🟉}

from increasing to decreasing, the reverse of Equations (20) and (21) hold. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

Table 1. For each line, for

g (x), f (x)

filling the conditions in first and second column, then Equations (20) and (21) hold for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. By changing

f^{🟉}

from increasing to decreasing, the reverse of Equations (20) and (21) hold. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

	$g$	$f$	$g^{🟉}$	$f^{🟉}$
1	ICX	ICV	ICV	ICV
1’	"	"	DCX	ICV
2	DCX	ICX	DCV	ICX
2’	"	"	ICX	ICX
3	ICV	ICX	ICX	ICX
3’	"	"	DCV	ICX
4	DCV	ICV	DCX	ICV
4’	"	"	ICV	ICV
5	ICX	ICV	DCV	ICX
5’	"	"	ICX	ICX
6	DCX	ICX	ICV	ICV
6’	"	"	DCX	ICV
7	ICV	ICX	DCX	ICV
7’	"	"	ICV	ICV
8	DCV	ICV	ICX	ICX
8’	"	"	DCV	ICX

Table 2. For each line,

g (x) = x

,

f (x)

filling the conditions in second column and Equations (20) and (21) holding, then Equations (20) and (21) hold too for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

Table 2. For each line,

g (x) = x

,

f (x)

filling the conditions in second column and Equations (20) and (21) holding, then Equations (20) and (21) hold too for

g^{🟉}, f^{🟉}

filling the conditions in columns three and four. ICX: convex and increasing, ICV: concave and increasing, DCX: convex and decreasing, DCV: concave and decreasing.

$f$	$g^{🟉}$	$f^{🟉}$
ICV	ICV	ICV
ICV	DCX	ICV
ICX	ICX	ICX
ICX	DCV	ICX
ICV	DCV	ICX
ICV	ICX	ICX
ICX	DCX	ICV
ICX	ICV	ICV

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$h = g^{🟉} \circ f^{🟉}$	$f = g^{- 1} \circ h$
1	ICX	ICV	ICV	ICV	ICV	ICV
2	DCX	DCX	DCV	ICX	DCV	ICX
3	ICV	ICX	ICX	ICX	ICX	ICX
4	DCV	DCV	DCX	ICV	DCX	ICV

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$f^{🟉} (m)$	$h = g^{🟉} \circ f^{🟉} \circ m$	$f = g^{- 1} \circ h$
5	ICX	ICV	DCV	ICX	DCX	ICV	ICV
6	DCX	DCX	ICV	ICV	DCV	DCV	ICX
7	ICV	ICX	DCX	ICV	DCV	ICX	ICX
8	DCV	DCV	ICX	ICX	DCX	DCX	ICV

$f$	$g^{🟉}$	$f^{🟉}$
ICV	ICV	ICV
ICV	DCX	ICV
ICX	ICX	ICX
ICX	DCV	ICX
ICV	DCV	ICX
ICV	ICX	ICX
ICX	DCX	ICV
ICX	ICV	ICV

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$h = g^{🟉} \circ f^{🟉}$	$f = g^{- 1} \circ h$
1	ICX	ICV	ICV	ICV	ICV	ICV
2	DCX	DCX	DCV	ICX	DCV	ICX
3	ICV	ICX	ICX	ICX	ICX	ICX
4	DCV	DCV	DCX	ICV	DCX	ICV

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$f^{🟉} (m)$	$h = g^{🟉} \circ f^{🟉} \circ m$	$f = g^{- 1} \circ h$
5	ICX	ICV	DCV	ICX	DCX	ICV	ICV
6	DCX	DCX	ICV	ICV	DCV	DCV	ICX
7	ICV	ICX	DCX	ICV	DCV	ICX	ICX
8	DCV	DCV	ICX	ICX	DCX	DCX	ICV

$f$	$g^{🟉}$	$f^{🟉}$
ICV	ICV	ICV
ICV	DCX	ICV
ICX	ICX	ICX
ICX	DCV	ICX
ICV	DCV	ICX
ICV	ICX	ICX
ICX	DCX	ICV
ICX	ICV	ICV

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$h = g^{🟉} \circ f^{🟉}$	$f = g^{- 1} \circ h$
1	ICX	ICV	ICV	ICV	ICV	ICV
2	DCX	DCX	DCV	ICX	DCV	ICX
3	ICV	ICX	ICX	ICX	ICX	ICX
4	DCV	DCV	DCX	ICV	DCX	ICV

	g	$g^{- 1}$	$g^{🟉}$	$f^{🟉}$	$f^{🟉} (m)$	$h = g^{🟉} \circ f^{🟉} \circ m$	$f = g^{- 1} \circ h$
5	ICX	ICV	DCV	ICX	DCX	ICV	ICV
6	DCX	DCX	ICV	ICV	DCV	DCV	ICX
7	ICV	ICX	DCX	ICV	DCV	ICX	ICX
8	DCV	DCV	ICX	ICX	DCX	DCX	ICV

$f$	$g^{🟉}$	$f^{🟉}$
ICV	ICV	ICV
ICV	DCX	ICV
ICX	ICX	ICX
ICX	DCV	ICX
ICV	DCV	ICX
ICV	ICX	ICX
ICX	DCX	ICV
ICX	ICV	ICV