Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence

Pečarić, Đilda; Pečarić, Josip; Perić, Jurica

doi:10.3390/math10050755

Open AccessArticle

Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence

by

Đilda Pečarić

¹,

Josip Pečarić

² and

Jurica Perić

^3,*

¹

Department of Media and Communication, University North, 48000 Koprivnica, Croatia

²

Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia

³

Department of Mathematics, Faculty of Science, University of Split, 21000 Split, Croatia

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(5), 755; https://doi.org/10.3390/math10050755

Submission received: 24 January 2022 / Revised: 17 February 2022 / Accepted: 23 February 2022 / Published: 26 February 2022

(This article belongs to the Special Issue Mathematical Inequalities with Applications)

Download Versions Notes

Abstract

:

In this paper we give a new refinement of the Lah–Ribarič inequality and, using the same technique, we give a refinement of the Jensen inequality. Using these results, a refinement of the discrete Hölder inequality and a refinement of some inequalities for discrete weighted power means and discrete weighted quasi-arithmetic means are obtained. We also give applications in the information theory; namely, we give some interesting estimations for the discrete Csiszár divergence and for its important special cases.

Keywords:

Lah–Ribarič inequality; Jensen inequality; Hölder’s inequalit; power mean; quasi-arithmetic mean; Csiszár divergence; Zipf–Mandelbrot law

MSC:

26D15; 94A15

1. Introduction

Research of the classical inequalities, such as the Jensen, the Hölder and similar, has experienced great expansion. These inequalities first appeared in discrete and integral forms, and then many generalizations and improvements have been proved (for instance, see [1,2]). Lately, they are proven to be very useful in information theory (for instance, see [3]).

Let I be an interval in

R

and

f : I \to R

a convex function. If

x = (x_{1}, \dots, x_{n})

is any n-tuple in

I^{n}

and

p = (p_{1}, \dots, p_{n})

a nonnegative n-tuple such that

P_{n} = \sum_{i = 1}^{n} p_{i} > 0

, then the well known Jensen’s inequality

f (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i})

(1)

holds (see [4,5] or for example [6] (p. 43)). If f is strictly convex then (1) is strict unless

x_{i} = c

for all

i \in \{j : p_{j} > 0\}

.

Jensen’s inequality is one of the most famous inequalities in convex analysis, for which special cases are other well-known inequalities (such as Hölder’s inequality, A-G-H inequality, etc.). Beside mathematics, it has many applications in statistics, information theory, and engineering.

Strongly related to Jensen’s inequality is the Lah–Ribarič inequality (see [7]),

\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}) \leq \frac{M - \bar{x}}{M - m} f (m) + \frac{\bar{x} - m}{M - m} f (M),

(2)

which holds when

f : I \to R

is a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty,

p

is as in (1),

x = (x_{1}, \dots, x_{n})

is any n-tuple in

{[m, M]}^{n}

and

\bar{x} = \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i} .

If f is strictly convex then (4) is strict unless

x_{i} \in \{m, M\}

for all

i \in \{j : p_{j} > 0\}

.

The Lah–Ribarič inequality has been largely investigated and the interested reader can find many related results in the recent literature as well as in monographs such as [6,8,9]. It is interesting to find further refinements of the above inequality.

Our main result will be refinement of the inequality (2).

Using the same technique, we will give a refinement of the inequality (1) (see [10]).

In addition, we deal with the notion of f-divergences which measure the distance between two probability distributions. One of the most important is the Csiszár f-divergence, some special cases of which are the Shannon entropy, Jeffrey’s distance, Kullback–Leibler divergence, the Hellinger distance, and the Bhattacharyya distance. We deduce the relations for the mentioned f-divergences.

Let us say few words about the organization of the paper. In the following section we give a new refinement of the Lah–Ribarič inequality and state a known refinement of the Jensen inequality using the same technique. Using obtained results we give a refinement of the famous Hölder inequality and some new refinements for the weighted power means and quasi-arithmetic means. In addition, we give a historical remark regarding the Jensen–Boas inequality. In Section 3, we give the results for various f-divergences. These are further examined for the Zipf–Mandelbrot law.

2. New Refinements

The starting point of this consideration is the following lemma (see [11]).

Lemma 1.

Let f be a convex function on an interval I. If

a, b, c, d \in I

such that

a \leq b < c \leq d

, then the inequality

\frac{c - u}{c - b} f (b) + \frac{u - b}{c - b} f (c) \leq \frac{d - u}{d - a} f (a) + \frac{u - a}{d - a} f (d)

holds for any

u \in [b, c]

.

The main result is a refinement of the Lah–Ribarič inequality (2). As we will see, its proof is based on the idea from the proof of the Jensen–Boas inequality.

Theorem 1.

Let

f : I \to R

be a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty

,

p

is as in (1),

x = (x_{1}, \dots, x_{n})

be any n-tuple in

{[m, M]}^{n}

and

\bar{x} = \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i} .

Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} p_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {x_{j} : j \in N_{i}}

,

M_{i} = max {x_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}) \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{M_{i} - {\bar{x}}_{i}}{M_{i} - m_{i}} f (m_{i}) + \frac{{\bar{x}}_{i} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ \leq \frac{M - \bar{x}}{M - m} f (m) + \frac{\bar{x} - m}{M - m} f (M) \end{matrix}

(3)

holds, where

\bar{x} = \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}, {\bar{x}}_{i} = \frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} x_{j} .

If f is concave on I, then the inequalities in (3) are reversed.

Proof.

We have

\begin{matrix} \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}) = \frac{1}{P_{n}} [\sum_{i = 1}^{m} \sum_{j \in N_{i}} p_{j} f (x_{j})] = \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} f (x_{j})] . \end{matrix}

Using the Lah–Ribarič inequality (2) for each of the subsets

N_{i}

, we obtain

\begin{matrix} \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} f (x_{j})] \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{M_{i} - \frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} x_{j}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} x_{j} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ = \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{M_{i} - {\bar{x}}_{i}}{M_{i} - m_{i}} f (m_{i}) + \frac{{\bar{x}}_{i} - m_{i}}{M_{i} - m_{i}} f (M_{i})] . \end{matrix}

Using

m \leq m_{i} \leq {\bar{x}}_{i} \leq M_{i} \leq M

,

m < M, m_{i} < M_{i}

and Lemma 1, we obtain

\begin{matrix} \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{M_{i} - {\bar{x}}_{i}}{M_{i} - m_{i}} f (m_{i}) + \frac{{\bar{x}}_{i} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{M - {\bar{x}}_{i}}{M - m} f (m) + \frac{{\bar{x}}_{i} - m}{M - m} f (M)] \\ = \frac{M - \bar{x}}{M - m} f (m) + \frac{\bar{x} - m}{M - m} f (M) \end{matrix}

□

Remark 1.

If

N_{i} = \{x_{j}\}

(|N_{i}| = 1)

, the related term in the sum on the right-hand side of the first inequality in the proof of Theorem 1 remains unaltered (i.e., is equal to

f (x_{j})

).

Using the same technique, we obtain the following refinement of the Jensen inequality (1).

Theorem 2.

Let I be an interval in

R

and

f : I \to R

a convex function. Let

x = (x_{1}, \dots, x_{n})

is any n-tuple in

I^{n}

and

p = (p_{1}, \dots, p_{n})

a nonnegative n-tuple such that

P_{n} = \sum_{i = 1}^{n} p_{i} > 0

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} p_{j} > 0, i = 1, \dots, m

. Then

f (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) f (\frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i})

(4)

holds.

If f is concave on I, then the inequalities in (4) are reversed.

Proof.

We have

\begin{matrix} f (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}) = f (\frac{1}{P_{n}} \sum_{i = 1}^{m} [\sum_{j \in N_{i}} p_{j} x_{j}]) \\ = f (\frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) \frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}}) . \end{matrix}

Using Jensen’s inequality (1), we obtain

\begin{matrix} f (\frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) \frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}}) \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) f (\frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}}) \\ \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) [\frac{1}{\sum_{j \in N_{i}} p_{j}} \sum_{j \in N_{i}} p_{j} f (x_{j})] \\ = \frac{1}{P_{n}} \sum_{i = 1}^{m} \sum_{j \in N_{i}} p_{j} f (x_{j}), \end{matrix}

which is (4). □

We can find this idea for proving the refinement of our main result (and the refinement of the Jensen inequality) in one other well-known result (see [6] (pp. 55–60)).

In Jensen’s inequality there is a condition “

p = (p_{1}, \dots, p_{n})

a nonnegative n-tuple such that

P_{n} = \sum_{i = 1}^{n} p_{i} > 0

”. In 1919, Steffensen gave the same inequality (1) with slightly relaxed conditions (see [12]).

Theorem 3

(Jensen–Steffensen). If

f : I \to R

is a convex function,

x

is a real monotonic n-tuple such that

x_{i} \in I, i = 1, \dots, n

, and

p

is a real n-tuple such that

0 \leq P_{k} \leq P_{n}, k = 1, \dots, n, P_{n} > 0 .

Then (1) holds. If f is strictly convex, then inequality (1) is strict unless

x_{1} = x_{2} = \dots = x_{n}

.

One of many generalizations of the Jensen inequality is the Riemann–Stieltjes integral form of the Jensen inequality.

Theorem 4

(the Riemann–Stieltjes form of Jensen’s inequality). Let

ϕ : I \to R

be a continuous convex function where I is the range of the continuous function

f : [a, b] \to R

. Inequality

ϕ (\frac{\int_{a}^{b} f (x) d λ (x)}{\int_{a}^{b} d λ (x)}) \leq \frac{\int_{a}^{b} ϕ (f (x)) d λ (x)}{\int_{a}^{b} d λ (x)}

(5)

holds provided that λ is increasing, bounded and

λ (a) \neq λ (b)

.

Analogously, integral form of the Jensen–Steffensen’s inequality is given.

Theorem 5

(The Jensen–Steffensen). If f is continuous and monotonic (either increasing or decreasing) and λ is either continuous or of bounded variation satisfying

λ (a) \leq λ (x) \leq λ (b) for all x \in [a, b], λ (a) < λ (b),

then (5) holds.

In 1970, Boas gave the integral analogue of Jensen–Steffensen’s inequality with slightly different conditions.

Theorem 6

(the Jensen–Boas inequality). If f is continuous or of bounded variation satisfying

λ (a) \leq λ (x_{1}) \leq λ (y_{1}) \leq λ (x_{2}) \leq \dots \leq λ (y_{n - 1}) \leq λ (x_{n}) \leq λ (b)

for all

x_{k} \in 〈 y_{k - 1}, y_{k} 〉

, and

λ (b) > λ (a)

, and if f is continuous and monotonic (either increasing or decreasing) in each of the

n - 1

intervals

〈 y_{k - 1}, y_{k} 〉

, then inequality (5) holds.

In 1982, J. Pečarić gave the following proof of the Jensen–Boas inequality.

Proof.

If

λ (a) < λ (x_{1}) < λ (y_{1}) < λ (x_{2}) < \dots < λ (y_{n - 1}) < λ (x_{n}) < λ (b)

with the notation

p_{k} = \int_{y_{k - 1}}^{y_{k}} d λ (x), t_{k} = \frac{\int_{y_{k - 1}}^{y_{k}} f (x) d λ (x)}{\int_{y_{k - 1}}^{y_{k}} d λ (x)}, k = 1, \dots, n

we have

\begin{matrix} ϕ (\frac{\int_{a}^{b} f (x) d λ (x)}{\int_{a}^{b} d λ (x)}) = ϕ (\frac{\sum_{k = 1}^{n} \int_{y_{k - 1}}^{y_{k}} f (x) d λ (x)}{\sum_{k = 1}^{n} \int_{y_{k - 1}}^{y_{k}} d λ (x)}) = ϕ (\frac{\sum_{k = 1}^{n} p_{k} t_{k}}{\sum_{k = 1}^{n} p_{k}}) . \end{matrix}

Using Jensen’s inequality (1), we obtain

\begin{matrix} ϕ (\frac{\sum_{k = 1}^{n} p_{k} t_{k}}{\sum_{k = 1}^{n} p_{k}}) \leq \frac{1}{\sum_{k = 1}^{n} p_{k}} \sum_{k = 1}^{n} p_{k} ϕ (t_{k}) = \frac{1}{\sum_{k = 1}^{n} p_{k}} [\sum_{k = 1}^{n} p_{k} ϕ (\frac{\int_{y_{k - 1}}^{y_{k}} f (x) d λ (x)}{\int_{y_{k - 1}}^{y_{k}} d λ (x)})] . \end{matrix}

Using Jensen–Steffensen’s inequality (5) on each subinterval

[y_{k - 1}, y_{k}], k = 1, \dots, n

, we obtain

\begin{matrix} \frac{1}{\sum_{k = 1}^{n} p_{k}} [\sum_{k = 1}^{n} p_{k} ϕ (\frac{\int_{y_{k - 1}}^{y_{k}} f (x) d λ (x)}{\int_{y_{k - 1}}^{y_{k}} d λ (x)})] \\ \leq \frac{1}{\sum_{k = 1}^{n} p_{k}} [\sum_{k = 1}^{n} p_{k} \frac{1}{\int_{y_{k - 1}}^{y_{k}} d λ (x)} \int_{y_{k - 1}}^{y_{k}} ϕ (f (x)) d λ (x)] \\ = \frac{1}{\sum_{k = 1}^{n} \int_{y_{k - 1}}^{y_{k}} d λ (x)} \sum_{k = 1}^{n} \int_{y_{k - 1}}^{y_{k}} ϕ (f (x)) d λ (x) = \frac{\int_{a}^{b} ϕ (f (x)) d λ (x)}{\int_{a}^{b} d λ (x)} . \end{matrix}

If

λ (y_{j - 1}) = λ (y_{j})

, for some j, then

d λ (x) = 0

on

[y_{j - 1}, y_{j}]

and we can easily prove that the Jensen–Boas inequality is valid. □

If we look at the previous proof, we see that the technique is the same as for our main result and the refinement of the Jensen inequality.

By using Theorem 2, we obtain the following refinement of the discrete Hölder inequality (see [13,14]).

Corollary 1.

Let

p, q > 1

such that

\frac{1}{p} + \frac{1}{q} = 1

. Let

a = (a_{1}, a_{2}, \dots, a_{n})

,

b = (b_{1}, b_{2}, \dots, b_{n})

such that

a_{i}, b_{i} > 0, i = 1, \dots, n

. Then:

\begin{matrix} \sum_{i = 1}^{n} a_{i} b_{i} \leq {(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{q}} {[\sum_{i = 1}^{m} {(\sum_{j \in N_{i}} b_{j}^{q})}^{1 - p} {(\sum_{j \in N_{i}} a_{j} b_{j})}^{p}]}^{\frac{1}{p}} \\ \leq {(\sum_{i = 1}^{n} a_{i}^{p})}^{\frac{1}{p}} {(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{q}} . \end{matrix}

(6)

Proof.

We use Theorem 2 with

p_{i} = b_{i}^{q} > 0, x_{i} = a_{i} b_{i}^{- \frac{q}{p}} > 0

. Then

p_{i} x_{i} = b_{i}^{q} a_{i} b_{i}^{- \frac{q}{p}} = a_{i} b_{i}^{q - \frac{q}{p}} = a_{i} b_{i}^{q (1 - \frac{1}{p})} = a_{i} b_{i}^{q \frac{1}{q}} = a_{i} b_{i}

and from (4), we obtain

\begin{matrix} f (\frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} a_{i} b_{i}) & \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} b_{j}^{q}) f (\frac{\sum_{j \in N_{i}} a_{j} b_{j}}{\sum_{j \in N_{i}} b_{j}^{q}}) \\ \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} b_{i}^{q} f (a_{i} b_{i}^{- \frac{q}{p}}) . \end{matrix}

(7)

For the function

f (t) = t^{p}

from (7), we obtain

\begin{matrix} {(\frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} a_{i} b_{i})}^{p} & \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} b_{j}^{q}) {(\frac{\sum_{j \in N_{i}} a_{j} b_{j}}{\sum_{j \in N_{i}} b_{j}^{q}})}^{p} \\ \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} b_{i}^{q} {(a_{i} b_{i}^{- \frac{q}{p}})}^{p} = \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} a_{i}^{p} \end{matrix}

Multiplying with

{(\sum_{i = 1}^{n} b_{i}^{q})}^{p}

, and raising to the power of

\frac{1}{p}

, we obtain

\begin{matrix} \sum_{i = 1}^{n} a_{i} b_{i} & \leq & {(\sum_{i = 1}^{n} b_{i}^{q})}^{1 - \frac{1}{p}} {[\sum_{i = 1}^{m} {(\sum_{j \in N_{i}} b_{j}^{q})}^{1 - p} {(\sum_{j \in N_{i}} a_{j} b_{j})}^{p}]}^{\frac{1}{p}} \\ \leq & {(\sum_{i = 1}^{n} b_{i}^{q})}^{1 - \frac{1}{p}} {(\sum_{i = 1}^{n} a_{i}^{p})}^{\frac{1}{p}} \end{matrix}

which is (6). □

Corollary 2.

Using the same conditions as in previous corollary for

p \in R

,

p < 1

,

p \neq 0

, we obtain

\begin{matrix} {(\sum_{i = 1}^{n} a_{i}^{p})}^{\frac{1}{p}} {(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{q}} \leq \sum_{i = 1}^{m} {(\sum_{j \in N_{i}} b_{j}^{q})}^{\frac{1}{q}} {(\sum_{j \in N_{i}} a_{j}^{p})}^{\frac{1}{p}} \leq \sum_{i = 1}^{n} a_{i} b_{i} . \end{matrix}

(8)

Proof.

First for

0 < p < 1

. We use Theorem 2 with

p_{i} = b_{i}^{q} > 0, x_{i} = a_{i}^{p} b_{i}^{- q} > 0

. Then

p_{i} x_{i} = b_{i}^{q} a_{i}^{p} b_{i}^{- q} = a_{i}^{p}

and from (4), we obtain

\begin{matrix} f (\frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} a_{i}^{p}) & \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} b_{j}^{q}) f (\frac{\sum_{j \in N_{i}} a_{j}^{p}}{\sum_{j \in N_{i}} b_{j}^{q}}) \\ \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} b_{i}^{q} f (a_{i}^{p} b_{i}^{- q}) . \end{matrix}

For the function

f (t) = t^{\frac{1}{p}}

, we obtain

\begin{matrix} {(\frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} a_{i}^{p})}^{\frac{1}{p}} & \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} b_{j}^{q}) {(\frac{\sum_{j \in N_{i}} a_{j}^{p}}{\sum_{j \in N_{i}} b_{j}^{q}})}^{\frac{1}{p}} \\ \leq & \frac{1}{\sum_{i = 1}^{n} b_{i}^{q}} \sum_{i = 1}^{n} b_{i}^{q} {(a_{i}^{p} b_{i}^{- q})}^{\frac{1}{p}} . \end{matrix}

Multiplying with

{(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{p}}

, and then with

{(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{q}}

, we obtain

\begin{matrix} {(\sum_{i = 1}^{n} a_{i}^{p})}^{\frac{1}{p}} {(\sum_{i = 1}^{n} b_{i}^{q})}^{\frac{1}{q}} \leq \sum_{i = 1}^{m} {(\sum_{j \in N_{i}} b_{j}^{q})}^{\frac{1}{q}} {(\sum_{j \in N_{i}} a_{j}^{p})}^{\frac{1}{p}} \leq \sum_{i = 1}^{n} a_{i} b_{i}, \end{matrix}

which is (8).

If

p < 0

, then

0 < q < 1

, and the same result follows from symmetry (see comments in Corollary 1). □

It is interesting to show how the previously obtained results impact the study of the weighted discrete power means and the weighted discrete quasi-arithmetic means.

Let

n \in N

,

n \geq 2

,

x = (x_{1}, \dots, x_{n})

,

p = (p_{1}, \dots, p_{n})

,

x_{i}, p_{i} \in R^{+}

. The weighted discrete power means of order

r \in R

are defined as

M_{r} (x, p) = \{\begin{matrix} {(\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}^{r})}^{\frac{1}{r}} & , & r \neq 0, \\ {(\prod_{i = 1}^{n} x_{i}^{p_{i}})}^{\frac{1}{P_{n}}} & , & r = 0 . \end{matrix}

Using Theorem 2, we obtain the following inequalities for the weighted discrete power means. Let us notice that left-hand side and right-hand side of both inequalities are the same; only mixed means in the middle, which are a refinement, change.

Corollary 3.

Let

n \in N, n \geq 2

,

x = (x_{1}, \dots, x_{n})

,

p = (p_{1}, \dots, p_{n})

,

x_{i}, p_{i} \in R^{+}

. Let

s, t \in R

such that

s \leq t

. Then

\begin{matrix} M_{s} (x, p) & \leq & {[\frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) M_{s}^{t} (x_{N_{i}}, p_{N_{i}})]}^{\frac{1}{t}} \\ \leq & M_{t} (x, p), \end{matrix}

(9)

\begin{matrix} M_{s} (x, p) & \leq & {[\frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) M_{t}^{s} (x_{N_{i}}, p_{N_{i}})]}^{\frac{1}{s}} \\ \leq & M_{t} (x, p), \end{matrix}

(10)

where

x_{N_{i}} = (x_{j_{1}}^{i}, \dots, x_{j_{k_{i}}}^{i})

,

p_{N_{i}} = (p_{j_{1}}^{i}, \dots, p_{j_{k_{i}}}^{i})

,

k_{i} = |N_{i}|

,

N_{i} = {j_{1}^{i}, \dots, j_{k_{i}}^{i}}

, for

i = 1, \dots, m

.

Proof.

We use Theorem 2 with

f (x) = x^{\frac{t}{s}}

for

x > 0

,

s, t \in R

,

t > 0

,

s \neq 0

,

s \leq t

. From (4), we obtain

\begin{matrix} {(\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i})}^{\frac{t}{s}} \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) {(\frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}})}^{\frac{t}{s}} \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}^{\frac{t}{s}} . \end{matrix}

Substituting

x_{i}

with

x_{i}^{s}

, and then raising to the power

\frac{1}{t}

, we obtain

\begin{matrix} {[{(\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}^{s})}^{\frac{1}{s}}]}^{t} \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) {[{(\frac{\sum_{j \in N_{i}} p_{j} x_{j}^{s}}{\sum_{j \in N_{i}} p_{j}})}^{\frac{1}{s}}]}^{t} \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} {(x_{i}^{s})}^{\frac{t}{s}}, \end{matrix}

which is (9).

Similarly, we use Theorem 2 with

f (x) = x^{\frac{s}{t}}

for

x > 0

,

s, t \in R

,

s, t > 0

,

s \leq t

. We obtain

\begin{matrix} {(\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i})}^{\frac{s}{t}} \geq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) {(\frac{\sum_{j \in N_{i}} p_{j} x_{j}}{\sum_{j \in N_{i}} p_{j}})}^{\frac{s}{t}} \geq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} x_{i}^{\frac{s}{t}} . \end{matrix}

Substituting

x_{i}

with

x_{i}^{t}

, and then raising to the power

\frac{1}{s}

, inequality (10) easily follows. Other cases follow similarly. □

Let I be an interval in

R

. Let

n \in N, n \geq 2

,

x = (x_{1}, \dots, x_{n}), p = (p_{1}, \dots, p_{n})

,

x_{i} \in I

,

p_{i} \in R^{+}

. Then, for a strictly monotone continuous function

h : I \to R

, the discrete weighted quasi-arithmetic mean is defined as

M_{h} (x, p) = h^{- 1} (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} h (x_{i})) .

Using Theorem 2, we obtain the following inequalities for quasi-arithmetic means.

Corollary 4.

Let I be an interval in

R

. Let

n \in N, n \geq 2

,

x = (x_{1}, \dots, x_{n}), p = (p_{1}, \dots, p_{n})

,

x_{i} \in I

,

p_{i} \in R^{+}

. Let

h : I \to R

be a strictly monotone continuous function such that

f \circ h^{- 1}

convex. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} p_{j} > 0, i = 1, \dots, m

. Then

\begin{matrix} f (M_{h} (x, p)) & \leq & \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) f (M_{h} (x_{N_{i}}, p_{N_{i}})) \leq \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}), \end{matrix}

where

x_{N_{i}} = (x_{j_{1}}^{i}, \dots, x_{j_{k_{i}}}^{i})

,

p_{N_{i}} = (p_{j_{1}}^{i}, \dots, p_{j_{k_{i}}}^{i})

,

k_{i} = |N_{i}|

,

N_{i} = {j_{1}^{i}, \dots, j_{k_{i}}^{i}}

, for

i = 1, \dots, m

.

Proof.

Theorem 2 with

f \to f \circ h^{- 1}

and

x_{i} \to h (x_{i})

gives

\begin{matrix} f (h^{- 1} (\frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} h (x_{i}))) & \leq & \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) f (h^{- 1} (\frac{\sum_{j \in N_{i}} p_{j} h (x_{j})}{\sum_{j \in N_{i}} p_{j}})) \\ \leq & \frac{1}{P_{n}} \sum_{i = 1}^{n} p_{i} f (x_{i}) . \end{matrix}

□

3. Applications in Information Theory

In this section we give basic results concerning the discrete Csiszár f-divergence. In addition, bounds for the divergence of the Zipf–Mandelbrot law are obtained.

Let us denote the set of all probability densities by

P

, i.e.,

p = (p_{1}, \dots, p_{n}) \in P

if

p_{i} \in [0, 1]

for

i = 1, \dots, n

and

\sum_{i = 1}^{n} p_{i} = 1

.

In [15], Csiszár introduced the f-divergence functional as

D_{f} (p, q) = \sum_{i = 1}^{n} q_{i} f (\frac{p_{i}}{q_{i}}),

(11)

where

f : [0, + \infty 〉

is a convex function, and it represents a “distance function” on the set of probability distributions

P

.

In order to use nonnegative probability distributions in the f-divergence functional, we assume, as usual,

f (0) : = lim_{t \to 0 +} f (t), 0 \cdot f (\frac{0}{0}) : = 0, 0 \cdot f (\frac{a}{0}) : = lim_{t \to 0 +} t f (\frac{a}{t})

and the following definition of a generalized f-divergence functional is given.

Definition 1

(the Csiszár f-divergence functional). Let

J \subset R

be an interval, and let

f : J \to R

be a function. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers and

q = (q_{1}, \dots, q_{n})

be an n-tuple of nonnegative real numbers such that

p_{i} / q_{i} \in J

for every

i = 1, \dots, n

. The Csiszár f-divergence functional is defined as

\begin{matrix} {\hat{D}}_{f} (p, q) : = \sum_{i = 1}^{n} q_{i} f (\frac{p_{i}}{q_{i}}) . \end{matrix}

(12)

Theorem 7.

Let I be an interval in

R

and

f : I \to R

a convex function. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers and

q = (q_{1}, \dots, q_{n})

be an n-tuple of nonnegative real numbers such that

p_{i} / q_{i} \in I

for every

i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

and

\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} \in I, i = 1, \dots, m

. Then

f (\frac{P_{n}}{Q_{n}}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}) \leq \frac{1}{Q_{n}} {\hat{D}}_{f} (p, q)

(13)

holds.

Proof.

Using Theorem 2 with

p_{i} \to q_{i}

and

x_{i} \to \frac{p_{i}}{q_{i}}

, we obtain

\begin{matrix} f (\frac{1}{Q_{n}} \sum_{i = 1}^{n} q_{i} \frac{p_{i}}{q_{i}}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) f (\frac{\sum_{j \in N_{i}} q_{j} \frac{p_{j}}{q_{j}}}{\sum_{j \in N_{i}} q_{j}}) \leq \frac{1}{Q_{n}} \sum_{i = 1}^{n} q_{i} f (\frac{p_{i}}{q_{i}}), \end{matrix}

which is (13). □

Corollary 5.

If in the previous theorem we take

p

and

q

to be probability distributions, and we directly obtain the following result:

f (1) \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}) \leq D_{f} (p, q) .

(14)

Theorem 8.

Let

f : I \to R

be a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty

. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers and

q = (q_{1}, \dots, q_{n})

be an n-tuple of nonnegative real numbers such that

m \leq \frac{p_{i}}{q_{i}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {p_{j} / q_{j} : j \in N_{i}}

,

M_{i} = max {p_{j} / q_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} {\hat{D}}_{f} (p, q) & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} f (m_{i}) + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ \leq & \frac{M - \frac{P_{n}}{Q_{n}}}{M - m} f (m) + \frac{\frac{P_{n}}{Q_{n}} - m}{M - m} f (M) \end{matrix}

(15)

holds.

Proof.

Using Theorem 1 with

p_{i} \to q_{i}

and

x_{i} \to \frac{p_{i}}{q_{i}}

, we obtain

\begin{matrix} \frac{1}{Q_{n}} \sum_{i = 1}^{n} q_{i} f (\frac{p_{i}}{q_{i}}) \\ \leq \frac{1}{Q_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{1}{\sum_{j \in N_{i}} q_{j}} \sum_{j \in N_{i}} q_{j} \frac{p_{j}}{q_{j}}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{1}{\sum_{j \in N_{i}} q_{j}} \sum_{j \in N_{i}} q_{j} \frac{p_{j}}{q_{j}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ \leq \frac{M - \frac{1}{\sum_{i = 1}^{n} q_{i}} \sum_{i = 1}^{n} q_{i} \frac{p_{i}}{q_{i}}}{M - m} f (m) + \frac{\frac{1}{\sum_{i = 1}^{n} q_{i}} \sum_{i = 1}^{n} q_{i} \frac{p_{i}}{q_{i}} - m}{M - m} f (M), \end{matrix}

which is (15). □

Corollary 6.

If, in the previous theorem, we take

p

and

q

to be probability distributions, we directly obtain the following result:

\begin{matrix} D_{f} (p, q) & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} f (m_{i}) + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \\ \leq & \frac{M - 1}{M - m} f (m) + \frac{1 - m}{M - m} f (M) \end{matrix}

(16)

If

p

and

q

are probability distributions, the Kullback–Leibler divergence, also called relative entropy or KL divergence, is defined as

D_{K L} (p, q) : = \sum_{i = 1}^{n} p_{i} \log (\frac{p_{i}}{q_{i}}) .

The next corollary provides us bounds for the Kullback–Leibler divergence of two probability distributions.

Corollary 7.

Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

.

Let $p = (p_{1}, \dots, p_{n})$ and $q = (q_{1}, \dots, q_{n})$ be n-tuples of nonnegative real numbers. Then

$\begin{matrix} \frac{P_{n}}{Q_{n}} log \frac{P_{n}}{Q_{n}} \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) log \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} \leq \frac{1}{Q_{n}} \sum_{i = 1}^{n} p_{i} log \frac{p_{i}}{q_{i}} . \end{matrix}$
Let $p = (p_{1}, \dots, p_{n})$ and $q = (q_{1}, \dots, q_{n}) \in P$ be probability distributions. Then

$\begin{matrix} 0 \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j}) log \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} \leq D_{K L} (p, q) . \end{matrix}$

Proof.

Let

p = (p_{1}, \dots, p_{n})

and

q = (q_{1}, \dots, q_{n})

be an n-tuples of nonnegative real numbers. Since the function

t \mapsto t log t

is convex, first inequality follows from Theorem 7 by setting

f (t) = t log t

.

The second inequality is a special case of the first inequality for probability distributions

p

and

q

. □

Corollary 8.

Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

.

Let $p = (p_{1}, \dots, p_{n})$ and $q = (q_{1}, \dots, q_{n})$ be n-tuples of nonnegative real numbers. Let $m = min {p_{i} / q_{i} : i = 1, \dots, n}$ , $M = max {p_{i} / q_{i} : i = 1, \dots, n}$ , $m_{i} = min {p_{j} / q_{j} : j \in N_{i}}$ and $M_{i} = max {p_{j} / q_{j} : j \in N_{i}}$ , for $i = 1, \dots, m$ . Then

$\begin{matrix} \sum_{i = 1}^{n} p_{i} log \frac{p_{i}}{q_{i}} \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) \frac{1}{M_{i} - m_{i}} log (m_{i}^{m_{i} (M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}} M_{i}^{M_{i} (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i})}) \\ \leq \frac{1}{M - m} log (m^{m (M - \frac{P_{n}}{Q_{n}})} M^{M (\frac{P_{n}}{Q_{n}} - m)}) . \end{matrix}$
Let $p = (p_{1}, \dots, p_{n})$ and $q = (q_{1}, \dots, q_{n}) \in P$ be probability distributions. Let $m = min {p_{i} / q_{i} : i = 1, \dots, n}$ , $M = max {p_{i} / q_{i} : i = 1, \dots, n}$ , $m_{i} = min {p_{j} / q_{j} : j \in N_{i}}$ and $M_{i} = max {p_{j} / q_{j} : j \in N_{i}}$ , for $i = 1, \dots, m$ . Then

$\begin{matrix} D_{K L} (p, q) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) \frac{1}{M_{i} - m_{i}} log (m_{i}^{m_{i} (M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}} M_{i}^{M_{i} (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i})}) \\ \leq \frac{1}{M - m} log (m^{m (M - 1)} M^{M (1 - m)}) . \end{matrix}$

Proof.

Let

p = (p_{1}, \dots, p_{n})

and

q = (q_{1}, \dots, q_{n})

be an n-tuples of nonnegative real numbers. Since the function

t \mapsto t log t

is convex, the first inequality follows from Theorem 8 by setting

f (t) = t log t

.

The second inequality is a special case of the first inequality for probability distributions

p

and

q

. □

Now we deduce the relations for some more special cases of the Csiszár f-divergence.

Definition 2

(the Shannon entropy). For a

p \in P

, the discrete Shannon entropy is defined as

S E (p) = - \sum_{i = 1}^{n} p_{i} log p_{i} .

Corollary 9.

Let

q \in P

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

. Then

\begin{matrix} - log n \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) (log \sum_{j \in N_{i}} q_{j} - log | N_{i} |) \leq - S E (q) . \end{matrix}

Proof.

Using Theorem 7 with

f (t) = - log t, t \in R^{+}

and

q \in P

, we obtain

\begin{matrix} - log (P_{n}) \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) (- log (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}})) \leq \sum_{i = 1}^{n} q_{i} (- log \frac{p_{i}}{q_{i}}) . \end{matrix}

For

p_{i} = 1, i = 1, \dots, n

inequality (17) follows. □

Corollary 10.

Let

[m, M] \subset R^{+}

,

- \infty < m < M < + \infty

,

q \in P

such that

m \leq \frac{1}{q_{i}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {1 / q_{j} : j \in N_{i}}

,

M_{i} = max {1 / q_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} - S E (q) & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{\frac{| N_{i} |}{\sum_{j \in N_{i}} q_{j}} - M_{i}}{M_{i} - m_{i}} log m_{i} + \frac{\frac{m_{i} - | N_{i} |}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} log M_{i}] \\ \leq & \frac{n - M}{M - m} log m + \frac{m - n}{M - m} log M \end{matrix}

(17)

holds.

Proof.

Using Theorem 8 with

f (t) = - log t, t \in R^{+}

,

q \in P

and

p_{i} = 1, \dots, n

, we obtain

\begin{matrix} \sum_{i = 1}^{n} q_{i} (- log \frac{1}{q_{i}}) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{| N_{i} |}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} (- log m_{i}) + \frac{\frac{| N_{i} |}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} (- log M_{i})], \end{matrix}

and (17) easily follows. □

Definition 3

(Jeffrey’s distance). For the

p, q \in P

the discrete Jeffrey distance is defined as

J_{d} (p, q) = \sum_{i = 1}^{n} (p_{i} - q_{i}) log \frac{p_{i}}{q_{i}} .

Corollary 11.

Let

p, q \in P

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

. Then

\begin{matrix} 0 & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} p_{j} - \sum_{j \in N_{i}} q_{j}) log \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} \leq J_{d} (p, q) . \end{matrix}

(18)

Proof.

Using Corollary 5 with

f (t) = (t - 1) log t, t \in R^{+}

, we obtain

\begin{matrix} (1 - 1) log 1 & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - 1) log \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} \leq \sum_{i = 1}^{n} q_{i} (\frac{p_{i}}{q_{i}} - 1) log \frac{p_{i}}{q_{i}}, \end{matrix}

and (18) easily follows. □

Corollary 12.

Let

[m, M] \subset R^{+}

,

- \infty < m < M < + \infty

,

p, q \in P

such that

m \leq \frac{p_{i}}{q_{i}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {p_{j} / q_{j} : j \in N_{i}}

,

M_{i} = max {p_{j} / q_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} J_{d} (p, q) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} (m_{i} - 1) log m_{i} + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} (M_{i} - 1) log M_{i}] \\ \leq log {(\frac{M}{m})}^{\frac{(1 - m) (M - 1)}{M - m}} \end{matrix}

(19)

holds.

Proof.

Using Corollary 6 with

f (t) = (t - 1) log t, t \in R^{+}

, we obtain

\begin{matrix} \sum_{i = 1}^{n} q_{i} (\frac{p_{i}}{q_{i}} - 1) log \frac{p_{i}}{q_{i}} \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} (m_{i} - 1) log m_{i} + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} (M_{i} - 1) log M_{i}] \\ \leq \frac{M - 1}{M - m} (m - 1) log m + \frac{1 - m}{M - m} (M - 1) log M, \end{matrix}

and (19) easily follows. □

Definition 4

(the Hellinger distance). For the

p, q \in P

, the discrete Hellinger distance is defined as

H_{d} (p, q) = \sum_{i = 1}^{n} {(\sqrt{p_{i}} - \sqrt{q_{i}})}^{2} .

Corollary 13.

Let

p, q \in P

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

. Then

\begin{matrix} 0 & \leq & \sum_{i = 1}^{m} {(\sqrt{\sum_{j \in N_{i}} p_{j}} - \sqrt{\sum_{j \in N_{i}} q_{j}})}^{2} \leq H_{d} (p, q) . \end{matrix}

(20)

Proof.

Using Corollary 5 with

f (t) = {(\sqrt{t} - 1)}^{2}, t \in R^{+}

(20) follows. □

Corollary 14.

Let

[m, M] \subset R^{+}

,

- \infty < m < M < + \infty

,

p, q \in P

such that

m \leq \frac{p_{i}}{q_{i}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {p_{j} / q_{j} : j \in N_{i}}

,

M_{i} = max {p_{j} / q_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} H_{d} (p, q) & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}}{M_{i} - m_{i}} {(\sqrt{m_{i}} - 1)}^{2} + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}} - m_{i}}{M_{i} - m_{i}} {(\sqrt{M_{i}} - 1)}^{2}] \\ \leq & \frac{M - 1}{M - m} {(\sqrt{m} - 1)}^{2} + \frac{1 - m}{M - m} {(\sqrt{M} - 1)}^{2} \end{matrix}

(21)

holds.

Proof.

Using Corollary 6 with

f (t) = {(\sqrt{t} - 1)}^{2}, t \in R^{+}

(21) follows. □

Definition 5

(Bhattacharyya distance). For the

p, q \in P

, the discrete Bhattacharyya distance is defined as

B_{d} (p, q) = \sum_{i = 1}^{n} \sqrt{p_{i} q_{i}} .

Corollary 15.

Let

p, q \in P

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

and

\sum_{j \in N_{i}} q_{j} > 0, i = 1, \dots, m

. Then

\begin{matrix} - 1 & \leq & - \sum_{i = 1}^{m} \sqrt{\sum_{j \in N_{i}} p_{j} \sum_{j \in N_{i}} q_{j}} \leq - B_{d} (p, q) . \end{matrix}

(22)

Proof.

Using Corollary 5 with

f (t) = - \sqrt{t}, t \in R^{+}

(22) follows. □

Corollary 16.

Let

[m, M] \subset R^{+}

,

- \infty < m < M < + \infty

,

p, q \in P

such that

m \leq \frac{p_{i}}{q_{i}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\sum_{j \in N_{i}} q_{j} > 0

, for

i = 1, \dots, m

and

m_{i} = min {p_{j} / q_{j} : j \in N_{i}}

,

M_{i} = max {p_{j} / q_{j} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} - B_{d} (p, q) & \leq & \sum_{i = 1}^{m} (\sum_{j \in N_{i}} q_{j}) \frac{(\sqrt{m_{i} M_{i}} + \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} q_{j}}) (\sqrt{m_{i}} - \sqrt{M_{i}})}{M_{i} - m_{i}} \\ \leq & \frac{(\sqrt{m M} + 1) (\sqrt{m} - \sqrt{M})}{M - m} \end{matrix}

(23)

holds.

Proof.

Using Corollary 6 with

f (t) = - \sqrt{t}, t \in R^{+}

(23) follows. □

Now we are going to derive the results from the Theorems (7) and (8) for the Zipf–Mandelbrot law.

The Zipf–Mandelbrot law is a discrete probability distribution and is defined by the following probability mass function:

f (i; M, s, t) = \frac{1}{{(i + t)}^{s} H_{M, s, t}}, i = 1, \dots, M,

where

H_{M, s, t} = \sum_{i = 1}^{M} \frac{1}{{(i + t)}^{s}}

is a generalization of the harmonic number and

M \in N

,

s > 0

and

t \in [0, \infty 〉

are parameters.

If we define

q

as a Zipf–Mandelbrot law M-tuple, we have

q_{i} = \frac{1}{{(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}}, i = 1, \dots, M,

where

H_{M, s_{2}, t_{2}} = \sum_{i = 1}^{M} \frac{1}{{(i + t_{2})}^{s_{2}}},

and the Csiszár functional becomes

{\hat{D}}_{f} (p, i, M, s_{2}, t_{2}) = \sum_{i = 1}^{M} \frac{1}{{(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}} f (p_{i} {(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}),

where

f : I \to R, I \subseteq R

, and the parameters

M \in N, s_{2} > 0, t_{2} \geq 0

are such that

p_{i} {(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}} \in I, i = 1, \dots, M

.

If

p

and

q

are both defined as Zipf–Mandelbrot law M-tuples, then the Csiszár functional becomes

{\hat{D}}_{f} (i, M, s_{1}, s_{2}, t_{1}, t_{2}) = \sum_{i = 1}^{M} \frac{1}{{(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}} f (\frac{{(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}}{{(i + t_{1})}^{s_{1}} H_{M, s_{1}, t_{1}}}),

where

f : I \to R, I \subseteq R

, and the parameters

M \in N, s_{1}, s_{2} > 0, t_{1}, t_{2} \geq 0

are such that

\frac{{(i + t_{2})}^{s_{2}} H_{M, s_{2}, t_{2}}}{{(i + t_{1})}^{s_{1}} H_{M, s_{1}, t_{1}}} \in I, i = 1, \dots, M

.

Now, from Theorem 7, we have the following result.

Corollary 17.

Let I be an interval in

R

and

f : I \to R

a convex function. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers and

q = (q_{1}, \dots, q_{n})

be an n-tuple of nonnegative real numbers such that

p_{i} / q_{i} \in I

for every

i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

. Suppose

s_{2} > 0, t_{2} \geq 0

are such that

p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \in I, i = 1, \dots, n

,

\sum_{j \in N_{i}} p_{j} {(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \in I, i = 1, \dots, m

. Then

\begin{matrix} f (P_{n}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}) \\ \leq {\hat{D}}_{f} (p, i, n, s_{2}, t_{2}) \end{matrix}

(24)

holds.

Proof.

If we define

q

as a Zipf–Mandelbrot law n-tuple with parameters

s_{2}, t_{2}

, then from Theorem 7 it follows

\begin{matrix} f (P_{n}) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}) \\ \leq \sum_{i = 1}^{n} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}} f (p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}), \end{matrix}

which is (24). □

From Theorem 8 we have the following result.

Corollary 18.

Let

f : I \to R

be a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty

. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers. Suppose

s_{2} > 0, t_{2} \geq 0

are such that

m \leq p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \in I, i = 1, \dots, n

,

\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} \in I, i = 1, \dots, m

and

m_{i} = min {p_{j} / {(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} : j \in N_{i}}

,

M_{i} = max {p_{j} / {(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} {\hat{D}}_{f} (p, i, n, s_{2}, t_{2}) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \leq \frac{M - P_{n}}{M - m} f (m) + \frac{P_{n} - m}{M - m} f (M) \end{matrix}

(25)

holds.

Proof.

If we define

q

as a Zipf–Mandelbrot law n-tuple with parameters

s_{2}, t_{2}

, then from Theorem 8 it follows

\begin{matrix} \sum_{i = 1}^{n} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}} f (p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \leq \frac{M - \frac{P_{n}}{1}}{M - m} f (m) + \frac{\frac{P_{n}}{1} - m}{M - m} f (M), \end{matrix}

which is (25). □

Now, from Theorem 7, we also have the following result.

Corollary 19.

Let I be an interval in

R

and

f : I \to R

a convex function. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

. Suppose

s_{1}, s_{2} > 0, t_{1}, t_{2} \geq 0

are such that

\frac{{(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(i + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} \in I, i = 1, \dots, n

,

\sum_{j \in N_{i}} \frac{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} \in I, \frac{\sum_{j \in N_{i}} \frac{1}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} \in I, i = 1, \dots, m

. Then

\begin{matrix} f (1) \leq \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) f (\frac{\sum_{j \in N_{i}} \frac{1}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}) \\ \leq {\hat{D}}_{f} (i, n, s_{1}, s_{2}, t_{1}, t_{2}) \end{matrix}

(26)

holds.

Proof.

If we define

p, q

as a Zipf–Mandelbrot law n-tuples with parameters

s_{1}, t_{1}, s_{2}, t_{2}

, then from Theorem 7, we obtain (26). □

From Theorem 8, we have the following result.

Corollary 20.

Let

f : I \to R

be a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty

. Suppose

s_{1}, s_{2} > 0, t_{1}, t_{2} \geq 0

are such that

m \leq \frac{{(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(i + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

\frac{{(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(i + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} \in I, i = 1, \dots, n

,

\frac{\sum_{j \in N_{i}} \frac{1}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} \in I, i = 1, \dots, m

and

m_{i} = min {\frac{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} : j \in N_{i}}

,

M_{i} = max {\frac{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} {\hat{D}}_{f} (i, n, s_{1}, s_{2}, t_{1}, t_{2}) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} \frac{1}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{\sum_{j \in N_{i}} \frac{1}{{(j + t_{1})}^{s_{1}} H_{n, s_{1}, t_{1}}}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} - m_{i}}{M_{i} - m_{i}} f (M_{i})] \leq \frac{M - 1}{M - m} f (m) + \frac{1 - m}{M - m} f (M) \end{matrix}

(27)

holds.

Proof.

If we define

p, q

as a Zipf–Mandelbrot law n-tuples with parameters

s_{1}, t_{1}, s_{2}, t_{2}

, then from Theorem 8, we obtain (27). □

Since the minimal value for

q_{i}

is

min {q_{i}} = \frac{1}{{(n + t_{2})}^{s_{2}} H_{l, s_{2}, t_{2}}}

and its maximal value is

max {q_{i}} = \frac{1}{{(1 + t_{2})}^{s_{2}} H_{l, s_{2}, t_{2}}}

, from the right-hand side of (24) and the left-hand side of (25), we obtain the following result.

Corollary 21.

Let

f : I \to R^{+}

be a convex function on

I, [m, M] \subset I

,

- \infty < m < M < + \infty

. Let

p = (p_{1}, \dots, p_{n})

be an n-tuple of real numbers. Suppose

s_{2} > 0, t_{2} \geq 0

are such that

m \leq p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \leq M, i = 1, \dots, n

. Let

N_{i} \subseteq {1, 2, \dots, n}, i = 1, \dots, m

where

N_{i} \cap N_{j} = Ø

for

i \neq j

,

\cup_{i = 1}^{m} N_{i} = {1, 2, \dots, n}

,

p_{i} {(i + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \in I, i = 1, \dots, n

,

\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} \in I, i = 1, \dots, m

and

m_{i} = min {p_{j} / {(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} : j \in N_{i}}

,

M_{i} = max {p_{j} / {(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} : j \in N_{i}}

, for

i = 1, \dots, m

. Then

\begin{matrix} \frac{1}{P_{n} {(n + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}} \sum_{i = 1}^{m} |N_{i}| f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}) \\ \leq {\hat{D}}_{f} (p, i, n, s_{2}, t_{2}) \\ \leq \frac{1}{{(1 + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}} \sum_{i = 1}^{m} [\frac{M_{i} |N_{i}| - {(1 + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \sum_{j \in N_{i}} p_{j}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{{(n + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}} \sum_{j \in N_{i}} p_{j} - m_{i} |N_{i}|}{M_{i} - m_{i}} f (M_{i})] \end{matrix}

(28)

holds.

Proof.

Using

min {q_{i}} = \frac{1}{{(n + t_{2})}^{s_{2}} H_{l, s_{2}, t_{2}}}

and

max {q_{i}} = \frac{1}{{(1 + t_{2})}^{s_{2}} H_{l, s_{2}, t_{2}}}

from the right-hand side of (24) and the left-hand side of (25), we obtain

\begin{matrix} \frac{1}{P_{n}} \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(n + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) f (\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(j + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}) \\ \leq {\hat{D}}_{f} (p, i, n, s_{2}, t_{2}) \\ \leq \sum_{i = 1}^{m} (\sum_{j \in N_{i}} \frac{1}{{(1 + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}) [\frac{M_{i} - \frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(1 + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}}}{M_{i} - m_{i}} f (m_{i}) \\ + \frac{\frac{\sum_{j \in N_{i}} p_{j}}{\sum_{j \in N_{i}} \frac{1}{{(n + t_{2})}^{s_{2}} H_{n, s_{2}, t_{2}}}} - m_{i}}{M_{i} - m_{i}} f (M_{i})], \end{matrix}

and (28) follows. □

4. Conclusions

In this paper we have obtained a refinement of the Lah–Ribarič inequality and a refinement of the Jensen inequality which follows from using the Lah–Ribarič inequality and the Jensen inequality on disjunctive subsets of

{1, 2, \dots, n}

.

Using these results, we find a refinement of the discrete Hölder inequality and a refinement of some inequalities for the discrete weighted power means and the discrete weighted quasi-arithmetic means. In addition, some interesting estimations for the discrete Csiszár divergence and for its important special cases are obtained.

It would be interesting to see whether using this method one can give refinements of some other inequalities. In addition, we can try to use this method for refining the Jensen inequality and the Lah–Ribarič inequality for operators.

Author Contributions

All authors jointly worked on the results. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dragomir, S.S.; Adil Khan, M.; Abathun, A. Refiment of Jensen’s integral inequality. Open Math. 2016, 14, 221–228. [Google Scholar] [CrossRef] [Green Version]
Jessen, B. Bemaerkinger om konvekse Funktioner og Uligheder imellem Middelvaerdier. I. Mat. Tidsskr. B 1931, 17–29. [Google Scholar]
Merhav, N. Reversing Jensen’s Inequality for Information-Theoretic Analyses. Information 2022, 13, 39. [Google Scholar] [CrossRef]
Jensen, J.L.W.V. Om konvexe funktioner og uligheder mellem Middelvaerdier. Nyt. Tidsskr. Math. 1905, 16 B, 49–69. [Google Scholar]
Nikolova, L.; Persson, L.E.; Varošanec, S. A new look at classical inequalities involving Banach lattice norms. J. Inequal. Appl. 2017, 2017, 302. [Google Scholar] [CrossRef] [PubMed]
Pečarić, J.E.; Proschan, F.; Tong, Y.L. Convex Functions, Partial Orderings, and Statistical Applications; Mathematics in Science and Engineering, 187; Academic Press, Inc.: Boston, MA, USA, 1992; p. xiv+467. ISBN 0-12-549250-2. [Google Scholar]
Lah, P.; Ribarič, M. Converse of Jensen’s inequality for convex functions. Univ. Beograd Publ. Elektrotehn. Fak. Ser. Mat. Fiz. 1973, 412–460, 201–205. [Google Scholar]
Mitrinović, D.S.; Pečarić, J.E.; Fink, A.M. Classical and New Inequalities in Analysis; Mathematics and its Applications (East European Series), 61; Kluwer Academic Publishers Group: Dordrecht, The Netherlands, 1993; p. xviii+740. ISBN 0-7923-2064-6. [Google Scholar]
Andrić, M.; Pečarić, J. Lah–Ribarič type inequalities for (h, g;m)-convex functions. Rev. Real Acad. Cienc. Exactas Fis. Nat. Ser. A Mat. 2022, 116, 39. [Google Scholar] [CrossRef]
Popescu, P.G.; Sluşanschi, E.I.; Iancu, V.; Pop, F. A New Upper Bound for Shannon Entropy. A Novel Approach in Modeling of Big Data Applications. Concurr. Comput. Pract. Exp. 2016, 28, 351–359. [Google Scholar] [CrossRef]
Pečarić, J.; Perić, J. Refinements of the integral form of Jensen’s and the Lah–Ribarič inequalities and applications for Csiszár divergence. J. Inequal. Appl. 2020, 2020, 108. [Google Scholar] [CrossRef] [Green Version]
Constantin, P.; Niculescu; Persson, L.-E. Convex Functions and Their Applications. A Contemporary Approach; CMS Books in Mathematics; Springer: New York, NY, USA, 2005. [Google Scholar]
Beckenbach, E.F.; Bellman, R. Inequalities; Springer: Berlin/Göttingen/Heidelberg, Germany, 1961. [Google Scholar]
Hardy, G.H.; Littlewood, J.E. Pólya. In Inequalities; Cambridge Univ. Press: Cambridge, UK, 1934. [Google Scholar]
Csiszár, I. Information-type measures of difference of probability functions and indirect observations. Studia Sci. Math. Hung. 1967, 2, 299–318. [Google Scholar]

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pečarić, Đ.; Pečarić, J.; Perić, J. Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence. Mathematics 2022, 10, 755. https://doi.org/10.3390/math10050755

AMA Style

Pečarić Đ, Pečarić J, Perić J. Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence. Mathematics. 2022; 10(5):755. https://doi.org/10.3390/math10050755

Chicago/Turabian Style

Pečarić, Đilda, Josip Pečarić, and Jurica Perić. 2022. "Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence" Mathematics 10, no. 5: 755. https://doi.org/10.3390/math10050755

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Refinement of Discrete Lah–Ribarič Inequality and Applications on Csiszár Divergence

Abstract

1. Introduction

2. New Refinements

3. Applications in Information Theory

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI