*Article* **Fisher-like Metrics Associated with** *φ***-Deformed (Naudts) Entropies**

**Cristina-Liliana Pripoae 1, Iulia-Elena Hirica 2, Gabriel-Teodor Pripoae 2,\* and Vasile Preda 2,3,4**


**Abstract:** The paper defines and studies new semi-Riemannian generalized Fisher metrics and Fisher-like metrics, associated with entropies and divergences. Examples of seven such families are provided, based on exponential PDFs. The particular case when the basic entropy is a *φ*-deformed one, in the sense of Naudts, is investigated in detail, with emphasis on the variation of the emergent scalar curvatures. Moreover, the paper highlights the impact on these geometries determined by the addition of some group logarithms.

**Keywords:** *φ*-deformed (Naudts) entropy; divergence; relative group entropy; generalized Fisher metric; Fisher-like metric; MaxEnt problem

**MSC:** 53B12; 22E70; 94A17; 53B20

**1. Introduction**

*1.1. History*

Entropy is a a very versatile measure of order (or of chaos). In the last few several decades, the growing needs of modeling for stochastic phenomena contributed to the apparition of many new different families of entropy functionals, with increasing levels of generality, reliability and applicability [1–19]. One of the recent interesting new directions of study uses the relative group entropies, based on group logarithms (see [20,21] and references therein).

The geometrization method, a powerful tool in modelization, was applied in the investigation of some statistical relevant parameters sets, beginning with the work of the pioneers: Fisher, Rao, Efron and Amari [12,22,23]. This bridge allows the use of the differential geometric machinery to understand the local and the global behavior of statistical objects.

In particular, the Fisher (semi-Riemannian) metrics correspond to the Fisher Information matrices. Their invariants, especially those tensor fields expressing different kinds of curvature properties, are used in the parameters estimation theory as control tools. For example, the scalar curvature function measures the average statistical uncertainty of a density matrix [12,20,24].

Consider a statistical model, governed by a given entropy, and two or more fixed parameterized probability density functions (PDFs) within it. Various divergences ("distancelike functionals") can be defined in this framework, able to detect how these PDFs relate to each other. A kind of infinitesimal variation of such divergences, w.r.t. the parameters,

**Citation:** Pripoae, C.-L.; Hirica, I.-E.; Pripoae, G.-T.; Preda, V. Fisher-like Metrics Associated with *φ*-Deformed (Naudts) Entropies. *Mathematics* **2022**, *10*, 4311. https://doi.org/10.3390/ math10224311

Academic Editor: Sorin V. Sabau

Received: 12 October 2022 Accepted: 15 November 2022 Published: 17 November 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

may provide interpretations for some Fisher-like metrics. Several types of divergences are used, including the Kullback–Leibler and the Bregman ones. For recent viewpoints upon divergences, see [14,25,26].

In 2002, Naudts introduced ([27]) the "*φ*-deformed entropy", via a positive strictly increasing function *φ*, which plays the role of a "generalized logarithm". (We shall call it "*φ*-deformed (Naudts) entropy" and not simply "*φ*-deformed entropy", in order to avoid confusion and to distinguish it from other "deformed" entropies, all originating—sooner or later—from the Boltzman–Gibbs–Shannon (BGS) germ). This new entropy extends (with some technical precautions) the Tsallis and the Kaniadakis entropies, among other ones. Using it, new Fisher metrics were defined [27–29], ranging from simple ones to some more "baroque" constructions. Their applicability covers a wide area, from Physics (the starting point) to Information geometry [29–32].

Using a *φ*-deformed (Naudts) exponential family of PDFs, Matsuzoe et al. [33] investigated the geometry of statistical manifolds derived from a sequence of escort expectations.

Korbel et al. [30] studied properties of the Fisher metrics associated with the *φ*deformed (Naudts) entropies, in the case of exponential-type PDFs. Particular choices of the function *φ* provided examples based on (*c*, *d*)-entropies. Dealing with the MaxEnt problem, they use the Fisher information of the *φ*-deformed (Naudts) exponential entropies, in order to reveal a duality between the cases with linear constraints and those based on escort constraints.

Inspired by these previous works, we believe that a systematic study of semi-Riemannian metrics, canonically associated with the *φ*-deformed (Naudts) entropy, is necessary and might provide useful statistical tools in the future. Our paper suggests a method of research, which combines the beaten path with some new speculative ideas.

#### *1.2. The Content of the Paper*

In Section 2, we recall (in a creative manner) the notations and fix the conventions concerning (the different variants of) entropy and divergence; we closely follow [34]. We make some comments about the place of the Naudts' *φ*-deformed entropy in the "Universe" of generalized entropies. We recall here some other examples of remarkable entropies (Tsallis, Kaniadakis, Sharma–Taneja–Mittal). Our main new idea is the distinction we made between the "quotient" divergence and the "difference" divergence, in the context of generalized logarithms; in the particular case of the Neperian logarithm, these two notions coincide, but in other cases (such that of the *φ*-deformed (Naudts) entropy) they are distinct.

In Section 3, we fix the needed notions concerning the generalized Fisher-like metrics associated with the entropies and to the relative (group) entropies, following (especially) [20]. Following the previous distinction we made in Section 2, between the two kinds of divergences, we introduce two generalized Fisher-like metrics (GFM1 and GFM2), which coincide in the classical setting with the Fisher metric. Three other Fisher-like metrics are defined, in a formal way, as auxiliary (but eventually useful) by-products of the former ones.

In Section 4, we determine the semi-Riemannian geometries of the generalized Fisherlike metrics, associated with group relative entropies based on *φ*-deformed (Naudts) entropies and divergences. Their coefficients are expressed in terms of both PDFs and of the *φ*-deformed logarithm and may depend on a group logarithm too.

In the next section, we give seven families of examples of such metrics, for the case when the involved PDFs are exponential. The scalar curvatures functions are computed, and their variation is studied.

In Section 6, we define and solve the MaxEnt problem based on the *φ*-deformed (Naudts) entropy, for univariate PDFs, and we generalize some thermodynamic relations.

#### *1.3. Conventions*

Implicitly, the integrals are supposed to be correctly defined and to commute with their derivatives. "Differentiable" means "smooth", even if, sometimes, a weaker assumption would be enough. When a symmetric matrix is called "a (semi-Riemannian) metric", we assume, implicitly that it is non-degenerate; the positive definiteness is not assumed, in general, unless otherwise stated.

#### **2. Entropies and Divergences—A Breviary**

We consider a real valued random variable *<sup>x</sup>* on a domain *<sup>X</sup>* <sup>⊂</sup> <sup>R</sup>*m*. We denote by *<sup>ρ</sup>* = *<sup>ρ</sup>*(*x*) a fixed probability density function (PDF); then, *<sup>ρ</sup>*(*x*) ≥ 0 and  *<sup>X</sup> ρ*(*x*)*dx* = 1. We fix a real valued differentiable function *ϕ*, as a "controlling tool". In this setting, the generalized (normalized) entropy is

$$H[\rho] = -\int\_X \rho(\mathbf{x})\,\rho(\rho(\mathbf{x}))d\mathbf{x}.\tag{1}$$

We shall use a similar notation for other entropy-like functionals too. In the literature, the avatars of the "generalized logarithm" *ϕ* are subject to additional restrictions, imposed through applications inspired axioms.

Let *F* : [0, ∞) × [0, ∞) → R a smooth function and *σ* an additional fixed PDF. We define

$$D(\rho, \sigma) := \int\_{\mathcal{X}} F(\rho(\mathbf{x}), \sigma(\mathbf{x})) d\mathbf{x}.\tag{2}$$

We suppose that *D*(*ρ*, *σ*) ≥ 0 and *D*(*ρ*, *σ*) = 0 if and only if *ρ* = *σ*. The number *D*(*ρ*, *σ*) is called the (generalized) divergence between *ρ* and *σ* and measures to what extent *σ* influences *ρ*. Sometimes, additional properties of the divergence function are added, axiomatically.

**Example 1.** *With the previous notations, we recall some well-known examples of entropies ([35–37]). (i) In the particular case when ϕ*(*y*) := *log*(*y*)*, from Formula (1), we obtain the Boltzmann– Gibbs–Shannon (BGS) entropy.*

*(ii) Consider a fixed parameter q* ∈ R\{1}*. The Tsallis q-logarithm*

$$\varphi\_{\{q\}}^T(y) := \frac{y^{1-q} - 1}{1 - q} \tag{3}$$

*provides a Tsallis entropy. Usually, for ϕ<sup>T</sup>* {*q*}*, we use the notation log<sup>T</sup>* {*q*}*. When <sup>q</sup>* <sup>→</sup> <sup>1</sup>*, the BGS entropy is recovered.*

*(iii) Let us fix k* ∈ [−1, 1]\{0}*. The Kaniadakis k-logarithm*

$$
\varphi\_{\{k\}}^{\mathbb{K}}(y) := \frac{y^k - y^{-k}}{2k} \tag{4}
$$

*defines a Kaniadakis entropy (named also k-deformed entropy). Usually, ϕ<sup>K</sup>* {*k*} *is denoted log<sup>K</sup>* {*k*}*. When k* → 0*, we recover again the BGS entropy.*

*(iv) Fix two real parameters k and r. The Sharma–Taneja–Mittal* (*k*,*r*)*-logarithm*

$$\varphi\_{\{(k,r)\}}^{\mathrm{STM}}(y) := y^r \cdot \frac{y^k - y^{-k}}{2k}$$

*provides a Sharma–Taneja–Mittal (STM) entropy (also named* (*k*,*r*)*-deformed entropy). Instead of ϕSTM* {(*k*,*r*)}*, we shall denote logSTM* {(*k*,*r*)}*. The Kaniadakis k- logarithm and the Tsallis q-logarithm are recovered as particular cases, for r* = 0 *and for r* = ± | *k* |*, respectively. When* (*k*,*r*) → (0, 0)*, we recover the BGS entropy. Sometimes, additional restrictions are imposed on the domain of the parameters, required by convergence conditions imposed on some integrals (see [38–40] for details).*

*(v) ([27]) Let φ* : (0, ∞) → R *a positive, differentiable, strictly-increasing function. (Sometimes, in the literature, "non-decreasing" is required, instead of the "strictly-increasing" condition). Define the φ-deformed (Naudts) logarithm*

$$\log\_{\phi}^{N}(y) := \int\_{1}^{y} \frac{1}{\phi(z)} dz. \tag{5}$$

*The function ϕ<sup>N</sup> <sup>φ</sup>* := *log<sup>N</sup> <sup>φ</sup> defines the φ-deformed (Naudts) entropy. The previous formula may also be read "backwards":*

$$\phi(y) = (\frac{\partial}{\partial\_{\mathbf{x}}} \varphi\_{\boldsymbol{\Phi}}^{N}(y))^{-1}. \tag{6}$$

*Moreover, given an arbitrary "generalized logarithm" ϕ as in (1), Formula (6) always provides a differentiable function φ; if it is positive and strictly-increasing, we expressed ϕ like a φ-deformed (Naudts) logarithm. Sometimes, this procedure works for some restrictions of the involved parameters only. For example, the preceding four entropies are recovered as particular cases of φ-deformed (Naudts) entropies, as follows: BGS for φ* := *id; Tsallis for φ*(*y*) := *y<sup>q</sup> with the restrictions q* > 0 *and y* <sup>∈</sup> (0, <sup>∞</sup>)*; Kaniadakis <sup>φ</sup>*(*y*) :<sup>=</sup> <sup>2</sup>(*yk*−<sup>1</sup> <sup>+</sup> *<sup>y</sup>*−*k*−1)−<sup>1</sup> *with the additional restriction*

$$y^{2k} < \frac{k+1}{k-1}r$$

*for y* ∈ (0, ∞)*; STM for*

$$\phi(y) := 2k[(k+r)y^{k+r-1} + (k-r)y^{r-k-1}]^{-1},$$

*with the additional restriction*

$$y^{2k} < \frac{(r-k)(r-k-1)}{(r+k)(r+k-1)},$$

*for y* ∈ (0, ∞)*. These additional restrictions are imposed in order φ to be strictly-increasing.*

*(vi) Let G* = *G*(*t*) *be a formal group logarithm, which is a differentiable real valued function with some special algebraic properties, inspired from the formal series linking Lie groups to Lie algebras. More precisely,*

$$G(t) := \sum\_{i=0}^{\infty} c\_i \frac{t^{i+1}}{i+1} \prime$$

*where c*<sup>0</sup> = 1 *and ci* ∈ Q*. Its inverse is*

$$F(s) := \sum\_{i=0}^{\infty} \gamma\_i \frac{s^{i+1}}{i+1} \gamma$$

*where <sup>γ</sup><sup>i</sup>* <sup>∈</sup> <sup>Q</sup>*, <sup>γ</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup>*, <sup>γ</sup>*<sup>1</sup> <sup>=</sup> <sup>−</sup>*c*1*, <sup>γ</sup>*<sup>2</sup> <sup>=</sup> <sup>3</sup> 2 *c*2 <sup>1</sup> − *c*<sup>2</sup> *and so on. (We refer to [20,21,41] for details about these functions). The simplest example is G*(*t*) = *t.*

*We define the generalized group entropy functional (GGEF) associated with (1) by*

$$S\_G(\rho) := \int\_X \rho(\mathbf{x}) G(\rho \circ \rho(\mathbf{x})) d\mathbf{x}.\tag{7}$$

*In particular, for ϕ* := −*log, we recover the well-known* group entropy functional *([20,41]) associated with (1)*

$$S\_G(\rho) := \int\_X \rho(\mathbf{x}) G(\log \rho(\mathbf{x})^{-1}) d\mathbf{x}.\tag{8}$$

*Similar GGEFs can be provided by replacing the Neperian logarithm by other "generalized" logarithms (e.g., Tsallis, Kaniadakis, STM, etc). In Section 3, we shall introduce the geometries associated with the GGEF, based on φ-deformed (Naudts) entropies. Accordingly, we shall use the generalized logarithm* log*<sup>N</sup> <sup>φ</sup> from (5).*

**Example 2.** *With the previous notations, we recall some well-known examples of divergences.*

*(i) An important particular case is the generalized (quotient) relative entropy (a.k.a. generalized divergence) between ρ and σ (see [34,42])*

$$\mathcal{D}(\rho \parallel \sigma) := \int\_X \rho(\mathbf{x}) \varphi(\frac{\rho(\mathbf{x})}{\sigma(\mathbf{x})}) d\mathbf{x}.\tag{9}$$

*The function F*(*z*, *y*) := *zϕ*( *<sup>z</sup> <sup>y</sup>* )*. We accept (formally) that* <sup>0</sup> · *<sup>ϕ</sup>*( <sup>0</sup> *<sup>σ</sup>* ) = <sup>0</sup>*, <sup>ρ</sup>* · *<sup>ϕ</sup>*( *<sup>ρ</sup>* <sup>0</sup> ) = 0 *and ϕ*(1) = 0*. In particular, when ϕ* := *log, we recover the Kullback–Leibler divergence ([20]).*

*Another particular case considers f* : [0, ∞) → (−∞, ∞] *to be a convex function, with f*(1) = 0 *and f*(0) = lim *<sup>t</sup>*→0<sup>+</sup> *<sup>f</sup>*(*t*)*. For <sup>ϕ</sup>*(*y*) :<sup>=</sup> <sup>1</sup> *<sup>y</sup> f*(*y*)*, we recover the f-divergence ([43] and references therein). The slightly more general notion of* (*f* , Γ)*-divergence (see [44]) may be recovered in a similar way.*

*(ii) In a similar way, we define the generalized (difference) relative entropy between ρ and σ, as*

$$D(\rho \parallel \sigma) := \int\_X \rho(\mathbf{x}) [\varphi(\rho(\mathbf{x})) - \varrho(\sigma(\mathbf{x}))] d\mathbf{x}.\tag{10}$$

*The function <sup>F</sup>*(*z*, *<sup>y</sup>*) :<sup>=</sup> *<sup>z</sup>*[*ϕ*(*z*) <sup>−</sup> *<sup>ϕ</sup>*(*y*)]*. In particular, when <sup>ϕ</sup>* :<sup>=</sup> *log, <sup>D</sup>*˜ *coincides with <sup>D</sup> and we recover the Kullback–Leibler divergence, as in (i). When ϕ* := *log<sup>N</sup> <sup>φ</sup> , the divergence D was considered in [27]; we mention that, in this case, D does not coincide with D.* ˜

*In general, a necessary and sufficient condition on ϕ, ρ and σ, in order that D* = *D*˜ *, is the vanishing of the mean function ϕ*( *<sup>ρ</sup> <sup>σ</sup>* ) − *ϕ*(*ρ*) + *ϕ*(*σ*)*. A sufficient (but quite strong) condition is provided by the functional equation ϕ*( *<sup>ρ</sup> <sup>σ</sup>* ) = *ϕ*(*ρ*) − *ϕ*(*σ*)*.*

*(iii) In the hypothesis of Example 1 (vi), we can define generalized divergences as relative group entropies, which combine the formal group logarithm G, the ϕ-likelihood function and the previous quotient or difference operation upon two PDFs. For example, the analogue of (10) is*

$$D\_G(\rho \parallel \sigma) := \int\_X \rho(\mathfrak{x}) \cdot G\left(\varphi(\rho(\mathfrak{x})) - \varphi(\sigma(\mathfrak{x}))\right) d\mathfrak{x}.$$

*(iv) Consider two fixed PDFs ρ*<sup>1</sup> *and ρ*2*. Denote ψ* : R → R *as a fixed convex differentiable function. In this setting, the Bregman divergence is*

$$D\_{\Psi}(\rho\_1 \parallel \rho\_2) := \int\_X \{\psi(\rho\_1(\mathbf{x})) - \psi(\rho\_2(\mathbf{x})) - (\rho\_1(\mathbf{x}) - \rho\_2(\mathbf{x}))\psi'(\rho\_2(\mathbf{x}))\}d\mathbf{x}.\tag{11}$$

*We mention that the function F*(*z*, *y*) := *ψ*(*z*) − *ψ*(*y*) − (*z* − *y*) · *ψ* (*y*) *is convex too.*

Let *ρ* = *ρ*(*x*, *t*) be a *time-dependent* PDF, where *x*, *t* ∈ R. Then, the entropy in (1) will also depend on the parameter *t*, so *H*[*ρ*] = *H*[*ρ*](*t*). We consider a *potential energy function V* = *V*(*x*) and its associated *energy average function*

$$\mathcal{U}[\rho](t) := \int\_{\mathbb{R}} V(\mathbf{x}) \rho(\mathbf{x}, t) d\mathbf{x}.\tag{12}$$

(If needed, restriction of these functions to open subsets is possible). This particular framework will be used in Section 6 only.

#### **3. Fisher-like Metrics Associated with Generalized Entropies and Generalized Divergences**

In this section, we recall the notion of Fisher metric associated with a family of (generalized) entropies or divergences, defined on the space of parameters of an arbitrary PDF, using mainly [20,34]. For a more general setting, see [34].

Consider the case when the PDF *ρ* in Section 2 depends, moreover, on *n* real parameters *<sup>θ</sup>*1, ... , *<sup>θ</sup>n*, with *<sup>θ</sup>* := (*θ*1, ... , *<sup>θ</sup>n*) <sup>∈</sup> <sup>Θ</sup>, where <sup>Θ</sup> is an open set of <sup>R</sup>*n*. Thus, *<sup>ρ</sup>* : *<sup>X</sup>* <sup>×</sup> <sup>Θ</sup> <sup>→</sup> <sup>R</sup>, *ρ* = *ρ*(*x*, *θ*). Let *ϕ* : R → R be a differentiable controlling function, *ϕ* = *ϕ*(*y*). The

dependence on *θ* leads to a generalized entropy function *H* : Θ → R, canonically derived from Formula (1):

$$H(\theta) = -\int\_X \rho(\mathbf{x}, \theta) \cdot \varrho(\rho(\mathbf{x}, \theta)) d\mathbf{x}.\tag{13}$$

In a similar natural way, we can define generalized divergence functions, by *θ*parameterizing (2) and its avatars.

Define

$$\log\_{\vec{\eta}}(\theta) := -\int\_{X} \rho(\mathbf{x}, \theta) \frac{\partial^2 \varrho(\rho(\mathbf{x}, \theta))}{\partial \theta^j \partial \theta^j} d\mathbf{x} \quad , \quad \mathbf{i}, \mathbf{j} = \overline{\mathbf{1}, n} \tag{14}$$

and

$$\mathfrak{F}\_{\vec{l}\vec{j}}(\theta) := \int\_{X} \rho(\mathbf{x}, \theta) \frac{\partial \rho(\rho(\mathbf{x}, \theta))}{\partial \theta^{i}} \cdot \frac{\partial \rho(\rho(\mathbf{x}, \theta))}{\partial \theta^{\vec{j}}} d\mathbf{x} \quad , \quad \mathbf{i}, \mathbf{j} = \overline{1, n}. \tag{15}$$

We suppose that the matrices (*gij*)*i*,*j*=1,*<sup>n</sup>* and (*g*˜*ij*)*i*,*j*=1,*<sup>n</sup>* are non-degenerated, and *g* has constant index on Θ. We call *g* and *g*˜ *generalized Fisher metrics of type 1 and type 2*, respectively, and denote GFM1 and GFM2. Both metrics are "means", w.r.t. *ρ*, of some *ϕ*-mediated "information matrices": the Hessian of *ϕ* ◦ *ρ* and the matrix of the gradient of *ϕ* ◦ *ρ* with its transpose, respectively. The diagonal coefficients *g*˜*ii*(*θ*), *i* = 1, *n*, generalize the Fisher Information Numbers from [45], which can be recovered when *ϕ* is the Tsallis logarithm.

In general, the semi-Riemannian metric *g* and the Riemannian metric *g*˜ differ from each other and differ from the Hessian (semi-Riemannian metric if non-degenerated)

$$h\_{i\bar{j}}(\theta) := \frac{\partial^2 H(\theta)}{\partial \theta^i \partial \theta^{\bar{j}}}.\tag{16}$$

We define, in a formal way, two auxiliary symmetric tensors of (0,2)-type *α* and *β*, given by

$$\mathfrak{a}\_{ij}(\theta) := \int\_X \frac{\partial^2 \rho(\mathbf{x}, \theta)}{\partial \theta^i \partial \theta^j} \cdot \mathfrak{q}(\rho(\mathbf{x}, \theta)) d\mathbf{x} \tag{17}$$

and

$$\beta\_{\vec{\eta}}(\theta) := \int\_{X} \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{i}} \cdot \frac{\partial \rho(\rho(\mathbf{x}, \theta))}{\partial \theta^{j}} + \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{j}} \cdot \frac{\partial \rho(\rho(\mathbf{x}, \theta))}{\partial \theta^{i}} \right\} d\mathbf{x}.\tag{18}$$

We remark that, if non-degenerated, *α* and *β* provide semi-Riemannian metrics. In this case, these metrics are also of Fisher type, as they express "means" w.r.t. the PDF *ρ* of two "derived information matrices", of coefficients *<sup>ρ</sup>*−<sup>1</sup> · *<sup>ρ</sup>ij* · *<sup>ϕ</sup>*(*ρ*) and *<sup>ρ</sup>*−<sup>1</sup> · (*ρ<sup>i</sup>* · *<sup>ϕ</sup>j*(*ρ*) + *ρ<sup>j</sup>* · *ϕi*(*ρ*)), respectively.

#### **Example 3.** *Consider the particular case of the BGS-entropy, with ϕ* := *log.*

*(i) In this case, both previous GFM1 and GFM2 coincide with the classical (Riemannian) Fisher metric g*<sup>0</sup> *associated with H (or ϕ) [20].*

*In the general case, it would be interesting to find all the controlling functions ϕ, for which g coincides with g*˜*. Does this property necessarily imply that ϕ is proportional with log, modulo a non-null constant? A further step would be to look for appropriate functions ϕ, in order that g and g*˜*: be homothetic or conformal; have the same geodesics; have the same curvature, etc. To this differential geometric viewpoint, a statistical counterpart may eventually correspond.*

*(ii) Let <sup>X</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>m</sup> be an open set and let <sup>C</sup>* <sup>=</sup> *<sup>C</sup>*(*x*)*, <sup>F</sup>*<sup>1</sup> <sup>=</sup> *<sup>F</sup>*1(*x*)*,..., Fn* <sup>=</sup> *Fn*(*x*)*, <sup>ν</sup>* <sup>=</sup> *<sup>ν</sup>*(*θ*) *be smooth functions on X. Consider <sup>ρ</sup>* : *<sup>X</sup>* <sup>×</sup> <sup>R</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> *the PDF of exponential type, given by*

$$\rho(\mathfrak{x}, \theta) := \exp\{\mathbb{C}(\mathfrak{x}) - \nu(\theta) + \sum\_{i=1}^{n} F\_i(\mathfrak{x})\theta^i\}.$$

*The associated Fischer metric is g* = *Hessν, which is a Hessian metric.*

*(iii) For this choice of the function <sup>ϕ</sup>, we obtain <sup>α</sup>ij* <sup>=</sup> *<sup>ρ</sup>*−<sup>1</sup> · *<sup>ρ</sup>ij* · *log*(*ρ*) *and <sup>β</sup>ij* <sup>=</sup> <sup>2</sup>*ρ*−<sup>2</sup> · *<sup>ρ</sup><sup>i</sup>* · *<sup>ρ</sup>j, for i*, *j* = 1, *n. The "perturbed" Hessian matrix associated with α is similar to the one studied in some recent statistical applications (see, for example, [46]).*

**Remark 1.** *(i) We give an interpretation and a motivation for the definition of the GFM1, in a slightly more general case than [20]. Consider ϕ a fixed controlling function. Let ρ* = *ρ*(*x*, *θ*) *and <sup>σ</sup>* :<sup>=</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) *be two families of parameterized PDFs over X* <sup>⊂</sup> <sup>R</sup>*m, with <sup>θ</sup>*, *<sup>θ</sup>*<sup>0</sup> <sup>∈</sup> <sup>R</sup>*n, and let*

$$D(\rho \parallel \sigma)(\theta, \theta\_0) := \int\_X \rho(\mathbf{x}, \theta) \cdot [\varrho(\rho(\mathbf{x}, \theta)) - \varrho(\sigma(\mathbf{x}, \theta\_0))] d\mathbf{x}$$

*be the generalized (difference) relative entropy between them, as in (10). Denote* Δ*θ* := *θ* − *θ*<sup>0</sup> *and suppose its norm to be infinitesimally small. We know that D*(*ρ σ*) *has a unique minimum for ρ* = *σ, i.e., for θ*<sup>0</sup> = *θ. The Taylor decomposition around θ* = *θ*<sup>0</sup> *gives*

$$D(\rho \parallel \sigma)(\theta\_0, \theta\_0) = -\frac{1}{2} \int\_X \rho(\mathbf{x}, \theta\_0) \cdot \Delta \theta^i \cdot \Delta \theta^j \cdot \left(Hess\_{\boldsymbol{\theta}^\alpha \boldsymbol{\theta}}\right)\_{\stackrel{i}{ij}}(\theta\_0) \, d\mathbf{x} + \mathcal{O}((\Delta \theta)^3) = 0$$

$$= \frac{1}{2} \cdot \Delta \theta^i \cdot \Delta \theta^j \cdot \mathcal{g}\_{\stackrel{ij}{ij}}(\theta\_0) + \mathcal{O}((\Delta \theta)^3).$$

*The second order approximation of this expression is precisely half of the GFM1 g, calculated in θ*0*.*

*When ϕ* := *log, we recover the interpretation given in [20].*

*(ii) We do not know a similar interpretation for the GFM2 g.*˜

*(iii) The generalized group relative divergences from Example 2 (iii) provide analogous formulas. We shall study them in the next section, in the particular case of the φ-deformed (Naudts) entropy.*

*(iv) The definition of Fisher metrics described previously is closely related to the need for understanding a variation of a PDF w.r.t. another (reference) one; the output of this "variational calculus factory" are functions. We signal here the forthcoming book [47], containing new revolutionary ideas in Variational calculus, including invariants of tensorial type, motivated by differential geometric problems; this source provides new insights for the definition and the study of divergence-like tensor fields, as a path toward a new bundle spaces approach in Statistics.*

*(v) All the previous tensor fields g, g*˜*, h, α, and β have constant index, one each connected component of their definition domains.*

*An open problem is to find the more general hypothesis such that these tensor fields be nondegenerated (in order to define semi-Riemannian metrics). Locally, the answer is simple: let θ*<sup>0</sup> *be a point in the parameters space, such that the determinant of the corresponding matrix, calculated in θ*0*, is not null. Then, the tensor field is non-degenerated in an open neighborhood of θ*0*. For many families of examples (and in Section 5 we add several more ones), this property holds true. A common practice in the literature is to stop here, without investigating global conditions which are fulfilled in general cases. To our knowledge, global existence results for Fisher metrics, in the general setting, are not proven yet. Moreover, the eventual singular points have an interest in their own, as they may signal—in a suitable statistical model—a phase transition ([48]).*

*We consider it useful to point out here the paper [49], where a different but correlated problem is studied: namely, to what extent the Fisher metric is (globally) unique, modulo the action of a diffeomorphism group.*

#### **4. The Fisher Geometries Associated with GGEFs Based on** *φ***-Deformed (Naudts) Entropies and Divergences**

We particularize now the results from Section 3, for the case of the Naudts entropies. Let us fix the context more precisely.

Consider *φ* a positive, differentiable and strictly-increasing function as in Example 1 (v) and the *φ*-deformed (Naudts) logarithm *log<sup>N</sup> <sup>φ</sup>* defined in Formula (5). Let *ρ* : *X* × Θ → R, *ρ* = *ρ*(*x*, *θ*) be a family of parameterized PDFs, as in Section 3. The associated GFM1 *g* and the GFM2 *g*˜ are obtained as particular cases from (14) and (15):

$$\log\_{\hat{\imath}}(\theta) := -\int\_{X} \rho(\mathbf{x}, \theta) \frac{\partial^2 \log\_{\theta}^N(\rho(\mathbf{x}, \theta))}{\partial \theta^j \partial \theta^j} d\mathbf{x} \quad , \quad i, j = \overline{1, n} \tag{19}$$

and

$$\bar{g}\_{\bar{i}\bar{j}}(\theta) := \int\_{X} \rho(\mathbf{x}, \theta) \frac{\partial \log^{N}\_{\theta}(\rho(\mathbf{x}, \theta))}{\partial \theta^{i}} \cdot \frac{\partial \log^{N}\_{\theta}(\rho(\mathbf{x}, \theta))}{\partial \theta^{\bar{j}}} d\mathbf{x} \quad , \quad i, j = \overline{1, n}. \tag{20}$$

We suppose, as usual, that *g* and *g*˜ are non-degenerated and that *g*˜ has a constant index on *X*.

We also consider, via (16), the associated Hessian metric *h* = *h*(*θ*)

$$h\_{ij}(\theta) = -\frac{\overleftarrow{\partial}^2}{\partial \theta^i \partial \theta^j} \left\{ \int\_X \rho(\mathbf{x}, \theta) \cdot \log\_{\theta}^N(\rho(\mathbf{x}, \theta)) d\mathbf{x} \right\} \quad , \quad i, j = \overline{1, n}. \tag{21}$$

**Proposition 1.** *With the previous notations, for every i*, *j* = 1, *n, we have*

$$
\mathfrak{g}\_{ij}(\theta) = \int\_{X} \rho(\mathbf{x}, \theta) \Big\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{i}} \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{j}} \cdot \boldsymbol{\Phi}^{-2}(\rho(\mathbf{x}, \theta)) \cdot \boldsymbol{\Phi}^{\prime}(\rho(\mathbf{x}, \theta)) - \tag{22}
$$

$$
$$

$$
\mathfrak{g}\_{ij}(\theta) := \int\_{X} \rho(\mathbf{x}, \theta) \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{i}} \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{j}} \cdot \boldsymbol{\Phi}^{-2}(\rho(\mathbf{x}, \theta)) d\mathbf{x},\tag{23}
$$

*and*

$$h\_{\vec{ij}}(\theta) = \int\_X \left\{ \rho(\mathbf{x}, \theta) \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^i} \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^j} \cdot \phi^{-2}(\rho(\mathbf{x}, \theta)) \cdot \phi^\prime(\rho(\mathbf{x}, \theta)) - \tag{24}$$

$$-\frac{\partial^2 \rho(\mathbf{x}, \boldsymbol{\theta})}{\partial \boldsymbol{\theta}^i \partial \boldsymbol{\theta}^j} \cdot \log \mathbb{S}^N\_{\boldsymbol{\theta}}(\rho(\mathbf{x}, \boldsymbol{\theta})) - 2 \frac{\partial \rho(\mathbf{x}, \boldsymbol{\theta})}{\partial \boldsymbol{\theta}^i} \cdot \frac{\partial \rho(\mathbf{x}, \boldsymbol{\theta})}{\partial \boldsymbol{\theta}^j} \cdot \boldsymbol{\phi}^{-1}(\rho(\mathbf{x}, \boldsymbol{\theta})) - $$
 
$$-\rho(\mathbf{x}, \boldsymbol{\theta}) \cdot \frac{\partial^2 \rho(\mathbf{x}, \boldsymbol{\theta})}{\partial \boldsymbol{\theta}^i \partial \boldsymbol{\theta}^j} \cdot \boldsymbol{\phi}^{-1}(\rho(\mathbf{x}, \boldsymbol{\theta})) \Big) d\mathbf{x}.$$

In this case, *α* and *β* are given by

$$\mathfrak{a}\_{ij}(\theta) := \int\_X \frac{\partial^2 \rho(\mathfrak{x}, \theta)}{\partial \theta^i \partial \theta^j} \cdot \log\_{\phi}^N(\rho(\mathfrak{x}, \theta)) d\mathfrak{x}$$

and

$$\beta\_{\vec{\imath}\vec{\jmath}}(\theta) := \int\_{X} \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{i}} \cdot \frac{\partial \log^{N}\_{\theta}(\rho(\mathbf{x}, \theta))}{\partial \theta^{\vec{\jmath}}} + \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{\vec{\jmath}}} \cdot \frac{\partial \log^{N}\_{\theta}(\rho(\mathbf{x}, \theta))}{\partial \theta^{i}} \right\} d\mathbf{x}.$$

**Corollary 1.** *In a condensed form, we have the following relation*

$$h = \mathfrak{g} - a - \beta.$$

We consider now, in addition, a fixed formal group logarithm *G*, as in Example 1 (vi). Let *σ* := *ρ*(*x*, *θ*0) be the associated parameterized PDFs and *DG*,*<sup>φ</sup>* = *DG*,*φ*(*ρ σ*)(*θ*, *θ*0) be the generalized (difference) group relative entropy (a.k.a. the generalized (difference) group divergence), as particularization from (10) and Remark 1 (i), (iii), written as

$$D\_{G, \phi}(\rho \parallel \sigma)(\theta, \theta\_0) = \int\_X \rho(\mathbf{x}, \theta) \cdot G\left(\log\_{\Phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\Phi}^N(\rho(\mathbf{x}, \theta\_0))\right) d\mathbf{x}.$$

Denote the generalized group Fisher metric associated with *DG*,*<sup>φ</sup>* by

$$\xi\_{jk}(\theta\_0) := \frac{\partial^2 D\_{G,\phi}(\rho \parallel \sigma)(\theta\_\prime \theta\_0)}{\partial \theta^j \partial \theta^k} \mid\_{\theta = \theta\_0} \tag{25}$$

This Hessian-type metric will be calculated in the next result.

**Proposition 2.** *With the previous notations, we have the relation*

$$
\mathcal{G}\_{\vec{\mathcal{R}}}(\theta\_0) = G'(0) \cdot \left\{ \int\_X \frac{\partial^2 \rho(\mathbf{x}, \theta\_0)}{\partial \theta^j \partial \theta^k} \cdot \frac{\rho(\mathbf{x}, \theta\_0)}{\oint \rho(\mathbf{x}, \theta\_0)} d\mathbf{x} + \tag{26} \right. $$

$$
+ 2 \int\_X \phi(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^j} \log\_{\theta}^N(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^k} \log\_{\theta}^N(\rho(\mathbf{x}, \theta\_0)) d\mathbf{x} - $$

$$
$$

$$
+ G''(0) \cdot \int\_X \rho(\mathbf{x}, \theta\_0) \cdot \frac{\partial}{\partial \theta^j} \log\_{\theta}^N(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^k} \log\_{\theta}^N(\rho(\mathbf{x}, \theta\_0)) d\mathbf{x},
$$

*which may be re-written as depending only on φ and ρ, in*

$$\mathcal{G}\_{\vec{\mathbb{R}}}(\theta\_0) = G'(0) \cdot \left\{ \int\_X \frac{\partial^2 \rho(\mathbf{x}, \theta\_0)}{\partial \theta^j \partial \theta^k} \cdot \frac{\rho(\mathbf{x}, \theta\_0)}{\phi(\rho(\mathbf{x}, \theta\_0))} d\mathbf{x} + \mathbf{Q} \right\}$$

$$+ 2 \int\_X \Phi^{-1}(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^j} \rho(\mathbf{x}, \theta\_0) \cdot \frac{\partial}{\partial \theta^k} \rho(\mathbf{x}, \theta\_0) d\mathbf{x} -$$

$$- \int\_X \rho(\mathbf{x}, \theta\_0) \cdot \Phi'(\rho(\mathbf{x}, \theta\_0)) \cdot \Phi^{-2}(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^j} \rho(\mathbf{x}, \theta\_0) \cdot \frac{\partial}{\partial \theta^k} \rho(\mathbf{x}, \theta\_0) d\mathbf{x} \right\} +$$

$$+ G''(0) \cdot \int\_X \rho(\mathbf{x}, \theta\_0) \cdot \Phi^{-2}(\rho(\mathbf{x}, \theta\_0)) \cdot \frac{\partial}{\partial \theta^j} \rho(\mathbf{x}, \theta\_0) \cdot \frac{\partial}{\partial \theta^k} \rho(\mathbf{x}, \theta\_0) d\mathbf{x}.$$

**Proof.** We follow the line of reasoning from [20]. As

$$\log\_{\theta}^{N}(\rho(\mathbf{x},\theta)) = \int\_{1}^{\rho(\mathbf{x},\theta)} \frac{1}{\phi(y)} dy.$$

we calculate

$$\frac{\partial}{\partial \theta^k} \log^N\_{\phi}(\rho(\mathbf{x}, \theta)) = \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k} \cdot \frac{1}{\phi(\rho(\mathbf{x}, \theta))}.$$

Suppose, for the moment, that *θ*<sup>0</sup> is constant. Denote

$$A(\theta) := D\_{G, \phi}(\rho(\mathfrak{x}, \theta) \parallel \rho(\mathfrak{x}, \theta\_0)).$$

We calculate successively

$$\frac{\partial A}{\partial \theta^k}(\theta) = \int\_X \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k} \cdot G \left( \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\phi}^N(\rho(\mathbf{x}, \theta\_0)) \right) + \right.$$

$$+ \rho(\mathbf{x}, \theta) \cdot G' \left( \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\phi}^N(\rho(\mathbf{x}, \theta\_0)) \right) \cdot \frac{\partial}{\partial \theta^k} \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) \right\} d\mathbf{x} = 0$$

$$= \int\_X \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k} \cdot G \left( \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\phi}^N(\rho(\mathbf{x}, \theta\_0)) \right) + \frac{1}{\oint \rho(\mathbf{x}, \theta\_0) \cdot \frac{1}{\partial \theta^k} \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k}} \right\} d\mathbf{x} = 0$$

$$= \int\_X \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k} \cdot \left\{ G \left( \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\phi}^N(\rho(\mathbf{x}, \theta\_0)) \right) \right\} +$$

$$\quad + \frac{\rho(\mathbf{x}, \theta)}{\Phi(\rho(\mathbf{x}, \theta))} \cdot G' \left( \log\_{\phi}^N(\rho(\mathbf{x}, \theta)) - \log\_{\phi}^N(\rho(\mathbf{x}, \theta\_0)) \right) \right\} d\mathbf{x}$$

and

$$\frac{\partial^2 A}{\partial \theta^j \partial \theta^k}(\theta) = \int\_X \frac{\partial^2 \rho(\mathbf{x}, \theta)}{\partial \theta^j \partial \theta^k} \cdot \left\{ G \left( \log^N\_{\theta} \left( \rho(\mathbf{x}, \theta) \right) - \log^N\_{\theta} \left( \rho(\mathbf{x}, \theta\_0) \right) \right) \right\} + 1$$

<sup>+</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*)) · *<sup>G</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) ,<sup>+</sup> <sup>+</sup>*∂ρ*(*x*, *<sup>θ</sup>*) *∂θ<sup>k</sup>* · + *∂ ∂θ<sup>j</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *θ*)) · *G log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) + +*G log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) · *∂ρ*(*x*,*θ*) *∂θ<sup>j</sup>* · [*φ*(*ρ*(*x*, *<sup>θ</sup>*)) − *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*) · *<sup>φ</sup>* (*ρ*(*x*, *θ*)] *<sup>φ</sup>*2(*ρ*(*x*, *<sup>θ</sup>*)) <sup>+</sup> <sup>+</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*)) · *<sup>G</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) · *∂ ∂θ<sup>j</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)), *dx* = = *X ∂*2*ρ*(*x*, *θ*) *∂θ<sup>j</sup> ∂θ<sup>k</sup>* · + *G log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) + <sup>+</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*)) · *<sup>G</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) ,<sup>+</sup> <sup>+</sup>*∂ρ*(*x*, *<sup>θ</sup>*) *∂θ<sup>j</sup>* · *∂ρ*(*x*, *θ*) *∂θ<sup>k</sup>* · <sup>1</sup> *φ*(*ρ*(*x*, *θ*)) · + *G log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) + +*G log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) · *φ*(*ρ*(*x*, *θ*)) − *ρ*(*x*, *θ*) · *φ* (*ρ*(*x*, *θ*) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*)) <sup>+</sup> <sup>+</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*)) · *<sup>G</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*)) <sup>−</sup> *log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) ,*dx*.

We replace *θ* := *θ*0, and we use the property *G*(0) = 0. It follows that

*<sup>g</sup>*ˆ*jk*(*θ*0) :<sup>=</sup> *<sup>∂</sup>*2*<sup>A</sup> ∂θ<sup>j</sup> ∂θ<sup>k</sup>* <sup>|</sup>*θ*=*θ*0<sup>=</sup> *X ∂*2*ρ*(*x*, *θ*0) *∂θ<sup>j</sup> ∂θ<sup>k</sup>* · + *<sup>G</sup>*(0) + *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*0)) · *<sup>G</sup>* (0) , + <sup>+</sup>*∂ρ*(*x*, *<sup>θ</sup>*0) *∂θ<sup>j</sup>* · *∂ρ*(*x*, *θ*0) *∂θ<sup>k</sup>* · <sup>1</sup> *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*0)) · + *G* (0) + *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*0)) · *<sup>G</sup>*(0)+ +*G* (0) · *φ*(*ρ*(*x*, *θ*0)) − *ρ*(*x*, *θ*0) · *φ* (*ρ*(*x*, *θ*0) *φ*(*ρ*(*x*, *θ*0)) , *dx* = = *G* (0) · *X ∂*2*ρ*(*x*, *θ*0) *∂θ<sup>j</sup> ∂θ<sup>k</sup>* · *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) *<sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*0)) *dx*<sup>+</sup> + *X <sup>φ</sup>*(*ρ*(*x*, *<sup>θ</sup>*0)) · *<sup>∂</sup> ∂θ<sup>j</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) · *<sup>∂</sup> ∂θ<sup>k</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *θ*0))· ·*G* (0) · + <sup>2</sup> <sup>−</sup> *<sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) · *<sup>φ</sup>* (*ρ*(*x*, *θ*0)) *φ*(*ρ*(*x*, *θ*0)) , *dx*+ +*G*(0) · *X <sup>ρ</sup>*(*x*, *<sup>θ</sup>*0) · *<sup>∂</sup> ∂θ<sup>j</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *<sup>θ</sup>*0)) · *<sup>∂</sup> ∂θ<sup>k</sup> log<sup>N</sup> <sup>φ</sup>* (*ρ*(*x*, *θ*0))*dx*.

From the last suite of formulas, we obtain both (26) and (27).

Suppose, moreover, that *G*(*t*) = *t*. Then, we have

$$\hat{\mathcal{G}}\_{\vec{\rho}k}(\theta\_0) = \int\_X \left\{ \frac{\partial^2 \rho(\mathbf{x}, \theta\_0)}{\partial \theta^j \partial \theta^k} \frac{\rho(\mathbf{x}, \theta\_0)}{\phi(\rho(\mathbf{x}, \theta\_0))} + 2 \frac{\partial \rho(\mathbf{x}, \theta\_0)}{\partial \theta^j} \cdot \frac{\partial \rho(\mathbf{x}, \theta\_0)}{\partial \theta^k} \cdot \frac{1}{\phi(\rho(\mathbf{x}, \theta\_0))} - \right. \tag{28}$$
 
$$ - \frac{\partial \rho(\mathbf{x}, \theta\_0)}{\partial \theta^j} \cdot \frac{\partial \rho(\mathbf{x}, \theta\_0)}{\partial \theta^k} \cdot \frac{\rho(\mathbf{x}, \theta\_0) \cdot \phi'(\rho(\mathbf{x}, \theta\_0))}{\phi^2(\rho(\mathbf{x}, \theta\_0))} \right\} d\mathbf{x}.$$

We re-write this formula in a condensed form, and we obtain the following result, which completes Corollary 1.

**Corollary 2.** *With the previous notations, for G*(*t*) = *t, we obtain*

$$
\mathfrak{g} = -h - \mathfrak{a}.
$$

By analogy, starting with a generalized (quotient) group relative entropy (a.k.a. the generalized (quotient) group divergence) *<sup>D</sup>*˜ *<sup>G</sup>*,*<sup>φ</sup>* <sup>=</sup> *<sup>D</sup>*˜ *<sup>G</sup>*,*φ*(*<sup>ρ</sup> <sup>σ</sup>*)(*θ*, *<sup>θ</sup>*0), as particularization from (9), we shall obtain, in the sequel, other Fisher-like metrics, similar to the ones in Proposition 2 and Corollary 2.

Denote the generalized group Fisher metric associated with *D*˜ *<sup>G</sup>*,*<sup>φ</sup>* by

$$\overline{g}\_{jk}(\theta\_0) := \frac{\partial^2 \bar{D}\_{G,\theta}(\rho \parallel \sigma)(\theta, \theta\_0)}{\partial \theta^j \partial \theta^k} \mid\_{\theta = \theta\_0} \,. \tag{29}$$

**Proposition 3.** *With the previous notations, we have the relation*

$$\overline{\mathbf{g}} = \left\{ G'(0) \cdot \left[ \frac{2}{\phi(1)} - \frac{\phi'(1)}{\phi^2(1)} \right] + G''(0) \cdot \frac{1}{\phi^2(1)} \right\} \cdot \mathbf{g}^0,\tag{30}$$

*where g*<sup>0</sup> *denotes the classical Fisher metric.*

**Proof.** We adapt the proof of Proposition 2, from the divergence *DG*,*<sup>φ</sup>* to the divergence *D*˜ *<sup>G</sup>*,*φ*. Suppose that *θ*<sup>0</sup> is constant. Denote

$$A(\theta) := D\_{G\varphi}(\rho(\mathfrak{x}, \theta) \parallel \rho(\mathfrak{x}, \theta\_0)).$$

It follows that

$$\frac{\partial \bar{A}}{\partial \theta^{k}}(\theta) = \int\_{X} \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{k}} \cdot G \left( \log\_{\Phi}^{N} \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_{0})} \right] \right) + \right.$$

$$+ \rho(\mathbf{x}, \theta) \cdot G' \left( \log\_{\Phi}^{N} \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_{0})} \right] \right) \cdot \frac{\partial}{\partial \theta^{k}} \log\_{\Phi}^{N} \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_{0})} \right] \Big| \, \mathbf{x} = 0$$

$$= \int\_{X} \left\{ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^{k}} \cdot G \left( \log\_{\Phi}^{N} \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_{0})} \right] \right) +$$

$$+\rho(\mathbf{x},\boldsymbol{\theta})\cdot G'\Big(\log\mathbb{S}\_{\boldsymbol{\theta}}^{N}\Big[\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}\Big]\Big)\cdot\boldsymbol{\Phi}^{-1}(\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})})\cdot\boldsymbol{\rho}^{-1}(\mathbf{x},\boldsymbol{\theta}\_{0})\cdot\frac{\partial\rho(\mathbf{x},\boldsymbol{\theta})}{\partial\boldsymbol{\theta}^{k}}\Big]d\mathbf{x}=0$$

$$=\int\_{X}\frac{\partial\rho(\mathbf{x},\boldsymbol{\theta})}{\partial\boldsymbol{\theta}^{k}}\cdot\left\{G\Big(\log\mathbb{S}\_{\boldsymbol{\theta}}^{N}\Big[\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}\Big]\right\}+$$

$$+\rho(\mathbf{x},\boldsymbol{\theta})\cdot G'\Big(\log\mathbb{S}\_{\boldsymbol{\theta}}^{N}\Big[\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}\Big]\Big)\cdot\boldsymbol{\Phi}^{-1}(\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})})\cdot\boldsymbol{\rho}^{-1}(\mathbf{x},\boldsymbol{\theta}\_{0})\Big\}d\mathbf{x}$$

and

$$\frac{\partial^2 A}{\partial \theta^j \partial \theta^k}(\theta) = \int\_X \frac{\partial^2 \rho(\mathbf{x}, \theta)}{\partial \theta^j \partial \theta^k} \cdot \left\{ G \left( \log\_{\phi}^N \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)} \right] \right) +$$

$$+ \rho(\mathbf{x}, \theta) \cdot \Phi^{-1}(\frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)}) \cdot \rho^{-1}(\mathbf{x}, \theta\_0) \cdot G' \left( \log\_{\phi}^N \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)} \right] \right) \right\} +$$

$$+ \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^k} \cdot \left\{ \Phi^{-1}(\frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)}) \cdot \rho^{-1}(\mathbf{x}, \theta\_0) \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^j} \cdot G' \left( \log\_{\phi}^N \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)} \right] \right) + \right\}$$

$$+ G' \left( \log\_{\phi}^N \left[ \frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)} \right] \right) \cdot \Phi^{-2}(\frac{\rho(\mathbf{x}, \theta)}{\rho(\mathbf{x}, \theta\_0)}) \cdot \frac{\partial \rho(\mathbf{x}, \theta)}{\partial \theta^j} \cdot$$

$$\begin{split} &\cdot \left[\rho^{-1}(\mathbf{x},\boldsymbol{\theta}\_{0}) \cdot \boldsymbol{\phi}(\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}) - \rho(\mathbf{x},\boldsymbol{\theta}) \cdot \rho^{-2}(\mathbf{x},\boldsymbol{\theta}\_{0}) \cdot \boldsymbol{\phi}'(\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}) \right] + \\ &+ \rho(\mathbf{x},\boldsymbol{\theta}) \cdot \boldsymbol{\phi}^{-2}(\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}) \cdot \boldsymbol{\rho}^{-2}(\mathbf{x},\boldsymbol{\theta}\_{0}) \cdot \boldsymbol{G}''(\log\_{\boldsymbol{\theta}}^{N}\left[\frac{\rho(\mathbf{x},\boldsymbol{\theta})}{\rho(\mathbf{x},\boldsymbol{\theta}\_{0})}\right]) \cdot \frac{\partial \rho(\mathbf{x},\boldsymbol{\theta})}{\partial \boldsymbol{\theta}^{j}}\Big] \,d\mathbf{x} \;.\end{split}$$

We assign *θ* := *θ*0, and we use the property *G*(0) = 0. We obtain

$$\begin{split} \overline{\mathcal{g}}\_{jk}(\theta\_{0}) &:= \frac{\partial^{2}A}{\partial\theta^{j}\partial\theta^{k}} \mid\_{\theta=\theta\_{0}} = \int\_{X} \frac{\partial^{2}\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{j}\partial\theta^{k}} \cdot \left\{ G(0) + \frac{1}{\phi^{j}(1)} \cdot G'(0) \right\} + \\ &+ \frac{\partial\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{j}} \cdot \frac{\partial\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{k}} \cdot \frac{1}{\rho(\mathbf{x},\theta\_{0})} \cdot \left\{ \frac{1}{\phi^{j}(1)} \cdot G'(0) + \frac{1}{\phi^{2}(1)} \cdot G''(0) + \\ &+ G'(0) \cdot \frac{\phi(1)(1) - \phi^{j}(1)}{\phi^{2}(1)} \right\} dx = \left[ G'(0) \cdot \left( \frac{2}{\phi(1)} - \frac{\phi^{j}(1)}{\phi^{2}(1)} \right) + G''(0) \cdot \frac{1}{\phi^{2}(1)} \right]. \\ &\left. \cdot \int\_{X} \rho^{-1}(\mathbf{x},\theta\_{0}) \right) \cdot \frac{\partial\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{j}} \cdot \frac{\partial\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{k}} + G'(0) \cdot \frac{1}{\phi^{1}(1)} \cdot \int\_{X} \frac{\partial^{2}\rho(\mathbf{x},\theta\_{0})}{\partial\theta^{j}\partial\theta^{k}} dx. \end{split}$$

The first integral equals *g*<sup>0</sup> *jk*(*θ*0). The second integral is null because  *<sup>X</sup> ρ*(*x*, *θ*)*dx* = 1. We obtained Formula (30).

**Remark 2.** *(i) In Proposition 1, we establish the basic formulas for the future development of associated Riemannian geometries determined by g, g*˜*, h, α, β, in terms of the function φ-deformed (Naudts) entropy (curvature, geodesics, Riemannian distance in the positive definite case). Examples of scalar curvature functions derived from these formulas will be shown in the next section. The coefficients of GFM1 g extend known ones from [29], derived for PDFs of exponential type and for particular functions φ. The other Fisher metrics are new.*

*An interesting consequence of Proposition 1 is the fact that g and g*˜ *do not coincide, as in the case of the Neperian logarithm. This can be seen directly, by comparing their φ-dependent coefficients.*

*(ii) In Proposition 2, we derive the Fisher-like metric g*ˆ *associated with the divergence DG*,*φ, as a generalization of a construction in [30] for the case of a Kullback–Leibler divergence, of a trivial group logarithm G* = *id and for PDFs of exponential type.*

*(iii) In Proposition 3, the Fisher-like metric g*ˆ *associated with the divergence D*˜ *<sup>G</sup>*,*<sup>φ</sup> is—to our knowledge—completely new.*

*The metrics in Formula (30) are homothetic, via a constant kG*,*<sup>φ</sup> supposed—implicitly—to be not null. It is interesting that kG*,*<sup>φ</sup> depends only on the behavior of the deformation function φ, for or around 1 and on G, around 0. Its independence on the PDFs gives kG*,*<sup>φ</sup> an "universality" feature, which corresponds—probably—to some special uncovered property of the statistical model.*

*Suppose, moreover, that G*(*t*) = *t. We replace in (30) the values G* (0) = 1 *and G*(0) = 0*, and we obtain*

$$\overline{\mathfrak{g}} = \left[ \frac{2}{\phi(1)} - \frac{\phi'(1)}{\phi^2(1)} \right] \cdot \mathfrak{g}^0. \tag{31}$$

#### **5. Examples**

We particularize now the results from Section 4, for the case when *ρ* is an exponential PDF and *m* = 1, *n* = 2. The deforming function *φ* will be chosen conveniently, in order to be able to compute the integrals.

Let *X* := R and *ρ* : R × R × (0, ∞) → R be the exponential (normal) PDF given by

$$\rho(\mathbf{x}; \theta^1, \theta^2) = \frac{1}{\sqrt{2\pi}\theta^2} \cdot e^{-\frac{(\mathbf{x} - \theta^1)^2}{2(\theta^2)^2}}.\tag{32}$$

We denote the partial derivatives of *ρ*, with respect to the variables *θ*<sup>1</sup> and *θ*2, by *ρ*1, *ρ*2, *ρ*11, *ρ*12, *ρ*22. A short calculation ([34]) leads to the formulas

$$\rho\_1 = \frac{\mathbf{x} - \theta^1}{(\theta^2)^2} \cdot \rho \quad , \quad \rho\_2 = \{\frac{(\mathbf{x} - \theta^1)^2}{(\theta^2)^3} - \frac{1}{\theta^2}\} \cdot \rho\_1$$

$$\rho\_{11} = \{\frac{(\mathbf{x} - \theta^1)^2}{(\theta^2)^4} - \frac{1}{(\theta^2)^2}\} \cdot \rho \quad , \quad \rho\_{12} = \{\frac{(\mathbf{x} - \theta^1)^3}{(\theta^2)^5} - \frac{3(\mathbf{x} - \theta^1)}{(\theta^2)^3}\} \cdot \rho\_1$$

$$\rho\_{22} = \{\frac{(\mathbf{x} - \theta^1)^4}{(\theta^2)^6} - \frac{5(\mathbf{x} - \theta^1)^2}{(\theta^2)^4} + \frac{2}{(\theta^2)^2}\} \cdot \rho.$$

The classical Fisher metric *g*<sup>0</sup> has the coefficients *g*<sup>0</sup> <sup>11</sup> = (*θ*2)−2, *<sup>g</sup>*<sup>0</sup> <sup>12</sup> = *<sup>g</sup>*<sup>0</sup> <sup>21</sup> = 0 and *g*0 <sup>22</sup> = <sup>2</sup>(*θ*2)−<sup>2</sup> (see, for example, [2,34]).

For future calculations, we shall use the following simple result.

**Lemma 1.** *Let c, k*1*, k*<sup>2</sup> *be fixed real constants, with k*<sup>1</sup> = 0*, k*<sup>2</sup> = 0*. Then, the semi-Riemannian metric*

$$y^{-\varepsilon} \cdot \begin{bmatrix} k\_1 & 0\\ 0 & k\_2 \end{bmatrix}$$

*on the set y* <sup>=</sup> <sup>0</sup> *in* <sup>R</sup><sup>2</sup> *has the scalar curvature*

$$-\frac{c}{2k\_2} \cdot y^{\varepsilon - 2}.$$

In the sequel, we give examples of the semi-Riemannian metrics from Propositions 1–3, under various particular assumptions.

**I—The case of** *g***.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, 2) an arbitrary fixed parameter. From Formula (22), we calculate the coefficients

$$\mathbf{g}\_{11} = \mathbf{K}\_1(\mathbf{c}) \cdot (\theta^2)^{\varepsilon - 3} \quad , \quad \mathbf{g}\_{12} = \mathbf{g}\_{21} = 0 \quad , \quad \mathbf{g}\_{22} = \mathbf{K}\_2(\mathbf{c}) \cdot (\theta^2)^{\varepsilon - 3} \dots$$

where

$$K\_1(\varepsilon) = (2 - \varepsilon)^{-\frac{3}{2}} \cdot (\sqrt{2\pi})^{\varepsilon - 1} \quad , \quad K\_2(\varepsilon) = (\varepsilon^3 - 4\varepsilon^2 + 6\varepsilon - 1) \cdot (2 - \varepsilon)^{-\frac{5}{2}} \cdot (\sqrt{2\pi})^{\varepsilon - 1}.$$

There exists a unique *c*<sup>0</sup> ∈ (0.18, 0.19) such that *K*2(*c*0) = 0. For this value, *g* is degenerated. The metric *g* is Lorentzian, when *c* ∈ (0, *c*0) and is Riemannian, when *c* ∈ (*c*0, 2).

The scalar curvature *S*{*c*} = *S*{*c*}(*θ*) of *g* is

$$S^{\{c\}}(\theta) = \frac{1}{2K\_2(c)} \cdot (c-3) \cdot (\theta^2)^{1-c}.$$

The scalar curvature *S*{*c*} does not vanish anywhere, and its sign is the opposite sign of *K*2(*c*). Moreover, *S*{*c*} is constant if and only if *c* = 1, i.e., only in the case when *g* is the classical Fisher metric *g*0. If we decide to use the scalar curvature as a control, this may lead to a quick criterion to distinguish the BGS entropy case from the *φ*-deformed (Naudts) entropy case. (The statistical interpretation of the scalar curvature of the Fisher metrics may be found in [20]).

We depicted in Figure 1 (and magnified in Figure 2 around *c* = 1 and in Figure 3 around *c* = 0.19) how *S*{*c*} varies w.r.t. *c* and *θ*<sup>2</sup> (denoted *t*).

**Figure 1.** The variation of *<sup>S</sup>*{*c*} w.r.t. *<sup>c</sup>* <sup>∈</sup> (0, *<sup>c</sup>*0) (*c*0, 2) and *θ*<sup>2</sup> := *t*.

**Figure 2.** The variation of *<sup>S</sup>*{*c*} w.r.t. *<sup>c</sup>* <sup>∈</sup> (0.8, 1.2) and *<sup>θ</sup>*<sup>2</sup> :<sup>=</sup> *<sup>t</sup>*.

**Figure 3.** The variation of *<sup>S</sup>*{*c*} w.r.t. *<sup>c</sup>* <sup>∈</sup> (0.18, 0.20) and *<sup>θ</sup>*<sup>2</sup> :<sup>=</sup> *<sup>t</sup>*.

**II—The case of** *g*˜**.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, <sup>3</sup> <sup>2</sup> ) an arbitrary fixed parameter. From Formula (23), we calculate the coefficients

$$\mathfrak{g}\_{11} = \mathbb{R}\_1(\mathfrak{c}) \cdot (\mathfrak{e}^2)^{2c-4} \quad , \quad \mathfrak{g}\_{12} = \mathfrak{g}\_{21} = 0 \quad , \quad \mathfrak{g}\_{22} = \mathbb{R}\_2(\mathfrak{c}) \cdot (\mathfrak{e}^2)^{2c-4} \text{.} $$

where

$$\tilde{K}\_1(\mathcal{c}) = (3 - 2\mathcal{c})^{-\frac{3}{2}} \cdot (\sqrt{2\pi})^{2\mathcal{c} - 2} \quad , \quad \tilde{K}\_2(\mathcal{c}) = (4\mathcal{c}^2 - 8\mathcal{c} + 6) \cdot (3 - 2\mathcal{c})^{-\frac{5}{2}} \cdot (\sqrt{2\pi})^{2\mathcal{c} - 2} \cdot 1$$

The scalar curvature *S*˜{*c*} = *S*˜{*c*}(*θ*) of the Riemannian metric *g*˜ is

$$S^{\{c\}}(\theta) = \frac{1}{\bar{K}\_2(c)} \cdot (c-2) \cdot (\theta^2)^{2-2c}.$$

We mention that: the scalar curvature is negative; it decreases indefinitely as the variable *θ*<sup>2</sup> grows and the parameter *c* goes to 0; it tends to 0 as *c* goes to <sup>3</sup> <sup>2</sup> . We depicted in Figure 4 how *S*˜{*c*} varies w.r.t. *c* and *θ*<sup>2</sup> (denoted *t*).

**Figure 4.** The variation of *S*˜{*c*} w.r.t. *c* and *θ*<sup>2</sup> := *t*.

**III—The case of** *h***.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, 2) an arbitrary fixed parameter. From Formula (24), we calculate the coefficients

$$h\_{11} = h\_{12} = h\_{21} = 0 \quad , \quad h\_{22} = K\_4(\mathfrak{c}) \cdot (\theta^2)^{c-3} \lambda$$

where

$$K\_4(c) = -(2 - c)^{\frac{1}{2}} \cdot (\sqrt{2\pi})^{c - 1}.$$

As the (0,2)-type tensor field *h* is degenerated, it does not define a semi-Riemannian metric. In this case, there is no scalar curvature to compute.

**IV—The case of** *α***.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, 2) an arbitrary fixed parameter. From Formula (17) or from Proposition 1, we calculate the coefficients

$$a\_{11} = \mathcal{K}\_5(\mathfrak{c}) \cdot (\theta^2)^{\mathfrak{c}-3} \quad , \quad a\_{12} = a\_{21} = 0 \quad , \quad a\_{22} = \mathcal{K}\_6(\mathfrak{c}) \cdot (\theta^2)^{\mathfrak{c}-3} \dots$$

where

$$K\_5(\varepsilon) = - (\sqrt{2\pi})^{\varepsilon - 1} \cdot (2 - \varepsilon)^{-\frac{3}{2}} \quad , \quad K\_6(\varepsilon) = (1 - 2\varepsilon) \cdot (2 - \varepsilon)^{-\frac{5}{2}} \cdot (\sqrt{2\pi})^{\varepsilon - 1} \cdot \varepsilon$$

The (0,2)-type tensor field *α* is degenerated for *c* = <sup>1</sup> <sup>2</sup> . If *<sup>c</sup>* <sup>∈</sup> (0, <sup>1</sup> <sup>2</sup> ), then *α* is a Lorentzian metric. If *<sup>c</sup>* <sup>∈</sup> ( <sup>1</sup> <sup>2</sup> , 2), then (−*α*) is a Riemannian metric.

The scalar curvature *<sup>U</sup>*{*c*} <sup>=</sup> *<sup>U</sup>*{*c*}(*θ*) of (−*α*) is

$$\mathcal{U}^{\{c\}}(\theta) = \frac{1}{2K\_6(c)} \cdot (3-c) \cdot (\theta^2)^{1-c}.$$

and has the sign of *K*6. We depicted in Figure 5 how *U*{*c*} varies w.r.t. *c* and *θ*<sup>2</sup> (denoted *t*).

**Figure 5.** The variation of *U*{*c*} w.r.t. *c* and *θ*<sup>2</sup> := *t*.

**V—The case of** *β***.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, 2) an arbitrary fixed parameter. From Formula (18) or from Proposition 1, we calculate the coefficients

$$\beta\_{11} = \mathbb{K}\_7(\mathfrak{c}) \cdot (\theta^2)^{\mathfrak{c}-3} \quad , \quad \beta\_{12} = \beta\_{21} = 0 \quad , \quad \beta\_{22} = \mathbb{K}\_8(\mathfrak{c}) \cdot (\theta^2)^{\mathfrak{c}-3} \text{.} $$

where

$$K\_7(\mathfrak{c}) = 2(\sqrt{2\pi})^{\mathfrak{c}-1}(2-\mathfrak{c})^{-\frac{3}{2}} \quad , \quad K\_8(\mathfrak{c}) = 2(\mathfrak{c}^2 - 2\mathfrak{c} + 3) \cdot (2-\mathfrak{c})^{-\frac{5}{2}} \cdot (\sqrt{2\pi})^{\mathfrak{c}-1} \cdot \mathfrak{c}$$

The scalar curvature *V*{*c*} = *V*{*c*}(*θ*) of *β* is

$$V^{\{c\}}(\theta) = \frac{1}{2K\_8(c)} \cdot (c-3) \cdot (\theta^2)^{1-c}.$$

and takes negative values. We depicted in Figure 6 how *V*{*c*} varies w.r.t. *c* and *θ*<sup>2</sup> (denoted *t*).

**Figure 6.** The variation of *V*{*c*} w.r.t. *c* and *θ*<sup>2</sup> := *t*.

**VI—The case of** *g*ˆ**.** Suppose *φ*(*t*) := *t <sup>c</sup>*, with *<sup>c</sup>* <sup>∈</sup> (0, <sup>3</sup> <sup>2</sup> ) an arbitrary fixed parameter. From Formula (27), we calculate the coefficients

$$
\hat{\mathfrak{g}}\_{11} = K\_{\mathfrak{G}}(c) \cdot (\theta^2)^{c-3} + K\_{10}(c) \cdot (\theta^2)^{2c-3},
$$

$$
\hat{\mathfrak{g}}\_{12} = \hat{\mathfrak{g}}\_{21} = 0,
$$

$$
\hat{\mathfrak{g}}\_{22} = K\_{11}(c) \cdot (\theta^2)^{c-3} + K\_{12}(c) \cdot (\theta^2)^{2c-3}.
$$

where

$$K\_{\theta}(c) = G'(0) \cdot (\sqrt{2\pi})^{c-1} \cdot (2 - c)^{-\frac{3}{2}},$$

$$K\_{10}(c) = G''(0) \cdot (\sqrt{2\pi})^{2c-2} \cdot (3 - 2c)^{-\frac{3}{2}},$$

$$K\_{11}(c) = -G'(0) \cdot (\sqrt{2\pi})^{c-1} \cdot (2 - c)^{-\frac{5}{2}} \cdot (c^3 - 6c^2 + 10c - 7),$$

$$K\_{12}(c) = G''(0) \cdot (\sqrt{2\pi})^{2c-2} \cdot (3 - 2c)^{-\frac{5}{2}} \cdot (4c^2 - 8c + 6).$$

We suppose that the group logarithm *G* is chosen such that *g*ˆ be non-degenerated. The scalar curvature *S*ˆ{*c*} = *S*ˆ{*c*}(*θ*) of *g*ˆ is calculated using MAPLE:

*<sup>S</sup>*ˆ{*c*}(*θ*) = <sup>1</sup> <sup>4</sup> · (*θ*2)−<sup>2</sup> · (*K*12(*θ*2)*<sup>c</sup>* <sup>+</sup> *<sup>K</sup>*11)−2(*K*10(*θ*2)*<sup>c</sup>* <sup>+</sup> *<sup>K</sup>*9)−<sup>3</sup> · + (*θ*2)<sup>3</sup> · *K*3 <sup>9</sup>*K*12*c*2<sup>−</sup> <sup>−</sup>*K*<sup>3</sup> <sup>9</sup>*K*12*<sup>c</sup>* <sup>−</sup> <sup>18</sup>*K*<sup>2</sup> <sup>9</sup>*K*10*K*<sup>11</sup> <sup>−</sup> <sup>3</sup>*K*<sup>2</sup> <sup>9</sup>*K*10*K*11*c*<sup>2</sup> <sup>+</sup> <sup>11</sup>*K*9*K*10*K*11*<sup>c</sup>* <sup>−</sup> <sup>6</sup>*K*<sup>3</sup> 9*K*12 + +(*θ*2)3+*<sup>c</sup>* · <sup>−</sup> <sup>18</sup>*K*<sup>2</sup> <sup>9</sup>*K*10*K*<sup>12</sup> <sup>−</sup> <sup>18</sup>*K*9*K*<sup>2</sup> <sup>10</sup>*K*<sup>11</sup> + <sup>2</sup>*K*<sup>2</sup> <sup>9</sup>*K*10*K*12*<sup>c</sup>* <sup>−</sup> <sup>5</sup>*K*9*K*<sup>2</sup> <sup>10</sup>*K*11*c*2+ +16*K*9*K*<sup>2</sup> <sup>10</sup>*K*11*<sup>c</sup>* + *<sup>K</sup>*<sup>2</sup> 9*K*10*K*12*c*<sup>2</sup> + (*θ*2)3+2*<sup>c</sup>* · <sup>−</sup> <sup>18</sup>*K*9*K*<sup>2</sup> <sup>10</sup>*K*<sup>12</sup> <sup>−</sup> <sup>2</sup>*K*<sup>3</sup> <sup>10</sup>*K*11*c*2+ +7*K*<sup>3</sup> <sup>10</sup>*K*11*<sup>c</sup>* + <sup>7</sup>*K*9*K*<sup>2</sup> <sup>10</sup>*K*12*<sup>c</sup>* <sup>−</sup> <sup>6</sup>*K*<sup>3</sup> 10*K*11 + (*θ*2)3−*<sup>c</sup>* · 2*K*<sup>3</sup> <sup>9</sup>*K*11*<sup>c</sup>* <sup>−</sup> <sup>6</sup>*K*<sup>3</sup> 9*K*11 + +(*θ*2)3+3*<sup>c</sup>* · 4*K*<sup>3</sup> <sup>10</sup>*K*12*<sup>c</sup>* <sup>−</sup> <sup>6</sup>*K*<sup>3</sup> <sup>10</sup>*K*12,.

Interestingly, the scalar curvature *S*ˆ{*c*} is a rational function of *θ*<sup>2</sup> and (*θ*2)*c*.

We particularize now the setting for the BGS group logarithm *G*(*t*) := *t* and replace *G* (0) = 1 and *G*(0) = 0 in the previous formulas. Then,

$$\mathfrak{g}\_{11} = \mathbb{K}\_{\mathfrak{P}}(\mathfrak{c}) \cdot (\mathfrak{e}^2)^{\mathfrak{c}-3} \quad , \quad \mathfrak{g}\_{12} = \mathfrak{f}\_{21} = 0 \quad , \quad \mathfrak{g}\_{22} = \mathbb{K}\_{11}(\mathfrak{c}) \cdot (\mathfrak{e}^2)^{\mathfrak{c}-3} \text{.} $$

where

$$K\_{\theta}(\mathfrak{c}) = (\sqrt{2\pi})^{\mathfrak{c}-1} \cdot (2-\mathfrak{c})^{-\frac{3}{2}},$$

$$K\_{11}(\mathfrak{c}) = -(\sqrt{2\pi})^{\mathfrak{c}-1} \cdot (2-\mathfrak{c})^{-\frac{5}{2}} \cdot (\mathfrak{c}^3 - 6\mathfrak{c}^2 + 10\mathfrak{c} - 7).$$

In this particular case, the scalar curvature *S*ˆ{*c*} = *S*ˆ{*c*}(*θ*) of the Riemannian metric *g*ˆ has the form:

$$
\hat{S}^{\{c\}}(\theta) = \frac{1}{2K\_{11}(c)} \cdot (c-3) \cdot (\theta^2)^{1-c}.
$$

(The same formula may be recovered, directly, by using Lemma 1.) We mention that *<sup>S</sup>*ˆ{*c*} takes negative values, for every *<sup>c</sup>* <sup>∈</sup> (0, 2). In Figure 7, we depicted how this particular *S*ˆ{*c*} varies w.r.t. *c* and *θ*<sup>2</sup> (denoted *t*).

**Figure 7.** The variation of *S*ˆ{*c*} w.r.t. *c* and *θ*<sup>2</sup> := *t*.

**VII—The case of** *g***.** From (30), we have the coefficients of *g* :

$$\overline{\mathfrak{g}}\_{11} = k\_{\mathbb{G}, \mathfrak{g}} \cdot (\theta^2)^{-2} \quad , \quad \overline{\mathfrak{g}}\_{12} = \overline{\mathfrak{g}}\_{21} = 0 \quad , \quad \overline{\mathfrak{g}}\_{22}^0 = 2k\_{\mathbb{G}, \mathfrak{g}} \cdot (\theta^2)^{-2} \quad ,$$

where

$$k\_{G, \phi} = G'(0) \cdot \left[ \frac{2}{\phi(1)} - \frac{\phi'(1)}{\phi^2(1)} \right] + G''(0) \cdot \frac{1}{\phi^2(1)}$$

For the moment, we suppose that *G* and *φ* are suitable chosen, such that *kG*,*<sup>φ</sup>* > 0. It follows that *g* is a Riemannian metric. As the scalar curvature of *g*<sup>0</sup> is a negative constant *<sup>S</sup>*<sup>0</sup> <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>2</sup> , we deduce the scalar curvature of *<sup>g</sup>* is a negative constant *<sup>S</sup>* <sup>=</sup> <sup>−</sup><sup>1</sup> <sup>2</sup> · *kG*,*<sup>φ</sup>* w.r.t. *θ* too. In what follows, we study the variance of *S* in two particular cases.

.

*VII*1. Let *<sup>G</sup>*(*t*) := *<sup>t</sup>* be the BGS grup logarithm function and consider *<sup>φ</sup>*(*t*) := *<sup>t</sup>a*<sup>2</sup> + *e<sup>t</sup> b*3 , where the real parameters *a* and *b* satisfy *a*<sup>2</sup> + *eb*<sup>3</sup> < 2(1 + *e*). Denote the respective metrics by *g*{*a*,*b*} and their scalar curvatures by *S* {*a*,*b*} . Then,

$$\overline{\mathcal{S}}^{\{a,b\}} = -\frac{1}{2} \cdot \left\{ \frac{2}{1+\varepsilon} - \frac{a^2 + \varepsilon b^3}{(1+\varepsilon)^2} \right\}.$$

We mention that *kG*,*<sup>φ</sup>* > 0 (and hence *S* {*a*,*b*} < 0). The dependency of *<sup>S</sup>* {*a*,*b*} w.r.t. *<sup>a</sup>* and *b* may be seen in Figure 8.

**Figure 8.** The variation of *S* {*a*,*b*} w.r.t. *<sup>a</sup>* and *<sup>b</sup>*.

The family of Fisher-like Riemannian metrics *g*{*a*,*b*} may be considered as evolving from the classical Fisher metric *g*0. Their evolution may be controlled through their scalar curvature.

*VII*2. Let *<sup>G</sup>*(*t*) :<sup>=</sup> *<sup>e</sup>*(1−*q*)*<sup>t</sup>* <sup>−</sup><sup>1</sup> <sup>1</sup>−*<sup>q</sup>* be the Tsallis grup logarithm function, where *<sup>q</sup>* <sup>=</sup> 1. Let us define *φ*(*t*) := *ta*<sup>2</sup> + *e<sup>t</sup> b*3 , with real parameters *<sup>a</sup>* and *<sup>b</sup>* satisfying *<sup>a</sup>*<sup>2</sup> <sup>+</sup> *<sup>e</sup>*· *<sup>b</sup>*<sup>3</sup> <sup>+</sup> *<sup>q</sup>* <sup>−</sup> <sup>1</sup> <sup>&</sup>lt; <sup>2</sup>(<sup>1</sup> <sup>+</sup> *<sup>e</sup>*). We denote the associated metric by *g*{*a*,*b*;*q*} and its scalar curvature by *S* {*a*,*b*;*q*} . Then,

$$\overline{S}^{\{a,b;q\}} = -\frac{1}{2} \cdot \left\{ \frac{2}{1+\varepsilon} - \frac{a^2 + eb^3 + q - 1}{(1+\varepsilon)^2} \right\}.$$

We mention that *kG*,*<sup>φ</sup>* > 0 (and hence *S* {*a*,*b*;*q*} < 0). The dependency of *<sup>S</sup>* {*a*,*b*;*q*} w.r.t. *<sup>a</sup>* and *b* may be seen in Figure 9, for *q* taking successively the values 1,11,21,31 (from bottom to top). The value *q* = 1 is no longer a forbidden (singular) one!

**Figure 9.** The variation of *S* {*a*,*b*;*q*} w.r.t. *<sup>a</sup>* and *<sup>b</sup>*, when *<sup>q</sup>* ∈ {1, 11, 21, 31}.

The family of Fisher-like Riemannian metrics *g*{*a*,*b*;*q*} may be considered as evolving from the classical Fisher metric *g*0, and also as "expanding" from the BGS group logarithm to the *q*-dependent Tsallis group logarithm. The evolution of these metrics may be controlled through their scalar curvature, which, in addition to the previous case *VII*1, "foliates" following the values of *q*.

**Remark 3.** *(i) The parameters' domains are subsets of* R × (0, ∞)*, which is two-dimensional. Therefore, for all the metrics in this section, the scalar curvature coincides with the Gaussian curvature. The coefficients of the metrics depend on the variable θ*<sup>2</sup> *only, which has the signification of standard deviation. It follows that the scalar curvature functions are also independent on the mean of the PDF modeled by θ*1*. This dependence of the geometric invariants only on the standard deviation suggests applications where a similar property appears: see, for example, [50–54].*

*(ii) Using general differential geometric arguments, we knew a priori that the metrics must be (locally) conformal with the Euclidean (or Minkowskian) metric of the plane. However, we obtained more: the conformal factors are explicitly derived, they are global and, as expected, they are also independent of the mean θ*1*. Moreover, the metric g in example VII is even homothetic with the Euclidean metric.*

*If we consider a curve in the parameters space, its length (w.r.t any of the respective metrics) depends only on the standard deviation; instead, the angle of two such curves does not depend on either the mean or the standard deviation.*

*(iii) The statistical significance of the sectional curvature of Fisher-like metrics g, g*˜*, h, β, g*ˆ*, g can be obtained by analogy with Ruppeiner's geometric modelization of the Gaussian thermodynamic fluctuations [55]. His "thermodynamic curvature" (R) corresponds to the sectional curvature and measures the inter-particles interaction: when R* = 0*, there is no interaction, and the cases R* > 0 *or R* < 0 *correspond to repulsive or attractive interactions, respectively ([55], apud [48,56]). This approach was developed and generalized by the Geometrothermodynamics theory [57].*

*Another viewpoint interprets the scalar curvature as a measure of the stability of the statistical model, in a direct proportionality relation ([58], apud [59]).*

*(iv) It may be worth noting the following special property, apparently collateral to the main path of the discourse. Let us fix a value of the Tsallis parameter q*<sup>0</sup> *and a value of the scalar curvature S* {*a*,*b*;*q*0} *in example V I I*2*, denoted by s*0*. Then, the solution of the equation*

$$s\_0 = -\frac{1}{2} \cdot \left\{ \frac{2}{1+\varepsilon} - \frac{a^2 + \varepsilon b^3 + q\_0 - 1}{(1+\varepsilon)^2} \right\}$$

*is an elliptic curve in the plane of coordinates* (*a*, *b*)*, written in Weierstrass form. In Figure 10, we drew these elliptic curves, corresponding to s*<sup>0</sup> = −1 *and to q*<sup>0</sup> ∈ {1, −51, −101, −1001} *(from left to right).*

**Figure 10.** The elliptic curves associated with *s*<sup>0</sup> = −1 and *q*<sup>0</sup> ∈ {1, −51, −101, −1001}.

#### **6. The MaxEnt Problem for the** *φ***-Deformed (Naudts) Entropy**

Let *V* = *V*(*x*) be a fixed potential energy function, *φ* be a fixed positive strictlyincreasing function and *U*<sup>0</sup> > 0 be a fixed real number. Consider *ρ* = *ρ*(*x*) a univariate PDF, satisfying

$$\int\_{\mathbb{R}} V(x)\rho(x)dx = \mathcal{U}\_0$$

and let *H<sup>N</sup> <sup>φ</sup>* [*ρ*] be its associated *φ*-deformed (Naudts) entropy, based on (5).

**Theorem 1.** *The optimization problem*

$$\max \ H\_{\phi}^{N}[\rho]$$

*has the solution*

$$\rho^{ME}\_{\phi}(\mathbf{x}) = \exp^{N}\_{\{\phi\}} \left[ \gamma + \beta V(\mathbf{x}) \right],\tag{33}$$

*where exp<sup>N</sup>* {*φ*} *is the inverse function of log<sup>N</sup>* {*φ*}*; <sup>β</sup> and <sup>γ</sup> are the Lagrange multipliers determined by the constraints, and satisfy the inequality γ* + *βV*(*x*) > 0*.*

**Proof.** The proof is a standard one; see, for example, [60], §12.1.

**Remark 4.** *Under the previous hypothesis, we denote: the (maximal) φ-deformed (Naudts) entropy H* := *H<sup>N</sup> <sup>φ</sup>* [*ρME <sup>φ</sup>* ]*; the mean force with respect to ρME φ*

$$\mathcal{U} := \int\_{\mathbb{R}} V(\mathfrak{x}) \cdot \rho\_{\boldsymbol{\Phi}}^{\mathrm{ME}}(\mathfrak{x}) d\mathfrak{x};$$

*the φ-deformed generalized free energy*

$$F := -\frac{\gamma}{\beta}.$$

*We obtain φ-deformed generalizations of the thermodynamic relations:*

$$F = U + \frac{1}{\beta}H \quad , \quad \frac{d}{d\beta}(\beta F) = U.$$

*In the previous relations, all the notions depend on φ; we skipped it, in order to keep the formalism simpler. For some physical interpretations, we recommend [29,61,62]. In the particular cases when the φ-deformed (Naudts) entropy is of Tsallis or of Kaniadakis type, we recover the formulas from [38,39].*

#### **7. Conclusions**

(**i**) In this paper, we refined the search of relevant semi-Riemannian metrics associated in a canonical manner to manifolds of parameterized PDfs, via remarkable entropies and divergences. We stress the main general ideas:


(**ii**) In particular, based on the *φ*-deformed (Naudts) entropy, we focused on the following topics:


(**iii**) Future work will be directed toward:


(**iv**) There exist two different but connected approaches to entropy: in Thermodynamics and in Statistical mechanics. Its geometrization by means of Fisher metrics follows two apparently different paths. The procedures to construct Fisher-like metrics from entropy are analogous, as they originate from the same general differential geometric methods. Instead, the basic manifold these metrics act upon (i.e., the space of the parameters) is

essentially different. Moreover, entropy in Thermodynamics is "more deterministic" and one does not use a log-likelihood function which "produces" it.

The first formalism is dominated by the ideas of Weinhold, Ruppeiner and Quevedo [55,57,63], and is extensively used in models for the entropy of black holes (see [64] and references therein).

Our paper engaged in the second path and is dependent of log-likelihood functions, especially of the *φ*-deformed (Naudts) one. However, we are aware that more connections between the two theories are needed, with refined comparisons of the Riemannian models they both rely on.

**Author Contributions:** Conceptualization, C.-L.P., I.-E.H., G.-T.P. and V.P. ; validation, C.-L.P., I.-E.H., G.-T.P. and V.P.; writing, C.-L.P., I.-E.H., G.-T.P. and V.P.; visualization, G.-T.P.; supervision, C.-L.P., I.-E.H., G.-T.P. and V.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We are grateful to the reviewers for their valuable enlightening remarks.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Alexander Zhdanok 1,\* and Anna Khuruma <sup>2</sup>**


**Abstract:** In this study, we consider general Markov chains (MC) defined by a transition probability (kernel) that is finitely additive. These Markov chains were constructed by S. Ramakrishnan within the concepts and symbolism of game theory. Here, we study these MCs by using the operator approach. In our work, the state space (phase space) of the MC has any cardinality and the sigmaalgebra is discrete. The construction of a phase space allows us to decompose the Markov kernel (and the Markov operators that it generates) into the sum of two components: countably additive and purely finitely additive kernels. We show that the countably additive kernel is atomic. Some properties of Markov operators with a purely finitely additive kernel and their invariant measures are also studied. A class of combined finitely additive MC and two of its subclasses are introduced, and the properties of their invariant measures are proven. Some asymptotic regularities of such MCs were revealed.

**Keywords:** Markov chains; Markov operators; finitely additive Markov chains; finitely additive measures; invariant measures; decompositions of Markov chains

**MSC:** 60J05; 60J10; 28A33; 46E27

#### **1. Introduction**

In this study, classical Markov chains (MC) are interpreted as random Markov processes with discrete time (in the usual sense) in the phase space (*X*, Σ), where *X* is some set (space) and Σ is some sigma-algebra of subsets in *X*. We also consider time-homogeneous MCs. If *X* is an arbitrary infinite set that does not highlight any structure other than sigma-algebra Σ, then these MCs are called general.

In 1937, Kryloff and Bogoliouboff [1,2] proposed an operator-theoretical treatment of the general MC study that was then explicitly developed then by Yosida and Kakutani [3]. The essence of the treatment is that the MC is given by a transition function (probability) *P*(*x*, *E*), *x* ∈ *X*, *E* ∈ Σ, which as a kernel defines two dual integral Markov operators *T* and *A* in spaces of measurable functions and in spaces of measures, respectively.

A Markov chain is identified with an iterative sequence of probability measures {*μn*}. Such sequence is generated by the second Markov operator *μ<sup>n</sup>* = *Aμn*−<sup>1</sup> = *Anμ*<sup>1</sup> with an arbitrary initial probability measure *μ*1. We use this treatment in this work.

In the classical theory of MC, the transition function (probability) *P*(*x*, ·) is assumed by the second argument to be a countably additive measure. At the same time, in economic game theory, developed in the 1960s by Dubbins and Savage [4], and their numerous students and followers, to also involve finitely additive probability measures in the construction of specific random processes became necessary. In particular, in [5], some constructions and investigations of finitely additive measures similar to Markov chains were presented.

**Citation:** Zhdanok, A.; Khuruma, A. Decomposition of Finitely Additive Markov Chains in Discrete Space. *Mathematics* **2022**, *10*, 2083. https://doi.org/10.3390/math 10122083

Academic Editors: Alexandru Agapie, Denis Enachescu, Vlad Stefan Barbu, Bogdan Iftimie and Andreas C. Georgiou

Received: 3 April 2022 Accepted: 13 June 2022 Published: 15 June 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Based on the work of [5], in 1981, Ramakrishnan [6] developed a new object construction in the language of strategies, thus named finitely additive Markov chains. These chains are generated by a transition function (strategy) that is finitely additive by the second argument. The phase space (*X*, Σ) in [6] is a discrete set with the sigma-algebra of all its subsets. In the framework of this construction, the study in [6] contains proof of a number of non-trivial theorems, including ergodic ones, based on these specific chains properties within game theory terms. Some additional questions on this topic were also discussed in further publications by Ramakrishnan (see, e.g., [7]).

Other authors also continued to study some problems of the finitely additive Markov chains theory based on Ramakrishnan findings (see, e.g., [8]). The authors of such works actively used the special apparatus of random variables defined by finitely additive probabilities.

Zhdanok also used finitely additive measures in the study of general classical Markov chains in the works [9,10].

In this paper, we study general Markov chains generated by a transition function that is finitely additive by its second argument, as mentioned above. We consider Markov chains defined on a discrete space. However, it does not use any specific features of game theory, and a different range of problems is solved. We also do not use the apparatus of random variables.

In Section 2, we provide an operator approach for studying general Markov chains with a countably additive transition function on an arbitrary measurable space. We use and develop this construction for finitely additive transition functions.

In Section 3, a discrete topology and a discrete sigma-algebra containing all subsets of the set *X* are introduced into the phase space (of any cardinality) of finitely additive MCs. We study the properties of countably additive and purely finitely additive transition functions and the Markov operators generated by them in such spaces. We then prove that countably additive transition functions are atomic measures with a finitely or countable support and prove that the Markov operators of MCs with a purely finitely additive transition functions transform all finitely additive measures (including countably additive ones) into purely finitely additive measures.

The transition function of an arbitrary finitely additive MC and the Markov operators generated by it are decomposed into a countably additive component and a purely finitely additive component. Their general properties are studied.

In Section 4, we prove that, for any purely finitely additive MC, all its invariant finitely additive measures are purely finitely additive. The class of combined finitely additive MCs is also introduced here. We then prove that such MCs do not have invariant countably additive measures.

In Section 5, we consider the decomposition of a Markov sequence of measures of combined MCs into a countably additive component and a purely finitely additive component. Combined MCs have two subclasses. The first subclass is when the countably additive component of the Markov operator transforms all purely finitely additive measures into countably additive ones (condition (*H*1)). The second subclass is when the same component transforms all purely finitely additive measures into the same ones (condition (*H*2)). Under condition (*H*1), the norms of countably additive and purely finitely additive components of a Markov sequence of measures were proven to be time-stationary. Additionally, under condition (*H*2), the norms of countably additive components of a Markov sequence of measures were proven to converge exponentially to zero. The simple conditions (*G*1) and (*G*2) on the transition function of the MC are given, under which the "qualitative" conditions (*H*1) and (*H*2) are also satisfied. The corresponding theorems are then proven.

Examples of finitely additive MCs on a segment are considered in detail in Sections 4 and 5 and their phase portraits are shown.

#### **2. Definitions, Notation and Some Information**

Let *X* be an arbitrary infinite set and Σ be a sigma-algebra of its subsets containing all one-point subsets from *X*. Let *B*(*X*, Σ) denote the Banach space of bounded Σ-measurable functions *f* : *X* → *R* with sup-norm.

We also consider Banach spaces of bounded measures *μ* : Σ → *R*, with the norm equal to the total variation of the measure *μ* (but one can also use the topologically equivalent sup-norm):

*ba*(*X*, Σ) is the space of finitely additive measures, and

*ca*(*X*, Σ) is the space of countably additive measures.

If *μ* ≥ 0, then norm ||*μ*|| = *μ*(*X*).

**Definition 1** ([11])**.** *A finitely additive measure μ, μ* ≥ 0*, is called purely finitely additive (pure charge, pure mean) if any countably additive measure λ satisfying the condition* 0 ≤ *λ* ≤ *μ is identically zero. An alternating measure μ is called purely finitely additive if both components of its Jordan decomposition <sup>μ</sup>* <sup>=</sup> *<sup>μ</sup>*<sup>+</sup> <sup>−</sup> *<sup>μ</sup>*<sup>−</sup> *are purely finitely additive.*

**Lemma 1.** *If the measure μ is purely finitely additive, then it is equal to zero on every one-point set: μ*({*x*}) = 0, ∀*x* ∈ *X.*

**Proof of Lemma 1.** Take a purely finitely additive measure *μ* ≥ 0. Suppose that there is a point *x*<sup>0</sup> ∈ *X* such that *μ*({*x*0}) = *α* > 0. We take the Dirac measure *δx*<sup>0</sup> at the point *x*0. Then, *δx*<sup>0</sup> (*X* \ *x*0) = 0 and *α* · *δx*<sup>0</sup> (*E*) ≤ *μ*(*E*) for all *E* ∈ Σ, i.e., *α* · *δx*<sup>0</sup> ≤ *μ*. All Dirac measures are countably additive, and this measure *α* · *δx*<sup>0</sup> is also countably additive. Therefore, the statement in Lemma 1 is true for *μ* ≥ 0. This statement is also true for any signalternating purely finitely additive measure.

Obviously, a purely finitely additive measure is equal to zero on any finite set as well. The converse, generally speaking, is not true, for example, for the Lebesgue measure on the segment [0, 1].

**Remark 1.** *If the measure μ is identically zero, then it can formally be considered both countably additive and purely finitely additive.*

**Theorem 1** (Yosida-Hewitt decomposition, see [11])**.** *Any finitely additive measure μ can be uniquely decomposed into the sum μ* = *μca* + *μpfa, where μca is countably additive and μpfa is a purely finitely additive measure.*

Bounded purely finitely additive measures also form a Banach space *pfa*(*X*, Σ) with the same norm and *ba*(*X*, Σ) = *ca*(*X*, Σ) ⊕ *pfa*(*X*, Σ).

We denote the sets of non-negative measures:

*Vba* = {*μ* ∈ *ba*(*X*, Σ) : *μ*(*X*) ≤ 1}, *Vca* = {*μ* ∈ *ca*(*X*, Σ) : *μ*(*X*) ≤ 1}, *Vpfa* = {*μ* ∈ *pfa*(*X*, Σ) : *μ*(*X*) ≤ 1}.

Measures from these sets are called probabilistic if *μ*(*X*) = 1.

We also denote by *Sba*, *Sca*, and *Spfa* the sets of all probability measures in *Vba*, *Vca*, and *Vpfa*, respectively.

**Definition 2.** *The classical Markov chains (MCs) on a measurable space* (*X*, Σ) *are given by their transition function (probability kernel) P*(*x*, *E*), *x* ∈ *X*, *E* ∈ Σ*, under the usual conditions:*


The numerical value of the function *P*(*x*, *E*) is the probability that the system moves from the point *x* ∈ *X* to the set *E* ∈ Σ in one step (per unit of time).

We emphasize that the transition function of the classical Markov chain is a countably additive measure in the second argument.

We also call such transition functions countably additive kernels.

The transition function generates two Markov linear bounded positive integral operators:

$$\begin{aligned} T: &B(X,\Sigma) \to B(X,\Sigma), (Tf)(\mathbf{x}) = Tf(\mathbf{x}) = \int\_X f(y)P(\mathbf{x}, dy), \\ \forall f \in B(X,\Sigma), \forall \mathbf{x} \in X; \\ A: &ca(X,\Sigma) \to ca(X,\Sigma), (A\mu)(E) = A\mu(E) = \int\_X P(\mathbf{x}, E)\mu(d\mathbf{x}), \\ \forall \mu \in ca(X,\Sigma), \forall E \in \Sigma. \end{aligned}$$

The operator *A* is isometric in the cone of non-negative measures, in particular, *ASca* ⊂ *Sca*. Let the initial measure be *<sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sca*. Then, the iterative sequence of countably additive probability measures *<sup>μ</sup>n*+<sup>1</sup> <sup>=</sup> *<sup>A</sup>μ<sup>n</sup>* <sup>∈</sup> *Sca*, *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>* is usually identified with the Markov chain. We call {*μn*} a Markov sequence of measures.

Topologically conjugated to the space *B*(*X*, Σ) is (isomorphically) the space of finitely additive measures: *B*∗(*X*, Σ) = *ba*(*X*, Σ) (see, for example, [12]). In this case, the operator *T*<sup>∗</sup> : *ba*(*X*, Σ) → *ba*(*X*, Σ) serves as a topological conjugate to the operator T, which is uniquely determined by the well-known rule of integral "scalar products":

$$
\langle T^\*\mu, f\rangle = \langle \mu, Tf\rangle, \forall f \in B(X, \Sigma), \forall \mu \in ba(X, \Sigma).
$$

The operator *T*∗ is the only bounded continuation of the operator *A* to the space *ba*(*X*, Σ), preserving its analytic form

$$T^\*\mu(E) = \int\_X P(\mathfrak{x}, E)\mu(d\mathfrak{x}), \forall \mu \in ba(X, \Sigma), \forall E \in \Sigma.$$

The operator *T*<sup>∗</sup> has its own invariant subspace *ca*(*X*, Σ), i.e., *T*∗[*ca*(*X*, Σ)] ⊂ *ca*(*X*, Σ), on which it coincides with the original operator *A*. The operator *T*∗ is also isometric, and *T*∗*Sba* ⊂ *Sba*. The construction of the Markov operators *T* and *T*<sup>∗</sup> is now functionally closed. We continue to denote the operator *T*∗ as *A*.

In such a setting, considering theMarkov sequences of probabilistic finitely additive measures

$$
\mu^1 \in S\_{ba\prime} \mu^{n+1} = A\mu^n \in S\_{ba\prime} \, n \in N\_{\prime \varepsilon}
$$

and retaining the countable additivity of the transition function *P*(*x*, ·) by the second argument are natural.

Despite this circumstance, image *Aμ* of a purely finitely additive measure *μ* can remain purely finitely additive, i.e., generally speaking,

$$A\left[ba(X,\Sigma)\right] \not\subset ca(X,\Sigma).$$

The integral over a finitely additive measure, usually called the Radon integral, is constructed according to the same scheme as the Lebesgue integral over the Lebesgue measure. Its construction was developed in [12] and, in a more modern form, in [13]. Note that, if the original space *X* is countable and the measure *μ* is not countably additive, then the integral on *X* cannot be replaced by a sum (series). Such integrals have other features as well.

**Definition 3.** *If Aμ* = *μ holds for some positive finitely additive measure μ, then we call such a measure invariant for the operator A (and for the Markov chain).*

*An invariant probability countably additive measure is often called the stationary distribution of a Markov chain.*

The question of the existence of invariant measures and their properties is one of the main questions in the theory of Markov chains.

We denote the sets of all non-zero invariant measures for the operator *A* as follows:

$$\Delta\_{\hbar a} = \{ \mu \in V\_{\hbar a} : \mu = A\mu \},$$

$$\Delta\_{\mathrm{ca}} = \{ \mu \in V\_{\mathrm{ca}} : \mu = A\mu \},$$

and

$$\Delta\_{pfa} = \{ \mu \in V\_{pfa} : \mu = A\mu \}.$$

The classical Markov chain with a countably additive transition probability may or may not have invariant countably additive probability measures, i.e., possibly Δ*ca* = ∅ (for example, for a symmetric walk on *Z*).

In ([14], Theorem 2.2), Šidak proved that any countably additive MC on an arbitrary measurable space (*X*, Σ) with an operator extended to the space of finitely additive measures has at least one invariant finitely additive measure, i.e., always Δ*ba* = ∅.

In ([14], Theorem 2.5), for such MC (in the general case), Šidak established that, if a finitely additive measure *μ* is invariant, *Aμ* = *μ*, and *μ* = *μca* + *μpfa* is its decomposition into countably additive and purely finitely additive components, then each of them is also invariant: *Aμca* = *μca* and *Aμpfa* = *μpfa*.

We now give our key definition of finitely additive MCs.

**Definition 4.** *A transition function of a finitely additive MC on an arbitrary (phase) measurable space* (*X*, Σ) *is a function P*(*x*, *E*), *x* ∈ *X*, *E* ∈ Σ*, for which the conditions (1), (2), and (4) from Definition 2 and, instead of condition (3), condition* (3 ) *are satisfied: P*(*x*, ·) ∈ *ba*(*X*, Σ), ∀*x* ∈ *X. We will also call such transition functions finitely additive.*

We consider specific finitely additive MCs that are not countably additive in Examples 2–5 below.

The finitely additive transition function *P*(*x*, *E*) also generates two integral operators: *T* : *B*(*X*, Σ) → *B*(*X*, Σ) and *A* : *ba*(*X*, Σ) → *ba*(*X*, Σ) in the same analytical form, with *T*<sup>∗</sup> = *A*.

The Markov operators *T* and *T*∗ = *A* are linear, bounded, and positive. In addition, the operator *A* is isometric in the cone of non-negative finitely additive measures, and *ASba* ⊂ *Sba*. However, in this case, generally speaking, the operator *A* does not transform countably additive measures into the same ones, that is, *Aca*(*X*, Σ) ⊂ *ca*(*X*, Σ). Finitely additive MCs are also associated with their Markov sequences of finitely additive measures {*μn*}.

**Remark 2.** *As already noted in the Introduction, in [6], Ramakrishnan introduced the concept of finitely additive Markov chains. This definition uses a number of concepts and constructions used only in game theory. The transition function of such MCs (in our terms) in [6] was interpreted as some conditional strategy, which, as a function of sets, is finitely additive. In Definition 4 and in the following comments about Markov operators, the usual language of functional analysis (measure theory and linear operator theory) is used. Strictly comparing these completely different approaches to constructing the theory of finitely additive Markov chains is very difficult (most likely impossible). However, some analogies for individual results are easy to see.*

It is natural to consider the decomposition of such transition functions (kernels) into two components: countably additive and purely finitely additive.

To define such kernels, we take as a basis Definition 2 and some information from Revuse's book ([15], Chapter 1, §1) and transfer them to the finitely additive case.

**Definition 5.** *A numerical function P*(*x*, *E*) *of two variables x* ∈ *X and E* ∈ Σ *is called a sub-Markov countably additive kernel if conditions (1), (2), and (3) from Definition 2 are satisfied.*

Similarly, we introduce the terms sub-Markov and Markov kernels for the cases when the kernel *P*(*x*, ·) is finitely additive or purely finitely additive in the second argument for each *x* ∈ *X*.

We can say that, in this case, we replace condition (3) in Definitions 2 and 4 with the following conditions:

(3 ) *P*(*x*, ·) ∈ *ba*(*X*, Σ), ∀*x* ∈ *X*, and

(3) *P*(*x*, ·) ∈ *pfa*(*X*, Σ), ∀*x* ∈ *X*, respectively.

The integral operators *T* and *A* in spaces of functions and measures generated by a sub-Markov (Markov) kernel are also called sub-Markov (Markov).

The already cited Yosida–Hewitt Theorem 1 [11] on the decomposition of a finitely additive measure implies the following statement.

**Proposition 1.** *Let X be an infinite set and an arbitrary sigma-algebra of its subsets* Σ *contains all one-point sets. Any Markov finitely additive kernel P*(*x*, *E*) *on* (*X*, Σ) *is uniquely presented as the sum of its countably additive and purely finitely additive components: P*(*x*, *E*) = *Pca*(*x*, *E*) + *Ppfa*(*x*, *E*), *where Pca*(*x*, ·) ∈ *ca*(*X*, Σ), *Ppfa*(*x*, ·) ∈ *pfa*(*X*, Σ), *for all x* ∈ *X, E* ∈ Σ*.*

**Proof of Proposition 1.** The transition function of a finitely additive MC is a probability finitely additive measure *P*(*x*, ·) on the second argument *P*(*x*, *E*), *E* ∈ Σ for each fixed *x* ∈ *X*, i.e., *P*(*x*, ·) ∈ *ba*(*X*, Σ), by Definition 1. Therefore, for each *x* ∈ *X*, the transition function *P*(*x*, ·) has a unique decomposition *P*(*x*, ·) = *Pca*(*x*, ·) + *Ppfa*(*x*, ·) into its countably additive and finitely additive, according to Theorem 1.

We cannot yet call the components *Pca*(*x*, ·) and *Ppfa*(*x*, ·) sub-Markov kernels, because the Σ -measurability of the functions *Pca*(·, *E*) and *Ppfa*(·, *E*) for different *E* ∈ Σ and for an arbitrary sigma-algebra Σ is not guaranteed. Moreover, the original Markov kernel *P*(·, *E*) is Σ -measurable for any *E* ∈ Σ by definition.

If the components *Pca*(·, *E*) or *Ppfa*(·, *E*) are immeasurable, then no sub-Markov operators *T* and *A* are integrally expressed in terms of them.

The question of measurability with respect to the first argument of two components in the decompositions of the Markov kernel in Proposition 1 was pointed out by one of the authors of this article in their paper [9]. It was hypothesized that immeasurable decompositions exist. This problem was solved by Gutman and Sotnikov in their work [16].

They proved a number of theorems on the singularities of the decompositions of transition functions (kernels) into the sum of their countably additive and purely finitely additive components in different cases and proved that non-measurable decompositions exist, in particular, on the segment [0, 1] with Lebesgue sigma-algebra.

Later, Sotnikov [17] constructed a class of strongly additive transition functions in which both of their decomposition components are measurable.

In this paper, we use another possibility of ensuring the measurability of the components in the decompositions of the finitely additive Markov kernel, which serves as an introduction to the next subsection in which discrete topologies in an arbitrary MC phase space are discussed.

#### **3. Finitely Additive Markov Kernels in Discrete Space**

In the theory of Markov chains, the term "discrete" is used in different senses, and is applied to both the time parameter and the state space of the MC. We use the classical definition from functional analysis (see, for example, [18]), which is also used in some papers on the theory of MCs.

**Definition 6.** *A topological space* (*X*, *τ*) *is called discrete if all its subsets are simultaneously open and closed (clopen), that is, the topology τ* = 2*<sup>X</sup> is the set of all subsets of the set X.*

Such a topology in *X* is generated by the discrete metric *d*(*x*, *y*) equal to 1 for *x* = *y* and equal to 0 for *x* = *y*. In discrete space, all points are metrically isolated. Discrete metric (and topology) can be introduced in any set *X*. In particular, the discrete topology can be introduced in all "principal" number sets: *N*, *Z*, *Q*, [0, 1], and *R* = *R*1, as well as in *<sup>R</sup>m*(*<sup>m</sup>* <sup>∈</sup> *<sup>N</sup>*), transforming them into discrete spaces.

If a topological space is discrete, then, obviously, its Borel sigma-algebra B = *τ* = 2*X*. This sigma-algebra contains all subsets of the set *X*. Such a sigma-algebra in *X* is also called discrete. We will denote it by Σ*d*.

Ramakrishnan [6] uses a similar definition of the discrete phase space of an MC.

If the space *X* is discrete, then, obviously, any bounded numerical function *f* : *X* → *R* is measurable with respect to the discrete sigma-algebra Σ*d*, that is, *f* ∈ *B*(*X*, Σ*d*). In particular, Σ*<sup>d</sup>* is measurable in the first argument and the components *Pca*(·, *E*) and *Ppfa*(·, *E*) of the CM transition function in Proposition 1 for all *E* ∈ Σ.

Note that all numeric functions *f* : *X* → *R* on any discrete space (*X*, Σ*d*) are continuous in the discrete topology *τ* = 2*X*.

Let us introduce the concept of a measure atom, known in different versions (we just need to use a simplified version of its definition).

**Definition 7.** *Let* (*X*, Σ) *be an arbitrary measurable space and μ* : Σ → *R be some countably additive measure. An element x* ∈ *X is called an atom of the measure μ if μ*({*x*}) = 0*. If a bounded measure μ*, *μ* ≥ 0, *has a support (set of full measure) D* ∈ Σ*, consisting of a finite or countable family of its atoms, then such a measure is called atomic (discrete). Moreover, D* = {*x*1, *x*2, ... } *and μ*(*D*) = ∑*<sup>n</sup> μ*({*xn*}) = *μ*(*X*)*.*

The atomic measure *μ* ≥ 0 can be represented as follows

$$\mu(E) = \sum\_{n} \alpha\_n \delta\_{x\_n}(E)\_{\prime\prime}$$

where *E* ∈ Σ, *δxn* are Dirac measures concentrated at the points *xn*, and ∑*<sup>n</sup> α<sup>n</sup>* = *μ*(*D*) = *μ*(*X*).

Note that a countably additive measure on a nondiscrete measurable space (*X*, Σ) may not have atoms, for example, the Lebesgue measure on ([0, 1], B). Additionally, from Definition 7 and Lemma 1, any purely finitely additive measure on any measurable space has no atoms.

If the set *X* is countable and Σ = Σ*d*, then, obviously, any bounded countably additive measure *μ* on (*X*, Σ*d*) is atomic.

Now, we want to find out how countably additive measures are arranged on an arbitrary discrete space (*X*, Σ*d*). In a wider formulation, this question is considered, for example, in Bourbaki ([19], Chapter III, paragraphs 1 and 2). A locally compact topological space is taken as the initial space *X*. Countably additive measures are defined as linear continuous functionals on the space of continuous functions. Definitions of a discrete space, a discrete (atomic) measure, and its support are given, which differ from those given above. After proving a number of propositions (theorems), in ([19], Chapter III, paragraph 2, item 5), the following statement is formulated: "on a discrete space, any measure is discrete" (here, countably additive measures).

To apply this statement in this work, we need to give precise definitions of the above and other concepts and translate them into our language. Therefore, in Theorem 2 below, we give our proof of the above statement from [19] in our definitions and refine it.

However, for this, we need one well-known and nontrivial theorem of Ulam, stated, for example, in ([20], Chapter 5, Theorems 5.6 and 5.7) and, in more detail, in ([21], Volume 1, Theorem 1.12.40, and Corollary 1.12.41). We present this theorem under the condition that the continuum hypothesis is accepted, i.e., we assume that ℵ<sup>1</sup> = *c* (continuum).

**Theorem 2.** *A finite countably additive measure μ defined on all subsets of the set X of cardinality* ℵ<sup>1</sup> *(c, continuum) is identically zero if it is zero for each one-point subset.*

Obviously, Ulam's theorem holds trivially for sets *X* with countable cardinality ℵ0. We continue to assume that the continuum hypothesis is true.

**Remark 3.** *In the books [20,21], the (extended) Ulam Theorem 2 is noted to be true and, for higher, so-called "immeasurable" cardinalities of the set X are found. Immeasurable cardinality includes all cardinalities from an ordered cardinality scale:* ℵ0*,* ℵ<sup>1</sup> = *c,* ℵ2*, etc. There is still no example of a set with "measurable" power.*

**Definition 8.** *A measurable space* (*X*, Σ*d*) *is called an arbitrary discrete space if* Σ*<sup>d</sup>* = 2*X, and the set X has an arbitrary immeasurable cardinality (including from the ordered cardinality scale). In other words, we consider only discrete spaces for which the (extended) Ulam theorem is valid.*

Now, let us prove the following promised theorem.

**Theorem 3.** *Any non-zero non-negative bounded countably additive measure μ* : Σ*<sup>d</sup>* → *R, on an arbitrary discrete space* (*X*, Σ*d*) *is atomic (discrete) and has a finite or countable support D* = {*x*1, *x*2, ... } ⊂ *X, for which μ*({*xn*}) = *α<sup>n</sup>* > 0, *n* ∈ *N,* ∑*<sup>n</sup> α<sup>n</sup>* = *μ*(*D*) = *μ*(*X*)*, μ*(*X* \ *D*) = 0*.*

**Proof.** For a countable set *X*, the assertions of the theorem are trivially fulfilled.

Now, let the set *X* have uncountable cardinality. Consider an arbitrary bounded nonnegative countably additive measure *μ* : Σ*<sup>d</sup>* → *R* for which 0 < *μ*(*X*) = *γ* < ∞. Because the measure *μ* is not identically zero, then by Theorem 2, the measure *μ* has at least one one-point atom *x*<sup>0</sup> ∈ *X* such that *μ*({*x*0}) = *α*<sup>0</sup> > 0. We denote by *D* the set of all atoms of measure *μ*. As we have shown above, *x*<sup>0</sup> ∈ *D* and *D* = ∅. Let us prove that the set *D* is finite or countable.

We split the interval (0, *γ*] of possible non-zero values of the measure *μ* into a countable family of disjoint intervals

$$(0, \gamma] = \cup\_{n=1}^{\infty} (\frac{\gamma}{n+1}, \frac{\gamma}{n}].$$

We denote the inverse images of these intervals as

$$D\_n = \{ \mathfrak{x} \in X : \mu(\{\mathfrak{x}\}) \in (\frac{\gamma}{n+1}, \frac{\gamma}{n}] \}, \\ D\_n \in \Sigma\_{d, \prime} \\ n = 1, 2, \ldots, n$$

Then, the sets *Dn* are also pairwise disjoint and *<sup>D</sup>* <sup>=</sup> <sup>∪</sup><sup>∞</sup> *<sup>n</sup>*=1*Dn*. Therefore, since the measure *μ* is countably additive, then *μ*(*D*) = ∑<sup>∞</sup> *<sup>n</sup>*=<sup>1</sup> *μ*(*Dn*).

By construction, for any point *<sup>x</sup>* <sup>∈</sup> *Dn* performed, *<sup>γ</sup> <sup>n</sup>*+<sup>1</sup> <sup>&</sup>lt; *<sup>μ</sup>*({*x*}) <sup>≤</sup> *<sup>γ</sup> <sup>n</sup>* , *n* = 1, 2, ... . In addition, *μ*(*Dn*) ≤ *γ* < ∞ for all *n* = 1, 2, . . . .

If any of the sets *Dn* was infinite, then, by virtue of the inequalities for *μ*({*x*}) for *x* ∈ *Dn*, it would be *μ*(*Dn*) = ∞. This contradiction implies that each set *Dn* is finite or empty.

Therefore, the set *D*, as a union of a countable (or finite) family of finite sets, is countable (or finite).

By construction, *D* ⊂ *X* and *μ*(*D*) ≤ *μ*(*X*). Let us prove that *μ*(*D*) = *μ*(*X*). Because *X* has uncountable cardinality and the set *D* is finite or countable, the set *X* \ *D* = ∅ and is also uncountable.

Suppose that *μ*(*D*) < *μ*(*X*), i.e., *μ*(*X* \ *D*) > 0. By the hypothesis of the theorem, the set *X* is discrete. Consequently, the set *X* \ *D* is also discrete.

The restriction *μ<sup>D</sup>* of the measure *μ* from the set *X* to the set *X* \ *D* also satisfies all of the requirements for the measure *μ* under the conditions of the theorem. Because *μ*(*X* \ *D*) > 0, then, again applying Ulam's Theorem 2, we obtain that a point *y*<sup>0</sup> ∈ *X* \ *D* exists such that *μD*({*y*0}) > 0. However, then, *μ*({*y*0}) > 0. Therefore, *y*<sup>0</sup> ∈ *D* and *y*<sup>0</sup> = *x* for any *x* ∈ *D*. However, the set *D* was defined as the set of all points *x* ∈ *X* for which *μ*({*x*}) > 0 is satisfied. Thus, we obtain a contradiction. Therefore, *μ*(*D*) = *μ*(*X*) and *μ*(*X* \ *D*) = 0.

Because *D* is finite or countable, we re-number all its points and obtain the last statement of the theorem.

Surprisingly, all countably additive measures on the discrete segment [0, 1] are only atomic. Additionally, the Lebesgue measure does not exist on the discrete segment [0, 1]. In 1923, Stefan Banach proved that the Lebesgue measure defined on the Borel (generated by the Euclidean topology) sigma-algebra of the segment [0, 1] cannot be countably additively extended to the sigma-algebra of all subsets of the segment [0, 1]. However, it can be extended to a finitely additive measure on a discrete sigma-algebra, and infinitely many such extensions exist. This issue is discussed in many sources; see, for example, ([21], Volume 1, items 1.12.29 and 2.12.91).

**Theorem 4.** *A finitely additive nonnegative measure μ defined on an arbitrary discrete space* (*X*, Σ*d*) *is purely finitely additive if and only if the condition μ*({*x*}) = 0 *for all x* ∈ *X is fulfilled.*

**Proof.** The necessity of the condition is obvious. Let us show its sufficiency. Let the condition be satisfied but the measure *μ* be not purely finitely additive. Then, in its decomposition *μ* = *μca* + *μpfa* the countably additive component *μca* = 0. In this case, by Theorem 3, a point *x*<sup>1</sup> ∈ *X* exists such that *μca*({*x*1}) > 0. From this contradiction, we can see that *μ* is purely finitely additive.

Let us now return to Markov chains. Theorem 3 automatically implies the following statement.

**Theorem 5.** *Let a countably additive sub-Markov kernel P*(*x*, *E*) *be given on an arbitrary discrete space* (*X*, Σ*d*)*, and P*(*x*, *X*) > 0 *for all x* ∈ *X . Then, for any x* ∈ *X, the measure P*(*x*, ·) *is atomic and has a finite or countable support D*(*x*) = {*x*1(*x*), *x*2(*x*),... }*, for which*

*P*(*x*, {*xn*(*x*)}) = *αn*(*x*) > 0, *n* ∈ *N, and* ∑*<sup>n</sup> αn*(*x*) = *P*(*x*, *D*(*x*)) = *P*(*x*, *X*), *P*(*x*, *X* \ *D*(*x*)) = 0*.*

**Corollary 1.** *(From Theorem 4). Let a finitely additive sub-Markov kernel P*(*x*, *E*) *be given on an arbitrary discrete space* (*X*, Σ*d*)*. For any fixed x*<sup>0</sup> ∈ *X the measure P*(*x*0, ·) *is purely finitely additive if and only if P*(*x*0, {*y*}) = 0 *for all y* ∈ *X (including the case y* = *x*0*).*

**Example 1.** *Let X* = [0, 1] *with Euclidean topology,* Σ = B*, be the Borel sigma-algebra and a countably additive MC given by the kernel P*(*x*, *E*) = *λ*(*E*) *for all x* ∈ *X and E* ∈ B*, where λ is the Lebesgue measure. Such a MC corresponds to a sequence of independent uniformly distributed random variables on the segment* [0, 1]*. Obviously, P*(*x*, {*y*}) = 0 *holds for all x*, *y* ∈ *X. However, the phase space X* = [0, 1], Σ = B *is not discrete, and Theorem 4 is not applicable.*

**Example 2.** *Let us now take the same X* = [0, 1] *with the discrete sigma-algebra* Σ*d. Consider a finitely additive MC defined by the kernel P*(*x*, *E*) = *η*(*E*) *for all x* ∈ *X and E* ∈ Σ*d, where η is some purely finitely additive measure satisfying the following conditions: η* ≥ 0*, η*(*X*) = 1 *and η*((0,*ε*)) = 1 *for all ε* > 0*. Then, obviously, the condition P*(*x*, {*y*}) = 0 *is also satisfied for all x*, *y* ∈ *X, and Theorem 4 is applicable.*

The measure *η* in this example can be informally characterized as follows. It specifies a certain "random variable" that takes a value with probability 1 as close to point 0 as desired but not at point 0.

We then denote by *Pn*(*x*, *E*) the integral convolution of the kernel *P*1(*x*, *E*) = *P*(*x*, *E*), *n* = 1, 2, 3, . . ..

The following statement is easily proven by induction.

**Corollary 2.** *Let a sub-Markov purely finitely additive kernel P*(*x*, *E*) *be given on an arbitrary discrete space* (*X*, <sup>Σ</sup>*d*)*. Then, for all <sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> *<sup>X</sup> and <sup>n</sup>* <sup>=</sup> 1, 2, 3, ... *, <sup>P</sup>n*(*x*, {*y*}) = <sup>0</sup> *(including the case x* = *y).*

In general, the converse is not true. Here is a counter-example.

**Example 3.** *Let some purely finitely additive probability measure η be given on the discrete space* (*X*, Σ*d*)*, where X* = [0, 1]*. Consider on* ([0, 1], Σ*d*) *a finitely additive MC with the following rules for passing in one step: P*(0, {1}) = 1, *P*(*x*, *E*) = *η*(*E*) *for all x* ∈ (0, 1] *and E* ∈ Σ*d. In particular, P*(*x*, {*y*}) = *η*({*y*} = 0 *for all x* ∈ (0, 1] *and y* ∈ [0, 1]*.*

*Performing the integral convolution of two kernels <sup>P</sup>*(*x*, *<sup>E</sup>*)*, we obtain that <sup>P</sup>*2(*x*, {*y*}) = <sup>0</sup> *for all x* ∈ [0, 1] *and y* ∈ [0, 1]*. Moreover, P*(0, {1}) = 1 > 0*.*

As noted above in Section 2, the operator *A* generated by the countably additive sub-Markov kernel transforms countably additive measures into the same ones, that is, *<sup>A</sup>*[*ca*(*X*, <sup>Σ</sup>)] <sup>⊂</sup> *ca*(*X*, <sup>Σ</sup>). This property is preserved for the particular discrete case <sup>Σ</sup> <sup>=</sup> <sup>2</sup>*X*. However, if the measure *μ* ∈ *Vpfa* is purely finitely additive, then both cases are possible: *Aμ* ∈ *Vca* and *Aμ* ∈ *Vpfa*. However, the situation is different with a purely finitely additive kernel.

**Theorem 6.** *Let a purely finitely additive sub-Markov kernel P*(*x*, *E*) *be given on an arbitrary discrete space* (*X*, Σ*d*)*. Then, the sub-Markov operator A generated by this kernel transforms all finitely additive measures into purely finitely additive measures, that is, A*[*ba*(*X*, Σ*d*)] ⊂ *pfa*(*X*, Σ*d*)*, in particular, A*[*ca*(*X*, Σ*d*)] ⊂ *pfa*(*X*, Σ*d*) *and A*[*pfa*(*X*, Σ*d*)] ⊂ *pfa*(*X*, Σ*d*)*.*

**Proof.** Let the finitely additive measure *μ* ∈ *Vba* and *μ*(*X*) > 0. We denote the measure by *η* = *Aμ*. Clearly, that the measure *η* is also finitely additive.

If *η*(*X*) = 0, that is, *η* ≡ 0 (which is possible), then it can be considered purely finitely additive (see Remark 1) and the theorem is true.

Let *η*(*X*) > 0. Take its decomposition *η* = *ηca* + *ηpfa* into a countably additive component *ηca* and a purely finitely additive component *ηpfa*.

If *ηca*(*X*) = 0, then the measure is *η* = *ηpfa* ∈ *Vpfa* and the theorem is proved.

Suppose that the countably additive measure *ηca*(*X*) > 0. Then, by Theorem 3, the measure *ηca* has at least one atom *a* ∈ *X*, *ηca*({*a*}) = *γ* > 0. Because the measure *ηpfa* is purely finitely additive, then *ηpfa*({*a*}) = 0.

By the hypothesis of the theorem, all kernels *P*(*x*, ·) are purely finitely additive for all *x* ∈ *X*. Such measures vanish on any one-point set. Therefore, *P*(*x*, {*a*}) = 0 for all *x* ∈ *X*. Hence,

$$\gamma = \eta\_{ca}(\{a\}) = \eta\_{ca}(\{a\}) + 0 = \eta\_{ca}(\{a\}) + \eta\_{pA}(\{a\}) = \eta(\{a\}) = A\mu(\{a\})$$

$$= \int\_{X} P(\mathbf{x}, \{a\})\mu(d\mathbf{x}) = \int\_{X} 0 \cdot \mu(d\mathbf{x}) = 0.$$

Thus, we obtain a contradiction. Therefore, *ηca*(*X*) = 0, and the measure *η* = *ηpfa* = *Aμ* is purely finitely additive.

Now, by using the discrete topology in *X*, we can complete Proposition 1.

**Proposition 2.** *Let an arbitrary discrete space* (*X*, Σ*d*) *be given. Any Markov finitely additive kernel P*(*x*, *E*) *on* (*X*, Σ*d*) *is uniquely presented as the sum of a sub-Markov countably additive kernel Pca*(*x*, *E*) *and a sub-Markov purely finitely additive kernel Ppfa*(*x*, *E*)*:*

$$P(\mathfrak{x}, E) = P\_{ca}(\mathfrak{x}, E) + P\_{pfa}(\mathfrak{x}, E),$$

*where Pca*(*x*, ·) ∈ *ca*(*X*, Σ*d*), *Ppfa*(*x*, ·) ∈ *pfa*(*X*, Σ*d*), *and Pca*(·, *E*) ∈ *B*(*X*, Σ*d*)*, Ppfa*(·, *E*) ∈ *B*(*X*, Σ*d*) *for all x* ∈ *X and E* ∈ Σ*d.*

The last inclusions, *Pca*(·, *E*) ∈ *B*(*X*, Σ*d*) and *Ppfa*(·, *E*) ∈ *B*(*X*, Σ*d*), mean that the kernels *Pca*(·, *E*) and *Ppfa*(·, *E*) are Σ*d*-measurable in the first argument for all *E* ∈ Σ*d*.

Proposition 2 makes it possible to introduce integral sub-Markov operators *Aca* and *Apfa* generated by the corresponding measurable subkernels. These operators act in the space of measures *Aca* and *Apfa* : *ba*(*X*, Σ*d*) → *ba*(*X*, Σ*d*). They have the same analytical form as the operator *A*. For any *μ* ∈ *ba*(*X*, Σ*d*) and *E* ∈ Σ,

$$A\_{ca}\mu(E) = \int\_X P\_{ca}(\mathfrak{x}, E)\mu(dx)$$

and

$$A\_{pfa}\mu(E) = \int\_{\mathcal{X}} P\_{pfa}(\mathfrak{x}, E)\mu(dx).$$

In this case, *A* = *Aca* + *Apfa*.

Because integral kernels of operators are non-negative, the operators *Aca* and *Apfa* transform non-negative measures into the same ones, i.e., operators *Aca* and *Apfa* are positive. Because 0 ≤ *Pca*(*x*, *E*) ≤ *P*(*x*, *E*) and 0 ≤ *Ppfa*(*x*, *E*) ≤ *P*(*x*, *E*) for all *x* ∈ *X* and *E* ∈ Σ, the norms *Aca*≤*A* = 1 and *Apfa*≤*A* = 1, i.e., operators are bounded. Thus, both sub-Markov operators *Aca* and *Apfa* are linear, bounded (continuous), and positive, and *Aca* ≤ 1 and *Apfa* ≤ 1.

As we have already found out,

$$A\_{\rm cr}[ca(X,\Sigma\_d)] \subset ca(X,\Sigma\_d),\\A\_{pfa}[ba(X,\Sigma\_d)] \subset pfa(X,\Sigma\_d).$$

**Corollary 3.** *The following inclusions are true for superpositions of operators Aca and Apfa:*


**Remark 4.** *The operators Aca and Apfa, generally speaking, are non-commutative, i.e., Aca* · *Apfa* = *Apfa* · *Aca.*

#### **4. Invariant Measures of Markov Operators**

In the paper by Zhdanok ([9], Chapter I, §5, Theorem 5.3), the following statement was proven.

**Theorem 7.** *For any Markov chain with a Markov finitely additive kernel P*(*x*, *E*) *on an arbitrary measurable space* (*X*, Σ)*, an invariant probability finitely additive measure μ* = *Aμ* ∈ *Sba exists, that is,* Δ*ba* = ∅*.*

Earlier, a similar theorem (in the language of strategies) was proven by Ramakrishnan ([6], p. 8, Theorem 2) but in the special case of a discrete phase space. In our Theorem 7 given above, no restrictions on the phase space are assumed.

Now let on an arbitrary discrete space (*X*, Σ*d*), Σ*<sup>d</sup>* = 2*X*, a Markov chain with a Markov finitely additive kernel *P*(*x*, *E*) be given. We previously identified two special "extreme" cases. The first is when the kernel *P*(*x*, ·) is a countably additive measure for every *x* ∈ *X*. The second is when the kernel *P*(*x*, ·) is a purely finitely additive measure for all *x* ∈ *X*.

The first case has already been considered in the previous paragraphs of this article and studied in a number of studies by various authors.

Consider now the second special case.

**Theorem 8.** *Let a Markov chain with a purely finitely additive kernel P*(*x*, *E*) *be given on an arbitrary discrete space* (*X*, Σ*d*)*. Then, for the Markov operator A generated by it, an invariant probabilistic finitely additive measure μ* = *Aμ* ∈ *Vba exists and all its invariant measures are purely finitely additive, that is,* Δ*ba* = Δ*pfa* = ∅ *and* Δ*ca* = ∅*.*

**Proof.** Theorem 7 is proven for any sigma-algebra Σ subsets of *X* and for any Markov finitely additive kernel. Hence, it is also true for the discrete sigma-algebra Σ*<sup>d</sup>* = 2*<sup>X</sup>* and for a purely finitely additive kernel.

Therefore, under the conditions of the present theorem, for the operator *A*, an invariant probabilistic finitely additive measure *μ* = *Aμ* exists, defined on the discrete space (*X*, Σ*d*).

From Theorem 6, the measure *μ* and all other invariant measures of the operator *A* are purely finitely additive.

**Definition 9.** *We call a finitely additive MC on an arbitrary discrete space* (*X*, Σ*d*) *combined if its transition function in the decomposition*

$$P(\mathfrak{x}, E) = P\_{ca}(\mathfrak{x}, E) + P\_{pfa}(\mathfrak{x}, E),$$

*satisfies the conditions:*

$$P\_{ca}(\mathfrak{x}, \boldsymbol{X}) = q\_1 \, \_1P\_{pf a}(\mathfrak{x}, \boldsymbol{X}) = q\_2 \, \_2for \, all \, \mathfrak{x} \in \mathcal{X}\_\prime$$

*where* 0 ≤ *q*1, *q*<sup>2</sup> ≤ 1*, q*<sup>1</sup> + *q*<sup>2</sup> = 1*.*

Let the finitely additive MC be combined. Then, as shown in the comments to Proposition 2, its Markov operator *A* can also be represented as the sum *A* = *Aca* + *Apfa* of its two components generated by the sub-Markov kernels *Pca*(*x*, *E*) and *Ppfa*(*x*, *E*), wherein *Aca* = *q*1, *Apfa* = *q*2.

**Definition 10.** *A combined MC is called non-degenerate if its decomposition from Definition 9 holds for* 0 < *q*1, *q*<sup>2</sup> < 1 *and degenerate if q*<sup>1</sup> = 0 *or q*<sup>2</sup> = 0*.*

Above, in Section 2 and in Theorem 8, we describe the existence of invariant measures and their types for countably additive and purely finitely additive MCs. By Definition 10, they are degenerate cases of combined MCs.

Let the MC be non-degenerate. Let us take functions

$$
\tilde{P}\_{\rm ca}(\mathbf{x}, E) = \frac{1}{q\_1} P\_{\rm ca}(\mathbf{x}, E), \\
\tilde{P}\_{pfa}(\mathbf{x}, E) = \frac{1}{q\_2} P\_{pfa}(\mathbf{x}, E).
$$

Then, the functions *P*˜ *ca*(*x*, *E*) and *P*˜ *pfa*(*x*, *E*) satisfy Definition 1 and are transition functions (Markov kernels) of the corresponding Markov operators

$$
\bar{A}\_{ca} = \frac{1}{q\_1} A\_{ca\prime} \bar{A}\_{pfa} = \frac{1}{q\_2} A\_{pfa}.
$$

Therefore, the Markov operator *A* of the combined MC is a linear combination

$$A = q\_1 A\_{ca} + q\_2 A\_{pfar}$$

for two Markov operators *A*˜ *ca* and *A*˜ *pfa* (hence, the name of such MCs and operators in Definition 9 is taken).

Recall that, by Theorem 7, any, including combined, finitely additive MC has an invariant finitely additive measure.

**Theorem 9.** *The combined non-degenerate finitely additive MC on an arbitrary discrete space* (*X*, Σ*d*) *has no non-zero invariant countably additive measures, that is,* Δ*ca* = ∅*.*

**Proof.** We carry out the proof by contradiction. Suppose that *μ* = *Aμ* ∈ *Sca*, i.e., the invariant measure *μ* is countably additive. Then,

$$
\mu = A\mu = (A\_{ca} + A\_{pf a})\mu = A\_{ca}\mu + A\_{pfa}\mu,
$$

where *Aca* is countably additive, and *Apfa* are purely finitely additive components of the operator *A*. Then, *Acaμ* is also a countably additive measure, and *Acaμ*(*X*) = *q*<sup>1</sup> > 0, that is, the measure *Acaμ* is non-zero. By Theorem 6, the measure*Apfaμ* is purely finitely additive and non-zero: *Apfaμ*(*X*) = *q*<sup>2</sup> > 0.

Consequently, the measure *μ* has a non-zero purely finitely additive component *Apfaμ* and is not countably additive. The resulting contradiction proves the theorem.

From Theorem 9, we obtain the following assertion.

**Theorem 10.** *Let a combined non-degenerate finitely additive MC with invariant probability finitely additive measure μ* = *Aμ* ∈ *Sba on an arbitrary discrete space* (*X*, Σ*d*) *be given. Let μ* = *μca* + *μpfa be its decomposition into countably additive μca and purely finitely additive μpfa components, μca* = 0*, and μpfa* = 0*.*

*Then, the measures μca and μpfa are not invariant for the operator A, that is, μca* = *Aμca and μpfa* = *Aμpfa.*

Recall that by Šidak ([14], Theorem 2.5), for a MC with a countably additive kernel in a similar decomposition of the invariant measure *μ* = *μca* + *μpfa*, *μca* = *Aμca* and *μpfa* = *Aμpfa*. The difference between such MCs and combined ones turned out to be very significant.

Let us give an example to illustrate the last two theorems.

**Example 4.** *Consider on the segment X* = [0, 1] *with discrete sigma-algebra* Σ*<sup>d</sup> a combined finitely additive MC with kernel*

$$P(\mathfrak{x}, E) = P\_{ca}(\mathfrak{x}, E) + P\_{pfa}(\mathfrak{x}, E).$$

These components are set according to the following rules:

*Pca*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *δ*0(*E*) for all *x* ∈ *X* and *E* ⊂ *X*, where *δ*<sup>0</sup> is the Dirac at point 0; *Ppfa*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *η*(*E*) for all *x* ∈ *X* and *E* ⊂ *X*, where *η* is some fixed purely finitely additive measure from *Spfa*. For clarity, we take the measure *η* from the family of purely finitely additive measures satisfying the condition *η*((0,*ε*)) = 1 for any *ε* > 0. Moreover, *Pca*(*x*, *X*) = <sup>1</sup> <sup>2</sup> <sup>=</sup> *<sup>q</sup>*<sup>1</sup> and *Ppfa*(*x*, *<sup>X</sup>*) = <sup>1</sup> <sup>2</sup> = *q*<sup>2</sup> for all *x* ∈ *X*.

Essentially, all this means that a Markov chain in one step can move from any point *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* to point 0 with probability <sup>1</sup> <sup>2</sup> and to any set *<sup>E</sup>* <sup>⊂</sup> *<sup>X</sup>* \ {0} with probability <sup>1</sup> <sup>2</sup> *η*(*E*). In particular, from any point *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*, the system can move with probability <sup>1</sup> <sup>2</sup> to the open interval (0,*ε*) for every *ε* ∈ (0, 1). The phase portrait of such a MC with an arbitrary *ε* ∈ (0, 1) is shown in the Figure 1.

**Figure 1.** Phase portrait of the MC from Example 4.

Take an arbitrary (initial) finitely additive probability measure *μ* ∈ *Sba*. Then, for any *E* ⊂ *X*, the following holds:

$$A\mu(E) = \int\_X P(\mathbf{x}, E) d\mu(\mathbf{x}) = \int\_X P\_{\text{cra}}(\mathbf{x}, E) d\mu(\mathbf{x}) + \int\_X P\_{pfa}(\mathbf{x}, E) d\mu(\mathbf{x})$$

$$= \frac{1}{2} \int\_X \delta\_0(E) d\mu(\mathbf{x}) + \frac{1}{2} \int\_X \eta(E) d\mu(\mathbf{x})$$

$$= \frac{1}{2} \delta\_0(E) \cdot \mu(X) + \frac{1}{2} \eta(E) \cdot \mu(X) = \frac{1}{2} \delta\_0(E) + \frac{1}{2} \eta(E).$$

Hence, *Aμ* = <sup>1</sup> <sup>2</sup> *<sup>δ</sup>*<sup>0</sup> <sup>+</sup> <sup>1</sup> <sup>2</sup> *η* for any initial measure *μ*. If *μ* = <sup>1</sup> <sup>2</sup> *<sup>δ</sup>*<sup>0</sup> <sup>+</sup> <sup>1</sup> <sup>2</sup> *η*, then *Aμ* = *μ*.

Obviously, this is the only invariant probabilistic finitely additive measure for a given MC. The measures *μca* = <sup>1</sup> <sup>2</sup> *<sup>δ</sup>*<sup>0</sup> and *<sup>μ</sup>pfa* <sup>=</sup> <sup>1</sup> <sup>2</sup> *η* are non-zero components of the measure *μ*, countably additive and purely finitely additive, respectively, and *μ* = *μca* + *μpfa*. Thus, Theorem 9 is confirmed. Then, also obvious is that *Aμca* = *μ* = *μca* and *Aμpfa* = *μ* = *μpfa*. Therefore, this example also confirms Theorem 10.

In the combined non-degenerate decomposition *A* = *Aca* + *Apfa* of the finitely additive operator *A*, its countably additive component *Aca* and the purely finitely additive component *Apfa* are equal. One might suppose that Theorem 9 would also be valid for a purely finitely additive invariant measure. However, it is not. Let us give a corresponding counterexample.

**Example 5.** *We consider a finitely additive combined MC on a discrete segment X* = [0, 1] *under the same conditions as in Example 4, but with a different countably additive component of its kernel: Pca*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *δx*(*E*) *for all x* ∈ *X and E* ⊂ *X, where δ*<sup>0</sup> *is the Dirac measure at point x.*

Meaningfully, this means that, in one step, the Markov system can go from any *x* ∈ *X* to the point *x*, i.e., go into itself with probability <sup>1</sup> <sup>2</sup> and into any set *E* ⊂ *X* \ {*x*} with probability <sup>1</sup> <sup>2</sup> *<sup>η</sup>*(*E*). In particular, the probability *Ppfa*(*x*,(0,*ε*)) = <sup>1</sup> <sup>2</sup> for any *ε* ∈ (0, 1). The phase portrait of such a MC with an arbitrary *ε* ∈ (0, 1) is shown in Figure 2.

**Figure 2.** Phase portrait of the MC from Example 5.

Obviously, this MC is a combined non-degenerate chain.

Let us perform integral transformations for an arbitrary initial probability measure *μ* ∈ *Sba*, similar to the transformations in Example 4. As a result (omitting the calculations), we have *Aμ* = <sup>1</sup> <sup>2</sup>*<sup>μ</sup>* <sup>+</sup> <sup>1</sup> <sup>2</sup> *η*. Then, we solve the equation *μ* = *Aμ*. From the last two equalities, we obtain the only solution *μ* = *η*.

We have shown that this combined non-degenerate MC has a unique invariant finitely additive measure *η*, which is purely finitely additive, i.e., has no non-zero countably additive component.

#### **5. Norms of Components in the Decomposition of a Markov Sequence of Measures and Their Asymptotic Behavior**

Consider a combined non-degenerate finitely additive MC on an arbitrary discrete space (*X*, Σ*d*).

Let an arbitrary initial probability measure *<sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba*, *<sup>μ</sup>*<sup>1</sup> <sup>=</sup> *<sup>μ</sup>*<sup>1</sup> *ca* + *μ*<sup>1</sup> *pfa*, be given, and *<sup>μ</sup>n*+<sup>1</sup> <sup>=</sup> *<sup>A</sup>μn*, *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>* is the Markov sequence of measures generated by this initial measure. Its decomposition is

$$
\mu^{n+1} = \mu\_{ca}^{n+1} + \mu\_{pfa}^{n+1}.
$$

**Remark 5.** *The notation μn*+<sup>1</sup> *ca can be interpreted in two ways: it can be a countably additive component of the measure μn*+1*, i.e.,* (*μn*+1)*ca, or it can be* (*n* + 1)*-th iteration of measure* (*μ*<sup>1</sup> *ca*)*, i.e.,* (*μ*<sup>1</sup> *ca*)(*n*+1)*. Generally speaking, these two interpretations do not coincide. Hereafter, we mean that <sup>μ</sup>n*+<sup>1</sup> *ca* = (*μn*+1)*ca and <sup>μ</sup>n*+<sup>1</sup> *pfa* = (*μn*+1)*pfa, for any n* <sup>∈</sup> *N.*

Because the operator *A* is isometric in the cone of positive measures, the norms of *μn*+1 <sup>=</sup> *<sup>μ</sup>n*+1(*X*) = *μ*1(*X*) <sup>=</sup> *<sup>μ</sup>*1(*X*) = 1 for each *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>*.

In this section, we consider the norms of the components *μn*+<sup>1</sup> *ca* and *μn*+<sup>1</sup> *pfa* for *n* → ∞.

Take the second iteration in the Markov sequence of measures *μ*<sup>2</sup> = *Aμ*1. Let us make the appropriate transformations:

$$\begin{split} \mu^2 = \mu\_{ca}^2 + \mu\_{pfa}^2 = A\mu^1 = (A\_{ca} + A\_{pfa}) \quad (\mu\_{ca}^1 + \mu\_{pfa}^1) \\ &= A\_{ca}\mu\_{ca}^1 + A\_{ca}\mu\_{pfa}^1 + A\_{pfa}\mu\_{ca}^1 + A\_{pfa}\mu\_{pfa}^1. \end{split} \tag{1}$$

In the last four terms of the decomposition (1), the first is a countably additive measure and the third and fourth are purely finitely additive measures (see Theorem 6).

The second term *Acaμ*<sup>1</sup> *pfa* can be a measure of any type. Consider two corresponding main cases: disjoint conditions (*H*1) and (*H*2).

$$(H\_1) \qquad \qquad A\_{ca}(V\_{pf a}) \subset V\_{ca\prime}$$

that is, the operator *Aca* transforms all purely finitely additive measures from *Vpfa* into countably additive measures. Markov chains satisfying this condition (*H*1) exist. Let us show that the Markov chain in Example 4 has this property. In Example 4, the Markov chain kernel has a countably additive component *Pca* = <sup>1</sup> <sup>2</sup> *δ*0.

Let *μ* be an arbitrary purely finitely additive measure: *μ* ∈ *Vpfa*. Then, for any *E* ⊂ *X*, the following holds:

$$A\_{\mathrm{ccl}}\mu(E) = \int\_X P\_{\mathrm{ccl}}(\mathbf{x}, E)\mu(d\mathbf{x}) = \frac{1}{2}\int\_X \delta\_0(E)\mu(d\mathbf{x}) = \frac{1}{2}\delta\_0(E)\mu(X),$$

i.e., *Acaμ* = <sup>1</sup> <sup>2</sup> *δ*<sup>0</sup> · *μ*(*X*), where the Dirac measure *δ*<sup>0</sup> is countably additive. Therefore, condition (*H*1) is satisfied in Example 4.

**Theorem 11.** *Let condition* (*H*1) *be satisfied for a combined non-degenerate finitely additive Markov chain on an arbitrary discrete space* (*X*, <sup>Σ</sup>*d*)*. Then, for any initial measure <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba and for any n* ∈ *N,*

$$\|\mu\_{ca}^{n+1}\| = \mu\_{ca}^{n+1}(X) = q\_1$$
 and 
$$\|\mu\_{pfa}^{n+1}\| = \mu\_{pfa}^{n+1}(X) = q\_2.$$

**Proof.** Here, we carry out the proof by induction. Let *n* = 1. Then, by condition (*H*1), the second term *Acaμ*<sup>1</sup> *pfa* in decomposition (1) is a countably additive measure. Therefore, due to the uniqueness of the decomposition of the Yosida–Hewitt measures, we have

$$
\mu\_{ca}^2 = A\_{ca}\mu\_{ca}^1 + A\_{ca}\mu\_{fpa}^1 = A\_{ca}(\mu\_{ca}^1 + \mu\_{fpa}^1) = A\_{ca}\mu^1.
$$

From here,

$$\|\mu\_{ca}^2\| = \mu\_{ca}^2(X) = A\_{ca}\mu^1(X) = q\_1 \cdot \mu^1(X) = q\_1.$$

Because

$$1 = \|\mu^2\| = \mu\_{ca}^2(X) + \mu\_{pfa}^2(X) = \|\mu\_{ca}^2\| + \|\mu\_{pfa}^2\|\_{\prime}$$

then

$$||\mu\_{pfa}^2|| = 1 - ||\mu\_{ca}^2|| = 1 - q\_1 = q\_2.$$

Thus, the statement of the theorem for *n* = 1 is proven.

Suppose that the statement of the theorem is also true for some *n* ∈ *N*.

Let us make the decomposition similar to the decomposition (1) for *μn*+<sup>1</sup> and obtain the following equalities:

$$\begin{split} \mu^{n+1} = \mu\_{ca}^{n+1} + A\mu\_{pfa}^{n+1} &= A\mu^n = (A\_{ca} + A\_{pfa})(\mu\_{ca}^n + \mu\_{pfa}^n) \\ &= A\_{ca}\mu\_{ca}^n + A\_{ca}\mu\_{pfa}^n + A\_{pfa}\mu\_{ca}^n + A\_{pfa}\mu\_{pfa}^n. \end{split} \tag{2}$$

As in the decomposition (1), here, the first term is a countably additive measure, and the third and fourth terms are purely finitely additive measures.

By condition (*H*1) the second term *Acaμ<sup>n</sup> pfa* in (2) is a countably additive measure. Therefore, just as for the measure *μ*<sup>2</sup> *ca*, we obtain that *μn*+<sup>1</sup> *ca* = *Acaμn*. In the same way, we have that

$$\|\mu\_{ca}^{n+1}\| = \mu\_{ca}^{n+1}(X) = A\_{ca}\mu^{n}(X) = q\_1 \cdot \mu^{n}(X) = q\_1.$$

and

$$||\mu\_{pfa}^{n+1}|| = \mu\_{pfa}^{n+1}(X) = A\_{pfa}\mu^n(X) = q\_2 \cdot \mu^n(X) = q\_2.$$

Therefore, the statement of the theorem is true for any *n* ∈ *N*.

**Remark 6.** *Norms μn*+<sup>1</sup> *ca and μn*+<sup>1</sup> *pfa in Theorem 11 are independent of the norms of the components of the initial measure μ*<sup>1</sup> *ca and μ*<sup>1</sup> *pfa. Additionally, this is a very interesting fact.*

**Corollary 4.** *Let the conditions of Theorem 11 be satisfied. Then, for such a Markov chain there exist invariant finitely additive measures μ*∗ = *Aμ*∗*, μ*∗ = *μ*∗ *ca* + *μ*<sup>∗</sup> *pfa, and for all such measures for their components, the equalities are true:*

$$\|\|\mu\_{ca}^\*\|\| = \mu\_{ca}^\*(X) = q\_1,\\ \|\|\mu\_{pfa}^\*\|\| = \mu\_{pfa}^\*(X) = q\_2.$$

Because Markov chains satisfying the condition (*H*1) are not degenerate, that is, 0 < *q*1, *q*<sup>2</sup> < 1, they do not have invariant countably additive and invariant purely finite additive measures.

Corollary 4 clarifies our Theorem 9 under the additional condition (*H*1).

**Remark 7.** *Obviously, in Example 4, which satisfies condition* (*H*1)*, the assertion of Theorem 11 is satisfied. Added to this fact is that, for any initial measure <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba the following Markov measure μ*<sup>2</sup> = *Aμ*<sup>1</sup> *coincides with the unique invariant measure μ*<sup>2</sup> = *μ*<sup>∗</sup> = <sup>1</sup> <sup>2</sup> *<sup>δ</sup>*<sup>0</sup> <sup>+</sup> <sup>1</sup> <sup>2</sup> *η for the given MC. It is not strictly possible to say that the MC from Example 4 "strongly converges" uniformly in the initial measures μ*<sup>1</sup> *to the only invariant measure μ*∗*, i.e., this MC has the best ergodic properties.*

We now give the second condition (*H*2) related to the decomposition in (1).

$$(H\_2) \qquad \qquad A\_{ca}(V\_{pfa}) \subset V\_{pfa\prime}$$

that is, the operator *Aca* transforms all purely finitely additive measures from *Vpfa* into purely finitely additive measures. Such Markov chains exist. Let us show that the Markov chain in Example 5 has this property.

In Example 5, *Pca*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *δx*(*E*) for all *x* ∈ *X* and *E* ⊂ *X*. We take an arbitrary measure *μ* ∈ *Vpfa*. Then, for all *E* ⊂ *X*, the following holds:

$$A\_{ca}\mu(E) = \int\_X P\_{ca}(\mathfrak{x}, E)\mu(d\mathfrak{x}) = \frac{1}{2}\int\_X \delta\_\mathfrak{X}(E)\mu(d\mathfrak{x}) = \frac{1}{2}\mu(E).$$

Thus, *Acaμ* = <sup>1</sup> <sup>2</sup> · *μ*, where *μ* is a purely finitely additive measure. Thus, condition (*H*2) is satisfied.

**Theorem 12.** *Let condition* (*H*2) *be satisfied for a combined non-degenerate finitely additive Markov chain on an arbitrary discrete space* (*X*, Σ*d*)*. Then, for any initial finitely additive measure <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba, for any n* <sup>∈</sup> *<sup>N</sup>*

$$||\mu\_{ca}^{n+1}|| = \mu\_{ca}^{n+1}(X) = q\_1^n \cdot \mu\_{ca}^1(X) = q\_1^n \cdot ||\mu\_{ca}^1||$$

*and*

$$\|\mu\_{pfa}^{n+1}\| = \mu\_{pfa}^{n+1}(X) = 1 - q\_1^n \cdot \|\mu\_{ca}^1\|.$$

**Proof.** Let us return to the decomposition (1). From condition (*H*2), the second term *Acaμ*<sup>1</sup> *pfa* in expansion (1) is a purely finitely additive measure. Therefore, due to the uniqueness of the Yosida–Hewitt decomposition,

$$\begin{aligned} \mu\_{ca}^2 &= A\_{ca}\mu\_{ca}^1\\ \mu\_{pfa}^2 &= A\_{ca}\mu\_{pfa}^1 + A\_{pfa}\mu\_{ca}^1 + A\_{pfa}\mu\_{pfa}^1 = A\_{ca}\mu\_{pfa}^1 + A\_{pfa}\mu^1. \end{aligned} \tag{3}$$

Find the norm of the measure *μ*<sup>2</sup> *ca* in equalities (3)

$$\|\|\mu\_{ca}^2\|\| = \mu\_{ca}^2(X) = A\_{ca}\mu\_{ca}^1(X) = \int\_X P\_{ca}(\mathbf{x}, X)\mu\_{ca}^1(d\mathbf{x}) = q\_1 \cdot \mu\_{ca}^1(X) = q\_1 \cdot \|\|\mu\_{ca}^1\|\|.$$

Because 1 <sup>=</sup> *μ*2 <sup>=</sup> *μ*<sup>2</sup> *ca* <sup>+</sup> *μ*<sup>2</sup> *pfa*, then *μ*<sup>2</sup> *pfa* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>q</sup>*<sup>1</sup> · *μ*<sup>1</sup> *ca*.

From the equalities obtained for *n* = 1 (*n* + 1 = 2), making an assumption about the general form of the norms of the components of Markov measures is still difficult. Therefore, we now consider another case *n* = 2 (*n* + 1 = 3).

Let us make transformations for the measure *μ*3, similar to transformations (1) for the measure *μ*2, relying on the condition (*H*2). As a result, we obtain equality for the measure

$$\mu\_{ca}^3 = A\_{ca} \mu\_{ca}^2$$

and the equality for the norm of this measure

$$\|\mu\_{ca}^3\| = q\_1 \cdot \mu\_{ca}^2(X) = q\_1^2 \cdot \mu\_{ca}^1(X) = q\_1^2 \cdot \|\mu\_{ca}^1\|.$$

From here,

$$||\mu\_{pfa}^3|| = 1 - q\_1^2 \cdot ||\mu\_{ca}^1||.$$

Suppose now that, for arbitrary *<sup>n</sup>* <sup>∈</sup> *<sup>N</sup>*, *<sup>n</sup>* <sup>≥</sup> 2 holds for measures *<sup>μ</sup><sup>n</sup> ca* = *Acaμn*−<sup>1</sup> *ca* , and for the norms of these measures, we have *μ<sup>n</sup> ca* <sup>=</sup> *<sup>q</sup>n*−<sup>1</sup> <sup>1</sup> · *μ*<sup>1</sup> *ca*.

Then, (omitting transformations) we have

$$\mu\_{ca}^{n+1} = A\_{ca} \mu\_{ca}^{n}$$

$$||\mu\_{ca}^{n+1}|| = q\_1^n \cdot ||\mu\_{ca}^1||,$$

$$||\mu\_{pfa}^{n+1}|| = 1 - q\_1^n \cdot ||\mu\_{ca}^1||.$$

**Remark 8.** *Unlike Theorem 11, in Theorem 12, the norms of the components μn*+<sup>1</sup> *ca and μn*+<sup>1</sup> *pfa of the measure μn*+<sup>1</sup> *depend (linearly) on the norms of the components of the initial measure μ*1*.*

**Corollary 5.** *Let the conditions of Theorem 12 be satisfied. Then for any finitely additive initial measure <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba for the components of the Markov sequence of measures generated by it <sup>μ</sup>n*+<sup>1</sup> <sup>=</sup> *<sup>A</sup>μ<sup>n</sup> as n* <sup>→</sup> <sup>∞</sup>*,*

$$||\mu\_{ca}^{n}|| \to 0 \text{ and } ||\mu\_{pfa}^{n}|| \to 1.$$

*Moreover, the convergence is uniform with respect to the initial measures <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba and exponentially fast.*

**Corollary 6.** *Let the conditions of Theorem 12 be satisfied. Then, for such a Markov chain, all of its invariant finitely additive measures (and such ones always exist, see Theorem 7) are purely finitely additive, i.e.,* Δ*ba* = Δ*pfa* = ∅, Δ*ca* = ∅*.*

This statement follows from Theorem 12 or from its Corollary 5, if we take as the initial measure *μ*<sup>1</sup> its invariant measure *μ*<sup>∗</sup> = *Aμ*∗.

**Remark 9.** *Let us return to the MC from Example 5. The following assertions are obtained from the properties of the MC obtained above.*

*Then, verifying by induction that, for any initial measure <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba and for all <sup>E</sup>* <sup>⊂</sup> *X, <sup>n</sup>* <sup>∈</sup> *N,*

$$|\mu^{n+1}(E) - \eta(E)| = \frac{1}{2^n} |\mu^1(E) - \eta(E)|^2$$

*is easy. Therefore, for each E* ⊂ *X, n* ∈ *N, for the norm of a measure equal to the total variation of the measure, the following estimate is true:*

$$||\mu^{n+1} - \eta|| = \frac{1}{2^n}||\mu^1 - \eta|| \le \frac{2}{2^n} = \frac{1}{2^{n-1}}.$$

*This implies that the Markov sequence of measures* {*μn*+1} *of a given MC converges strongly (in the metric topology) to a unique invariant purely finitely additive measure η. This convergence is uniform in all initial finitely additive (including countably additive) measures <sup>μ</sup>*<sup>1</sup> <sup>∈</sup> *Sba. In this case, the convergence is exponentially quickly. Thus, the MC in Example 5 is ergodic.*

**Remark 10.** *In the previous Remark 9, we talked about the limiting behavior of Markov sequences of measures, not their Cesaro means. Such an increase in the type of convergence of measures is due to the fact that the MC from Example 5 does not have cycles of measures.*

*The article by Zhdanok [22] was devoted to cycles of finitely additive measures. MCs with countably additive transition probability were considered with the Markov operator A extended to the space of finitely additive measures.*

The conditions (*H*1) and (*H*2) are of an understandable qualitative character, but they are difficult to verify for specific MCs. Thus, finding simple analogues of these conditions in terms of the properties of the transition functions considered by the MC is desirable. We offer two such conditions. Here, we present the first of them: the (*G*1) condition.

(*G*1) - There is a finite set *D* ⊂ *X* such that for all *x* ∈ *X* : *Pca*(*x*, *D*) = *Pca*(*x*, *X*) = *q*1, which is equivalent to *Pca*(*x*, *X* \ *D*) = 0.

We still consider an arbitrary discrete phase space and finitely additive combined non-degenerate MCs defined on it.

**Theorem 13.** *Let condition* (*G*1) *be satisfied for some MC. Then,*


**Proof.** Let *μ* ∈ *Vpfa*, *μ* = 0, i.e., the measure *μ* be purely finitely additive, and the measure *η* = *Acaμ*. Let condition (*G*1) be satisfied. Then,

$$\eta(D) = A\_{ca}\mu(D) = \int\_X P\_{ca}(\mathbf{x}, D)\mu(d\mathbf{x}) = \int\_X q\_1\mu(d\mathbf{x}) = q\_1\mu(X) > 0.$$

Similarly, we obtain *η*(*X* \ *D*) = 0 and *η*(*X*) = *η*(*D*).

The finitely additive measure *η* = *Acaμ* is concentrated on a finite set *D*. Therefore, it is formally countably additive on *D* and on the whole space *X*. This means that *η* = *Acaμ* ∈ *Vca*. Condition (*H*1) is satisfied, i.e., (*G*1) ⇒ (*H*1), and the assertion of Theorem 11 is true.

Consider one more condition (*G*2) on the transition function of the MC. For an arbitrary *y* ∈ *X*, we denote the set *Qy* = {*x* ∈ *X* : *P*(*x*, {*y*}) > 0}.

(*G*2) For any *y* ∈ *X* the set *Qy* is empty or finite.

**Theorem 14.** *Let condition* (*G*2) *be satisfied for some MC. Then,*


**Proof.** Let *μ* ∈ *Vp f a*, *μ* = 0, i.e., the measure *μ* be purely finitely additive, and *η* = *Acaμ*. Then, for any *y* ∈ *X*, the following holds

$$\eta\left(\{y\}\right) = A\_{\text{cl}}\mu\left(\{y\}\right) = \int\_X P\_{\text{cl}}\left(\mathbf{x}, \{y\}\right) \quad \mu(d\mathbf{x}) = \begin{array}{c} \int\_{Q\_y} P\_{\text{cl}}\left(\mathbf{x}, \{y\}\right)\mu(d\mathbf{x}) + \int\_{X\backslash Q\_y} P\_{\text{cl}}\left(\mathbf{x}, \{y\}\right)\mu(d\mathbf{x}) . \end{array}$$

Because a purely finitely additive measure is equal to zero on any finite set, then *μ*(*Qy*) = 0, and the first integral in the expansion above is equal to zero.

By condition (*G*2) the function *Pca*(*x*, {*y*}) is equal to zero for all *x* ∈ *X* \ *Qy*. Consequently, the second integral in this expansion is equal to zero. This implies that *η*({*y*}) = 0. Then, the measure *η* is purely finitely additive by our Theorem 4, Condition (*H*2) is satisfied, and the assertion of Theorem 12 is true.

**Remark 11.** *Let us show that the MC in Example 4 satisfies the condition* (*G*1)*. Recall that the countably additive component Pca*(*x*, *E*) *of the MC transition function in Example 4 has the following form:*

*Pca*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *δ*0(*E*) *for all x* ∈ *X and E* ⊂ *X, where δ*<sup>0</sup> *is the Dirac measure at point 0. Take a finite set D* = {0}*. Then,*

$$P\_{\mathfrak{C}\mathfrak{a}}(\mathfrak{x}, D) = \frac{1}{2}\delta\_0(D) = \frac{1}{2} = q\_{1\prime}\\P\_{\mathfrak{C}\mathfrak{a}}(\mathfrak{x}, X \nmid D) = 0 \text{ for all } \mathfrak{x} \in X.$$

*Thus, condition* (*G*1) *is fulfilled.*

*We showed above that the MC in Example 4 also satisfies condition* (*H*1)*.*

*Let us now return to Example 5, in which the countably additive component of the transition function is given by the following rule: Pca*(*x*, *E*) = <sup>1</sup> <sup>2</sup> *δx*(*E*) *for all x* ∈ *X and E* ⊂ *X, where δ*<sup>0</sup> *is the Dirac measure at point x.*

*Obviously, any point <sup>y</sup>* <sup>∈</sup> *<sup>X</sup>* = [0, 1] *can be reached in one step only from itself with probability* <sup>1</sup> <sup>2</sup> *. Hence, Qy* = {*x* ∈ *X* : *P*(*x*, {*y*}) > 0} = {*y*}*. This set is finite for any y* ∈ *X. Thus, condition* (*G*2) *is satisfied.*

*Above, we directly showed (without using Theorem 14) that, in Example 5, condition* (*H*2) *also holds.*

#### **6. Conclusions**

Work on the theory of finitely additive Markov chains quite naturally appeared in the general theory of random processes and in the economic game theory. Ramakrishnan's pioneering work laid the foundations for such a theory. The main condition in this work is that the transition probability of Markov chains can only be finitely additive. However, the structures he created or used (strategies) are quite complex. They require readers to have a broad outlook in several areas of mathematics.

The authors of this article have been working on problems with using finitely additive measures to study the properties of general Markov chains for a long time. However, attention was primarily paid to classical Markov chains with countably additive transition probability. In this case, finitely additive measures appeared as a result of the extension of Markov operators from the space of countably additive measures to the space of finitely additive measures. All of these studies were carried out within the framework of the operator treatment.

We have seen that combining the problems of the theory of finitely additive Markov chains and the methods we are developing for studying general Markov chains is possible. The result is the present work. Its feature is the absence of concepts and methods of game theory and the apparatus of random variables generated by finitely additive measures. We used the language and methods of classical functional analyses available to a wider readership, and some of our results have a simple proof. However, they provide a basic platform for possible future research conducted by other authors in this direction. In particular, the ergodic properties of finitely additive Markov chains can be considered.

**Author Contributions:** Conceptualization, A.Z.; methodology, A.Z. and A.K.; writing—original draft preparation, writing—review and editing, A.Z. and A.K.; visualization, A.K.; project administration, A.Z.; funding acquisition, A.Z. and A.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Russian Foundation of Basic Research, RFBR project number 20-01-00575-a.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

