**Stability Problems for Stochastic Models: Theory and Applications II**

Editors

**Alexander Zeifman Victor Korolev Alexander Sipin**

MDPI • Basel • Beijing • Wuhan • Barcelona • Belgrade • Manchester • Tokyo • Cluj • Tianjin

*Editors* Alexander Zeifman Vologda State University Russia

Victor Korolev Lomonosov Moscow State University Russia

Alexander Sipin Vologda State University Russia

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Mathematics* (ISSN 2227-7390) (available at: https://www.mdpi.com/journal/mathematics/special issues/Stability Problems Stochastic Models Theory Applications II).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

LastName, A.A.; LastName, B.B.; LastName, C.C. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-3815-0 (Hbk) ISBN 978-3-0365-3816-7 (PDF)**

© 2022 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license, which allows users to download, copy and build upon published articles, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications.

The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons license CC BY-NC-ND.

## **Contents**


#### **Alexander Zeifman, Yacov Satin and Alexander Sipin**


## **About the Editors**

**Alexander Zeifman**—Professor, Head of Department of Applied Mathematics, Vologda State University, Vologda, Russia; Senior Researcher, Institute of Informatics Problems, Federal Research Center "Computer Sciences and Control" of the Russian Academy of Sciences, Russia; Chief Researcher, Vologda Research Center of the Russian Academy of Sciences, Russia. Graduate of Vologda State Pedagogical Institute, 1976. Candidate of Science in Physics and Mathematics (PhD), 1981. Doctor of Science in Physics and Mathematics (1994, Institute of Control Sciences, Russian Academy of Sciences). Main research interests: stochastic models, continuous-time Markov chains, bounds on the rate of convergence, perturbation bounds, queueing models, biological models, queueing theory.

**Victor Korolev**—Professor, Head of Department of Mathematical Statistics, Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University, Moscow, Russia; Leading researcher, Institute of Informatics Problems, Federal Research Center "Computer Sciences and Control" of the Russian Academy of Sciences, Moscow, Russia. Graduate of Faculty of Computational Mathematics, Lomonosov Moscow State University, 1977. Candidate of Science in Physics and Mathematics (PhD), 1981. Doctor of Science in Physics and Mathematics (1994, Lomonosov Moscow State University). Main research interests: Limit theorems of probability theory and their applications in distribution theory, statistics, risk theory, reliability theory. Probability models of real processes in physics, meteorology, financial mathematics and other fields.

**Alexander Sipin**—Professor at the Department of Applied Mathematics, Vologda State University, Institute of Mathematics, Natural and Computer Sciences, Russia. Graduate of Faculty of Mathematics and Mechanics, Leningrad State University, now St.Petersburg State University, Russia in 1975. Candidate of Science in Physics and Mathematics (PhD), 1979. Doctor of Science in Physics and Mathematics (2016, St. Petersburg State University, Russia). Research interests: Monte Carlo and quasi-Monte Carlo methods, Markov chains, meshless numerical methods for solving boundary value problems.

## **Preface to "Stability Problems for Stochastic Models: Theory and Applications II"**

Most papers published in this Special Issue of *Mathematics* are written by the participants of the XXXVI International Seminar on Stability Problems for Stochastic Models. This seminar was founded by outstanding Russian mathematician Vladimir Zolotarev (27 February 1931–7 November 2019).

The main theme of the seminar is the development of an approach to limit theorems of probability theory and related fields proposed by V.M. Zolotarev. The main point of this approach is that limit theorems of probability theory are treated as special stability theorems. Zolotarev created the theoretical foundation of the key method used within this approach, namely, the theory of probability metrics. This approach assumes that statements establishing convergence must be accompanied by statements establishing the convergence rate. Zolotarev called the conditions of convergence those that simultaneously serve as convergence rate and estimates "natural". This approach was developed in the works of Zolotarev, Kalashnikov, Kruglov, Senatov, Korolev, Khokhlov, and their colleagues. By the mid-1970s, the investigation of the continuity and stability of probability and statistical models, such as, say, characterization models for probability distributions and queuing models, grew into a kind of a separate domain of probability theory. In May and November 1974, in Leningrad and Vilnius, two compact symposia on these problems were held. These symposia were initiated and organized by Vladimir Zolotarev. In 1975–1976 the research seminar on stability problems for stochastic models and related topics headed by V. Zolotarev was held at the Steklov Mathematical Institute of the Academy of Sciences of the USSR. Later, this weekly seminar moved to the Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University. Together with V. Zolotarev, this seminar was coordinated by V. Kalashnikov and V. Kruglov. In parallel with the weekly seminars, annual international sessions were launched with the wide participation of mathematicians from many countries. Now, this seminar is internationally recognized for the originality and relevance of the considered problems and presented results. The seminar formed and developed a breakthrough approach to limit theorems of probability theory such as stability theorems. Within this approach, many deep results were obtained.

Now, the scope of the seminar embraces the following:


and other fields.

Here is the complete list of the International sessions of the Seminar on Stability Problems for Stochastic Models:


XXX. 24–30 September 2012, Svetlogorsk, Russia

XXXI. 23–27 April 2013, Moscow, Russia

XXXII. 15–21 June 2014, Trondheim, Norway

XXXIII. 13–18 June 2016, Svetlogorsk, Russia

XXXIV. 24–28 August, 2017, Debrecen, Hungary

XXXV. 24–28 September, 2018, Perm, Russia

XXXVI. 22–26 June, 2020, Petrozavodsk, Russia (on-line session), 21–25 June, 2021, Petrozavodsk, Russia (off-line session).

Most papers published in this Special Issue of the "Mathematics" are written by the participants of the XXXVI International Seminar on Stability Problems for Stochastic Models, 21 – 25 June, 2021, Petrozavodsk, Russia.

The scope of the seminar embraces


and other fields.

This issue contains twelve papers by specialists who represent six counties: Belarus, France, Hungary, India, Italy and Russia.

> **Alexander Zeifman, Victor Korolev, and Alexander Sipin** *Editors*

## *Article* **Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case**

**Vladimir E. Bening 1,2 and Victor Y. Korolev 1,2,3,\***


**Abstract:** In the paper, we consider a new approach to the comparison of the distributions of sums of random variables. Unlike preceding works, for this purpose we use the notion of deficiency that is well known in mathematical statistics. This approach is used, first, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the (1 − *α*)-quantile of the normalized sum for a given *α* ∈ (0, 1), and second, to determine the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Both problems are solved under the condition that possible distributions of random summands possess coinciding three first moments. In both settings the best distribution delivers the smallest number of summands. Along with distributions of a non-random number of summands, we consider the case of random summation and introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands. The main mathematical tools used in the paper are asymptotic expansions for the distributions of R-valued functions of random vectors, in particular, normalized sums of independent identically distributed r.v.s and their quantiles. Along with the general case, main attention is paid to the situation where the summarized random variables are independent and identically distributed. The approach under consideration is applied to determination of the distribution of insurance payments providing the least insurance portfolio size under prescribed Value-at-Risk or non-ruin probability.

**Keywords:** limit theorem; sum of independent random variables; random sum; asymptotic expansion; asymptotic deficiency; kurtosis

#### **1. Introduction**

#### *1.1. The Problem under Consideration and the Structure of the Paper*

The problem considered in the paper is very close to the problem of stochastic ordering and even may be considered as a a version of this problem. In probability theory and statistics, a stochastic order quantifies the concept of one random variable being "bigger" or "smaller" than another. Many different orders exist, which have different applications, see, e.g., the book [1]. Here we propose an approach to establishing stochastic order for the distributions of sums of independent random variables (r.v.s) based on the notion of deficiency that is well known in asymptotic statistics, see, e.g., [2] and later publications [3–5]. Roughly speaking, in statistics the deficiency of a statistical procedure with respect to an 'optimal' procedure is the number of additional observations required to attain the same quality of inference as is guaranteed by the 'optimal' procedure.

In this paper we deal with the case where the deficiency is measured in natural-valued discrete units (number of 'additional' summands) and therefore here we deal with discrete

**Citation:** Bening, V.E.; Korolev, V.Y. Comparing Distributions of Sums of Random Variables by Deficiency: Discrete Case. *Mathematics* **2022**, *10*, 454. https://doi.org/10.3390/ math10030454

Academic Editors: Anatoliy Swishchuk and Antonella Basso

Received: 27 December 2021 Accepted: 28 January 2022 Published: 30 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

case. The notion of deficiency can be extended to the case of the continuous parameter, say, time. This case will be considered in another work.

Along with the general case, in the paper main attention is paid to the situation where the r.v.s being summed are assumed to be independent and identically distributed.

The first problem to be considered below consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the (1 − *α*)-quantile of the normalized sum for a given *α* ∈ (0, 1). The second problem considered in the paper consists in determination of the distribution of a separate random variable in the sum that provides the least possible number of summands guaranteeing the prescribed value of the probability for the normalized sum to fall into a given interval. Actually, in both problems we deal with 'fine tuning' of the distribution of a separate summand since we assume that different possible distributions of random summands possess coinciding three first moments, so that they can differ only by their kurtosis. In both settings the best distribution delivers the smallest number of summands.

We also consider the problem where some additional randomization is introduced so that the number of summands in the sum can be random itself. This randomization may not be artificially induced, but also may occur when the exact number of summands is a priori unknown and only some its 'expected' value can be available as the parameter of the problem. For this case we introduce an analog of deficiency which can be used to compare the distributions of sums with random and non-random number of summands.

Both problems are closely related with the problem of quantification of the accuracy of approximations provided by limit theorems of probability theory. The main mathematical tools used in the paper are asymptotic expansions for the distributions of normalized sums of independent identically distributed r.v.s and their quantiles.

The formal settings mentioned above can be applied to solving practical problems where the models of the observed statistical regularities have the form of distributions of sums of r.v.s and the number of summands plays a substantial role. For example, consider an insurance company whose portfolio consists of a finite number of insurance contracts. Formally, the portfolio is assumed to be a finite set of r.v.s each of which characterizes the income of the company related to a separate contract. Instead of *income* we can speak of *loss* assuming that income is a negative loss or that loss is a negative income.

In these terms, the first setting concerns the problem of determination of the distribution of a possible loss within a separate insurance contract (say, the distribution of an insurance payment) providing the least possible portfolio size and guaranteeing the prescribed Value-at-Risk for the average losses. The approach considered in the paper can be used when the distributions of the summands (possible losses) are known only up to their three first moments and the exact Value-at-Risk is not known for sure. In the second setting the latter requirement is replaced by that of guaranteeing the prescribed 'non-ruin' probability. Within the framework of this example in both settings the problem consists in the description of the best strategy of the insurance company, if by a strategy we mean the choice of the terms of a contract (e.g., the amount of insurance payment related to each possible insurance event), that is, of the distribution of possible loss within a separate contract. Briefly, the problem is to choose an optimal distribution of a separate loss among the distributions that have the same first three moments so that the portfolio size is least possible.

The paper is organized as follows. Section 1.2 contains a short overview of the properties of statistical deficiency. In Section 2 we outline some results concerning the asymptotic expansions for the distributions of R-valued measurable functions of r.v.s and, in particular, for the distributions of normalized sums of r.v.s, as well as for their quantiles. In Section 3 the problem of comparison of the distributions of two sums of independent r.v.s by their deficiency is considered. The notion of asymptotic deficiency is introduced and some formulas for the calculation of asymptotic deficiency are presented. Section 3.1 contains the solution of this problem for these distributions providing a prescribed value

of the (1 − *α*)-quantile for a given *α* ∈ (0, 1). In Section 3.2 this problem is considered for the distributions of sums of independent r.v.s guaranteeing a prescribed probability for an R-valued measurable function of r.v.s, in particular, for a normalized sum of r.v.s, to fall into a given interval. Section 4 contains an example of extension of the results of Section 3 to the case of a random number of summands in the sum (random portfolio size, in terms of the example dealing with an insurance company). In Section 4.1 asymptotic expansions for the asymptotic (1 − *α*)-quantile (called *α*-reserve here) under a random portfolio size are presented and an analog of deficiency of the sum of a random number of summands (or the strategy with a random portfolio size) with respect to the distribution of the sum of a non-random number of summands (or a strategy with a non-random portfolio size) is considered. In Section 4.2 the problem of comparison of these distributions by an analog of deficiency is considered in a special case of three-point distribution of portfolio size.

Everywhere in what follows the set of real numbers is denoted by R, the set of natural numbers is denoted by N. The distribution function of the standard normal law will be denoted by Φ(*x*),

$$\Phi(\mathbf{x}) = \frac{1}{\sqrt{2\pi}} \int\_{-\infty}^{\mathbf{x}} \varrho(y) dy, \quad \varrho(\mathbf{x}) = \frac{1}{\sqrt{2\pi}} \exp\left\{-\frac{\mathbf{x}^2}{2}\right\}, \ \mathbf{x} \in \mathbb{R}.$$

The distribution of a random vector (*X*1,..., *Xn*) will be denoted L(*X*1,..., *Xn*).

#### *1.2. Asymptotic Deficiency*

Following the classical terminology of [6], consider two decision rules (say, two statistical procedures) *D*∗ *<sup>n</sup>* and *Dn* whose quality is characterized by the quantities *π*<sup>∗</sup> *<sup>n</sup>* and *πn*, respectively. Here *n* is the number of observations *X*1, ... , *Xn* delivering the information underlying the decision rules. Assume that the rule *D*∗ *<sup>n</sup>* is in some sense optimal whereas the rule *Dn* is competing. For example, in the problem of estimation usually *π*∗ *<sup>n</sup>* and *π<sup>n</sup>* are mean square deviations and *π*∗ *<sup>n</sup>* ≤ *πn*. In the problem of testing hypotheses usually *π*<sup>∗</sup> *<sup>n</sup>* and *π<sup>n</sup>* are powers of tests so that *π*∗ *<sup>n</sup>* ≥ *πn*.

By *m*(*n*) denote the number of observations required for the decision rule *Dm*(*n*) based on *m*(*n*) observations *X*1, ... , *Xm*(*n*) to attain the same quality as the 'best' rule *D*<sup>∗</sup> *<sup>n</sup>* based on *n* observations *X*1, ... , *Xn*. In what follows we will keep to the asymptotic approach assuming that *n* → ∞. Following [7], by the asymptotic relative efficiency (a.r.e.) of the rule *Dn* with respect to the rule *D*∗ *<sup>n</sup>* we will mean the limit

$$\varepsilon \equiv \lim\_{n \to \infty} \frac{n}{m(n)}$$

(if it exists and does not depend on the sequence *m*(*n*)).

Instead of the ratio of the required number of observations, the difference *m*(*n*) − *n* can be considered as well, vividly showing the additional number of observations required by the decision rule *Dn*. However, many authors considered the ratio *n*/*m*(*n*), possibly, because the asymptotic analysis of its properties is simpler.

The systematic analysis of the asymptotic behavior of the difference *m*(*n*) − *n* was first carried out by Hodges and Lehmann in 1970 [2]. They suggested to call the difference *m*(*n*) − *n deficiency* of the competing decision rule *Dn* with respect to the rule *D*<sup>∗</sup> *<sup>n</sup>* and introduced the notation

$$d\_n = m(n) - n.\tag{1}$$

If the limit lim*n*→<sup>∞</sup> *dn* exists, then it is called *the asymptotic deficiency* of the competing decision rule *Dn* with respect to the rule *D*∗ *<sup>n</sup>* and is denoted *d*. The number *d* is often called the *deficiency* of *Dn* with respect to *D*∗ *<sup>n</sup>*. Note that if a.r.e. *e* = 1, then *d* = ∞, so that this case is not so interesting. In [2] it was also noticed that for some decision rules (statistical procedures) there typically appear cases *e* = 1 (see, e.g., the book [8]), that is, in these cases the a.r.e. cannot give an answer to the question, which rule is better, whereas the deficiency can clarify the case, because, generally speaking, in this case the asymptotic deficiency can be arbitrary.

So, the deficiency of *Dn* with respect to *D*∗ *<sup>n</sup>* shows, how many additional observations (that is, how much extra information) is required to attain the desired quality, if the decision rule *Dn* is used instead of the 'optimal' decision rule *D*∗ *<sup>n</sup>*. Therefore, the notion of deficiency provides natural grounds for the asymptotic comparison of *Dn* and *D*∗ *<sup>n</sup>* in the case *e* = 1. The study of the asymptotic behavior of the deficiency *dn* requires more sophisticated techniques than is used to find the limit *e*. As a rule, this techniques employ the construction of asymptotic expansions (a.e.s) for the corresponding functions characterizing the quality of decision rules (see, e.g., the books [7–9]).

Since the rules *D*∗ *<sup>n</sup>* and *Dn* have the quality characteristics *π*<sup>∗</sup> *<sup>n</sup>* and *πn*, respectively, then, by the definition of the deficiency *dn* = *m*(*n*) − *n*, for every *n* we have

$$
\pi\_n^\* = \pi\_{m(n)}.\tag{2}
$$

So solve Equation (2), the integer-valued quantity *m*(*n*) should be treated as a variable taking arbitrary real values. For this purpose the function *πm*(*n*) can be defined for noninteger *m*(*n*) by the formula

$$
\pi\_{m(n)} = \left(1 - m(n) + [m(n)]\right)\pi\_{[m(n)]} + \left(m(n) - [m(n)]\right)\pi\_{[m(n)]+1}
$$

(see [2]).

The functions *π*∗ *<sup>n</sup>* and *π<sup>n</sup>* are usually unknown, so, in practice, their approximations are used. Assume that the a.e.s

$$
\pi\_n^\* = \frac{a}{n^r} + \frac{b}{n^{r+s}} + o(n^{-r-s}),
\tag{3}
$$

and

$$
\pi\_n = \frac{a}{n^r} + \frac{c}{n^{r+s}} + o(n^{-r-s}),
\tag{4}
$$

hold, where *a*, *b* and *c* are some numbers that do not depend on *n*, and *r* > 0, and *s* > 0 are constants determining the rate of decrease of these quality criteria in *n*. The first terms in these expansions coincide which means that the a.r.e. of the corresponding rules equals one. It can be easily obtained from relations (1)–(4) that

$$d\_n = \frac{c - b}{ra} n^{1 - s} + o(n^{1 - s}) \tag{5}$$

(see [2] or [7]). Thus, the asymptotic deficiency has the form

$$d = \begin{cases} \pm \infty, & 0 < s < 1, \\ \frac{c - b}{ra}, & s = 1, \\ 0, & s > 1. \end{cases} \tag{6}$$

The asymptotic deficiency possesses the following obvious property of transitivity: if there is some third decision rule *Dn* with the quality characteristic *π<sup>n</sup>* admitting an a.e. of the form (4), then the deficiency *dn* of the rule *Dn* with respect the the rule *D*∗ *<sup>n</sup>* satisfies the equality

$$
\overline{d}\_n = \tilde{d}\_n + d\_{n'}
$$

where *d <sup>n</sup>* is the deficiency of the rule *Dn* with respect to *Dn* and *dn* is the deficiency of *Dn* with respect to *D*∗ *n*.

The case where *s* = 1 is most interesting, because in this case the asymptotic deficiency is finite. In the paper [2] some simple examples are given illustrating that this case is quite natural in mathematical statistics (also see the book [8]).

#### **2. Asymptotic Expansions for the Distributions of Normalized Sums of Random Variables**

We begin with most general case. Let *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>. Consider a finite set of r.v.s *<sup>X</sup>*1, ... , *Xn*. For the time being we do not assume that the r.v.s *X*1, ... , *Xn* are independent and identically distributed. Let *Ln* = *Ln*(*X*1, ... , *Xn*) be an R-valued measurable function of *X*1, ... , *Xn*. (In what follows when dealing with the example of the portfolio of an insurance company we will call this function *generalized loss*). In particular, *Ln* may be of the form *Ln* <sup>=</sup> <sup>√</sup>*nTn* where *Tn* is the arithmetic mean,

$$T\_n \equiv \frac{1}{n} \sum\_{i=1}^n X\_i. \tag{7}$$

As it has been already said, the problem consists in description of the distribution of r.v.s *Xi* providing the least possible number of summands *n* and guaranteeing the prescribed value of the (1 − *α*)-quantile of the function *Ln* for a given *α* ∈ (0, 1).

Let *α* ∈ (0, 1) be a small number. Consider the quantity *cα*(*n*) defined by the asymptotic relation

$$\mathbb{P}\left(L\_n \ge c\_n(n)\right) = a + o(n^{-1}), \ n \to \infty. \tag{8}$$

The quantity *<sup>c</sup>α*(*n*) is the asymptotic (<sup>1</sup> <sup>−</sup> *<sup>α</sup>*)-quantile of *Ln*. If *Ln* <sup>=</sup> <sup>√</sup>*nTn*, then *<sup>c</sup>α*(*n*) can be interpreted as the threshold, the exceedance of which by *Ln* is undesirable and is assumed to have the prescribed small probability *α*. In terms of an insurance company, *cα*(*n*) is the asymptotic Value-at-Risk.

By applying the Taylor formula it is not difficult to obtain the following result.

**Lemma 1.** *Assume that there exist distribution function G*(*x*) *and functions g*1(*x*) *and g*2(*x*) *such that*

$$\sup\_{\mathbf{x}\in\mathbb{R}} \left| \mathbb{P} \left( L\_n < \mathbf{x} \right) - G(\mathbf{x}) - \frac{1}{\sqrt{n}} \mathbf{g}\_1(\mathbf{x}) - \frac{1}{n} \mathbf{g}\_2(\mathbf{x}) \right| = o(n^{-1}),$$

*where the functions G*(*x*)*, g*1(*x*) *and g*2(*x*) *are smooth enough. Then the asymptotic* (1 − *α*) *quantile cα*(*n*) *of Ln admits the a.e.*

$$\mathcal{L}\_{\mathfrak{a}}(n) = \mathfrak{c}\_{\mathfrak{a}} - \frac{g\_1(\mathfrak{c}\_{\mathfrak{a}})}{\sqrt{n}G'(\mathfrak{c}\_{\mathfrak{a}})} - \frac{1}{n} \left[ \frac{G''(\mathfrak{c}\_{\mathfrak{a}})g\_1^2(\mathfrak{c}\_{\mathfrak{a}})}{2(G'(\mathfrak{c}\_{\mathfrak{a}}))^3} + \frac{G'(\mathfrak{c}\_{\mathfrak{a}})g\_2(\mathfrak{c}\_{\mathfrak{a}}) - g\_1(\mathfrak{c}\_{\mathfrak{a}})g\_1'(\mathfrak{c}\_{\mathfrak{a}})}{(G'(\mathfrak{c}\_{\mathfrak{a}}))^2} \right] + o(n^{-1}),$$

*where c<sup>α</sup> satisfies the equation G*(*cα*) = 1 − *α.*

Consider the application of this lemma to the case where *X*1, *X*2, ... are independent identically distributed r.v.s such that

$$\mathbb{E}X\_1 = 0, \; \mathbb{E}X\_1^2 = 1, \; \mathbb{E}|X\_1|^{k+\delta} < \infty, \; k \in \mathbb{N}, \; k \ge 3, \; \delta > 0 \tag{9}$$

and the function *Ln* has the form *Ln* <sup>=</sup> <sup>√</sup>*nTn* with *Tn* defined by (7). Here the condition E*X*<sup>1</sup> = 0 means that the separate losses are centered by their expectations. Assume that the characteristic function *f*(*t*) of the r.v. *X*<sup>1</sup> satisfies the Cramér condition (*C*)

$$\limsup\_{|t| \to \infty} |f(t)| < 1. \tag{10}$$

Under conditions (9) and (10), from Theorem 6.3.2 of [10] (also see [9]) it follows that there exist functions *<sup>Q</sup>*1(*x*),..., *Qk*−2(*x*) and a *Ck*,*<sup>δ</sup>* ∈ (0, <sup>∞</sup>) such that

$$\sup\_{x} \left| \mathbb{P} \left( \sqrt{n} T\_n < x \right) - \Phi(x) - \sum\_{i=1}^{k-2} n^{-i/2} Q\_i(x) \right| \le \frac{\mathsf{C}\_{k, \delta}}{n^{(k-2+\delta)/2}}, \ n \in \mathbb{N}, \tag{11}$$

For the definition of the functions *<sup>Q</sup>*1(*x*), ... , *Qk*−2(*x*) see the book [10]. In particular,

$$Q\_1(\mathbf{x}) = -(\mathbf{x}^2 - 1)\varphi(\mathbf{x})\frac{\mathbb{E}X\_1^3}{6}\varphi$$

$$Q\_2(\mathbf{x}) = -(\mathbf{x}^3 - 3\mathbf{x})\varrho(\mathbf{x})\frac{\mathbb{E}X\_1^4 - 3}{24} - (\mathbf{x}^5 - 10\mathbf{x}^3 + 15\mathbf{x})\varrho(\mathbf{x})\frac{(\mathbb{E}X\_1^3)^2}{72}.\tag{12}$$

Relations (11) and (12) and Lemma 1 directly imply the a.e. for the asymptotic (1 − *α*) quantile *cn*(*α*) of *Ln* presented in the following lemma.

**Lemma 2.** *Let conditions* (9) *and* (10) *hold with k* = 4*, δ* > 0*. Then the the asymptotic* (1 − *α*) *quantile cn*(*α*) *of Ln admits the a.e.*

$$\mathcal{L}\_{\mathfrak{a}}(n) = u\_{\mathfrak{a}} + \frac{\mathbb{E}X\_1^3}{6\sqrt{n}}(u\_{\mathfrak{a}}^2 - 1) + \frac{1}{12n} \left[ \frac{\mathbb{E}^2 X\_1^3}{3}(5u\_{\mathfrak{a}} - 2u\_{\mathfrak{a}}^3) + \frac{\mathbb{E}X\_1^4 - 3}{2}(u\_{\mathfrak{a}}^3 - 3u\_{\mathfrak{a}}) \right] + o(n^{-1}),$$

*where u<sup>α</sup> is the* (1 − *α*)*-quantile of the standard normal distribution:* Φ(*uα*) = 1 − *α.*

#### **3. The Comparison of the Distributions of Two Normalized Sums of Random Variables**

*3.1. The Asymptotic Deficiency of the Distributions of Summands Providing a Given* (1 − *α*)*-Quantile of the Normalized Sums*

In this section we will present an approach to the comparison of the distributions of two sums of r.v.s in terms of the number of summands. The distribution of the random vector *<sup>X</sup>*1, ... , *Xn* will be denoted <sup>L</sup>(*X*1, ... , *Xn*). Consider an <sup>R</sup>-valued measurable function of *X*1,..., *Xn*.

From Lemma 1 we can easily obtain the following result.

**Lemma 3.** *Consider a sequence* {*n*}*n*≥<sup>1</sup> *such that <sup>n</sup>* → 0 *as n* → ∞*. Under the conditions of Lemma 1 we have*

$$\begin{aligned} \sup\_{\mathbf{x}\in\mathbb{R}} \left| \mathbb{P} \left( L\_{\mathbb{H}} (X\_{1\prime}, \dots, X\_{\mathbb{H}}) < \mathbf{x} + \varepsilon\_{\mathbb{H}} \right) - \mathbb{P} \left( L\_{\mathbb{H}} (X\_{1\prime}, \dots, X\_{\mathbb{H}}) < \mathbf{x} \right) - \mathbf{x} \right| \\\\ -\varepsilon\_{\mathbb{H}} G'(\mathbf{x}) - \frac{\varepsilon\_{\mathbb{H}}^2}{2} G''(\mathbf{x}) - \frac{\varepsilon\_{\mathbb{H}}}{\sqrt{n}} g\_1{1}{}^{\prime}(\mathbf{x}) \right| = o \left( \max \left\{ \varepsilon\_{\mathbb{H}^\prime}^2 \frac{\varepsilon\_{\mathbb{H}}}{\sqrt{n}}, n^{-1} \right\} \right). \end{aligned}$$

Along with the r.v.s *X*1, ... , *Xn* resulting in the value *Ln*(*X*1, ... , *Xn*) of the function *Ln*, consider another set of r.v.s *Y*1,...,*Yn*, according to which the value of the function *Ln* is *Ln*(*Y*1, ... ,*Yn*). For example, *Ln*(*X*1, ... , *Xn*) may have the form *Ln*(*X*1, ... , *Xn*) = <sup>√</sup>*nTn* with *Tn* defined by (7) and *Ln*(*Y*1, ... ,*Yn*) may have the form *Ln*(*Y*1, ... ,*Yn*) = <sup>√</sup>*nUn* where

$$
\mathcal{U}\_n = \frac{1}{n} \sum\_{i=1}^n \mathcal{Y}\_i. \tag{13}
$$

Let to the distribution L(*Y*1, ... ,*Yn*) there correspond the asymptotic (1 − *α*)-quantile *cα*(*n*) of *Ln*:

$$\mathbb{P}\left(L\_n(\mathbb{Y}\_1, \dots, \mathbb{Y}\_n) \ge \overline{c}\_a(n)\right) = a + o(n^{-1}), \ n \to \infty. \tag{14}$$

Assume that the a.e. for the distribution function of *Ln*(*Y*1,...,*Yn*) has the form

$$\mathbb{P}\left(L\_n(\boldsymbol{\chi}\_1,\ldots,\boldsymbol{\chi}\_n) < \mathbf{x}\right) = \mathbb{G}(\mathbf{x}) + \frac{1}{\sqrt{n}}\mathbf{g}\_1(\mathbf{x}) + \frac{1}{n}\overline{\mathbf{g}}\_2(\mathbf{x}) + o(n^{-1}),\tag{15}$$

where the functions *G*(*x*), *g*1(*x*) and *g*2(*x*) are smooth enough. The a.e. (15) differs from the a.e. for the distribution function of *Ln*(*X*1, ... , *Xn*) established by Lemma 1 only by the term of order *n*−1, which means that the two distributions are rather close. Define the sequence of natural numbers {*m*(*n*)}*n*≥<sup>1</sup> by the equality

$$\mathbb{P}\left(L\_{m(n)}(\boldsymbol{Y}\_{1},\ldots,\boldsymbol{Y}\_{m(n)}) \ge c\_{n}(m(n))\right) = a + o(n^{-1}), \ n \to \infty. \tag{16}$$

If *<sup>m</sup>*(*n*) <sup>−</sup> *<sup>n</sup>* <sup>=</sup> *<sup>d</sup>* <sup>+</sup> *<sup>o</sup>*(1), *<sup>d</sup>* <sup>∈</sup> <sup>R</sup>, *<sup>n</sup>* <sup>→</sup> <sup>∞</sup>, then *<sup>d</sup>* is the asymptotic deficiency of the distribution L(*Y*1, ... ,*Y*1) with respect to the distribution L(*X*1, ... , *Xn*). In other words, *d* is the asymptotic number of 'additional' r.v.s be included in the set *Y*1, ... ,*Y*<sup>1</sup> in order that the distribution L(*Y*1,...,*Ym*(*n*)) provides the same quality as the distribution L(*X*1,..., *Xn*).

**Theorem 1.** *Assume that the conditions of Lemma 1 and (15) hold and G* (*cα*)*c<sup>α</sup>* = 0*. Then the asymptotic deficiency d of the distribution* L(*Y*1, ... ,*Y*1) *with respect to the distribution* L(*X*1,..., *Xn*) *has the form*

$$d = \frac{2\left[\mathcal{G}\_2(c\_\alpha) - \overline{\mathcal{G}}\_2(c\_\alpha)\right]}{G'(c\_\alpha)c\_\alpha} + o(1).$$

**Proof.** From Lemma 1 and condition (15) it directly follows that

$$\overline{c}\_{a}(n) = c\_{a} - \frac{g\_{1}(c\_{a})}{\sqrt{n}G'(c\_{a})} - \frac{1}{n} \left[ \frac{G''(c\_{a})g\_{1}^{2}(c\_{a})}{2(G'(c\_{a}))^{3}} + \frac{G'(c\_{a})\overline{g}\_{2}(c\_{a}) - g\_{1}(c\_{a})g\_{1}'(c\_{a})}{(G'(c\_{a}))^{2}} \right] + o(n^{-1})\tag{17}$$

and therefore

$$\varepsilon\_n \equiv \sqrt{\frac{m(n)}{n}} \overline{\varepsilon}\_a(m(n)) - \varepsilon\_a(m(n)) = \frac{d}{2n} \varepsilon\_a - \frac{1}{n} \frac{\left(\overline{\mathbb{S}\_2(\mathfrak{c}\_a)} - \overline{\mathbb{S}\_2}(\mathfrak{c}\_a)\right)}{G'(\mathfrak{c}\_a)} + o(n^{-1}).\tag{18}$$

Further, with the account of the definitions of *m*(*n*) (see (16)) and *<sup>n</sup>* we have

$$a + o(n^{-1}) = \mathbb{P}\left(L\_{m(n)}(\boldsymbol{Y}\_1, \dots, \boldsymbol{Y}\_{m(n)}) \ge \overline{c}\_a(m(n))\right) = 1$$

$$\mathbb{P}\left(L\_{m(n)}(\boldsymbol{Y}\_1, \dots, \boldsymbol{Y}\_{m(n)}) \ge \sqrt{\frac{n}{m(n)}} \left(c\_a(m(n)) + \epsilon\_n\right)\right) \tag{19}$$

Applying Lemma 3 to the right-hand side of (19) we obtain

$$
\alpha + o(n^{-1}) = \mathbb{P}\left(L\_{m(n)}(\boldsymbol{\chi}\_1, \dots, \boldsymbol{\chi}\_{m(n)}) \ge c\_{\mathfrak{a}}(m(n))\right) - \varepsilon\_n G'(c\_{\mathfrak{a}}) + o(n^{-1}).
$$

Now from (16) and (18) it follows that

$$d = \frac{2\left[\mathcal{G}\_2(c\_\alpha) - \overline{\mathcal{G}}\_2(c\_\alpha)\right]}{G'(c\_\alpha)c\_\alpha} + o(1).$$

The theorem is proved.

Now consider an example of the application of Theorem 1 to the optimization of the portfolio size of an insurance company. Let the possible losses *X*1, *X*2, ... related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses *Y*1,*Y*2, . . . are assumed to be independent identically distributed r.v.s such that

$$\mathbb{E}Y\_1 = 0,\\ \mathbb{E}Y\_1^2 = 1,\\ \mathbb{E}|Y\_1|^{4+\delta} < \infty,\\ \delta > 0. \tag{20}$$

Assume that the characteristic function *p*(*t*) of the r.v. *Y*<sup>1</sup> satisfies the Cramér (*C*) condition

$$\limsup\_{|t|\to\infty} |p(t)| < 1. \tag{21}$$

For each *n* consider the average losses *Un* defined by (13). Assume that

$$\mathbb{E}X\_1^3 = \mathbb{E}Y\_{1'}^3. \tag{22}$$

(for example, the r.v.s *Xi* and *Yi* are centered by their expectations and the distributions of these centered r.v.s are symmetric). From Lemma 2 and Theorem 1 we directly obtain the following statement.

**Lemma 4.** *Let conditions (9), (10) and (20)–(22) hold. Then the asymptotic* (*as n* → ∞) *deficiency of the distribution* L(*Y*1, ... ,*Yn*) *with respect to the distribution* L(*X*1, ... , *Xn*) (*the 'additional number of contracts'*) *d has the form*

$$d = \frac{\left(\mathbb{E}X\_1^4 - \mathbb{E}Y\_1^4\right)\left(3 - \mu\_a^2\right)}{12} + o(1).$$

Lemma 4 illustrates that if the distributions are close, then the deficiency is determined by the kurtosis.

*3.2. The Asymptotic Deficiency of the Distributions of Summands Providing a Given Probability for the Normalized Sum to Fall into a Given Interval*

To begin with, in this section we again consider the values of a measurable Rvalued function *Ln*(*X*1, ... , *Xn*) and *Ln*(*Y*1, ... ,*Yn*) on random vectors (*X*1, ... , *Xn*) and (*Y*1, ... ,*Yn*) with the the distributions L(*X*1, ... , *Xn*) and L(*Y*1, ... ,*Yn*), respectively. The goal is to provide that the value of *Ln* falls into the interval [*S*1, *S*2) for some given numbers *S*<sup>1</sup> < *S*2. As a quality characteristic consider the probabilities

$$\pi\_n = \mathbb{P}\left(\mathbb{S}\_1 \le L\_n(X\_1, \dots, X\_n) < \mathbb{S}\_2\right), \ \overline{\pi}\_n = \mathbb{P}\left(\mathbb{S}\_1 \le L\_n(Y\_1, \dots, Y\_n) < \mathbb{S}\_2\right). \tag{23}$$

If *Ln*(*X*1, ... , *Xn*) = <sup>√</sup>*nTn* (see (7)) and *Ln*(*Y*1, ... ,*Yn*) = <sup>√</sup>*nUn* (see (22)), that is, normalized sums of r.v.s are considered, then relation (23) means that *π<sup>n</sup>* and *π<sup>n</sup>* are probabilities of that the normalized sums of r.v.s are inside the interval [*S*1, *S*2).

From the definition of *π<sup>n</sup>* we directly obtain the following result.

**Lemma 5.** *Assume that for some r* > 0 *and s* > 0 *there exist a distribution function H*(*x*) *and functions h*1(*x*)*, h*2(*x*) *and h*2(*x*) *such that*

$$\sup\_{x \in \mathbb{R}} \left| \mathbb{P} \left( L\_n(X\_1, \dots, X\_n) < \mathbf{x} \right) - H(\mathbf{x}) - \frac{1}{n^r} h\_1(\mathbf{x}) - \frac{1}{n^{r+s}} h\_2(\mathbf{x}) \right| = o(n^{-r-s}),$$

$$\sup\_{x \in \mathbb{R}} \left| \mathbb{P} \left( L\_n(Y\_1, \dots, Y\_n) < \mathbf{x} \right) - H(\mathbf{x}) - \frac{1}{n^r} h\_1(\mathbf{x}) - \frac{1}{n^{r+s}} \overline{h}\_2(\mathbf{x}) \right| = o(n^{-r-s}),$$

*and, moreover, the functions h*1(*x*)*, h*2(*x*) *and h*2(*x*) *are measurable. Then π<sup>n</sup> and π<sup>n</sup> admit a.e.s*

$$\begin{aligned} \pi\_{\mathfrak{n}} &= H(\mathbb{S}\_2) - H(\mathbb{S}\_1) + \frac{h\_1(\mathbb{S}\_2) - h\_1(\mathbb{S}\_1)}{n^r} + \frac{h\_2(\mathbb{S}\_2) - h\_2(\mathbb{S}\_1)}{n^{r+s}} + o(n^{-r-s}),\\ \overline{\pi}\_{\mathfrak{n}} &= H(\mathbb{S}\_2) - H(\mathbb{S}\_1) + \frac{h\_1(\mathbb{S}\_2) - h\_1(\mathbb{S}\_1)}{n^r} + \frac{\overline{h}\_2(\mathbb{S}\_2) - \overline{h}\_2(\mathbb{S}\_1)}{n^{r+s}} + o(n^{-r-s}).\end{aligned}$$

**Corollary 1.** *Let <sup>n</sup>* ↓ 0 *as n* → ∞ *and S*<sup>2</sup> = *S*<sup>1</sup> + *n. Assume that the functions H*(*x*)*, h*1(*x*)*, h*2(*x*) *and h*2(*x*) *are smooth enough and h*1(*S*2) = *h*1(*S*1)*. Then*

$$\begin{aligned} \epsilon\_n^{-1}\pi\_n &= H'(S\_1) + \frac{\epsilon\_n}{2}H''(S\_1) + \frac{\epsilon\_n^2}{6}H^{\prime\prime\prime}(S\_1) + o(\epsilon\_n^2) + \\ &+ \frac{1}{n^r}h\_1'(S\_1) + \frac{1}{2n^r}h\_1''(S\_1)\epsilon\_n + o(\epsilon\_n n^{-r}) + \frac{1}{n^{r+s}}h\_2'(S\_1) + o(n^{-r-s}\epsilon\_n^{-1}), \end{aligned}$$

$$\begin{aligned} \mathfrak{e}\_{\mathfrak{n}}^{-1}\overline{\pi}\_{\mathfrak{n}} &= H'(\mathcal{S}\_{1}) + \frac{\mathfrak{e}\_{\mathfrak{n}}}{2}H''(\mathcal{S}\_{1}) + \frac{\mathfrak{e}\_{\mathfrak{n}}^{2}}{6}H^{\prime\prime\prime}(\mathcal{S}\_{1}) + o(\mathfrak{e}\_{\mathfrak{n}}^{2}) + \\ + \frac{1}{n^{r}}h\_{1}'(\mathcal{S}\_{1}) + \frac{1}{2n^{r}}h\_{1}''(\mathcal{S}\_{1})\varepsilon\_{\mathfrak{n}} + o(\mathfrak{e}\_{\mathfrak{n}}n^{-r}) + \frac{1}{n^{r+s}}\overline{h}\_{2}'(\mathcal{S}\_{1}) + o(n^{-r-s}\mathfrak{e}\_{\mathfrak{n}}^{-1}). \end{aligned}$$

Lemma 5, Corollary 1 and formula (6) directly imply the expression for the asymptotic deficiency with quality characteristics (23).

**Theorem 2.** *Let conditions of Lemma 5 hold with s* = 1*. Then the deficiency dn of the distribution* L(*Y*1, ... ,*Yn*) *with the quality characteristic π<sup>n</sup> with respect to the distribution* L(*X*1, ... , *Xn*) *with the quality characteristic π<sup>n</sup> has the form*

$$d\_{\rm li} = \frac{\overline{h}\_2(\mathcal{S}\_2) - h\_2(\mathcal{S}\_2) + \overline{h}\_2(\mathcal{S}\_1) - \overline{h}\_2(\mathcal{S}\_1)}{r(h\_1(\mathcal{S}\_2) - h\_1(\mathcal{S}\_1))} + o(1). \tag{24}$$

If *S*<sup>2</sup> = *S*<sup>1</sup> + *<sup>n</sup>* with *<sup>n</sup>* ↓ 0 as *n* → ∞ and *h* <sup>1</sup>(*S*1) = 0, then the formal passage to the limit in (3.13) yields the formula

$$d\_{\mathcal{U}} = \frac{\overline{h\_2}(\mathcal{S}\_1) - h\_2'(\mathcal{S}\_1)}{rh\_1'(\mathcal{S}\_1)} + o(1).$$

Consider an example of the application of Theorem 2 to the optimization of the portfolio size of an insurance company. Let the possible losses *X*1, *X*2, ... related with each insurance contract in the portfolio be independent identically distributed r.v.s satisfying conditions (9) and (10). Consider another distribution, under which the possible losses *Y*1,*Y*2, ... are assumed to be independent identically distributed r.v.s satisfying conditions (20) and (21). Assume that in (9) and (20) *k* = 3. We are interested in the asymptotic behavior of the average losses *Tn* (see (7)) and *Un* (see (13)). With the account of Lemma 5 we obtain the following statement.

**Lemma 6.** *Let conditions* (9)*,* (10)*,* (19) *and* (20) *hold with k* = 3*. Then*

$$\mathbb{P}(\sqrt{n}T\_n < \mathbf{x}) = \Phi(\mathbf{x}) + \frac{\underline{Q}\_1(\mathbf{x})}{\sqrt{n}} + \frac{\underline{Q}\_2(\mathbf{x})}{n} + o(n^{-1}),$$

$$\mathbb{P}(\sqrt{n}\underline{U}\_n < \mathbf{x}) = \Phi(\mathbf{x}) + \frac{\overline{Q}\_1(\mathbf{x})}{\sqrt{n}} + \frac{\overline{Q}\_2(\mathbf{x})}{n} + o(n^{-1}),$$

*uniformly in x* <sup>∈</sup> <sup>R</sup>*,*

$$\begin{aligned} \pi\_n &= \Phi(S\_2) - \Phi(S\_1) + \frac{Q\_1(S\_2) - Q\_1(S\_1)}{\sqrt{n}} + \frac{Q\_2(S\_2) - Q\_2(S\_1)}{n} + o(n^{-1}), \\\\ \overline{\pi}\_n &= \Phi(S\_2) - \Phi(S\_1) + \frac{\overline{Q}\_1(S\_2) - \overline{Q}\_1(S\_1)}{\sqrt{n}} + \frac{\overline{Q}\_2(S\_2) - \overline{Q}\_2(S\_1)}{n} + o(n^{-1}), \end{aligned}$$

*where the functions Q*1(*x*) *and Q*2(*x*) *are defined in* (12)*,*

$$
\overline{Q}\_1(\mathbf{x}) = -(\mathbf{x}^2 - 1)\varrho(\mathbf{x})\frac{\mathbb{E}Y\_1^3}{6},
$$

$$
\overline{Q}\_2(\mathbf{x}) = -(\mathbf{x}^3 - 3\mathbf{x})\varrho(\mathbf{x})\frac{\mathbb{E}Y\_1^4 - 3}{24} - (\mathbf{x}^5 - 10\mathbf{x}^3 + 15\mathbf{x})\varrho(\mathbf{x})\frac{(\mathbb{E}Y\_1^3)^2}{72}.
$$

**Corollary 2.** *Let <sup>n</sup>* ↓ 0 *as n* → ∞ *and S*<sup>2</sup> = *S*<sup>1</sup> + *n. Assume that conditions of Lemma 6 hold. Then*

$$
\epsilon\_n^{-1}\pi\_n = \varphi(S\_1) + \frac{\epsilon\_n}{2}\varphi'(S\_1) + \frac{\epsilon\_n^2}{6}\varphi''(S\_1) + o(\epsilon\_n^2) + 1
$$

$$+\frac{1}{\sqrt{n}}Q\_1'(S\_1) + \frac{\epsilon\_n}{2\sqrt{n}}Q\_1''(S\_1) + o(\epsilon\_n n^{-1/2})\frac{1}{n}Q\_2'(S\_1) + o(n^{-1}\epsilon\_n^{-1}),$$

$$\epsilon\_n^{-1}\overline{\pi}\_n = \varrho(S\_1) + \frac{\epsilon\_n}{2}\varrho'(S\_1) + \frac{\epsilon\_n^2}{6}\varrho''(S\_1) + o(\epsilon\_n^2) +$$

$$+\frac{1}{\sqrt{n}}\overline{Q}\_1'(S\_1) + \frac{\epsilon\_n}{2\sqrt{n}}\overline{Q}\_1''(S\_1) + o(\epsilon\_n n^{-1/2}) + \frac{1}{n}\overline{Q}\_2'(S\_1) + o(n^{-1}\epsilon\_n^{-1}).$$

Theorem 2, Lemma 5 and formula (5) directly imply the following statement.

**Theorem 3.** *Let, in addition to the conditions of Lemma 5.,* E*X*<sup>3</sup> <sup>1</sup> = <sup>E</sup>*Y*<sup>3</sup> <sup>1</sup> *. Then the deficiency dn of the distribution* L(*Y*1, ... ,*Yn*) *with the quality characteristic π<sup>n</sup> with respect to the strategy* L(*X*1, ... , *Xn*) *with the quality characteristic π<sup>n</sup>* (*the 'additional number of contracts'*) *has the form*

$$d\_n = 2\frac{\overline{Q}\_2(S\_2) - Q\_2(S\_2) + Q\_2(S\_1) - \overline{Q}\_2(S\_1)}{Q\_1(S\_2) - Q\_1(S\_1)} n^{1/2} + o(n^{1/2}).$$

Consider an example where the asymptotic deficiency is finite.

**Corollary 3.** *Let <sup>n</sup>* = <sup>1</sup> *<sup>n</sup> and <sup>S</sup>*<sup>2</sup> <sup>=</sup> *<sup>S</sup>*<sup>1</sup> <sup>+</sup> <sup>1</sup> *<sup>n</sup> ,* <sup>E</sup>*X*<sup>3</sup> <sup>1</sup> = <sup>E</sup>*Y*<sup>3</sup> <sup>1</sup> = 0*. Then under the conditions of Lemma 5 we have*

$$
\pi\_n = \frac{\varrho(S\_1)}{n} + \frac{\varrho'(S\_1) + 2\mathbb{Q}'\_2(S\_1)}{n^2} + o(n^{-2}),
$$

$$
\pi\_n = \frac{\varrho(S\_1)}{n} + \frac{\varrho'(S\_1) + 2\overline{\mathbb{Q}}'\_2(S\_1)}{n^2} + o(n^{-2}).
$$

*Moreover, the deficiency dn has the form*

$$d\_n = \frac{2(\overline{Q}\_2'(S\_1) - Q\_2'(S\_1))}{\varrho(S\_1)} + o(1) = \frac{S\_1^4 - 6S\_1^2 + 3}{12}(\mathbb{E}Y\_1^4 - \mathbb{E}X\_1^4) + o(1).$$

#### **4. Random Number of Summands**

*4.1. Asymptotic Expansions for the Asymptotic* (<sup>1</sup> <sup>−</sup> *<sup>α</sup>*)*-Quantile of* <sup>R</sup>*-Valued Measurable Functions of a Random Number of Random Variables*

In this section we consider the case where an additional randomization can be introduced into the problem. In this case the number of summands in the sum can be considered as random. This randomization may not be artificially induced, but also may occur when the exact portfolio size can be unknown beforehand and only some 'expected' number of summands can be available as the parameter of the problem.

Let natural-valued r.v.s *N*1, *N*2, ... and r.v.s *X*1, *X*2, ... be defined on one and the same probability space (Ω, A, P). In what follows we will assume that *n* is the expected value of *Nn*,

$$\mathbb{E}\mathbb{N}\_{\mathbb{H}} = n.\tag{25}$$

Assume that for each *n* ≥ 1 the r.v. *Nn* is independent of the sequence *X*1, *X*2, .... As above, for each *<sup>n</sup>* <sup>≥</sup> 1, consider the value of an <sup>R</sup>-valued measurable function *Ln* <sup>=</sup> *Ln*(*X*1,..., *Xn*). For each *n* ≥ 1 consider the r.v. *LNn* defined as

$$L\_{N\_n}(\omega) \equiv L\_{N\_n(\omega)}(X\_1(\omega), \dots, X\_{N\_n(\omega)}(\omega)), \ \omega \in \Omega.$$

Below we will assume that the following condition holds.

**Condition A.** *There exist <sup>k</sup>* <sup>∈</sup> <sup>N</sup>\{1}*, <sup>α</sup>i*,*<sup>n</sup>* <sup>∈</sup> <sup>R</sup>*, <sup>i</sup>* <sup>=</sup> 1, ... , *k, <sup>β</sup><sup>n</sup>* <sup>&</sup>gt; <sup>0</sup>*, Ck* <sup>&</sup>gt; <sup>0</sup>*, a differentiable distribution function G*(*x*) *and measurable functions gj*(*x*)*, j* = 1, . . . , *k such that*

$$
\beta\_n \to 0, \max\_{1 \le i \le k} |\alpha\_{i,n}| \to 0
$$

*as n* → ∞ *and*

$$\sup\_{x} \left| \mathbb{P} (L\_n < x) - G(x) - \sum\_{i=1}^k \alpha\_{i,n} \xi\_i(x) \right| \le C\_k \beta\_{n\prime} \quad n \in \mathbb{N}.$$

**Lemma 7.** *Let the function Ln* = *Ln*(*X*1,..., *Xn*) *satisfy Condition A. Then*

$$\sup\_{\mathbf{x}} \left| \mathbb{P} \left( L\_{N\_n} < \mathbf{x} \right) \ -G(\mathbf{x}) - \sum\_{i=1}^k g\_i(\mathbf{x}) \mathbb{E} a\_{i, N\_n} \right| \leq \mathbb{C}\_k \mathbb{E} \beta\_{N\_n}.$$

The elementary proof of this lemma directly follows by the formula of total probability. Consider an example of application of Lemma 7. Let *X*1, *X*2, ... be independent identically distributed r.v.s satisfying conditions (9) and (10). Assume that the function *Ln* is the normalized arithmetic mean (or, which is the same, the normalized sum) *Ln* <sup>=</sup> <sup>√</sup>*nTn* with *Tn* defined in (7). Then, in accordance with what has been said in Section 2, relation (11) holds implying the validity of Condition A. From (11) playing the role of Condition A and Lemma 7 we obtain the following statement.

**Lemma 8.** *Assume that Ln* <sup>=</sup> <sup>√</sup>*nTn with Tn defined in (7) and conditions (9) and (10) hold. Then*

$$\sup\_{x} \left| \mathbb{P} \left( \sqrt{N\_{\mathbf{n}}} T\_{N\_{\mathbf{n}}} < x \right) - \Phi(x) - \sum\_{i=1}^{k-2} Q\_i(x) \mathbb{E} N\_{\mathbf{n}}^{-i/2} \right| \leq C\_{k, \delta} \mathbb{E} N\_{\mathbf{n}}^{-(k-2+\delta)/2} \lambda$$

*where the functions Qi*(*x*) *are defined in Theorem 6.3.2 of [10].*

Relation (11) and Lemma 8 imply the following statement.

**Lemma 9.** *Let conditions (9) and (10) hold with k* = 4 *and δ* > 0*. Assume that condition (25) holds and*

$$\mathbb{E}N\_n^{-1/2} = \frac{1}{\sqrt{n}} + \frac{a}{n} + o(n^{-1}), \ a \in \mathbb{R},$$

$$\mathbb{E}N\_n^{-1} = \frac{b}{n} + o(n^{-1}), \ \mathbb{E}N\_n^{-(2+\delta)/2} = o(n^{-1}), \ b \in \mathbb{R}.$$

*Then*

$$\sup\_{x} \left| \mathbb{P} \left( \sqrt{n} T\_n < x \right) - \Phi(x) - \frac{Q\_1(x)}{\sqrt{n}} - \frac{Q\_2(x)}{n} \right| = o(n^{-1}),$$

*and*

$$\sup\_{x} \left| \mathbb{P} \left( \sqrt{N\_n} T\_{N\_n} < x \right) - \Phi(x) - \frac{Q\_1(x)}{\sqrt{n}} - \frac{bQ\_2(x) + aQ\_1(x)}{n} \right| = o(n^{-1}).$$

We will use Lemma 9 in order to determine the asymptotic (1 − *α*)-quantile of *Ln* and calculate the asymptotic deficiency.

Recall that, for *α* ∈ (0, 1), the asymptotic (1 − *α*)-quantile of *Ln* is the quantity *cα*(*n*) satisfying the asymptotic equality

$$\mathbb{P}\left(L\_n \ge c\_{\mathfrak{a}}(n)\right) = \mathfrak{a} + o(n^{-1}), n \to \infty. \tag{26}$$

Correspondingly, we define the the asymptotic (1 − *α*)-quantile *c*˜*α*(*n*) of *LNn* by the equation

$$\mathbb{P}(L\_{N\_n} \ge \mathbb{E}\_{\mathfrak{a}}(n)) = \mathfrak{a} + o(n^{-1}), n \to \infty. \tag{27}$$

From Lemmas 1 and 9 we directly obtain the a.e.s for these asymptotic (1 − *α*) quantiles.

**Lemma 10.** *Under the conditions of Lemma 8, we have*

$$c\_{\mathfrak{a}}(n) = u\_{\mathfrak{a}} + \frac{\mathbb{E}X\_1^3}{6\sqrt{n}}(u\_{\mathfrak{a}}^2 - 1) + \frac{1}{12n} \left[ \frac{\mathbb{E}^2 X\_1^3}{3}(5u\_{\mathfrak{a}} - 2u\_{\mathfrak{a}}^3) + \frac{\mathbb{E}X\_1^4 - 3}{2}(u\_{\mathfrak{a}}^3 - 3u\_{\mathfrak{a}}) \right] + o(n^{-1}),$$

$$\bar{c}\_{\mathfrak{a}}(n) = u\_{\mathfrak{a}} + \frac{\mathbb{E}X\_1^3}{6\sqrt{n}}(u\_{\mathfrak{a}}^2 - 1) +$$

$$+ \frac{1}{12n} \left[ \frac{\mathbb{E}^2 X\_1^3}{3}(5u\_{\mathfrak{a}} - 2u\_{\mathfrak{a}}^3) + \frac{b(\mathbb{E}X\_1^4 - 3)}{2}(u\_{\mathfrak{a}}^3 - 3u\_{\mathfrak{a}}) + 2a\mathbb{E}X\_1^3(u\_{\mathfrak{a}}^2 - 1) \right] + o(n^{-1}),$$
where  $u\_{\mathfrak{a}}$  satisfies the equation  $\Phi(u\_{\mathfrak{a}}) = 1 - u$ .

*where u<sup>α</sup> satisfies the equation* Φ(*uα*) = 1 − *α.*

Now define the sequence *m*(*n*) of natural numbers by the relation

$$\mathbb{P}(\sqrt{n}L\_{N\_{m(n)}} \ge \sqrt{m(n)}c\_{\mathbb{A}}(m(n))) = a + o(n^{-1}), n \to \infty. \tag{28}$$

If

$$m(n) = n + d + o(1),\tag{29}$$

*n* = 1, 2, ..., then *d* can have the meaning of the expected additional number of summands to be included in the sum in order that the function *LNn* exceeds *cα*(*n*) for the loss under a non-random number *n* of summands. The quantity *d* will be called the *asymptotic deficiency*.

In the same way that Theorem 1 was proved, we can establish the following statement.

**Theorem 4.** *Assume that*

$$\begin{aligned} \mathbb{E}N\_n &= n, \; \mathbb{E}N\_n^{-1/2} = \frac{1}{\sqrt{n}} + \frac{a}{n} + o(n^{-1}), \; a \in \mathbb{R}, \\\\ \mathbb{E}N\_n^{-1} &= \frac{b}{n} + o(n^{-1}), \; \mathbb{E}N\_n^{-(2+\delta)/2} = o(n^{-1}), \; b \in \mathbb{R}, \end{aligned}$$

*and there exist δ* > 0*, a differentiable distribution function G*(*x*) *and measurable functions g*1(*x*) *and g*2(*x*) *such that*

$$\sup\_{n} \left| \mathbb{P} \left( L\_n < x \right) - G(x) - \frac{g\_1(x)}{\sqrt{n}} - \frac{g\_2(x)}{n} \right| \le \frac{C}{n^{(2+\delta)/2}}$$

*and G* (*cα*)*c<sup>α</sup>* = 0*. Then the expected number d of additional summands (see* (28) *and* (29)*) in the normalized random sum LNn with respect to the normalized sum Ln has the form*

$$d = \frac{2\left[g\_2(c\_\alpha)(1 - b) - ag\_1(c\_\alpha)\right]}{G'(c\_\alpha)c\_\alpha} + o(1),$$

*where c<sup>α</sup> satisfies the equation G*(*cα*) = 1 − *α.*

Theorem 4 implies the following statement.

**Corollary 4.** *Under the conditions of Lemma 8 the expected additional number of summands <sup>d</sup> (see (28) and (29)) corresponding to the normalized sum* <sup>√</sup>*NnTNn with a random number of summands with respect to the normalized sum* <sup>√</sup>*nTn has the form*

$$d = \frac{2\left((1-b)Q\_2(u\_a) - aQ\_1(u\_a)\right)}{\wp(u\_a)u\_a} + o(1).$$

*If additionally* E*X*<sup>3</sup> <sup>1</sup> = 0*, then*

$$d = \frac{(1 - b)(3 - \mu\_a^2)(\mathbb{E}X\_1^4 - 3)}{12} + o(1).$$

#### *4.2. An Example of Three-Point Distribution of the Number of Summands*

In this section, keeping to the terminology of the example related to optimization of the portfolio size of an insurance company, we will use Corollary 4 to obtain a.e.s for the asymptotic Value-at-Risk (asymptotic (1 − *α*)-quantile of the normalized average loss, or asymptotic normalized *α*-reserve) in the case where the portfolio size *Nn* has a special distribution concentrated in three points so that is symmetric around the central point.

Assume that the portfolio size *Nn* has the distribution of the form

$$\mathbb{P}(\mathbf{N}\_{\text{n}} = n - h\_{\text{n}}) = \mathbb{P}(\mathbf{N}\_{\text{n}} = n) = \mathbb{P}(\mathbf{N}\_{\text{n}} = n + h\_{\text{n}}) = \frac{1}{\mathfrak{Z}},\tag{30}$$

where *hn* <sup>∈</sup> <sup>N</sup>, *hn* <sup>&</sup>lt; *<sup>n</sup>*, *<sup>n</sup>* <sup>=</sup> 1, 2, . . ., and

$$\lim\_{n \to \infty} \frac{h\_n}{n} = 0.\tag{31}$$

**Lemma 11.** *Let the random portfolio size Nn have distribution (30) and let condition (31) hold. Then* E*Nn* = *n and, as n* → ∞*,*

$$\mathbb{E}N\_n^{-1/2} = \frac{1}{\sqrt{n}} - \frac{1}{4\sqrt{n}} \left(\frac{h\_n}{n}\right)^2 + O\left(\frac{1}{\sqrt{n}} \left(\frac{h\_n}{n}\right)^3\right),$$

$$\mathbb{E}N\_n^{-1} = \frac{1}{n} + \frac{2}{3n} \left(\frac{h\_n}{n}\right)^2 + O\left(\frac{1}{n} \left(\frac{h\_n}{n}\right)^4\right), \quad \mathbb{E}N\_n^{-3/2} = \frac{1}{n^{3/2}} + O\left(\frac{1}{n^{3/2}} \left(\frac{h\_n}{n}\right)^2\right).$$

**Proof.** The desired statements follow from the relations

$$\mathbb{E}N\_n^{-1} = \frac{3n^2 - h\_n^2}{3n(n^2 - h\_n^2)} = \frac{1}{n} \left(1 - \frac{h\_n^2}{3n}\right) \left(1 + \frac{h\_n^2}{n^2} + O\left(\frac{h\_n^4}{n^4}\right)\right) = \frac{1}{n} + \frac{2}{3n} \left(\frac{h\_n}{n}\right)^2 + O\left(\frac{1}{n} \left(\frac{h\_n}{n}\right)^4\right),$$

$$\mathbb{E}N\_n^{-3/2} = \frac{1}{3n^{3/2}} \left(\frac{1}{(1 - h\_n/n)^{3/2}} + 1 + \frac{1}{(1 + h\_n/n)^{3/2}}\right) = \frac{1}{n^{3/2}} + O\left(\frac{1}{n^{3/2}} \left(\frac{h\_n}{n}\right)^2\right).$$

$$\text{Thus, } \quad \dots \quad \dots \quad \dots \quad \dots \quad \dots \quad \dots \quad \dots.$$

The formula for E*N*−1/2 *<sup>n</sup>* is established in a similar way.

Lemmas 10 and 11 imply the following statement.

**Theorem 5.** *Assume that the normalized average loss has the form Ln* <sup>=</sup> <sup>√</sup>*nTn with Tn defined in (7). Let the r.v. Nn be distributed according to (30) and condition (31) hold. Under the conditions of Lemma 9, for the asymptotic <sup>α</sup>-reserve <sup>c</sup>*˜*α*(*n*) *corresponding to the normalized average loss* <sup>√</sup>*NnTNn there holds the relation*

$$
\tilde{c}\_{\mathfrak{a}}(n) = c\_{\mathfrak{a}}(n) - \frac{\mathbb{E}X\_1^3(\mu\_{\mathfrak{a}}^2 - 1)}{24\sqrt{n}} \left(\frac{h\_{\mathfrak{a}}}{n}\right)^2 + o(n^{-1}), n \to \infty.
$$

**Remark 1.** *In addition to the conditions of Theorem 5, let*

$$h\_{\rm II} = \gamma n^{\beta} + o(n^{\beta}), \ \gamma \ge 0, \ 0 \le \beta < 1.$$

*Then, as n* → ∞*,*

$$n^{5/2 - 2\beta} \left( c\_a(n) - \overline{c}\_a(n) \right) \to \frac{\gamma^2}{24} \mathbb{E}X\_1^3 (\mu\_a^2 - 1).$$

Applying Lemma 9, by simple calculations we obtain the following statement.

**Lemma 12.** *Assume that conditions (9) and (10) hold with k* = 4 *and* 0 < *δ* ≤ 1*. Let conditions (30) and (31) hold. Then*

$$\sup\_{\mathbf{x}} \left| \mathbb{P} \left( \sqrt{N\_n} T\_{N\_n} < \mathbf{x} \right) - \Phi(\mathbf{x}) - \left( 1 - \frac{h\_n^2}{4n^2} \right) \frac{Q\_1(\mathbf{x})}{\sqrt{n}} - \left( 1 + \frac{2h\_n^2}{3n^2} \right) \frac{Q\_2(\mathbf{x})}{n} \right| = O \left( \frac{h\_n^{(4+2\delta)/3}}{n^{7(2+\delta)/6}} \right).$$

**Corollary 5.** *Let conditions of Lemma 12 hold and hn* = *n*3/4*. Then*

$$\sup\_{\mathbf{x}\in\mathbb{R}} \left| \mathbb{P} \left( \sqrt{N\_{\mathbf{u}}} T\_{N\_{\mathbf{u}}} < \mathbf{x} \right) - \Phi(\mathbf{x}) - \frac{1}{\sqrt{n}} Q\_1(\mathbf{x}) - \frac{1}{n} \left( Q\_2(\mathbf{x}) - \frac{1}{4} Q\_1(\mathbf{x}) \right) \right| = o(n^{-1}).$$

Relations (12), Lemmas 10 and 11 yield the following theorem.

**Theorem 6.** *Let the conditions of Corollary 5 hold. Then the asymptotic α-reserves cα*(*n*) *and <sup>c</sup>*˜*α*(*n*) *related to the normalized average losses* <sup>√</sup>*nTn and* <sup>√</sup>*NnTNn have the form*

$$\varepsilon\_{a}(n) = u\_{a} + \frac{\mathbb{E}X\_{1}^{3}}{6\sqrt{n}}(u\_{a}^{2} - 1) + \frac{1}{12n} \left[ \frac{\mathbb{E}^{2}X\_{1}^{3}}{3}(5u\_{a} - 2u\_{a}^{3}) + \frac{\mathbb{E}X\_{1}^{4} - 3}{2}(u\_{a}^{3} - 3u\_{a}) \right] + o(n^{-1}),$$

$$\begin{split} \varepsilon\_{a}(n) &= u\_{a} + \frac{\mathbb{E}X\_{1}^{3}}{6\sqrt{n}}(u\_{a}^{2} - 1) + \\ &+ \frac{1}{12n} \left[ \frac{\mathbb{E}^{2}X\_{1}^{3}}{3}(5u\_{a} - 2u\_{a}^{3}) + \frac{\mathbb{E}X\_{1}^{4} - 3}{2}(u\_{a}^{3} - 3u\_{a}) - \frac{1}{2}\mathbb{E}X\_{1}^{3}(u\_{a}^{2} - 1) \right] + o(n^{-1}), \end{split}$$

*where u<sup>α</sup> satisfies the equation* Φ(*uα*) = 1 − *α. The corresponding expected additional number d of contracts has the form*

$$d = \frac{Q\_1(\mu\_\alpha)}{2\varrho(\mu\_\alpha)\mu\_\alpha} + o(1) = \frac{(1 - \mu\_\alpha^2) \mathbb{E}X\_1^3}{12\mu\_\alpha} + o(1).$$

#### **5. Conclusions**

The paper deals with an approach to the comparison of distributions of sums of a finite number of independent random variables by deficiency. The notion of asymptotic deficiency of the distribution of a measurable R-valued function of a random vector with respect to the distribution of the same function of another random vector was introduced. Some formulas for the calculation of asymptotic deficiency were presented in the cases where the function has the form of a normalized sum of independent identically distributed r.v.s. The formulas for the asymptotic deficiency were obtained as the solution of two problems, one of which deals with the description of the distribution of a separate summand minimizing the number of summands and providing a prescribed value of the (1 − *α*) quantile of the normalized sum for a given *α* ∈ (0, 1). The second problem deals with minimization of the number of summands and guaranteeing a prescribed probability for a normalized sum of r.v.s to fall into a given interval. These results were extended to the case of a random number of summands in the sum (or random portfolio size, in terms of the example dealing with an insurance company). For this case, an analog of deficiency of the sum of a random number of summands with respect to the distribution of the sum of a non-random number of summands was introduced. The problem of comparison of these distributions by an analog of deficiency was considered in a special case of threepoint distribution of portfolio size. The main mathematical tools used in the paper were asymptotic expansions for the distributions of average losses and their quantiles.

**Author Contributions:** Conceptualization, V.E.B. and V.Y.K.; Formal analysis, V.Y.K.; Funding acquisition, V.Y.K.; Investigation, V.E.B. and V.Y.K.; Writing – original draft, V.E.B. and V.Y.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was supported by the Ministry of Science and Higher Education of the Russian Federation, project No. 075-15-2020-799.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors thank the anonymous referees for their comments and suggestions that improved the paper. We also thank A. K. Gorshenin for his help in formatting the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Equilibrium in a Queueing System with Retrials**

**Julia Chirkova 1,\*,†,‡, Vladimir Mazalov 1,2,†,‡ and Evsey Morozov 1,2,3,†,‡**


**Abstract:** We find an equilibrium in a single-server queueing system with retrials and strategic timing of the customers. We consider a set of customers, each of which must decide when to arrive to a queueing system during a fixed period of time. In this system, after completion of service, the server seeks a customer blocked in a virtual orbit (orbital customer) to be served next, unless a new customer captures the server. We develop, in detail, a setting with two and three customers in the set, and formulate and discuss the problem for the general case with an arbitrary number of customers. The numerical examples for the system with two and three customers included as well.

**Keywords:** equilibrium arrivals; one-server queueing system; orbit; retrials

#### **1. Introduction**

The retrial queues have been attracting increasing interest because of their importance in modeling modern wireless telecommunication systems. Many papers have been devoted to steady-state performance analysis of such queues with the most important sources mentioned here [1–3]. Firstly, we outline the main settings that describe the dynamics of a wide class of the retrial queueing systems.

There are many practical situation that can be modeled as a queueing system in which customers are allowed to have a few attempts to be served. For instance, in call centers with a callback option, as well as customers who cannot connect immediately with the operator, thus register their numbers and go back at a later time. These customers can be called *orbital* because it seems natural that registered customers are waiting for service in a so-called *orbit-queue (orbit)*. As these customers cannot be picked up immediately when the operator becomes available, some seeking time (called *retrieval time*) is needed to access a registered customer. We note that sometimes the operator may make an *outgoing call*, not being aware of the presence of the registered customers in the orbit. A similar situation (with retrial attempts) arises in many service systems where a ticket is issued upon the arrival of a customer who will be served in a later time when the server is available.

Now we touch upon service disciplines that are considered in the retrial queueing systems. The most traditional discipline is the so-called *classical retrials*, when the customers blocked in the orbits make retrial attempts *independently*, in which case the retrial rate increases (linearly) as the orbit size increases. Stability analysis of such systems are considered in the mentioned books (mainly in the Markovian setting), and in a more general setting, this analysis has been performed in the papers [4–6].

The latter approach, in a generalized form, has been applied to the stability analysis of a wide class of queueing systems (including many retrial systems) in the recent book [7]. The main ingredient of the analysis is to establish the negative drift of the remaining work

**Citation:** Chirkova, J.; Mazalov, V.; Morozov, E. Equilibrium in the Queueing System with Retrials. *Mathematics* **2022**, *10*, 428. https:// doi.org/10.3390/math10030428

Academic Editor: János Sztrik

Received: 31 December 2021 Accepted: 26 January 2022 Published: 28 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

(workload) when the orbit size becomes large. Moreover, a key observation is that, from the point-of-view of stability, such a retrial system approaches the classic buffered system when the orbit size increases unlimitedly. (We call it *an asymptotically work-conserving* discipline in [5].) The following step to simplify the regenerative stability analysis has been realized in the paper [6], in which the positive drift of the *idle time* of the servers is used (instead of the positive drift of the workload) under the assumption that the orbit size increases to infinite in probability. (For more on regenerative stability analysis, see [7].)

Another wide and important class of the retrial models is the queueing systems with the *constant retrial rate*, see Chapter 7 in [7]. These models play an important role in the analysis of modern wireless telecommunication systems. In this regard, we mention the paper [8] in which, to the best of our knowledge, such a model has been used for the first time to model a telephone exchange system. A retrial queueing system with a constant retrial rate is suitable to describe the behavior of the multiple access protocols [9]. The retrial queues with a constant retrial rate have been applied to model TCP (Transmission Control Protocol) traffic related to short HTTP (HyperText Transfer Protocol) connections and to describe an optical-electrical hybrid contention resolution scheme [10]. There is also a modification of the retrial system in which, after each departure, the server seeks the customer in orbit (this is the above-mentioned retrieval time) to be served next. For instance, such a system has been considered in the paper [11], where a logarithmic asymptotic of a large deviation probability of the orbit size during the regeneration period is obtained. Moreover, in this paper we also consider the system with the retrieval time.

The most interesting and newest setting related to a retrial system is the so-called *retrial systems with coupled orbits* [12], which have potential applications to model the wireless multiple access systems. In particular, they can model the relay-assisted cognitive cooperative wireless systems in which the users transmit packets to a common destination node, and there are a finite number of relay nodes (i.e., orbits) that assist to retransmit the blocked packets; when a direct source user transmission is blocked, it forwards the blocked packet at a relay node [13]. Recent progress in the analysis of the queueing systems with coupled orbits is presented in book [7]. (See Chapters 7–9, where motivation and further references can be also found.)

Among the most important previous results in the analysis of the retrial systems, we mention the explicit expression for the stationary remaining service time of the server, which has been obtained in the recent paper [14]. It is easy to show that this result is also applicable for the stationary retrial queueing system. The mentioned result has a potential application in the setting in which the customers (both in the input process and in orbit) can select the instant to capture the server. An analysis of such a setting is very close to the main purpose of present paper. In the context of the present paper, it is important to emphasize that in the conventional queueing theory setting, customers following an input process are unable to make their own decision, while now we allow this possibility. To best of our knowledge, this setting is completely new, and this is the main contribution of our paper.

In conventional queueing theory, the structure of the input process and service process are usually assumed to be predefined and specified by the input rate and service times of the customers. However, there exists a different approach to the queueing which is based on the assumption that the customers (or users) logging into the system are *strategic* ([15–25]). Namely, it is assumed that the user strategy is to select the arrival instant to the system on a time interval [0, *T*]. In this setting, the queue in the system is determined after each player selects (at the initial instant *t* = 0) their random arrival instant in the system. Thus, each user spends some time in the system, and this time is their personal utility function. As a result, a *non-zero-sum game* is obtained, in which we need to find the *Nash equilibrium*. To the best of our knowledge, the paper [15] (by Glazer and Hassin) is the first work that considers the queue as a result of the user's behavior, and the authors denote this system as ?/*M*/1. They further formulate a *non-cooperative game* in which a Poisson-distributed number of the customers determines their arrival instances in the

queue of a single-server system, over a (limited) admission interval [0, *T*]. The purpose of the customers is to minimize their waiting time in the system. It is shown in [15] that the *symmetric Nash equilibrium strategy* is mixed. In particular, it was revealed that this strategy is the (absolutely continuous) uniform distribution over time interval [0, *T*], except a singularity at zero, and the density function decreases between zero and *T*. A similar model ?/*M*/*m*/*c* with *m* ≥ 1 identical (exponential) servers and the buffer size *c* ≥ 0 for the waiting customers is considered in the paper [22]. Note that the arrival times game with the *batch service* has been investigated in [16]. A single-server bufferless system in which the customers have a *time-sensitivity function* that they want to minimize, instead of their own waiting costs, has been studied in [18]. The paper [20] establishes conditions under which the customers cannot queue until an opening time, and shows that, in the equilibrium, there is a singularity at instant *t* = 0, and that the density is positive only since an instant *te* > 0. A model where the customers may incur tardiness costs in addition to the waiting costs is considered in [21]. The paper [25] considers a model combining the tardiness costs, waiting costs, and restrictions on the opening and closing times. (An overview of the existing literature on this problem can also be found in [25].)

In this paper, we apply a game-theoretic approach to a callback queueing system with one server. The queue is formed by the strategic players. (In what follows, we use the terms 'customer', 'user', and 'players' as synonyms.) The player's strategy is to choose a moment to enter the system. If the server is busy, then the user is blocked and joins a (virtual) orbit queue. Otherwise, if the server is free, it seeks a blocked user from the orbit during an (exponential) retrieval time. In this setting, we will find the optimal strategies of the players. First, we consider the case of two users, then consider in detail the case of three players and finally formulate this problem for an arbitrary number of players.

Thus, the found optimal strategies of the players in the described retrial system is the main contribution of this paper.

The paper is organized as follows. In Section 2, we describe the model in details. Then in Section 3, we study the setting with two players. The simplest setting allows one to highlight the main ideas and steps of our analysis. In Section 4, we focus on the system with three players, and the analysis in this case turns out to be much more involved. The analysis of the game with three players is continued in Section 5 where an algorithm on how to find an approximation of the equilibrium is given. The developed approach is then extended to a general setting with *N* + 1 players in Section 6. Moreover, we consider a few numerical examples in Sections 3 and 6.

#### **2. Description of the Model**

Now we describe our model in more details in a general setting. We assume that there exists a single server that serves *N* ('exogenous') customers presented in the system at the initial instant *t* = 0. Unlike the conventional queueing theory setting, these customers use some *strategy* to choose an instant to enter the server. By a symmetry, this strategy is the same for each user. This strategy is determined by a distribution function, which is the main purpose of the analysis, and it determines the instant of the attempt to enter the server, see (1) below. This is an important difference with the standard queueing theory setting where customers are not allowed to have their own decisions but follow the predefined rules describing the dynamics of the system. After the departure of a served customer, the server starts to seek the customer blocked in orbit (if any) to be served next. As mentioned above, this seeking time is exponential with parameter *γ* and mean *τ*, and it is called retrieve time in the retrial queueing terminology [11]. If an exogenous customer finds a server busy, then they join a (virtual) orbit, and such orbital customers constitute a virtual *orbit queue* (see [12]), which is served in FIFO (First-In-First-Out) order. If, during a retrieval time (when the server is idle), some customer arrives then they capture server for the exponentially distributed time with parameter *μ*. Recall that the arrival time is selected in according to distribution (1). Thus, the present setting combines some features of both the classic retrial system and a *gated* queue, in which the input gate remains closed until all *N*

customers leave the system. In addition, a similar situation arises in a *polling system* (see, for instance, [26]), in which different queues are served in an order, and server, returning to a fixed queue, as a rule finds there a few new customers to be served.

**Remark 1.** *In our setting, the system has initially a finite number of players, and in this case the system is stable in a traditional sense: The number of customers is bounded. However, in our analysis the requirement μ* > *γ on the parameters μ and γ appears (see (8)), which indeed can be treated as a stability condition in the framework of the considered game setting. Moreover, when the number of players is N (see Section 6) then the classic stability analysis could be critically important in the asymptotic setting as N* → ∞*.*

#### **3. Two Players Scenario**

Consider the case of two players. To find an equilibrium in this two-person game, we will use the following approach. Suppose that one of the players (for the sake of definiteness, the second player) uses, as a strategy, a random arrival time with the distribution function *F*(*t*) (having density *f*(*t*)) of the following form:

$$F(t) = \begin{cases} \ p\_{\prime} & 0 \le t < t\_{\ell \prime} \\ \ p + \int\_{t\_{\ell}}^{t} f(x)dx, \ t\_{\ell} \le t \le T. \end{cases} \tag{1}$$

That is, the second player enters the system at the initial moment *t* = 0 with probability *p* ∈ (0, 1), otherwise, they arrive at instant *t* ∈ [*te*, *T*] following distribution *F*(*t*), where *te* > 0 and *T* < ∞ are predefined constants.

Now, we find the best response of the first player to the described strategy used by the second player. As a cost function of the first player, we will consider their average *sojourn time* (the average time the player spends in the system). Thus, the objective of the first player is to choose a strategy that minimizes the average *sojourn time*, that is the total time the player spends in the system. Due to the symmetry of the problem, in the equilibrium, the optimal strategy of the first player must coincide with the chosen strategy of their opponent. To do this, it is sufficient that the strategy of the second player is chosen in such a way that the cost function of the first player takes a constant value over the interval [*te*, *T*] and at the initial instant *t* = 0 (see [24]). Then the payoff of the first player will not depend on their own strategy.

Next, we find the best response of the first player to the strategy of the second player defined by the relation (1). First, we find its cost function. The average sojourn time, provided the first player enters the system at the instant *t* = 0, is:

$$C(0) = (1 - p)\frac{1}{\mu} + p(\frac{1}{2}\frac{1}{\mu} + \frac{1}{2}(\pi + \frac{2}{\mu})) = (1 - p)\frac{1}{\mu} + p(\frac{3}{2\mu} + \frac{\pi}{2}).$$

In this expression, we take into account that, with the probability 1 − *p*, the second player does not arrive in the system at the instant *t* = 0. Then the first player will be served first, and the average sojourn time equals the average service time 1/*μ*. If the second player arrives at the instant *t* = 0 (with the probability *p*), then, with probability 1/2, the first player can be selected for service, and their average service time equals 1/*μ*. However, with probability 1/2, the server chooses the second player, and then the first player joins the orbit and waits until the second player ends service.

If 0 < *t* < *te*, then the average sojourn time of a customer that arrives at instant *t* satisfies:

$$\begin{split} \mathcal{C}(t) &= (1-p)\frac{1}{\mu} + p\left( (1-e^{-\mu t})\frac{1}{\mu} + e^{-\mu t}(\frac{1}{\mu} + \tau + \frac{1}{\mu}) \right) \\ &= (1-p)\frac{1}{\mu} + p\left( \frac{1}{\mu} + e^{-\mu t}(\tau + \frac{1}{\mu}) \right). \end{split}$$

To obtain this expression, we take into account that, with the probability 1 − *p*, the second player does not arrive at instant *t* = 0. In this case, the first player is served first, and the average (sojourn) service time equals 1/*μ*. If the second player arrives at *t* = 0 (with the probability *p*), then with probability 1 − exp{−*μt*}, it will be served in the time interval [0, *t*], and then the first player will be served immediately. However, with the probability exp{−*μt*}, the second player is still in the server at instant *t*, and then the first player joins the orbit. This player will wait until the second player leaves the server, and then they occupy server after time *τ*. We note that function *C*(*t*) decreases in *t* and then, in the limit as *t* → 0+, we obtain:

$$\mathcal{C}(0+) = (1-p)\frac{1}{\mu} + p\left(\tau + \frac{2}{\mu}\right) > \mathcal{C}(0).$$

We require the fulfillment of the following condition on *te*: *C*(0) = *C*(*te*), that is:

$$\frac{1}{\mu} + e^{-\mu t\_{\varepsilon}}(\tau + \frac{1}{\mu}) = \frac{3}{2\mu} + \frac{\tau}{2}\tau$$

which yields:

$$t\_{\varepsilon} = \frac{\log 2}{\mu}.\tag{2}$$

Similarly, for *t* ≥ *te*, we obtain the average sojourn time in the form:

$$\begin{split} C(t) &= p \left( (1 - e^{-\mu t}) \frac{1}{\mu} + e^{-\mu t} (\frac{1}{\mu} + \tau + \frac{1}{\mu}) \right) \\ &+ \left( \int\_{t\_{\mathcal{E}}}^{t} dF(\theta) \left( (1 - e^{-\mu(t - \theta)}) \frac{1}{\mu} + e^{-\mu(t - \theta)} (\frac{1}{\mu} + \tau + \frac{1}{\mu}) \right) + \int\_{t}^{T} \frac{1}{\mu} dF(\theta) \right), \end{split}$$

implying,

$$\mathcal{C}(t) = p\left(\frac{1}{\mu} + e^{-\mu t}(\tau + \frac{1}{\mu})\right) + \left(\frac{1-p}{\mu} + \int\_{t\_\ell}^t e^{-\mu(t-\theta)}(\tau + \frac{1}{\mu}) dF(\theta)\right). \tag{3}$$

Now, we find the exact shape of the target distribution function *F*(*t*) using condition *C* (*t*) = 0. Differentiating, we obtain:

$$-\mu p(\tau + \frac{1}{\mu})e^{-\mu t} - \mu(\tau + \frac{1}{\mu})\int\_{t\_0}^{t} e^{-\mu(t-\theta)}dF(\theta) + (\tau + \frac{1}{\mu})f(t) = 0,$$

implying,

$$\int\_{t\_\varepsilon}^t \mathcal{e}^{\mu\theta} f(\theta) d\theta = \frac{1}{\mu} f(t) \mathcal{e}^{\mu t} - p.t$$

Denoting:

$$\lg(t) = f(t)e^{\mu t},$$

we can write the equation for function *g*(*t*) in the following form:

$$\int\_{t\_\varepsilon}^t \mathbf{g}(\theta)d\theta = \frac{1}{\mu}\mathbf{g}(t) - p. \tag{4}$$

In addition, we have the following relation:

$$
\mathfrak{g}'(t) = \mathfrak{h}\mathfrak{g}(t)\_\prime
$$

implying,

$$\lg(t) = const \cdot e^{\mu t} \text{ and } f(t) = K = const.$$

Substituting *g*(*t*) to (4) we obtain:

$$K = \mu p \varepsilon^{-\mu t\_\varepsilon} = \frac{\mu p}{2}.\tag{5}$$

On the other hand, condition:

$$1 - p = \int\_{t\_c}^{T} \mathbb{K}dt\_r$$

yields,

$$p = 1 - K(T - t\_e)\_r$$

and then, using expression (5), we finally obtain the probability *p*:

$$p = \frac{1}{1 + \mu(T - t\_\varepsilon)e^{-\mu t\_\varepsilon}} = \frac{2}{2\mu + \mu T - \log 2}. \tag{6}$$

Now it follows from (3) that:

$$\mathcal{C}(t) = \mathcal{C}(t\_\varepsilon) = p e^{-\mu t\_\varepsilon} (\tau + \frac{1}{\mu}) + \frac{1}{\mu} = \frac{1}{2\mu + \mu T - \log 2} (\tau + \frac{1}{\mu}) + \frac{1}{\mu}.$$

The previous analysis can be summarized as the following statement.

**Proposition 1.** *An equilibrium in the two-person queueing game with retrials satisfies relation (1), where f*(*t*) = *K* = *const for t* ∈ [*te*, *T*]*, and parameters p*, *te*, *K satisfy conditions (2), (5), and (6).*

**Example 1.** *Consider a system which accepts the customers within interval* [0, 2]*, and assume that the service rate μ* = 2 *and the retrieval rate γ* = <sup>1</sup> *<sup>τ</sup>* = 1*. Then it is easy to calculate that:*

$$t\_e \approx 0.347, \ f(t) \approx 0.377 \text{ for } t \ge t\_{e\nu}$$

*the probability (to arrival at instant 0) p* ≈ 0.377 *and the average sojourn time is:*

$$\mathcal{C}(0) = \mathcal{C}(t) \approx 0.783 \text{ for } t \ge t\_c.$$

#### **4. Three-Player Solution**

Consider the case of three players. Suppose that two of the players (for definiteness, the second and third players) use, as a strategy, a random arrival time with the distribution *F*(*t*) satisfying (1). We assume that customers are taken from the orbit by the server in the order they entered the orbit, i.e., using FIFO discipline. If two customers entered the orbit simultaneously, then the order is assigned at random (that is, each one is selected with probability 1/2).

Now we find the best response of the first player provided they arrive at instant *t* = 0. In this case, the payoff is:

$$\mathcal{C}(0) = (1-p)^2 \frac{1}{\mu} + 2p(1-p)(\frac{1}{\mu} + \frac{1}{2}\mathcal{W}\_1^1(0)) + p^2(\frac{1}{\mu} + \frac{2}{3}\mathcal{W}\_2^0(0)),$$

where,

$$\begin{array}{l} \mathcal{W}\_{1}^{0}(0) = \frac{1}{\mu} + \frac{1}{\gamma},\\ \mathcal{W}\_{2}^{0}(0) = \frac{1}{\mu} + \frac{1}{\gamma} + \frac{1}{2}(\frac{1}{\mu} + \frac{1}{\gamma}) = \frac{3}{2\mu} + \frac{3}{2\gamma},\\ \mathcal{W}\_{1}^{1}(0) = \frac{1}{1-\rho} \int\_{t\_{\mathcal{E}}} \int \mu e^{-\mu s} ds \int \gamma e^{-\gamma u} (s+u) du + \int \mu e^{-\mu s} ds \int \gamma e^{-\gamma u} (s+u+\frac{1}{\mu}) du + \\ \stackrel{\sim}{\int} \mu e^{-\mu s} \text{sds} dF(\theta) = (\frac{1}{\mu} + \frac{1}{\gamma}) + \frac{1}{\gamma (\mu - \gamma)(1-\rho)} \int\_{t\_{\mathcal{E}}}^{T} (\gamma e^{-\gamma \theta} - \mu e^{-\mu \theta}) dF(\theta), \end{array}$$

and by *W<sup>j</sup> <sup>i</sup>*(*t*) we denote the average time spent in the orbit by a customer arriving at instant *t*, provided that there are *i* customers in the orbit, and *j* (exogenous) customers remains outside the system.

Now we find the value *C*(0+) for a customer who arrives in the system at instant *t* = 0+. Then they occupy the server if both remaining players arrive after *te*. If one player arrives at instant *t* = 0, then the customer arriving at *t* = 0+, evidently, joins the orbit. Finally, if both other customers arrive at *t* = 0, then the customer arriving at *t* = 0+ joins the end of the orbit queue. This results in:

$$\mathcal{L}(0+) = (1-p)^2 \frac{1}{\mu} + p(1-p)(\frac{1}{\mu} + \frac{1}{1-p} \mathcal{W}\_1^1(0)) + p^2(\frac{1}{\mu} + \mathcal{W}\_2^0(0+)) > \mathcal{C}(0),\tag{7}$$

where,

$$\mathcal{W}\_2^0(0+) = \frac{1}{\mu} + \frac{1}{\gamma} + \frac{1}{\mu} + \frac{1}{\gamma} = \frac{2}{\mu} + \frac{2}{\gamma} > \mathcal{W}\_2^0(0).$$

The playoff of a customer arriving at instant *te*, provided that no customers arrive in the time interval (0, *te*), satisfies:

*<sup>C</sup>*(*te*)=(<sup>1</sup> <sup>−</sup> *<sup>p</sup>*)<sup>2</sup> <sup>1</sup> *μ* + 2*p*(1 − *p*) ⎛ ⎝ 1 *μ* + -*T te dF*(*θ*) 1 − *F*(*te*) ⎛ ⎝ *θ te μe* −*μs ds θ* -−*s* 0 *γe* <sup>−</sup>*γτ*(*<sup>s</sup>* <sup>−</sup> *te* <sup>+</sup> *<sup>τ</sup>*)*dτ*<sup>+</sup> *θ te μe* −*μs ds* - ∞ *θ*−*s γe* <sup>−</sup>*γτ*(*<sup>θ</sup>* <sup>−</sup> *te* <sup>+</sup> 1 *μ* + 1 *<sup>γ</sup>*)*d<sup>τ</sup>* <sup>+</sup> - ∞ *θ μe* −*μs* (*s* − *te* + 1 *<sup>γ</sup>*)*ds*<sup>+</sup> *p*2 ⎛ ⎝ 1 *μ* + *te* 0 *μe* −*μs ds t <sup>e</sup>*−*s* 0 *γe* <sup>−</sup>*γτdτ* -∞ *te*−*s*−*τ μe* <sup>−</sup>*μv*(*<sup>s</sup>* <sup>+</sup> *<sup>τ</sup>* <sup>+</sup> *<sup>v</sup>* <sup>−</sup> *te* <sup>+</sup> 1 *<sup>γ</sup>*)*dv*<sup>+</sup> -∞ *te μe* −*μs ds*(*s* + 1 *μ* + 2 *<sup>γ</sup>* <sup>−</sup> *te*) ⎞ <sup>⎠</sup> = (<sup>1</sup> <sup>−</sup> *<sup>p</sup>*)<sup>2</sup> <sup>1</sup> *μ* + 2*p*(1 − *p*) ⎛ ⎝ 1 *<sup>μ</sup>* + ( <sup>1</sup> *μ* + 1 *γ*)*e* <sup>−</sup>*μte* <sup>+</sup> *<sup>e</sup>*−*μte* (1 − *p*)(*μ* − *γ*) -*T te* (*e* <sup>−</sup>*γ*(*θ*−*te*) <sup>−</sup> *<sup>e</sup>* −*μ*(*θ*−*te*) )*dF*(*θ*) ⎞ ⎠+ *p*2 1 *μ* + 2( 1 *μ* + 1 *γ*)*e* <sup>−</sup>*μte* + ( <sup>1</sup> *μ* + 1 *<sup>γ</sup>*) *<sup>γ</sup>* (*<sup>μ</sup>* <sup>−</sup> *<sup>γ</sup>*)<sup>2</sup> (*<sup>e</sup>* <sup>−</sup>*γte* <sup>−</sup> *<sup>e</sup>* <sup>−</sup>*μte* ) <sup>−</sup> ( 1 *μ* + 1 *<sup>γ</sup>*) *<sup>γ</sup> μ* − *γ tee* −*μte* .

We assume that:

$$
\mu > \gamma. \tag{8}
$$

Then the payoff value *C*(*te*) decreases by *te*, provided that there are no customers arriving into the system at the interval (0, *te*).

The obtained latter equality and the inequality *C*(0+) > *C*(0) (see (7)) confirm that it is more profitable for a customer to arrive with a delay following the instant *t* = 0 (but not at the instant 0+). Since, in the equilibrium, the payoff should be the same on the strategy support, then the instant when a new customer decides to arrive in the system satisfies the equation:

$$
\mathbb{C}(0) = \mathbb{C}(t\_e).
$$

Now we show that an arrival at the instant *t* ∈ (0, *te*) is unprofitable even for one player, when other players make an attempt to occupy the server since instant *te*. For such *t* we obtain:

$$\begin{split} \mathcal{C}(t) &= (1-p)^2 \frac{1}{\mu} + 2p(1-p) \left( \frac{1}{\mu} + \int\_t^T dF(\theta) \left( \frac{1}{\mu} + \int\_t^\theta \mu e^{-\mu t} ds \int\_0^{\theta-s} \gamma e^{-\gamma \tau} (s-t+\tau) d\tau + \int\_t^\theta \gamma e^{-\gamma \tau} d\tau \right) \right) \\ &\left. \int\_t^\theta \mu e^{-\mu s} ds \int\_{\theta-s}^\infty \gamma e^{-\gamma \tau} (\theta-t+\frac{1}{\mu}+\frac{1}{\gamma}) d\tau + \int\_\theta^\infty \mu e^{-\mu s} (s-t+\frac{1}{\gamma}) ds \right) + \\ &p^2 \left( \frac{1}{\mu} + \int\_0^t \mu e^{-\mu s} ds \int\_0^{t-s} \gamma e^{-\gamma \tau} d\tau \int\_{t-s-\tau}^\infty \mu e^{-\mu s} (s+\tau+v-t+\frac{1}{\gamma}) dv + \\ &\int\_t^\infty \mu e^{-\mu s} ds (s + \frac{1}{\mu} + \frac{2}{\gamma} - t) \right) = \\ &(1-p)^2 \frac{1}{\mu} + p^2 \left( \frac{1}{\mu} + 2(\frac{1}{\mu} + \frac{1}{\gamma}) e^{-\mu t} + \frac{1}{\mu - \gamma} (e^{-\gamma t} - e^{-\mu t}) - te^{-\mu t} \right) + \\ &2p(1-p) \left( \frac{1}{\mu} + (\frac{1}{\mu} + \frac{1}{\gamma}) e^{-\mu t} + \frac{e^{-\mu t}}{(1-p)(\mu-\gamma)} \right) \Big( (e^{-\gamma(\theta-t)} - e^{-\mu(\theta-t)}) dF(\theta) \Big). \end{split}$$

Then it is easy to check that function *C*(*t*) decreases in *t* < *te*, confirming that, even for one player, it is better to avoid an attempt to enter the server earlier the instant *te*.

Now we consider the situation when the first player enters the system in the interval *t* ∈ [*te*, *T*]. To study it, we define, for instant *t*, the (time-dependent) state probabilities, denoted by *pijk*(*t*), that *i* ∈ {0, 1, 2} customers arrived in the system, in the interval [*te*, *T*], *j* ∈ {0, 1} customers are in the server, and *k* ∈ {0, 1} customers are in the orbit, at instant *t*. The arrival rate at instant *t* (that is the rate of exogenous customers) depends on the chosen strategy and the number *k* of customers who have already entered the system up to instant *t*. In an evident notation, these rates are equal to:

$$
\lambda\_0(t) = 2 \frac{f(t)}{1 - F(t)}, \quad \lambda\_1(t) = \frac{f(t)}{1 - F(t)}.
$$

respectively. Now we can write down the corresponding Kolmogorov backward equations for the state probabilities:

$$\begin{array}{l}p'\_{000}(t) = -\lambda\_0(t)p\_{000}(t),\\p'\_{100}(t) = -\lambda\_1(t)p\_{100}(t) + \mu p\_{110}(t),\\p'\_{110}(t) = -(\mu + \lambda\_1(t))p\_{110}(t) + \lambda\_0(t)p\_{000}(t),\\p'\_{201}(t) = -\gamma p\_{201}(t) + \mu p\_{211}(t),\\p'\_{210}(t) = -\mu p\_{210}(t) + \gamma p\_{201}(t) + \lambda\_1(t)p\_{100}(t),\\p'\_{211}(t) = -\mu p\_{211}(t) + \lambda\_1(t)p\_{110}(t),\\p'\_{200}(t) = \mu p\_{210}(t).\end{array} \tag{9}$$

Now we find the state probabilities for *t* = *te* as follows:

$$\begin{split} p\_{100}(t\_{\ell}) &= (1-p)^2 \\ p\_{100}(t\_{\ell}) &= 2p(1-p)(1-e^{-\mu t\_{\ell}}) \\ p\_{110}(t\_{\ell}) &= 2p(1-p)e^{-\mu t\_{\ell}} \\ p\_{201}(t\_{\ell}) &= p^2 \frac{t\_{\ell}}{0} \mu e^{-\mu \theta} e^{-\gamma(t\_{\ell}-\theta)} d\theta = p^2 \mu \frac{e^{-\gamma t\_{\ell}} - e^{-\mu t\_{\ell}}}{\mu - \gamma} \\ p\_{210}(t\_{\ell}) &= p^2 \frac{t\_{\ell}}{0} \mu e^{-\mu \theta} \int\_{0}^{t} \gamma e^{-\gamma \tau} e^{-\mu(t\_{\ell}-(\theta+\tau))} d\tau d\theta = p^2 \gamma \frac{e^{-\gamma t\_{\ell}} - e^{-\mu t\_{\ell}}(\mu t\_{\ell} - \gamma t\_{\ell} + 1)}{(\mu - \gamma)^2} \\ p\_{211}(t\_{\ell}) &= p^2 e^{-\mu t\_{\ell}} \\ p\_{200}(t\_{\ell}) &= p^2 \int\_{0}^{t\_{\ell}} \mu e^{-\mu \theta} \int\_{0}^{t} \gamma e^{-\gamma \tau} (1 - e^{-\mu(t\_{\ell}-(\theta+\tau))}) d\tau d\theta = \\ &= p^2 (1 - e^{-\mu t\_{\ell}} - \gamma \frac{e^{-\gamma t\_{\ell}} - e^{-\mu t\_{\ell}}(\mu t\_{\ell} - \gamma t\_{\ell} + 1)}{(\mu - \gamma)^2} - \mu \frac{e^{-\gamma t\_{\ell}} - e^{-\mu t\_{\ell}}}{\mu - \gamma}). \end{split} \tag{10}$$

Expressions (10) indeed are the initial boundary conditions for the Cauchy problem for differential Equations (9). The probability *p* of an arrival at instant *t* = 0 can be found from the normalization condition:

$$p + \int\_{t\_\varepsilon}^{T} f(t)dt = 1.$$

Then the average sojourn time of a player entering the system at instant *t* is:

$$\begin{aligned} \mathcal{C}(t) &= \frac{1}{\mu} \left( p\_{000}(t) + p\_{100}(t) + p\_{201}(t) + p\_{200}(t) \right) \\ &+ p\_{110}(t) (\frac{1}{\mu} + \mathcal{W}\_1^1(t)) + p\_{210}(t) (\frac{1}{\mu} + \mathcal{W}\_1^0(t)) + p\_{211}(t) (\frac{1}{\mu} + \mathcal{W}\_2^0(t)) \\ &= \frac{1}{\mu} + p\_{110}(t)\mathcal{W}\_1^1(t) + p\_{210}(t)\mathcal{W}\_1^0(t) + p\_{211}(t)\mathcal{W}\_2^0(t), \end{aligned}$$

where,

$$\begin{split} &\mathcal{W}\_{1}^{0}(t) = \frac{1}{\mu} + \frac{1}{2},\\ &\mathcal{W}\_{2}^{0}(t) = \frac{2}{\mu} + \frac{2}{\gamma},\\ &\mathcal{W}\_{1}^{1}(t) = \frac{1}{1 - F(t)} \int\_{t}^{T} dF(\theta) \left( \int\_{0}^{\theta - t} \mu e^{-\mu s} ds \int\_{0}^{\theta - t - s} \gamma e^{-\gamma \tau} (s + \tau) d\tau + \int\_{\theta - t}^{\theta} \gamma e^{-\gamma \tau} (s + \tau) d\tau + \int\_{\theta - t}^{\theta} \gamma e^{-\gamma \tau} (s + \tau) d\tau \right) \\ &\qquad \left( \int\_{0}^{\theta - t} \mu e^{-\mu s} \int\_{\theta - t - s}^{\theta} \gamma e^{-\gamma \tau} (\theta - t + \frac{1}{\mu} + \frac{1}{\gamma}) d\tau + \int\_{\theta - t}^{\infty} \mu e^{-\mu s} (\frac{1}{\gamma} + s) ds \right) \\ &= \frac{1}{\mu} + \frac{1}{\gamma} + \frac{1}{(1 - F(t))(\mu - \gamma)} \int\_{t}^{T} e^{-\gamma (\theta - t)} - e^{-\mu (\theta - t)} dF(\theta). \end{split}$$

In the equilibrium, the equality *C*(*t*) = *C*(*te*) = *const*, *t* ∈ [*te*, *T*] is satisfied, implying:

$$\begin{aligned} &\left(\frac{1}{\mu} + \frac{1}{\gamma}\right) \left(p\_{110}(t) - p\_{110}(t\_\varepsilon) + p\_{210}(t) - p\_{210}(t\_\varepsilon) + 2(p\_{211}(t) - p\_{211}(t\_\varepsilon)) + \\ &\frac{1}{\mu - \gamma} \left(\frac{p\_{110}(t)}{1 - F(t)} \int\_t^T (e^{-\gamma(\theta - t)} - e^{-\mu(\theta - t)}) dF(\theta) - \frac{p\_{110}(t\_\varepsilon)}{1 - p} \int\_{t\_\varepsilon}^T (e^{-\gamma(\theta - t\_\varepsilon)} - e^{-\mu(\theta - t\_\varepsilon)}) dF(\theta)\right) \\ &= 0. \end{aligned} \tag{11}$$

If condition (11) is met, then the function *C*(*t*) must be a constant on the time interval [*te*, *T*]. It remains now to require that the condition *C*(*te*) = *C*(0) is satisfied. In other words,

$$\begin{split} &(1-p)^2 \frac{1}{\mu} + p^2(\frac{2}{\mu} + \frac{1}{\gamma}) \\ &= 2p(1-p)(\frac{1}{\mu} + \frac{1}{2}(\frac{1}{\mu} + \frac{1}{\gamma}) + \frac{1}{2\gamma(\mu-\gamma)(1-p)} \int\_{\frac{1}{\mu}}^{\frac{\gamma}{\gamma}} (\gamma e^{-\gamma\theta} - \mu e^{-\mu\theta}) dF(\theta)) = \\ &(1-p)^2 \frac{1}{\mu} + \\ &2p(1-p)\left(\frac{1}{\mu} + (\frac{1}{\mu} + \frac{1}{\gamma})e^{-\mu t\_{\varepsilon}} + \frac{e^{-\mu t\_{\varepsilon}}}{(1-p)(\mu-\gamma)} \int\_{t\_{\varepsilon}}^{\frac{\gamma}{\gamma}} (e^{-\gamma(\theta-t\_{\varepsilon})} - e^{-\mu(\theta-t\_{\varepsilon})}) dF(\theta)\right) + \\ &p^2 \left(\frac{1}{\mu} + 2(\frac{1}{\mu} + \frac{1}{\gamma})e^{-\mu t\_{\varepsilon}} + \left(\frac{1}{\mu} + \frac{1}{\gamma}\right)\frac{\gamma}{(\mu-\gamma)^2} (e^{-\gamma t\_{\varepsilon}} - e^{-\mu t\_{\varepsilon}}) - \left(\frac{1}{\mu} + \frac{1}{\gamma}\right)\frac{\gamma}{\mu-\gamma}t\_{\varepsilon}e^{-\mu t\_{\varepsilon}}\right) \end{split}$$

or:

$$\begin{split} &(\frac{1}{\mu} + \frac{1}{\gamma})(1 - 2e^{-\mu t\_{\varepsilon}}) + \\ &\frac{1}{(\mu - \gamma)}(\frac{1}{\gamma} \int\_{t\_{\varepsilon}}^{T} (\gamma e^{-\gamma \theta} - \mu e^{-\mu \theta}) dF(\theta) - 2e^{-\mu t\_{\varepsilon}} \int\_{t\_{\varepsilon}}^{T} (e^{-\gamma(\theta - t\_{\varepsilon})} - e^{-\mu(\theta - t\_{\varepsilon})}) dF(\theta)) + \\ &p(\frac{1}{\mu} + \frac{1}{\gamma})(\frac{\gamma}{\mu - \gamma} t\_{\varepsilon} e^{-\mu t\_{\varepsilon}} - \frac{\gamma}{(\mu - \gamma)^2} (e^{-\gamma t\_{\varepsilon}} - e^{-\mu t\_{\varepsilon}})) = 0. \end{split} \tag{12}$$

The analysis performed above is summarized in the following statement.

**Proposition 2.** *An equilibrium in the three-person queueing game with retrievals has the form (1), where f*(*t*) *for t* ∈ [*te*, *T*] *is determined as a solution of Equations (9), (11), and (12).*

#### **5. Computing the Equilibrium**

In this section, we describe in detail the numerical solution of the above obtained equations. Let us fix *te* and *T*. We divide the interval [*te*, *T*] into *k* − 1 equal segments. Then we find an approximate solution in the nodes of the grid *K* = {*t*<sup>1</sup> = *te*, *t*<sup>2</sup> = *t*<sup>1</sup> + Δ, ... , *tk* = *tk*−<sup>1</sup> + <sup>Δ</sup>}, where <sup>Δ</sup> = (*<sup>T</sup>* − *te*)/(*<sup>k</sup>* − 1). We represent the values of the unknown function *f*(*t*) at the initial moment and at the nodes of the grid as the arguments of the problem:

$$f(\mathbf{x}\_0 = f(0) = p, \mathbf{x}\_1 = f(t\_1), \dots, \mathbf{x}\_k = f(t\_k) = f(T)).$$

Then the derived conditions for the equilibrium can be represented as the difference equations. More precisely, condition (11) becomes:

$$\begin{split} &(\frac{1}{\mu} + \frac{1}{\gamma})(p\_{100}(t\_i) - p\_{110}(t\_\ell) + p\_{210}(t\_i) - p\_{210}(t\_\ell) + \\ &+2(p\_{211}(t\_\ell) - p\_{211}(t\_\ell)) + \frac{p\_{110}(t\_i)}{(\mu - \gamma)(1 - p - \sum\_{j=1}^{\ell - 1} x\_j \Delta)} \sum\_{j=i}^{k-1} \left( e^{-\gamma(t\_j - t\_i)} - e^{-\mu(t\_j - t\_i)} \right) \mathbf{x}\_j \Delta - \\ &- \frac{p\_{110}(t\_\ell)}{(\mu - \gamma)(1 - p)} \sum\_{j=1}^{k-1} \left( e^{-\gamma(t\_j - t\_\ell)} - e^{-\mu(t\_\ell - t\_\ell)} \right) \mathbf{x}\_j \Delta = 0, \quad i = 2, \ldots, k. \end{split} \tag{13}$$

Condition (12) takes the form:

$$\begin{split} &(\frac{1}{\mu} + \frac{1}{\gamma})(1 - 2e^{-\mu t\_{\varepsilon}}) + \\ &\frac{1}{\gamma} \frac{1}{(\mu - \gamma)} (\frac{1}{\gamma} \sum\_{j=1}^{k-1} (\gamma e^{-\gamma t\_{j}} - \mu e^{-\mu t\_{\varepsilon}}) \mathbf{x}\_{j} \Delta - 2e^{-\mu t\_{\varepsilon}} \sum\_{j=1}^{k-1} (e^{-\gamma(t\_{j} - t\_{\varepsilon})} - e^{-\mu(t\_{j} - t\_{\varepsilon})}) \mathbf{x}\_{j} \Delta) + \\ &p(\frac{1}{\mu} + \frac{1}{\gamma})(\frac{\gamma}{\mu - \gamma} t\_{\varepsilon} e^{-\mu t\_{\varepsilon}} - \frac{\gamma}{(\mu - \gamma)^{2}} (e^{-\gamma t\_{\varepsilon}} - e^{-\mu t\_{\varepsilon}})) = 0. \end{split} \tag{14}$$

The Kolmogorov backward (difference) equations become:

$$\begin{split} p\_{000}(t\_{i+1}) &= p\_{000}(t\_i) \left( 1 - \Delta \frac{2x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j} \right), \\ p\_{100}(t\_{i+1}) &= p\_{100}(t\_i) \left( 1 - \Delta \frac{x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j} \right) + \Delta \mu p\_{110}(t\_i), \\ p\_{110}(t\_{i+1}) &= p\_{110}(t\_i) \left( 1 - \Delta (\mu + \frac{x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j}) \right) + \Delta \frac{2x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j} p\_{000}(t\_i), \\ p\_{201}(t\_{i+1}) &= (1 - \Delta \gamma) p\_{201}(t\_i) + \Delta \mu p\_{211}(t\_i), \\ p\_{210}(t\_{i+1}) &= (1 - \Delta \mu) p\_{210}(t\_i) + \Delta \tau p\_{201}(t\_i) + \Delta \frac{x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j} p\_{100}(t\_i), \\ p\_{211}(t\_{i+1}) &= (1 - \Delta \mu) p\_{211}(t\_i) + \Delta \frac{x\_i}{1 - p - \Delta \sum\_{j=1}^{i-1} x\_j} p\_{101}(t\_i), \\ p\_{200}(t\_{i+1}) &= p\_{200}(t\_i) + \Delta \mu p\_{210}(t\_i), \\ i &= 1, \dots, k-1. \end{split} \tag{15}$$

Finally, the normalization condition takes the form:

$$p + \Delta \sum\_{i=1}^{k-1} x\_i - 1 = 0. \tag{16}$$

Now we find a solution as follows. We iterate over the values of *p* on the interval [0, 1]. For each *p*, we iterate over the values of *te* belonging to interval [0, *T*]. For each given *p* and *te*, we solve the system of the difference Equations (13), where the state probabilities satisfy system (15). We are looking for a pair *p*, *te*, in such a way that conditions (14) and (16) are satisfied.

To find *p* and *te*, we first use a partition of [0, 1] × [0, *T*] into a grid with a small rank and look for a node, passing through which the left sides of Equations (14) and (16) change sign. We seek for a solution in a neighborhood of this node. More precisely, for each given *p* and *te*, the solution *x* is found with Algorithm 1. First we specify the uniform distribution over the interval [*te*, *T*] as the initial approximation of the solution *x*, taking into account that there is an atom at the point *t* = 0 with a given probability *p*. Then we search for a solution *x* that satisfies (13). Step 1 of Algorithm 1 is repeated until the solution is stabilized with a given accuracy . In practice, the algorithm converges in 2–3 runs.

At each iteration *i* = 2, ... , *k* at the beginning of the execution, we have *xj*, where *j* = 1, . . . , *i* − 2, and state probabilities for all steps from 1 to *i* − 1, found on previous iterations. On iteration *i*, we solve Equation (13) for *xi*−1, each time calculating the probabilities of states at step *i* using an approximation of *xi*−<sup>1</sup> according to (15). In this case, the sum in the two last terms on the left-hand side of Equation (13) is partially calculated from the old values *xprev* from the solution at the previous iteration, and all other components of the equation are calculated by new ones, according to the solution *x* at the current

iteration. However, as calculation experiments confirm, with each pass of the algorithm, the difference between the solutions decreases.

**Algorithm 1** Finding the solution *x* for *p* and *te*.

Step 0. Initialization. **for** *i* = 1 to *k* **do** *xi* <sup>←</sup> <sup>1</sup>−*<sup>p</sup> T*−*te* **end for** Calculate *p*000(*t*1),..., *p*200(*t*1) from (10). Step 1. Next approximations. **do** *<sup>x</sup>prev* <sup>←</sup> *<sup>x</sup>* **for** *i* = 2 to *k* **do** Find *xi*−<sup>1</sup> from (13) with new *<sup>p</sup>*000(*ti*),..., *<sup>p</sup>*200(*ti*) re-calculated with (15). **end for while** (|*<sup>x</sup>* <sup>−</sup> *<sup>x</sup>prev*<sup>|</sup> <sup>&</sup>gt; )

After completing the algorithm, we get a set of values *xi*, where *i* = 1, ... , *k* − 1, which is enough to find *F*(*T*) ≈ *p* + *k*−1 ∑ *i*=1 *xi*Δ. If conditions (14) and (16) are satisfied with a given accuracy , the current *p*, *te* and *x* give a solution, otherwise we change *te* and *p*.

**Example 2.** *Let T* = 2*, μ* = 2*, γ* = 1*. The computations give the optimal values in the equilibrium: p* ≈ 0.412*, te* ≈ 0.380*. The density of the optimal arrival time at the interval* [*te*, *T*] *is presented at Figure 1. In the figure, the function f*(*t*) *first decreases in the interval* [*te*, 0.833]*, then increases in the interval* [0.833, 1.676]*, then decreases again in the interval* [1.676, 2]*. The value at the equilibrium is equal to C*(*t*) ≈ 1.133*.*

**Figure 1.** The equilibrium density *f*(*t*) and cost *C*(*t*) for *T* = 2, *μ* = 2, *γ* = 1.

**Example 3.** *If T* = 4, *μ* = 2, *γ* = 1 *then p* ≈ 0.232*, te* ≈ 0.369*, and C*(*t*) ≈ 0.857*. The shape of the equilibrium density is similar to that presented in Figure 1, and C*(*t*) ≈ 0.857*.*

**Example 4.** *If T* = 4*, μ* = 4*, γ* = 1 *then the shape of the equilibrium density is similar to that as given on Figure 1, and p* ≈ 0.121*, te* ≈ 0.179*, and C*(*t*) ≈ 0.404*.*

**Example 5.** *If T* = 1*, μ* = 2*, γ* = 1 *then p* ≈ 0.661*, te* ≈ 0.381*, and C*(*t*) ≈ 1.485*. The density of the optimal arrival time at the interval* [*te*, *T*] *is presented on Figure 2. In the figure, the function f*(*t*) *decreases in the interval* [0.381, 1]*.*

**Figure 2.** Equilibrium density *f*(*t*) and cost *C*(*t*) for *T* = 1, *μ* = 2, *γ* = 1.

#### **6.** *N* **Players**

The same approach can be used when considering a service system with more than 3 players. Suppose that in the queueing system an arrival strategy *F*(*t*) of the form (1) is used, and the (*N*)-th player is looking for a strategy that gives the best response. In this case, the found best answer must coincide with the strategy *F*(*t*). Due to symmetry, it will be the equilibrium in the game.

By the state of the system at an instant *t*, we mean the set (*k*,*s*, *i*), where *k* is the number of claims entered into the system, *i* is the number of claims in the orbit, and the parameter *s*, which means for *s* = 0 the server free, and with *s* = 1 the server is busy at instant *t*. We denote by *W<sup>j</sup> <sup>i</sup>*(*t*) the average time until the server seeks for a claim in orbit at the time *t* provided the system has not yet received *j* customers and there are *i* customers in the orbit. Our purpose is to find the average sojourn time of the (*N*)-th customer when they enter the system at the instant *t*. For *t* = 0, we obtain:

$$\begin{aligned} C(0) &= (1-p)^{N-1} \frac{1}{\mu} + \sum\_{i=1}^{N-1} p^i (1-p)^{N-1-i} \left( \frac{1}{\mu} + \mathcal{W}\_i^{N-1-i}(0) \right), \\ &= \frac{1}{\mu} + \sum\_{i=1}^{N-1} p^i (1-p)^{N-1-i} \mathcal{W}\_i^{N-1-i}(0). \end{aligned}$$

For 0 < *t* < *te*, the average sojourn time of the (*N*)-th customer satisfies:

$$\begin{aligned} C(t) &= (1-p)^{N-1} \frac{1}{\mu} + \sum\_{i=1}^{N-1} p^i (1-p)^{N-1-i} \left( \frac{1}{\mu} + \mathcal{W}\_i^{N-1-i}(t) \right), \\ &= \frac{1}{\mu} + \sum\_{i=1}^{N-1} p^i (1-p)^{N-1-i} \mathcal{W}\_i^{N-1-i}(t). \end{aligned}$$

Finally, for *t* ≥ *te*, the average sojourn time of the (*N*)-th customer equals:

$$\begin{aligned} \mathcal{C}(t) &= P(\text{idle server}) \frac{1}{\mu} + \\ &\sum\_{k=1}^{N-1} \sum\_{i=0}^{k-1} P(\text{busy server}, k \text{ arrived customers}, i \text{ orbital customers}) \left(\mathcal{W}\_i^{N-1-k}(t) + \frac{1}{\mu}\right) .\end{aligned}$$

It gives:

$$C(t) = \sum\_{k=1}^{N-1} \sum\_{i=0}^{k-1} \left( p\_{k,0,i}(t) \frac{1}{\mu} + p\_{k,1,i}(t) \left( \mathcal{W}\_i^{N-1-k}(t) + \frac{1}{\mu} \right) \right),$$

implying, after some algebra, the following expression:

$$C(t) = \frac{1}{\mu} + \sum\_{k=1}^{N-1} \sum\_{i=0}^{k-1} p\_{k,1,i}(t) \mathcal{W}\_i^{N-1-k}(t). \tag{17}$$

In this expression *pk*,*s*,*i*(*t*) denotes the state probability at the instant *t*, which satisfy the following Kolmogorov backward equations:

$$\begin{aligned} p'\_{k,1,i}(t) &= - \quad (\mu + \lambda\_k(t)) p\_{k,1,i}(t) + \lambda\_{k-1}(t) p\_{k-1,1,i-1}(t) \\ &+ \quad \lambda\_{k-1}(t) p\_{k-1,0,i}(t) + \gamma p\_{k,0,i+1}(t), \end{aligned}$$

$$\begin{array}{rcl}p'\_{k,0,i}(t) = & - & (\gamma + \lambda\_k(t))p\_{k,0,i}(t) + \mu p\_{k,1,i}(t);\\k = & 1,2,\ldots,N-2; \quad i = 0,\ldots,k-1.\end{array}$$

We note that in this case,

$$
\lambda\_k(t) = (N - 1 - k) \frac{f(t)}{1 - F(t)}, \quad k = 1, \dots, N - 2,
$$

is the arrival rate, provided there are *k* customers in the system at the instant *t*.

Then we find the expression for *W<sup>j</sup> <sup>i</sup>*(*t*), the average waiting time a customer spends in the orbit, starting with instant *t*, until they occupy the server. Then, substituting *W<sup>j</sup> <sup>i</sup>*(*t*) in (17), we proceed as above to find the optimal strategy for the (*N*)-th player. To do this, we require that the following conditions:

$$\mathcal{C}(0) = \mathcal{C}(t\_{\mathfrak{e}}) = \mathcal{C}(t) = \mathcal{C}^\*, \quad t \in [t\_{\mathfrak{e}}, T]\_{\prime}$$

are satisfied. Finally, these conditions allow, in general, one to find the optimal strategy *F*(*t*) along the same steps we described above for particular cases with 2 and 3 players.

#### **7. Conclusions**

We consider a single-server retrial queueing system in which the service of customers is handled in a strategic manner, i.e., unlike the conventional retrial queues, the customers are generated by players who choose the time to occupy the server. An equilibrium is sought in such a system when applications are distributed in such a way that the average service time is minimal. An equilibrium is sought in the class of mixed strategies, when players choose the login time randomly. It is shown that the optimal strategy is such that a player enters the system with a non-zero probability at the initial moment of time, otherwise, it takes a random pause, and then a distribution density is used, satisfying a system of the differential equations. This setting is described in detail for the case of two and three players and illustrated by a few numerical examples. Moreover the model for an

arbitrary number of players is formulated. This scenario is assumed to be considered in detail in a future work.

**Author Contributions:** Conceptualization, J.C., V.M. and E.M.; methodology, J.C., V.M. and E.M.; software, J.C., V.M. and E.M.; validation, J.C., V.M. and E.M.; formal analysis, J.C., V.M. and E.M.; investigation, J.C., V.M. and E.M.; resources, J.C., V.M. and E.M.; data curation, J.C., V.M. and E.M.; writing—original draft preparation, J.C., V.M. and E.M; writing—review and editing, J.C., V.M. and E.M.; visualization, J.C., V.M. and E.M.; supervision, J.C., V.M. and E.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Self-Service System with Rating Dependent Arrivals**

**Alexander Dudin 1,2,\*, Olga Dudina 1, Sergei Dudin <sup>1</sup> and Yulia Gaidamaka <sup>2</sup>**


**Abstract:** A multi-server infinite buffer queueing system with additional servers (assistants) providing help to the main servers when they encounter problems is considered as the model of real-world systems with customers' self-service. Such systems are widely used in many areas of human activity. An arrival flow is assumed to be the novel essential generalization of the known Markov Arrival Process (*MAP*) to the case of the dynamic dependence of the parameters of the *MAP* on the rating of the system. The rating is the process defined at any moment by the quality of service of previously arrived customers. The possibilities of a customers immediate departure from the system at the entrance to the system and the buffer due to impatience are taken into account. The system is analyzed via the use of the results for multi-dimensional Markov chains with level-dependent behavior. The transparent stability condition is derived, as well as the expressions for the key performance indicators of the system in terms of the stationary probabilities of the Markov chain. Numerical results are provided.

**Keywords:** multi-server queueing model; rating; self-sufficient servers; self-checkout; assistants; multi-dimensional Markov chains

#### **1. Introduction**

Queueing theory is very useful for modeling various real-world systems, contact centers, airports, banks, telecommunication, and retail networks, in particular. The queueing model considered in this paper has two main novel features: (i) the mechanism of customer's arrival is dependent on the current rating of the system and (ii) consideration of self-service of customers via so-called self-service devices (*SSD*) or self-checkouts. Both these features are inherent for many real systems, e.g., entertainment systems, contact centers, food services, and retail networks. Thus, they have to be carefully taken into account in system design and management aiming to guarantee the effective operation of a system. The effective operation suggests earning the maximal profit received by the system via customers service and a high degree of customer satisfaction.

The standard queueing models considered in the literature suggest that the arrival flow of customers to the system does not depend on the system state. The flow entering the service may depend on this state via the mechanism of customers admission depending on the visibility of a queue, as well as its length and (or) the number of busy servers. In this paper, we assume the dependence of the arrival process not on the queue length but the rating of the system.

The rating is now the well-known notion that reflects the customers' satisfaction and has an influence on the choice among the competitive service systems in which the customer can obtain service. Ratings of various service systems are now very popular and easily available, e.g., on the Internet. Checking ratings/reviews before making consumption decisions has become a ritual for many of today's customers, see [1]. The rating dynamically evaluates the current quality of customer service in this system. The rating of the system

**Citation:** Dudin, A.; Dudina, O.; Dudin, S.; Gaidamaka, Y. Self-Service System with Rating Dependent Arrivals. *Mathematics* **2022**, *10*, 297. https://doi.org/10.3390/ math10030297

Academic Editor: János Sztrik

Received: 11 December 2021 Accepted: 17 January 2022 Published: 19 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

may cause so-called word-of-mouth advertising and has quite a strong influence on the preferences of the customers and their arrival rate to any system. In turn, this has a high impact on the profit earned by this system, and the ratings have to be taken into account in the management of the system operation. Analysis of queueing systems with an arrival process depending on the rating of the system is of high practical importance.

The existing literature about the queues with servers rating is not extensive. In [1], a queueing model with two types of customers is considered. Sophisticated customers are well-informed of service-related information and make their joining-or-balking decisions strategically, whereas naive customers do not have such information and rely on online rating information to make such decisions. The problem of optimal pricing strategy is solved in [1]. The queueing model containing two competitive systems with customers' choice of the system to be joined via the comparison of the individual ratings of the systems was recently considered with the use of matrix analytic methods in [2].

In modern retail networks, hotels, banks, airports, etc., there is a robust trend for extending the use of self-service devices (*SSD*) or self-checkouts. *SSD*s have been defined by the new technological interfaces (e.g., the Quick Response (QR) codes, image and face recognition, radio frequency identification (RFID)) that allow customers to produce services without a service employee's involvement, see, e.g., [3]. The human operator (administrator, assistant, etc.) is involved in the service only upon request of a customer who asks for help in resolving certain problems that he/she met during a service or in the case of violation of the established rules by a customer. The use of *SSD* is becoming very wide nowadays, in particular, in many retail networks worldwide due to many reasons. The main reason is that it is profitable for both owners of services and customers.

The owners save money via non-payment of salaries to service employees and other operational costs. This creates better opportunities for successful competition with very popular now online shopping that is associated for many customers with the safest (from the perspective of health safety) and convenient way of shopping. Statistics show that one human operator (administrator, assistant, etc.) can easily control 6–10 *SSD*s. The *SSD*s take up less space than the regular cash registers, which allows optimizing the store space. With their help, it is possible to unload the cash register area and to increase its throughput. Among other things, *SSD*s encourage customers to make additional purchases. At one time, McDonald's found out during an experiment that visitors spend an average of 30 percent more on purchases when they are not worried that the person behind the cash register will evaluate their choice. The use of *SSD*s may allow the reduction of the actual and perceived waiting time that is strongly linked with customers satisfaction. In turn, this should imply higher loyalty of customers and the future profit of the owner.

The main profit gained by a customer consists of: (i) having a chance to avoid long waiting in the queue until the human server (cashier) will become available. Waiting generally is regarded as an undesirable activity that customers must undertake to complete the service. Waiting can lead to both emotional (anger, irritation, frustration, boredom, stress) and behavioral (e.g., abandonment or reneging) responses, especially when it is costly and limits the person's ability to engage in more productive or rewarding ways to spend their time; (ii) getting more control over his/her shopping experience; (iii) obtaining a possibility of better distancing from other buyers what is very important in the current era of the COVID-19 pandemic. With the continuous improvement in technology and the promotion of self-service retail stores in the market, their numbers will increase. Furthermore, the scales of users and transactions will rapidly increase in the future. For more existing literature about the perspectives and attractiveness of the use of *SSD*s, see, e.g., [4–8].

In the early beginning, the spread of the use of *SSD*s was fully justified by the huge investment of the companies to enforce the use of the new perspective technologies. After they are already implemented in life, it is necessary to effectively manage the operation of each concrete service system. To this end, besides many administrative problems, a whole bunch of pure mathematical problems has to be resolved. One of these problems is a traditional problem in modeling service systems. Namely, given the actual or expected characteristics of the arrival process, the distribution of service time of one customer, and the required values of the service level indicators, it is necessary to optimally choose the number of required servers (*SSD*s). Such indicators may be, e.g., the probability that the waiting time of an arbitrary customer will not exceed the given value with the fixed in advance value or the probability of customer abandonment.

It is known that, during the use of *SSD* for purchases, the customers can meet problems related, e.g., to the search of the necessary good on the shelves, readability of RFIDs, correct use of the scales, damage of the goods. To resolve the potentially arising problems, usually, the stores have some additional staff of administrators (assistants, helpers, etc.). They provide help to a customer if at least one of the assistants is not busy. Otherwise, the customer should wait until one of the assistants becomes available. Therefore, the problem of the optimal choice of the number of *SSD*s is supplemented by the problem of the optimal matching of the number of required assistants to the number of the *SSD*s. The redundant number of assistants implies higher, unjustified operational costs. The insufficient number of assistants causes long waiting times for help for a customer. That, in turn, implies longer total service time of this customer, longer waiting time of other customers, the higher probability of the abandonment and reneging by an arbitrary customer from the system, and the loss of potential profit that could be earned by service of customers. In this paper, we solve the problem of computation of the values of performance indicators under any fixed set of the numbers of *SSD*s and assistants. The problem is formulated and solved in borders of the matrix queueing theory. The usability of this result for the optimal choice of such a set is numerically illustrated.

Due to the practical importance of the effective use of *SSD*s, there are a lot of papers devoted to this topic. We mention only a few of them that operate with the notion of a customer waiting time. Analysis of the waiting time is one of the standard goals in queueing theory. In [6], the usefulness of queueing theory for the analysis of systems with *SSD* is noted. The question of the relation of the actual waiting time of a customer and the perceived waiting time, as well as their strong link with customer satisfaction, are discussed. Customer satisfaction is strongly associated with the loyalty of the customers, which is very important for service providers. Therefore, analysis of the ways to increase the loyalty of the customers strongly correlates with an analysis of the actual waiting time of a customer. Such a time is one of the key performance indicators of the majority of queueing systems. Thus, queueing analysis is an important part of solving the problem of the optimal design of the systems of *SSD*s. However, the analysis of many queueing systems is quite complicated. This explains why analysis of these systems is often implemented not via the use of the analytical and algorithmic methods of queueing theory but via the computer simulation methods. Namely, the method of computer simulation is used for the experimental study of the systems of *SSD*s. In [7], the correlation of the waiting time, customer experience, and satisfaction was discussed via the use of certain methods of sociology. The content of [8,9] is similar to [7], and the methods of sociology are also used.

In our paper, we provide an analysis of the queueing model with *SSD*s and assistants. The existing literature about the queues with service assistants is quite scarce. The recent paper [10] is devoted to the analysis of the set of *SSD*s described in terms of a tandem queueing model with a single-server first phase and a multi-server second phase. All distributions defining the system operation are exponential. The behavior of the system is described by a two-dimensional Markov chain that is the Quasi-Birth-and-Death-Process. This process is easily analyzed via the tools of the matrix-geometric method by M. Neuts, see [11].

The queueing model of the self-checkout (self-service) system considered in our paper assumes the existence of two multi-server sub-systems. Let us denote the number of servers in these two systems as *N* and *M*, correspondingly. The first sub-system defines the service process of customers by themselves. Any arriving customer that does not abandon the system (due to too long, in his/her opinion, queue) obtains service in this sub-system and successfully departs from the system if he/she does not encounter service problems. If a problem occurs, the server from the second sub-system has to help in resolving this problem if they are available. If all servers of the second sub-system are busy, the customer that encountered the problem suspends their service until any server of the second subsystem becomes available. After the problem is resolved, the server in the first sub-system resumes the work while the corresponding assistant is released. A problem in service at the first sub-system can occur an arbitrary number of times. After service is completed, the customer departs from the system.

As follows from this brief description, our model does not belong to the class of tandem queues because the service of an arbitrary customer does not strictly consist of at most two sequential services at different queueing systems. This service may be the sequence of alternating services by servers from the first and the second sub-systems. Our model is more similar to the unreliable queue with repairmen. The first sub-system describes the service of customers, and the second sub-system describes the behavior of the pool of repairmen. However, the overwhelming majority of the papers devoted to this subject consider only the joint distribution of the number of non-broken servers and the number of busy repairmen. The duty of servers to provide service and a possible queue of customers are not taken into account; see the survey [12], paper [13], and references therein. In our model, namely, the characteristics of the customers' service quality are in the focus of the study.

Models similar to our queueing models (however, without the rating consideration) are considered in the following papers. In [14], the model with *N* servers and *M* = 1 assistants (called in [14] as the main servers and the consultant) is considered. Arrivals are defined by quite a general Markov Arrival Process (*MAP*); for definition, properties, and related research, see [15–18]. Other distributions characterizing the system are assumed to be exponential. The system is comprehensively analyzed using the matrix analytic methods. Extensive illustrative numerical examples to bring out the qualitative nature of the model are presented. In [19], the model with *N* = 1 servers and *M* = 1 assistants is analyzed. All parameters of the system depend on the state of a finite state random environment. All involved distributions are assumed to be exponential. The system is analyzed using the matrix analytic methods. In [20], the model with *N* = 1 servers and *M* = 1 assistants is analyzed. The input buffer is finite, and the number of opportunities to ask for help by each server is restricted. Service and help times have so-called phase-type (*PH*) distributions, see [11]. The system is analyzed using the matrix analytic methods. The model considered in [21] assumes an arbitrary number of servers and assistants (called in this paper, specialist servers); the possibility of providing help by the assistant is only after the main service. Service cannot be continued after receiving help. Practically, this means the consideration of a tandem queueing model. The arrival process is the *MAP*, and the help times have *PH* distributions. The system is analyzed using the matrix analytic methods. A little bit similar to our model under quite general assumptions about the arrival process (the Batch Markov Arrival process) and service times (phase-time distribution) was recently considered in [22]. In that model, if the server does not succeed in finishing the service of a customer within a certain time, then the so-called backup server joins with the server for the service of this customer. When both servers serve a customer, the service speed increases. Another difference is that in [22], after obtaining help from the backup server, the server mandatorily finishes the service. In our model, we suggest that the server can obtain help from the assistant many times, and the server and the assistant do not cooperate in service. During obtaining the help, service is not provided.

The additional two features of the model considered in this paper are the following.

• The model suggests the visible queue. This means that the arriving customer can see the number of customers in the queue and can use this information to decide to join a queue or abandon (balk) it, e.g., queueing systems managed by ticket technology is widely used in service industries, as well as government offices, see [23]. Upon arriving at a ticket queue, each customer is issued a numbered ticket. The number currently being served is displayed. An arriving customer balks if the difference

between their ticket number and the displayed number exceeds their patience level. Analysis of a queueing model in [23] was implemented via the use of matrix analytic methods. Systems with a visible queue were also previously considered, e.g., in [24,25]. In [24], an empirical study of queue abandonment by the patients in an emergency department of a hospital is implemented by the methods of econometrics. In [25], the queueing/inventory model with a visible queue was analyzed via the use of matrix analytic methods.

• The model suggests the impatience of customers waiting in the buffer. Abandonment or balking means the refusal of a customer from joining the queue due to it being inappropriate for him/her in length. Another possible reason for customer loss is his/her impatience or reneging. Initially, a customer joins the queue. However, if his/her waiting time exceeds some critical (deterministic or random) level, the customer departs from the system. The phenomenon of impatience is important and received quite a lot of attention in the literature; for more references, see, e.g., [26–28]. In [27], in particular, the problem of wasting time by the server because the arriving customer decides to join a queue and receives a ticket but then departs from the system without canceling the ticket is considered. In [29], the authors analyzed the model in which customers' patience is exponentially distributed, and the system's waiting capacity is unlimited. Such a model is both rich and analyzable enough to provide information that is practically important for call center managers. The distinguishing feature of the paper [30] is that service time distributions in the considered multiserver queueing model are generally distributed. A simple and insightful solution is presented for the loss probability. The solution offered in [30] is exact for exponential services and is an excellent heuristic for general service times. In [31], a single server variant of the model from [30] is under study. In [32], the customer's loss probability in *M*/*M*/*c* queue with impatient customers is expressed in a simple formula involving the waiting time probabilities in the *M*/*M*/*c* queue with patient customers. In that paper, a probabilistic derivation of this formula is given, and possible use of this general formula in the *M*/*M*/*c* retrial queue with impatient customers is outlined. In [33–39], the retrial models with impatient customers are also considered. The model considered in [40] assumes that the multi-server queue operates under the influence of the random environment, and the impatience rate depends on the current state of the random environment. In [41–43], multi-server queues with the *MAP* or marked *MAP* arrival flows and impatient customers as the models of call centers were analyzed via the matrix analytic methods.

The remainder of the text of this paper consists of the following. In Section 2, the mathematical model is completely described. In Section 3, the process describing the dynamics of the system is defined as the continuous-time multi-dimensional Markov chain with level-dependent transitions. The generator of this chain is given. Section 4 contains the ergodicity and non-ergodicity conditions for this Markov chain in the transparent form. Section 5 briefly touches on the question of computation of the stationary distribution of the Markov chain. In Section 6, expressions for computation of the key performance indicators of the system in terms of the computed stationary probabilities of the system states are presented. Section 7 contains the results of the numerical experiment, the aim of which is to give insights into the shape of dependencies of the performance indicators on the number of servers and assistants. It is worth noting that the numerical realization of the elaborated paper algorithm results, are implemented not for a toy example with a small number of servers and assistants but for the system with realistic numbers of *SSD*s and assistants. Section 8 concludes the paper.

#### **2. Mathematical Model**

We consider a multi-server queueing system with an infinite buffer having two types of servers. The structure of the system is presented in Figure 1.

**Figure 1.** Structure of the system.

The servers of the first type correspond to self-checkouts (self-service devices). The total number of such servers is equal to *N*. The servers of the second type correspond to administrators (assistants). The total number of assistants is equal to *M*. If a customer is accepted for service, he/she firstly occupies one first-type server for service during an exponentially distributed time with the parameter *μ*1. After this time expires, with a probability of 1 − *p*, the customer successfully finishes service and departs from the system. With the complementary probability, the customer meets a problem with service and requires some kind of help from an administrator. If at this moment, there is a free administrator, this administrator provides help to the customer to resolve the problem during an exponentially distributed with the parameter *μ*<sup>2</sup> time. After that, the customer continues his/her service during an exponentially distributed time with the parameter *μ*1. The number of moments during service when a customer asks the help of an administrator (not necessarily the same as in the previous moments) is unlimited. If the help of an administrator is required while there is no free administrator, the customer suspends their service and waits until any administrator will become available.

The process of customers' arrival to the system is the generalization of the Markovian arrival process (*MAP*), see, e.g., [15–18], described below. Arrivals in the classical *MAP* are governed by the underlying Markov chain *νt*, *t* ≥ 0, with the state space {1, 2, ... , *W*}. The generator *D* of this chain is represented in the additive form *D* = *D*<sup>0</sup> + *D*1, where the components of the matrix *D*<sup>1</sup> define intensities of transitions of the chain that are accompanied by the customer's arrival. The non-diagonal components of the matrix *D*<sup>0</sup> define the intensities of transitions of the chain that are not accompanied by the customer's arrival. The diagonal components are negative. The moduli of these components define the rates of the exit from the corresponding state of the Markov chain.

In contrast to a classical *MAP*, we assume that the transition intensities of the underlying process *ν<sup>t</sup>* additionally depend on the parameter *r*, *r* = 1, *R*, which defines the so-called current rating of the system. If the rating of the system is *r*, then the *MAP* is characterized by the square matrices *D*(*r*) <sup>0</sup> and *<sup>D</sup>*(*r*) <sup>1</sup> of size *W*. The average arrival rate of the *MAP* when the rating of the system is *<sup>r</sup>* is denoted as *<sup>λ</sup>r*, which can be found as *<sup>λ</sup><sup>r</sup>* <sup>=</sup> *<sup>θ</sup>*(*r*)*D*(*r*) <sup>1</sup> **e**, *r* = 1, *R*, where *θ*(*r*) is an invariant vector of the *MAP* defined by the matrices *D*(*r*) <sup>0</sup> and *<sup>D</sup>*(*r*) <sup>1</sup> , and **<sup>e</sup>** = (1, 1, ... , 1)*T*. We do not specify the concrete form of the matrices *<sup>D</sup>*(*r*) <sup>0</sup> and *<sup>D</sup>*(*r*) <sup>1</sup> . We only suggest that the increase in the rating cannot imply the decrease in the average arrival rate, i.e., we require that the following inequalities hold true:

$$
\lambda\_1 \le \lambda\_2 \le \cdots \le \lambda\_R.
$$

The rating of the system can be dynamically changed during the system operation. We assume that if a customer is admitted to the system without waiting in the queue, its current rating *r*, *r* = 1, *R* − 1, immediately increases by one with the fixed small probability *r*+. For example, if we fix *r*<sup>+</sup> = 0.001, the service of 1000 customers without waiting in the

queue leads, on average, to the increase in the rating by one. If a customer abandons the system without service, the current rating *r*, *r* = 2, *R*, decreases by one with the probability *r*−. For example, if we fix *r*− = 0.01, the loss of 100 customers leads, on average, to the decrease in the current rating by one. At the moment of the change of the rating, transition intensities of the underlying process *ν<sup>t</sup>* immediately adjust their values to the new rating.

The queue is assumed to be visible. This means that an arriving customer can observe the number of customers in the queue and leave the system without service if he/she considers this number inappropriate. We assume that if during an arbitrary customer arrival epoch there is no free server and the number of customers in the buffer is *i*, the customer permanently leaves the system with the probability *qi*, *i* ≥ 0. With the complementary probability, the customer joins the queue. We suggest that the limit *q* = lim *<sup>i</sup>*→<sup>∞</sup> *qi* exists, 0 < *q* ≤ 1. Additionally, we assume that customers can be impatient and leave the buffer and depart from the system, independently of each other, after an exponentially distributed with the parameter *α*, *α* ≥ 0, amount of time. In the case of *α* = 0, the customers are patient. Now, let us analyze the described queueing model.

#### **3. Process of the System States**

The behavior of the system under study can be described by the regular irreducible continuous-time Markov chain

$$\mathcal{J}\_t = \{i\_{t'}n\_{t'}r\_{t'}\nu\_t\}, \ t \ge 0,$$

where, during the epoch *t*,


Here and further, the notation *n* = 0, *N* means that the parameter *n* admits values from the set {0, . . . , *N*}.

To formally define the continuous-time Markov chain *ξt*, it is necessary to write down, for any pair of the states (*i*, *n*,*r*, *ν*) and (*i* , *n* ,*r* , *ν* ), the intensity of the transitions between these states.

To avoid bulky denotations, following the standard methodology of investigation of multi-dimensional Markov chains having one denumerable component, we enumerate the states of the Markov chain *ξ<sup>t</sup>* = {*it*, *nt*,*rt*, *νt*} in the direct lexicographic order of the components {*nt*,*rt*, *νt*} and combine the set of the states with the value *i* of the component *it* into the so-called level *i*, *i* ≥ 0.

Let *Qi*,*<sup>j</sup>* be the matrix constituted by the transition intensities from level *i* to level *j* and let *Q* be the block matrix constituted by the blocks *Qi*,*j*, *i* ≥ 0, *j* ≥ 0. It is clear that the matrix *Q* is the infinitesimal generator of the Markov chain *ξt*, *t* ≥ 0.

**Theorem 1.** *The generator Q of the Markov chain ξt*, *t* ≥ 0, *has the following block three-diagonal structure*

*Q* = ⎛ ⎜⎜⎜⎝ *Q*0,0 *Q*0,1 *O O* ... *OOO* ... *Q*1,0 *Q*1,1 *Q*1,2 *O* ... *OOO* ... *O Q*2,1 *Q*2,2 *Q*2,3 ... *OOO* ... *. . . . . . . . . . . . ... . . . . . . . . . ...* ⎞ ⎟⎟⎟⎠.

*The non-zero blocks are defined as follows:*

$$Q\_{0\beta} = D\_{0\prime}$$

$$Q\_{i,i} = I\_{i+1} \otimes \mathcal{D}\_0 + (-\mu\_2 \mathbb{C}\_i - \mu\_1 \mathbb{C}\_i + \mu\_2 \mathbb{C}\_i E\_i^- + p\mu\_1 \mathbb{C}\_i E\_i^+) \otimes I\_{\mathbb{R}W}, \ 0 < i < N\_{\prime \prime}$$

$$Q\_{i,i} = I\_{N+1} \otimes \hat{D}\_0 + q\_{i-N}I\_{N+1} \otimes \hat{D}\_1((r\_-\mathcal{R}^- + (1-r\_-)I\_{\mathbb{R}}) \otimes I\_{\mathbb{W}}) + (-\mu\_2\mathbb{C}\_N - \mu\_1\bar{\mathsf{C}}\_N + 1)\bar{I}\_{i,i}$$

$$+ \mu\_2\mathbb{C}\_N\bar{E}\_N^- + p\mu\_1\bar{\mathsf{C}}\_N\bar{E}\_N^+) \otimes I\_{\mathbb{R}W} - (i-N)\mu I\_{(N+1)\mathbb{R}W}, i \ge N\_\prime$$

$$Q\_{i,i+1} = \bar{E}\_i \otimes \hat{D}\_1((r\_+\mathcal{R}^+ + (1-r\_+)I\_{\mathbb{R}+1}) \otimes I\_{\mathbb{W}}), 0 < i < N\_\prime$$

$$Q\_{i,i+1} = (1 - q\_{i-N})I\_{N+1} \otimes \hat{D}\_1, i \ge N\_\prime$$

$$Q\_{i,i-1} = \mu\_1(1-p)\bar{\mathsf{C}}\_i\bar{\mathsf{E}}\_i \otimes I\_{\mathbb{R}W}, 0 < i \le N\_\prime$$

$$Q\_{i,i-1} = (i-N)\mu I\_{(N+1)} \otimes \left((r\_-\mathcal{R}^- + (1-r\_-)I\_{\mathbb{R}}) \otimes I\_{\mathbb{W}}\right) + \mu\_1(1-p)\mathbb{C}\_N \otimes I\_{\mathbb{R}W}, i > N\_\prime$$

*where*

⊗ *indicates the symbol of the Kronecker product matrices, see [44];*

*<sup>D</sup>*<sup>ˆ</sup> <sup>0</sup> <sup>=</sup> diag{*D*(1) <sup>0</sup> , *<sup>D</sup>*(2) <sup>0</sup> , ... , *<sup>D</sup>*(*R*) <sup>0</sup> }, *where* diag{... } *denotes the diagonal matrix with the diagonal entries listed in the brackets;*

$$\begin{array}{c} \mathcal{D}\_1 = \text{diag}\{D\_1^{(1)}, D\_1^{(2)}, \dots, D\_1^{(R)}\}; \\ C\_1 = \text{diag}\{0, 1 \quad \text{min}\{i \quad \mathbf{1} \quad \mathcal{M}\} \quad \text{min}\{i \; \lambda\} \end{array}$$

*Ci* = diag{0, 1, . . . , min{*i* − 1, *M*}, min{*i*, *M*}}, *i* = 1, *N*; *C*˜ *<sup>i</sup>* = diag{*i*, *i* − 1, . . . , 0}, *i* = 1, *N*;

*IK is the identity matrix having a size indicated in the suffix (if the size of the matrix is clear from the context, it can be omitted);*

*OK is a zero matrix having a size indicated in the suffix (if the size of the matrix is clear from the context, it can be omitted);*

*E*− *<sup>i</sup> is a square matrix of size i* + 1 *with all zero entries, except the entries* (*E*<sup>−</sup> *<sup>i</sup>* )*l*,*l*−1, *<sup>l</sup>* = 2, *i* + 1, *i* = 1, *N*, *which are equal to 1;*

*E*+ *<sup>i</sup> is a square matrix of size <sup>i</sup>* <sup>+</sup> <sup>1</sup> *with all zero entries, except the entries* (*E*<sup>+</sup> *<sup>i</sup>* )*l*,*l*<sup>+</sup>1, *l* = 1, *i*, *i* = 1, *N*, *which are equal to 1;*

*E*˜ *<sup>i</sup> is a matrix of size* (*<sup>i</sup>* <sup>+</sup> <sup>1</sup>) <sup>×</sup> (*<sup>i</sup>* <sup>+</sup> <sup>2</sup>) *with all zero entries, except the entries* (*E*˜ *<sup>i</sup>*)*l*,*l*, *l* = 1, *i* + 1, *i* = 1, *N*, *which are equal to 1;*

*E*ˆ *<sup>i</sup> is a matrix of size* (*<sup>i</sup>* <sup>+</sup> <sup>1</sup>) <sup>×</sup> *<sup>i</sup> with all zero entries, except the entries* (*E*<sup>ˆ</sup> *<sup>i</sup>*)*l*,*l*, *l* = 1, *i*, *i* = 1, *N*, *which are equal to 1;*

<sup>R</sup><sup>+</sup> *is a matrix of size <sup>R</sup>* <sup>×</sup> *<sup>R</sup> with all zero entries, except the entries* (R+)*l*,*l*+1, *<sup>l</sup>* <sup>=</sup> 1, *<sup>R</sup>* <sup>−</sup> 1, *and* (R+)*R*,*R, which are equal to 1;*

R<sup>−</sup> *is a matrix of size <sup>R</sup>* × *<sup>R</sup> with all zero entries, except the entries* (R−)*l*,*l*−1, *<sup>l</sup>* = 2, *<sup>R</sup>*, *and* (R−)1,1*, which are equal to 1.*

**Proof.** The proof of Theorem 1 is implemented via careful analysis of all possible transitions of the Markov chain *ξt*, *t* ≥ 0, and further combining the intensities of these transitions into the blocks of the generator.

The generator *Q* has all negative diagonal entries and non-negative non-diagonal entries. The diagonal entries of the generator *Q* define, up to the sign, the total intensity of leaving the corresponding state of the Markov chain *ξt*, *t* ≥ 0. In the case *i* = 0, the Markov chain can leave the current state only if the underlying process of the arrival process makes a transition. The intensities of such transitions are defined as the modules of the diagonal entries of the matrix *D*ˆ 0.

In the case when the system is not idle, but the buffer is empty, the Markov chain *ξt*, *t* ≥ 0, can also change its state due to the finish of the service by a server or finish of help provided by an assistant. The intensities of such transitions are defined as the diagonal entries of the matrix (*μ*2*Ci* + *μ*1*C*˜ *<sup>i</sup>*) ⊗ *IRW*, *i* = 1, *N*.

If the number of customers in the buffer is greater than zero, the Markov chain *ξt*, *t* ≥ 0, can also change its state due to the departure of some customers from the buffer due to impatience. The intensities of such transitions are defined as the diagonal entries of the matrix (*i* − *N*)*αI*(*N*+1)*RW*, *i* > *N*.

The non-diagonal entries of the matrices *Qi*,*i*, *i* ≥ 0, define the intensities of transitions that do not lead to the change of the number of customers in the system *i*. Such transitions are the following:

(1) The underlying process of the arrival process transition, which does not imply the acceptance of a new customer to the system (the customer is not generated or lost). The intensities of such transitions are given as the non-diagonal entries of the matrix *<sup>I</sup>*min{*N*,*i*}+<sup>1</sup> <sup>⊗</sup> *<sup>D</sup>*<sup>ˆ</sup> <sup>0</sup> and the entries of the matrix *qi*−*<sup>N</sup> IN*+<sup>1</sup> <sup>⊗</sup> *<sup>D</sup>*<sup>ˆ</sup> <sup>1</sup>((*r*−R<sup>−</sup> + (<sup>1</sup> <sup>−</sup> *<sup>r</sup>*−)*IR*) <sup>⊗</sup> *IW*) for *i* ≥ *N*. Note, that here the matrix (*r*−R<sup>−</sup> + (1 − *r*−)*IR*) defines the possible change of the system rating due to the loss of a customer;

(2) The number of blocked servers is increased by one. The intensities of such transitions are given as the entries of the matrix (*pμ*1*C*˜ min{*N*,*i*}*E*<sup>+</sup> min{*N*,*i*}) <sup>⊗</sup> *IRW*;

(3) The number of blocked servers is decreased by one. The intensities of such transitions are given as the entries of the matrix (*μ*2*C*min{*N*,*i*}*E*<sup>−</sup> min{*N*,*i*}) <sup>⊗</sup> *IRW*.

The entries of the matrices *Qi*,*i*+1, *i* ≥ 0 define the intensities of transitions that lead to the increase in the number of customers *i* in the system by one. This can happen only in the case when the underlying arrival process makes a transition with the generation of a customer, and this customer is admitted to the system. The intensities of such transitions are given as the entries of the matrices *E*˜ *<sup>i</sup>* <sup>⊗</sup> *<sup>D</sup>*<sup>ˆ</sup> <sup>1</sup>((*r*+R<sup>+</sup> + (<sup>1</sup> <sup>−</sup> *<sup>r</sup>*+)*IR*+1) <sup>⊗</sup> *IW*), in the case *<sup>i</sup>* <sup>=</sup> 1, *<sup>N</sup>* <sup>−</sup> 1, and the entries of the matrices (<sup>1</sup> <sup>−</sup> *qi*−*N*)*IN*+<sup>1</sup> <sup>⊗</sup> *<sup>D</sup>*<sup>ˆ</sup> 1, if *<sup>i</sup>* <sup>≥</sup> *<sup>N</sup>*.

The entries of the matrices *Qi*,*i*−1, *i* ≥ 1, define the intensities of transitions that lead to the decrease in the number of customers *i* in the system by one. This can happen if a customer leaves the system successfully serviced (the intensities of such transitions are given as the entries of the matrices *<sup>μ</sup>*1(<sup>1</sup> <sup>−</sup> *<sup>p</sup>*)*C*˜ min{*N*,*i*}*E*<sup>ˆ</sup> min{*N*,*i*} ⊗ *IRW*) and if a customer leaves the non-empty buffer due to impatience (the intensities of such transitions are given as the entries of the matrices (*i* − *N*)*αI*(*N*+1) ⊗ ((*r*−R<sup>−</sup> + (1 − *r*−)*IR*) ⊗ *IW*)).

The blocks *Qi*,*j*, *i*, *j* ≥ 0, |*i* − *j*| > 1, of the generator *Q* are zero matrices because the customers arrive and leave the system only one-by-one.

#### **4. Ergodicity Condition**

An important step of analysis of any Markov chain with an infinite state space is establishing conditions for the ergodicity and non-ergodicity of this Markov chain (the stability condition). The ergodicity and non-ergodicity conditions for the Markov chain under study *ξ<sup>t</sup>* are given by the following Theorem.

**Theorem 2.** *(a) If the impatience rate α is positive, the Markov chain ξ<sup>t</sup> is ergodic under all finite values of other parameters of the considered queueing system;*

*(b) If the limiting probability q is equal to 1, the Markov chain ξ<sup>t</sup> is ergodic under all finite values of other parameters of the considered queueing system;*

*(c) If the customers staying in the buffer are patient, i.e., α* = 0*, the Markov chain ξ<sup>t</sup> is ergodic if the following inequality is fulfilled:*

$$
\lambda\_1 (1 - q) < \sum\_{n=0}^{N} \gamma\_n (N - n) \mu\_{1\prime} \tag{1}
$$

*where γ<sup>n</sup> are the probabilities that at an arbitrary moment when the system is overloaded, the number of assistants providing help to the servers is equal to n*, *n* = 0, *N*.

*These probabilities are computed by the formula:*

$$\gamma\_n = \left( 1 + \sum\_{j=1}^{N} \prod\_{l=1}^{j} \frac{p(N-l+1)\mu\_1}{l\mu\_2} \right)^{-1} \prod\_{l=1}^{n} \frac{p(N-l+1)\mu\_1}{l\mu\_2}, \ n = \overline{0, N}; \tag{2}$$

*(d) If the customers staying in the buffer are patient, i.e., α* = 0*, the Markov chain ξ<sup>t</sup> is non-ergodic if the following inequality is fulfilled:*

$$
\lambda\_1 (1 - q) > \sum\_{n=0}^{N} \gamma\_n (N - n) \mu\_1. \tag{3}
$$

**Proof.** Let the generator of a Markov chain be the upper-Hessenbergian matrix, i.e., it has zero blocks below the first sub-diagonal and other blocks *Qi*,*i*+*k*−1, *k* ≥ 0. Let the matrix *Ti* be the diagonal matrix, the diagonal entries of which also coincide with the modules of the diagonal entries of the matrix *Qi*,*i*, *i* ≥ 0.

If the following limits exist:

$$\mathcal{Y}^{(k)} = \lim\_{i \to \infty} \mathcal{R}\_i^{-1} Q\_{i, i+k-1}, \; k = 0, 2, 3, \dots, \quad \mathcal{Y}^{(1)} = \lim\_{i \to \infty} \mathcal{R}\_i^{-1} Q\_{i, i} + I\_{\prime 1}$$

and the matrix <sup>∞</sup> ∑ *k*=0 *Y*(*k*) is stochastic; then the Markov chain belongs to the class of Asymptotically Quasi-Toeplitz Markov chains (*AQTMC*), see [45].

The generator *Q* of the Markov chain describing the queueing system under study defined by Theorem 1 has the particular, with respect to the upper-Hessenbergian matrix, three-block diagonal structure.

If *α* > 0, then it can be verified that for this Markov chain the matrices *Yk*, *k* = 0, 1, 2 exist and are defined by: *Y*(0) = *I*, *Y*(*k*) = *O*, *k* = 1, 2.

If *α* = 0, then it can be verified that the matrices *Y*(*k*) exist and are defined by

$$\begin{aligned} \chi^{(0)} &= T^{-1} \mathbb{Q}^{-}, \\ \chi^{(1)} &= T^{-1} \mathbb{Q}^{0} + I, \\ \chi^{(2)} &= T^{-1} \mathbb{Q}^{+}, \end{aligned}$$

where

$$
\tilde{Q}^{-} = \mu\_1 (1 - p) \mathbf{\hat{C}}\_N \otimes I\_{\text{RW}},
$$

$$
\tilde{Q}^0 = I\_{N+1} \otimes \hat{D}\_0 + q I\_{N+1} \otimes \hat{D}\_1 ((r - \mathcal{R}^- + (1 - r\_-) I\_R) \otimes I\_W) + 1,
$$

$$
(-\mu\_2 \mathbf{\hat{C}}\_N - \mu\_1 \mathbf{\tilde{C}}\_N + \mu\_2 \mathbf{\hat{C}}\_N \mathbf{\bar{E}}\_N^- + p \mu\_1 \mathbf{\tilde{C}}\_N \mathbf{\bar{E}}\_N^+) \otimes I\_{\text{RW}},
$$

$$
\tilde{Q}^+ = (1 - q) I\_{N+1} \otimes \hat{D}\_1,
$$

and *T* is the diagonal matrix the diagonal entries of which coincide with the modules of the diagonal entries of the matrix *Q*˜ 0.

Therefore, in both cases, *α* > 0 and *α* = 0, the limits *Y*(*k*), *k* = 0, 1, 2, exist and it is easy to check that their sum is the stochastic matrix. This implies that the considered Markov chain belongs to the class of *AQTMC*. The sufficient condition for the ergodicity of *AQTMC*, see [45], rewritten for the three-block diagonal generator is the fulfillment of the inequality:

$$
\Psi Y^{(0)} \mathbf{e} \succ \Psi Y^{(2)} \mathbf{e} \tag{4}
$$

where the vector *ψ* is the unique solution of equations

$$
\psi(\mathcal{Y}^{(0)} + \mathcal{Y}^{(1)} + \mathcal{Y}^{(2)}) = \psi,\ \psi \mathbf{e} = 1. \tag{5}
$$

The sufficient condition for the non-ergodicity is the fulfillment of the inequality

$$
\psi Y^{(0)} \mathbf{e} < \psi Y^{(2)} \mathbf{e}.
$$

Because for *α* > 0 we have *Y*(0) = *I*, *Y*(*k*) = *O*, *k* = 1, 2, inequality (4) takes a trivial form: 1 > 0. Thus, the chain is ergodic for all finite values of other parameters of the considered queueing system. Statement (a) of the theorem is proven.

Consider now the case *α* = 0. It can be verified that in this case system (4) and inequality (5) are equivalent to the system

$$
\mathfrak{sp}(\mathbb{Q}^- + \mathbb{Q}^0 + \mathbb{Q}^+) = \mathbf{0}, \mathfrak{sp} = 1\tag{6}
$$

and the inequality

$$
\!\!\!\!\!\!Q^{-}\!\!\!\!\/)\mathbf{e} > \!\!\!\!\!\!Q^{+}\!\!\!\!\mathbf{e}.\tag{7}
$$

It can be verified that

$$\mathcal{Q}^- + \mathcal{Q}^0 + \mathcal{Q}^+ = I\_{N+1} \otimes \left( (1 - q)\mathcal{D}\_1 + \mathcal{D}\_0 + q\mathcal{D}\_1((r\_-\mathcal{R}^- + (1 - r\_-)I\_R) \otimes I\_W) \right) + H$$

where

$$H = \begin{pmatrix} -pN\mu\_1 & pN\mu\_1 & O & \dots & O & O\\ \mu\_2 & -\mu\_2 - p(N-1)\mu\_1 & p(N-1)\mu\_1 & \dots & O & O\\ O & 2\mu\_2 & -2\mu\_2 - p(N-2)\mu\_1 & \dots & O & O\\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots\\ O & O & O & \dots & N\mu\_2 & -N\mu\_2 \end{pmatrix} \otimes I\_{\text{RW}}.$$

Using the so-called mixed product rule for Kronecker product of matrices, see [44], it is possible to show that a solution of (6) has the representation

$$
\mathfrak{p} = \Delta\_1 \otimes \Delta\_2 \tag{8}
$$

where the row vector **Δ**<sup>1</sup> is a solution to the system

$$\mathbf{A}\_{1}\left[I\_{N+1}\otimes\left((1-q)\mathcal{D}\_{1}+\mathcal{D}\_{0}+q\mathcal{D}\_{1}((r\_{-}\mathcal{R}^{-}+(1-r\_{-})I\_{R})\otimes I\_{W})\right)\right]=\mathbf{0},\tag{9}$$

$$\mathbf{A}\_{1}\mathbf{e}=1,$$

and the row vector **Δ**<sup>2</sup> is a solution to the system

$$
\Delta\_2 H = \mathbf{0}, \ \Delta\_2 \mathbf{e} = 1. \tag{10}
$$

By the direct substitution into (9), it is possible to check that vector **Δ**<sup>1</sup> defined by

$$\Delta\_1 = (\theta^{(1)}, \mathbf{0}, \dots, \mathbf{0})\tag{11}$$

is the solution of equation (9).

By the direct substitution into (10), it is possible to check that vector **Δ**<sup>2</sup> is defined by

$$\Delta\_2 = (\gamma\_{0\prime}, \dots, \gamma\_N) \tag{12}$$

where probabilities *γn*, *n* = 0, *N*, are given by Formula (2). Taking into account (8), (11), and (12) in (4), we obtain Formula (1).

It is clear that if *q* = 1, inequality (1) is always true. This proves statement (b) of the theorem. Statement (c) is also proven. The proof of statement (d) is easily made analogously to the proof of (b).

**Remark 1.** *Inequality (1) is intuitively transparent. Usually, the ergodicity condition is equivalent to the requirement that in the situation when the system is very overloaded, the rate of customers admission is less than the rate of customers service. The left-hand side of (1) is the rate of customers' admission to the system. As it is seen from (1), this rate here depends only on the arrival rate λ*<sup>1</sup> *and does not depend on the rates λr*, *r* = 2, *R*. *This stems from the fact that the rating of the system, when it is very overloaded, is equal to 1 due to the abandonment (balking) of many customers. The right-hand side of (1) is the rate of customers' departure from the service when the system is overloaded. The values γn*, *n* = 0, *N*, *are the probabilities that n assistants provide help to the servers and these servers do not serve customers. Correspondingly, the number of servers providing service with the rate μ*<sup>1</sup> *is equal to N* − *n*. *It follows from the formula of total probability that the*

*right-hand side of (1) indeed is the rate of customers' departure from the service when the system is overloaded.*

#### **5. Computation of the Stationary Distribution of the Markov Chain**

If the ergodicity condition is fulfilled, the stationary probabilities of the Markov chain *ξt*, *t* ≥ 0,

$$\pi(i, n, r, \nu) = \lim\_{t \to \infty} P\{i\_t = i, n\_t = n, r\_t = r, \nu\_t = \nu\},$$

$$i \ge 0, \ n = \overline{0, \min\{i, N\}}, \ r = \overline{1, R}, \ \nu = \overline{1, W};$$

exist.

We form the row vectors *πi*, *i* ≥ 0, of these stationary probabilities enumerated in the lexicographic order of the components (*n*,*r*, *ν*), *n* = 0, min{*i*, *N*}, *r* = 1, *R*, *ν* = 1, *W*.

It is a well-known fact that these stationary probabilities can be found as the solution of the following system:

$$(\pi\_{0\prime}\pi\_{1\prime}, \dots, \pi\_{N\prime}, \dots, )\mathbf{Q} = \mathbf{0},$$

$$(\pi\_{0\prime}\pi\_{1\prime}, \dots, \pi\_{N\prime}, \dots, )\mathbf{e} = 1.$$

Because the Markov chain *ξt*, *t* ≥ 0, is a level-dependent quasi-birth-and-death process having one countable component, this system cannot be solved using the standard matrix analytic methods. To solve this system, we recommend using the algorithms from papers [46,47].

#### **6. Performance Indicators**

The probability *pr*, *r* = 1, *R*, that the value of the rating of the system at an arbitrary epoch is equal to *r* can be found as

$$p\_r = \sum\_{i=0}^{\infty} \sum\_{n=0}^{\min\{i, N\}} \pi(i, n, r) \mathbf{e}\_r$$

The average rating *R*¯ of the system can be found as

$$\mathcal{R} = \sum\_{r=1}^{\mathbb{R}} r p\_{r^\*}$$

The average customer's arrival rate *λ* can be found as

$$
\lambda = \sum\_{r=1}^{R} p\_r \lambda\_r.
$$

The average output rate *λout* of successfully serviced customers can be found as

$$\lambda\_{out} = \sum\_{i=1}^{\infty} \sum\_{n=0}^{\min\{i, N\}-1} (\min\{i, N\} - n)(1 - p)\mu\_1 \pi(i, n) \mathbf{e}\_n$$

The average number *Ncust* of customers in the system is calculated as

$$N\_{\text{cust}} = \sum\_{i=1}^{\infty} i \pi\_i \mathbf{e}\_i$$

The average number *Nbuf* of customers in the buffer is calculated as

$$N\_{buf} = \sum\_{i=N+1}^{\infty} (i - N)\pi\_i \mathbf{e}\_i$$

The average number *Nserv*−<sup>1</sup> of occupied servers is calculated as

$$N\_{scrv-1} = \sum\_{i=1}^{\infty} \min\{i, N\} \,\pi\_i \mathbf{e}\_i$$

The average number *Nblocked* of blocked servers is calculated as

$$N\_{blocked} = \sum\_{i=1}^{\infty} \sum\_{n=1}^{\min\{i, N\}} n \,\pi(i, n) \mathbf{e}\_n$$

The average number *Nblocked*−<sup>1</sup> of blocked servers that are currently obtaining the help of an assistant is calculated as

$$N\_{blockxdd-1} = \sum\_{i=1}^{\infty} \sum\_{n=1}^{\min\{i, N\}} \min\{n, M\} \pi(i, n) \mathbf{e}\_n$$

The average number *Nblocked*−<sup>2</sup> of blocked servers that are waiting until any assistant will become available is calculated as

$$N\_{blockd-2} = \sum\_{i=M+1}^{\infty} \sum\_{n=M+1}^{\min\{i,N\}} (n-M)\,\pi(i,n)\mathbf{e} = N\_{blockcd} - N\_{blockcd-1}\dots$$

The average number *Nserv*−<sup>2</sup> of busy assistants is equal to *Nblocked*−1.

The loss probability *Pent* of an arbitrary customer at the entrance to the system due to the unwillingness to wait in a long queue can be found as

$$P\_{cnt} = \frac{1}{\lambda} \sum\_{i=N}^{\infty} \sum\_{n=0}^{N} \sum\_{r=1}^{R} q\_{i-N} \pi(i, n, r) D\_1^{(r)} \mathbf{e}\_r$$

The loss probability *Pimp* of an arbitrary customer due to impatience can be found as

$$P\_{imp} = \frac{\alpha N\_{buf}}{\lambda}.$$

The loss probability *Ploss* of an arbitrary customer can be found as

$$P\_{loss} = 1 - \frac{\lambda\_{out}}{\lambda} = P\_{cut} + P\_{imp}.$$

#### **7. Numerical Example**

The purpose of this example is to illustrate the dependencies of the main performance measures of the system on the number *N* of servers and the number *M* of assistants. Let us assume that the system can have up to 50 servers and up to 10 assistants. Thus, below we vary the parameter *N* over the interval (1, 50) and the parameter *M* over the interval (1, 10) with the same step 1.

We assume the following values of the parameters of the system:


• The probabilities *qi* = *qi*,*<sup>N</sup>* that a customer will leave the system having *N* servers during the arrival epoch when the number of customers in the buffer is *i*, are defined as:

$$q\_{i,N} = \begin{cases} \frac{i}{i + 100N} & \text{if } 0 \le i \le N;\\ \frac{i}{i + 400N} & \text{if } N < i \le \max\{10, 2N\};\\ \frac{i}{i + 100N} & \text{if } \max\{10, 2N\} < i \le \max\{20, 5N\};\\ \frac{i}{i + N} & \text{if } \max\{20, 5N\} < i \le \max\{100, 10N\};\\ \frac{i}{i + 0.1N} & \text{otherwise.} \end{cases}$$

• The MAP arrival flow of customers when the system has the rating *r* is defined by the matrices *D*(*r*) <sup>0</sup> = *rDbase* <sup>0</sup> and *<sup>D</sup>*(*r*) <sup>1</sup> = *rDbase* <sup>1</sup> , where the matrices *<sup>D</sup>base* <sup>0</sup> and *<sup>D</sup>base* <sup>1</sup> are defined by

$${}^{0}D\_{0}^{\text{base}} = \begin{pmatrix} -2.5 & 0.02\\ 0.001 & -0.8 \end{pmatrix}; D\_{1}^{\text{base}} = \begin{pmatrix} 2.46 & 0.02\\ 0.001 & 0.798 \end{pmatrix}.$$

.

The base arrival flow has the average intensity *λ* = 0.879048, the coefficient of correlation of successive inter-arrival times *ccor* = 0.0557495 and the coefficient of variation of inter-arrival times *cvar* = 1.12815.

The dependence of the average rating of the system *R*¯ on the parameters *N* and *M* is presented in Figure 2.

**Figure 2.** Dependence of the average rating of the system *R*¯ on *N* and *M*.

As is seen in Figure 2, the average rating increases with the increase in the number of servers and assistants. Here, we do not consider the situations when *M* ≥ *N*. It is evident that in such a situation, adding a new assistant does not improve the system performance, and the rating is not changed. In other cases, adding new servers and (or) assistants leads to a better quality of customer service. As the result, the average rating increases.

Figure 3 illustrates the dependence of the average customer's arrival rate *λ* on the parameters *N* and *M*.

The average arrival rate of customers also increases with the increase in the number of servers and (or) assistants. This is explained by the evident fact that a higher rating of the system leads to a higher arrival rate.

The dependence of the average number *Nbuf* of customers in the buffer on the parameters *N* and *M* is presented in Figure 4.

The dependence of the average number *Nbuf* of customers in the buffer on the parameters *N* and *M* is complicated and hardly predictable intuitively. On the one hand, as for a classical system, the increase in the number of servers leads to a decrease in the waiting time. However, for the considered system, the increase in the number of servers leads to an increase in the arrival rate that may imply an increase in the waiting time. As one can see

from the figure, sometimes the first factor prevails over the second one, and the average number *Nbuf* of customers in the buffer decreases with the growth of *N* and *M*; sometimes the situation changes oppositely.

**Figure 3.** Dependence of the average customer's arrival rate *λ* on *N* and *M*.

**Figure 4.** Dependence of the average number *Nbuf* of customers in the buffer on *N* and *M*.

Figures 5–7 illustrate the dependencies of the average number *Nserv*−<sup>1</sup> of occupied servers, the average number *Nblocked*−<sup>1</sup> of blocked servers that are under unblocking by an assistant, and the average number *Nblocked*−<sup>2</sup> of blocked servers that are waiting until an assistant will become available on the parameters *N* and *M*.

**Figure 5.** Dependence of the average number *Nserv*−<sup>1</sup> of occupied type-1 servers on *N* and *M*.

**Figure 6.** Dependence of the average number *Nblocked*−<sup>1</sup> on *N* and *M*.

**Figure 7.** Dependence of the average number *Nblocked*−<sup>2</sup> on *N* and *M*.

The average number *Nserv*−<sup>1</sup> of occupied servers essentially increases with the increase in the total number of servers *N*. When the number of assistants *M* increases, the average number of blocked servers decreases if the average arrival rate is constant. Thus, the increase in the number of assistants *M* may lead to the decrease in the number of busy servers. However, with an increasing value of *M*, the rating and the intensity of the arrival flow can also increase, which may cause an increase in the number of busy servers. Therefore, the dependence of the number *Nserv*−<sup>1</sup> on the parameter *M* is hardly predictable.

25

The average number *Nblocked*−<sup>1</sup> of blocked servers currently receiving the help of assistants increases with the growth of *N* and *M*. This growth is mainly caused by the increase in the average arrival rate. In the considered example, the average number *Nblocked*−<sup>2</sup> of blocked servers that are waiting until an assistant will become available grows when the number of servers *N* grows and decreases with the increasing value of *M*.

The dependence of the loss probability *Pent* of an arbitrary customer at the entrance to the system due to the unwillingness to wait in a long queue on the parameters *N* and *M* is presented in Figure 8.

As it is seen from Figure 8, the loss probability of an arbitrary customer at the entrance to the system decreases when the number *N* of servers grows. The dependence of *Pent* on *M* is less essential, and *Pent* may decrease or increase with the growth of *M*.

The dependence of the loss probability *Pimp* of an arbitrary customer due to impatience on the parameters *N* and *M* is presented in Figure 9.

The behavior of the dependence of the loss probability *Pimp* of an arbitrary customer due to impatience on the parameters *N* and *M* is also complicated. Note that the loss probability *Pimp* essentially depends on the average number of customers in the buffer (see Figure 4) and the average arrival rate *λ* (see Figure 3).

**Figure 8.** Dependence of the loss probability *Pent* on *N* and *M*.

**Figure 9.** Dependence of the loss probability *Pimp* on *N* and *M*.

The dependence of the loss probability *Ploss* of an arbitrary customer on the parameters *N* and *M* is presented in Figure 10.

**Figure 10.** Dependence of the loss probability *Ploss* on *N* and *M*.

The loss probability *Ploss* is the sum of loss probabilities *Pent* and *Pimp*. Because in the considered case, the main losses occur due to customers' impatience, the behavior of *Ploss* is similar to the behavior of the probability *Pimp*.

We considered the dependencies of the main performance measures on the number of servers *N* and the number of assistants *M* and can conclude that these dependencies can be quite hardly predictable. Therefore, mathematical modeling with the use of the results presented in this paper is required for the exact estimation of the system performance characteristics. It is worth noting that, from the point of view of potential practical

applications, the more important problem is to define the optimal number *N* of servers and the number *M* of assistants. To find these optimal values, first of all, it is necessary to choose an appropriate cost criterion. In this paper, we assume that the quality of the system operation is characterized by the following economic criterion:

$$E = E(M, N) = a\_1 \lambda\_{out} - b\_1 \lambda P\_{ent} - b\_2 \lambda P\_{imp} - d\_1 N - d\_2 M.$$

Here, the cost coefficients *a*1, *b*1, *b*2, *d*1, and *d*<sup>2</sup> have the following meaning:

*a*<sup>1</sup> is the profit obtained by the system for the successful service of one customer;

*b*<sup>1</sup> and *b*<sup>2</sup> are the charges paid by the system for the loss of a customer at the entrance of the system and due to impatience, correspondingly;

*d*<sup>1</sup> and *d*<sup>2</sup> are the charges paid by the system for maintaining one server and one assistant per unit time, correspondingly;

Thus, the economic criterion *E* has the meaning of the average profit obtained by the system per unit of time.

In this numerical example, we fix the following cost coefficients:

 $a\_1 = 1$ ,  $b\_1 = 2$ ,  $b\_2 = 3$ ,  $d\_1 = 0.05$ ,  $d\_2 = 0.1$ .

We aim to maximize the average profit of the system.

Figure 11 illustrates the dependence of the economic criterion *E* on the parameters *N* and *M*.

**Figure 11.** Dependence of the values of economic criterion *E* on *N* and *M*.

As is seen from Figure 11, the profit of the system essentially increases when the number of the exploited servers grows from 1 to about 30. When the number of servers grows from 30 to 40, the profit still increases, but not so essentially. A further increase in the number of servers *N* leads to a slight decrease in the system profit because the cost of adding a new server exceeds the obtained additional profit from the service of customers. The same situation occurs when the number *M* of assistants grows. Thus, it can be verified that in the considered example, the optimal value *E*∗ of the economic criterion *E* is *E*∗ = 5.87082. This optimal value is achieved when the number of servers is *N* = 40, and the number of assistants is *M* = 4.

It is worth noting that the maximal size of the blocks of the generator is defined by the number (*N* + 1)*RW*. One of the goals of this numerical experiment was to demonstrate the feasibility of the elaborated algorithms for more or less realistic values of the number *N* of SSDs and the number *M* of assistants. In this experiment, we have fixed *N* = 50 and *M* = 10 and the range of rating as the set {1, ... , 10}, which is quite enough for the modeling of even a quite large real hypermarket. Therefore, the maximal size of the block in this example is 510 W. Because the total number of points (*N*, *M*) for which computations were performed to show the shape of the considered dependencies and solve the optimization problem is *NM* = 500, we restricted ourselves by the base *MAP* of order 2, just to avoid long computations. For *W* = 2, the maximal size of the block is more than 1000, and computation for 500 points takes several minutes. The increase in *W* does not create any essential problem in computations except the increase in computation time with the increase in *W*. Note that the use of the *MAP* of order 2 is often enough for good matching the main characteristics of the *MAP* flow to the corresponding characteristics of even quite bursty real flows.

#### **8. Conclusions**

In this paper, we have considered a queueing model having a finite number *N* of servers and *M* assistants that help the servers when certain service problems occur. This model fits a description of the operation of a huge variety of real-world systems with so-called self-service of customers. We assume the novel description of an arrival process as the generalization of the *MAP* to the case of rating-dependent arrival rates. Rating dependent arrivals are typical in many real systems with competing service providers. The effects of possible customers abandonment (balking) and impatience (reneging) are accepted for consideration. Algorithmic analysis of this queueing model based on the use of level-dependent multi-dimensional Markov chains is implemented. This analysis includes the derivation of conditions for stability and non-stability of the model, computation of the steady-state distribution of the chain, numerical illustration of the dependence of the main performance measures of the system on *N* and *M*, and the solution of the optimization problem.

The obtained results can be extended to the models with other possible mechanisms for calculating the value of the rating, more complicated distributions of service and help times, unreliable servers or (and) assistants, heterogeneous (experienced and non-experienced) customers, etc.

**Author Contributions:** Conceptualization, S.D., Y.G. and A.D.; methodology, S.D., O.D. and Y.G.; software, S.D. and O.D.; validation, S.D. and O.D.; formal analysis, S.D., Y.G. and A.D.; investigation, A.D.; writing, original draft preparation, Y.G. and A.D.; writing, review and editing A.D. and S.D.; supervision A.D. and Y.G.; project administration O.D. and A.D. All authors read and agreed to the published version of the manuscript.

**Funding:** This paper has been supported by the RUDN University Strategic Academic Leadership Program.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Continuous-Time Network Evolution Model Describing 2 and 3-Interactions**

**István Fazekas \* and Attila Barta**

Faculty of Informatics, University of Debrecen, Kassai Street 26, 4028 Debrecen, Hungary; barta.attila@inf.unideb.hu

**\*** Correspondence: fazekas.istvan@inf.unideb.hu

**Abstract:** A continuous-time network evolution model is considered. The evolution of the network is based on 2- and 3-interactions. 2-interactions are described by edges, and 3-interactions are described by triangles. The evolution of the edges and triangles is governed by a multi-type continuous-time branching process. The limiting behaviour of the network is studied by mathematical methods. We prove that the number of triangles and edges have the same magnitude on the event of non-extinction, and it is *eα<sup>t</sup>* , where *α* is the Malthusian parameter. The probability of the extinction and the degree process of a fixed vertex are also studied. The results are illustrated by simulations.

**Keywords:** network evolution; random graph; multi-type branching process; continuous-time branching process; 2- and 3-interactions; Malthusian parameter; Poisson process; life-length; extinction

**MSC:** 05C80; 60J85

#### A Continuous-Time Network Evolution Model Describing 2- and 3-Interactions. *Mathematics* **2021**, *9*, 3143. https://doi.org/10.3390/

Academic Editors: Alexander Zeifman, Victor Korolev and Alexander Sipin

math9233143

**Citation:** Fazekas, I.; Barta, A.

Received: 18 October 2021 Accepted: 28 November 2021 Published: 6 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Network theory has a vast literature. In the book of Barabási [1], the general aspects can be found, while the book of van der Hofstad [2] is devoted to the mathematical models. Any network can be considered as a graph. The nodes of the network are the vertices, and the connections are the edges of the graph. A most famous model is the preferential attachment model proposed by Albert and Barabási [3]. It is a discrete time network evolution model and it describes connections of two nodes. In real life, the meaning of connection can be any interaction or any cooperation.

There are models for cooperation more than two units. For example, Backhausz and Móri studied three-interactions in [4]. Their model is generalised for *N*-interactions by Fazekas and Porvázsnyik in [5]. Both of these papers consider cliques where, inside a team, all members cooperate. In some sense, the opposite of the cliques, i.e., star-like connections were considered by Fazekas and Perecsényi [6]. In [6], there is no cooperation between two peripheral members of the team but all of them cooperate with the central member of the team. Despite [3], in papers [4–6] the preferential attachment rule is used for certain subgraphs and not for vertices.

We mention that in [7] the Erd˝os–Rényi graph, the configuration model and the preferential attachment graph were studied when the population was split into two types. The mathematical tool of the analysis in [7] is the theory of multi-type branching processes.

There are several continuous-time network evolution models. Here, we list only some papers using continuous time branching processes. Early works in this direction are [8,9]. Recently, in [10], multi-type preferential attachment trees were studied. In [10], the results of [11] on multi-type continuous time branching processes were applied to describe the evolution of the network.

In this paper, we study a new network evolution model. The structure and the rules of the evolution of our model were inspired both by some everyday experiences and deep scientific results on motifs. On the one hand, we had in our mind activities and

55

structures based on personal connections of the actors and where teams of some persons are important. Thus, we considered the friendship, the recruitment of party members and cooperation among party members, the recruitment and cooperation of volunteers, cooperation among scientists, informal connections among the employees of a company, etc. In these cases, the network consists of relatively small teams, a person can be a member of several teams at the same time, new teams can be born, and they can die, a newcomer can join the network if he/she joins an existing team.

On the second hand, our model is supported by the theory of motifs and their applications for real life networks. Here, we list only a few papers on this topic.

In [12], the authors used network motifs: 'patterns of interconnections occurring in complex networks at numbers that are significantly higher than those in randomized networks'. They developed an algorithm for detecting network motifs and found motifs with three or four vertices in biological and technological networks.

In [13], the authors analyse the local structure of several networks such as protein signaling, developmental genetic networks, power grids, protein-structure networks, World Wide Web links, social networks, and word-adjacency networks. For the study, they used motifs on three or our vertices. In [14], the authors found the numbers of all 3- and 4-node subgraphs, in both directed and non-directed geometric networks. In [15], a method for the identification of all ordered 3-node substructures and the visualization of their significance profile are offered.

Therefore, we wanted to study a network that consists of small substructures, a node can be a member of several substructures at the same time, new substructures can be born and they can die, a new node can join to the network if it joins to an existing substructure.

Concerning the mathematical tools, we follow the line of Móri and Rokob [16], where connections of two units were described by edges and the evolution of the edges was governed by a continuous-time branching process. In [17] we extended the model of [16] to 3-interactions. In this paper, our aim is to study networks containing groups of different sizes. For the sake of simplicity, we consider only groups of sizes 2 and 3. In this case, we can obtain explicit formulae for some quantities and our implicit formulae will be also transparent. The extension of our results to larger groups is possible. We emphasize that in our model all the nodes have the same type. Thus, despite [7,10], the theory of multi-type branching processes will be applied for groups of nodes and not just for the nodes.

In our model, when a new member joins to the network, it joins directly to an existing team. If that team consists of two members, then either a new team of two members or a new team of three members is produced. Similarly, if the new member joins to a team of three members, then new teams having two or three members can emerge. Thus, we obtain a two-type continuous time branching process, in which an individual can be either an edge or a triangle of the network.

The starting individual (that is the ancestor) can be either an edge or a triangle. It produces offspring at each time given by the driving branching process. These offspring can be edges or triangles, and after their birth they also start their own reproduction processes. An evolution step of the generic triangle is the following. Always one vertex is born, but it is connected to the triangle with random number of edges. The new vertex can be connected to 1, 2 or 3 vertices of the triangle. Therefore, the offspring of a triangle can be a new edge, or one new triangle or three new triangles. The lifetime of a triangle is determined by the number of its offspring. The reproduction process of an edge is similar.

The structure of this paper is the following. In Section 2, a detailed description of our model is given. In Section 3, the general results are presented. These are the survival functions of an edge and of a triangle (Theorem 1), the mean offspring number of an edge and of a triangle (Corollary 1), the Perron root and the Malthusian parameter. As usual, we obtain only implicit expression for the Malthusian parameter, but our expression is simple and numerically tractable.

In Section 4, asymptotic theorems on the number of edges and triangles (Theorem 2) are proved. Both of them have magnitude *eα<sup>t</sup>* on the event of non-extinction, where *α* is

the Malthusian parameter. To prove Theorem 2, we use the underlying branching process counted with certain random characteristics and apply the asymptotic theorems of [11].

In Section 5, the generating functions are calculated. Using the generating functions, the probability of extinction are studied. In Section 6, the asymptotic behaviour of the degree of a fixed vertex is considered. Here, we again apply the asymptotic theorems of [11] but with other characteristics than in Section 4. In Section 7, we present some simulation results supporting our theorems. Our figures and tables show that the values obtained by simulation fit well to the theoretical results.

The proofs are based on known general results of multi-type continuous-time branching processes. Therefore, for the reader's convenience, in Section 8, we list several results on multi-type Crump–Mode–Jagers processes.

We mention that our model was presented in our conference paper [18]. In that paper some preliminary theoretical results were announced together with some numerical evidence but without mathematical proofs.

#### **2. The Model**

We study the following network evolution model. At the initial time *t* = 0 the network consists of one single object, this object can be either an edge or a triangle. This object is called the ancestor. During the evolution, this ancestor object produces offspring objects, which can be either edges or triangles. Then, these offspring objects produce their offspring objects and so on. The reproduction times of any fixed object, including the ancestor, are the occurrences in its own Poisson process with rate 1.

From the theory of branching processes, we apply the following usual assumptions. That is we suppose that the reproduction processes of different objects are independent. Moreover, we assume that the reproduction processes of the edges are independent copies of the reproduction process of the generic edge. Similarly, the reproduction processes of the triangles are independent copies of the reproduction process of the generic triangle.

First, we explain the evolution of the generic edge. A Poisson process Π2(*t*) with parameter 1 gives its reproduction times. At any jumping time of this Poisson process, a new vertex appears and it is connected to the generic edge with one or two edges. The probability that this new vertex is connected to the generic edge by one new edge is *r*1, where 0 ≤ *r*<sup>1</sup> ≤ 1. The other end point of this new edge is chosen from the two vertices of the generic edge uniformly at random. We see that in this case the generic edge produces always one new edge. The other case is that when the new vertex is connected to both vertices of the generic edge. Its probability is *r*<sup>2</sup> = 1 − *r*1. In this second case the offspring of the generic edge is a triangle consisting of the generic edge and the two new edges. We emphasize that in this last case the generic edge itself and the new triangle will produce offspring, but the two new edges are not substantive parts of the reproduction process, so they alone will not produce offspring.

The reproduction process of the generic triangle is similar. The Poisson process with rate 1 corresponding to the generic triangle is denoted by Π3(*t*), *t* ≥ 0. The jumping times of Π3(*t*) are the birth times of the generic triangle. At every birth time a new vertex is born and it joins to the existing graph so that it is connected to our generic triangle with 1, 2 or 3 edges. Denote by *pj* (*j* = 1, 2, 3) the probability that the new vertex is connected to *j* vertices of our generic triangle. The vertices of the generic triangle to be connected to the new vertex are chosen uniformly at random.

By the above definition of the evolution process, at each birth step we add precisely 1 new vertex. When the new vertex is connected to one vertex of the generic triangle, the generic triangle gives birth to one new edge. This event has probability *p*1. However, in the remaining two cases we count only the new triangles and not the new edges. When the new edge is connected to the generic triangle by two edges, these two edges and one edge of the generic triangle form a new triangle. Therefore, with probability *p*2, the generic triangle produces one child triangle. When the new edge is connected to the generic triangle by three edges, these edges and the edges of the generic triangle form three new triangles. Thus, with probability *p*3, the generic triangle produces three children triangles.

Any edge is called a type 2 object, and any triangle is called a type 3 object. We use subscript 2 for edges and subscript 3 for triangles. Thus, we denote by *ξi*,*j*(*t*) the number of type *j* offspring of the type *i* generic object up to time *t* (*i*, *j* = 2, 3). Recall that *ξi*,*j*, *i*, *j* = 2, 3, are point processes. Then

$$
\mathfrak{J}\mathfrak{z}(t) = \mathfrak{J}\mathfrak{z}\mathfrak{z}(t) + \mathfrak{J}\mathfrak{z}\mathfrak{z}(t) \tag{1}
$$

gives the total number of offspring (that is both edges and triangles) of the generic edge up to time *t*. We can also see that

$$
\mathfrak{J}\_3(t) = \mathfrak{J}\_{3,2}(t) + \mathfrak{J}\_{3,3}(t) \tag{2}
$$

is the number of all offspring (edges or triangles) of the generic triangle up to time *t*.

We denote by *τ*3(1), *τ*3(2), ... the birth times of the generic triangle, and we denote by *ε*3(1),*ε*3(2), ... the corresponding total litter sizes. That is, at the *i*th birth event, the generic triangle bears *ε*3(*i*) children being either triangles or edges. The discrete random variables *ε*3(1),*ε*3(2), ... are independent and identically distributed having distribution <sup>P</sup>(*ε*3(*i*) = *<sup>j</sup>*) <sup>=</sup> *qj*, *<sup>j</sup>* <sup>≥</sup> 1. By the above evolution process, we have

$$\mathbb{P}(\varepsilon\_3(i) = 1) = q\_1 = p\_1 + p\_2, \text{ } \mathbb{P}(\varepsilon\_3(i) = 3) = q\_3 = p\_3.$$

$$\mathbb{P}(\varepsilon\_3(i) = j) = q\_j = 0, \text{ if } j \notin \{1, 3\}.$$

We assume that the litter sizes are independent of the birth times.

Let *λ*<sup>3</sup> be the life-length of the generic triangle. It is a finite, non-negative random variable. We assume that the reproduction terminates at the death of the individual. Therefore, *ξ*3(*t*) = *ξ*3(*λ*3) for *t* > *λ*3. Then, the reproduction process of a triangle can be formulated as

$$\mathfrak{E}^{\mathfrak{z}}(t) = \sum\_{\mathfrak{z}\mathfrak{z}(i) \le t \wedge \lambda^{\mathfrak{z}}} \mathfrak{e}\mathfrak{z}(i) = \mathfrak{S}\mathfrak{z}(\Pi\mathfrak{z}(t \wedge \lambda^{\mathfrak{z}})),\tag{3}$$

where Π3(*t*) is the Poisson process, *S*3(*n*) = *ε*3(1) + ··· + *ε*3(*n*) gives the total number of offspring of the generic triangle before the (*n* + 1)th birth event and by *x* ∧ *y* we denote the minimum of {*x*, *y*}.

**The survival function of the life-length.** Let *L*3(*t*) denote the distribution function of the triangle's life-length *λ*3. Then, the survival function of *λ*<sup>3</sup> is

$$1 - L\_3(t) = \mathbb{P}(\lambda\_3 > t) = \exp\left(-\int\_0^t l\_3(u) du\right),\tag{4}$$

where *l*3(*t*) is the hazard rate of the life-length *λ*3. We suppose that the hazard rate depends on the total number of offspring, so that

$$l\_3(t) = b + c\emptyset\_3(t)\tag{5}$$

with fixed positive constants *b* and *c*.

Let *λ*<sup>2</sup> be the life-length of the generic edge. Then, *ξ*2(*t*) = *ξ*2(*λ*2) for *t* > *λ*2. As the edge always gives birth to one offspring (which can be an edge or a triangle); therefore,

$$\xi\_2^{\chi}(t) = \Pi\_2(t \wedge \lambda\_2) \tag{6}$$

is the total number of offspring of the generic edge, where Π2(*t*) is the Poisson process.

We denote by *L*2(*t*) the distribution function of *λ*2. Then, the survival function of the life-length of an edge is

$$1 - L\_2(t) = \exp\left(-\int\_0^t l\_2(u) du\right),\tag{7}$$

where *l*<sup>2</sup> is the hazard rate of the life-length *λ*2. We suppose that *l*<sup>2</sup> is of the form *l*2(*t*) = *b* + *cξ*2(*t*).

We emphasize that we do not delete any edge or any triangle when it dies, because its ingredients can belong to other triangles or edges, too. Thus, dead triangles and edges will be considered as inactive objects not producing new offspring.

In Figure 1, an example is shown for our graph evolution model. For a clear view it contains only three birth steps after the initial time *t* = 0. The nodes of the ancestor are highlighted by red. The edges are labelled with the birth times *t*. The following objects appear in Figure 1, which are described by the labels of their nodes:


**Figure 1.** Example of the graph evolution model with parameter set: *r*<sup>1</sup> = 0.1, *p*<sup>1</sup> = 0.4, *p*<sup>2</sup> = 0.2, *b* = 0.1, *c* = 0.1.

Two more examples are shown in Figure 2 with different parameters. In Figure 2a the ancestor is an edge, while in Figure 2b the ancestor is a triangle.

**Figure 2.** Examples of the graph evolution model with two different parameter sets.

#### **3. General Results**

#### **The survival functions.**

**Theorem 1.** *The survival function for a triangle is*

$$\mathbb{P}(\lambda\_3 > t) = e^{-t(b+1)} e^{\frac{3(p\_1 + p\_2)\left(1 - e^{-ct}\right) + p\_3\left(1 - e^{-3ct}\right)}{3c}}.\tag{8}$$

*The survival function for an edge is*

$$\mathbb{P}(\lambda\_2 > t) = e^{-t(b+1)} e^{\frac{1-\varepsilon^{-ct}}{\varepsilon}}.\tag{9}$$

**Proof.** At the first part of the proof we omit subscripts 2 and 3, because the calculations are the same for edges and triangles. Let *t* > 0 and assume that Π(*t*) = *k*. Then, the first *k* birth events happened before time *t*. Thus, the birth times *τ*(1), *τ*(2), ... , *τ*(*k*) and the corresponding litter sizes *ε*(1),*ε*(2), ... ,*ε*(*k*) are known. Therefore, the reproduction process *ξ*(*u*) is also known for *u* < *t*. By (5), a simple calculation shows that the survival function of an object is

$$1 - L(t) = \exp\left(-\int\_0^t l(u) du\right) = \exp\left(-\left(bt + c\int\_0^t \mathbb{Z}(u) du\right)\right) = 0$$

$$= \exp\left(-\left(bt + ctS(k) - c(\varepsilon(1)\tau(1) + \dots + \varepsilon(k)\tau(k))\right)\right).$$

Then

$$\mathbb{P}(\lambda > t | \Pi(t) = k, \tau(1), \dots, \tau(k), \varepsilon(1), \dots, \varepsilon(k)) = 1$$

$$= \exp\left(-(bt + \varepsilon t S(k) - \varepsilon(\varepsilon(1)\tau(1) + \dots + \varepsilon(k)\tau(k)))\right).$$

Let *U*∗ <sup>1</sup> ,..., *U*<sup>∗</sup> *k* be an ordered sample of size *k* from uniform distribution on [0, 1]. Then, the joint conditional distribution of the birth times *τ*(1), ... , *τ*(*k*) given Π(*t*) = *k*, coincides with the distribution of *tU*∗ <sup>1</sup> ,..., *tU*<sup>∗</sup> *k* . Therefore

$$\mathbb{P}(\lambda > t | \Pi(t) = k) = \mathbb{E} \exp\left(-\left(bt + ct \sum\_{i=1}^{k} \varepsilon(i) \left(1 - \frac{\tau(i)}{t}\right)\right)\right) = 0$$

$$= \mathbb{E} \exp\left(-bt + ct \sum\_{i=1}^{k} \varepsilon(i)(\mathcal{U}\_i^\* - 1)\right),$$

because *τ*(*i*) = *tU*∗ *<sup>i</sup>* . The litter sizes *ε*(1), ... ,*ε*(*k*) are independent identically distributed random variables, which are independent also of *U*∗ <sup>1</sup> ,..., *U*<sup>∗</sup> *<sup>k</sup>* . Hence

$$\mathbb{P}(\lambda > t | \Pi(t) = k) = \mathbb{E} \exp\left(-bt + ct \sum\_{i=1}^{k} \varepsilon(i)(\mathcal{U}\_{i} - 1)\right) = 1$$

$$= \varepsilon^{-bt} \mathbb{E} \prod\_{i=1}^{k} \varepsilon^{ctt(i)(\mathcal{U}\_{i} - 1)} = \varepsilon^{-bt} \left(\mathbb{E}\_{\varepsilon(i)} \left(\mathbb{E}\_{\mathcal{U}\_{i}} \left(\varepsilon^{ct\varepsilon(i)\mathcal{U}\_{i}}\right) e^{-\varepsilon t\varepsilon(i)}\right)\right)^{k} = 1$$

$$= \varepsilon^{-bt} \left(\sum\_{j=1}^{\infty} q\_{j} \frac{\varepsilon^{ctj} - 1}{\varepsilon tj} e^{-\varepsilon tj}\right)^{k} = \varepsilon^{-bt} \left(\sum\_{j=1}^{\infty} q\_{j} \frac{1 - e^{-\varepsilon tj}}{\varepsilon tj}\right)^{k},$$

where we applied that *Ui* is uniformly distributed. Using this and the total probability theorem, we find

$$\mathbb{P}(\lambda > t) = \sum\_{k=0}^{\infty} \mathbb{P}(\Pi(t) = k) \mathbb{P}(\lambda > t | \Pi(t) = k) = 1$$

$$=\sum\_{k=0}^{\infty} \frac{t^k}{k!} e^{-t} e^{-bt} \left(\sum\_{j=1}^{\infty} q\_j \frac{1 - e^{-ctj}}{ctj} \right)^k = 1$$
 
$$= e^{-(b+1)t} \sum\_{k=0}^{\infty} \frac{1}{k!} \left(\sum\_{j=1}^{\infty} q\_j \frac{1 - e^{-ctj}}{cj} \right)^k = 1$$
 
$$= e^{-(b+1)t} e^{\sum\_{j=1}^{\infty} q\_j \frac{1 - e^{-ctj}}{cj}}$$

Therefore, the survival function for a triangle is

$$\mathbb{P}(\lambda\_3 > t) = e^{-t(b+1)} e^{\frac{3(p\_1+p\_2)\left(1-e^{-ct}\right) + p\_3\left(1-e^{-3ct}\right)}{3c}}$$

Finally, the survival function for an edge is

$$\mathbb{P}(\lambda\_2 > t) = e^{-t(b+1)} e^{\frac{1-e^{-ct}}{c}}.$$

**The mean offspring number.** Let us denote by *mi*,*j*(*t*) = E*ξi*,*j*(*t*) the expectation of the number of type *j* offspring of a type *i* mother until time *t*.

**Corollary 1.** *For any t* ≥ 0*, we have*

$$m\_{2,2}(t) = r\_1 F(t), \qquad m\_{2,3}(t) = r\_2 F(t), \tag{10}$$

*where*

$$F(t) = \int\_0^t (1 - L\_2(s)) ds = \int\_0^t e^{-(b+1)s} e^{\frac{1 - e^{-\alpha t}}{t}} ds = \frac{1}{c} \int\_0^{1 - e^{-\alpha t}} (1 - u)^{\frac{b+1}{c} - 1} e^{\frac{u}{c}} du.$$

$$\mathbb{E}\lambda\_2 = \frac{1}{c} \int\_0^1 (1 - u)^{\frac{b+1}{c} - 1} e^{\frac{u}{c}} du. \tag{11}$$

*For any t* ≥ 0*, we have*

$$m\_{3,2}(t) = p\_1 G(t), \qquad m\_{3,3}(t) = (p\_2 + \mathfrak{z}p\_3)G(t), \tag{12}$$

*where*

$$G(t) = \int\_0^t (1 - L\_3(s)) ds = \int\_0^t e^{-s(b+1)} e^{\frac{3(p\_1 + p\_2)(1 - e^{-\alpha})}{\lambda^\varepsilon} + p\_3 \left(1 - e^{-3\alpha}\right)} ds = 1$$

$$= \frac{1}{c} \int\_0^{1 - e^{-ct}} (1 - u)^{\frac{b+1}{c} - 1} e^{\frac{u}{\lambda^\varepsilon} \left(p\_3 u^2 - 3p\_3 u + 3\right)} du.$$

$$\mathbb{E}\lambda\_3 = \frac{1}{c} \int\_0^1 (1 - u)^{\frac{b+1}{c} - 1} e^{\frac{\#}{\lambda^\varepsilon} \left(p\_3 u^2 - 3p\_3 u + 3\right)} du. \tag{13}$$

<sup>0</sup> <sup>&</sup>lt; <sup>E</sup>*λ*2,E*λ*<sup>3</sup> <sup>&</sup>lt; <sup>∞</sup> *because b* <sup>≥</sup> <sup>0</sup>*.*

**Proof.** We have

$$m\_{i,j}(t) = \mathbb{E}\xi\_{i,j}(t) = \mathbb{E}(\varepsilon\_{i,j}(1) + \varepsilon\_{i,j}(2) + \dots + \varepsilon\_{i,j}(\Pi(t \wedge \lambda\_i))),$$

where *εi*,*j*(*k*) is the number of type *j* offspring of a type *i* mother at her *k*th birth event. Using Wald's identity, the average number of children is

$$m\_{i,j}(t) = \mathbb{E}(\varepsilon\_{i,j}(1))\mathbb{E}(\Pi(t \wedge \lambda\_i)).\tag{14}$$

Using that Π is a Poisson process with rate 1, and *t* ∧ *λ* is bounded for any *t*, from (14), we obtain that the average number of children is

$$m\_{i, \vec{j}}(t) = \mathbb{E}(\varepsilon\_{i, \vec{j}}(1)) \mathbb{E}(t \wedge \lambda\_i) = \mathbb{E}(\varepsilon\_{i, \vec{j}}(1)) \int\_0^t (1 - L\_i(s)) ds. \tag{15}$$

Now, consider *<sup>m</sup>*2,2(*t*). Applying (9) and using the substitution *<sup>u</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*cs*, we obtain

$$m\_{2,2}(t) = r\_1 \int\_0^t e^{-(b+1)s} e^{\frac{1-\varepsilon^{-cs}}{\varepsilon}} ds = \frac{r\_1}{\varepsilon} \int\_0^{1-\varepsilon^{-ct}} (1-u)^{\frac{b+1}{\varepsilon}-1} e^{\frac{u}{\varepsilon}} du. \tag{16}$$

If we write *r*<sup>2</sup> instead of *r*1, then we obtain *m*2,3(*t*). Thus, we obtained (10). Moreover, with *<sup>t</sup>* <sup>→</sup> <sup>∞</sup>, we have <sup>E</sup>*λ*<sup>2</sup> <sup>=</sup> <sup>∞</sup> <sup>0</sup> <sup>P</sup>(*λ*<sup>2</sup> <sup>&</sup>gt; *<sup>t</sup>*)*dt*. Thus, (11) follows from (16).

Now, we turn to *<sup>m</sup>*3,3(*t*). Applying (8), and using the substitution *<sup>u</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*cs*, we obtain ,

$$\int\_{0}^{t} \mathbb{P}(\lambda\_{3} > s) ds = \int\_{0}^{t} e^{-s(b+1)} e^{\frac{3(p\_{1} + p\_{2})(1 - e^{-\alpha}) + p\_{3}\left(1 - e^{-3\alpha}\right)}{\lambda^{c}}} ds = $$

$$= \frac{1}{c} \int\_{0}^{1 - e^{-ct}} (1 - u)^{\frac{k+1}{c} - 1} e^{\frac{k}{kc} \left(p\_{3}u^{2} - 3p\_{3}u + 3(p\_{1} + p\_{2} + p\_{3})\right)} du. \tag{17}$$

As E(*ε*3,3(1)) = *p*<sup>2</sup> + 3*p*3, so from (15) we obtain *m*3,3(*t*). Using that E(*ε*3,2(1)) = *p*1, we obtain *<sup>m</sup>*3,2(*t*). Thus, we obtained (12). Moreover, we have <sup>E</sup>*λ*<sup>3</sup> <sup>=</sup> <sup>∞</sup> <sup>0</sup> <sup>P</sup>(*λ*<sup>3</sup> <sup>&</sup>gt; *<sup>t</sup>*)*dt*. Thus, (13) follows from (17) with *t* → ∞.

Let

$$m\_{i,j}^\*(\kappa) = \int\_0^\infty e^{-\kappa t} m\_{i,j}(dt), \qquad i, j = 2, 3, 4$$

be the Laplace transform of *mi*,*j*.

**Proposition 1.** *For any κ* ≥ 0*, we have*

$$m\_{2,2}^{\*}(\kappa) = r\_1 A(\kappa), \qquad m\_{2,3}^{\*}(\kappa) = r\_2 A(\kappa), \tag{18}$$

*where*

$$A(\mathbf{x}) = \int\_0^\infty e^{-\mathbf{x}s} e^{-(b+1)s} e^{\frac{1-\varepsilon^{-\alpha}}{\varepsilon}} ds = \frac{1}{\varepsilon} \int\_0^1 (1-u)^{\frac{\kappa+b+1}{\varepsilon}-1} e^{\frac{u}{\varepsilon}} du. \tag{19}$$

*For any κ* ≥ 0*, we have*

$$m\_{3,2}^{\*}(\kappa) = p\_1 B(\kappa), \qquad m\_{3,3}^{\*}(\kappa) = (p\_2 + 3p3)B(\kappa), \tag{20}$$

*where*

$$B(\kappa) = \int\_0^\infty e^{-\kappa s} e^{-s(b+1)} e^{\frac{3(p\_1+p\_2)(1-e^{-\kappa s}) + p\_3\left(1-e^{-3\kappa}\right)}{3}} ds = 1$$

$$= \frac{1}{c} \int\_0^1 (1-u)^{\frac{\kappa+k+1}{c}-1} e^{\frac{u}{\kappa c}\left(p\_3u^2 - 3p\_3u + 3\right)} du.$$

**Proof.** Apply the definition of *m*∗ *i*,*j* (*κ*), Corollary <sup>1</sup> and substitution *<sup>u</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> *<sup>e</sup>*−*cs*.

**The Perron root and the Malthusian parameter.** Let

$$M(\kappa) = \begin{pmatrix} m\_{2,2}^\*(\kappa) & m\_{2,3}^\*(\kappa) \\ m\_{3,2}^\*(\kappa) & m\_{3,3}^\*(\kappa) \end{pmatrix} \tag{21}$$

be the matrix of the Laplace transforms. Direct calculation gives that the characteristic roots of *M*(*κ*) are

$$g\_{1,2}(\mathbf{x}) = \frac{(p\_2 + 3p\_3)B(\mathbf{x}) + r\_1 A(\mathbf{x}) \pm \sqrt{((p\_2 + 3p\_3)B(\mathbf{x}) - r\_1 A(\mathbf{x}))^2 + 4p\_1 B(\mathbf{x})r\_2 A(\mathbf{x})}}{2}. \tag{22}$$

The greater of the values 1(*κ*) and 2(*κ*) is called the Perron root, so

$$\varrho(\mathbf{x}) = \varrho\_1(\mathbf{x}) = \frac{(p\_2 + 3p\_3)B(\mathbf{x}) + r\_1 A(\mathbf{x}) + \sqrt{((p\_2 + 3p\_3)B(\mathbf{x}) - r\_1 A(\mathbf{x}))^2 + 4p\_1 B(\mathbf{x})r\_2 A(\mathbf{x})}}{2} \tag{23}$$

is the Perron root.

We assume that our process is supercritical; that is,

$$
\varrho(0) > 1.\tag{24}
$$

For supercriticality, condition

$$\max\{ (p\_2 + 3p\_3)B(0), r\_1 A(0) \} > 1$$

is sufficient.

That value of *κ* for which the Perron root is equal to 1 is called the Malthusian parameter. Thus, using the usual notation in the theory of branching processes, *α* is the Malthusian parameter if (*α*) = 1. In this paper, we assume the existence of the Malthusian parameter. From relation (*α*) = 1 and (23), we obtain that the Malthusian *α* satisfies the equation

$$r\_1 A(a)(p\_2 + 3p\_3)B(a) - (r\_1 A(a) + (p\_2 + 3p\_3)B(a)) = r\_2 A(a)p\_1 B(a) - 1.\tag{25}$$

Later, we use the eigenvectors of *M*(*α*). To this end, let *α* be the Malthusian parameter, and let (*v*2, *v*3) be the right eigenvector of *M*(*α*) corresponding to eigenvalue 1 and satisfying condition *v*<sup>2</sup> + *v*<sup>3</sup> = 1. Then, direct calculation shows that

$$v\_2 = \frac{(r\_1 - 1)A(a)}{(2r\_1 - 1)A(a) - 1}, \qquad v\_3 = \frac{r\_1 A(a) - 1}{(2r\_1 - 1)A(a) - 1}.\tag{26}$$

Again, let *α* be the Malthusian parameter and let (*u*2, *u*3) be the left eigenvector of *M*(*α*) satisfying condition *u*2*v*<sup>2</sup> + *u*3*v*<sup>3</sup> = 1. Direct calculation shows that

$$u\_2 = \frac{p\_1 B(a) ( (2r\_1 - 1) A(a) - 1)}{p\_1 B(a) (r\_1 - 1) A(a) - (r\_1 A(a) - 1)^2}, \qquad u\_3 = \frac{(1 - r\_1 A(a)) ( (2r\_1 - 1) A(a) - 1)}{p\_1 B(a) (r\_1 - 1) A(a) - (r\_1 A(a) - 1)^2}.\tag{27}$$

#### **4. Asymptotic Theorems on the Number of Triangles and Edges**

In this section, we use Proposition 4 from Section 8. So we should check the conditions given in Section 8. For condition (a) from Section 8, we should guarantee that not all measures *mi*,*<sup>j</sup>* are concentrated on a lattice. By Corollary 1, these measures are absolutely continuous, and thus it is satisfied.

Concerning condition (b1), we underline that we suppose the existence of a positive Malthusian parameter *α*. To this end, in this section, we assume that (25) has a finite positive solution *α*. We can check numerically the existence of this value. For (b2), we assume (24). Condition (c) from Section 8 will be checked later in the proofs of the results together with other conditions related to it.

Now, we analyse condition (d). We can see from Corollary 1 that *F*(∞) and *G*(∞) are positive. Thus, we can concentrate on parameters *ri* and *pi*. If *r*<sup>2</sup> = *p*<sup>1</sup> = 0, then (d) is not satisfied; however, in this case, one can study separately the process of edges (it grows at any birth time by 1), and the process of triangles (this is described in [17]). If *r*<sup>1</sup> = 0 and *p*<sup>2</sup> + *p*<sup>3</sup> = 0, then (d) is not satisfied, and the evolution process is an alternating one. If either *r*<sup>2</sup> = 0 or *p*<sup>1</sup> = 0, then (d) is not satisfied.

To guarantee condition (d), in this section, we assume that 0 ≤ *r*<sup>1</sup> < 1, 0 < *p*<sup>1</sup> ≤ 1, and it is excluded that both *r*<sup>1</sup> = 0 and *p*<sup>1</sup> = 1 are satisfied at the same time. In this case, condition (d) from Section 8 is satisfied.

**The denominator in the limit theorem.** In the following theorem, we need the next formulae. In Section 8, we see that the denominator of *m*<sup>Φ</sup> <sup>∞</sup> in the limiting expression is independent of Φ, and it is

$$\sum\_{l,j=1}^p u\_l v\_j \int\_0^\infty t e^{-\alpha t} m\_{l,j}(dt).$$

It can be written in the form (and considering our two-dimensional case)

$$D(\boldsymbol{\alpha}) = \sum\_{l,j=2}^{3} \mu\_l v\_j \left( -m\_{l,j}^\*(\boldsymbol{\alpha}) \right)'. \tag{28}$$

Here, *ui* and *vi* are from Equations (26) and (27). Moreover, by Corollary 1 or by Proposition 1, we have that

$$\left(-m\_{2,2}^\*(a)\right)'\_{\phantom{\cdot}} = r\_1(-A'(a)), \qquad \left(-m\_{2,3}^\*(a)\right)' = r\_2\left(-A'(a)\right),\tag{29}$$

$$\left(-m\_{3,2}^\*(a)\right)' = p\_1(-B'(a)), \qquad \left(-m\_{3,3}^\*(a)\right)' = \left(p\_2 + 3p\_3\right)(-B'(a)), \tag{30}$$

where

$$1 - A'(a) = \int\_0^\infty s e^{-as} e^{-(b+1)s} e^{\frac{1-e^{-as}}{\varepsilon}} ds = -\frac{1}{c^2} \int\_0^1 \ln(1-u)(1-u)^{\frac{a+b+1}{\varepsilon}-1} e^{\frac{u}{\varepsilon}} du,\tag{31}$$

$$-B'(u) = \int\_0^\infty s e^{-s s} e^{-s(b+1)} e^{\frac{3(p\_1+p\_2)(1-e^{-\alpha}) + p\_3\left(1-e^{-\beta s}\right)}{\chi\_c}} ds = \tag{32}$$

$$= -\frac{1}{c^2} \int\_0^1 \ln(1-u)(1-u)^{\frac{a+b+1}{c}-1} e^{\frac{y}{\chi\_c}\left(p\_3u^2 - 3p\_3u + 3\right)} du.$$

Now, we turn to the number of edges and triangles. Recall that an edge is a type 2, and a triangle is a type 3 object.

**Theorem 2.** *Assume that* (24) *is satisfied and* (25) *has a finite positive solution α. Assume that* 0 ≤ *r*<sup>1</sup> < 1*,* 0 < *p*<sup>1</sup> ≤ 1 *and it is excluded that both r*<sup>1</sup> = 0 *and p*<sup>1</sup> = 1 *are satisfied at the same time.*

*Let iE*(*t*) *denote the number of all edges being born up to time t if the ancestor of the population was a type i object, i* = 2, 3*. Then*

$$\lim\_{t \to \infty} e^{-at} \, \_iE(t) = \, \_i\mathcal{W} \frac{\upsilon\_i u\_2}{aD(a)}\tag{33}$$

*almost surely for i* = 2, 3*.*

*Let iE*ˆ(*t*) *denote the number of all edges present at time t if the ancestor of the population was a type i object, i* = 2, 3*. Then*

$$\lim\_{t \to \infty} e^{-at} \, \_i\triangle(t) = \, \_i\mathcal{W} \frac{\upsilon\_i u\_2 A(a)}{D(a)}\tag{34}$$

*almost surely for i* = 2, 3*.*

*Let iT*(*t*) *denote the number of all triangles being born up to time t if the ancestor of the population was a type i object, i* = 2, 3*. Then*

$$\lim\_{t \to \infty} e^{-at} \,\_i T(t) = \,\_i \mathcal{W} \frac{\upsilon\_i u\_3}{a D(a)} \tag{35}$$

*almost surely for i* = 2, 3*.*

*Let iT*ˆ(*t*) *denote the number of all triangles present at time t if the ancestor of the population was a type i object, i* = 2, 3*. Then,*

$$\lim\_{t \to \infty} e^{-at} \,\_i \mathcal{T}(t) = \,\_i \mathcal{W} \frac{\upsilon\_i u\_3 B(a)}{D(a)} \tag{36}$$

*almost surely for i* = 2, 3*.*

*The quantities* <sup>2</sup>*W and* <sup>3</sup>*W are a.s. non-negative,* E(2*W*) = E(3*W*) = <sup>1</sup>*,* <sup>2</sup>*W and* <sup>3</sup>*W are a.s. positive on the event of survival.*

**Proof.** We apply Proposition 4. To obtain condition (71), it is enough to show that

$$\mathbb{E}\left[\,\_{a}\zeta\_{i}(\infty)\,\log^{+}\,\_{a}\zeta\_{i}(\infty)\right] < \infty, \qquad i = 2,3,\tag{37}$$

where

$$\xi\_a \xi\_i(\infty) = \int\_0^\infty e^{-at} \xi\_i(dt), \qquad i = 2, 3,\tag{38}$$

and

$$\mathfrak{z}\_i(t) = \mathfrak{z}\_{i,2}(t) + \mathfrak{z}\_{i,3}(t), \qquad i = 2,3. \tag{39}$$

If *i* = 2, then *ξ*2(*t*) is the birth process of an edge, and the children can be both edges and triangles. Therefore, at each birth, there is one child. Therefore,

$$\mathbb{E}\_{a}\xi\_2(\infty) = \int\_0^\infty e^{-at}\xi\_2(dt) = \sum\_{\tau(i)\le\lambda\_2} 1e^{-a\tau(i)} \le \sum\_{i=1}^\infty 1e^{-a\tau(i)} = M\_\tau$$

where *τ*(1), *τ*(2), ... are the jumps of the Poisson process Π2. In the Poisson process Π2(*t*) the distribution of the interarrival time (*τ*(*i*) − *τ*(*i* − 1)) is exponential with rate 1. Therefore, *τ*(*i*) has Γ-distribution *Γ*(*i*, 1). Using this, we have

$$\mathbb{E}(M) = \sum\_{i=1}^{\infty} \mathbb{E}\left(e^{-a\tau(i)}\right) = \sum\_{i=1}^{\infty} \frac{1}{\left(1+\alpha\right)^{i}} = \frac{1}{\alpha}.\tag{40}$$

Let us denote by *η<sup>i</sup>* the interarrival time *τ*(*i*) − *τ*(*i* − 1). Let *η*<sup>0</sup> be an exponentially distributed random variable with rate 1 that is independent of *M*. Then,

$$e^{-a\eta\_0}(1+M) = e^{-a\eta\_0} + e^{-a\eta\_0} \\ \sum\_{i=1}^{\infty} e^{-a(\eta\_1 + \dots + \eta\_i)} = \sum\_{i=0}^{\infty} e^{-a(\eta\_0 + \eta\_1 + \dots + \eta\_i)}.$$

Therefore, the distribution of *e*−*αη*<sup>0</sup> (1 + *M*) coincides with the distribution of *M*. Therefore, using (40), we have

$$\mathbb{E}M^2 = \mathbb{E}\left(e^{-a\eta\_0}(1+M)\right)^2 = \frac{1}{1+2a}\left(1+\frac{2}{a}+\mathbb{E}M^2\right).$$

From this, we find

$$\mathbb{E}M^2 = \frac{\alpha + 2}{2\alpha^2} < \infty.$$

Thus, (37) is true for *i* = 2.

If *i* = 3, then *ξ*3(*t*) is the birth process of a triangle and the children can be both edges and triangles. Therefore, at each birth there are at most three children. Therefore,

$$\mathbb{E}\_{a}\xi\_{3}(\infty) = \int\_{0}^{\infty} e^{-at}\xi\_{3}(dt) = \sum\_{\pi(i)\le\lambda\_{3}} \varepsilon(i)e^{-a\pi\left(i\right)} \le 3\sum\_{i=1}^{\infty} 1e^{-a\pi\left(i\right)} = 3M\_{\prime}$$

where *τ*(1), *τ*(2), ... are the jumps of the Poisson process Π3. By the above calculation E*M*<sup>2</sup> < ∞, so (37) is true for *i* = 3.

If we show that <sup>∞</sup> <sup>0</sup> *t* <sup>2</sup>*e*−*α<sup>t</sup> mi*,*j*(*dt*) < ∞, for *i*, *j* = 2, 3, then conditions (c) and (*iv*) of Section 8 will be proved. Now, for *i* = 2 and *j* = 2, 3, we have from Corollary 1

$$t\int\_0^\infty t^2 e^{-at} m\_{2,j}(dt) \le \max\{r\_1, r\_2\} \int\_0^\infty t^2 e^{-at} e^{-t(b+1)} e^{\frac{1-s^{-it}}{c}} dt \le \int\_0^\infty t^2 e^{-t(a+b+1-1)} dt < \infty$$

because *α* + *b* > 0.

For *i* = 3 and *j* = 2, 3, we have from Corollary 1

$$\int\_0^\infty t^2 e^{-at} m\_{3,j}(dt) \le \max\{p\_1, p\_2 + 3p\_3\} \int\_0^\infty t^2 e^{-at} e^{-t(b+1)} e^{(p\_1+p\_2)\frac{1-e^{-ct}}{c} + p\_3\frac{1-e^{-3ct}}{3c}} dt \le \frac{1}{2}$$

$$\leq \int\_0^\infty t^2 e^{-t(a+b+1-1)} dt < \infty$$

Thus, conditions (c) and (*iv*) of Section 8 are proved. Now, turn to the number of edges.

To obtain (33), let Φ*x*(*t*) = 1 if *x* is an edge, and Φ*x*(*t*) = 0 if *x* is a triangle. Therefore, <sup>E</sup>Φ2(*t*) = 1 and <sup>E</sup>Φ3(*t*) = 0. Conditions (*i*) <sup>−</sup> (*ii*) <sup>−</sup> (*iii*) and (*v*) of Section <sup>8</sup> are satisfied. Thus, (69) and (70) imply (33).

To obtain (34), let Φ*x*(*t*) = 1 if *x* is an edge and it is present at *t*, and Φ*x*(*t*) = 0 if *x* is a triangle. Therefore, <sup>E</sup>Φ2(*t*) = <sup>1</sup> <sup>−</sup> *<sup>L</sup>*2(*t*) and <sup>E</sup>Φ3(*t*) = 0. Conditions (*i*) <sup>−</sup> (*ii*) <sup>−</sup> (*iii*) and (*v*) of Section 8 are satisfied. Now,

$$\int\_0^\infty e^{-\alpha t} \mathbb{E} \Phi\_2(t) dt = \int\_0^\infty e^{-\alpha t} (1 - L\_2(t)) dt = A(\alpha).$$

Thus, (69) and (70) imply (34).

Now, we turn to the number of triangles.

To obtain (35), let Φ*x*(*t*) = 0 if *x* is an edge, and Φ*x*(*t*) = 1 if *x* is a triangle. Therefore, <sup>E</sup>Φ2(*t*) = 0 and <sup>E</sup>Φ3(*t*) = 1. Conditions (*i*) <sup>−</sup> (*ii*) <sup>−</sup> (*iii*) and (*v*) of Section <sup>8</sup> are satisfied. Thus, (69) and (70) imply (35).

To obtain (36), let Φ*x*(*t*) = 0 if *x* is an edge, and Φ*x*(*t*) = 1 if *x* is a triangle, and it is present at *<sup>t</sup>*. Therefore, <sup>E</sup>Φ2(*t*) = 0 and <sup>E</sup>Φ3(*t*) = <sup>1</sup> <sup>−</sup> *<sup>L</sup>*3(*t*). Conditions (*i*) <sup>−</sup> (*ii*) <sup>−</sup> (*iii*) and (*v*) of Section 8 are satisfied. Now,

$$\int\_0^\infty e^{-at} \mathbb{E} \Phi\_\Re(t) dt = \int\_0^\infty e^{-at} (1 - L\_\Im(t)) dt = B(\alpha).$$

Thus, (69) and (70) imply (36).

#### **5. Generating Functions and the Probability of Extinction**

**The joint generating function of** Π2(*λ*2)**,** *ξ*22(*λ*2) **and** *ξ*23(*λ*2)**.** Recall that Π<sup>2</sup> is the Poisson process describing the reproduction times of the generic edge and *λ*<sup>2</sup> is its life length. Thus,

$$w\_{i,j,k} = \mathbb{P}(\Pi\_2(\lambda\_2) = i, \xi\_{22}(\lambda\_2) = j, \xi\_{23}(\lambda\_2) = k)$$

is the joint distribution of the offspring size of the generic edge during its whole life and its last reproduction time. We have

$$w\_{i,j,k} = \mathbb{P}(\tau\_i \le \lambda\_2 < \tau\_{i+1}, \xi\_{22}(\tau\_i) = j, \xi\_{23}(\tau\_i) = k),$$

where *τ<sup>i</sup>* is the *i*th jumping time of the Poisson process Π2. Thus, it again shows that *wi*,*j*,*<sup>k</sup>* is the probability that the *i*th birth event is the last one that occurred before death, and the total numbers of the two types of offspring up to time *τ<sup>i</sup>* are equal to *j* and *k*, respectively.

Now, consider the sequence

$$
\mu\_{i,j,k} = \mathbb{P}(\tau\_i \le \lambda\_{2\prime} \xi\_{22}(\tau\_i) = j, \xi\_{23}(\tau\_i) = k).
$$

Let *<sup>ξ</sup>*2(*τi*−1) = *<sup>m</sup>* and assume for a while that *<sup>τ</sup><sup>i</sup>* and *<sup>τ</sup>i*−<sup>1</sup> are fixed. Then, using (4) and (5) for the hazard rate, we can calculate that, for fixed *τ<sup>i</sup>* and *τi*−1,

$$\mathbb{P}(\lambda\_2 \ge \tau\_i | \lambda\_2 \ge \tau\_{i-1}) = \exp\left(-(b+cm)(\tau\_i - \tau\_{i-1})\right).$$

We know that the increment (*τ<sup>i</sup>* − *τi*−1) is exponential with parameter 1; therefore,

$$\mathbb{P}(\lambda\_2 \ge \tau\_i | \lambda\_2 \ge \tau\_{i-1}) = \mathbb{E}\_{\tau\_i - \tau\_{i-1}} \exp(-(b+cm)(\tau\_i - \tau\_{i-1})) = \frac{1}{1+b+cm}.\tag{41}$$

At each birth step, the new individual can be either an edge or a triangle. Therefore, using the above calculations, the total probability theorem, and the independence of the type of the newly born individual and (Π2, *λ*2), we have the following recursion for *ui*,*j*,*k*.

$$u\_{i,j,k} = u\_{i-1,j-1,k} \frac{r\_1}{1+b+c(j+k-1)} + u\_{i-1,j,k-1} \frac{r\_2}{1+b+c(j+k-1)}.\tag{42}$$

Now, by the definition of *wi*,*j*,*k*, we can see that

$$\begin{split} w\_{i,j,k} = \mathbb{P}(\tau\_{i} \le \lambda\_{2} < \tau\_{i+1}, \mathbb{P}\_{22}(\tau\_{i}) = j, \mathbb{P}\_{23}(\tau\_{i}) = k) = \\ = \mathbb{P}(\lambda\_{2} < \tau\_{i+1} | \tau\_{i} \le \lambda\_{2}, \mathbb{P}\_{22}(\tau\_{i}) = j, \mathbb{P}\_{23}(\tau\_{i}) = k) \mathbb{P}(\tau\_{i} \le \lambda\_{2}, \mathbb{P}\_{22}(\tau\_{i}) = j, \mathbb{P}\_{23}(\tau\_{i}) = k) = \\ &= \frac{b + c(j + k)}{1 + b + c(j + k)} u\_{i,j,k}, \end{split}$$

where by (41), *<sup>b</sup>*+*c*(*j*+*k*) <sup>1</sup>+*b*+*c*(*j*+*k*) is the probability that the generic individual dies before the next birth event.

Let *vi*,*j*,*<sup>k</sup>* <sup>=</sup> *wi*,*j*,*<sup>k</sup> <sup>b</sup>* <sup>+</sup> *<sup>c</sup>*(*<sup>j</sup>* <sup>+</sup> *<sup>k</sup>*) <sup>=</sup> *ui*,*j*,*<sup>k</sup>* 1 + *b* + *c*(*j* + *k*) . Then, from (42), we obtain the following recursion for the sequence *vi*,*j*,*<sup>k</sup>*

$$(1 + b + c(j + k))v\_{i,j,k} = v\_{i-1,j-1,k}r\_1 + v\_{i-1,j,k-1}r\_2. \tag{43}$$

where the initial values are

$$v v\_{0,0,0} = \frac{1}{1+b} \text{ and } v\_{0,j,k} = 0 \text{ for } j \neq 0 \text{ or } k \neq 0. \tag{44}$$

Now, we calculate the generating function *G*(*x*, *y*, *z*) of the sequence *vi*,*j*,*k*. We have

$$G(x,y,z) = \sum\_{i=0}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \upsilon\_{i,j,k} x^i y^j z^k z^i$$

First, multiplying with *x<sup>i</sup> yj z<sup>k</sup>* and then taking the sum of both sides of (43), we obtain

$$\begin{aligned} \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} v\_{i,j,k} \mathbf{x}^j y^j z^k (1 + b + cj + ck) &= \\ &= r\_1 \mathbf{x} y \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} v\_{i-1, j-1, k} \mathbf{x}^{i-1} y^{j-1} z^k + r\_2 \mathbf{x} z \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} v\_{i-1, j, k-1} \mathbf{x}^{i-1} y^j z^{k-1}, \end{aligned}$$

where *v*0,*j*,*k*, *j* = 0, 1, ... is given by (44), and we define *vi*,*j*,*<sup>k</sup>* = 0 if *j* < 0 or *k* < 0. From this equation, we find

$$\begin{aligned} (1+b)\left(\mathbf{G}(\mathbf{x},y,z) - \frac{1}{1+b}\right) + y\varepsilon \mathbf{G}\_y'(\mathbf{x},y,z) + z\varepsilon \mathbf{G}\_z'(\mathbf{x},y,z) &= \\ &= r\_1 \mathbf{x} \mathbf{y} \mathbf{G}(\mathbf{x},y,z) + r\_2 \mathbf{x} \varepsilon \mathbf{G}(\mathbf{x},y,z). \end{aligned} \tag{45}$$

Let *h*(*t*) = *G*(*x*, *ty*, *tz*). Now, substituting *y* with *ty*, *z* with *tz* in (45), we can obtain the following linear differential equation.

$$h'(t) + h(t) \left(\frac{1+b}{ct} - \frac{r\_1xy + r\_2xz}{c}\right) = \frac{1}{ct} \tag{46}$$

with the initial value condition

$$h(0) = \frac{1}{1+b}.\tag{47}$$

Now, we can use the well-known method for linear differential equations. We obtain that the solution of the initial value problem (46) and (47) is

$$h(t) = t^{-\frac{1+b}{c}} e^{\frac{r\_1xy + r\_2xz}{c}} t^{\frac{1}{c}} \int\_0^t s^{\frac{1+b-\varsigma}{c}} e^{-\frac{r\_1xy + r\_2xz}{c}s} ds.$$

With *t* = 1, we obtain that

$$G(x,y,z) = h(1) = \frac{1}{\mathcal{c}} \int\_0^1 s^{\frac{1+h-\mathcal{c}}{\mathcal{c}}} e^{\frac{r\_1xy + r\_2xz}{\mathcal{c}}(1-s)} ds.$$

We need the generating function of *wi*,*j*,*<sup>k</sup>* = *vi*,*j*,*k*(*b* + *c*(*j* + *k*)). It is

$$H(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \sum\_{i=0}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} v\_{i,j,k} (b + c(j+k)) \mathbf{x}^i y^j z^k = $$

$$= b \mathbf{G}(\mathbf{x}, \mathbf{y}, \mathbf{z}) + c \mathbf{y} \mathbf{G}\_y'(\mathbf{x}, \mathbf{y}, \mathbf{z}) + c \mathbf{z} \mathbf{G}\_z'(\mathbf{x}, \mathbf{y}, \mathbf{z}). \tag{48}$$

From here, we obtain

**Proposition 2.** *The joint generating function of* Π2(*λ*2)*, ξ*22(*λ*2) *and ξ*23(*λ*2) *is*

$$H(x, y, z) = \\\\\cfrac{r\_1 x y + r\_2 x z}{c} \frac{1}{c} \int\_0^1 s^{\frac{1 + b - c}{c}} e^{-\frac{r\_1 x y + r\_2 x z}{c}} [b + (r\_1 x y + r\_2 x z)(1 - s)] ds,\tag{49}$$

*where* −1 ≤ *x*, *y*, *z* ≤ 1*.*

**Corollary 2.** *The generating function of the total offspring distribution of the generic edge is*

$$f\_2(y, z) = H(1, y, z) = e^{\frac{r\_1 y + r\_2 z}{c}} \frac{1}{c} \int\_0^1 s^{\frac{1 + k - \varepsilon}{c}} e^{-\frac{r\_1 y + r\_2 z}{c}} [b + (r\_1 y + r\_2 z)(1 - s)] ds. \tag{50}$$

**The joint generating function of** Π3(*λ*3)**,** *ξ*32(*λ*3) **and** *ξ*33(*λ*3)**.** Here, we study the offspring of a triangle. To distinguish the notation of this subsection and the previous subsection, but avoid too many subscripts, we use bar. Thus, here *wi*,*j*,*k*, *ui*,*j*,*k*, *vi*,*j*,*k*, *G*(*x*, *y*, *z*) and *H*(*x*, *y*, *z*) denote quantities relating offspring of the generic triangle. Recall that Π<sup>3</sup> is the Poisson process describing the reproduction times of the generic triangle and *λ*<sup>3</sup> is the life length of the triangle. Thus,

$$\overline{w}\_{i,j,k} = \mathbb{P}(\Pi\_3(\lambda\_3) = i, \xi\_{32}(\lambda\_3) = j, \xi\_{33}(\lambda\_3) = k)$$

is the joint distribution of the offspring size of the generic triangle during its whole life and its last reproduction time. We have

$$
\overline{w}\_{i,j,k} = \mathbb{P}(\tau\_i \le \lambda\_3 < \tau\_{i+1}, \xi\_{32}(\tau\_i) = j, \xi\_{33}(\tau\_i) = k),
$$

where *τ<sup>i</sup>* is the *i*th jumping time of the Poisson process Π3. Thus, we again show that *wi*,*j*,*<sup>k</sup>* is the probability that the *i*th birth event is the last one that happened before death, and the total numbers of the two types of offspring up to time *τ<sup>i</sup>* are equal to *j* and *k*, respectively. Let

$$
\overline{\mu}\_{i,j,k} = \mathbb{P}(\tau\_i \le \lambda\_3, \xi\_{32}(\tau\_i) = j, \xi\_{33}(\tau\_i) = k).
$$

Let *<sup>ξ</sup>*3(*τi*−1) = *<sup>m</sup>*, and assume for a while that *<sup>τ</sup><sup>i</sup>* and *<sup>τ</sup>i*−<sup>1</sup> are fixed. Then, using (4) and (5) for the hazard rate, we can calculate that, for fixed *τ<sup>i</sup>* and *τi*−1,

$$\mathbb{P}(\lambda\_3 \ge \tau\_i | \lambda\_3 \ge \tau\_{i-1}) = \exp\left(-(b+cm)(\tau\_i - \tau\_{i-1})\right).$$

We know that the increment (*τ<sup>i</sup>* − *τi*−1) is exponential with parameter 1; therefore,

$$\mathbb{P}(\lambda\_3 \ge \tau\_i | \lambda\_3 \ge \tau\_{i-1}) = \mathbb{E}\_{\tau\_i - \tau\_{i-1}} \exp(-(b+cm)(\tau\_i - \tau\_{i-1})) = \frac{1}{1+b+cm}.\tag{51}$$

At each birth step, the new individual can be either an edge or a triangle. Therefore, using the above calculations, the total probability theorem, and the independence of the type of the newly born individual and (Π3, *λ*3), we have the following recursion for *ui*,*j*,*k*.

$$\begin{split} \overline{u}\_{i,j,k} &= \overline{u}\_{i-1,j-1,k} \frac{p\_1}{1+b+c(j+k-1)} + \\ &+ \overline{u}\_{i-1,j,k-1} \frac{p\_2}{1+b+c(j+k-1)} + \overline{u}\_{i-1,j,k-3} \frac{p\_3}{1+b+c(j+k-3)}.\end{split} \tag{52}$$

Now, by the definition of *wi*,*j*,*k*, we can see that

$$\begin{split} \overline{w}\_{i,j,k} = \mathbb{P}(\tau\_{i} \le \lambda\_{3} < \tau\_{i+1} \mathbb{Z}\_{32}(\tau\_{i}) = j, \mathbb{Z}\_{33}(\tau\_{i}) = k) = \\ \mathbb{P}(\lambda\_{3} < \tau\_{i+1} | \tau\_{i} \le \lambda\_{3} \mathbb{Z}\_{32}(\tau\_{i}) = j, \mathbb{Z}\_{33}(\tau\_{i}) = k) \mathbb{P}(\tau\_{i} \le \lambda\_{3}, \mathbb{Z}\_{32}(\tau\_{i}) = j, \mathbb{Z}\_{33}(\tau\_{i}) = k) = \\ &= \frac{b + c(j + k)}{1 + b + c(j + k)} \overline{u}\_{i,j,k}, \end{split}$$

where by (51), *<sup>b</sup>*+*c*(*j*+*k*) <sup>1</sup>+*b*+*c*(*j*+*k*) is the probability that the generic individual dies before the next birth event.

Now, let *vi*,*j*,*<sup>k</sup>* <sup>=</sup> *wi*,*j*,*<sup>k</sup> <sup>b</sup>* <sup>+</sup> *<sup>c</sup>*(*<sup>j</sup>* <sup>+</sup> *<sup>k</sup>*) <sup>=</sup> *ui*,*j*,*<sup>k</sup>* 1 + *b* + *c*(*j* + *k*) . Then, from (52), we obtain the following recursion for the sequence *vi*,*j*,*<sup>k</sup>*

$$(1 + b + c(j + k))\overline{v}\_{i,j,k} = \overline{v}\_{i-1,j-1,k}p\_1 + \overline{v}\_{i-1,j,k-1}p\_2 + \overline{v}\_{i-1,j,k-3}p\_3. \tag{53}$$

where the initial values are

$$
\overline{v}\_{0,0,0} = \frac{1}{1+b} \text{ and } \overline{v}\_{0,j,k} = 0 \text{ for } j \neq 0 \text{ or } k \neq 0. \tag{54}
$$

Now, we calculate the generating function *G*(*x*, *y*, *z*) of the sequence *vi*,*j*,*k*. We have

$$\overline{G}(x,y,z) = \sum\_{i=0}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{\varpi}\_{i,j,k} x^i y^j z^k.$$

First, multiplying with *x<sup>i</sup> yj z<sup>k</sup>* and then taking the sum of both sides of (53), we obtain

$$\begin{split} \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{v}\_{i,j,k} \mathbf{x}^{i} y^{j} z^{k} (1 + b + cj + ck) &= p\_{1} \mathbf{x} \mathbf{y} \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{v}\_{i-1,j-1,k} \mathbf{x}^{i-1} y^{j-1} z^{k} + \\ &+ p\_{2} \mathbf{x} \mathbf{z} \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{v}\_{i-1,j,k-1} \mathbf{x}^{i-1} y^{j} z^{k-1} + p\_{3} \mathbf{x} \mathbf{z}^{3} \sum\_{i=1}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{v}\_{i-1,j,k-3} \mathbf{x}^{i-1} y^{j} z^{k-3}, \end{split}$$

where *v*0,*j*,*k*, *j* = 0, 1, ... is given by (54) and we define *vi*,*j*,*<sup>k</sup>* = 0 if *j* < 0 or *k* < 0. From this equation, we find

$$\begin{aligned} (1+b)\left(\overline{\mathcal{G}}(\mathbf{x},y,z) - \frac{1}{1+b}\right) + yc\overline{\mathcal{G}}\_y'(\mathbf{x},y,z) + zc\overline{\mathcal{G}}\_z'(\mathbf{x},y,z) &= \\ &= p\_1 \mathbf{x} y \overline{\mathcal{G}}(\mathbf{x},y,z) + p\_2 \mathbf{x} z \overline{\mathcal{G}}(\mathbf{x},y,z) + p\_3 \mathbf{x} z^3 \overline{\mathcal{G}}(\mathbf{x},y,z). \end{aligned} \tag{55}$$

Let *h*(*t*) = *G*(*x*, *ty*, *tz*). Now, substituting *y* with *ty*, *z* with *tz* in (55), we can obtain the following linear differential equation.

$$
\overline{h}'(t) + \overline{h}(t) \left( \frac{1+b}{ct} - \frac{p\_1 \mathbf{x} y + p\_2 \mathbf{x} z + p\_3 \mathbf{x} z^3 t^2}{c} \right) = \frac{1}{ct} \tag{56}
$$

with the initial value condition

$$
\overline{h}(0) = \frac{1}{1+b}.\tag{57}
$$

One can see that the solution of the initial value problem (56) and (57) is

$$\overline{h}(t) = t^{-\frac{1+k}{c}} e^{\frac{p\_1 x y + p\_2 x x}{c}} t^{-\frac{p\_3 x x^3}{3c}} t^3 \frac{1}{c} \int\_0^t s^{\frac{1+k-c}{c}} e^{-\frac{p\_1 x y + p\_2 x x}{c}} s^{-\frac{p\_3 x x^3}{3c}} ds.$$

With *t* = 1, we obtain that

$$\overline{G}(x,y,z) = \overline{h}(1) = \frac{1}{\mathcal{c}} \int\_0^1 s^{\frac{1+\overline{h}-\overline{c}}{\mathcal{c}}} e^{\frac{p\_1xy + p\_2xz}{\mathcal{c}}(1-s) + \frac{p\_3xz^3}{\mathcal{c}^2}(1-s^3)} ds.$$

Therefore, the generating function of *wi*,*j*,*<sup>k</sup>* = *vi*,*j*,*k*(*b* + *c*(*j* + *k*)) is

$$\overline{H}(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \sum\_{i=0}^{\infty} \sum\_{j=0}^{\infty} \sum\_{k=0}^{\infty} \overline{v}\_{i,j,k} (b + c(j+k)) \mathbf{x}^i \mathbf{y}^j \mathbf{z}^k =$$

$$= b\overline{\mathbf{G}}(\mathbf{x}, \mathbf{y}, \mathbf{z}) + c\mathbf{y}\overline{\mathbf{G}}\_y'(\mathbf{x}, \mathbf{y}, \mathbf{z}) + c\mathbf{z}\overline{\mathbf{G}}\_z'(\mathbf{x}, \mathbf{y}, \mathbf{z}). \tag{58}$$

From here, we obtain

**Proposition 3.** *The joint generating function of* Π3(*λ*3)*, ξ*32(*λ*3) *and ξ*33(*λ*3) *is*

$$H(x, y, z) = \tag{59}$$

$$\begin{aligned} &= \frac{1}{c} \int\_0^1 s^{\frac{1+b-\epsilon}{\epsilon}} e^{\frac{p\_1 w + p\_2 z}{\epsilon} (1-s) + \frac{p\_3 z^3}{\lambda} (1-s^3)} \left[ b + (p\_1 x y + p\_2 x z)(1-s) + p\_3 x z^3 (1-s^3) \right] ds, \\ &\le -1 \le x, y, z \le 1. \end{aligned}$$

**Corollary 3.** *The generating function of the total offspring distribution of the generic triangle is*

$$f\mathbf{3}(y,z) = H(\mathbf{1}, \mathbf{y}, z) = \tag{60}$$

$$\mathcal{L} = e^{\frac{\underline{p}\underline{1}\underline{v} + \underline{p}\underline{2}\underline{z}}{\underline{c}} + \frac{\underline{p}\underline{1}^3}{3!}} \frac{1}{\underline{c}} \int\_0^1 s^{\frac{\underline{1}\underline{v} + \underline{h}\underline{v}}{\underline{c}}} e^{-\frac{\underline{p}\underline{1}\underline{v} + \underline{p}\underline{2}\underline{z}}{\underline{c}} s - \frac{\underline{p}\underline{1}^3}{3!} s^3} \left[ b + (p\_1y + p\_2z)(1 - s) + p\_3z^3(1 - s^3) \right] ds.$$

**The probability of extinction.** In Theorem 3, we give the probability of extinction. To determine the extinction probability of the process, we consider the well-known embedded multi type Galton–Watson process. At time *t* = 0, the 0th generation of the Galton–Watson process consists of a single individual, i.e., the ancestor. The first generation consists of all offspring of the ancestor. The offspring of the individuals of the *n*th generation form the (*n* + 1)th generation. Under some assumptions, the extinction of our original process has the same probability as the extinction of this embedded Galton–Watson process. The reproduction process *ξi*,*j*(*t*) gives the number of type *j* offspring of an ancestor of type *i* up to time *t*. With *t* → ∞, we obtain that the total number of offspring is *ξi*,*j*(∞). Therefore, Corollary 1 gives us the 2 × 2 matrix of the expected total offspring number as

$$\mathbb{M} = \left( m\_{i,j}(\infty) \right)\_{i,j=2}^3$$

Actually, *mi*,*j*(∞) is the expected offspring number of the embedded Galton–Watson process.

Let *s*<sup>2</sup> and *s*<sup>3</sup> denote the probability of extinction of our process when the ancestor is an edge, resp. triangle.

**Theorem 3.** *Assume that* 0 ≤ *r*<sup>1</sup> < 1*,* 0 < *p*<sup>1</sup> ≤ 1 *and it is excluded that both r*<sup>1</sup> = 0 *and p*<sup>1</sup> = 1 *are satisfied at the same time. Let be the Perron–Frobenius root of* <sup>M</sup>*. If* <sup>≤</sup> <sup>1</sup>*, then <sup>s</sup>*<sup>2</sup> <sup>=</sup> *<sup>s</sup>*<sup>3</sup> <sup>=</sup> <sup>1</sup>*. If* > 1*, then s*<sup>2</sup> < 1 *and s*<sup>3</sup> < 1*. In any case,* (*s*2,*s*3) *is the smallest non-negative solution of the vector equation*

$$(s\_{2, \prime}s\_3) = (f\_2(s\_{2, \prime}s\_3), f\_3(s\_{2, \prime}s\_3)),$$

*where f*<sup>2</sup> *and f*<sup>3</sup> *are given in Corollaries 2 and 3.*

**Proof.** We apply Theorem 7.1 in Chapter 1 of [19]. By Corollary 1, *mi*,*j*(0) = 0 and *mi*,*j*(*t*) is finite for any *i*, *j*. Therefore, by Theorem 7.1 in Chapter 3 of [19], the extinction of our original process has the same probability as the extinction of the embedded Galton–Watson process. Thus, we can apply Theorem 7.1 in Chapter 1 of [19]. Here, M is the matrix of the expected offspring numbers of the embedded Galton–Watson process. Now, M is positively regular because we assume that 0 ≤ *r*<sup>1</sup> < 1, 0 < *p*<sup>1</sup> ≤ 1 and it is excluded, that both *r*<sup>1</sup> = 0 and *p*<sup>1</sup> = 1 are satisfied at the same time. Thus, our result follows from Theorem 7.1 in Chapter 1 of [19].

#### **6. The Asymptotic Behaviour of the Degree of a Fixed Vertex**

**The process of the 'good children'.** To describe the degree of a fixed vertex, we introduce a new branching process that we call the process of 'good children'. This process contains those objects that contribute to the degree of the fixed vertex. We can see that a newly born vertex can have 1 or 2 edges if its parent is an edge object and 1, 2 or 3 edges if its parent is a triangle object.

First, we consider the case when the newly born vertex has one edge, and thus, at the beginning, it belongs to an edge object. In this paragraph, we call this edge the 'parent' edge. We fix the newly born vertex. Then, we distinguish those children objects of the 'parent' edge, which contribute to the degree of our fixed vertex. We call a child object of the 'parent' edge a 'good child' if it contains our fixed vertex. We can see that only the 'good children' and their 'good children' offspring can contribute to the degree of the fixed vertex. Then, the distribution of the number of 'good children' at a reproduction event of the 'parent' edge is

$$\mathbb{P}(\overline{\varepsilon}\_{22} = 0) = 1 - \frac{1}{2}r\_{1\prime} \quad \mathbb{P}(\overline{\varepsilon}\_{22} = 1) = \frac{1}{2}r\_{1\prime} \quad \mathbb{P}(\overline{\varepsilon}\_{23} = 0) = 1 - r\_{2\prime} \quad \mathbb{P}(\overline{\varepsilon}\_{23} = 1) = r\_{2\prime}$$

where *ε*˜22 denotes the number of edge type 'good children' and *ε*˜23 denotes the triangle type 'good children'. We have to consider the reproduction process of the 'good child', which is the following

$$\xi\_{2,2}(t) = \mathbb{E}\_{22}(1) + \mathbb{E}\_{22}(2) + \dots + \mathbb{E}\_{22}(\Pi(t \wedge \lambda\_2)),\tag{61}$$

$$\xi\_{2,3}(t) = \mathbb{E}\_{23}(1) + \mathbb{E}\_{23}(2) + \dots + \mathbb{E}\_{23}(\Pi(t \wedge \lambda\_2)),\tag{62}$$

where ˜ *ξ*2,2(*t*) denotes the number of all edge type 'good children', and ˜ *ξ*2,3(*t*) denotes the number of all triangle type 'good children' born by the 'parent' edge, *ε*˜22(1),*ε*˜22(2), ... are i.i.d. copies of *ε*˜22 and *ε*˜23(1),*ε*˜23(2), ... are i.i.d. copies of *ε*˜23. Using Corollary 1, we see that the mean values of the number of edge type and triangle type 'good children' are

$$\begin{aligned} \mathfrak{m}\_{2,2}(t) &= \mathbb{E}\_{\mathfrak{s}2,2}^{\tilde{\mathbf{s}}}(t) = \mathbb{E}(\mathfrak{s}\_{22})\mathbb{E}(\Pi(t \wedge \lambda\_2)) = \frac{1}{2}r\_1 F(t) = \frac{1}{2}m\_{2,2}(t), \\ \mathfrak{m}\_{2,3}(t) &= \mathbb{E}\_{\mathfrak{s}2,3}^{\tilde{\mathbf{s}}}(t) = \mathbb{E}(\mathfrak{s}\_{23})\mathbb{E}(\Pi(t \wedge \lambda\_2)) = r\_2 F(t) = m\_{2,3}(t). \end{aligned}$$

Now, consider the second case where the newly born vertex has two edges, and thus the 'parent' object is a single triangle. Let *ε*˜32 and *ε*˜33 denote the number of edge, resp. triangle type 'good children' of the 'parent' triangle. The distribution of the number of 'good children' will be the following

$$\mathbb{P}(\mathfrak{E}\_{32} = 0) = 1 - \frac{1}{3}p\_{1\prime} \quad \mathbb{P}(\mathfrak{E}\_{32} = 1) = \frac{1}{3}p\_{1\prime}$$

$$\mathbb{P}(\mathfrak{E}\_{33} = 0) = 1 - \frac{2}{3}p\_{2\prime} - p\_{3\prime} \quad \mathbb{P}(\mathfrak{E}\_{33} = 1) = \frac{2}{3}p\_{2\prime} \quad \mathbb{P}(\mathfrak{E}\_{33} = 2) = p\_{3\prime}$$

Let ˜ *ξ*3,2(*t*) denote the number of all edge type 'good children', and ˜ *ξ*3,3(*t*) denote the number of all triangle type 'good children' born by the 'parent' triangle. We obtain from Corollary 1 that

$$\begin{aligned} \mathfrak{m}\_{3,2}(t) &= \mathbb{E}\_{5,2}^{\mathbb{Z}}(t) = \mathbb{E}(\mathfrak{z}\_{32})\mathbb{E}(\Pi(t \wedge \lambda\_3)) = \frac{1}{3}p\_1 G(t) = \frac{1}{3}m\_{3,2}(t), \\ \mathfrak{m}\_{3,3}(t) &= \mathbb{E}\_{5,3}^{\mathbb{Z}}(t) = \mathbb{E}(\mathfrak{z}\_{33})\mathbb{E}(\Pi(t \wedge \lambda\_3)) = \frac{2}{3}(p\_2 + 3p\_3)G(t) = \frac{2}{3}m\_{3,3}(t). \end{aligned}$$

Therefore, from Proposition 1, it is easily seen that the Laplace transforms of the average number of offspring are

$$
\tilde{m}\_{2,2}^\*(\mathbf{x}) = \frac{1}{2} r\_1 A(\mathbf{x}), \quad \tilde{m}\_{2,3}^\*(\mathbf{x}) = r\_2 A(\mathbf{x}), \quad \tilde{m}\_{3,2}^\*(\mathbf{x}) = \frac{1}{3} p\_1 B(\mathbf{x}), \quad \tilde{m}\_{3,3}^\*(\mathbf{x}) = \frac{2}{3} (p\_2 + 3p\_3) B(\mathbf{x}).
$$

$$
\text{Let}
$$

$$
\tilde{M}(\mathbf{x}) = \begin{pmatrix}
\tilde{m}\_{2,2}^\*(\mathbf{x}) & \tilde{m}\_{2,3}^\*(\mathbf{x}) \\
\tilde{m}\_{3,2}^\*(\mathbf{x}) & \tilde{m}\_{3,3}^\*(\mathbf{x})
\end{pmatrix}
$$

be the matrix of the previous Laplace transforms. The Perron root that is the largest eigenvalue of *M*˜ (*κ*) is

$$\overline{\varrho}(\mathbf{x}) = \frac{\frac{2}{3}(p\_2 + 3p\_3)B(\mathbf{x}) + \frac{1}{2}r\_1A(\mathbf{x}) + \sqrt{\left(\frac{2}{3}(p\_2 + 3p\_3)B(\mathbf{x}) - \frac{1}{2}r\_1A(\mathbf{x})\right)^2 + \frac{4}{3}p\_1B(\mathbf{x})r\_2A(\mathbf{x})}}{2}. \tag{63}$$

In the following, we assume supercriticality of the 'good children' process; that is, we suppose that ˜(0) > 1. We can see that the reproduction process of the 'good children' is supercritical if

$$\max\left\{\frac{1}{2}r\_1A(0), \frac{2}{3}(p\_2 + 3p\_3)B(0)\right\} > 1.$$

We assume the existence of finite and positive Malthusian parameter of the 'good children' process. Thus, let *α*˜ be the Malthusian parameter; it satisfies equation ˜(*α*˜) = 1. From this equation and from (63), we see that *α*˜ is the solution of

$$\frac{1}{3}(r\_1(p\_2 + 3p\_3) - r\_2p\_1)A(\vec{\pi})B(\vec{\pi}) - \frac{1}{2}r\_1A(\vec{\pi}) - \frac{2}{3}(p\_2 + 3p\_3)B(\vec{\pi}) + 1 = 0. \tag{64}$$

Let (*v*˜2, *v*˜3) denote the right eigenvector of *M*˜ (*α*˜) corresponding to the eigenvalue 1, and let (*u*˜2, *u*˜3) be the left eigenvector with the conditions *v*˜2 + *v*˜3 = 1 and *v*˜2*u*˜2 + *v*˜3*u*˜3 = 1. Direct calculations show that

$$
\vec{v}\_2 = \frac{(1 - r\_1)A(\vec{\alpha})}{\left(1 - \frac{3}{2}r\_1\right)A(\vec{\alpha}) + 1}, \qquad \vec{v}\_3 = \frac{1 - \frac{1}{2}r\_1A(\vec{\alpha})}{\left(1 - \frac{3}{2}r\_1\right)A(\vec{\alpha}) + 1},
$$

$$
\vec{u}\_2 = \frac{\left(\left(1 - \frac{3}{2}r\_1\right)A(\vec{\alpha}) + 1\right)\frac{1}{3}p\_1B(\vec{\alpha})}{\frac{1}{3}r\_2A(\vec{\alpha})p\_1B(\vec{\alpha}) + \left(\frac{1}{2}r\_1A(\vec{\alpha}) - 1\right)^2},
$$

$$\mathfrak{u}\_3 = \frac{\left(\left(\frac{3}{2}r\_1 - 1\right)A\left(\tilde{\mathfrak{a}}\right) - 1\right)\left(\frac{1}{2}r\_1A\left(\tilde{\mathfrak{a}}\right) - 1\right)}{\frac{1}{3}r\_2A\left(\tilde{\mathfrak{a}}\right)p\_1B\left(\tilde{\mathfrak{a}}\right) + \left(\frac{1}{2}r\_1A\left(\tilde{\mathfrak{a}}\right) - 1\right)^2}.$$

**Limit results for the degree.** We have already mentioned that the 'good children' and only they can contribute to the degree of the fixed vertex. Thus, its degree is equal to the initial degree plus the number of 'good children'. Let <sup>2</sup>*C*˜(*t*) be the degree of a fixed vertex at time *t* after its birth in the case when the vertex belongs to an edge at its birth. Similarly, <sup>3</sup>*C*˜(*t*) is its degree in the case when the vertex belongs to triangle at its birth. Up to an additive constant, *iC*˜(*t*) is the number of 'good children' offspring of an *i* type 'parent' object at time *t*. It is the sum of the number of edge type 'good children' *iE*˜(*t*) and the triangle type 'good children' *iT*˜(*t*). To apply Proposition 4, we can use the same method as in Theorem 2. Thus, for the edges, we can again use the random characteristic Φ*x*(*t*) = 1 if *x* is an edge and Φ*x*(*t*) = 0 if *x* is a triangle, but the underlying process is the process of 'good children'. This is similar for triangles.

Therefore, we have almost surely

$$\lim\_{t \to \infty} e^{-\|t\|} \, \_i\mathbb{C}(t) = \lim\_{t \to \infty} e^{-\|t\|} \left( \_i\mathbb{E}(t) + \_i\mathbb{T}(t) \right) = \_i\mathcal{W} \frac{\vartheta\_i(\vec{u}\_2 + \vec{u}\_3)}{\|\vec{D}(\vec{u})\|},$$

for *i* = 2, 3, where <sup>2</sup>*W*˜ and <sup>3</sup>*W*˜ are positive on the event of non-extinction of the 'good children'.

The last case is when the newly born vertex has three edges. Then, three triangles contribute to the degree of that vertex. Let <sup>3</sup> ˜ *<sup>C</sup>*˜(*t*) be the degree of this vertex. Then, <sup>3</sup> ˜ *C*˜(*t*) is the sum if 'good' offspring of three triangles. Thus, almost surely,

$$\lim\_{t \to \infty} e^{-\tilde{\alpha}t} \mathfrak{z} \tilde{\mathcal{C}}(t) = (\mathfrak{z} \tilde{\mathcal{W}}\_1 + \mathfrak{z} \tilde{\mathcal{W}}\_2 + \mathfrak{z} \tilde{\mathcal{W}}\_3) \frac{\mathfrak{v}\_3(\vec{\mathfrak{u}}\_2 + \vec{\mathfrak{u}}\_3)}{\tilde{\alpha} \tilde{D}(\vec{\mathfrak{a}})},$$

where <sup>3</sup>*W*˜ 1, <sup>3</sup>*W*˜ 2, <sup>3</sup>*W*˜ <sup>3</sup> are independent copies of <sup>3</sup>*W*˜ .

**Checking the conditions of Proposition 4 for the 'good children' process.** To complete the previous reasoning, we should check the conditions of Proposition 4. First, we find the the denominator in the limit theorem that is we calculate *D*˜ . By Section 8, we see that

$$\mathcal{D}(\vec{\alpha}) = \sum\_{l,j=2}^{3} \vec{n}\_l \vec{\sigma}\_j \left( -\vec{m}\_{l,j}^\*(\vec{\alpha}) \right)'. \tag{65}$$

Here, *u*˜*<sup>i</sup>* and *v*˜*<sup>i</sup>* are the eigenvectors. Moreover,

$$\left(-\vec{m}\_{2,2}^\*(\vec{\mathfrak{a}})\right)' = \frac{r\_1}{2}(-A'(\vec{\mathfrak{a}})), \qquad \left(-\vec{m}\_{2,3}^\*(\vec{\mathfrak{a}})\right)' = r\_2\left(-A'(\vec{\mathfrak{a}})\right), \tag{66}$$

$$\left(-\vec{m}\_{3,2}^\*(\vec{\alpha})\right)' = \frac{p\_1}{3}(-B'(\vec{\alpha})), \qquad \left(-\vec{m}\_{3,3}^\*(\vec{\alpha})\right)' = \frac{2}{3}(p\_2 + 3p\_3)\left(-B'(\vec{\alpha})\right), \tag{67}$$

where *α*˜ is the Malthusian parameter in the process of 'good children' and *A* , *B* denotes the derivatives given in (31) and (32).

Condition (a) of Proposition 4 is true because the measures *m*˜ *<sup>i</sup>*,*<sup>j</sup>* are non-lattice as they are absolutely continuous. For condition (b1), we assume the existence of a positive Malthusian parameter. That is, we assume that (64) has a finite and positive solution *α*˜. Condition (b2) is true, because we assume that ˜(0) > 1. Condition (c) is a consequence of Section 4, because *m*˜ *<sup>i</sup>*,*j*(*t*) has shape *cmi*,*j*, where *c* is positive number.

To guarantee condition (d), in this section, we assume that 0 ≤ *r*<sup>1</sup> < 1, 0 < *p*<sup>1</sup> ≤ 1, and it is excluded that both *r*<sup>1</sup> = 0 and *p*<sup>1</sup> = 1 are satisfied at the same time. Conditions (i)-(ii)- (iii) and (v) are true because of the shape of Φ. Conditions (iv) and (vi) are consequences of ˜ *ξi*,*j*(*t*) ≤ *ξi*,*j*(*t*) as one can see from the proof of Theorem 2.

**The extinction of the degree process.** The extinction of the degree process means that the degree of the vertex does not increase after a certain time, that is, the reproduction process of the 'good children' dies out. The probability of this kind of extinction is the smallest non-negative root (*s*˜2,*s*˜3) of the equation

$$(\mathfrak{s}\_{2\prime}\mathfrak{s}\_3) = (\vec{f}\_2(\mathfrak{s}\_{2\prime}\mathfrak{s}\_3), \vec{f}\_3(\mathfrak{s}\_{2\prime}\mathfrak{s}\_3)),$$

where ˜ *f*<sup>2</sup> and ˜ *f*<sup>3</sup> are the generating functions of the total 'good children' distribution of an edge, resp. a triangle. Now, by (61) and (62),

$$
\tilde{f}\_2(y, z) = h\_{\Pi\_2(\lambda\_2)} \left( h\_{\tilde{\varepsilon}\_{2,2}\tilde{\varepsilon}\_{2,3}}(y, z) \right),
$$

where *h*Π2(*λ*2) is the generating function of Π2(*λ*2), and *hε*˜2,2,*ε*˜2,3 is the joint generating function of *ε*˜2,2 and *ε*˜2,3. Here, by (49),

$$h\_{\Pi\_2(\lambda\_2)}(\mathbf{x}) = H(\mathbf{x}, \mathbf{1}, \mathbf{1}) = \frac{1}{\mathcal{C}} \int\_0^1 s^{\frac{1+b-\epsilon}{\epsilon}} e^{\frac{(r\_1+r\_2)\mathbf{x}}{\epsilon}(1-s)} [b + (r\_1+r\_2)\mathbf{x}(1-s)] ds.$$

By direct calculation,

$$h\_{\mathcal{E}\_{2,2}\mathcal{E}\_{2,3}}(y,z) = \frac{1}{2}r\_1 + \frac{1}{2}r\_1y + r\_2z.$$

Similarly,

$$
\vec{f}\_3(y, z) = h\_{\Pi\_3(\lambda\_3)} \left( h\_{\tilde{\varepsilon}\_{3,2}\tilde{\varepsilon}\_{3,3}}(y, z) \right),
$$

where by (59), the generating function of Π3(*λ*3) is

$$h\_{\Pi y(\lambda\_3)}(\mathbf{x}) = \overline{\mathcal{H}}(\mathbf{x}, 1, 1) = \frac{1}{c} \int\_0^1 \mathbf{s}^{\frac{1+b-c}{\epsilon}} e^{\frac{(p\_1+p\_2)\mathbf{x}}{\epsilon}} (1-s) + \frac{p\_3\mathbf{x}}{\lambda^c} (1-s^3) \left[ b + (p\_1\mathbf{x} + p\_2\mathbf{x})(1-s) + p\_3\mathbf{x}(1-s^3) \right] ds.$$

Moreover, the joint generating function of *ε*˜3,2 and *ε*˜3,3 is

$$h\_{\ell;3,\mathcal{E}\_{3,3}}(y,z) = \frac{2}{3}p\_1 + \frac{1}{3}p\_2 + \frac{1}{3}p\_1y + \frac{2}{3}p\_2z + p\_3z^2.$$

#### **7. Simulations**

In this section, we provide some empirical results for our asymptotic theorems. We generated our process in the programming language Julia. We needed an environment, where the priority queues were highly applicable. Using this structure, the running time was reasonable. A more detailed explanation of the algorithm can be found in [20].

According to Theorem 2, for large *t*, the graphs of the numbers of edges and triangles are approximately straight lines on the logarithmic scale. To obtain empirical evidence of our Theorem 2, we investigated the slope of the simulated number of edges and triangles being born and being present up to time *t* on the logarithmic scale. The initial instability of the single processes (Figure 3) motivated us to exclude the first few observations from the calculations, but the lack of them was not relevant, because the asymptotic properties can be observed in the later stage of the processes.

**Figure 3.** Measurements of a single process on a logarithmic scale.

For each parameter set, we stored the mentioned measurements only in integer time steps, and then we took the average of 100 simulated processes. In Figure 4, an example is shown for a specific parameter set (*r*<sup>1</sup> = 0.1, *p*<sup>1</sup> = 0.2, *p*<sup>3</sup> = 0.6, *b* = 0.25, *c* = 0.25). The values of the averages are plotted by dots. In each case, we fitted a regression line (plotted by continuous red line) to the last 9 values. We can see that the fit is perfect, thus, supporting our theorem.

Our main goal was to obtain a 95% confidence interval for the slope of the linear regression line, as that was our simulated approximation of the Malthusian parameter *α*. Table 1 contains the boundaries of the 95% confidence intervals for *α*. The columns labelled with 2.5% and 97.5% refer to the lower and the upper bounds obtained from simulations, while the column of *α*ˆ refers to the numerical solution of Equation (25).

For each fixed parameter set {*r*1, *p*1, *p*2, *b*, *c*}, we present the confidence intervals calculated from the number of edges being born (*E*) resp. being present (*E*˜) and from the number of triangles being born (*T*) resp. being present (*T*˜) up to time *t* = 14. The confidence intervals containing the numerical Malthusian parameter *α*ˆ are highlighted with the ∗ symbol. We see that any confidence interval is narrow, and it either contains *α*ˆ, or *α*ˆ is very close to the interval. These results show that the approximation is good for moderate values of *t*.

**Figure 4.** The average of 100 processes generated by the same parameter set and the regression line.

Finally, we present some simulation results for Theorem 3, that is, for the probability of extinction of the evolution process. We made the following computer experiment for any fixed parameter set {*r*1, *p*1, *p*2, *b*, *c*} and for type 2 and type 3 ancestors. We started to generate the process. If this process reached 210 birth steps, then we stopped it and considered it as a non-extinct process. Otherwise, when the process did not reach 210 birth steps, then the process died out. Applying the above method, we generated 105 processes for each parameter sets and counted the relative frequencies of the processes being extinct.

In Table 2, we show some of the results. Column Ancestor contains the type of the ancestor. In the column Numeric we show the numeric solution of the non-linear equation in Theorem 3. We used Julia's trust region method. Column Simulation contains the relative frequencies extracted from the simulations. The simulation results slightly underestimate the numeric values. This is reasonable because we stopped all processes at a fixed time.


**Table 1.** The 95% confidence intervals for *α*.

**Table 2.** Comparison of the numeric values of the extinction probabilities and their relative frequencies from 105 repetitions.


#### **8. Basic Facts on Branching Processes**

In our paper, we use known results of the theory of continuous-time branching processes. The single type general Crump–Mode–Jagers branching processes have been described e.g., in [21–23]. The general multi-type branching processes have been studied, e.g., in [11,19,24].

Here, we give a short description of the general multi-type branching processes based on [11]. The individuals of this process can be of *p* different types, which we denote by 1, 2, ... , *p*. Any individual *x* is described by the quantities *λx*, *ξx*, Φ*x*, Ψ*x*, ... . The quantities *λx*, *ξx*, Φ*x*, Ψ*x*, ... are independent copies of the quantities *λ*, *ξ*, Φ, Ψ, ... . Thus, we should give the definition of *λ*, *ξ*, Φ, Ψ, ... , which we consider as the quantities corresponding to the generic individual.

The lifetime *λ* is a non-negative random variable which is not necessarily independent from the reproduction. The lifetime distribution is *<sup>L</sup>*(*t*) <sup>=</sup> <sup>P</sup>(*<sup>λ</sup>* <sup>≤</sup> *<sup>t</sup>*). The reproduction process is *ξi*(*t*) = *ξi*,1(*t*),..., *ξi*,*p*(*t*) , *t* ≥ 0. Here, the random point process *ξi*,*<sup>j</sup>* describes the births of type *j* offspring of a type *i* mother. *ξi*,*j*(*t*) gives the number of type *j* offspring of a type *i* mother up to time *t*. *ξi*,*<sup>j</sup>* is determined by the birth events and the numbers of offspring. The process starts at time *t* = 0 with one individual called the ancestor and denoted by *x*0. When a child is born, it starts its own reproduction process and so on. The birth time of the individual *x* is denoted by *σx*.

Let Φ(*t*) be a non-negative random function that describes a certain aspect of the life history of the individual. It is usually assumed that Φ(*t*) = 0 for *t* ≤ 0. Then, Φ(*t*) is called a random characteristic. Let Ψ(*t*) be another random characteristic. Thus, the behaviour of the individual *x* is described by *ξx*, *λx*, Φ*x*, Ψ*x*,....

Let us define the branching process *<sup>x</sup>*0*Z*Φ(*t*) counted by the characteristic Φ as

$$\_{x\_0}Z^{\Phi}(t) = \sum\_{\mathbf{x}} \Phi\_{\mathbf{x}}(t - \_{x\_0} \sigma\_{\mathbf{x}})\_{\mathbf{x}}$$

where we summarize for all individuals *x*. Here, the left subscript *x*<sup>0</sup> of *Z* and of the birth time *σ<sup>x</sup>* is important, because it denotes that the process starts with ancestor *x*<sup>0</sup> and the type of *x*<sup>0</sup> has influence for the evolution of the population.

Let us denote by *mi*,*j*(*t*) the reproduction function, which is the expected reproduction number *mi*,*j*(*t*) = E*ξi*,*j*(*t*).

The following facts are well-known (see [11] or [24]).

We assume the following basic conditions in this section.

(a) Not all of the measures *mi*,*<sup>j</sup>* are concentrated on a lattice.

Let

$$m\_{i,j}^\*(\kappa) = \int\_0^\infty e^{-\kappa t} m\_{i,j}(dt), \qquad i, j = 1, \dots, p\_\nu$$

be the Laplace transform of *mi*,*j*. Let *M*(*κ*) be the matrix

$$M(\kappa) = \left(m\_{i,j}^\*(\kappa)\right)\_{i,j=1}^p.$$

(b1) There exists a positive Malthusian parameter *α* that is a finite positive value so that *M*(*α*) has finite entries only, and the Perron–Frobenius root of *M*(*α*) is equal to 1. Here, the Perron–Frobenius root is the largest eigenvalue of the matrix. Let (*v*1, ... , *vp*) be the right positive eigenvector and (*u*1, ... , *up*) the left positive eigenvector of *M*(*α*) corresponding to the Perron–Frobenius root. We normalize them as ∑*<sup>p</sup> <sup>i</sup>*=<sup>1</sup> *vi* <sup>=</sup> 1 and <sup>∑</sup>*<sup>p</sup> <sup>i</sup>*=<sup>1</sup> *uivi* = 1.

(b2) The matrix *mi*,*j*(∞) *p <sup>i</sup>*,*j*=<sup>1</sup> has an infinite entry, or all of them are finite, and its Perron–Frobenius root is greater than 1.

(c) The first moment of *e*−*α<sup>t</sup> mi*,*j*(*dt*) is finite and positive; that is,

$$0 < \int\_0^\infty t e^{-at} m\_{i,j}(dt) < \infty, \qquad i, j = 1, \dots, p.$$

(d) There exists a finite positive integer *K* so that all elements of the *K*th power of the matrix *mi*,*j*(∞) *p <sup>i</sup>*,*j*=<sup>1</sup> are positive.

Let

$$\,\_{\mathbb{R}}\tilde{\varsigma}\_{i,\mathbf{j}}(\infty) = \int\_0^\infty e^{-\alpha t} \zeta\_{i,\mathbf{j}}(dt). \tag{68}$$

**Proposition 4.** *Let α be the Malthusian parameter. Assume that the random characteristic* Φ *satisfies the following conditions:*


$$\int\_0^\infty t(\log(1+t))^{1+\varepsilon} e^{-at} m\_{\bar{i},\bar{j}}(dt) < \infty, \qquad \bar{i}, \bar{j} = 1, \dots, p.$$

*and*

*(v) for some ε* > 0

$$\mathbb{E}\sup\_{t\geq 0} \left\{ \max \left\{ t(\log(1+t))^{1+\varepsilon}, 1 \right\} \varepsilon^{-\alpha t} \Phi(t) \right\} < \infty$$

*for any ancestor.*

*Then,*

$$\lim\_{t \to \infty} e^{-\alpha t} \, \_{x\_0} Z^{\Phi}(t) = \, \_{x\_0} \mathcal{Y}\_{\infty} \upsilon\_i m^{\Phi}\_{\infty} \tag{69}$$

*is likely, where i is the type of x*0*,*

$$m\_{\infty}^{\Phi} = \frac{\sum\_{j=1}^{p} u\_{j} \int\_{0}^{\infty} e^{-at} \mathbb{E}\Phi\_{j}(t) dt}{\sum\_{l,j=1}^{p} u\_{l} v\_{j} \int\_{0}^{\infty} t e^{-at} m\_{l,j}(dt)} \,\tag{70}$$

*<sup>x</sup>*0*Y*<sup>∞</sup> *is an a.s. non-negative random variable depending on the type of the ancestor x*<sup>0</sup> *but not depending on the choice of* Φ*.*

*If, in addition, we assume that*

*(vi)*

$$\mathbb{E}\left[\,^{\mathfrak{a}}\xi\_{i,j}(\infty)\,\log^{+}\,{\mathfrak{a}}\xi\_{i,j}(\infty)\right] \prec \infty, \qquad i,j = 1,\ldots,p,\tag{71}$$

*then* E(*x*0*Y*∞) = <sup>1</sup>*, <sup>x</sup>*0*Y*<sup>∞</sup> *is positive with positive probability, and <sup>x</sup>*0*Y*<sup>∞</sup> *is a.s. positive on the survival set.*

The proof is a simple consequence of Theorem 2.4 and Proposition 4.1 of [11].

#### **9. Discussion**

In this paper, a new network evolution model was introduced. This model was inspired by those networks where small substructures play important role. In social life, such substructures could be a group of friends. In the theory of networks, these substructures are called motifs. In this paper, for the sake of simplicity, we consider only two types of substructures, the edges and the triangles. The novelty of the paper is the usage of a two-type continuous time branching process to describe these two types of interactions. Thus, despite [7,10], the theory of multi-type branching processes was applied for certain substructures of the network and not just for the nodes. Our paper extends the former studies of [16,17], where only one type of interaction was considered.

In this paper, we proved that the magnitude of the number of triangles on the event of non-extinction is *eα<sup>t</sup>* , where *α* is the Malthusian parameter. We obtained similar results for the number of edges. We also studied the degree process of a fixed vertex and the probability of the extinction. Our results are similar to the ones obtained for the simpler models in [16,17]. In addition to mathematical proofs, the results were illustrated by simulations.

In future extensions of the model, more than two types of substructures can be studied using the theory of multi-type branching processes.

**Author Contributions:** Conceptualization and methodology, I.F.; software, A.B.; writing the original draft, I.F.; editing, A.B.; visualization, A.B.; supervision, I.F.; Section 4 is due to I.F., Section 6 is due to A.B. All authors have read and agreed to the published version of the manuscript.

**Funding:** The publication is supported by the EFOP-3.6.1-16-2016-00022 project. The project is co-financed by the European Union and the European Social Fund.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to thank the referees for their helpful remarks.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **A Time-Inhomogeneous Prendiville Model with Failures and Repairs**

**Virginia Giorno \*,† and Amelia G. Nobile †**

Dipartimento di Informatica, Università degli Studi di Salerno, Via Giovanni Paolo II n. 132, 84084 Salerno, Italy; nobile@unisa.it

**\*** Correspondence: giorno@unisa.it

† These authors contributed equally to this work.

**Abstract:** We consider a time-inhomogeneous Markov chain with a finite state-space which models a system in which failures and repairs can occur at random time instants. The system starts from any state *j* (operating, *F*, *R*). Due to a failure, a transition from an operating state to *F* occurs after which a repair is required, so that a transition leads to the state *R*. Subsequently, there is a restore phase, after which the system restarts from one of the operating states. In particular, we assume that the intensity functions of failures, repairs and restores are proportional and that the birth-death process that models the system is a time-inhomogeneous Prendiville process.

**Keywords:** continuous-time ehrenfest model; first-passage time densities; proportional intensity functions; asymptotic behaviors

**MSC:** 60J28; 60J35; 60K25; 60K20

#### **1. Introduction**

Continuous-time Markov chains (CTMC) are usually used in various application fields related to queueing systems, mathematical biology, physics, and chemistry (cf., for instance, Anderson [1], Iosifescu and Tautu [2], Medhi [3], Bayley [4], van Kampen [5], Taylor and Karlin [6], Sericola [7]). In these cases, the stochastic process describes the evolution in continuous time of a Markov chain with a countable set of states that represent the number of customers in a queue, the number of molecules in a chemical reaction, the size of the population with births/deaths/immigrations/emigrations.

In the recent decades, particular attention has been paid to the study of these processes under the effect of random catastrophes that produce a sudden change of the state of a system. After such failure, one can think that the system is empty (total catastrophes) and then the dynamics immediately restart without delay (cf., for instance, Dharmaraja et al. [8], Giorno et al. [9–11], Di Crescenzo et al. [12], Economou and Fakinos [13,14], Chen et al. [15]). In more realistic cases, after a failure the system can be shipped for maintenance; in these cases, due to the extent of the failure, it is reasonable to assume random repair times. To introduce the effect of a catastrophe related to a failure of the system, one adds to the usual assumptions the existence of a non-zero probability of transition to an intermediate state from which the zero, or another operating state, can be reached at some randomly distributed instants (cf., for instance, Di Crescenzo et al. [16,17], Ye et al. [18], Mytalas and Zazanis [19], Krishna Kumar et al. [20]). In many cases, the times to failures and the times of repair are assumed to be exponential random variables. Some models consider the phase-type distributions for failure and repair times (see, for instance, Altiok [21–23], Dallery [24]).

Frequently, time-inhomogeneous Markov chains are used to model real dynamic systems. Research in this area are oriented to determine the transient and the limiting probability distribution, and to construct a continuous time diffusion approximation (cf., for instance,

**Citation:** Giorno, V.; Nobile, A.G. A Time-Inhomogeneous Prendiville Model with Failures and Repairs. *Mathematics* **2022**, *10*, 251. https:// doi.org/10.3390/math10020251

Academic Editors: Alexander Zeifman, Victor Korolev and Alexander Sipin

Received: 20 December 2021 Accepted: 12 January 2022 Published: 14 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Kendall [25], McNeil and Schach [26], Di Crescenzo et al. [27,28], Giorno et. al. [29,30]). Moreover, some studies on the ergodicity of time-inhomogeneous birth-death chains are considered in Ammar et al. [31], Zeifman et al. [32,33], Satin et al. [34]. For CTMC, the evaluation of first-passage time densities and their moments via analytical and numerical methods plays an important role (cf., for instance, Jouini [35], Giorno and Nobile [36] and references therein).

Various research have been devoted to stochastic "logistic models" that describe biological population growth in a limited environment or the number of customers in a queueing system with finite capacity. In particular, the logistic model proposed by Prendiville in 1949, and subsequently solved by Takashima in 1956, was applied in biology, in ecology and in queueing systems (cf. Prendiville [37], Takashima [38], Giorno et al. [39], Ricciardi [40]). The Prendiville process can be also viewed as the Ehrenfest model in continuous time (see, Karlin and McGregor [41], Flegg et al. [42]). Furthermore, Zheng [43] gives the extension of the Prendiville process to the inhomogeneous case. The Prendiville/Ehrenfest model has been also used to describe queueing systems in presence of catastrophes (cf. Dharmaraja [8], Giorno [44,45]). Moreover, Parthasarathy and Krishna Kumar [46] and Matis and Kiffe [47] consider stochastic compartment models with Prendiville growth mechanisms.

In the present paper, we consider a time-inhomogeneous birth-death process with a finite state-space and we assume that failures and repairs can occur at random time instants. Specifically, the state-space of the considered stochastic process, in addition to the operating states, includes two particular states, denoted by *F* and *R*. The dynamics system starts from any state *j* (operating, *F*, *R*). Due to a failure that occurs according to a non-stationary exponential distribution, a transition from an operating state to *F* occurs; after which a repair, that leads to the state *R*, starting from *F*, is required. Even the repair times are assumed to be random and they occur according to a non-stationary exponential distribution. After the system has been repaired, it restarts from one of the operating states.

The plan of the paper is as follows. In Section 2, we describe the stochastic model; we provide the Kolmogorov differential equations for the time-inhomogeneous CTMC with a finite state-space, assuming that the times of failures, repairs, and restores are exponentially distributed. In Section 3, we assume that the failures, repairs and restores intensity functions are proportional; we determine the transient probabilities that, starting from an arbitrary state *j* at time *t*0, the system reaches the state *F*, or the state *R* or one of the operating states 0, 1, ... , at time *t*. In Section 4, we analyze the time of first failure and determine its probability density function and related average. In Section 5, we obtain the probability generating function of the operating states of the system and the related conditional mean. In Section 6, the asymptotic behavior of the probabilities and of related average for the operating state is studied, under the assumption of proportional intensity functions.

#### **2. The Model**

Let {*N*(*t*), *t* ≥ *t*0} be a time-inhomogeneous Markov chain with space-state S = {−2, −1, 0, 1, . . . , -}, where *n* = −2 corresponds to the failure state (*F*), *n* = −1 describes the repair state (*R*) from which the process can work again and *n* = 0, 1, ... , - correspond to the operating states of the system (see, Figure 1). We assume that the arrival (upward jumps) and departures (downward jumps) at time *t* occur with intensity functions *λn*(*t*) for *n* = 0, 1, ... , and *μn*(*t*) for *n* = 1, 2, ... , -, respectively. Moreover, the failures occur according to a non-homogeneous Poisson process, with intensity function *ξn*(*t*), starting from the operating state *n*, with *n* = 0, 1, ... , -. If a failure occurs, then the system goes into the failure state *F*, and further, the completion of a repair occurs according to the intensity function (*t*). After the repair, there is a restore phase after which the system restarts from an operating state *n*, with the intensity function *γn*(*t*) for *n* = 0, 1, ... , -. Several cases can occur: *(a)* after the repair the system restarts from the state *n* = 0, so that we have *γ*0(*t*) = *γ*(*t*) and *γn*(*t*) = 0 for *n* = 1, 2 ... , -; *(b)* the state from which the system restarts is chosen randomly, by setting *γn*(*t*) = *γ*(*t*) for *n* = 0, 1, 2 ... , -; *(c)* the

intensity functions *γ*0(*t*), *γ*1(*t*), ... , *γ*-(*t*) are chosen by reflecting the priority of one state over the others.

**Figure 1.** The state diagram of the Markov process *N*(*t*) modeling failures and repairs.

Specifically, in any small interval (*t*, *t* + Δ*t*), Δ*t* > 0, we assume that the transitions that regulate *N*(*t*) occur according the following scheme:


where *λn*(*t*), *μn*(*t*), *γn*(*t*), *ξn*(*t*), (*t*) are positive, bounded and continuous functions for *t* ≥ 0. In Buonocore et al. [48], a time-homogeneous similar model is considered in the biological context, assuming that *λn*(*t*) = *λ*, for *n* = 0, 1, ... , - − 1, *μn*(*t*) = *μ*, for *n* = 0, 1, . . . , - − 1, *γn*(*t*) = *γ* for *n* = 0, 1, . . . , -, *ξn*(*t*) = *ξ* for *n* = 0, 1, . . . , and (*t*) = . Let

$$p\_{j,n}(t|t\_0) = P\{N(t) = n | N(t\_0) = j\}, \qquad j, n \in \mathcal{S} \tag{1}$$

be the transition probabilities of *N*(*t*). Setting

$$\nu(t) = \sum\_{n=0}^{\ell} \gamma\_n(t),\tag{2}$$

one has:

$$\begin{split} \frac{dp\_{j,-2}(t|t\_0)}{dt} &= \sum\_{n=0}^{\ell} \tilde{\varrho}\_n(t) \, p\_{j,n}(t|t\_0) - \varrho(t) \, p\_{j,-2}(t|t\_0) \\ \frac{dp\_{j,-1}(t|t\_0)}{dt} &= -\nu(t) \, p\_{j,-1}(t|t\_0) + \varrho(t) \, p\_{j,-2}(t|t\_0), \\ \frac{dp\_{j,0}(t|t\_0)}{dt} &= \gamma\_0(t) \, p\_{j,-1}(t|t\_0) - \left[\lambda\_0(t) + \xi\_0(t)\right] p\_{j,0}(t|t\_0) + \mu\_1(t) \, p\_{j,1}(t|t\_0), \\ \frac{dp\_{j,n}(t|t\_0)}{dt} &= \gamma\_n(t) \, p\_{j,-1}(t|t\_0) + \lambda\_{n-1}(t) \, p\_{j,n-1}(t|t\_0) \\ &- \left[\lambda\_n(t) + \mu\_n(t) + \tilde{\zeta}\_n(t)\right] p\_{j,n}(t|t\_0) + \mu\_{n+1}(t) \, p\_{j,n+1}(t|t\_0), \qquad n = 1,2,\ldots,\ell-1, \\ \frac{dp\_{j,\ell}(t|t\_0)}{dt} &= \gamma\_{\ell}(t) \, p\_{j,-1}(t|t\_0) + \lambda\_{\ell-1}(t) \, p\_{j,\ell-1}(t|t\_0) - \left[\mu\_{\ell}(t) + \tilde{\zeta}\_{\ell}(t)\right] p\_{j,\ell}(t|t\_0), \end{split} \tag{3}$$

to solve with the initial conditions

$$\lim\_{t \downarrow t\_0} p\_{j,n}(t|t\_0) = \delta\_{j,n} \qquad j, n \in \mathcal{S}. \tag{4}$$

For *t* ≥ *t*<sup>0</sup> , denoting by

$$\mathcal{P}\_{\dot{\jmath}}(t|t\_0) = \sum\_{n=0}^{\ell} p\_{\dot{\jmath},n}(t|t\_0), \qquad \dot{\jmath} \in \mathcal{S}, \tag{5}$$

the probability that the system is in an operating state at time *t*, one has:

$$\mathcal{P}\_j(t|t\_0) + p\_{j,-2}(t|t\_0) + p\_{j,-1}(t|t\_0) = 1, \qquad j \in \mathcal{S}.\tag{6}$$

If *ξn*(*t*) = *ξ*(*t*) for *n* = 0, 1, . . . , and *t* ≥ *t*0, by virtue of (6), one obtains

$$\sum\_{n=0}^{\ell} \xi\_n(t) \, p\_{j,n}(t|t\_0) = \xi(t) \left[1 - p\_{j,-1}(t|t\_0) - p\_{j,-2}(t|t\_0)\right],$$

so that the first two equations of system (3) become:

$$\begin{split} \frac{dp\_{j,-2}(t|t\_0)}{dt} &= \xi(t) \left[1 - p\_{j,-1}(t|t\_0)\right] - \left[\xi(t) + \varrho(t)\right] p\_{j,-2}(t|t\_0), \\ \frac{dp\_{j,-1}(t|t\_0)}{dt} &= -\nu(t) \left. p\_{j,-1}(t|t\_0) + \varrho(t) \left. p\_{j,-2}(t|t\_0) \right. \right. \end{split} \tag{7}$$

to solve with the initial conditions

$$\lim\_{t \downarrow t\_0} p\_{j\_r - 2}(t|t\_0) = \delta\_{j\_r - 2}, \qquad \lim\_{t \downarrow t\_0} p\_{j\_r - 1}(t|t\_0) = \delta\_{j\_r - 1}.\tag{8}$$

Furthermore, if *ξn*(*t*) = *ξ*(*t*) for *n* = 0, 1, ... , and *t* ≥ *t*0, by virtue of (3), one has that the probability P*j*(*t*|*t*0) satisfies the following differential equation

$$\frac{d\mathcal{P}\_j(t|t\_0)}{dt} = -\zeta(t)\,\mathcal{P}\_j(t|t\_0) + \nu(t)\,p\_{j,-1}(t|t\_0) \tag{9}$$

to solve with the initial condition

$$\lim\_{t \downarrow t\_0} \mathcal{P}\_{\dot{\jmath}}(t|t\_0) = 1 - \delta\_{\dot{\jmath},-2} - \delta\_{\dot{\jmath},-1}.\tag{10}$$

Equation (9) shows that the probability that the system is in an operating state at time *t* does not depend on the intensity functions *λn*(*t*) and *μn*(*t*) related to the birth-death process without failures and repairs.

#### **3. Proportional Intensity Functions of Failures, Repairs and Restores**

We assume that

$$\begin{aligned} \varrho(t) &= \varrho \, \varrho(t), \quad \varrho\_n(t) = \, ^\mathbb{Z} \, \varrho(t), \quad \gamma\_n(t) = \gamma\_n \, \varrho(t), \quad n = 0, 1, \ldots, \ell, \\\nu(t) &= \left(\gamma\_0 + \gamma\_1 + \ldots + \gamma\_\ell\right) \, \varrho(t), \end{aligned} \tag{11}$$

where *ϕ*(*t*) is a positive, bounded and continuous function for *t* ≥ 0. We denote by

$$\Phi(t|t\_0) = \int\_{t\_0}^{t} \varphi(u) \, du, \qquad t \ge t\_0 \tag{12}$$

and we assume that lim*t*→+<sup>∞</sup> Φ(*t*|*t*0)=+∞.

*3.1. Asymptotic Behavior of the System* Let

$$q\_{\mathbb{M}} = \lim\_{t \to +\infty} p\_{j,\mathbb{n}}(t|t\_0), \qquad j, \mathbb{n} \in \mathcal{S}, \qquad \mathbb{Q} = \sum\_{n=0}^{\ell} q\_{\mathbb{M}} = 1 - q\_{-2} - q\_{-1}. \tag{13}$$

be the steady-state probabilities of the considered system.

**Proposition 1.** *Under the assumptions (11), one has:*

$$q\_{-2} = \frac{\nu \, \text{\textquotedblleft}}{\nu \, \varrho + \nu \, \text{\textquotedblright} + \varrho \, \text{\textquotedblleft}}, \qquad q\_{-1} = \frac{\varrho \, \text{\textquotedblleft}}{\nu \, \varrho + \nu \, \text{\textquotedblleft} + \varrho \, \text{\textquotedblright}}, \qquad Q = \frac{\varrho \, \text{\textquotedblleft}}{\nu \, \varrho + \nu \, \text{\textquotedblleft} + \varrho \, \text{\textquotedblright}}.\tag{14}$$

**Proof.** It follows from (7), by taking the limit as *t* → +∞.

Note that the last identity in (14) is the probability that the system is in an operating state *n* = 0, 1, . . . , in equilibrium regime.

#### *3.2. Transient Behavior of the System*

To determine the transient solution of system (7) with initial conditions (8), we denote by *x*<sup>1</sup> and *x*<sup>2</sup> the solutions of the following equation:

$$\mathbf{x}^2 + (\nu + \varrho + \mathfrak{z})\ \mathbf{x} + \nu\varrho + \nu\,\mathfrak{z} + \varrho\,\mathfrak{z} = 0$$

and set

$$
\Delta = \left(\nu - \varrho - \xi\right)^2 - 4\varrho\,\xi. \tag{15}
$$

Since *x*<sup>1</sup> + *x*<sup>2</sup> = −( + *ξ* + *ν*) < 0 and *x*1*x*<sup>2</sup> = *ν*( + *ξ*) + *ξ* > 0, for Δ ≥ 0 one has that *x*<sup>1</sup> < 0 and *x*<sup>2</sup> < 0.

**Proposition 2.** *Under the assumptions (11), for t* ≥ *t*<sup>0</sup> *the following results hold: (i) If* Δ > 0*,*

$$\begin{aligned} p\_{j,-2}(t|t\_0) &= q\_{-2} + \left[\delta\_{j,-2} - q\_{-2}\right] Z\_1(t|t\_0) + \left[\mathfrak{z}(1-\delta\_{j,-1}) - (\mathfrak{z}+\mathfrak{q})\delta\_{j,-2}\right] Z\_2(t|t\_0), \\ p\_{j,-1}(t|t\_0) &= q\_{-1} + \left[\delta\_{j,-1} - q\_{-1}\right] Z\_1(t|t\_0) + \left[-\nu\delta\_{j,-1} + \mathfrak{q}\delta\_{j,-2}\right] Z\_2(t|t\_0), \\ \mathcal{P}\_{\bar{j}}(t|t\_0) &= Q + \left[1 - Q - \delta\_{j,-2} - \delta\_{j,-1}\right] Z\_1(t|t\_0) + \left[(\mathfrak{z}+\nu)\delta\_{j,-1} - \mathfrak{z}(1-\delta\_{j,-2})\right] Z\_2(t|t\_0), \end{aligned}$$

*with*

$$Z\_1(t|t\_0) = \frac{\mathbf{x}\_1 e^{\mathbf{x}\_2 \Phi(t|t\_0)} - \mathbf{x}\_2 e^{\mathbf{x}\_1 \Phi(t|t\_0)}}{\mathbf{x}\_1 - \mathbf{x}\_2}, \qquad Z\_2(t|t\_0) = \frac{e^{\mathbf{x}\_1 \Phi(t|t\_0)} - e^{\mathbf{x}\_2 \Phi(t|t\_0)}}{\mathbf{x}\_1 - \mathbf{x}\_2}.$$

*(ii) If* Δ = 0*,*

$$p\_{j,-2}(t|t\_0) = q\_{-2} + \varepsilon^{x\_1 \Phi(t|t\_0)} \left\{ \delta\_{j,-2} - q\_{-2} + \Phi(t|t\_0) \left[ \xi(1 - \delta\_{j,-1}) - (\xi + \varrho) \, \delta\_{j,-2} \right. \right. \\ \left. \begin{aligned} -x\_1 \left( \delta\_{j,-2} - q\_{-2} + \Phi(t|t\_0) \left[ \xi(1 - \delta\_{j,-1}) - (\xi + \varrho) \, \delta\_{j,-2} \right. \right. \right. \\ -x\_1 \left( \delta\_{j,-1} - q\_{-1} + \Phi(t|t\_0) \left[ \longrightarrow \nu \, \delta\_{j,-1} + \Phi(t|t\_0) \right] \left[ \longrightarrow \nu \, \delta\_{j,-1} + \zeta \, \delta\_{j,-2} \right. \end{aligned} \right. \\\left. \begin{aligned} p\_{j,-1}(t|t\_0) = q\_{-1} + \varepsilon^{x\_1 \Phi(t|t\_0)} \left[ \right] \end{aligned} \right\}\_{j}$$

$$\begin{aligned} \mathcal{P}\_j(t|t\_0) = Q + \varepsilon^{x\_1 \Phi(t|t\_0)} \left\{ 1 - Q - \delta\_{j,-2} - \delta\_{j,-1} + \Phi(t|t\_0) \left[ \left( \xi + \nu \right) \delta\_{j,-1} - \xi \left( 1 - \delta\_{j,-2} \right) \right. \end{aligned} \right. \\\left. \begin{aligned} -x\_1 \left( 1 - Q - \delta\_{j,-2} - \delta\_{j,-1} \right) \end{aligned} \right\}\_j \end{aligned}$$

*(iii) If* Δ < 0*,*

$$\begin{split} p\_{j,-2}(t|t\_{0}) &= q\_{-2} + \epsilon^{a\Phi(t|t\_{0})} \left\{ (\delta\_{j,-2} - q\_{-2}) \cos[b\,\Phi(t|t\_{0})] \right. \\ &\left. + \frac{1}{b} \left[ -a\left(\delta\_{j,-2} - q\_{-2}\right) - (\delta^{\sharp} + q)\delta\_{j,-2} + \xi^{\sharp}\_{j} (1 - \delta\_{j,-1}) \right] \sin[b\,\Phi(t|t\_{0})] \right\}, \\ p\_{j,-1}(t|t\_{0}) &= q\_{-1} + \epsilon^{a\Phi(t|t\_{0})} \left\{ (\delta\_{j,-1} - q\_{-1}) \cos[b\,\Phi(t|t\_{0})] \right. \\ &\left. + \frac{1}{b} \left[ -a\left(\delta\_{j,-1} - q\_{-1}\right) - \nu\delta\_{j,-1} + \varrho\delta\_{j,-2} \right] \sin[b\,\Phi(t|t\_{0})] \right\}, \\ \mathcal{P}\_{j}(t|t\_{0}) &= Q + \epsilon^{a\Phi(t|t\_{0})} \left\{ (1 - Q - \delta\_{j,-2} - \delta\_{j,-1}) \cos[b\,\Phi(t|t\_{0})] \right. \\ &\left. + \frac{1}{b} \left[ a\left(1 - Q - \delta\_{j,-2} - \delta\_{j,-1}\right) + (\xi + \nu)\delta\_{j,-1} - \xi\left(1 - \delta\_{j,-2}\right) \right] \sin[b\,\Phi(t|t\_{0})] \right\}, \end{split}$$

*where*

$$a = -\frac{\nu + \varrho + \xi}{2}, \qquad b = \frac{\sqrt{4\varrho\frac{\wp}{\xi} - (\nu - \varrho - \xi)^2}}{2}.$$

**Proof.** From (7), with conditions (8), one has that *pj*,−2(*t*|*t*0) is solution of the second order differential equation

$$\begin{split} \frac{1}{\varrho(t)} \frac{d}{dt} \left[ \frac{1}{\varrho(t)} \frac{d p\_{j,-2}(t|t\_0)}{dt} \right] + (\varrho + \mathfrak{f} + \nu) \frac{1}{\varrho(t)} \frac{d p\_{j,-2}(t|t\_0)}{dt} \\ + \left[ \nu \left( \varrho + \mathfrak{f} \right) + \varrho \mathfrak{f} \right] p\_{j,-2}(t|t\_0) - \nu \mathfrak{f} = 0, \end{split} \tag{16}$$

to solve with the initial conditions:

$$\lim\_{t \downarrow t\_0} p\_{j,-2}(t|t\_0) = \delta\_{j,-2}, \qquad \lim\_{t \downarrow t\_0} \left[ \frac{1}{\varrho(t)} \frac{d p\_{j,-2}(t|t\_0)}{dt} \right] = \left(1 - \delta\_{j,-1}\right)\xi - \left(\xi + \varrho\right)\delta\_{j,-2}.\tag{17}$$

Similarly, for *pj*,−1(*t*|*t*0) one has

$$\begin{split} \frac{1}{d\varrho(t)} \frac{d}{dt} \Big[ \frac{1}{\varrho(t)} \frac{d p\_{j,-1}(t|t\_0)}{dt} \Big] + (\varrho + \mathfrak{z} + \nu) \frac{1}{\varrho(t)} \frac{d p\_{j,-1}(t|t\_0)}{dt} \\ + \left[ \nu \left( \varrho + \mathfrak{z} \right) + \varrho \mathfrak{z} \right] p\_{j,-1}(t|t\_0) - \varrho \mathfrak{z} = 0, \end{split} \tag{18}$$

to solve with the initial conditions:

$$\lim\_{t \downarrow t\_0} p\_{j,-1}(t|t\_0) = \delta\_{j,-1}, \qquad \lim\_{t \downarrow t\_0} \left[ \frac{1}{\varrho(t)} \frac{d p\_{j,-1}(t|t\_0)}{dt} \right] = -\nu \,\delta\_{j,-1} + \varrho \,\delta\_{j,-2}.\tag{19}$$

Results of theorem follow by using standard techniques to solve (16) and (18), with the initial conditions (17) and (19), respectively; then, recalling Equation (6), one determines P*j*(*t*|*t*0).

In Figures 2–4 the probabilities *pj*,−1(*t*|0), *pj*,−2(*t*|0) and P*j*(*t*|0) are plotted for *<sup>ϕ</sup>*(*t*) = 1, *ξ* = 1, *ν* = 4 and some choices of the parameter . In particular, Δ = 3.36 in Figure 2, Δ = 0 in Figure 3 and Δ = −3.75 in Figure 4.

**Figure 2.** The probabilities *pj*,−1(*t*|0), *pj*,−2(*t*|0) and P*j*(*t*|0) are plotted for *<sup>ϕ</sup>*(*t*) = 1 and for *<sup>ξ</sup>* = 1.0, = 0.6, *ν* = 4.0. In (**a**) *j* = −2 (failure state) and in (**b**) *j* = −1 (repair state).

**Figure 3.** As in Figure 2, for *ϕ*(*t*) = 1 and for *ξ* = 1.0, = 1.0, *ν* = 4.0. In (**a**) *j* = −2 (failure state) and in (**b**) *j* = −1 (repair state).

**Figure 4.** As in Figure 2, for *ϕ*(*t*) = 1 and for *ξ* = 1.0, = 1.5, *ν* = 4.0. In (**a**) *j* = −2 (failure state) and in (**b**) *j* = −1 (repair state).

#### **4. Time of First Failure**

We denote by

$$\mathcal{T}\_{\mathbb{H},-2}(t\_0) = \inf\{t > t\_0 : N(t) = -2\}, \qquad j \in \{-1, 0, 1, \dots, \ell\} \tag{20}$$

the random variable that describes the time of first failure of the system, i.e. the time in which the chain enters in the state *F* for the first time, starting from the state *j* ∈ {−1, 0, 1, . . . , -} at time *t*0. Let

$$g\_{j,-2}(t|t\_0) = \frac{d}{dt}P(\mathcal{T}\_{j,-2}(t\_0) \le t | N(t\_0) = j), \qquad j \in \{-1, 0, 1, \dots, \ell\} \tag{21}$$

be the density of the time of first failure.

**Proposition 3.** *Under the assumptions (11), for j* ∈ {−1, 0, 1, . . . , -} *one has*

$$\mathcal{G}\_{\boldsymbol{\beta},-2}(t|t\_0) = \begin{cases} \, \, ^\mathbb{Z}\boldsymbol{\varrho}(t) \frac{\, ^\mathbb{V}\boldsymbol{\delta}\_{\boldsymbol{\beta},-1} \, ^{\mathbb{C}-\boldsymbol{\tau}\Phi(t|t\_0)} + \left[ \, \_\mathbb{S}^\boldsymbol{\varepsilon}(1-\delta\_{\boldsymbol{\beta},-1}) - \, \_\mathbb{V} \right] \, ^{\mathbb{C}-\boldsymbol{\xi}\Phi(t|t\_0)}}{\, ^\mathbb{S}-\boldsymbol{\nu}}, & \boldsymbol{\nu} \neq \, ^\mathbb{X}\_{\boldsymbol{\varepsilon}} \\\\ \, \, ^\mathbb{X}\boldsymbol{\varrho}(t) \, ^\mathbb{C}-\, ^\mathbb{S}\Phi(t|t\_0) \, \left[ \, 1-\delta\_{\boldsymbol{\beta},-1} + \, \_\mathbb{S}^\boldsymbol{\varepsilon}\Phi(t|t\_0) \, \delta\_{\boldsymbol{\beta},-1} \right], & \boldsymbol{\nu} = \, ^\mathbb{X}\_{\boldsymbol{\varepsilon}} \end{cases} \tag{22}$$

**Proof.** We consider a time-inhomogeneous Markov process {*N*&(*t*), *t* ≥ *t*0} with state-space S obtained from *N*(*t*) by setting an absorbing boundary into the state −2, that corresponds to the failure state *F* of the system and we denote by

$$
\hat{p}\_{j,n}(t|t\_0) = P\{\dot{N}(t) = n | \dot{N}(t\_0) = j\}, \qquad j, n \in \mathcal{S}.\tag{23}
$$

the probability that the system is in state *n* at time *t* and that no failure has yet occurred. Since,

$$P\{\mathcal{T}\_{\hat{\jmath},-2}(t\_0) \le t\} + \hat{\mathcal{p}}\_{\hat{\jmath},-1}(t|t\_0) + \sum\_{n=0}^{\ell} \hat{p}\_{\hat{\jmath},n}(t|t\_0) = 1, \qquad t \ge t\_{0\prime}$$

one has *<sup>P</sup>*{T*j*,−2(*t*0) <sup>≤</sup> *<sup>t</sup>*} <sup>=</sup> *<sup>p</sup>*&*j*,−2(*t*|*t*0), so that for *<sup>t</sup>* <sup>≥</sup> *<sup>t</sup>*<sup>0</sup> it results

$$\mathcal{G}\_{j,-2}(t|t\_0) = \frac{d}{dt}\hat{p}\_{j,-2}(t|t\_0), \qquad j \in \{-1, 0, 1, \dots, \ell\}.\tag{24}$$

Hence, to determine the density of the time of first failure, it is necessary to consider the following differential equations

$$\begin{split} \frac{d\hat{p}\_{j-2}(t|t\_0)}{dt} &= \xi \, q(t) \left[1 - \hat{p}\_{j-1}(t|t\_0) - \hat{p}\_{j,-2}(t|t\_0)\right], \\ \frac{d\hat{p}\_{j,-1}(t|t\_0)}{dt} &= -\nu \, q(t) \, \hat{p}\_{j,-1}(t|t\_0), \\ \frac{d\hat{p}\_{j,0}(t|t\_0)}{dt} &= \gamma\_0 \, q(t) \, \hat{p}\_{j,-1}(t|t\_0) - \left[\lambda\_0(t) + \xi \, q(t)\right] \hat{p}\_{j,0}(t|t\_0) + \mu\_1(t) \, \hat{p}\_{j,1}(t|t\_0), \\ \frac{d\hat{p}\_{j,n}(t|t\_0)}{dt} &= \gamma\_n \, q(t) \, \hat{p}\_{j,-1}(t|t\_0) + \lambda\_{n-1}(t) \, \hat{p}\_{j,n-1}(t|t\_0) \\ &- \left[\lambda\_n(t) + \mu\_n(t) + \xi \, q(t)\right] \hat{p}\_{j,n}(t|t\_0) + \mu\_{n+1}(t) \, \hat{p}\_{j,n+1}(t|t\_0), \qquad n = 1, 2, \ldots, \ell - 1, \\ \frac{d\hat{p}\_{j,\ell}(t|t\_0)}{dt} &= \gamma\_\ell \, q(t) \, \hat{p}\_{j,-1}(t|t\_0) + \lambda\_{\ell-1}(t) \, \hat{p}\_{j,\ell-1}(t|t\_0) - \left[\mu\_\ell(t) + \xi \, q(t)\right] \hat{p}\_{j,\ell}(t|t\_0), \end{split} \tag{25}$$

to solve with the initial conditions

$$\lim\_{t \downarrow t\_0} \widehat{p}\_{j,n}(t|t\_0) = \delta\_{j,n}, \quad j, n \in \mathcal{S}, j \neq -2, \qquad \lim\_{t \downarrow t\_0} \widehat{p}\_{-2,n}(t|t\_0) = 0, \quad n \in \mathcal{S}. \tag{26}$$

Proceeding as in Proposition 2, one has:

$$\hat{p}\_{j,-2}(t|t\_0) = \begin{cases} \frac{\frac{\tilde{\xi}}{\xi} \left[1 - e^{-v\Phi(t|t\_0)}\right] - \nu \left[1 - e^{-\tilde{\xi}\Phi(t|t\_0)}\right] + \tilde{\xi}\left(1 - \delta\_{j,-1}\right) \left[e^{-v\Phi(t|t\_0)} - e^{-\tilde{\xi}\Phi(t|t\_0)}\right]}{\frac{\tilde{\xi}}{\xi} - \nu}, & \nu \neq \tilde{\xi}, \\\\ 1 - e^{-\tilde{\xi}\Phi(t|t\_0)} \left[1 + \xi \Phi(t|t\_0) \delta\_{j,-1}\right], & \nu = \tilde{\xi}, \end{cases} \tag{27}$$

so that, by virtue of (24), Equation (22) holds.

From (22) it follows that *<sup>P</sup>*{T*j*,−2(*t*0) ≤ +∞} = 1, so that with certainty the system is destined to fail. By virtue of (24), for *j* ∈ {−1, 0, 1, ... , -} the reliability of the system before the first repair is

$$\begin{split} \left\{ P\{\mathcal{T}\_{\hat{\boldsymbol{\beta}},-2}(t\_{0}) > t \} = \int\_{t}^{+\infty} g\_{\hat{\boldsymbol{\beta}},-2}(\boldsymbol{\tau}|t\_{0}) \, d\boldsymbol{\tau} = \int\_{t}^{+\infty} \frac{d}{d\tau} \hat{p}\_{\hat{\boldsymbol{\beta}},-2}(\boldsymbol{\tau}|t\_{0}) \, d\boldsymbol{\tau} = 1 - \hat{p}\_{\hat{\boldsymbol{\beta}},-2}(t|t\_{0}) \\ = \begin{cases} \frac{\frac{\mathcal{E}}{\xi}\delta\_{\hat{\boldsymbol{\beta}},-1}e^{-\nu\Phi(t|t\_{0})} + \left[\frac{\nu}{\xi}(1-\delta\_{\hat{\boldsymbol{\beta}},-1}) - \nu\right]e^{-\frac{\nu}{\xi}\Phi(t|t\_{0})}}{\xi-\nu}, & \nu \neq \xi\_{\boldsymbol{\prime}} \end{cases} \\\\ \left[ 1 + \xi\Phi(t|t\_{0})\,\delta\_{\hat{\boldsymbol{\beta}},-1} \right] e^{-\frac{\nu}{\xi}\Phi(t|t\_{0})}, & \nu = \xi\_{\boldsymbol{\prime}}. \end{split} \tag{28}$$

Hence, for *j* ∈ {−1, 0, 1, . . . , -} the mean time to first failure is

$$\begin{split} \operatorname{E} \left[ \mathcal{T}\_{\bar{\boldsymbol{\beta}},-2}(t\_{0}) \right] &= \int\_{t\_{0}}^{+\infty} (t - t\_{0}) \, \operatorname{g}\_{\bar{\boldsymbol{\beta}},-2}(t|t\_{0}) \, dt = \int\_{t\_{0}}^{+\infty} \operatorname{P} \left\{ \mathcal{T}\_{\bar{\boldsymbol{\beta}},-2}(t\_{0}) > t \right\} \, dt \\ &= \begin{cases} \frac{\frac{\mathcal{T}}{\bar{\boldsymbol{\xi}} - \boldsymbol{\nu}} \, \delta\_{\bar{\boldsymbol{\beta}},-1} \int\_{t\_{0}}^{+\infty} e^{-\nu \Phi(t|t\_{0})} \, dt + \frac{\frac{\mathcal{T}}{\bar{\boldsymbol{\xi}} - \boldsymbol{\xi} - \boldsymbol{1}} (1 - \delta\_{\bar{\boldsymbol{\beta}},-1}) - \nu}{\frac{\mathcal{T}}{\bar{\boldsymbol{\xi}} - \boldsymbol{\nu}}} \int\_{t\_{0}}^{+\infty} e^{-\frac{\nu}{\bar{\boldsymbol{\xi}}} \Phi(t|t\_{0})} \, dt, & \boldsymbol{\nu} \neq \boldsymbol{\xi}\_{\boldsymbol{\nu}} \\\\ \int\_{t\_{0}}^{+\infty} e^{-\nu \Phi(t|t\_{0})} \left[ 1 + \underline{\boldsymbol{\xi}} \, \delta\_{\bar{\boldsymbol{\beta}},-1} \Phi(t|t\_{0}) \right] \, dt, & \boldsymbol{\nu} = \boldsymbol{\xi}. \end{split} \tag{29}$$

In particular, by setting *ϕ*(*t*) = 1, Equation (29) leads to

$$E[\mathcal{T}\_{\mathbb{\hat{J}},-2}] = \frac{1}{\nu} \delta\_{\mathbb{\hat{J}},-1} + \frac{1}{\xi'} \qquad j \in \{-1, 0, 1, \dots, \ell\}.$$

In Figure 5 the density of the time of first failure is plotted for *ϕ*(*t*) = 1, *ξ* = 1.0, = 0.6, *<sup>ν</sup>* = 4.0. If *<sup>j</sup>* = −1 one has *<sup>E</sup>*[T−1,−2] = 1.25, whereas *<sup>E</sup>*[T*j*,−2] = 1 if *<sup>j</sup>* is an operating state.

**Figure 5.** The density of the time of first failure is plotted for *ϕ*(*t*) = 1 and for *ξ* = 1.0, = 0.6, *ν* = 4.0.

#### **5. Operating States and Their Probabilities**

For the birth-death chain {*N*(*t*), *t* ≥ *t*0}, in addition to the assumptions (11), we suppose that the birth and death intensity functions are

$$
\lambda\_n(t) = (\ell - n)\,\lambda(t), \quad n = 0, 1, \ldots, \ell; \qquad \mu\_n(t) = n\,\mu(t), \qquad n = 1, \ldots, \ell,\tag{30}
$$

with *λ*(*t*) and *μ*(*t*) positive, bounded and continuous functions for *t* ≥ 0. Note that the birth-death intensity functions (30) define a time-inhomogeneous Prendiville process {*N*(*t*), *t* ≥ *t*0} with finite state-space {0, 1, ... , -}. The process *N*(*t*) identifies with the process *N*(*t*) in the absence of failures, repairs and restores.

Under the assumptions (11) and (30), the transition probabilities of *N*(*t*) satisfy the following system:

$$\begin{split} \frac{dp\_{j,0}(t|t\_0)}{dt} &= \gamma\_0 \,\varrho(t) \, p\_{j,-1}(t|t\_0) - \left[\ell \,\lambda(t) + \zeta \,\varrho(t)\right] p\_{j,0}(t|t\_0) + \mu(t) \, p\_{j,1}(t|t\_0), \\ \frac{dp\_{j,n}(t|t\_0)}{dt} &= \gamma\_n \,\varrho(t) \, p\_{j,-1}(t|t\_0) + \lambda(t) \, (\ell - n + 1) \, p\_{j,n-1}(t|t\_0) \\ &- \left[\lambda(t) \, (\ell - n) + \mu(t) \, n + \zeta \,\varrho(t)\right] p\_{j,n}(t|t\_0) + \mu(t) \, (n+1) \, p\_{j,n+1}(t|t\_0), \\ & & \qquad n = 1, 2, \ldots, \ell - 1, \\ \frac{dp\_{j,\ell}(t|t\_0)}{dt} &= \gamma\_{\ell'} \, \varrho(t) \, p\_{j,-1}(t|t\_0) + \lambda(t) \, p\_{j,\ell-1}(t|t\_0) - \left[\ell \, \mu(t) + \zeta \,\varrho(t)\right] p\_{j,\ell}(t|t\_0). \end{split} \tag{31}$$

to solve with the initial conditions

$$\lim\_{t \downarrow t\_0} p\_{j,n}(t|t\_0) = \delta\_{j,n}, \qquad j \in \mathcal{S}, n \in \{0, 1, \ldots, \ell\}. \tag{32}$$

Let

$$G\_j(z, t) = \sum\_{n=0}^{\ell} z^n \, p\_{i, n}(t|t\_0), \qquad j \in \mathcal{S} \tag{33}$$

be the probability generating function (PGF) of the operating states of *N*(*t*). From (31) one has:

$$\begin{aligned} \frac{\partial}{\partial t} \mathcal{G}\_{\vec{j}}(z,t) + (z-1) \left[\lambda(t)z + \mu(t)\right] \frac{\partial}{\partial z} \mathcal{G}\_{\vec{j}}(z,t) \\ &= \left[\ell\left(z-1\right)\lambda(t) - \zeta\,\varrho(t)\right] \mathcal{G}\_{\vec{j}}(z,t) + \varrho(t)p\_{\vec{j},-1}(t|t\_0) \sum\_{i=0}^{\ell} \gamma\_i \, z^i, \qquad j \in \mathcal{S}\_{\vec{\wedge}} \end{aligned} \tag{34}$$

to solve with the conditions

$$\begin{aligned} G\_j(z, t\_0) &= \sum\_{n=0}^{\ell} \delta\_{j, n} z^n = \begin{cases} 0, & j = -1, -2 \\ z^j, & j \in \{0, 1, \dots, \ell\}, \\\\ G\_j(z, t\_0) &= \mathcal{P}(t | t\_0) = 1 - p\_{j, -2}(t | t\_0) - p\_{j, -1}(t | t\_0). \end{cases} \end{aligned} \tag{35}$$

**Proposition 4.** *Under the assumption (11) and (30), the PGF of the operating states of N*(*t*) *is*

$$G\_{\uparrow}(z,t) = e^{-\frac{\tau}{5}\Phi(t|t\_0)} \sum\_{i=0}^{\ell} \delta\_{j,i} \left[1 + (z-1)\left.b\_1(t|t\_0)\right]^i \left[1 + (z-1)\left.b\_2(t|t\_0)\right]^{\ell-i}\right]$$

$$+ \int\_{t\_0}^t du \,\,\phi(u) \left.p\_{j,-1}(u|t\_0)\right.e^{-\frac{\tau}{5}\Phi(t|u)} \left[\frac{1 + (z-1)\left.b\_2(t|t\_0)\right]}{1 + (z-1)\left.b\_2(u|t\_0)\right]}\right]^{\ell}$$

$$\times \sum\_{i=0}^{\ell} \gamma\_i \left[\frac{1 + (z-1)\left.b\_1(t|u)\right]}{1 + (z-1)\left.b\_2(t|u)\right]}\right]^i, \qquad j \in \mathcal{S}\_{\prime} \tag{36}$$

*where* Φ(*t*|*t*0) *is given in (12) and where*

$$b\_1(t|t\_0) = e^{-\left[\Lambda(t|t\_0) + M(t|t\_0)\right]} \left[1 + B(t|t\_0)\right], \qquad b\_2(t|t\_0) = e^{-\left[\Lambda(t|t\_0) + M(t|t\_0)\right]} B(t|t\_0), \tag{37}$$

*with*

$$
\Lambda(t|t\_0) = \int\_{t\_0}^t \lambda(\tau) \, d\tau, \quad M(t|t\_0) = \int\_{t\_0}^t \mu(\tau) \, d\tau, \quad B(t|t\_0) = \int\_{t\_0}^t \lambda(\tau) \, e^{\Lambda(\tau|t\_0) + M(\tau|t\_0)} \, d\tau. \tag{38}
$$

**Proof.** The proof is given in Appendix A.

We remark that 0 ≤ *b*1(*t*|*t*0) ≤ 1 and 0 ≤ *b*2(*t*|*t*0) ≤ 1 for all *t* ≥ *t*0. Furthermore, we note that the function

$$\tilde{\mathbf{G}}\_{l}(z,t) = \left[1 + (z-1)\,b\_{1}(t|t\_{0})\right]^{i} \left[1 + (z-1)\,b\_{2}(t|t\_{0})\right]^{\ell-i}, \qquad i \in \{0, 1, \ldots, \ell\},\tag{39}$$

which appears to the right-hand sides of (36), is the PGF of the time-inhomogeneous Prendiville process *N*(*t*), characterized by the birth-death intensity functions *λn*(*t*) and *μn*(*t*), given in (30). The transition probabilities of *N*(*t*) are (cf. Zheng [43], Giorno and Nobile [49]):

$$\begin{split} \tilde{p}\_{0,n}(t|t\_0) &= \binom{\ell}{n} [b\_2(t|t\_0)]^n \left[1 - b\_2(t|t\_0)\right]^{\ell-n}, \\ \tilde{p}\_{i,n}(t|t\_0) &= [b\_1(t|t\_0)]^n \left[1 - b\_2(t|t\_0)\right]^{\ell-i} \left[1 - b\_1(t|t\_0)\right]^{i-n} \\ &\times \sum\_{r=\max(0,n-i)}^{\min(\ell-i,n)} \binom{\ell-i}{r} \binom{i}{n-r} \left\{\frac{b\_2(t|t\_0)}{b\_1(t|t\_0)} \frac{1 - b\_1(t|t\_0)}{1 - b\_2(t|t\_0)}\right\}^r, \quad i = 1, 2, \dots, \ell-1, \end{split} \tag{40}$$
 
$$\tilde{p}\_{\ell,n}(t|t\_0) = \binom{\ell}{n} [b\_1(t|t\_0)]^n \left[1 - b\_1(t|t\_0)\right]^{\ell-n}.$$

and the conditional mean and the conditional variance are:

$$\begin{aligned} \mathrm{E}[\check{N}(t)|\check{N}(t\_0) = i] &= i \, b\_1(t|t\_0) + (\ell - i) \, b\_2(t|t\_0), \\ \mathrm{Var}[\check{N}(t)|\check{N}(t\_0) = i] &= i \, b\_1(t|t\_0) \left[1 - b\_1(t|t\_0)\right] + (\ell - i) \, b\_2(t|t\_0) \left[1 - b\_2(t|t\_0)\right]. \end{aligned} \tag{41}$$

Under the assumptions (11) and (30), the probability that the system *N*(*t*) is in the zero-state at time *t* can be determined from (33):

$$\begin{split} p\_{j,0}(t|t\_0) &= G\_j(0,t) = e^{-\frac{\pi}{5}\Phi(t|t\_0)} \sum\_{i=0}^{\ell} \delta\_{j,i} \widetilde{p}\_{i,0}(t|t\_0) \\ &+ \sum\_{i=0}^{\ell} \gamma\_i \int\_{t\_0}^{t} du \, \varrho(u) \, p\_{j,-1}(u|t\_0) \, e^{-\frac{\pi}{5}\Phi(t|u)} \left[ \frac{1 - b\_2(t|t\_0)}{1 - b\_2(u|t\_0)} \right]^{\ell} \left[ \frac{1 - b\_1(t|u)}{1 - b\_2(t|u)} \right]^i, \quad j \in \mathcal{S}, \end{split} \tag{42}$$

where

$$\widetilde{p}\_{i,0}(t|t\_0) = \left[1 - b\_1(t|t\_0)\right]^i \left[1 - b\_2(t|t\_0)\right]^{\ell - \bar{\ell}}$$

is obtained from (40). Similarly, the probability that the system *N*(*t*) is in the state *n* = 1 at time *t* follows from (36):

$$\begin{split} p\_{j,1}(t|t\_{0}) &= \frac{d\mathcal{G}\_{j}(z,t)}{dz}\Big|\_{z=0} = e^{-\frac{\tau}{\xi}\Phi(t|t\_{0})}\sum\_{i=0}^{\ell}\delta\_{j,i}\widetilde{p}\_{i,1}(t|t\_{0}) \\ &+ \int\_{t\_{0}}^{t} du \,\,\varrho(u) \, p\_{j,-1}(u|t\_{0}) \, e^{-\frac{\tau}{\xi}\Phi(t|u)} \left[\frac{1-b\_{1}(t|t\_{0})}{1-b\_{2}(u|t\_{0})}\right]^{\ell-1} \left\{\frac{\ell\left[b\_{2}(t|t\_{0})-b\_{2}(u|t\_{0})\right]}{[1-b\_{2}(u|t\_{0})]^{2}} \\ &\times \sum\_{i=0}^{\ell}\gamma\_{i}\left[\frac{1-b\_{1}(t|u)}{1-b\_{2}(t|u)}\right]^{i} + e^{-\left[\Lambda(t|u)+M(t|u)\right]}\frac{1-b\_{1}(t|t\_{0})}{1-b\_{2}(u|t\_{0})}\int\_{i=0}^{t} i\,\gamma\_{i}\,\frac{[1-b\_{1}(t|u)]^{i-1}}{[1-b\_{2}(t|u)]^{i+1}}\right\},\end{split} \tag{43}$$

where, by virtue of (40), one has:

$$\begin{aligned} \widetilde{p}\_{i,1}(t|t\_0) &= \left[1 - b\_1(t|t\_0)\right]^{i-1} \left[1 - b\_2(t|t\_0)\right]^{\ell-i-1} \\ &\times \left\{i b\_1(t|t\_0) \left[1 - b\_2(t|t\_0)\right] + (\ell - i) \, b\_2(t|t\_0) \left[1 - b\_1(t|t\_0)\right] \right\}. \end{aligned}$$

For *<sup>r</sup>* <sup>∈</sup> <sup>N</sup>, let us introduce the *<sup>r</sup>*-th conditional moment of *<sup>N</sup>*(*t*):

$$\mathbb{E}[N'(t)|N(t)\geq 0, N(t\_0) = j] = \frac{1}{\mathcal{P}\_j(t|t\_0)} \sum\_{n=0}^{\ell} n^r \, p\_{j,n}(t|t\_0), \qquad j \in \mathcal{S}. \tag{44}$$

From (36), we have

$$\begin{split} \mathbb{E}[N(t)|N(t)\geq 0, N(t\_{0})=j] &= \frac{1}{\mathcal{P}\_{j}(t|t\_{0})} \frac{d\mathcal{G}\_{j}(z,t)}{dz}\Big|\_{z=1} \\ &= \frac{1}{\mathcal{P}\_{j}(t|t\_{0})} \left[e^{-\frac{\varepsilon}{\xi}\Phi(t|t\_{0})} \sum\_{i=0}^{\ell} \delta\_{j,i} \mathbb{E}[\hat{N}(t)|\hat{N}(t\_{0})=i] + \int\_{t\_{0}}^{t} du \,\varrho(u) \, p\_{j,-1}(u|t\_{0}) \, e^{-\frac{\varepsilon}{\xi}\Phi(t|u)} \right. \\ &\times \left\{\ell \vee \left[b\_{2}(t|t\_{0})-b\_{2}(u|t\_{0})\right] + e^{-\left[\Lambda(t|u)+M(t|u)\right]} \sum\_{i=0}^{\ell} i \,\gamma\_{i}\right\}\Big|\_{z}, \qquad j \in \mathcal{S}, \end{split} \tag{45}$$

where E[*N*(*t*)|*N*(*t*0) = *i*] is given in (41).

#### **6. Asymptotic Distribution of Operating States**

To study the asymptotic behavior of the probabilities for the operating states, we assume that the intensity functions of *N*(*t*) are proportional. Specifically, in addition to the conditions (11), we suppose that

$$
\lambda\_{\rm ll}(t) = (\ell - n)\,\lambda\,\varrho(t), \quad n = 0, 1, \ldots, \ell; \qquad \mu\_{\rm n}(t) = n\,\mu\,\varrho(t), \quad n = 1, \ldots, \ell,\tag{46}
$$

with *ϕ*(*t*) positive, bounded and continuous function for *t* ≥ 0. Let

$$G(z) = \sum\_{n=0}^{\ell} z^n q\_n \tag{47}$$

be the asymptotic PGF of the operating states of *N*(*t*). From (34) one has

$$\left[\left(z-1\right)\left[\lambda\left z+\mu\right]\frac{dG(z)}{dz}\right] = \left[\ell\left(z-1\right)\lambda - \zeta\right]G(z) + q\_{-1}\sum\_{i=0}^{\ell}\gamma\_i\left.z^i\right|, \qquad j \in \mathcal{S},\tag{48}$$

to solve with the condition

$$G(1) = Q = 1 - q\_{-2} - q\_{-1}.\tag{49}$$

**Proposition 5.** *Under the assumptions (11) and (46), the asymptotic PGF of the operating states is:*

$$\begin{split} G(z) &= (\lambda \, z + \mu)^{\frac{\ell}{\ell} / (\lambda + \mu) + \ell} (1 - z)^{-\frac{\ell}{\ell} / (\lambda + \mu)} \, q\_{-1} \\ &\quad \times \sum\_{i=0}^{\ell} \gamma\_i \int\_z^1 \mathbf{x}^i (\lambda \, \mathbf{x} + \mu)^{-\frac{\ell}{\ell} / (\lambda + \mu) - \ell - 1} (1 - \mathbf{x})^{\frac{\ell}{\ell} / (\lambda + \mu) - 1} \, d\mathbf{x} .\end{split} \tag{50}$$

**Proof.** The general solution of the differential Equation (48) is:

$$\begin{split} G(z) &= (\lambda z + \mu)^{\frac{\ell}{\ell}/(\lambda+\mu)+\ell} \left( 1 - z \right)^{-\frac{\ell}{\ell}/(\lambda+\mu)} \\ &\times \left[ -q\_{-1} \sum\_{i=0}^{\ell} \gamma\_{i} \int^{z} \mathbf{x}^{i} (\lambda \, \mathbf{x} + \mu)^{-\frac{\ell}{\ell}/(\lambda+\mu)-\ell-1} (1-\mathbf{x})^{\frac{\ell}{\ell}/(\lambda+\mu)-1} \, d\mathbf{x} + c \right], \end{split} \tag{51}$$

where *c* is an arbitrary constant. Making use of the condition (49), we note that the term in square brackets at the right-hand side of (51) must vanish when *z* → 1, allowing to determine the constant *c*. Hence, from (51) we obtain (50).

The knowledge of the asymptotic PGF (50) allows to calculate the asymptotic probabilities of the operating states, as

$$q\_0 = G(0), \qquad q\_n = \frac{1}{n!} \frac{d^n G(z)}{dz^n} \Big|\_{z=0} \qquad n = 1, 2, \dots, \ell,\tag{52}$$

and the *r*-th asymptotic conditional moment of *N*(*t*):

$$\mathbb{E}[N^r|N \ge 0] = \frac{1}{\mathbb{Q}} \sum\_{n=0}^{\ell} n^r q\_{n\prime} \qquad r \in \mathbb{N}.\tag{53}$$

**Proposition 6.** *Under the assumptions (11) and (46), one has:*

$$\begin{split} q\_{0} &= \frac{1}{\lambda + \mu} \left( \frac{\mu}{\lambda + \mu} \right)^{\xi/(\lambda + \mu) + \ell} q\_{-1} \sum\_{i=0}^{\ell} \gamma\_{i} \, \middle| \, \mathcal{S} \left( i + 1, \frac{\mathfrak{F}}{\lambda + \mu} \right) \\ & \times F \Big( \frac{\mathfrak{F}}{\lambda + \mu}, \frac{\mathfrak{F}}{\lambda + \mu} + \ell + 1; \frac{\mathfrak{F}}{\lambda + \mu} + i + 1; \frac{\lambda}{\lambda + \mu} \Big), \\ q\_{1} &= \frac{1}{\mu} \left( \lambda \, \ell + \xi \right) q\_{0} - \frac{\gamma\_{0}}{\mu} q\_{-1}, \\ q\_{2} &= \frac{1}{2 \mu^{2}} \left\{ \left( \lambda \, \ell + \xi \right) \left[ \lambda \left( \ell - 1 \right) + \xi \right] + \xi \, \mu \right\} q\_{0} + \left\{ \frac{\gamma\_{0}}{2 \mu^{2}} \left[ \lambda \left( \ell - 1 \right) + \xi + \mu \right] - \frac{\gamma\_{1}}{2 \mu} \right\} q\_{-1}. \end{split} \tag{54}$$

*where*

$$B(\mathbf{x}, y) = \frac{\Gamma(\mathbf{x})\,\Gamma(y)}{\Gamma(\mathbf{x} + y)}\tag{55}$$

*denotes the beta function and*

$$F(a,b;c;\mathbf{x}) = \sum\_{n=0}^{+\infty} \frac{(a)\_n}{(c)\_n} \frac{(b)\_n}{n!} \frac{\mathbf{x}^n}{n!} \tag{56}$$

*is the Gauss hypergeometric function.*

**Proof.** Since *q*<sup>0</sup> = *G*(0), by setting *z* = 0 in (50) one obtains:

$$q\_0 = \mu^{\ell + \frac{\pi}{\ell}/(\lambda + \mu)} \, q\_{-1} \sum\_{i=0}^{\ell} \gamma\_i \int\_0^1 \mathbf{x}^i (\lambda \, \mathbf{x} + \mu)^{-\frac{\pi}{\ell}/(\lambda + \mu) - \ell - 1} (1 - \mathbf{x})^{\frac{\pi}{\ell}/(\lambda + \mu) - 1} \, d\mathbf{x}.\tag{57}$$

Recalling that (see, Gradshteyn and Ryzhik [50], p. 1005 and p. 1008, n. 9.131)

$$\begin{aligned} F(a,b;c;z) &= \frac{1}{B(b,c-b)} \int\_0^1 \mathbf{x}^{b-1} \left(1-\mathbf{x}\right)^{c-b-1} (1-\mathbf{x}z)^{-a} \,d\mathbf{x}, \qquad \text{Re } c > \text{Re } b > 0, \\ F(a,b;c;z) &= (1-z)^{-a} \,F\left(a,c-b;c; \frac{z}{z-1}\right), \end{aligned}$$

by setting *a* = - + 1 + *ξ*/(*λ* + *μ*), *b* = *i* + 1, *c* = *i* + 1 + *ξ*/(*λ* + *μ*) and *z* = −*λ*/*μ*, for *i* = 0, 1, . . . , one has

$$\begin{split} &\int\_{0}^{1} \mathbf{x}^{i} \, (\lambda \, \mathbf{x} + \mu)^{-\frac{\mathsf{c}}{\mathsf{s}} / (\lambda + \mu) - \ell - 1} \, (1 - \mathbf{x})^{\mathsf{f}} / (\lambda + \mu) - 1 \, \mathrm{d}\mathbf{x} = \mu^{-\frac{\mathsf{c}}{\mathsf{s}} / (\lambda + \mu) - \ell - 1} \\ & \qquad \times B \Big( i + 1, \frac{\mathsf{c}}{\lambda + \mu} \Big) \, \mathrm{F} \Big( \frac{\mathsf{f}}{\lambda + \mu} + \ell + 1, i + 1; \frac{\mathsf{f}}{\lambda + \mu} + i + 1 : - \frac{\lambda}{\mu} \Big) \\ & \qquad = (\lambda + \mu)^{-\mathsf{f}} / (\lambda + \mu) - \ell - 1 \, \mathrm{B} \Big( i + 1, \frac{\mathsf{f}}{\lambda + \mu} \Big) \, \mathrm{F} \Big( \frac{\mathsf{f}}{\lambda + \mu}, \frac{\mathsf{f}}{\lambda + \mu} + \ell + 1; \frac{\mathsf{f}}{\lambda + \mu} + i + 1; \frac{\mathsf{h}}{\lambda + \mu} \Big), \end{split}$$

where the symmetry property *F*(*a*, *b*; *c*; *z*) = *F*(*b*, *a*; *c*; *z*) has been used in the last equality. Hence, the first equation in (54) follows from (57). Moreover, from (50) we have:

$$\frac{dG(z)}{dz} = \left[ \left( \ell + \frac{\frac{\pi}{\lambda}}{\lambda + \mu} \right) \frac{\lambda}{\lambda + \mu} + \frac{\frac{\pi}{\lambda}}{\lambda + \mu} \frac{1}{1 - z} \right] G(z) - \frac{q\_{-1}}{\left( \lambda \, z + \mu \right) \left( 1 - z \right)} \sum\_{i=0}^{\ell} \gamma\_i z^i, \quad \text{(58)}$$

so that the second equation in (54) follows from (52) for *n* = 1. Finally, from (58) one has:

$$\frac{d^2G(z)}{dz^2} = \left[ -\left(\ell + \frac{\int^\zeta}{\lambda + \mu} \right) \left(\frac{\lambda}{\lambda + \mu} \right)^2 + \frac{\int^\zeta}{\lambda + \mu} \left(\frac{1}{1 - z} \right)^2 \right] G(z)$$

$$+ \left[ \left(\ell + \frac{\int^\zeta}{\lambda + \mu} \right) \frac{\lambda}{\lambda \, z + \mu} + \frac{\int^\zeta}{\lambda + \mu} \frac{1}{1 - z} \right] \frac{dG(z)}{dz}$$

$$+ \frac{q\_{-1}}{\left(\lambda \, z + \mu\right)\left(1 - z\right)} \left[ \left(\frac{\lambda}{\lambda \, z + \mu} - \frac{1}{1 - z} \right) \sum\_{i = 0}^\ell \gamma\_i z^i - \sum\_{i = 0}^\ell i \gamma\_i \, z^{i - 1} \right]. \tag{59}$$

Hence, by virtue of (52) for *n* = 2, from (59) the last equation in (54) follows.

**Proposition 7.** *Under the assumptions (11) and (46), one obtain:*

$$\mathrm{E}[N|N \ge 0] = \frac{1}{\lambda + \mu + \frac{\pi}{\nu}} \left\{ \lambda \,\, \ell + \frac{\frac{\pi}{\nu}}{\nu} \sum\_{i=0}^{\ell} i \,\gamma\_i \right\},\tag{60}$$

*with ν* = *γ*<sup>0</sup> + *γ*<sup>1</sup> + ... + *γ*-*.*

**Proof.** By virtue of (53), from (58) one has

$$\begin{split} \operatorname{E}[N|N \geq 0] &= \frac{1}{Q} \frac{dG(z)}{dz} \Big|\_{z=1} \\ &= \frac{1}{Q} \lim\_{z \to 1} \frac{\left[ \left( \ell + \frac{\frac{\mathcal{I}}{\lambda}}{\lambda + \mu} \right) \frac{\lambda \left( 1 - z \right)}{\lambda z + \mu} + \frac{\frac{\mathcal{I}}{\lambda}}{\lambda + \mu} \right] G(z) - \frac{q\_{-1}}{\lambda z + \mu} \sum\_{i=0}^{\ell} \gamma\_{i} z^{i}}{1 - z} \\ &= \left( \ell + \frac{\frac{\mathcal{I}}{\lambda}}{\lambda + \mu} \right) \frac{\lambda}{\lambda + \mu} - \frac{\frac{\mathcal{I}}{\lambda}}{\lambda + \mu} \operatorname{E}[N|N \geq 0] - \frac{q\_{-1}}{Q} \frac{\lambda \left. \nu \right|\_{z}}{(\lambda + \mu)^{2}} + \frac{q\_{-1}}{Q \left( \lambda + \mu \right)} \sum\_{i=0}^{\ell} i \gamma\_{i} \end{split}$$

from which (60) follows.

**Example 1.** *We assume that* - = 0*. Under the assumptions (11), the time-inhomogeneous Markov chain N*(*t*) *is shown in Figure 6.*

**Figure 6.** The state diagram of the Markov process *N*(*t*) with -= 0.

*In this case, there is only one operating state in zero, the intensity functions of failure ξ*(*t*) = *ξ ϕ*(*t*)*, of repair* (*t*) =  *ϕ*(*t*) *and of restore γ*0(*t*) = *γ*<sup>0</sup> *ϕ*(*t*) *are proportional and pj*,0(*t*|*t*0) + *pj*,−2(*t*|*t*0) + *pj*,−1(*t*|*t*0) = <sup>1</sup>*. From (42), one has:*

$$p\_{j,0}(t|t\_0) = e^{-\frac{\pi}{6}\Phi(t|t\_0)}\,\delta\_{j,0} + \gamma\_0 \int\_{t\_0}^t \boldsymbol{\varrho}(\boldsymbol{u})\, p\_{j,-1}(\boldsymbol{u}|t\_0)\, e^{-\frac{\pi}{6}\Phi(t|\boldsymbol{u})}\,\boldsymbol{d}\boldsymbol{u},\qquad j=-2,-1,0.\tag{61}$$

*Of course, the conditional mean (45) is equal to zero for all t* ≥ *t*0*. From Proposition 6, one obtains:*

$$q\_0 = \frac{1}{\tilde{\xi}} \left( \frac{\mu}{\lambda + \mu} \right)^{\tilde{\xi}/(\lambda + \mu)} q\_{-1} \, F \left( \frac{\tilde{\xi}}{\lambda + \mu}, \frac{\tilde{\xi}}{\lambda + \mu} + 1; \frac{\tilde{\xi}}{\lambda + \mu} + 1; \frac{\lambda}{\lambda + \mu} \right). \tag{62}$$

*Since,*

$$F(a, b; b; z) = (1 - z)^{-a},\tag{63}$$

*from (62) one clearly has*

$$q\_0 = \frac{q\_{-1}}{\oint} \gamma\_0 = \frac{\varrho \gamma\_0}{\gamma\_0 \varrho + \gamma\_0 \varrho + \varrho \lrcorner \zeta}.$$

*that identifies with the probability Q, being ν* = *γ*0*.*

**Example 2.** *We assume that* - = 1*. Under the assumption (11) and (46), the time-inhomogeneous Markov chain N*(*t*) *is shown in Figure 7.*

**Figure 7.** The state diagram of the Markov chain *N*(*t*) with -= 1.

*In this case, there are two operating states* 0 *and* 1*, with intensity functions of failure ξ*(*t*) = *ξ ϕ*(*t*)*, of repair* (*t*) =  *ϕ*(*t*) *and of restores γi*(*t*) = *γ<sup>i</sup> ϕ*(*t*) *for i* = 0, 1*; the birthdeath intensity functions are λ*0(*t*) = *λ ϕ*(*t*) *and μ*1(*t*) = *μ ϕ*(*t*)*. By setting* - = 1 *in the first equation in of (54) one has*

$$q\_0 = \frac{1}{\frac{\pi}{\xi}} \left(\frac{\mu}{\lambda + \mu}\right)^{\frac{\pi}{\xi}/(\lambda + \mu) + 1} q\_{-1} \left[\gamma\_0 \, F\left(\frac{\frac{\pi}{\xi}}{\lambda + \mu}, \frac{\frac{\pi}{\lambda + \mu}}{\lambda + \mu} + 2; \frac{\frac{\pi}{\lambda + \mu}}{\lambda + \mu} + 1; \frac{\lambda}{\lambda + \mu}\right) \right.$$

$$+ \gamma\_1 \, \frac{\lambda + \mu}{\lambda + \mu + \frac{\pi}{\lambda}} F\left(\frac{\frac{\pi}{\xi}}{\lambda + \mu}, \frac{\frac{\pi}{\lambda}}{\lambda + \mu} + 2; \frac{\frac{\pi}{\xi}}{\lambda + \mu} + 2; \frac{\lambda}{\lambda + \mu}\right) \,. \tag{64}$$

*Recalling the Gauss' recursion function (see, Gradshteyn and Ryzhik [50], p. 1010, n. 9.137.17)*

$$\text{b.c.} F(a, b; c; z) - \left(c - b\right) F(a, b; c + 1; z) - b \, F(a, b + 1; c + 1; z) = 0 \tag{65}$$

*and the relation (63), one obtains:*

$$F\left(\frac{\tilde{\xi}}{\lambda+\mu}, \frac{\tilde{\xi}}{\lambda+\mu} + 2; \frac{\tilde{\xi}}{\lambda+\mu} + 1; \frac{\lambda}{\lambda+\mu}\right) = \frac{\lambda+\mu}{\lambda+\mu+\tilde{\xi}} \frac{\mu+\tilde{\xi}}{\mu} \left(\frac{\mu}{\lambda+\tilde{\xi}}\right)^{-\frac{\pi}{2}/(\lambda+\mu)}\tag{66}$$

*Making use of (66) and of the relation (63) in Equation (64), for* -= 1 *it follows*

$$\begin{aligned} q\_0 &= \frac{\mu}{\xi} \frac{1}{\lambda + \mu + \xi} \left[ \left( 1 + \frac{\xi}{\mu} \right) \gamma\_0 + \gamma\_1 \right] q\_{-1}, \\\\ q\_1 &= \frac{\lambda + \xi}{\mu} q\_0 - \frac{\gamma\_0}{\mu} \left| q\_{-1}. \end{aligned} \tag{67}$$

*Of course, q*<sup>0</sup> + *q*<sup>1</sup> = *Q* =  *ν*/( *ν* + *ν ξ* +  *ξ*)*, with ν* = *γ*<sup>0</sup> + *γ*1*. From (53) we have*

$$\mathrm{E}(N|N \ge 0) = \frac{q\_1}{\mathcal{Q}} = \frac{\lambda + \mathcal{\zeta}}{\mu} \frac{q\_0}{\mathcal{Q}} - \frac{\gamma\_0}{\mu} \frac{\mathcal{\zeta}}{\gamma\_0 + \gamma\_1} = \frac{1}{\lambda + \mu + \mathcal{\zeta}} \left(\lambda + \mathcal{\zeta} \frac{\gamma\_1}{\gamma\_0 + \gamma\_1}\right).$$

*that identifies with (60) for* -= 1*.*

**Example 3.** *We assume that* - = 2*. Under the assumption (11) and (46), the time-inhomogeneous Markov chain N*(*t*) *is shown in Figure 8.*

**Figure 8.** The state diagram of the Markov chain *N*(*t*) with -= 2.

*In this case, there are three operating states* 0*,* 1 *and* 2*, with the intensity functions of failure ξ*(*t*) = *ξ ϕ*(*t*)*, of repair* (*t*) =  *ϕ*(*t*) *and of restores γi*(*t*) = *γ<sup>i</sup> ϕ*(*t*) *for i* = 0, 1, 2*; the birthdeath intensity functions are λn*(*t*)=(2 − *n*) *λ ϕ*(*t*) *for n* = 0, 1 *and μn*(*t*) = *n μ ϕ*(*t*) *for n* = 1, 2*. By setting* -= 2 *in the first equation in of (54) one obtains*

$$q\_0 = \frac{1}{\xi} \left(\frac{\mu}{\lambda + \mu}\right)^{\frac{\xi}{\xi}/(\lambda + \mu) + 2} q\_{-1} \left[\gamma\_0 F\left(\frac{\frac{\xi}{\lambda}}{\lambda + \mu}, \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 3; \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 1; \frac{\lambda}{\lambda + \mu}\right)\right.$$

$$\begin{split} &+ \gamma\_1 \frac{\lambda + \mu}{\lambda + \mu + \frac{\xi}{\xi}} F\left(\frac{\frac{\xi}{\lambda}}{\lambda + \mu}, \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 3; \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 2; \frac{\lambda}{\lambda + \mu}\right.\\ &+ 2 \gamma\_2 \frac{(\lambda + \mu)^2}{(\lambda + \mu + \frac{\xi}{\xi}) \left[2(\lambda + \mu) + \xi\right]} F\left(\frac{\frac{\xi}{\lambda}}{\lambda + \mu}, \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 3; \frac{\frac{\xi}{\lambda}}{\lambda + \mu} + 3; \frac{\lambda}{\lambda + \mu}\right) \right]. \end{split}$$

*By virtue of (65), one has:*

$$\begin{split} F\left(\frac{\frac{\mathcal{F}}{\lambda}}{\lambda+\mu}, \frac{\frac{\mathcal{F}}{\lambda}}{\lambda+\mu} + 3; \frac{\frac{\mathcal{F}}{\lambda}}{\lambda+\mu} + 1; \frac{\lambda}{\lambda+\mu} \right) &= \frac{(\lambda+\mu)^2}{2(\lambda+\mu)+\xi} \left(\frac{\mu}{\lambda+\mu}\right)^{-\frac{\xi}{\xi}/(\lambda+\mu)}\\ &\times \left[\frac{\frac{\mathcal{F}}{\xi}}{\mu^2} + \frac{2}{\lambda+\mu+\frac{\xi}{\xi}} \left(1 + \frac{\frac{\mathcal{F}}{\xi}}{\mu}\right) \right], \end{split} \tag{69}$$

$$F\left(\frac{\frac{\chi}{\lambda}}{\lambda+\mu}, \frac{\frac{\chi}{\lambda}}{\lambda+\mu} + 3; \frac{\frac{\chi}{\lambda}}{\lambda+\mu} + 2; \frac{\lambda}{\lambda+\mu}\right) = \frac{\lambda+\mu}{2(\lambda+\mu)+\xi} \left(\frac{\mu}{\lambda+\beta}\right)^{-\frac{\xi}{2}/(\lambda+\mu)} \left(2 + \frac{\frac{\chi}{\xi}}{\mu}\right),$$

*Making use of (69) and of the relation (63) in Equation (68), for* -= 2 *it follows*

$$\begin{split} q\_{0} &= \frac{\mu^{2}}{\xi} \frac{1}{\left(\lambda + \mu + \xi\right)\left[2\left(\lambda + \mu\right) + \xi\right]} q\_{-1} \\ &\times \left\{ \gamma\_{0} \left[ \frac{\xi\left(\lambda + \mu + \xi\right)}{\mu^{2}} + 2\left(1 + \frac{\xi}{\mu}\right) \right] + \gamma\_{1} \left(2 + \frac{\xi}{\mu}\right) + 2\gamma\_{2} \right\}, \\ q\_{1} &= \frac{2\lambda + \xi}{\mu} q\_{0} - \frac{\gamma\_{0}}{\mu} q\_{-1}, \\ q\_{2} &= \frac{\left(\xi + \lambda\right)\left(\frac{\xi}{\mu} + 2\lambda\right) + \xi\mu}{2\mu^{2}} q\_{0} - \left[\gamma\_{0} \frac{\frac{\xi}{\mu} + \lambda + \mu}{2\mu^{2}} + \frac{\gamma\_{1}}{2\mu}\right] q\_{-1}. \end{split} \tag{70}$$

*Clearly, q*<sup>0</sup> + *q*<sup>1</sup> + *q*<sup>2</sup> = *Q* =  *ν*/( *ν* + *ν ξ* +  *ξ*)*, with ν* = *γ*<sup>0</sup> + *γ*<sup>1</sup> + *γ*2*. Finally, from* (53) *one obtains*

$$\begin{split} \operatorname{E}(N|N \geq 0) &= \frac{q\_1 + 2\,q\_2}{Q} = \left[\frac{2\,\lambda + \frac{\chi}{\xi}}{\mu} + \frac{(\frac{\chi}{\xi} + \lambda)\left(\frac{\chi}{\xi} + 2\lambda\right) + \frac{\chi}{\xi}\mu}{\mu^2}\right] \frac{q\_0}{Q} \\ &- \left[\gamma\_0 \frac{\frac{\chi}{\xi} + \lambda + 2\,\mu}{\mu^2} + \frac{\gamma\_1}{\mu}\right] \frac{\frac{\chi}{\xi}}{\gamma\_0 + \gamma\_1 + \gamma\_2} = \frac{1}{\lambda + \mu + \frac{\chi}{\xi}} \left\{2\,\lambda + \frac{\chi}{\gamma\_0 + \gamma\_1 + \gamma\_2}\right\}. \end{split}$$

*that identifies with (60) for* -= 2*.*

#### **7. Conclusions**

In the present paper, we have considered a time-inhomogeneous CTMC with a finite space-state in which failures and repairs can occur at random times. In addition to the operating states, the space of the states includes two particular ones, denoted by *F* and *R*, representing the failure state and the repair one, respectively. The failures occur according to a non-stationary exponential distribution and they produce a transition from an operating state to *F*. Subsequently, a repair is required that involves a transition from *F* to *R*. Even the repair times are assumed to be random and occurring according to a non-stationary exponential distribution. After the reparation, the system restarts from one of the operating states.

Assuming that the failures, repairs and restores are characterized by proportional intensity functions, we determine the transition probabilities that, starting from an arbitrary state *j* at time *t*0, the system reaches the state *F*, or the state *R*, or one of the operating states at time *t*. The obtained results show that that the probability that the system is in an operating state at time *t* does not depend on the intensity functions related to the birth-death process without failures and repairs. In other words, the transition probabilities related to the states *F*, *R*, as well as the transition probability that the system occupies an operating state, are independent of the dynamics existing between the operating states. We determine the density of the time of first failure and the related average. Moreover, we focus on the transition probabilities of operating states by determining the PGF and the conditioned mean. Finally, under the assumption of proportional intensity functions, we analyze the asymptotic behavior for the probabilities of the operating states by calculating the asymptotic PGF and the asymptotic conditional mean.

**Author Contributions:** Conceptualization, V.G. and A.G.N.; methodology, V.G. and A.G.N.; software, V.G. and A.G.N.; validation, V.G. and A.G.N.; formal analysis, V.G. and A.G.N.; investigation, V.G. and A.G.N.; resources, V.G. and A.G.N.; data curation, V.G. and A.G.N.; visualization, V.G. and A.G.N.; supervision, V.G. and A.G.N. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is partially supported by MIUR—PRIN 2017, Project "Stochastic Models for Complex Systems".

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** The authors are members of the research group GNCS of INdAM.

**Conflicts of Interest:** The authors declare that they have no conflict of interest.

#### **Appendix A. Proof of Proposition 4**

Equation (34) with the conditions (35) can be solved by using the method of characteristics (cf., for instance, Williams [51]). We consider the following differential equations:

$$\begin{aligned} \frac{dt}{d\psi} &= 1, & \frac{dz}{d\psi} &= (z-1) \left[\lambda(t)z + \mu(t)\right], \\ \frac{dG\_j}{d\psi} &= \left[\ell\left(z-1\right)\lambda(t) - \zeta\left\varrho(t)\right]G\_j + \varrho(t)\ p\_{j,-1}(t|t\_0)\sum\_{i=0}^{\ell}\gamma\_i z^i, \end{aligned} \tag{A1}$$

with the initial conditions:

$$t(s, \psi = t\_0) = t\_0, \qquad z(s, \psi = t\_0) = s, \qquad \mathcal{G}\_{\dot{\jmath}}(s, \psi = t\_0) = \sum\_{i=0}^{\ell} \delta\_{\dot{\jmath}, i} s^i. \tag{A2}$$

The first equation of (A1), with the related initial condition in (A2), leads to *t* = *ψ*. By setting *t* = *ψ* in the second equation of (A1) and by using the second of (A2) one obtains:

$$z - 1 = \frac{(s - 1) \, e^{\Lambda(\psi|t\_0) + \mathcal{M}(\psi|t\_0)}}{1 - (s - 1) \, B(\psi|t\_0)},\tag{A.3}$$

with Λ(*t*|*t*0), *M*(*t*|*t*0) and *B*(*t*|*t*0) defined in (38). Moreover, solving the third equation in (A1) with *t* = *ψ* and *z* obtained from (A3) we have

$$\begin{split} G\_{\tilde{\mathbb{P}}}(s,\boldsymbol{\varphi}) &= e^{-\frac{\tau}{8}\Phi(\boldsymbol{\Phi}|\boldsymbol{u})} \exp\left\{ \ell\left(s-1\right) \int\_{t\_{0}}^{\psi} \frac{\lambda\left(u\right)}{1-\left(s-1\right)B\left(u|t\_{0}\right)} du \right\} \sum\_{i=0}^{\ell} \delta\_{j,i} s^{i} \\ &+ \int\_{t\_{0}}^{\psi} du \,\,\boldsymbol{\varphi}(u) \, p\_{j,-1}(u|t\_{0}) \, e^{-\frac{\tau}{8}\Phi(\boldsymbol{\varphi}|\boldsymbol{u})} \exp\left\{ \ell\left(s-1\right) \int\_{u}^{\psi} \frac{\lambda\left(\boldsymbol{\vartheta}\right)e^{\lambda\left(\boldsymbol{\vartheta}|t\_{0}\right) + M\left(\boldsymbol{\vartheta}|t\_{0}\right)}}{1-\left(s-1\right)B\left(\boldsymbol{\vartheta}|t\_{0}\right)} \, d\boldsymbol{\vartheta} \right\} \\ &\times \sum\_{i=0}^{\ell} \gamma\_{i} \left[1 + \frac{\left(s-1\right)e^{\lambda\left(u|t\_{0}\right) + M\left(u|t\_{0}\right)}}{1-\left(s-1\right)B\left(u|t\_{0}\right)}\right]^{i}, \end{split} \tag{A4}$$

where the use of the third of (A2) has been made. From (A3) with *ψ* = *t*, we also obtain

$$s = \frac{1 + (z - 1)\,b\_1(t|t\_0)}{1 + (z - 1)\,b\_2(t|t\_0)'} \tag{A5}$$

with *b*1(*t*|*t*0) and *b*2(*t*|*t*0) defined in (37). By virtue of (A5), one has:

$$\begin{split} \left(s-1\right) \int\_{t\_0}^{t} \frac{\lambda\left(u\right) e^{\Lambda\left(u\left|t\_0\right) + M\left(u\left|t\_0\right)\right)}}{1 - \left(s-1\right)B\left(u\left|t\_0\right)} \, du &= \ln\left[1 + \left(z-1\right)b\_2(t\left|t\_0\right)\right], \\ 1 + \frac{\left(s-1\right)e^{\Lambda\left(u\left|t\_0\right) + M\left(u\left|t\_0\right)\right)}}{1 - \left(s-1\right)B\left(u\left|t\_0\right)} &= \frac{1 + \left(z-1\right)b\_1(t\left|u\right)}{1 + \left(z-1\right)b\_2(t\left|u\right)}. \end{split} \tag{A6}$$

Finally, recalling that *ψ* = *t* and making use of (A5) and (A6), from (A4) one derives (36).

#### **References**


## *Article* **Conditions for Existence of Second-Order and Third-Order Filters for Discrete Systems with Additive Noises**

**Mikhail Kamenshchikov 1,2,3**


**Abstract:** The problem of constructing functional optimal observers (filters) for stochastic control systems with additive noises in discrete time are studied in this work. Under the assumption that there is no filter of the first order, necessary and sufficient conditions for the existence of filters of the second and third order are obtained in the canonical basis. Analytical expressions of the transfer function matrix from the input noise to the estimation error are presented. A numerical example is given to compare the performance of filters by the quadratic criterion in the steady state.

**Keywords:** discrete time functional filter; optimal unbiased estimation; steady state

#### **1. Introduction**

The reduced-order filtering problem occupies an important place in the theory of optimal state estimation. Instead of the traditionally used Kalman filter, which forms an estimate of the total system state vector and has an order that coincides with the order of the system, it is proposed to construct its analogue, a functional filter with a reduced dimension. In this case, the computational effort to implement a functional filter is reduced. In addition, the reduced order of the filter simplifies the analysis of the dynamic system.

The problem under study is at the intersection of two classical problems of state estimation theory: the full order filtering problem for stochastic systems and the design functional observer problem for deterministic systems. The first problem relates to the filtration theory and was solved for non-linear case (even for the non-stationary case) in 1959–1960 by Stratonovich [1,2], and for linear case in 1960–1961 by Kalman and Bucy [3,4], for both continuous and discrete time. The solution to the second problem of constructing functional observers for linear stationary fully defined systems was proposed in 1966 by Luenberger [5]. The further development of the theory of functional observers is reflected in detail in the books by O'Reilly [6] and Korovin and Fomichev [7]. In particular, in [7], the conditions of existence and algorithms for the synthesis of functional observers for linear stationary fully deterministic systems are given for various cases, namely scalar and vector output; and scalar and vector functional. Two methods of solving the design functional observer problem are also proposed: the pseudo-input method and the scalar observer method. Both methods allow one to obtain necessary and sufficient conditions for the existence of functional observers of order *k* (*k* < *ν* − 1, where *ν* is the observability index of the system), which were first proposed in [8,9].

Much attention has been paid to the construction of the reduced-order filters for linear systems. Minimizing the quadratic error criterion over the interval and using the solution of the two-point boundary value problem, the reduced-order filter is designed in [10,11]. Based on the quasi-diagonal matrix decomposition and the solution of the Riccati and

**Citation:** Kamenshchikov, M. Conditions for Existence of Second-Order and Third-Order Filters for Discrete Systems with Additive Noises. *Mathematics* **2022**, *10*, 370. https://doi.org/10.3390/ math10030370

Academic Editor: Leonid Piterbarg

Received: 12 December 2021 Accepted: 22 January 2022 Published: 25 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Lyapunov matrix equations, a method is proposed in [12] for determining the parameters of continuous and digital linear filters of reduced order which ensures their asymptotic stability provided that the estimated system is stabilizable and detectable. In [13–15], the proof of the uniqueness of the optimal unbiased reduced-order filter and the properties of the reduced-order innovation process in continuous and discrete time are proposed. Developing the results obtained in [13], the necessary and sufficient conditions for existence, stability, and convergence of the designed filter are obtained for both continuous and discrete stochastic systems in [14], and for discrete stochastic systems with unknown inputs in [15]. In [16], a method for the synthesis of functional optimal observers in the frequency domain using spectral factorization in continuous and discrete time is proposed, and the transfer function of the filter and the properties of its associated innovation sequence are obtained. In [17], using a model reduction of the original system and solving the Lyapunov equations involved in each iteration of the optimum search algorithm, a simple method for reduced-order H<sup>2</sup> filter design is proposed. An approach to design a sliding mode controlbased functional observer for discrete-time stochastic systems, existence conditions, and stability analysis of the proposed observer are given in [18]. In [19], a generalization of the classical unbiasedness condition in the joint problem of stabilization and optimal filtering is presented, and an alternative method for constructing reduced-order filters is proposed, based on reduction to a non-linear optimization problem. Conditions for existence of second-order and third-order filters for systems in continuous time with additive noises are proposed in [20]. In [21,22], the frequency-weighted H2-optimal model order reduction problem is investigated, and algorithms are proposed, that constructs a reduced-order model, which nearly satisfies the first-order optimality conditions.

In practice, reduced-order optimal filters are used in signal processing of inertial navigation systems, in health parameter estimation for an aircraft turbofan engine, in induction motor state estimation, in dynamic image analysis, in restoration of progressive and interlaced video, in separation of heart and respiratory sounds, in meteorology and oceanography applications (see [23] and references therein).

This article proposes an approach to constructing reduced-order filters, which differs from the methods in [13–16] in that the filter order does not necessarily coincide with the dimension of the estimated functional. Unlike the methods in [12,17,21,22], where Lyapunov equations are used to calculate the quality criterion, this article uses the method of integral quadratic performance measures, which makes it possible to determine the dependence on parameters in an explicit form.

The problem formulation is presented in Section 2, where the scalar linear functional of the state vector is estimated from the measured scalar output. Perturbations are white random processes with a priori known probabilistic characteristics, uncorrelated with each other at different times and with the initial state of the system. The root-mean-square error in the steady state is chosen as a criterion of optimality. In Section 3, necessary and sufficient conditions for the existence of filters of the second and third order are obtained using canonical forms. Analytical expressions for the transfer function matrix are given in Section 4. The dependence of the parameters number of second-order and third-order filters on the order of original system is presented. Section 5 contains an illustrative example of comparing second and third order filters by quadratic criterion in steady state. Section 6 summarizes the article.

The mathematical notations used in this text are listed in Table 1.


**Table 1.** Mathematical notations.

#### **2. Problem Statement**

Consider an *n*–dimensional linear discrete system with stochastic perturbations and with a scalar output:

$$\begin{aligned} x\_{i+1} &= Ax\_i + Bu\_i + w\_{i\prime} \\ y\_i &= \mathbb{C}x\_i + v\_{i\prime} \end{aligned} \qquad i \ge 0,\tag{1}$$

where *xi* <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* is the unknown phase vector, *ui* <sup>∈</sup> <sup>R</sup>*<sup>m</sup>* is the known input of the system, and *yi* <sup>∈</sup> <sup>R</sup> is the measured output of the system; *<sup>A</sup>*, *<sup>B</sup>*, *<sup>C</sup>* are constant matrices of appropriate sizes, *wi*, *vi* are discrete uncorrelated, and zero mean white noise processes of dimensions *<sup>n</sup>* and 1, respectively, with given covariance matrices E[*wiw*- *<sup>j</sup>* ] = *<sup>Q</sup>δij*, E[*vivj*] = *<sup>R</sup>δij*; the initial state *x*<sup>0</sup> is a random variable uncorrelated with noises *wi*, *vi*, and has E[*x*0] = *x*¯0, <sup>E</sup>[(*x*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*¯0)(*x*<sup>0</sup> <sup>−</sup> *<sup>x</sup>*¯0) -] = *P*0. Here, *Q*, *P*<sup>0</sup> are positive semidefinite matrices; and *R* > 0. These assumptions can be represented as

$$\mathbb{E}\left[\begin{pmatrix}w\_{i}\\v\_{i}\\\mathbf{x}\_{0}\end{pmatrix}\begin{pmatrix}w\_{j}^{\mathsf{T}}&v\_{j}&\mathbf{x}\_{0}^{\mathsf{T}}&\mathbf{1}\end{pmatrix}\right]=\begin{bmatrix}Q\delta\_{ij}&0&0&0\\0&R\delta\_{ij}&0&0\\0&0&P\_{0}+\mathbb{X}\_{0}\mathbb{x}\_{0}^{\mathsf{T}}&\mathbb{x}\_{0}\end{bmatrix},\quad i\geq0,\ j\geq0.$$

It is also assumed that the matrices *Q*, *R*, *P*<sup>0</sup> are known a priori. The first equation in the system (1) can be understood in the sense of the stochastic difference equation [24].

Required based on the observation of the output *yi* and the known input *ui*, define an unbiased estimate *<sup>σ</sup><sup>i</sup>* for scalar functional

$$
\sigma\_i = F \mathbf{x}\_{i\prime} \quad i \ge 0,\tag{2}
$$

with the known matrix *<sup>F</sup>* <sup>∈</sup> <sup>R</sup>1×*n*, providing the minimum of the steady state mean value of the squared observation error *ei* Δ <sup>=</sup> *<sup>σ</sup><sup>i</sup>* <sup>−</sup> *<sup>σ</sup>i*:

$$J = \lim\_{i \to \infty} \mathbb{E}[e\_i^2] \to \min. \tag{3}$$

#### **3. Filter Design**

Let matrix *F* have the standard decomposition [6,7]

$$F = PT + V\mathbb{C}\_{\prime}$$

where *<sup>P</sup>* <sup>∈</sup> <sup>R</sup>1×*k*, *<sup>T</sup>* <sup>∈</sup> <sup>R</sup>*k*×*n*, and *<sup>V</sup>* <sup>∈</sup> <sup>R</sup>. Then, *<sup>σ</sup><sup>i</sup>* <sup>=</sup> *Pqi* <sup>+</sup> *Vyi* <sup>−</sup> *Vvi*, where *qi* <sup>=</sup> *Txi* <sup>∈</sup> <sup>R</sup>*<sup>k</sup>* is an unknown vector to be estimated. To reconstruct it, we use an observer of order *k*

$$\begin{aligned} \check{q}\_{i+1} &= N\check{q}\_i + TBu\_i + My\_{i\prime} & \quad \check{q}\_0 = T\pounds\_{0\prime} \\ \check{\sigma}\_i &= P\check{q}\_i + Vy\_{i\prime} \end{aligned} \qquad i \ge 0,\tag{4}$$

where *<sup>q</sup><sup>i</sup>* <sup>∈</sup> <sup>R</sup>*<sup>k</sup>* is the phase vector of the observer; *<sup>N</sup>*, *<sup>M</sup>* are constant matrices of appropriate sizes. In the second equation of the observer, the output *yi* of the original system (1) appears, which makes it possible to obtain an advantage in terms of the quadratic criterion over the filter without it.

Without loss of generality, we make the following standard [6] assumptions regarding the original system (1) and the desired filter (4).

**Assumption 1.** *The pair* {*C*, *A*} *is observable and is given in the second canonical form of observability [7]*

$$A = \begin{pmatrix} 0 & 0 & \dots & 0 & -a\_1 \\ 1 & 0 & \dots & 0 & -a\_2 \\ \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & \dots & 1 & -a\_n \end{pmatrix}, \quad \mathbb{C} = \begin{pmatrix} 0 & \dots & 0 & 1 \end{pmatrix}, \tag{5}$$

*where α<sup>i</sup> is the coefficients of the characteristic polynomial of the matrix A, i.e.,*

*<sup>α</sup>*(*z*) = det(*zIn* <sup>−</sup> *<sup>A</sup>*) = *<sup>z</sup><sup>n</sup>* <sup>+</sup> *<sup>α</sup>nzn*−<sup>1</sup> <sup>+</sup> ... <sup>+</sup> *<sup>α</sup>*1.

*The matrix F in the canonical basis has the form:*

$$F = \begin{pmatrix} f\_1 & f\_2 & \dots & f\_n \end{pmatrix}.$$

**Assumption 2.** *The pair* {*P*, *N*} *is observable and is given in the first canonical form of observability [7]*

$$N = \begin{pmatrix} 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 1 & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & 0 & \dots & 1 \\ -l\_1 & -l\_2 & -l\_3 & \dots & -l\_k \end{pmatrix}, \quad P = \begin{pmatrix} 1 & 0 & \dots & 0 \end{pmatrix},\tag{6}$$

*where li is the coefficients of the characteristic polynomial of the matrix N, i.e.,*

$$\beta(z) = \det(zI\_k - N) = z^k + l\_k z^{k-1} + \dots + l\_1 \dots$$

Let us investigate the question of when linear filters of the second (*k* = 2) and third (*k* = 3) order can estimate the functional (2) from the state vector. In addition, it is assumed that there is no first-order (*k* = 1) filter giving an unbiased estimate for the functional (2). Discrete-time filters of various orders starting from the first order were considered in [25].

**Theorem 1.** *For system* (1) *of order higher than the third* (*n* > 3) *with stochastic perturbations and filters* (4) *of the second and third order, giving an unbiased estimate of the functional* (2) *from the state vector, it is true that:*

*(1) the necessary and sufficient conditions for the existence of a second-order filter have the form*

$$T = \begin{pmatrix} f\_1 & f\_2 & \dots & f\_{n-1} & f\_n - V \\ f\_2 & f\_3 & \dots & f\_n - V & t\_{2n} \end{pmatrix},$$

$$M = \begin{pmatrix} & & -\sum\_{i=1}^n a\_i f\_i - t\_{2n} + a\_n V \\ & -\sum\_{i=1}^n a\_i f\_{i+1} - (a\_n - l\_2) t\_{2n} + l\_1 (f\_n - V) + a\_{n-1} V \\ -\sum\_{i=1}^n a\_i f\_{i+1} - (a\_n - l\_2) t\_{2n} + l\_1 (f\_n - V) + a\_{n-1} V \end{pmatrix},$$

$$V = f\_n + l\_1 f\_{n-2} + l\_2 f\_{n-1}, \ t\_{2n} = -l\_1 f\_{n-1} - l\_2 (f\_n - V),\ f\_1 f\_3 - f\_2^2 \neq 0,$$

$$l\_1 = \frac{f\_2 a - f\_3^2}{f\_1 f\_3 - f\_2^2}, \ l\_2 = \frac{f\_2 f\_3 - f\_1 a}{f\_1 f\_3 - f\_2^2}, \ 1 - l\_1 > 0,\ 1 - l\_2 + l\_1 > 0,\ 1 + l\_1 + l\_2 > 0,\ \quad \text{(7)}$$

$$a = \begin{cases} f\_4, \text{ if } n > 4, \\ f\_4 - V, \text{ if } n = 4; \end{cases}$$

*where the condition <sup>f</sup>*<sup>1</sup> *<sup>f</sup>*<sup>3</sup> <sup>−</sup> *<sup>f</sup>* <sup>2</sup> <sup>2</sup> = 0 *means that the observer* (4) *of the first order cannot reconstruct the unbiased estimate of the functional* (2)*;*

*(2) the necessary and sufficient conditions for the existence of a third-order filter have the form*

$$T = \begin{pmatrix} f\_1 & f\_2 & \dots & f\_{n-2} & f\_{n-1} & f\_n - V \\ f\_2 & f\_3 & \dots & f\_{n-1} & f\_n - V & t\_{2n} \\ f\_3 & f\_4 & \dots & f\_n - V & t\_{2n} & t\_{3n} \end{pmatrix},$$

$$M = \begin{pmatrix} & -\sum\_{i=1}^n a\_i f\_i - t\_{2n} + a\_n V \\ & -\sum\_{i=1}^{n-1} a\_i f\_{i+1} - a\_n t\_{2n} - t\_{3n} + a\_{n-1} V \\ & -\sum\_{i=1}^{n-2} a\_i f\_{i+2} - (a\_{n-1} - l\_2) t\_{2n} - (a\_n - l\_3) t\_{3n} + l\_1 (f\_n - V) + a\_{n-2} V \\ & -\sum\_{i=1}^{n-2} a\_i f\_{i+2} - (a\_{n-1} - l\_2) t\_{n-2} + l\_3 f\_{n-1} \end{pmatrix},$$

$$V = f\_n + l\_1 f\_{n-3} + l\_2 f\_{n-2} + l\_3 f\_{n-1},$$

$$t\_{2n} = -l\_1 f\_{n-2} - l\_2 f\_{n-1} - l\_3 (f\_n - V), \ t\_{3n} = -l\_1 f\_{n-1} - l\_2 (f\_n - V) - l\_3 t\_{2n}$$

$$b \neq \frac{f\_3(f\_3^2 - f\_2a) + a(f\_1a - f\_2f\_3)}{f\_1f\_3 - f\_2^2},\tag{8}$$

$$\begin{aligned} l\_1 &= \frac{a(a^2 - f\_3b) + b(f\_2b - f\_3a) + c(f\_3^2 - f\_2a)}{b(f\_1f\_3 - f\_2^2) - f\_3(f\_3^2 - f\_2a) - a(f\_1a - f\_2f\_3)}, \\ l\_2 &= \frac{a(f\_2b - f\_3a) + b(f\_3^2 - f\_1b) + c(f\_1a - f\_2f\_3)}{b(f\_1f\_3 - f\_2^2) - f\_3(f\_3^2 - f\_2a) - a(f\_1a - f\_2f\_3)}, \\ l\_3 &= \frac{a(f\_3^2 - f\_2a) + b(f\_1a - f\_2f\_3) + c(f\_2^2 - f\_1f\_3)}{b(f\_1f\_3 - f\_2^2) - f\_3(f\_3^2 - f\_2a) - a(f\_1a - f\_2f\_3)}, \end{aligned}$$

$$1 - l\_1^2 > 0, \ l\_1^2 - 1 < l\_1 l\_3 - l\_2, \ 1 + l\_3 + l\_2 + l\_1 > 0, \ -1 + l\_3 - l\_2 + l\_1 < 0,\tag{9}$$

$$a = \begin{cases} f\_4, \text{ if } n > 4, \\ f\_4 - V, \text{ if } n = 4; \end{cases} \quad b = \begin{cases} f\_5, \text{ if } n > 5, \\ f\_5 - V, \text{ if } n = 5, \\ t\_{24}, \text{ if } n = 4; \end{cases} \quad c = \begin{cases} f\_6, \text{ if } n > 6, \\ f\_6 - V, \text{ if } n = 6, \\ t\_{25}, \text{ if } n = 5, \\ t\_{34}, \text{ if } n = 4; \end{cases}$$

$$f\_i = -l\_1 f\_{i-3} - l\_2 f\_{i-2} - l\_3 f\_{i-1}, \text{ if } i = 7, \dots, n - 1, \text{ for } n > 7,$$

*where the condition* (8) *for the case n* > 5 *means that the observer* (4) *of the second order cannot reconstruct the unbiased estimate of the functional* (2)*.*

**Proof.** Using the stochastic difference equations of the original system (1) and the observer (4), it is not difficult to obtain that the estimation error *e q i* Δ <sup>=</sup> *qi* <sup>−</sup> *<sup>q</sup><sup>i</sup>* is described by the equation

$$\begin{aligned} \mathbf{e}\_{i+1}^{q} &= q\_{i+1} - \widetilde{q}\_{i+1} = T\mathbf{x}\_{i+1} - N\widetilde{q}\_{i} - TB\mathbf{u}\_{i} - My\_{i} \\ &= T A\mathbf{x}\_{i} - N(q\_{i} - \mathbf{e}\_{i}^{q}) - MC\mathbf{x}\_{i} + T\mathbf{w}\_{i} - M\mathbf{v}\_{i} \\ &= N\mathbf{e}\_{i}^{q} + (TA - MC - NT)\mathbf{x}\_{i} + T\mathbf{w}\_{i} - M\mathbf{v}\_{i}. \end{aligned} \tag{10}$$

The equation for the error *ei* <sup>=</sup> *<sup>σ</sup><sup>i</sup>* <sup>−</sup> *<sup>σ</sup><sup>i</sup>* has the form

$$\varepsilon\_{i} = \sigma\_{i} - \widetilde{\sigma}\_{i} = F\mathbf{x}\_{i} - P\widetilde{q}\_{i} - V\widetilde{y}\_{i} = PT\mathbf{x}\_{i} + V\mathbf{C}\mathbf{x}\_{i} - P\widetilde{q}\_{i} - V\mathbf{C}\mathbf{x}\_{i} - V\upsilon\_{i} = P\varepsilon\_{i}^{q} - V\upsilon\_{i}.\tag{11}$$

Based on the known results [6], we can conclude that the estimates *<sup>q</sup><sup>i</sup>* and *<sup>σ</sup><sup>i</sup>* are unbiased for *qi* and *σi*, respectively, if and only if the following conditions are satisfied:

*F* = *PT* + *VC*, *TA* − *MC* − *NT* = 0, *N* is a Schur matrix. (12)

Moreover, if the matrix *N* is a Schur matrix, then [26] the observation error *e q <sup>i</sup>* in the steady state is a stationary in the wide sense random process, in which the mathematical expectation is constant, and the correlation function depends on one variable.

Both statements of the theorem are obtained in a similar way from the conditions (12), Assumption 1 about the canonical representations of the original system (5) and Assumption 2 about the canonical representations of the desired filter (6).

**Remark 1.** *Inequalities in Formulas* (7) *and* (9) *are discrete stability constraints for the filter* (4) *obtained using the simplified stability criterion [27] for linear discrete systems.*

**Remark 2.** *If the condition* (8) *is violated, then there is a set of the degenerate third-order observers whose coefficients of the characteristic polynomial according to Vieta's formulas have the form*

$$l\_1 = z\_1 z\_2 (\mathcal{C}\_1 + z\_1 + z\_2), \ l\_2 = z\_1 z\_2 - (z\_1 + z\_2)(\mathcal{C}\_1 + z\_1 + z\_2), \ l\_3 = \mathcal{C}\_1 z\_3$$

*and are located at the intersection of the domain of discrete stability of the matrix N and the solution set for the system of equations*

$$z\_1^3 + l\_3 z\_1^2 + l\_2 z\_1 + l\_1 = 0, \quad z\_2^3 + l\_3 z\_2^2 + l\_2 z\_2 + l\_1 = 0,\tag{13}$$

*where z*1, *z*<sup>2</sup> *are the roots of the characteristic polynomial z*<sup>2</sup> + *l*2*z* + *l*<sup>1</sup> *with coefficients l*1, *l*<sup>2</sup> *satisfying* (7)*; and they are determined by the quadratic formula*

$$z\_{1,2} = \frac{f\_1 a - f\_2 f\_3 \pm \sqrt{(f\_2 f\_3 - f\_1 a)^2 - 4(f\_2 a - f\_3^2)(f\_1 f\_3 - f\_2^2)}}{2(f\_1 f\_3 - f\_2^2)}.$$

*C*<sup>1</sup> *is a free unknown, which is chosen so that the stability conditions* (9) *are satisfied:*

$$-1 - z\_1 - z\_2 < \mathcal{C}\_1 < 1 - z\_1 - z\_{2'}$$

*and the variable a is determined according to the second statement of Theorem 1.*

#### **4. Transfer Function Matrix of the Estimation Error System**

This section discusses a method for calculating the optimality criterion (3) by interpreting [28] the steady state root-mean-square error as H<sup>2</sup> norm of the weighted transfer matrix of the estimation error system (10) and (11)

$$J = \lim\_{i \to \infty} \mathbb{E}[e\_i^2] = \frac{1}{2\pi} \int\_{-\pi}^{\pi} \mathcal{W}\_{\text{eff}}(\varepsilon^{j\theta}) \begin{pmatrix} Q & 0 \\ 0 & R \end{pmatrix} \mathcal{W}\_{\text{eff}}^{\mathsf{T}}(\varepsilon^{-j\theta}) d\theta\_{\mathsf{T}}$$

where the transfer function matrix *Weu*¯(*z*) from the vector noise *u*¯*<sup>i</sup>* Δ = *w*- *<sup>i</sup> vi* to the estimation error *ei* must be stable and can be found using the following theorem.

**Theorem 2.** *If the conditions of Theorem 1 are satisfied, then the transfer function matrix Weu*¯(*z*) *of the estimation error system has the form*

$$\mathcal{W}\_{\mathfrak{c}\mathfrak{R}}(z) = \frac{1}{\beta(z)} \begin{pmatrix} \mathcal{W}\_{\mathfrak{c}\mathfrak{w}}^{1}(z) & \dots & \mathcal{W}\_{\mathfrak{c}\mathfrak{w}}^{n}(z) & \mathcal{W}\_{\mathfrak{c}\mathfrak{w}}(z) \end{pmatrix}.$$

*(1) in which, for the case of a second-order filter:*

$$W\_{c\upsilon}^{i}(z) = f\_{i}(z+l\_{2}) + f\_{i+1}, \ i = 1, \ldots, n-2,$$

$$W\_{c\upsilon}^{n-1}(z) = f\_{n-1}(z+l\_{2}) + f\_{n} - V, \ W\_{c\upsilon}^{n}(z) = (f\_{n} - V)(z+l\_{2}) + t\_{2n},$$

$$W\_{c\upsilon}(z) = -V\beta(z) + \left(\sum\_{i=1}^{n} a\_{i}f\_{i} + t\_{2n} - a\_{n}V\right)(z+l\_{2})$$

$$+ \sum\_{i=1}^{n-1} a\_{i}f\_{i+1} + (a\_{n} - l\_{2})t\_{2n} - l\_{1}(f\_{n} - V) - a\_{n-1}V,$$

$$\beta(z) = z^{2} + l\_{2}z + l\_{1}z$$

*(2) in which, for the case of a third-order filter:*

$$\begin{aligned} W\_{\text{cw}}^{i}(z) &= f\_{i}(z^{2} + l\_{3}z + l\_{2}) + f\_{i+1}(z + l\_{3}) + f\_{i+2}, \; i = 1, \ldots, n - 3, \\ W\_{\text{cw}}^{n-2}(z) &= f\_{n-2}(z^{2} + l\_{3}z + l\_{2}) + f\_{n-1}(z + l\_{3}) + f\_{n} - V, \\ W\_{\text{cw}}^{n-1}(z) &= f\_{n-1}(z^{2} + l\_{3}z + l\_{2}) + (f\_{n} - V)(z + l\_{3}) + t\_{2n}, \\ W\_{\text{cw}}^{n}(z) &= (f\_{n} - V)(z^{2} + l\_{3}z + l\_{2}) + t\_{2n}(z + l\_{3}) + t\_{3n}, \\ W\_{\text{cw}}(z) &= -V\beta(z) + (\sum\_{i=1}^{n} a\_{i}f\_{i} + t\_{2n} - a\_{n}V)(z^{2} + l\_{3}z + l\_{2}) \\ &+ (\sum\_{i=1}^{n-1} a\_{i}f\_{i+1} + a\_{n}t\_{2n} + t\_{3n} - a\_{n-1}V)(z + l\_{3}) \\ &+ \sum\_{i=1}^{n-2} a\_{i}f\_{i+2} + (a\_{n-1} - l\_{2})t\_{2n} + (a\_{n} - l\_{3})t\_{3n} - l\_{1}(f\_{n} - V) - a\_{n-2}V, \\ \beta(z) &= z^{3} + l\_{3}z^{2} + l\_{2}z + l\_{1}. \end{aligned}$$

**Proof.** The estimation error system (10) and (11) can be written as follows:

$$\begin{aligned} \varepsilon\_{i+1}^{q} &= \mathcal{N} \varepsilon\_{i}^{q} + \mathcal{B} \vec{u}\_{i\prime} & \quad \varepsilon\_{i} &= \mathcal{P} \varepsilon\_{i}^{q} + \mathcal{D} \vec{u}\_{i\prime} & \quad i \ge 0, \\ \mathcal{B} &= \begin{pmatrix} T & -M \end{pmatrix} \quad \mathcal{D} = \begin{pmatrix} 0 & -V \end{pmatrix} .\end{aligned}$$

For this system, the transfer function matrix from the input *u*¯*<sup>i</sup>* to the output *ei* is equal to

$$\mathcal{W}\_{\rm cl}(z) = P(zI\_k - N)^{-1}\mathcal{B} + \mathcal{D}.\tag{14}$$

Using Formula (14), the necessary and sufficient existence conditions of a filter of the appropriate order from Theorem 1, and Assumption 2 on the canonical representation of the filter, we obtain both statements of Theorem 2. Moreover, the pair {*P*, *N*} is observable by Assumption 2, and the pair {*N*, *<sup>B</sup>*¯} is controllable by the condition *<sup>f</sup>*<sup>1</sup> *<sup>f</sup>*<sup>3</sup> <sup>−</sup> *<sup>f</sup>* <sup>2</sup> <sup>2</sup> = 0 for the second-order filter and by the condition (8) for the third-order filter. Consequently, using the properties [29,30] of the concept of controllability and observability, we obtain that the specified transfer function matrix is irreducible.

**Remark 3.** *Depending on the order n of the original system* (1) *and the order k of the desired filter* (4)*, the transfer function matrix Weu*¯(*z*) *has unknown parameters indicated in Table 2.*


**Table 2.** Transfer function matrix parameters.

**Remark 4.** *If the condition* (8) *of Theorem 1 is violated for a third-order filter, then the transfer matrix of the error system can be calculated according to the first statement of Theorem 2.*

There are various ways to find the optimality criterion without calculating the poles of the transfer function. Firstly, calculation *J* can be reduced to the calculation of integrals

$$\frac{1}{2\pi\pi} \int\_{-\pi}^{\pi} \left| \frac{b\_0 e^{j\theta k} + b\_1 e^{j\theta(k-1)} + \dots + b\_k}{a\_0 e^{j\theta k} + a\_1 e^{j\theta(k-1)} + \dots + a\_k} \right|^2 d\theta\_\prime \tag{15}$$

where the coefficients *ai*, *bi* depend on unknown parameters of the filter (4) according to Theorem 2 and Remark 3. There are special formulas and tables [31–35] for calculating integrals (15).

Secondly, by bilinear transformation [36], calculation *J* can be reduced to the calculation of integrals

$$\frac{1}{2\pi} \int\_{-\infty}^{\infty} \left| \frac{\widetilde{b}\_0 (j\omega)^k + \widetilde{b}\_1 (j\omega)^{k-1} + \dots + \widetilde{b}\_k}{\widetilde{a}\_0 (j\omega)^{k+1} + \widetilde{a}\_1 (j\omega)^k + \dots + \widetilde{a}\_{k+1}} \right|^2 d\omega. \tag{16}$$

There are also special formulas and tables [35,37–39] for calculating integrals (16).

Thirdly, a discrete Lyapunov matrix equation can be used [40] to calculate *J*.

#### **5. Numerical Example**

This section presents a numerical example of comparing second and third order filters in terms of the asymptotic quadratic mean observation error.

We consider the system (1) and (2) of the fourth order, in which the matrices *A*, *C* are given in the canonical form (5) with *α*<sup>1</sup> = 1/16, *α*<sup>2</sup> = 1/2, *α*<sup>3</sup> = 3/2, *α*<sup>4</sup> = 2; the matrix *B* is zero matrix; the elements of the matrix *F* are equal to *f*<sup>1</sup> = 1, *f*<sup>2</sup> = 1/2, *f*<sup>3</sup> = 1/3, *f*<sup>4</sup> = 1/4; the probabilistic characteristics are *Q* = *P*<sup>0</sup> = *I*4, *R* = 1, *x*¯0 = 1000- .

There is no first-order filter reconstructing the unbiased estimate of the scalar functional (2). To find the unknown parameter (*V*) of the second-order filter (4), we solve the problem of minimizing the optimality criterion (3), which, according to Section 4, is

$$f(V) = \frac{492,687,360V^4 + 143,928,576V^5 + 55,244,160V^2 - 6,303,956V - 396,985}{768(36V + 1)(36V + 5)(108V - 13)},\tag{17}$$

where the parameter *V* must be such that the characteristic polynomial of the observer is stable, i.e., *V* ∈ (−1/36, 13/108). The function (17) defined over the open interval (−1/36, 13/108) has a global minimum at *V* ≈ 0.1148. Figure 1 shows the graph of the function *J*(*V*).

**Figure 1.** Graph of the function (17). The global minimum of the function *J*(*V*) over the open interval (−1/36, 13/108) is the red point (*V* ≈ 0.1148, *J*(*V*) ≈ 4.1223). The two blue dashed lines are the asymptotes *V* = −1/36 and *V* = 13/108.

The numerical values of the second-order filter matrices are

$$P = \begin{pmatrix} 1 & 0 \end{pmatrix}, \ N \approx \begin{pmatrix} 0 & 1 \\ 0.5224 & -0.378 \end{pmatrix},$$

$$T \approx \begin{pmatrix} 1 & 0.5 & 0.3333 & 0.1352 \\ 0.5 & 0.3333 & 0.1352 & 0.123 \end{pmatrix}, \ M \approx \begin{pmatrix} -1.2058 \\ -0.6708 \end{pmatrix}, \ V \approx 0.1148...$$

The steady state mean value of the squared observation error in this case is

$$\Gamma \approx 4.1223.$$

If the condition (8) is violated (*t*<sup>24</sup> <sup>=</sup> <sup>12</sup>*V*<sup>2</sup> <sup>−</sup> <sup>2</sup>*<sup>V</sup>* <sup>+</sup> 7/36), then, by Remark 2, degenerate third-order observers have the form (4), in which

$$\begin{aligned} P &= \begin{pmatrix} 1 & 0 & 0 \end{pmatrix}, \; N \approx \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ -0.1975 + 0.5224 \mathbb{C}\_1 & 0.6653 - 0.378 \mathbb{C}\_1 & -\mathbb{C}\_1 \end{pmatrix}, \\\ T &\approx \begin{pmatrix} 1 & 0.5 & 0.3333 & 0.1352 \\ 0.5 & 0.3333 & 0.1352 & 0.123 \\ 0.3333 & 0.1352 & 0.123 & 0.0241 \end{pmatrix}, \; M \approx \begin{pmatrix} -1.2058 \\ -0.6708 \\ -0.3763 \end{pmatrix}, \; V \approx 0.1148, \end{aligned}$$

where the unknown *C*<sup>1</sup> is chosen so that the stability conditions (9) are satisfied, i.e., *C*<sup>1</sup> ∈ (*C*1, *C*1), *C*<sup>1</sup> ≈ −0.622, *C*<sup>1</sup> ≈ 1.378. According to Remark 4, the transfer function matrix and the optimality criterion in this case are found in the same way as for the second-order filter.

Therefore, if the condition (8) is violated, then there is a set of degenerate observers whose coefficients of the characteristic polynomial *β*(*z*) are at the intersection of the linear manifold of solutions of the system (13) in which

$$\begin{aligned} z\_1 &= \frac{3 - 36V + \sqrt{3}\sqrt{1 + 432V^2}}{6} \approx 0.558, \\ z\_2 &= \frac{3 - 36V - \sqrt{3}\sqrt{1 + 432V^2}}{6} \approx -0.9361; \end{aligned}$$

and the domain of discrete stability of the matrix *N*.

If the condition (8) is satisfied (*t*<sup>24</sup> <sup>=</sup> <sup>12</sup>*V*<sup>2</sup> <sup>−</sup> <sup>2</sup>*<sup>V</sup>* <sup>+</sup> 7/36) that there exists a functional optimal observer (4) of the third order solving the optimal filtering problem. In this case, to find unknown variables (*V*, *t*24, *t*34), the problem of minimizing the optimality criterion is solved with a restriction on the parameters that must be such that the characteristic polynomial of the observer is stable. Figure 2 illustrates solution of this problem in discrete stability regions given by the inequalities (9) in coordinates (*l*1, *l*2, *l*3) on Figure 2a and in coordinates (*V*, *t*24, *t*34) on Figure 2b. As one can see, solution paths of the sequential quadratic programming method [41] from different starting points converge to the common minimum of the optimality criterion (4), which has the following coordinates

$$l\_1 \approx -0.0119, \quad l\_2 \approx 0.2303, \quad l\_3 \approx 0.0591;$$

$$V \approx 0.373, \quad t\_{24} \approx -0.0636, \quad t\_{34} \approx 0.0361.$$

**Figure 2.** Three convergence paths (blue, green and orange arrows) to the common minimum (red star) of the optimality criterion *J* from various starting points in the discrete stability regions in coordinates (**a**) *l*1, *l*2, *l*3; (**b**) *V*, *t*24, *t*34.

The numerical values of the third-order filter matrices are

$$P = \begin{pmatrix} 1 & 0 & 0 \end{pmatrix}, \ N \approx \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0.0119 & -0.2303 & -0.0591 \end{pmatrix},$$

$$T \approx \begin{pmatrix} 1 & 0.5 & 0.3333 & -0.123 \\ 0.5 & 0.3333 & -0.123 & -0.0636 \\ 0.3333 & -0.123 & -0.0636 & 0.0361 \end{pmatrix}, \ M \approx \begin{pmatrix} -0.503 \\ 0.0776 \\ 0.0528 \end{pmatrix}, \ V \approx 0.373.$$

The optimality criterion in this case is

$$\mathbb{J} \approx 2.3179...$$

Thus, the optimality criterion (3) for the third-order filter turned out to be less than for the second-order filter. Previously, second and third order filters were compared from both practical and theoretical points of view. In the context of satellite signal processing, it has been shown [42] that increasing the order of the filters led to an improvement in dynamic stress performance. In [17], a smaller value of the H<sup>2</sup> norm for a third-order filter over a second-order filter was obtained on a numerical experiments. Moreover, it has recently been theoretically explained [22] that, as the order of the reduced model was increased, the deviation in the satisfaction of the optimality conditions further reduced.

#### **6. Conclusions**

Necessary and sufficient conditions for the existence of discrete unbiased filters of the second and third order are proposed. In the canonical basis, analytical expressions are obtained both for the transfer function matrix of the estimation error system and for the coefficients of the characteristic polynomial of functional filters. In a numerical experiment, filters of the second and third order are constructed for a linear stochastic discrete system of the fourth order. The comparison was carried out according to the root-mean-square optimality criterion. It is shown that, in comparison with the second-order filter, the third-order filter is more optimal in terms of the quadratic criterion in the steady state.

**Funding:** This research was funded by Russian Foundation for Basic Research grant number 20-37- 90065 and grant number 20-08-00073.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


### *Article* **Optimal Prefetching in Random Trees**

**Kausthub Keshava 1,†, Alain Jean-Marie 2,\* and Sara Alouf <sup>3</sup>**


**Abstract:** We propose and analyze a model for optimizing the prefetching of documents, in the situation where the connection between documents is discovered progressively. A random surfer moves along the edges of a random tree representing possible sequences of documents, which is known to a controller only up to depth *d*. A quantity *k* of documents can be prefetched between two movements. The question is to determine which nodes of the known tree should be prefetched so as to minimize the probability of the surfer moving to a node not prefetched. We analyzed the model with the tools of Markov decision process theory. We formally identified the optimal policy in several situations, and we identified it numerically in others.

**Keywords:** prefetching; optimization; Markov decision processes; random trees; Galton–Watson

#### **1. Introduction**

Prefetching is a basic technique underlying many computer science applications. Its main purpose is to reduce the time needed to access some information by loading it in advance and concurrently with the process that needs this information. From prefetching of data and code in CPUs and memory architectures, to prefetching of web pages and video segments in Internet-based applications, this technique is ubiquitous. Yet, the technique fundamentally involves a tradeoff between access latency and the consumption of resources (memory, network), and the optimization of this tradeoff is not completely understood.

Clearly, the issue here is randomness: the entity in charge of prefetching, let us call it the "controller", does not know in advance what is the precise data access sequence of the process needing the data. It must therefore make decisions based on the current state of said process and its knowledge of the possible evolution. The adequate formalism for modeling optimal decisions in such a context is that of Markov Decision Processes (MDPs). The principle of using Markov decision processes to optimize prefetching in the context of video applications was first demonstrated in [1,2]. The model was extended in [3,4] and further extended in [5].

The basic principle of these models is that the set of (video) documents to be viewed is represented by a directed graph. The nodes represent the documents, and the edges represent the possible transitions: which documents can be viewed after the viewing of the current document is completed. The edges can be labeled with probabilities or frequencies. A random "surfer" alternates viewing periods and moves to another node/document according to the probabilities of the edges. The controller knows where the surfer stands and knows all about the graph, but does not know which way the surfer will go: only the odds. Its decision is to choose which nodes to download during the time the surfer views the current document. The amount of nodes that can be downloaded is constrained by network resources and is called the "prefetching budget". The amount of storage memory available to the controller is assumed to be sufficient: no memory management is involved in the decision. The criterion to be minimized is typically the average number of times

**Citation:** Keshava, K.; Jean-Marie, A.; Alouf, S. Optimal Prefetching in Random Trees. *Mathematics* **2021**, *9*, 2437. https://doi.org/10.3390/ math9192437

Academic Editors: Alexander Zeifman, Victor Korolev and Alexander Sipin

Received: 31 August 2021 Accepted: 22 September 2021 Published: 1 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the surfer moves to a document that has not been prefetched: this is a measure of the user's dissatisfaction. The criterion might also involve some measure of the waste of network and memory resources. An optimal policy can, in principle, be computed using dynamic programming.

In practical situations, the probabilities that the surfer moves to some new document after viewing the current document are not known a priori. However, these probabilities can be learned from data using Markov models, as in [6–11]. Moreover, the optimal control of a prefetching agent can be approximated using machine learning techniques such as reinforcement learning, as in [2]. A way to evaluate the efficiency of a machine learning algorithm is to test it on a problem for which the exact solution is known. The purpose of this paper is to provide such a benchmark situation, by determining the optimal policy and the minimal possible cost it induces, to which heuristics and learning algorithms can be compared.

While these previous modeling attempts demonstrated that the MDP formalism is flexible enough to take into account many features of a real system, they also illustrate that finding an optimal policy is a complex problem. Indeed, computing an optimal prefetching policy is very hard in general, that is when the graph of documents does not have a particular property. In [12], the authors studied the *feasibility* variant of the problem. There, it was assumed that the controller has a prefetching budget *k*, representing the number of documents that can be prefetched while some document is viewed. The question is to decide whether *k* is large enough so that there exists a policy that prefetches all nodes of a graph before the random surfer tries to access them. This is a subproblem of the Markov optimization model: if such a policy exists, the MDP model should find it as a policy that realizes a zero cost. The results of [12,13] concluded that finding the minimum possible *k* is difficult when the graph is general. Computing the optimal policy in the corresponding MDP must be even more difficult.

However, if the underlying graph is a tree, it was proven in [12] that the minimal budget *k* that ensures the existence of a costless policy can be computed in polynomial time. The corresponding prefetching strategy is also easy to compute and has the feature of being "connected". This property, which we also call "greedy", means that the controller can choose the documents to download in the set of neighbors (in the document graph) of the documents already downloaded.

The models and results reviewed thus far assumed that the complete space of documents is known to the controller. This ideal situation may be either unrealistic or undesirable. For instance, if the documents are pages on the global web, storing all the knowledge about this graph is probably impossible and also useless since the web surfer will not visit all the graph during a surfing session. Furthermore, since the complexity of the decision grows exponentially with the size of the graph, it may help the controller to limit, on purpose, the size of the known graph to a neighborhood of the current document.

The current literature lacks a model for the optimal prefetching problem, which features a dynamic graph of documents. It also lacks situations where an optimal control can be formally (and not just numerically) identified, even when the underlying graph of documents is static. We fill these gaps in two ways. First, we propose a new optimal prefetching model in which the graph of documents is dynamic. We focused on trees, since those are the simplest graphs, with the potential for having computable solutions as the literature review suggests. Second, we compute exactly the optimal control for some instances of this model. We proceed with an informal description of the model, then we highlight our contribution.

#### *1.1. The Model*

We propose to use the modeling described above, but replace the graph of known documents with a tree of depth *d*. The root of this tree is the current position of the surfer. The tree represents all the possible sequences of *d* moves of the surfer. After the surfer has moved to one of the neighbors of its current position, a discovery phase adds a new

generation of documents at depth *d*. The rest of the tree is then ignored: the possibility that the surfer moves back to a node already viewed is neglected, as well as the possibility that several paths exist from one node to another. If any of these possibilities happens in practice, the task will become easier to the controller.

In the discovery phase, we assume that a random number of new documents is attached to every leaf of the current tree, with a uniform distribution between 1 and some integer *p*, which we refer to as the "fanout". In practical graphs of documents, this assumption is not very realistic. The advantages of making such an assumption are that the space of possible configurations will remain finite and that probabilities relative to objects in this space will be easier to write.

As in previous models, the controller is assumed to have a fixed prefetching budget: some integer number *k*. Given a tree of documents with some nodes already downloaded, the problem is to decide which *k* nodes to download so as to minimize the cost. We chose as criterion the stationary probability that the surfer moves to a node that is not prefetched. All these elements are converted in the specification of a Markov decision process, with criterion the infinite-horizon average cost. The model has only three parameters: the depth *d* and fanout *p* of trees and the prefetching budget *k*. The question is whether there is a simple rule based on these three parameters that leads to an optimal decision.

#### *1.2. Contribution*

The first contribution of this paper is the precise specification of this MDP, in Section 3. This specification is based on sets of trees, presented in Section 2, together with their basic properties.

We then turn to the identification of optimal prefetching policies, in Sections 5–7. The results we obtained include: (a) A bound on the optimal cost in general trees; (b) The characterization of optimal policies in trees with depth 1, arbitrary fanout, and arbitrary budget; (c) The characterization of optimal policies in trees with depth 2, arbitrary fanout, and budget 1; (d) An exploration of the optimal policy in trees with depth 2, budget 2 and fanout less than 5. In the process of obtaining these results, we show, in Section 4, the properties of underlying Markov chains on the "shape of trees", which do not depend on the specific policy used and are of independent interest. We discuss the results and the modeling assumptions in Section 8 and conclude in Section 9.

The main notation that is used throughout the paper is summarized in Table A1 in Appendix A.

#### **2. Preliminaries: Sets of Trees**

This section is devoted to the presentation of mathematical objects that are used in the definition of our problem and in its solution.

The state space of the MDP we are about to construct is a set of marked trees. We introduce it now, together with other sets of trees that will be useful in the analysis. We shall use "with fanout *p*" as a shorthand for "with nodes having between 1 and *p* sons".

#### **Definition 1** (**Trees and marked trees**)**.** *Define:*


These sets are represented mathematically with the following recursive formulas:

$$\mathcal{T}\_{p,0} = \{0\} \tag{1}$$

$$\mathcal{T}\_{p,d} = \{0\} \times \text{SEQ}\_{1\dots p}(\mathcal{T}\_{p,d-1}) \qquad d \ge 1 \tag{2}$$

$$\mathcal{M}\_{p,0} = \{0, 1\} \tag{3}$$

$$\mathcal{M}\_{p,d} = \{0, 1\} \times \text{SEq}\_{1\dots p}(\mathcal{M}\_{p,d-1}) \qquad d \ge 1 \tag{4}$$

$$\mathcal{M}\_{p,1}^{+} = \mathcal{M}\_{p,0} \times \text{SEQ}\_{1\dots p}(\mathcal{T}\_{p,0}) \tag{5}$$

$$\mathcal{M}\_{p,d}^{+} = \mathcal{M}\_{p,0} \times \text{SEQ}\_{1..p}(\mathcal{M}\_{p,d-1}^{+}) \qquad d \ge 2. \tag{6}$$

In these expressions, SEQ1..*p*(*A*) denotes, in the notation of [14], a sequence of objects in the set *A*, with the length between 1 and *p*. In (1), we associate the constant mark "0" with nodes in unmarked trees. At the risk of being confusing, we shall say that a node in marked trees of <sup>M</sup>*p*,*<sup>d</sup>* or <sup>M</sup><sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* is "unmarked" if it has the mark 0. With this convention, we can say that <sup>T</sup>*p*,*<sup>d</sup>* ⊂ M<sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* ⊂ M*p*,*d*.

A tree *t* in M*p*,*<sup>d</sup>* is represented as follows. If the depth is *d* = 0, *t* = (*μ*) where *μ* is the mark. If *d* > 0, *t* = (*μ*,*s*) where *μ* = 0 or *μ* ∈ {0, 1} depending on the set, and *s* = (*s*1, ... ,*sm*) is a list of length *m* ∈ [1..*p*]. The elements of *s* are called "subtrees". The root nodes of these subtrees are called "sons" of *t*. The following notation will be useful to designate the components of a tree. Figure 1 illustrates this terminology.

**Definition 2** (**Mark, subtrees, internal nodes, leaves**)**.** *For a tree represented as t* = (*μ*,*s*)*, let <sup>μ</sup>*(*t*) : M*p*,*<sup>d</sup>* → {0, 1} *denote the mark of the root and <sup>s</sup>*(*t*) : M*p*,*<sup>d</sup>* → SEQ1..*k*(M*p*,*d*−1) *denote the list of subtrees of t. Let also* inode(*t*) *denote the number of internal nodes in t and* leaves(*t*) *denote the number of leaves in t.*

**Figure 1.** A tree *t* of depth *d* = 3 with fanout *p* = 2. There are 2 subtrees (|*s*(*t*)| = 2), 6 internal nodes (inode(*t*) = 6), and 5 leaves (leaves(*t*) = 5).

The cardinal of the sets defined in Definition 1 is important to know, in case we want to turn to numerical experiments. From the recursive definition of the different sets of trees, the following result is easily established.

**Lemma 1.** *Let Tp*,*d, Mp*,*d, and <sup>M</sup>*<sup>+</sup> *<sup>p</sup>*,*<sup>d</sup> denote respectively the cardinals of sets* T*p*,*d,* M*p*,*<sup>d</sup> and* <sup>M</sup><sup>+</sup> *<sup>p</sup>*,*d. Then:*

• *Tp*,0 = 1*, Tp*,1 = *p and for d* ≥ 2*,*

$$T\_{p,d} = \sum\_{m=1}^{p} \left(T\_{p,d-1}\right)^{m} = \frac{(T\_{p,d-1})^{p+1} - T\_{p,d-1}}{T\_{p,d-1} - 1};$$

• *Mp*,0 = 2 *and for d* ≥ 1*,*

*Mp*,*<sup>d</sup>* = 2 *p* ∑ *m*=1 (*Mp*,*d*−1)*<sup>m</sup>* <sup>=</sup> <sup>2</sup> (*Mp*,*d*−1)*p*+<sup>1</sup> <sup>−</sup> *Mp*,*d*−<sup>1</sup> *Mp*,*d*−<sup>1</sup> <sup>−</sup> <sup>1</sup> ; • *M*<sup>+</sup> *<sup>p</sup>*,1 = 2*p and for d* ≥ 2*, M*<sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* = 2 *p* ∑ *m*=1 (*M*<sup>+</sup> *<sup>p</sup>*,*d*−1)*<sup>m</sup>* <sup>=</sup> <sup>2</sup> (*M*<sup>+</sup> *<sup>p</sup>*,*d*−1)*p*+<sup>1</sup> <sup>−</sup> *<sup>M</sup>*<sup>+</sup> *p*,*d*−1 *M*<sup>+</sup> *<sup>p</sup>*,*d*−<sup>1</sup> <sup>−</sup> <sup>1</sup> .

Similar formulas can be established for generating functions of the size of trees in each set. We shall not develop this analysis further. Table 1 shows the values of *Tp*,*d*, *Mp*,*d*, and *M*<sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* for small values of *p* and *d*. Clearly, these numbers grow extremely fast with *d*. The sets remain manageable for small values of *p* and *d*. For instance, Figure 2 lists the 6 trees of T2,2.

**Table 1.** Instances of the cardinals of the sets of rooted trees of depth *d* with fanout *p*, when marks are ignored (*Tp*,*d*), when marks are in {0, 1} (*Mp*,*d*), and when leaves have mark 0 and other nodes have marks in {0, 1} (*M*<sup>+</sup> *<sup>p</sup>*,*d*), for different *d* and *p*.


**Figure 2.** There are 6 trees in T2,2, the set of rooted trees with depth 2 and fanout 2.

#### **3. The MDP Model**

In this section, we formally describe the five elements of the prefetching MDP. An MDP model is formally defined by a state space, an action space, transitions, costs, and an evaluation criterion. The state is usually made of numerical variables or discrete structures that summarize the information needed for the following definitions. Actions and transitions specify what controls are allowed to the controller, how they modify the states, and with what probabilities. The cost function, which depends on the states and actions, quantifies the impact of actions. The costs incurred at different time steps are aggregated into a numerical criterion: the objective is to minimize this criterion.

#### *3.1. Prefetching Process Flow*

The prefetching process flow is summarized as follows. The current state of the prefetching program is the currently known graph of depth *d*, together with the knowledge of nodes that have been already prefetched. This is represented as a marked tree of depth *d*. The surfer is assumed to stand at the root. The controller then prefetches up to *k* documents, which is represented by marking the corresponding nodes in the tree. Then, the surfer moves randomly to one of the sons, with uniform probabilities among them. If the document corresponding to this node is not already prefetched, then some cost is incurred. Finally, the controller discovers a new generation of nodes. We assume that every possible exploration/discovery of the current subtree is equally likely. After discovery, the controller is back at the beginning of this decision loop.

#### *3.2. State Space and Action Space*

According to the prefetching process flow described previously, the state space *S* of the MDP is <sup>M</sup><sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* as defined in Definition 1, since the leaves of a state were just discovered and are not prefetched: their mark is 0. For a given tree *t* ∈ *S*, the action space *At* is the set of all subsets of the vertices of *t* with cardinal at most *k*. An action *a* ∈ *At* will have the effect of marking the nodes in *a*. Some actions will not make much sense: those that mark already marked nodes. We choose to include them nevertheless as possible actions, which greatly simplifies the description of the set *At*. The parameter *k* ≥ 1 is called the prefetching budget.

#### *3.3. Transitions*

We first formalize the transition between the different trees using random variables and set mappings. Then, we quantify the probabilities of these transitions and the costs.

#### **Definition 3** (**Prefetching process random variables**)**.** *Define:*


In the evolution of the controlled process, these three random variables will depend on the time step *n*. When necessary, we use the notation *t*(0), *ta*(0), *tb*(0), *t*(1), *ta*(1), *tb*(1), . . ., *t*(*n*), *ta*(*n*), *tb*(*n*), . . . to denote this (random) succession of trees.

**Definition 4** (**Discovery**)**.** *Let* <sup>D</sup> : <sup>M</sup>*p*,*d*−<sup>1</sup> → P(M<sup>+</sup> *<sup>p</sup>*,*d*) *denote the mapping such that* D(*t*) *is the set of trees that can be discovered from tree <sup>t</sup>* ∈ M*p*,*d*−1*. Elements of* <sup>D</sup>(*t*) *are in* <sup>M</sup><sup>+</sup> *<sup>p</sup>*,*<sup>d</sup> and are obtained from the tree t by updating its leaves according to the following rule: for a leaf l* = (*μ*) *(i.e, depth-0 tree) in t, update it as:*

$$(\mu) \rightarrow (\mu, l\_{\text{new}}) \text{ where } l\_{\text{new}} \in \{ (0), (0, 0), \cdot, \cdot, (0)^p \}. \tag{7}$$

**Definition 5** (**Successors after discovery**)**.** *Let* SD : <sup>M</sup><sup>+</sup> *<sup>p</sup>*,*<sup>d</sup>* → P(M<sup>+</sup> *<sup>p</sup>*,*d*) *be defined by:*

$$\mathcal{SD}((\mu, s)) = \underset{t\_s \in s}{\sqcup} \mathcal{D}(t\_s) \tag{8}$$

*where refers to the disjoint union of sets.*

The set SD(*t*) contains all trees that are the possible results from a combination of surfer movement and the discovery of new leaves.

For a tree *<sup>t</sup>* = (*μ*,*s*) ∈ M<sup>+</sup> *<sup>p</sup>*,*d*, let *a*(*t*) ∈ M*p*,*<sup>d</sup>* denote the tree after marking according to action *a* ∈ *At*. Then, *s*(*a*(*t*)) is the set of subtrees of *t* after marking action *a*. The transition probability of moving from a tree *t* to a tree *t* under action *a* is:

$$P(t, a, t') = \begin{cases} \frac{1}{|s(t)||\mathcal{D}(t\_{\mathbb{B}})|} & \text{if } t' \in \mathcal{D}(t\_{\mathbb{B}}) \text{ and } t\_{\mathbb{B}} \in s(a(t))\\ 0 & \text{otherwise.} \end{cases} \tag{9}$$

#### *3.4. Immediate Cost*

The immediate cost of moving to tree *t* by choosing action *a* while in tree *t* is:

$$c(t, a, t') = \begin{cases} 0 & \text{if } \mu(t') = 1 \\ 1 & \text{if } \mu(t') = 0. \end{cases}$$

Accordingly, the expected cost incurred when applying action *a* to tree *t* is:

$$c(t, a) = \sum\_{t' \in \mathcal{SP}(t)} P(t, a, t')c(t, a, t')$$

but a simpler expression is available by substituting the explicit values for the probability term and the cost term, as stated in the next lemma.

**Lemma 2.** *The expected cost can be written as:*

$$c(t, a) := 1 - \frac{1}{|s(t)|} \sum\_{t' \in s(a(t))} \mu(t'). \tag{10}$$

Given a budget or a specific family of policies, there are states in *S* that will never feature in the MDP. Thus, we shall focus only on the "usable states".

**Definition 6** (**Usable states**)**.** *The states in S that are attained through transitions given a specific value of budget k or a family of policies are called usable states. Denote this set of states as* U*.*

For example, budget dependent states for *k* = 1, *d* = 2, and *p* ≥ 2 will not include states where both nodes at depth 1 (if two exist) are marked. We will come back to this in Section 6.

#### *3.5. Policies and Criterion*

We choose as evaluation criterion the expected average cost over an infinite horizon. The class of policies in which we optimize is, in the terminology of [15] (Section 2.1.4), that of History-dependent, Randomized strategies (HR). However, the classical theory allows focusing on stationary strategies only. Some of our definitions are valid in the general class HR. In this context, *γn*(*t*) denotes the action prescribed by the policy at time step *n* for tree *t*.

#### **Definition 7** (**Sensible policies, greedy policies**)**.** *A policy γ* ∈ *HR is called:*


Sensible policies do not waste the marking budget on marked nodes, unless they have no other possibility. Among them, greedy policies mark sons as priority.

The following observation will be useful. Its proof is immediate by unrolling the marking/surfing/discovering cycle.

**Lemma 3.** *Consider a stationary policy γ such that for any t, γ*(*t*) *marks nodes only up to some depth dm. Let* M*<sup>γ</sup> be the Markov chain generated by this policy. Then, the usable states* U*, and in particular the recurrent classes of* M*γ, contain only trees with nodes marked up to depth dm* − 1*.*

**Remark 1.** *Given the rules of surfing (uniform choice) and of discovery (independence of different subtrees), it seems possible to further reduce the size of the state space by exploiting symmetries. For instance, in Figure 2, the fourth and fifth trees will have exactly the same properties. We chose not to exploit these symmetries, because this would lead to an extra complexity in the formalism. Furthermore, it would render the enumeration of state spaces more complex and, as a consequence, complicate the description of the process on tree shapes; see the following section.*

#### **4. The Markov Chain of Tree Shapes**

In this section, we temporarily forget the control part of the MDP and focus on the process of trees generated by the surfing/discovering mechanism. It turns out to contain two Markov chains, which we identify and analyze.

#### *4.1. Definition and Basic Properties*

An important feature of the MDP constructed in Section 3 is that the *shape* of the successive trees does not depend on the marking strategy. In order to formalize this, we first define the shape of trees.

**Definition 8** (**Shape of trees**)**.** *Consider the mapping σp*,*<sup>d</sup>* : M*p*,*<sup>d</sup>* → T*p*,*d, defined for all p and d recursively with:*

$$
\sigma\_{p,0}(t) = 0, \qquad \sigma\_{p,d}(\mu\_\prime(s\_1, \dots, s\_m)) = (0, (\sigma\_{p,d-1}(s\_1), \dots, \sigma\_{p,d-1}(s\_m))), \newline \text{if } d \ge 1.
$$

*The tree σp*,*d*(*t*) *is called the shape of tree t.*

We now state the aforementioned property of the shape of trees in the MDP. Observe that, by the definition of the succession of trees *t*, *ta*, *tb*, we have *σp*,*d*(*t*(*n*)) = *σp*,*d*(*ta*(*n*)) for all *n*.

**Proposition 1.** *In the MDP defined in Section 3:*


**Proof.** The proof of (*i*) follows from the fact that transition probabilities in (9) do not depend on the action *a*. Then, the Markov nature of embedded sequences *σp*,*d*(*t*(*n*)) = *<sup>σ</sup>p*,*d*(*ta*(*n*)) and *<sup>σ</sup>p*,*d*−1(*tb*(*n*)) is clear since random moves of the surfer and discoveries depend only on the shape of the current tree.

We proceed with the identification of the stationary distributions of the Markov chains featured in Proposition 1. We first introduce the family of candidate distributions and state their basic properties. We then prove the result about Markov chains.

**Definition 9.** *Let πp*,*<sup>d</sup>* : T*p*,*<sup>d</sup>* → [0, 1] *be the sequence of functions defined recursively by:*

$$
\pi\_{p,0}(t) = 1\tag{11}
$$

$$
\pi\_{p,d}(0,(s\_1,\ldots,s\_m)) = \frac{1}{p} \prod\_{k=1}^m \pi\_{p,d-1}(s\_k) \qquad d \ge 1. \tag{12}
$$

**Lemma 4.** *The functions πp*,*<sup>d</sup> introduced in Definition 9 have the following properties:*


$$
\pi\_{p,d}(t) = \frac{1}{p^{\mathrm{inode}(t)}}.
$$

The interpretation of Definition 9 and Lemma 4 (*ii*) is that the probability of a tree *t* is the probability that this tree is generated by a Galton–Watson process with branching probabilities uniform in {1, . . . , *p*}, stopped at generation *d*.

**Proof.** The proof of (*i*) proceeds by recurrence. For *d* = 0, the property is trivial. Assume then the property proven up to *d*. Then, selecting the number of sons of a tree and using (12), we have:

$$\begin{split} \sum\_{\mathbf{t} \in \mathcal{T}\_{p,d+1}} \pi\_{p,d+1}(\mathbf{t}) &= \sum\_{m=1}^{p} \sum\_{\substack{s\_1, \dots, s\_m \in \mathcal{T}\_{p,d}}} \pi\_{p,d+1}(0, (s\_1, \dots, s\_m)) &= \sum\_{m=1}^{p} \sum\_{\substack{s\_1, \dots, s\_m \in \mathcal{T}\_{p,d}}} \frac{1}{p} \prod\_{j=1}^{m} \pi\_{p,d}(s\_j) \\ &= \sum\_{m=1}^{p} \frac{1}{p} \left( \sum\_{s \in \mathcal{T}\_{p,d}} \pi\_{p,d}(\mathbf{s}) \right)^{m} = \sum\_{m=1}^{p} \frac{1}{p} = 1. \end{split}$$

We continue with the proof of (*ii*). Recursively, inode(*t*) = 1 + ∑*<sup>t</sup>* <sup>∈</sup>*s*(*t*) inode(*<sup>t</sup>* ). The result then follows from the definition (12).

The following properties of the distribution *πp*,*<sup>d</sup>* will be useful. The proofs of the first two of them are straightforward and omitted.

**Lemma 5.** *Let t be a random tree in* T*p*,*<sup>d</sup> distributed according to πp*,*d. Then,* |*s*(*t*)| *is uniformly distributed in* {1, . . . , *p*}*.*

**Lemma 6.** *Let t be a random tree in* T*p*,*<sup>d</sup> distributed according to πp*,*d. Then, conditioned on the fact that* |*s*(*t*)| = *m, the subtrees s*1, ... ,*sm are independent and uniformly distributed according to πp*,*d*−1*.*

**Proposition 2.** *Consider the Markov chains*<sup>M</sup> <sup>=</sup> {*σp*,*d*(*t*(*n*)); *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>} *and*M*<sup>b</sup>* <sup>=</sup> {*σp*,*d*−1(*tb*(*n*)); *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>}*:*


**Proof.** The property (*i*) is proven if we can show that both chains are irreducible and aperiodic. Irreducibility follows from the fact that there is a sequence of transitions with a nonzero probability leading to, say, the tree that is a chain (all its internal nodes have only one son, let us name it *cp*,*d*): if the discovery phase adds just one leaf to every leaf of *tb* (this happens with positive probability), after *d* steps, the tree is *cp*,*d*, whatever the random surfing moves. The tree *tb* itself is *cp*,*d*−1. Aperiodicity also follows from this construction since the transition from *cp*,*<sup>d</sup>* to *cp*,*<sup>d</sup>* has a nonzero probability.

In order to prove (*ii*), we check that the distribution *πp*,*<sup>d</sup>* satisfies the equation *π* = *πP*. Since M is ergodic, this will be the unique solution. We first identify the set of trees that have a positive probability of transition to a given tree *t* ∈ T*p*,*d*. To that end, we have to reverse the process of the transformation of one tree into another. Reversing the discovery phase, we are led to define top(*t*) ∈ T*p*,*d*−<sup>1</sup> as the tree deduced from *<sup>t</sup>* by removing the leaves. Then, reversing the surfer movement, we conclude that *t* can be transformed into

*t* if and only if *t* has top(*t*) as one of its subtrees. Let Prec(*t*) ⊂ T*p*,*<sup>d</sup>* be this set. For any *t* ∈ Prec(*t*), we have:

$$P(t',t) = \frac{\text{card}\{\tau \in s(t') | \tau = \text{top}(t)\}}{|s(t')|} \frac{1}{p^{\text{leaves}(\text{top}(t))}} \dots$$

Accordingly, it is convenient to partition the set Prec(*t*) into "blocks" of states as follows:

$$\begin{aligned} \text{Prec}(t) &= \cup\_{m=1}^p \cup\_{n=1}^m \mathcal{P}(m, n) \\ \mathcal{P}(m, n) &:= \{ t' \in \mathcal{T}\_{p, d'} | s(t')| = m, s(t') \text{ contains top}(t) \text{ exactly } n \text{ times} \}. \end{aligned}$$

Trees that have the same number of sons and the same number of occurrences of top(*t*) among their sons are grouped together. By construction, we have:

$$P(t',t) = \frac{n}{m} \frac{1}{p^{\text{lenses}(\text{top}(t))}} \qquad \forall t' \in \mathcal{P}(m,n).$$

This transition probability is therefore constant in the block P(*m*, *n*). Then, we can write:

$$\begin{split} \sum\_{t' \in \mathcal{T}\_{p,d}} \pi\_{p,d}(t')P(t',t) &= \sum\_{t' \in \text{Prec}(t)} \pi\_{p,d}(t')P(t',t) \\ &= \sum\_{m=1}^{p} \sum\_{n=1}^{m} \sum\_{t' \in \mathcal{P}(m,n)} \pi\_{p,d}(t') \frac{n}{m} \frac{1}{p^{\text{lavves}(\text{top}(t))}} \\ &= \frac{1}{p^{\text{lavves}(\text{top}(t))}} \sum\_{m=1}^{p} \sum\_{n=1}^{m} \frac{n}{m} \sum\_{t' \in \mathcal{P}(m,n)} \pi\_{p,d}(t') \,. \end{split} \tag{13}$$

We evaluate the inner sum, which is the total probability of the block P(*m*, *n*) under the distribution *πp*,*d*. According to Lemma 6, the distribution of subtrees of *t* ∈ P(*m*, *n*) is that of *m* independent trees in T*p*,*d*−1. Therefore, the probability, conditioned on *m*, that exactly *n* subtrees of *t* are top(*t*) is the following binomial distribution (resulting from picking the *n* locations for trees top(*t*) among the *m* possibilities):

$$\binom{m}{n}\pi\_{p,d-1}(\text{top}(t))^n(1-\pi\_{p,d-1}(\text{top}(t)))^{m-n}\,. \tag{14}$$

We conclude that:

$$\sum\_{\substack{t' \in \mathcal{P}(m,n)}} \pi\_{p,l}(t') = \frac{1}{p} \binom{m}{n} \pi\_{p,l-1}(\text{top}(t))^n (1 - \pi\_{p,l-1}(\text{top}(t)))^{m-n} \,. \tag{15}$$

Using this result, we can evaluate the product of the distribution *πp*,*<sup>d</sup>* and the matrix *P* through the following computation:

$$\begin{split} \sum\_{\mathbf{r}' \in \mathcal{T}\_{p,\mathcal{L}}} \pi\_{p,\mathcal{L}}(t')P(t',t) \\ &= \quad \frac{1}{p^{\text{bavec}(\text{top}(t))}} \sum\_{m=1}^{p} \sum\_{n=1}^{m} \frac{n}{m} \frac{1}{p} \binom{m}{n} \pi\_{p,\mathcal{L}-1}(\text{top}(t))^{n} (1-\pi\_{p,\mathcal{L}-1}(\text{top}(t)))^{m-n} \\ &= \quad \frac{1}{p^{1+\text{bavec}(\text{top}(t))}} \sum\_{m=1}^{p} \sum\_{n=1}^{m} \binom{m-1}{n-1} (\pi\_{p,\mathcal{L}-1}(\text{top}(t)))^{n} \left(1-\pi\_{p,\mathcal{L}-1}(\text{top}(t))\right)^{m-n} \end{split} \tag{16a}$$

$$=\frac{\pi\_{p,d-1}(\text{top}(t))}{p^{1+\text{heavy}(\text{top}(t))}}\sum\_{m=1}^{p}\sum\_{n=0}^{m-1}{m-1\choose n}(\pi\_{p,d-1}(\text{top}(t)))^{n}\Big(1-\pi\_{p,d-1}(\text{top}(t))\Big)^{m-1-n}$$

$$\mathbf{x} = \frac{\pi\_{p,d-1}(\text{top}(t))}{p^{1+\text{lowos}(\text{top}(t))}} \sum\_{m=1}^{p} \mathbf{1} \tag{17a}$$

$$\begin{aligned} &= \frac{\pi\_{p,d-1}(\text{top}(t))}{p^{1+\text{horov}(\text{top}(t))}} \cdot p = \frac{1}{p^{\text{inclo}(\text{top}(t)) + \text{horov}(\text{top}(t))}} = \frac{1}{p^{\text{inclo}(t)}} \\ &= -\pi\_{p,d}(t). \end{aligned} \tag{17b}$$

We used (13) and (14) to obtain (16a). The binomial expansion theorem was used in (17a). Finally, in (17b), we note that inode(*t*) is the sum of inode(top(*t*)) and leaves(top(*t*)), which gives us the desired result, *πp*,*d*(*t*).

Finally, we prove (*iii*). We know that *σ*(*tb*) results from *σ*(*t*) through the random choice of a son of *t*. Invoking again Lemma 6, we have: conditioned on the event {|*s*(*t*)| = *<sup>m</sup>*}, each subtree is *<sup>τ</sup>* with probability *<sup>π</sup>p*,*d*−1(*τ*), and the probability that a uniform random choice picks *τ* has probability *πp*,*d*−<sup>1</sup> as well. Since this does not depend on *m*, the result is true also when removing the conditioning.

#### *4.2. Application to Greedy Policies*

As an application of Lemma 5, we obtain an upper bound on the optimal cost that can be realized by any policy. This was based on the following result.

**Lemma 7.** *Consider a random variable t* ∈ T*p*,*<sup>d</sup> distributed as πp*,*d. Define the random variable:*

$$C = \frac{[|s(t)| - k\ ]^{+}}{|s(t)|}.$$

*Then, C* <sup>=</sup> <sup>0</sup> *with probability 1 (in particular,* <sup>E</sup>*<sup>C</sup>* <sup>=</sup> <sup>0</sup>*) if k* <sup>≤</sup> *p and:*

$$\text{EC} = \frac{1}{p} \text{ H}\_{pk\text{\textquotedblleft}} \qquad k \ge p\_{\text{\textquotedblleft}} \tag{18}$$

*where:*

$$\mathbb{H}\_{pk} := \sum\_{m=k+1}^{p} \frac{m-k}{m} = \begin{array}{c} p-k-k(\mathbb{H}\_p - \mathbb{H}\_k). \end{array} \tag{19}$$

**Proof.** According to Lemma 5, |*s*(*t*)| is uniformly distributed. Then:

$$\begin{split} \mathbb{E}\mathbf{C} &= \sum\_{m=1}^{p} \frac{1}{p} \frac{[m-k]^{+}}{m} = \frac{1}{p} \sum\_{m=k+1}^{p} \frac{m-k}{m} = \frac{1}{p} \operatorname{H}\_{pk} \\ &= \frac{1}{p} \sum\_{m=k+1}^{p} \left(1 - \frac{k}{m}\right) = \frac{1}{p} \Big(p - k - k \sum\_{m=k+1}^{p} \frac{1}{m}\Big) \\ &= \frac{1}{p} \Big(p - k - k(\mathbb{H}\_{p} - \mathbb{H}\_{k})\Big). \end{split}$$

We now state the bound announced.

**Proposition 3.** *The optimal expected cost g*∗ *of the MDP described in Section 3 satisfies:*

$$\mathcal{g}^\* \le 1 - \frac{k}{p} \left( 1 + \mathbb{H}\_p - \mathbb{H}\_k \right). \tag{20}$$

#### *The cost of any greedy policy satisfies the same bound.*

**Proof.** Consider the policy that marks *k* sons of the current tree or all the sons if *k* is larger than this. This is a greedy policy, in the sense of Definition 7. The average cost of this policy is precisely given by E*C* as in Lemma 7, which is equal to the right-hand side in (20). Any greedy policy will mark at least the same nodes: it will necessarily have a smaller average cost. The optimal cost of the MDP is necessarily smaller than the cost of this specific policy.

#### *4.3. Metrics of Tree Shapes*

We applied the results of Section 4.1 to evaluate several simple metrics in tree shapes generated by the MDP. This may be useful in particular for estimating the quantity of memory needed in a simulation of this process.

#### 4.3.1. Average Number of Nodes

Let *Np*,*<sup>d</sup>* <sup>=</sup> <sup>E</sup>(|*t*|) be the average size of a tree *<sup>t</sup>* distributed according to *<sup>π</sup>p*,*d*. According to Lemma 6, we can write, conditional on the event {|*s*(*t*)| = *m*}: |*t*| = 1 + |*s*1| + ... + |*sm*| so that:

$$\mathbb{E}(|t| \mid |s(t)| = m) = 1 + \sum\_{j=1}^{m} \mathbb{E}(|s\_j|) \ = 1 + m \, N\_{p,d-1}$$

$$N\_{p,d} = \mathbb{E}(|t|) = 1 + \mathbb{E}(|s(t)|) N\_{p,d-1} \ = 1 + \frac{p+1}{2} N\_{p,d-1} \,. \tag{21}$$

Since the initial condition is *Np*,0 = 1, the recurrence in (21) has the solution:

$$N\_{p,d} = \frac{2}{p-1} \left( \left( \frac{p+1}{2} \right)^{d+1} - 1 \right), \quad p \ge 2,\tag{22}$$

with *N*1,*<sup>d</sup>* = *d* + 1. We have proven the following result.

**Lemma 8.** *The average number of nodes in a tree t* ∈ T*p*,*<sup>d</sup> distributed according to πp*,*<sup>d</sup> is given by* (22)*.*

#### 4.3.2. Average Number of Leaves

Let *Lp*,*<sup>d</sup>* be the average number of leaves, in trees *t* distributed according to *πp*,*d*. The reasoning of Section 4.3.1 can be reproduced: according to Lemma 6, we can write, conditional on the event {|*s*(*t*)| = *m*}: |*t*| = |*s*1| + ... + |*sm*|. If follows that:

$$L\_{p,d} = \mathbb{E}(|s(t)|) \, L\_{p,d-1} = \frac{p+1}{2} \, L\_{p,d-1}.$$

Since *Lp*,0 = 1, we have the following result.

**Lemma 9.** *The average number of leaves in a tree t* ∈ T*p*,*<sup>d</sup> distributed according to πp*,*<sup>d</sup> is:*

$$L\_{p,d} = \left(\frac{p+1}{2}\right)^d$$

.

Note that the maximal number of leaves of a tree in <sup>T</sup>*p*,*<sup>d</sup>* is *<sup>p</sup>*2.

4.3.3. Average Number of Nodes Created

Let *Cp*,*<sup>d</sup>* be the average number of nodes created at some time step in the Markov chain {*σp*,*d*(*t*(*n*)); *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>} in its stationary regime. It is equal to the expected number of nodes created from some tree *t* ∈ T*p*,*<sup>d</sup>* distributed according to *πp*,*d*. This number itself is equal to the expected number of leaves in *t*, since *t* results from the addition of leaves to some *t* , also distributed according to *πp*,*d*. We therefore have, with Lemma 9:

$$\mathbb{C}\_{p,d} = \left(\frac{p+1}{2}\right)^d.$$

4.3.4. Average Number of Nodes Deleted

Let *Dp*,*<sup>d</sup>* be the average number of nodes deleted at some time step in the Markov chain {*σp*,*d*(*t*(*n*)); *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>} in its stationary regime. By stationarity, it is expected that *Dp*,*<sup>d</sup>* <sup>=</sup> *Cp*,*d*. We verify this with a direct computation.

Consider a tree *t* ∈ T*p*,*d*. Conditioned on the events {|*s*(*t*)| = *m*} and the surfer moving to the *j*th son, the number of nodes deleted is: 1 + ∑- <sup>=</sup>*<sup>j</sup>* |*s*-| (the root and the other subtrees). Using Lemma 5 and the fact that each *s* is distributed according to *πp*,*d*−<sup>1</sup> (Lemma 6), we have then:

$$D\_{p,d} = \sum\_{m=1}^{p} \frac{1}{p} \sum\_{j=1}^{m} \frac{1}{m} \left( 1 + (m-1)N\_{p,d-1} \right) = \frac{1}{p} \sum\_{m=1}^{p} \left( 1 + (m-1)N\_{p,d-1} \right)$$

$$= 1 + \frac{p(p-1)}{2p} \left. N\_{p,d-1} = 1 + \frac{p-1}{2} \left. N\_{p,d-1} \right. \right. = \left( \frac{p+1}{2} \right)^{d}$$

where the last equality results from (22). This is equal to the average number of nodes created *Cp*,*d*, as expected.

#### **5. Trees of Depth** *d* **= 1 with an Arbitrary Marking Budget**

In this section, we consider trees of depth 1, and we prove the following result.

**Theorem 1.** *When d* = 1*, any greedy policy is optimal.*

It is quite clear intuitively that, indeed, no reasonable alternative exists. The exercise here is to check that the theory does provide a way to prove the result formally. In the process, we identify arguments that will be useful for the proofs of stronger results.

For the purpose of the forthcoming proof, we rename the elements of <sup>M</sup><sup>+</sup> *<sup>p</sup>*,1 as: <sup>M</sup><sup>+</sup> *<sup>p</sup>*,1 = {*tμ*,*<sup>j</sup>* : *μ* ∈ {0, 1}, 1 ≤ *j* ≤ *p*}. In the notation of Section 2, we have *tμ*,*<sup>j</sup>* = (*μ*;(0, . . . , 0) + ,- . *<sup>j</sup>* times ). Furthermore, observe that trees *tb* belong to M*p*,0 = {0, 1}: these are

trees reduced to a root with a mark.

**Proof.** We shall prove the result using Theorem A1. Define the constant *g* and the function *<sup>f</sup>* : <sup>T</sup>*p*,1 <sup>→</sup> <sup>R</sup> as:

$$\mathcal{g} = \frac{\mathcal{H}\_{pk}}{p} \tag{23}$$

$$f(t\_{\mu,j}) = \frac{(j-k)^+}{j}, \qquad \mu \in \{0, 1\}. \tag{24}$$

The symbol H*pk* was defined in (19). We shall check that this *g*, *f* satisfies the optimality Equation (A1). For every state *s* = *tμ*,*<sup>j</sup>* and every action *a*, we write the quantity to be minimized in the right-hand side of this equation:

$$\begin{split} Q(s,a \mid := c(s,a \mid + \sum\_{s' \in \mathcal{M}^+\_{p,1}} P(s,a\_{\boldsymbol{s}} \boldsymbol{s}') f(\mathbf{s}') \\ = c(t\_{\mu,j}, a \mid + \sum\_{s' \in \mathcal{M}^+\_{p,1}} P(t\_{\mu,j}, a, \boldsymbol{s}') f(\mathbf{s}') \\ = c(t\_{\mu,j}, a \mid + \sum\_{\mu' \in \{0,1\}} \mathbb{P}(t\_{\mu,j}, a, \mu') \sum\_{j'=1}^p \frac{1}{p} f(t\_{\mu',j'}). \end{split} \tag{25}$$

We obtained (25) by conditioning the transition *s* → *s* on the value of the tree *tb*. The new notation *P*(*t*, *a*, *tb*) stands for the probability of moving from *t* to *tb* when action *a* is applied. Given the definition of *f*(·) in (24), we have further:

$$\begin{split} Q(s,a \mid = \varepsilon(t\_{\mu,j}, a \mid + \sum\_{\mu' \in \{0,1\}} \mathsf{P}(t\_{\mu,j}, a \; \mu') \left(\sum\_{j'=1}^{p} \frac{1}{p} \frac{(j'-k)^{+}}{j'}\right) \\ = \varepsilon(t\_{\mu,j}, a \; ) + \sum\_{\mu' \in \{0,1\}} \mathsf{P}(t\_{\mu,j}, a \; \mu') \frac{\mathsf{H}\_{\mathsf{pk}}}{p} \\ = \varepsilon(t\_{\mu,j}, a \; ) + \mathsf{g}. \end{split} \tag{26}$$

The actions *a* can be grouped according to the number of sons they mark in the tree *ta*: this number ranges from 0 to min{*j*, *k*}. When *ta* has sons marked, this determines the cost as:

$$c(t\_{\mu,j'}a) := \frac{j-\ell}{j}.$$

Finally, the minimization with respect to *a* amounts to the following minimization with respect to -:

$$\min\_{a} \left\{ c(s, a) + \sum\_{s' \in \mathcal{M}\_{p, 1}^{+}} P(s, a, s') f(s') \right\} = \min\_{0 \le \ell \le j \land k} \left\{ \frac{j - \ell}{j} + g \right\}$$

$$= \frac{(j - k)^{+}}{j} + g = f(s) + g. \tag{27}$$

The constant *g* and the function *f* therefore solve Equation (A1). This function is bounded since the state space is finite. Therefore, there exists an optimal policy *γ*∗ with cost *g*. Clearly, this policy consists of marking up to *k* sons of any tree: a greedy policy in the sense of Definition 7.

From the proof of Theorem 1 and also from Lemma 7, we have the corollary:

**Corollary 1.** *The average value of any tree of depth 1 is* H*pk*/*p.*

**Remark 2.** *The fact that, in the present case, f*(*s*) = min*<sup>a</sup> c*(*s*, *a*) *is a consequence of the fact that the cost of the future tree s resulting from the transition is actually independent of the action a.*

**Remark 3.** *It was proven in [16] that the finite-horizon, total-cost optimal-value function is given by:*

$$\mathcal{W}\_N^\*(t\_{\mu,j}) = \frac{(j-k)^+}{j} + \frac{(N-1)}{p} \mathcal{H}\_{pk\ \nu}$$

*and is realized by any greedy policy. We obtained from this result the value of g* = lim*N*→+<sup>∞</sup> *W*<sup>∗</sup> *<sup>N</sup>*(*tμ*,*j*)/*N and the form of function f that must satisfy f*(*s*) − *f*(*s* ) = lim*N*→+∞(*W*<sup>∗</sup> *<sup>N</sup>*(*s*) − *W*<sup>∗</sup> *N*(*s* ))*.*

#### **6. Trees of Depth** *d* **= 2 with Marking Budget** *k* **= 1**

In this section, we consider trees of depth 2, and we prove the following result.

#### **Proposition 4.** *When k* = 1 *and d* = 2*, any greedy policy is optimal.*

We begin with some notation and preliminary results. Then, we provide the proof.

#### *6.1. Preliminaries*

As a preliminary, observe that as a consequence of the marking/moving/discovery cycle, the subtrees of T1,*<sup>p</sup>* that appear have at most one leaf marked. Indeed, at the beginning of the cycle, trees *<sup>t</sup>* ∈ M<sup>+</sup> 2,*<sup>p</sup>* have all leaves unmarked. The marking with budget *k* = 1 marks at most one of these leaves. Then, the surfer moves to one subtree *tb* ∈ M1,*p*, which inherits this property. The discovery phase merely adds unmarked leaves at depth 2. With this observation, we can restrict our attention to the usable set U of trees with at most one leaf marked, since only those can appear recurrently when some stationary policy is applied.

A second preparation is to calculate the average cost under some greedy policy *γ*, which therefore marks one node at depth one in any tree *t*. The choice of this node does not matter. According to Lemma 3, the Markov chain M*<sup>γ</sup>* generated by this policy has recurrent states with marks only at depth 0, that is at the root. Therefore, the cost (10) is always given by *c*(*t*, *γ*(*t*)) = 1 − 1/|*s*(*t*)|. It is then of the form assumed in Lemma 7, and the application of (18) yields the expected cost for policy *γ*:

$$J\_{\mathcal{T}} = \frac{1}{p} \left( p - 1 - \left( \mathbb{H}\_{\mathcal{P}} - \mathbb{H}\_{1} \right) \right) \\ = 1 - \frac{\mathbb{H}\_{\mathcal{P}}}{p} \,. \tag{28}$$

#### *6.2. Notation and Terminology*

When *d* = 2, the trees of interest are simpler than in the general case, and it is convenient to devise an appropriate notation. For trees in <sup>M</sup><sup>+</sup> *<sup>p</sup>*,2, all subtrees have depth one and unmarked leaves. We shall adopt the simplified notation for such trees: (*μ*; *m*) denotes a depth one tree with root marked with *μ* and *m* unmarked leaves. A typical tree of <sup>M</sup><sup>+</sup> *<sup>p</sup>*,2 is then denoted by *t* = (*μ*;(*μ*1, *j*1),...,(*μm*, *jm*)) for some *m* ∈ [1..*p*].

After marking, the subtrees of depth one will have at most one leaf marked: then (*μ*; *m*+) will denote this tree with one marked leaf. Which leaf exactly is marked does not make a difference in the following reasoning.

In the analysis, the number of unmarked sons of a tree is a key criterion. Accordingly, we introduce the following typology for *<sup>t</sup>* = (*μ*;(*μ*1, *<sup>j</sup>*1),...,(*μm*, *jm*)) ∈ M<sup>+</sup> *p*,2:

$$\text{Type 1: } \sum\_{r=1}^{m} \mu\_r \le m - 1, \qquad \text{Type 2: } \sum\_{r=1}^{m} \mu\_r = m.$$

#### *6.3. Proof*

The proof uses the optimality equations and Theorem A1, as in Section 5. The "*g*" value needed for this was computed as *J<sup>γ</sup>* in (28). The next step is to evaluate the " *f* " function in the optimality Equation (A1) in Theorem A1. It is sufficient to provide a value

to states that are usable in the sense of Definition 6. For other states, the value of *f* is defined by (A1), since the right-hand side only contains values of reachable states.

The function *f* that is proposed is the following:

$$= \begin{cases} f(\mu; (\mu\_1, j\_1), \dots, (\mu\_m, j\_m)) \\ -\frac{1}{m} \left( 1 + \sum\_{r=1}^m \mu\_r + \sum\_{r=1}^m \frac{1}{j\_r} \right) & \sum\_{r=1}^m \mu\_r \le m - 1 \quad \text{(type 1)} \\ -1 - \frac{1}{m} \left( 2 \sum\_{r=1}^m \frac{1}{j\_r} + SL\_1(t) \frac{\mathbb{H}\_p - p}{p - 1} \right) & \sum\_{r=1}^m \mu\_r = m \quad \text{(type 2)}. \end{cases} \tag{29}$$

In this last line, *SL*1(*t*) = |{*r*|*jr* = 1}| is the number of subtrees of *t* that have exactly one leaf.

**Proof of Proposition 4.** We apply Theorem A1 by checking that the function *f* , in (29), and the constant *<sup>g</sup>* <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>H</sup>*p*/*<sup>p</sup>* satisfy the optimality equations. To that end, we first evaluate the expected value of trees *tb* in <sup>M</sup>*p*,1. Denote by *<sup>P</sup>*dis : <sup>M</sup>*p*,1 → M<sup>+</sup> *<sup>p</sup>*,2 the transition probability from a tree *tb* to a "discovered" tree *t* . We can write:

$$\tilde{f}(t\_b) := \sum\_{t' \in \mathcal{M}^+\_{p,2}} P\_{\text{dis}}(t\_b, t') f(t').$$

We have, for any *μ* ∈ {0, 1}:

$$\begin{split} \tilde{f}(\mu;j) &= \sum\_{\ell\_1=1}^p \dots \sum\_{\ell\_f=1}^p \frac{1}{p^f} f(\mu;(0,\ell\_1),\dots,(0,\ell\_j)) \\ &= -\frac{1}{j} \sum\_{\ell\_1=1}^p \dots \sum\_{\ell\_f=1}^p \frac{1}{p^f} \left(1 + \sum\_{r=1}^j \frac{1}{\ell\_r}\right) \\ &= -\frac{1}{j} \left(1 + \sum\_{r=1}^j \sum\_{\ell\_1=1}^p \dots \sum\_{\ell\_f=1}^p \frac{1}{p^f} \frac{1}{\ell\_r}\right) \\ &= -\frac{1}{j} \left(1 + \sum\_{r=1}^j \frac{p^j}{p^f} \sum\_{\ell\_r=1}^p \frac{1}{\ell\_r}\right) = \dots - \frac{1}{j} \left(1 + \sum\_{r=1}^j \frac{\Pi\_p}{p}\right) \\ &= -\left(\frac{1}{j} + \frac{\Pi\_p}{p}\right) \\ \end{split} \tag{30} $$
 
$$\begin{split} \begin{split} [j \ge 2] \quad \hat{f}(\mu; j^+) &= \sum\_{\ell\_1=1}^p \dots \sum\_{\ell\_f=1}^p \frac{1}{p^f} f(\mu;(0,\ell\_1),\dots,(1,\ell\_4),\dots,(0,\ell\_j)) \\ &= -\frac{1}{j} \sum\_{\ell\_1=1}^p \dots \sum\_{\ell\_f=1}^p \frac{1}{p^f} \left(2 + \sum\_{r=1}^j \frac{1}{\ell\_r}\right) \\ &= -\left(\frac{2}{j} + \frac{\Pi\_p}{p}\right) \end{split} \tag{31} \end{split} \tag{31}$$

$$\begin{split} f(\boldsymbol{\mu};1^{+}) &= \sum\_{\ell=1}^{p} \frac{1}{p} f(\boldsymbol{\mu};(1,\ell)) \\ &= -\frac{1}{p} \Big( p + 2\mathbb{H}\_{p} + \frac{\mathbb{H}\_{p} - p}{p - 1} \Big) \\ &= -\left( 1 + 2\frac{\mathbb{H}\_{p}}{p} + \frac{\mathbb{H}\_{p} - p}{p(p - 1)} \right) \end{split} \tag{32}$$

From these formulas, the following identities are obtained: for any *μ* and *μ* ,

$$\begin{aligned} f(\mu; j) - f(\mu'; j^+) &= \frac{1}{j} & j \ge 2\\ \tilde{f}(\mu; 1) - \tilde{f}(\mu'; 1^+) &= \frac{\mathbb{H}\_p}{p} + \frac{\mathbb{H}\_p - p}{p(p - 1)} = \frac{\mathbb{H}\_p - 1}{p - 1}. \end{aligned}$$

The following result then immediately follows:

**Lemma 10.** *For all j* ∈ {1, ... , *p*}*, μ*, *μ* ∈ {0, 1} *and all <sup>p</sup>* <sup>≥</sup> <sup>2</sup>*,* <sup>0</sup> <sup>≤</sup> ˜ *<sup>f</sup>*(*μ*; *<sup>j</sup>*) <sup>−</sup> ˜ *f*(*μ* ; *j* +) <sup>≤</sup> 1/2*.*

We now proceed with checking that *f* and *g* solve the optimality equations. We start with Type 1 trees. Let *t* = (*μ*;(*μ*1, *j*1), ... ,(*μm*, *jm*)) with ∑*<sup>m</sup> <sup>r</sup>*=<sup>1</sup> *μ<sup>m</sup>* < *m*. The alternative actions are: (*a*) mark the root or any son already marked; (*bk*) mark an unmarked son *k*; (*ck*) mark a leaf of subtree (*μk*, *jk*). For actions (*bk*), we denote by *μ <sup>i</sup>* the marks of the sons after marking: *μ <sup>i</sup>* = *μ<sup>i</sup>* for (*i* = *k* and *μ <sup>k</sup>* <sup>=</sup> *<sup>μ</sup><sup>k</sup>* <sup>+</sup> 1. Clearly, <sup>∑</sup>*<sup>m</sup> <sup>r</sup>*=<sup>1</sup> *μ <sup>r</sup>* = 1 + ∑*<sup>m</sup> <sup>r</sup>*=<sup>1</sup> *μr*. For actions (*ck*), which leaf is marked does not matter, so we ignore this information.

The right-hand side of the optimality equation in the cases (*a*), (*bk*), and (*ck*) are respectively:

$$\begin{split} Q(t,a) &= \frac{m - \sum\_{r=1}^{m} \mu\_{r}}{m} + \sum\_{r=1}^{m} \frac{1}{m} \tilde{f}(\mu\_{r}; j\_{r}) \\ Q(t, b\_{k}) &= \frac{m - \sum\_{r=1}^{m} \mu\_{r} - 1}{m} + \sum\_{r=1}^{m} \frac{1}{m} \tilde{f}(\mu\_{r}^{\prime}; j\_{r}) \\ &= \frac{m - \sum\_{r=1}^{m} \mu\_{r} - 1}{m} - \sum\_{r=1}^{m} \frac{1}{m} \left(\frac{1}{j\_{r}} + \frac{\mathbb{H}\_{p}}{p}\right) \\ &= 1 - \frac{\mathbb{H}\_{p}}{p} - \frac{\sum\_{r=1}^{m} \mu\_{r} + 1}{m} - \frac{1}{m} \sum\_{r=1}^{m} \frac{1}{j\_{r}} = \text{ g} + f(t) \\ Q(t, c\_{k}) &= \frac{m - \sum\_{r=1}^{m} \mu\_{r}}{m} + \frac{1}{m} \sum\_{r=1 | r \neq k}^{m} \tilde{f}(\mu\_{r}; j\_{r}) + \frac{1}{m} \tilde{f}(\mu\_{k}; j\_{k}^{+}) \end{split} \tag{33}$$

Then:

$$\begin{aligned} Q(t, c\_k) - Q(t, a) &= \frac{1}{m} \left( \bar{f}(\mu\_k; j\_k^+) - \bar{f}(\mu\_k; j\_k) \right) \\ Q(t, b\_\ell) - Q(t, c\_k) &= -\frac{1}{m} + \frac{1}{m} \left( \bar{f}(\mu\_k'; j\_k) - \bar{f}(\mu\_k; j\_k^+) \right) \dots \end{aligned}$$

Both differences are negative according to Lemma 10. This implies that action (*bk*) dominates all actions (*ck*), which in turn dominate action (*a*). It therefore realizes the minimum in the right-hand side of the optimality equation. With (33), the right-hand side and the left-hand side coincide.

Next, we consider Type 2 trees. According to the preliminary remark, we can focus our attention on trees with at most one son marked: if this tree is of Type 2 (all sons marked), then it has only one son. Let then *t* = (*μ*;(1, *j*)), *j* ∈ [1..*p*] be such a tree. The alternative actions are: (*a*) mark the root or the son; (*b*) mark leaf of the subtree (*μk*, *jk*). The right-hand side of the optimality equation is respectively: *Q*(*t*, *a*) = ˜ *f*(1; *j*) and *Q*(*t*, *b*) = ˜ *f*(1; *j* +). From Lemma 10, we know that action (*b*) dominates action (*a*). Further, from Definitions (29) and (32),

$$\begin{split} f(\mu; (1, j)) + \mathfrak{g} - \mathcal{Q}(t, a) &= f(\mu; (1, j)) + 1 - \frac{\mathbb{H}\_p}{p} - \tilde{f}(1; j^+) \\ &= -1 - \left( 2 + \frac{\mathbb{H}\_p - p}{p - 1} \right) + 1 - \frac{\mathbb{H}\_p}{p} + \left( 1 + 2 \frac{\mathbb{H}\_p}{p} + \frac{\mathbb{H}\_p - p}{p(p - 1)} \right) \\ &= -1 + \frac{\mathbb{H}\_p}{p} + \frac{\mathbb{H}\_p - p}{p - 1} \left( \frac{1}{p} - 1 \right) = -1 + \frac{\mathbb{H}\_p}{p} - \frac{\mathbb{H}\_p - p}{p} = 0. \end{split}$$

**Remark 4.** *It was proven in [16] that, for any tree t* = (*μ*;(*s*1, ... ,*sm*))*, in the usable set, namely the set of trees with no sons marked, the finite-horizon total-cost optimal-value function is given by:*

$$\mathcal{W}\_n^\*(t) = n - \frac{1}{m} - \frac{n-2}{p} \operatorname{H}\_p - \frac{1}{m} \sum\_{r=1}^m \frac{1}{|\mathbf{s}(\mathbf{s}\_r)|} \quad \text{for} \quad 2 \le n \le N,\tag{34}$$

*and that this cost is realized by any greedy policy. From this result, the average cost of this policy in the infinite horizon is then:*

$$\lim\_{N \to \infty} \frac{W\_N^\*(t)}{N} = \lim\_{N \to \infty} \frac{1}{N} \left( N - \frac{1}{m} - \frac{N-2}{p} \operatorname{H}\_p - \frac{1}{m} \sum\_{r=1}^m \frac{1}{|\mathbf{s}(\mathbf{s}\_r)|} \right) \\ = 1 - \frac{\operatorname{H}\_p}{p}.$$

*This matches with* (28)*. Furthermore, it is compatible with the form of the function f in* (29)*. The interpretation of f*(*t*) − *f*(*t* ) *is the difference in the total expected cost when starting from trees t or t . According to* (34)*, the tree-dependent cost for trees with unmarked sons would be:*

$$f(t) = -\frac{1}{|s(t)|} \left( 1 + \sum\_{r=1}^{|s(t)|} \frac{1}{|s(s\_r)|} \right) \dots$$

*This is indeed the value in* (29) *since* ∑*<sup>r</sup> μ<sup>r</sup>* = 0*.*

#### **7. Trees of Depth** *d* **= 2 with Marking Budget** *k* **= 2**

This section is devoted to the case where the marking budget is *k* = 2. In this case, we do not present general results, but we focus on the case of trees with depth 2. For small values of *p*, we describe the optimal policy, and we conjecture that this policy is optimal for general values of *p*.

We begin with some additional notation, then we introduce the definitions of the policies of interest. This allows us to formulate Conjecture 1. We then present numerical experiments made with small values of *p* supporting this conjecture.

#### *7.1. Preliminary*

Similarly as in Section 6, we argue that usable states are necessarily such that the marking of sons *<sup>m</sup>* ∑ *r*=1 *ir* takes values in {0, 1, 2}. Indeed, the sons of a tree are the leaves of a tree at the previous time step, and at most two leaves can be marked at any step.

#### *7.2. Notation and Terminology*

We first recall the representation of depth two trees of <sup>M</sup><sup>+</sup> 2,*<sup>p</sup>* from Section 6.2. Such a tree can be represented as (*μ*,(*μ*1, *j*1), ··· ,(*μm*, *jm*)) where *m* ∈ [1, *p*], *μ*, *μ<sup>r</sup>* ∈ {0, 1} and 1 ≤ *jr* ≤ *p* for all *r* ∈ [1, *m*]. *μ*, *μ<sup>r</sup>* are the markings of the root and the sons, *m* is the number of sons, and *jr* are the number of leaves of the depth one subtree (*μr*, *jr*). Based on the number of marked sons of the tree, we classify them into two types:

$$\text{Type 1: } \sum\_{r=1}^{m} \mu\_r \le m - 2, \qquad \text{Type 2: } \sum\_{r=1}^{m} \mu\_r \ge m - 1.$$

Type 2 trees are further classified into three subtypes (remember that ∑*<sup>r</sup> μ<sup>r</sup>* cannot exceed 2):

$$\text{Type 2a: } \sum\_{r=1}^{m} \mu\_r = m - 1, \quad \text{Type 2b: } m = 2, \mu\_1 = \mu\_2 = 1, \quad \text{Type 2c: } m = 1, \mu\_1 = 1.$$

We also introduce some shorthand notation for the different possible actions. Let *a*(*d*1, *d*1) represent the action of marking two sons (*d*1 as in "depth 1"), *a*(*d*1, *ljc* ) represent the action of marking a son and a leaf of the tree (*μc*, *jc*) (*lj* as in "leaf *j*"), and *a*(*ljc*<sup>1</sup> , *ljc*<sup>2</sup> ) represent the action of marking a leaf in each of the subtrees (*μc*1, *jc*1) and (*μc*2, *jc*2). If *c*1 = *c*2, two leaves are marked in this subtree.

#### *7.3. Policies*

All policies of interest in our study are greedy in the sense of Definition 7. These policies do not specify what happens when some marking budget is left after marking all unmarked sons of a tree. We therefore specify the following variants of the greedy policy by their precise behavior in this situation. We begin with four simple rules; three of them rely on an order defined on the subtrees. The terms "first" and "second" used in the specification are relative to this order:


The cost of policy "greedy depth 1" is known by Lemma 7:

$$J\_{\text{Greatly Depth 1}} = \frac{\mathbb{H}\_{p2}}{p} = 1 + \frac{1 - 2\mathbb{H}\_p}{p}.\tag{35}$$

Finally, we introduce the "greedy finite optimal" policy, a name that we use as a shorthand for "the policy that seems to emerge as optimal with the finite horizon criterion". Its behavior is specified in Table 2. It is explained in Appendix C how the features of this policy are extrapolated from the results obtained with the finite-horizon version of the MDP.

The behavior of the "greedy finite optimal" policy is obvious on Type 1 (more than two unmarked sons) and Type 2c (one son that is marked) trees. On Type 2a (one unmarked son) and Type 2b (two sons, both marked), it introduces a threshold of 3 or 4 on the size of the subtrees. When the tree is of Type 2a, one mark is left for marking subtrees. If all of them have a size less than 2, the largest one is marked. On the other hand, if some of them has a size larger than 3, the smallest of these is marked. When the tree is of Type 2b, two marks are left for marking subtrees. If both of them have a size larger than 4 (*j*<sup>2</sup> ≥ *j*<sup>1</sup> > 3), the smallest subtree is marked. In the other case, both subtrees are marked.


**Table 2.** Specification of the greedy finite optimal policy.

A rationale for such a rule is as follows. When a tree has less than two unmarked sons, it can be made costless with the two marks of the budget. Therefore, when considering trees of Type 2a, a subtree that has less than two leaves has less priority than a subtree with more than three leaves, which is more "vulnerable". Among the vulnerable trees, it is better to mark the smallest ones, in order to reduce future expected costs. Trees of Type 2b have, so to speak, one round in advance since all sons are already marked. Subtrees of a size less than 3 can be "protected" by devoting one mark to them: if the surfer moves to them, the budget of the next round will be used to complete the protection. If both trees are too large to be fully protected, the budget is devoted again to the smallest one.

We can now state the conjecture that is the focus of this section.

#### **Conjecture 1.** *When k* = 2 *and d* = 2*, the "greedy finite optimal" policy is optimal.*

#### *7.4. Numerical Experiments*

We provide support to Conjecture 1 with results for small *p*. We implemented the policy improvement algorithm [15] (Chapter 8.6), starting with a particular greedy policy. Prior to the implementation of the algorithm, we evaluated the average cost of the four variants of the greedy policy introduced in Section 7.3, for *p* = 3, 4, 5. Of course, the greedy policies realize a zero cost for *p* ≤ 2.

The average costs of the five greedy policies for small values of *p* are summarized in Table 3. The row concerning the "greedy depth 1" policy was evaluated using the exact formula (35), which gave respectively 1/9, 5/24, and 43/150 for *p* = 3, 4, and 5. Several observations can be made from the data of this table. First, there is a substantial gain of marking subtrees after having marked all the sons: there is a larger gap between the performance of greedy depth one and the group of the other ones, than inside this group. Second, the best performance among the four simple policies introduced in Section 7.3 was achieved by marking the largest subtree, for all values of *p* tested. Picking the smallest subtree had the worst performance compared to picking the leftmost/largest subtree. Finally, choosing arbitrarily the leftmost (or, for that matter, the rightmost or a random) subtree resulted in a performance between these extremes.

**Table 3.** Average cost of different greedy policies.


We used the greedy largest policy as a starting candidate policy for the policy iteration algorithm for *p* = 3, 4, 5, since it gave the best performance among the simple policies. In each case, the algorithm converged in a few iterations, and the resulting policy was

the greedy finite optimal policy. The performance of the policy iteration algorithm is summarized in Table 4. The execution time figures correspond to an implementation in Python 3.9 running on a 1.4 GHz quad-core processor with 8 GB memory and a 1536 MB graphic card.

**Table 4.** Policy iteration performance.


We could have also started the algorithm with the greedy finite optimal policy and checked that it solves the optimality equations. Selecting another policy was also a way of checking that our implementation of policy iteration works properly, as well as measuring "how far" from the optimum the greedy largest policy is.

Observe in Table 3 that the relative performance of the greedy finite optimal policy was more pronounced for *p* = 5 as compared to *p* = 4. The former MDP had a larger number of Type 2 trees. The greedy largest policy prescribed a suboptimal marking scheme for such trees, which explains the greater cost reduction by switching to the greedy finite optimal policy for *p* = 5. The greedy finite optimal policy for *p* = 3 coincided with the greedy largest policy, and hence, the costs were identical.

#### **8. Discussion**

We proposed a stochastic dynamic decision model for prefetching problems, which is simple in the sense that it has only three integer parameters, and yet can help conceive of optimal strategies in practical situations. The simplicity of the model lies in several assumptions that we discuss now.

We first observe that the modeling we proposed does not look practical for large values of parameters *d* and *p*. Indeed, Table 1 clearly shows that the state space sizes that could be handled numerically corresponded to small values of these parameters. On the other hand, the formal results obtained so far suggest that such a numerical solution would be needed in practice. We argue, consistent with our introduction, that large values of *d* are not desirable in practice: it may be better not to know the graph at a larger distance, as long as the complexity of the decision grows exponentially with the amount of information. A value *d* = 2 may be a good compromise between excessive shortsightedness and an excess of information. Concerning the parameters *k* and *p*, practical situations should involve cases where these values are not far from each other. Clearly, if *k* ≥ *p*, the problem is easy, whereas if *k p*, all policies will be bad because the controller is overwhelmed by the number of nodes to control. The modeling we propose is relevant if the network is a bottleneck of the system: in other words, if *k* is not very large. Therefore, *p* should not be very large either.

The next feature departing from practical cases is about the discovery process. In our model, we assumed a uniform distribution for the new generation of nodes. Practical graphs are known to have different node degree distributions. Here, since we identified this mechanism with a Galton–Watson branching process, it seems possible to use other distributions while conserving the possibility of characterizing the distribution of trees as we did in Section 4. Therefore, evaluating the performance of simple greedy policies and obtaining bounds might be possible with this generalization. Furthermore, the results we obtained with budget *k* = 1 or depth *d* = 1 were probably insensitive to the distribution of the number of sons.

The assumption we made about the movements of the "surfer" in the graph of documents may also be questioned. We assumed a uniform choice between neighboring documents. In addition to simplicity, we argue that this represents the most difficult

situation for the controller, since the amount of information available to it is minimal. In the case where nonuniform movement probabilities are known, they can easily be integrated in the MDP, just as in the models reviewed in the Introduction. It is interesting to note that in such a case, models can be imagined where the optimal policy is *not* of the greedy type: consider just a case where the probability of moving to some son happens to be zero (or close to zero). The prefetching budget should then be devoted to marking the other sons and their subtrees. Apart from such obvious situations, it is difficult to imagine cases where an optimal policy would be formally identified, since this policy should weight the movement probabilities with the characteristics of the subtrees.

Going back to the simple model we proposed, the results of Section 7 (the case *d* = 2 and *k* = 2) and their interpretation suggest a general form for a heuristic policy. The first principle is that sons should be marked as priority. The issue is what to do with the remaining budget. On the one hand, the subtrees that have themselves less than *k* sons are easy to deal with and can be ignored. Among the remaining subtrees, those with the smallest number of sons should be marked first. If budget remains, the principle can be applied recursively.

#### **9. Conclusions and Perspectives**

Among the results of this paper, we proved that simple greedy policies are optimal in several situations, suggesting that optimal policies are always greedy. One first obvious step in future research will be to prove the following result, generalizing Proposition 4:

**Conjecture 2.** *When k* = 1*, any greedy policy is optimal.*

Next, our research will focus on the more challenging case *k* = 2. The first objective will be to prove Conjecture 1 and then determine how to adapt this policy to larger *d*. Some numerical investigation appears to be possible there for small values of *p*, despite the large size of the state space. Another line of research on formal solutions will focus on the analysis of Markov chains defined by simple policies with the purpose of providing bounds tighter than that of Proposition 3.

On the practical side, several issues need further investigation. The principal one is to efficiently identify the model and the optimal control from practical data. We plan to develop algorithms that would leverage the knowledge gained with the exact solution of simple models, with the objective of reducing learning time and learning errors.

**Author Contributions:** Formal analysis, K.K., A.J.-M., and S.A.; investigation, K.K., A.J.-M., and S.A.; supervision, A.J.-M. and S.A.; writing—original draft, K.K., A.J.-M., and S.A.; writing—review and editing, K.K., A.J.-M., and S.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data is contained within the article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Abbreviations**

The following abbreviations are used in this manuscript:

CPU Central Processing Unit

MDP Markov Decision Process

#### **Appendix A. Notation**

**Table A1.** Main notation used in the paper.


#### **Appendix B. MDP Facts**

We borrow the below existential theorem from [17] (Theorem 2.1, Chapter V).

**Theorem A1.** *If there exists a bounded function f*(*s*) *for every s* ∈ *S and a constant g such that:*

$$\text{sg} + f(\text{s}) = \max\_{a} \left[ r(\text{s}, a) + \sum\_{s' \in S} P(\text{s}, a, \text{s}') f(\text{s}') \right] \tag{A1}$$

*then there exists a stationary policy γ*∗ *such that:*

$$\mathfrak{g} = \max\_{\gamma} \phi\_{\gamma}(\mathfrak{s}) = \phi\_{\gamma^\*}(\mathfrak{s}). \tag{A2}$$

Theorem A1 guarantees the existence of an optimal policy given that there exists a bounded function *f* and constant *g* that satisfies (A1). We use the following theorem from [17] (Theorem 2.2, Chapter V), which proves the existence of such a function and constant.

**Theorem A2.** *Let* {*sn*}*n*≥<sup>0</sup> *and* {*an*}*n*≥<sup>0</sup> *be the sequence of states and actions of the MDP when a policy γα is followed. Define:*

$$\mathcal{W}^{\gamma\_{\mathfrak{a}}}(s) := E\left[\sum\_{n=0}^{\infty} \mathfrak{a}^n r(s\_{\mathfrak{n}\_\varGamma} a\_{\mathfrak{n}})\right] \text{ where } 0 < \mathfrak{a} < 1$$

*and Wγ*<sup>∗</sup> *<sup>α</sup>* (*s*) := max *γα <sup>W</sup>γα* (*s*)*. For some fixed <sup>s</sup>*<sup>0</sup> <sup>∈</sup> *S, if there exists a <sup>B</sup>* <sup>&</sup>lt; <sup>∞</sup> *such that* |*Wγ*<sup>∗</sup> *<sup>α</sup>* (*s*) <sup>−</sup> *<sup>W</sup>γ*<sup>∗</sup> *<sup>α</sup>* (*s*0)| < *B for all α and s, then there exist a bounded function f and a constant g satisfying* (A1)*.*

The uniform boundedness property on *Wγ*<sup>∗</sup> *<sup>α</sup>* exists if the expected time to go from any state *s* to the fixed state *s*<sup>0</sup> is bounded by a finite value while using the optimal policy *γ*∗ *<sup>α</sup>*. The reader may refer to Theorem 2.4, Chapter V, from [17] for a proof of the same. A sufficient condition for the bounded expected time is that every stationary policy in the MDP yields a unichain.

#### **Appendix C. Finite Horizon MDP for Trees of Depth** *d* **= 2 with Budget** *k* **= 2**

This section is devoted to the findings from the study of the finite-horizon prefetching MDP, in the cases *d* = 2 and *k* = 2 and general *p*. These results are quoted from the unpublished report [16]. Other results for finite-horizon MDP are quoted in Remarks 3 and 4.

The optimal actions for all tree types for the finite horizons *n* = 3, 4 are specified in Table A2. The optimal actions for the *n* = 3 horizon were computed analytically and confirmed through numerical simulations. For the *n* = 4 horizon, the optimal actions in Table A2 are the numerical results. The same shorthands for marking actions from Section 7 are used here.


**Table A2.** Comparison of optimal actions at *n* = 3 and *n* = 4.

The policy for Type 2 trees depends on the exact size of the subtrees, unlike the policy for Type 1 trees, where the sizes of the subtrees are irrelevant. There are thresholds that are functions of *p*, which decide the optimal action for Type 2*b* trees. The thresholds on *j*2, with the obvious constraint *j*<sup>2</sup> ≤ *p*, that decide the optimal action for certain specifications of Type 2b trees are below.

$$j\_2 \ge \frac{6p^3}{6p^2(\mathbb{H}\_p - 3) + 6p(5\mathbb{H}\_p - 8) + 38\mathbb{H}\_p - 15} \tag{A.3}$$

$$j\_2 \ge \frac{12p^2}{12p\mathbb{H}\_p + 8\mathbb{H}\_p - 39} \tag{A4}$$

$$j\_2 \ge \frac{4p^2}{2p(2\mathcal{H}\_p - 5) + 10\mathcal{H}\_p - 7} \tag{A5}$$

$$j\_2 \ge \frac{6p}{6\mathcal{H}\_p - 11} \,\,\,\,\,\,\,\,\,\tag{A6}$$

We observed a change in optimal actions for Type 2 trees for considerably large fanouts in the numerical experiments for *n* = 4. Since the results of *n* = 4 were obtained numerically, we could identify the critical *p* values after which there would be a change in the optimal action for some *j*2. The precise threshold on *j*<sup>2</sup> for which the optimal action changes was not found due to the complex calculations involved. Comparing the optimal actions, we note a simplification when going from *n* = 3 to *n* = 4. The policy for trees of Types 1, 2a, and 2c is the same. For trees of Type 2b, the number of cases where a switch in optimal actions occurs is smaller in the *n* = 4 than in the *n* = 3 case. When a switch occurs in both cases, the threshold on the value *j*<sup>2</sup> is observed to be larger in the case *n* = 4. Naturally, we could expect that for a given specification of Type 2b trees, the optimal action would remain the same for most of the trees as the horizon increases. With this line of thought, we may conjecture that the threshold values disappear as *n* → ∞ and the optimal actions are the same for all trees of a given type and specification. This is the principle that led to the definition of the greedy finite optimal policy in Table 2.

#### **References**


## *Article* **Convergence Bounds for Limited Processor Sharing Queue with Impatience for Analyzing Non-Stationary File Transfer in Wireless Network**

**Irina Kochetkova 1,2,\*, Yacov Satin 3, Ivan Kovalev 3, Elena Makeeva 1, Alexander Chursin <sup>1</sup> and Alexander Zeifman 2,3**


**Abstract:** The data transmission in wireless networks is usually analyzed under the assumption of non-stationary rates. Nevertheless, they strictly depend on the time of day, that is, the intensity of arrival and daily workload profiles confirm this fact. In this article, we consider the process of downloading a file within a single network segment and unsteady speeds—for arrivals, file sizes, and losses due to impatience. To simulate the scenario, a queuing system with elastic traffic with non-stationary intensity is used. Formulas are given for the main characteristics of the model: the probability of blocking a new user, the average number of users in service, and the queue. A method for calculating the boundaries of convergence of the model is proposed, which is based on the logarithmic norm of linear operators. The boundaries of the rate of convergence of the main limiting characteristics of the queue length process were also established. For clarity of the influence of the parameters, a numerical analysis was carried out and presented.

**Keywords:** queuing system; elastic traffic; inpatient claim; non-stationary intensity; convergence analysis; bounds on the rate of convergence; wireless network; file transfer; daily traffic profile; blocking probability

#### **1. Introduction**

The fifth generation (5G) networks will consist of different services with different specifications. Experts have identified network slicing as a key technology to enable 5G networks [1–4]. Within the framework of network slicing, many models are proposed with different principles for slicing radio resources [5]. In some cases, it is necessary to distribute slices between several users, and this distribution and the slices can be changed depending on time, for example, in [6–8], various options for re-slicing the network are shown. Earlier, we studied models for studying network slicing in the framework of papers [9,10], but we considered a model in the form of a queuing system with stationary intensities.

Given the need to re-slice the network and because all processes are non-stationary and depend on time. For example, a traffic profile—user activity is different depending on the time of day. Activity may also depend on the time of year; in the summer many users go on vacation and fly to other countries, and in winter most users are actively working and on weekends they sit at home watching movies, for example. Taking into account this dependence, it is necessary to consider models with non-stationary intensities. In our work, we consider an example of downloading user files depending on the time of day. For convenience, we consider one slice of the radio frequency channel taking into account the

**Citation:** Kochetkova, I.; Satin, Y.; Kovalev, I.; Makeeva, E.; Chursin, A.; Zeifman, A. Convergence Bounds for Limited Processor Sharing Queue with Impatience for Analyzing Non-Stationary File Transfer in Wireless Network. *Mathematics* **2022**, *10*, 30. https://doi.org/10.3390/ math10010030

Academic Editor: Daniel-Ioan Curiac

Received: 17 November 2021 Accepted: 17 December 2021 Published: 22 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

non-stationary nature, and look at the behavior of the characteristics. As a mathematical model, we take a queuing system with a non-stationary arrival intensity of user arrival in the system, as in [11,12], where such systems with non-stationary intensities are used for the joint service of radio frequencies.

This article discusses a special, but rather non-standard, heterogeneous birth and death process, for which, in principle, one can apply the methods described, for example, in articles [13,14], but due to the specifics of the model, in order to obtain acceptable estimates, the authors had to "trick". In particular, the rate of convergence in the example under consideration actually turns out to be rather slow, so that the existing (and constructed) limiting regime begins to adequately show the situation at sufficiently large times Usually, uniformization is used as a method for calculating the transition probabilities for Markov chains, as in [15,16]. However, the methods based on it work very poorly in the case of slow convergence, which takes place for the considered models. In addition, without a prior understanding of when the limiting regime is reached, significant computational efforts are required in order to be at least to some extent confident that the obtained solution is the required one [17].

The goal of the paper is to explore the nonstationary queue-length process based on file transferring in the wireless network and analyze the performance measures of this system model. The remainder of the paper is organized as follows. In Section 2, the queueing system and its performance measures are presented. The convergence analysis for large service rates and arrival rates of the mentioned queueing system are described in Section 3. Application for file transfer in the wireless network and numerical analysis of the considered system model are discussed in Section 4, followed by the conclusions in Section 5.

#### **2. Queuing System**

#### *2.1. Overview and Assumptions*

In this paper, we will consider a queuing system with elastic traffic and inpatient claims. To describe the flow of requests with a variable number of users, a Poisson flow of the first kind with the following parameters is suitable: the arrival intensity *λ*(*t*), the minimum requirement for the resource *b*, and the length of the transmitted data block *θ*(*t*). Table 1 reflects the main parameters that describe the system model and the corresponding terms of the mathematical model. In the system under consideration, there is a resource of volume *C*, a storage device with a finite capacity *r*. Requests also have the property of impatience—they leave the queue with intensity *γ*(*t*). We will assume that the block length is equal to some value *θ*(*t*). The entire volume *C* is divided equally between the orders, that is, if the number of customers is 1, then the entire resource is consumed by this customer, and the service rate is *θ*(*t*)/*C*; if the number of customers is 2, then the service rate is 2*θ*(*t*)/*C*—the resource volume is divided in half. In the case when *C* cannot be divided equally between the customers with the provision of the minimum guaranteed threshold *b*, a new customer enters the queue.

Let *N*(*t*) ∈ {1, . . . , |*C*/*b*|} be the number of processed orders at the moment *t* ≥ 0. Hence, the number |*C*/*b*| = *N* is the maximum number of requests that the device can process simultaneously. The state-space of the system looks like this:

$$X := \{ n \in 0, \ldots, N, \ldots, N + r : c(n) \le \mathbb{C} \}. \tag{1}$$

**Table 1.** System model parameters.


#### *2.2. Continuous-Time Markov Chain*

It is easy to see that the given model can be described by Markov process *X*(*t*), *t* > 0 where *X*(*t*) denotes the number of customers in the system at time *t* (queue-length process). Denote by *pn*(*t*) = *P*(*X*(*t*) = *n*), *n* = 0, 1, 2, 3, . . . .

From the above assumptions, the resulting behavior of the state probabilities are described by a forward Kolmogorov system:

$$p\_0'(t) = -\lambda(t)p\_0(t) + \frac{\mathcal{C}}{\theta(t)}p\_1(t) \tag{2}$$

$$p\_n'(t) = \lambda(t)p\_{n-1}(t) - \left(\frac{\mathbb{C}}{\theta(t)} + \lambda(t)\right)p\_n(t) + \frac{\mathbb{C}}{\theta(t)}p\_{n+1}(t), 1 \le n < N\tag{3}$$

$$p\_N'(t) = \lambda(t)p\_{n-1}(t) - \left(\frac{\mathbb{C}}{\theta(t)} + \lambda(t)\right)p\_n(t) + \left(\frac{\mathbb{C}}{\theta(t)} + \gamma(t)\right)p\_{n+1}(t) \tag{4}$$

$$p\_n'(t) = \lambda(t)p\_{n-1}(t) - \left(\frac{\mathbb{C}}{\theta(t)} + (n-N)\gamma(t) + \lambda(t)\right)p\_n(t) + \left(\frac{\mathbb{C}}{\theta(t)} + (n+1-N)\gamma(t)\right)p\_{n+1}(t),\tag{5}$$

$$N < n < N+r\tag{5}$$

$$p'\_{N+r}(t) = \lambda(t)p\_{N+r-1}(t) - \left(\frac{\mathbb{C}}{\theta(t)} + r\gamma(t) + \lambda(t)\right)p\_{N+r}(t). \tag{6}$$

Now, we consider the corresponding nonstationary situation. Namely, we suppose that the queue-length process {*X*(*t*), *t* ≥ 0} is an inhomogeneous continuous-time Markov chain. All possible transition intensities say *qij*(*t*), are supposed to be non-random functions of time. We suppose that all intensity functions are nonnegative and locally integrable on [0, ∞).

Denote by **p**(*t*) = (*p*0(*t*), *p*1(*t*), *p*2(*t*),..., *pN*+*r*(*t*)) *<sup>T</sup>* the vector of state probabilities at the moment *<sup>t</sup>*. Put *aij*(*t*) = *qji*(*t*) for *<sup>j</sup>* = *<sup>i</sup>* and *aii*(*t*) = − <sup>∑</sup>*<sup>j</sup>* <sup>=</sup>*<sup>i</sup> aji*(*t*) = − <sup>∑</sup>*<sup>j</sup>* <sup>=</sup>*<sup>i</sup> qij*(*t*).

We can consider the forward Kolmogorov system (2)–(6) as a differential equation

$$\frac{d\mathbf{p}(t)}{dt} = A(t)\mathbf{p}(t),\tag{7}$$

in the space of sequences *l*1, where *A*(*t*) is a bounded for almost all *t* ≥ 0 linear operator from *l*<sup>1</sup> to itself and it is generated be the corresponding transposed intensity matrix:


#### *2.3. Performance Measures*

For analysis of the system let us consider some characteristics of the devoted model. First, the probability of blocking an incoming application:

$$P\_{block}(t) = p\_{N+r}(t). \tag{9}$$

The average number of serviced applications:

$$\bar{\mathcal{L}}(t) = \sum\_{i=1}^{N} i p\_i(t) + N \cdot \sum\_{i=1}^{r} p\_{N+i}(t). \tag{10}$$

The average number of applications in the queue:

$$Q(t) = \sum\_{i=N+1}^{N+r} (i - N)p\_i(t). \tag{11}$$

#### **3. Convergence Analysis**

*3.1. Definitions of Terms*

We denote the mathematical expectation (the mean) of *X*(*t*) at the moment *t* if *X*(0) = *k* as *E*(*t*, *k*) = *E*{*X*(*t*)|*X*(0) = *k* } .

The Markov chain *X*(*t*) is called *weakly ergodic*, if lim *t*→∞ 4 <sup>4</sup>**p**1(*t*) <sup>−</sup> **<sup>p</sup>**2(*t*) 4 4 = 0 for any initial conditions **<sup>p</sup>**1(0) <sup>=</sup> **<sup>p</sup>**<sup>1</sup> <sup>∈</sup> <sup>Ω</sup>, **<sup>p</sup>**2(0) <sup>=</sup> **<sup>p</sup>**<sup>2</sup> <sup>∈</sup> <sup>Ω</sup>. For our situation *any* **<sup>p</sup>**1(*t*) is considered as a *quasi-stationary distribution* of the chain *X*(*t*).

The mentioned Markov chain *X*(*t*) also has the limiting mean *φ*(*t*), if |*E*(*t*; *k*) − *φ*(*t*)| → 0 as *t* → ∞ for any *k*.

We recall the logarithmic norm of operator function from *l*<sup>1</sup> to itself is calculated as (12):

$$\gamma(B(t))\_1 = \sup\_i \left( b\_{ii}(t) + \sum\_{j \neq i} |b\_{ji}(t)| \right), \tag{12}$$

and the bound

*U*(*t*,*s*) ≤ *e t <sup>s</sup> <sup>γ</sup>*(*B*(*τ*)) *<sup>d</sup>τ*, (13)

is valid for the Cauchy operator of the corresponding differential equation

$$\frac{d\mathbf{x}}{dt} = B(t)\mathbf{x}.\tag{14}$$

*3.2. Preliminary Considerations*

$$A(t) = \begin{pmatrix} \text{Let us put that } \mu(t) = \frac{c}{\mu(t)}, \text{ and } \mu\_n(t) = \mu(t) + \max(0, n - N)\gamma(t) \text{ for } n \ge 1. \text{ Then} \\\\ \begin{pmatrix} -\lambda(t) & \mu\_1(t) & 0 & \cdots & 0 & 0 \\\\ \lambda(t) & -(\mu\_1(t) + \lambda(t)) & \mu\_2(t) & \cdots & 0 & 0 \\\\ 0 & \lambda(t) & -(\mu\_2(t) + \lambda(t)) & \cdots & 0 & 0 \\\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\\\ 0 & 0 & 0 & \cdots & -(\mu\_{N+r-1}(t) + \lambda(t)) & \mu\_{N+r}(t) \\\\ 0 & 0 & 0 & \cdots & \lambda(t) & -\mu\_{N+r}(t) \end{pmatrix} \tag{15}$$

As indicated above, the considered method is based on the concept of the logarithmic norm and the corresponding estimates for the Cauchy operator.

Assuming that *<sup>p</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup> <sup>−</sup> <sup>∑</sup>*N*+*<sup>r</sup> <sup>i</sup>*=<sup>1</sup> *pi*(*t*), then from (7) we get the following equation:

$$\frac{d\mathbf{p}}{dt} = B(t)\mathbf{p} + \mathbf{g}(t), \quad t \ge 0,\tag{16}$$

where **g**(*t*) = (*λ*(*t*), 0, 0, . . . , 0) *<sup>T</sup>* and *B*(*t*) equals (17).

$$B(t) = \begin{pmatrix} -(\mu\_1(t) + 2\lambda(t)) & \mu\_2(t) - \lambda(t) & -\lambda(t) & \cdots & -\lambda(t) & -\lambda(t) \\\\ \lambda(t) & -(\mu\_2(t) + \lambda(t)) & \mu\_3(t) & 0 & \cdots & 0 & 0 \\\\ 0 & \lambda(t) & -(\mu\_1(t) + \lambda(t)) & \mu\_4(t) & \cdots & 0 & 0 \\\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\\\ 0 & 0 & 0 & 0 & \cdots & -(\mu\_{N+r-1}(t) + \lambda(t)) & \mu\_{N+r}(t) \\\\ 0 & 0 & 0 & 0 & \cdots & \lambda(t) & -\mu\_{N+r}(t) \end{pmatrix} \tag{17}$$

The solution to this equation can be represented in the following form:

$$\mathbf{p}(t) = \mathcal{U}^\*(t, 0)\mathbf{p}(0) + \int\_0^t \mathcal{U}^\*(t, \tau)\mathbf{g}(\tau) \,d\tau,\tag{18}$$

where *U*∗(*t*,*s*) is the Cauchy operator of the corresponding homogeneous equation:

$$\frac{d\mathbf{x}}{dt} = B(t)\mathbf{x}.\tag{19}$$

Next, we will consider estimates in "weighted" norms. Suppose *d*1, *d*2, ... , *dN*+*<sup>r</sup>* are positive numbers. Then let

$$D = \begin{pmatrix} d\_1 & d\_1 & \cdots & d\_1 \\ & 0 & d\_2 & \cdots & d\_2 \\ & & & \\ & & \ddots & \ddots & \vdots \\ & & & 0 & \cdots & d\_{N+r} \end{pmatrix} \tag{20}$$

We will denote by **z**1**<sup>D</sup>** = **Dz**1. Note that *B*(*t*) is essentially non-negative, that is, all off-diagonal elements of *B*(*t*) are non-negative for any *t* ≥ 0. Then we get:

$$B^{\*\*}(t) = DB(t)D^{-1} = \begin{pmatrix} -(\boldsymbol{\nu}\_{1}(t) + \lambda(t)) & \frac{d\_{1}}{2}\boldsymbol{\nu}\_{1}(t) & 0 & \cdots & 0 & 0\\ \frac{d\_{2}}{2}\lambda(t) & -(\boldsymbol{\nu}\_{2}(t) + \lambda(t)) & \frac{d\_{2}}{2}\boldsymbol{\nu}\_{2}(t) & \cdots & 0 & 0\\ 0 & \frac{d\_{3}}{2}\lambda(t) & -(\boldsymbol{\nu}\_{3}(t) + \lambda(t)) & \cdots & 0 & 0\\ & \ddots & \ddots & \frac{d\_{N}}{3}\lambda(t) & \cdots & 0 & 0\\ & \vdots & \vdots & \vdots & \vdots & \vdots\\ & & & & &\\ 0 & 0 & 0 & \cdots & -\left(p\_{N+t-1}(t) + \lambda(t)\right) & \frac{d\_{N+t-1}}{2\lambda(t)}p\_{N+t-1}(t)\\ & & & & &\\ & 0 & 0 & \cdots & \frac{d\_{N+t}}{2N+t-1}\lambda(t) & -(p\_{N+t}(t) + \lambda(t)) \end{pmatrix} \tag{21}$$

Let us put the following:

$$\gamma\_{\*\*}(t) = \inf\_{i} \left( |b\_{ii}(t)| - \sum\_{j \neq i} \frac{d\_j}{d\_i} b\_{ji}(t) \right). \tag{22}$$

Then we will get:

$$\gamma \left( B(t) \right)\_{\text{iD}} = \gamma \left( \text{D}B(t) \text{D}^{-1} \right) = \sup\_{i} \left( b\_{ii}(t) + \sum\_{j \neq i} \frac{d\_j}{d\_i} b\_{ji}(t) \right) = -\gamma\_{\*\*}(t). \tag{23}$$

For some positive *δ* we put *d*<sup>1</sup> = 1, *dk*<sup>+</sup><sup>1</sup> = *δdk*, *k* ≥ 1.

#### *3.3. Bounds on the Rate of Convergence for Large Service Rates*

First, we let *δ* > 1. Then we will get the following:

$$
\mu\_1(t) = \mu\_1(t) + \lambda(t) - \delta\lambda(t) = \mu(t) - (\delta - 1)\lambda(t),
$$

$$\begin{aligned} \mu\_k(t) &= \mu\_k(t) + \lambda(t) - \frac{1}{\delta}\mu\_{k-1}(t) - \delta\lambda(t) \ge \left(1 - \frac{1}{\delta}\right)\mu\_{k-1}(t) - (\delta - 1)\lambda(t) \ge \\ &\quad \left(1 - \frac{1}{\delta}\right)(\mu(t) - \delta\lambda(t)), \ 2 \le k < N + r, \end{aligned} \tag{24}$$

$$
\mu\_{N+r}(t) = \mu\_{N+r}(t) - \frac{1}{\delta}\mu\_{N+r-1}(t) + \lambda(t) \ge \left(1 - \frac{1}{\delta}\right)(\mu(t) - \delta\lambda(t)).
$$

Therefore,

$$\gamma\_{\*\*}(t) = \min(a\_l(t)) = \left(1 - \frac{1}{\delta}\right)(\mu(t) - \delta\lambda(t)).\tag{25}$$

**Theorem 1.** *Let there be a positive number delta* > 1 *such that,*

$$\int\_0^\infty (\mu(t) - \delta \lambda(t)) dt = +\infty. \tag{26}$$

*Then the Markov chain X*(*t*) *is weakly ergodic and has the following convergence rate bounds:*

$$\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\|\_{1D} \le e^{-\int\_0^t \left(1 - \frac{1}{\delta}\right) (\mu(t) - \delta \lambda(t)) \, d\tau} \|\mathbf{p}^\*(0) - \mathbf{p}^{\*\*}(0)\|\_{1D'} \tag{27}$$

$$\begin{split} \|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\| &\leq 4\delta^{N+r} e^{-\int\_0^t \left(1 - \frac{1}{\delta}\right) (\mu(t) - \delta\lambda(t)) \, d\tau} \|\mathbf{p}^\*(0) - \mathbf{p}^{\*\*}(0)\| \\ &\leq 8\delta^{N+r} e^{-\int\_0^t \left(1 - \frac{1}{\delta}\right) (\mu(t) - \delta\lambda(t)) \, d\tau} \end{split} \tag{28}$$

*for any initial conditions* **p**∗(0), **p**∗∗(0) *and any t* ≥ 0*.*

$$\text{We let that } \mathcal{W} = \min\_{k \ge 1} \frac{d\_k}{k} = \min\_{k \ge 0} \frac{\delta^k}{k+1}. \text{ Then we will get } \mathcal{W}||\mathbf{p}||\_{1E} \le ||\mathbf{p}||\_{1D}.$$

**Corollary 1.** *From the conditions of Theorem 1, X*(*t*) *has a limit mean, then we say φ*(*t*) = *E*(*t*, 0)*, and we obtain that the following estimate is true for any j and any t* ≥ 0*:*

$$|E(t,j) - E(t,0)| \le \frac{1 + \delta^{j-1}}{W} e^{-\int\_0^t \left(1 - \frac{1}{\delta}\right) \left(\mu(t) - \delta \lambda(t)\right) d\tau}.\tag{29}$$

*3.4. Bounds on the Rate of Convergence for Large Arrival Rates*

Now consider the case *<sup>δ</sup>* <sup>&</sup>lt; 1 and assume that *<sup>δ</sup>* <sup>∈</sup> [*r*−<sup>1</sup> *<sup>r</sup>* , 1). In this case, we have:

$$
\alpha\_1(t) \ge \mu(t) + (1 - \delta)\lambda(t),
$$

$$a\_k(t) \ge (1 - \delta)\lambda(t) + \mu\_k(t) - \frac{1}{\delta}\mu\_{k-1}(t) \ge$$

$$(1 - \delta)\lambda(t) - \mu(t)\left(\frac{1}{\delta} - 1\right), \quad 2 \le k \le N + r - 1$$

$$a\_{N+r}(t) \ge \lambda(t) - \left(\frac{1}{\delta} - 1\right)\mu(t).$$

Then it follows that we have the same:

$$\gamma\_{\*\*}(t) = \min(a\_i(t)) = \left(\frac{1}{\delta} - 1\right) (\delta \lambda(t) - \mu(t)). \tag{31}$$

**Theorem 2.** *Let*

$$\int\_0^\infty \left(\delta\lambda(t) - \mu(t)\right) dt = +\infty,\tag{32}$$

*for some <sup>δ</sup>* <sup>∈</sup> [*r*−<sup>1</sup> *<sup>r</sup>* , 1)*. Then, the Markov chain X*(*t*) *is weakly ergodic and has the following convergence rate bounds:*

$$\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\|\_{1D} \le e^{-\int\_0^t \left(\frac{1}{\delta} - 1\right) (\delta \lambda(t) - \mu(t)) \,d\tau} \|\mathbf{p}^\*(0) - \mathbf{p}^{\*\*}(0)\|\_{1D^\*} \tag{33}$$

*and*

$$\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\| \le 8\delta^{N+r}e^{-\int\_0^t \left(\frac{1}{\delta} - 1\right) \left(\delta\lambda(t) - \mu(t)\right) d\tau},\tag{34}$$

*for any initial conditions* **p**∗(0), **p**∗∗(0) *and any t* ≥ 0*. In addition, there is also a marginal mean and bound (29).*

In addition, note the following: if the process is homogeneous (i.e., all intensities are constant), then the conditions of Theorems 1 and 2 are equivalent to the inequalities *μ* > *λ* and *μ* < *λ*, respectively.

And it is also worth noting that when all intensities of the process are 1-periodic, then there is a weak ergodicity *X*(*t*) and the estimates of Theorem 1 or Theorem 2 if:

$$\int\_{0}^{1} \lambda(t) \, dt \neq \int\_{0}^{1} \mu(t) \, dt. \tag{35}$$

#### *3.5. Perturbed CTMC and Bounds*

In this subsection, we will consider the application of the general perturbation bounding in the same way as in the work [13]) for the models under study. We will consider a "perturbed" queue-length process *<sup>X</sup>*¯(*t*), *<sup>t</sup>* <sup>≥</sup> 0 with the corresponding transposed intensity matrix *<sup>A</sup>*¯(*t*), where the "perturbing" matrix *<sup>A</sup>*ˆ(*t*) = *<sup>A</sup>*(*t*) <sup>−</sup> *<sup>A</sup>*¯(*t*) is small. That is, we assume that the perturbed queue is of the same nature as the original one. Then the perturbed intensity matrix also has the same structure with the corresponding perturbed intensities ¯ *θ*(*t*), *γ*¯(*t*), *λ*¯(*t*). We assume that *μ*¯*n*(*t*) = *<sup>C</sup>* ¯ *<sup>θ</sup>*(*t*) + max(0, *<sup>n</sup>* − *<sup>N</sup>*)*γ*¯(*t*) for *<sup>n</sup>* ≥ 1.

We suppose that:

$$\left| \left| \frac{1}{\theta(t)} - \frac{1}{\theta(t)} \right| = \left| \frac{1}{\theta(t)} \right| \le \pounds \left| \gamma(t) - \bar{\gamma}(t) \right| = \left| \hat{\gamma}(t) \right| \le \pounds \left| \lambda(t) - \bar{\lambda}(t) \right| = \left| \hat{\lambda}(t) \right| \le \varepsilon. \tag{36}$$

Hence, we will get the following:

$$\begin{split} |\mu\_{\boldsymbol{n}}(t) - \mu\_{\boldsymbol{n}}(t)| &= |\mathbb{A}\_{\boldsymbol{n}}(t)| = |\frac{\mathbb{C}}{\theta(t)} + \max(0, n - N)\gamma(t) - \frac{\mathbb{C}}{\theta(t)} \\ &- \max(0, n - N)\bar{\gamma}(t)| \\ &\leq \left| \frac{\mathbb{C}}{\theta(t)} - \frac{\mathbb{C}}{\theta(t)} \right| + \max(0, n - N)|(\gamma(t) - \bar{\gamma}(t))| \\ &\leq \mathcal{C}\mathfrak{E} + r\mathfrak{E} = (\mathbb{C} + r)\mathfrak{E}. \end{split} \tag{37}$$

Then, from (8) we obtain the following bound:

$$\|\vec{A}(t)\|\| = 2\sup\_{k} |\hbar\_{kk}(t)| = 2\max\{ |\vec{\lambda}(t)|, |\vec{\lambda}(t)| + |\vec{\mu}\_n(t)|, |\vec{\mu}\_{N+r}(t)| \} \le 2(C+r+1)\hbar. \tag{38}$$

Now from Theorem 1 and Corollary 1 in the paper [13] the following bounds of the perturbation follow.

**Theorem 3.** *Suppose that under the conditions of Theorem 1 or Theorem 2 the Markov chain X*(*t*) *is exponentially ergodic, that is,*

$$\mathcal{E}e^{-\int\_{t}^{t} \gamma\_{\ast \ast}(\tau) \, d\tau} \le \mathcal{K}e^{-\gamma\_{0}(t-s)},\tag{39}$$

*for some positive K*, *γ*0*. Then the following bounds of the perturbation take place:*

$$\limsup\_{t \to \infty} \|\mathbf{p}(t) - \mathbf{\bar{p}}(t)\| \le \frac{2\pounds(\mathbb{C} + r + 1)(1 + \log(4K) + (N + r)|\log(\delta)|)}{\gamma\_0},\tag{40}$$

*and*

$$\limsup\_{t \to \infty} |E(t, 0) - \bar{E}(t, 0)| \le \frac{2(N + r)\hat{\varepsilon}(\mathcal{C} + r + 1)(1 + \log(4K) + (N + r)|\log(\delta)|)}{\gamma\_0},\tag{41}$$

*for any perturbed queue with the respectively closed intensities satisfying to (36).*

#### **4. File Transfer in Wireless Network**

*4.1. Multi-Service Network*

The system model of this work has the form of a cell, in the coverage area of which mobile devices are located (Figure 1). Each of them has its MSISDN (Mobile Subscriber Integrated Services Digital Number)—the mobile subscriber number of digital network with the integration of services. Each user behaves as follows: he sends a request to download a file, then downloads it, and also the user can disappear from the system. Disappearance can be associated with leaving the coverage area of the cell, or with a change in the type of service, or with the end of the service. The conclusion follows from the description of the system model: if we sum up the flow from all users, then the duration of the intervals between requests will not depend on the number of users, which is described by the Poisson flow of the first kind.

**Figure 1.** Scheme of System Model.

The users are provided with various services that belong to different categories of data transmission. A more detailed overview of the categories is illustrated in the Table 2. However, of these, we consider only those that can be described by elastic traffic, such as email, file transfer, and others.


**Table 2.** Description of service types.

#### *4.2. Dataset Structure*

The analysis of real traffic is of great importance, because the identification of the patterns of its arrival can make it possible to carry out studies of the system model described in the previous section in a non-stationary mode. Therefore, a task of this paper is formulated as follows: to analyze the traffic on one of the cell tower. There is a monitoring component to collect information about the system. Every hour, for each active device, the number of bits sent and received is added up. At the beginning of the next hour, the amount for the previous one hour is summed to the registration file. The principle of filling data during the monitoring is shown in Figure 2. There is a following designation *Sup d*,*t* —the sum of bits sent by the device per day *d*, at hour *t*. The part of the log file is shown in the Table 3. There are the column START\_HOUR contains the full date and hour when the data was transferred; and the column MASKED\_MSISDN—masked device identifier of the user; APP\_CLASS—the class of the application that transfers data; UPLOAD—the number

of bits, which were sent as we denoted *Sup d*,*t* ; and the last DOWNLOAD—the number of received bits as we denoted *Sdown <sup>d</sup>*,*<sup>t</sup>* .

**Figure 2.** The principle of data accumulation during monitoring.

**Table 3.** Part of the log file.


#### *4.3. Daily Traffic Profile*

We consider elastic traffic, which is characterized by such a parameter as the length of the elastic data block. For the considered traffic model, monitoring data were taken with a class "File Transfer", which corresponds to the transfer of data using the FTP protocol.

To get the number of requests per second at a particular hour of the day for the selected application class, we perform aggregation of the form:

$$
\lambda\_{up}(t) = \frac{\sum\_{d}^{\infty} S\_{d,t}^{up}}{8 \cdot 1024 \cdot 1024 \cdot l} \tag{42}
$$

$$
\lambda\_{down}(t) = \frac{\sum\_{d}^{\infty} S\_{d,t}^{down}}{8 \cdot 1024 \cdot 1024 \cdot l'} \tag{43}
$$

where the received sum is determined in bits, *d*—date and *l*—average packet size, equal to 2 MB. Then, we perform the calculation on Formulas (42) and (43) and the results are shown in Table 4.


**Table 4.** Dependence of the number of requests on the time of day.

#### *4.4. Fourier Series Approximation*

From the found values it is necessary to obtain some continuous function of the dependence of the fluctuation of the upward and downward traffic flows on the time of day. To do this, we perform approximation by the Fourier series. For clarity, we will gradually increase the number of conditions in a row, which is to choose the most optimal option.

Consider the dependence of the intensity of the upward data flow *λup* from time, and we will carry out the approximation by the Fourier series with one, two, and three conditions. As a result, we get a functions of the form (44)–(46) respectively and for our example the parameters will take values, which are shown in Table 5:

$$a(t) = a\_0 + a\_1 \cos(wt) + b\_1 \sin(wt)\tag{44}$$

$$a(t) = a\_0 + a\_1 \cos(wt) + b\_1 \sin(wt) + a\_2 \cos(2wt) + b\_2 \sin(2wt) \tag{45}$$

$$\begin{array}{l} a(t) = a\_0 + a\_1 \cos(wt) + b\_1 \sin(wt) + a\_2 \cos(2wt) + b\_2 \sin(2wt) \\ \quad + a\_3 \cos(3wt) + b\_3 \sin(3wt). \end{array} \tag{46}$$



Using Matlab, we have built the plots of the considered approximation. It is easy to see that the plot of the obtained approximation for the one condition (Figure 3a) differs significantly from the initial data, therefore, so that is why we increased the number of conditions to two and the graph of the resulting function for two conditions (Figure 3b) significantly reflects more faithfully the relationship between real data. Then, according to the plot of the approximation for the three conditions (Figure 3c), it could be seen that, now, in the intervals with the highest network load, the function has become closer to the real data points. However, in general, the graph did not receive significant changes, and the variant of the approximation with two conditions can be considered the most optimal.

**Figure 3.** (**a**) Graph of the approximation function *a*(*t*) for upward flow with one condition; (**b**) Graph of the approximation function *a*(*t*) for upward flow with two conditions; (**c**) Graph of the approximation function *a*(*t*) for upward flow with three conditions.

#### *4.5. Numerical Analysis*

For the numerical analysis, we will consider the following example: the volume of the resource block is *C* = 100 Mbps, and the finite capacity drive is *r* = 100. The size of the transferred file equals to *θ*(*t*) = *θ* = 10 MB, that is 80 Mb, and the minimum transfer rate is *b* = 1 Mbps. The arriving intensity of device equals to *λ*(*t*) = *λ* · *a*(*t*), where according to Section 4.4 *a*(*t*) equals to (46), where parameters are indicated in the Table 5 on the line for the third condition. The intensity of the flow of leaving requests for the transfer of a block of elastic data due to "impatience" is *γ*(*t*) = *γ* = 10−2.

Apply all our bounds for this specific situation.

For applying of Theorems 2 and 3 we put *δ* = 0.99, *d*<sup>0</sup> = 1 and *dk*<sup>+</sup><sup>1</sup> = *δdk* for *k* ≥ 0. Then, we will get:

$$\gamma\_{s+}(t) = \frac{1}{99} (2.97(a\_0 + a\_1 \cos(wt) + b\_1 \sin(wt) + a\_2 \cos(2wt) + b\_2 \sin(2wt) + a\_3 \cos(3wt) + b\_3 \sin(3wt)) - 1.25), \quad (47)$$

$$e^{-\int\_{\tau}^{t} \gamma\_{\ast \ast}(\tau) \, d\tau} \le e^{-0.016(t-s)} \prime \tag{48}$$

therefore, one can get *<sup>K</sup>* <sup>=</sup> <sup>2</sup> · <sup>10</sup><sup>6</sup> and *<sup>γ</sup>*<sup>0</sup> <sup>=</sup> 0.016 in (39).

Now we obtain the following bounds on the rate of convergence:

$$\|\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\|\| \le 3 \cdot 10^6 \cdot e^{-0.016t} \|\|\mathbf{p}^\*(0) - \mathbf{p}^{\*\*}(0)\|\|,\tag{49}$$

from Theorem 2;

$$\|\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\|\|\_{\text{1D}} \le 2 \cdot 10^6 \cdot e^{-0.016t} \|\|\mathbf{p}^\*(0) - \mathbf{p}^{\*\*}(0)\|\|\_{\text{1D}'} \tag{50}$$

$$|E(t,j) - E(t,0)| \le \frac{1 + 0.99^{j-1}}{W} e^{-0.016t} \text{ }\tag{51}$$

from Theorem 2 and Corollary 1.

The corresponding perturbation bounds are:

$$\limsup\_{t \to \infty} \left||\mathbf{p}(t) - \mathbf{\bar{p}}(t)\right|| \le 5 \cdot 10^5 \mathfrak{e},\tag{52}$$

and

$$\limsup\_{t \to \infty} |E(t, 0) - E(t, 0)| \le 10^8 \mathfrak{e},\tag{53}$$

from Theorem 3.

To solve the Cauchy problem, the 4th order Adams–Multon method was used with the use of IntelliJ IDEA software, JDK and the JFreeChart library, the functionality of which is used for plotting. The convergence plots were built for the characteristics specified in Section 2.3, shown in Figures 4–6. To analyze the characteristics, two scenarios were chosen: 1. At the initial moment of time, our system is empty: *X*(0) = 0; 2. At the initial moment, the system is completely occupied, that is, all devices are occupied and there are no free places in the queue *X*(0) = 200. Figure 4 shows graphs for the blocking probability. Note that, since we are considering two scenarios, then according to Figure 4a. for the first scenario, the starting value of the probability is 0, since the system is empty and it is logical that there will be no locks, and for the second scenario, the starting value is 1 since the system is completely busy. We can notice that the blocking probability for the two scenarios converges at time *t* = 500 and according to Figure 4b. We see that its values fluctuate within the range of 0.000–0.013, and the period of fluctuation is 24. Further, in Figure 5, we see that the average number of service requests converges at 100; this may indicate a high load of devices in our system. However, the probability of blocking is small. Let us look at the average number of applications in the queue shown in Figure 6. As shown in Figure 6a. for the two scenarios, the graphs converge and the mean value is no less than 64 and no more than 75, as we can see in Figure 6b.

**Figure 4.** (**a**) Probability of blocking for *t* ∈ [0, 1304]. (**b**) Approximation of the limiting probability of blocking for *t* ∈ [1304, 1329].

**Figure 5.** (**a**) The mean of applications served *E*(*t*, *k*) for *t* ∈ [0, 1304]. (**b**) Approximation of the limiting mean of applications served *E*(*t*, *k*) for *t* ∈ [1304, 1329].

**Figure 6.** (**a**) The mean of applications in the queue *Q*(*t*) for *t* ∈ [0, 1304]. (**b**) Approximation of the limiting mean of applications in the queue *Q*(*t*) for *t* ∈ [1304, 1329].

#### **5. Conclusions**

In this paper, we have investigated the process of downloading user files depending on the time in the form of a queuing system with elastic traffic and non-stationary intensities. We have developed a method for calculating the limits of convergence of such a model. Estimates of the rate of convergence are obtained based on the logarithmic norm of linear operators. As a result, it was found that the rate of convergence turned out to be low enough for an adequate representation of the situation in the limiting mode to begin at sufficiently long times. We evaluated the characteristics of such a model, namely, the probability of blocking new users, the average number of users downloading data, and the average number of users waiting for the download to start. We obtained the upper and lower bounds of their values.

We considered the model in the form of a single network slice and, in the framework of further tasks, we can consider the model in the form of several slices. If we want to scale our system, for example, to add another incoming stream, it will be necessary to solve the problem of redistribution; we will have to adapt our method for the new case. In addition, in our case, we have a one-dimensional random process, and for a new system where it will be multidimensional, we will first need to identify a function—a mapping, to go to the one-dimensional case to apply our method. This complexity can be viewed as a challenge for future research.

**Author Contributions:** Conceptualization, supervision, A.Z. and I.K. (Irina Kochetkova); methodology, Y.S. and I.K. (Irina Kochetkova); software, validation, visualization I.K. (Ivan Kovalev) and E.M.; investigation, writing, I.K. (Irina Kochetkova), Y.S., Ivan Kovalev, E.M., A.C., A.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Russian Science Foundation, grant number 19-11-00020 (recipients I.K. (Irina Kochetkova), Y.S., I.K. (Ivan Kovalev), A.Z., Sections 3–5). This paper has been supported by the RUDN University Strategic Academic Leadership Program (recipients A.C., E.M., Sections 1 and 2).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors are sincerely grateful to Luis M. Correia (IST/INESC-ID, University of Lisbon) for providing the dataset for Section 4.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **The Estimators of the Bent, Shape and Scale Parameters of the Gamma-Exponential Distribution and Their Asymptotic Normality**

**Alexey Kudryavtsev 1,2,\* and Oleg Shestakov 1,2,3,\***


**Abstract:** When modeling real phenomena, special cases of the generalized gamma distribution and the generalized beta distribution of the second kind play an important role. The paper discusses the gamma-exponential distribution, which is closely related to the listed ones. The asymptotic normality of the previously obtained strongly consistent estimators for the bent, shape, and scale parameters of the gamma-exponential distribution at fixed concentration parameters is proved. Based on these results, asymptotic confidence intervals for the estimated parameters are constructed. The statements are based on the method of logarithmic cumulants obtained using the Mellin transform of the considered distribution. An algorithm for filtering out unnecessary solutions of the system of equations for logarithmic cumulants and a number of examples illustrating the results obtained using simulated samples are presented. The difficulties arising from the theoretical study of the estimates of concentration parameters associated with the inversion of polygamma functions are also discussed. The results of the paper can be used in the study of probabilistic models based on continuous distributions with unbounded non-negative support.

**Keywords:** parameter estimation; gamma-exponential distribution; mixed distributions; generalized gamma distribution; generalized beta distribution; method of moments; cumulants; asymptotic normality

#### **1. Introduction**

Gamma and beta classes of distributions play an important role in applied probability theory and mathematical statistics and have proven to be convenient and effective tools for modeling many real processes. The generalized gamma distribution and generalized beta distribution of the second kind are quite wide classes, including distributions that have such useful properties as, for example, infinite divisibility and stability, which makes it possible to use distributions from these classes as asymptotic approximations in various limit theorems. The article discusses the distribution proposed in the Ref. [1], that is closely related to the listed popular distributions.

**Definition 1.** *We say that the random variable ζ has the gamma-exponential distribution GE*(*r*, *ν*,*s*,*t*, *δ*) *with the parameters of bent* 0 ≤ *r* < 1*, shape ν* = 0*, concentration s*, *t* > 0*, and scale δ* > 0*, if its density at z* > 0 *is*

$$\log\_E(z) = \frac{|\nu| z^{t\nu - 1}}{\delta^{t\nu} \Gamma(s) \Gamma(t)} \text{Ge}\_{r, tr+s}(-(z/\delta)^\vee),\tag{1}$$

*where E* = (*r*, *ν*,*s*, *t*, *δ*) *and* Ge*α*, *<sup>β</sup>*(*z*) *are the gamma-exponential function [2]:*

**Citation:** Kudryavtsev, A.; Shestakov, O. The Estimators of the Bent, Shape and Scale Parameters of the Gamma-Exponential Distribution and Their Asymptotic Normality. *Mathematics* **2022**, *10*, 619. https://doi.org/ 10.3390/math10040619

Academic Editors: Alexander Zeifman, Victor Korolev, Alexander Sipin and Stelios Psarakis

Received: 22 December 2021 Accepted: 14 February 2022 Published: 17 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

$$\text{Ge}\_{\mathfrak{A},\beta}(z) = \sum\_{k=0}^{\infty} \frac{z^k}{k!} \Gamma(ak + \beta), \ z \in \mathbb{R}, \ 0 \le a < 1, \ \beta > 0. \tag{2}$$

Function (2) generalizes to the case *β* = 1, which is the transformation introduced by Le Roy [3] to study generating functions of a special form. In addition, Function (2) can be considered (under some assumptions) as a special case of the Srivastava–Tomovski function [4], that generalizes the Mittag–Leffler function [5].

In the Ref. [1] it was shown that the distribution (1) adequately describes Bayesian balance models [6]. This is primarily due to the fact that the distribution with the density (1) can be represented as a scaled mixture of two random variables with generalized gamma distributions.

In turn, the generalized gamma distribution *GG*(*v*, *q*, *θ*) with the density

$$f(\mathbf{x}) = \frac{|v|\mathbf{x}^{v\eta - 1}e^{-(\mathbf{x}/\theta)^{v}}}{\theta^{v\eta}\Gamma(q)}, \ v \neq 0, \ q > 0, \ \theta > 0, \ \mathbf{x} > 0,\tag{3}$$

proposed in 1925 by the Italian economist Amoroso [7], has proven its validity in many applied problems that use continuous distributions with an unbounded non-negative support for modeling. The class of distributions (3) is wide enough and includes exponential distribution; *χ*2-distribution; Erlang distribution; gamma distribution; half-normal distribution (the distribution of the maximum of the Brownian motion process); Rayleigh distribution; Maxwell–Boltzmann distribution; *χ*-distribution; Nakagami m-distribution; Wilson–Hilferty distribution; Weibull–Gnedenko distribution, and many others, including scaled and inverse analogs of the above.

In addition, it was shown in the Ref. [8] that the distribution (1) when *r* → 1 gives in, is the limit of the generalized beta distribution of the second kind *GB*2(*ν*,*s*, *t*, *δ*) with the density

$$f(\mathbf{x}) = \frac{|\nu|(\mathbf{x}/\delta)^{t\nu - 1}}{\delta B(\mathbf{s}, t) \left(1 + (\mathbf{x}/\delta)^{\nu}\right)^{t + \mathbf{s}}}, \text{ } \nu \neq 0, \text{ } \mathbf{s} > 0, \text{ } \delta > 0, \text{ } \mathbf{x} > 0, \text{ } \mathbf{x} > 0,\tag{4}$$

proposed in 1984 by McDonald [9]. The distribution (4), used primarily in econometrics and regression analysis, includes the Burr distribution (or Singh–Maddala distribution); Dagum distribution; Pearson distribution; Pareto distribution; Lomax distribution; the Fisher–Snedecor F-distribution, and others.

Traditionally, an important place in problems of applied mathematical statistics is occupied by the problem of estimating unknown distribution parameters. At the same time, in order to improve the consistency between mathematical models and analyzed real processes, researchers are considering increasingly complex mathematical abstractions. The relevance of the statistical analysis of the distributions (3) and (4), and their particular types and mixtures is evidenced by a large number of publications on this topic, for example, the Refs. [10–19].

In the Ref. [1], it was shown that the gamma-exponential distribution has the following properties.

**Lemma 1.** *1. Let the independent random variables λ and μ have the distributions GG*(*v*, *q*, *θ*) *and GG*(*u*, *p*, *α*)*, uv* > 0*, respectively. Then the distribution of λ coincides with GE*(0, *v*, ·, *q*, *θ*)*; the distribution of λ*/*μ for* |*u*| > |*v*| *coincides with GE*(*v*/*u*, *v*, *p*, *q*, *θ*/*α*)*; and the distribution of λ*/*μ for* |*v*| > |*u*| *coincides with GE*(*u*/*v*, −*u*, *q*, *p*, *θ*/*α*)*.*

*2. For* 0 < *r* < 1*, the density gE*(*x*)*, E* = (*r*, *ν*,*s*, *t*, *δ*) *coincides with the density of the ratio of independent random variables with generalized gamma distributions GG*(*ν*, *t*, *δ*) *and GG*(*ν*/*r*,*s*, 1)*.*

The possibility of representing the gamma-exponential distribution as a ratio of random variables having a generalized gamma distribution allows it to be used in a wide range of applied problems [6,20]. In addition, the five-parameter gamma-exponential distribution can be used to model a wide range of real phenomena, due to the wide variety

of its possible densities [20]. Besides, for a random variable *ζ* with a distribution (1), the following representation is valid [8]:

$$\mathbb{Z} \stackrel{d}{=} \delta \left( \frac{\lambda}{\mu^r} \right)^{1/\nu} \, \Big|\, \tag{5}$$

where the independent random variables *λ* and *μ* have gamma distributions *GG*(1, *t*, 1) and *GG*(1,*s*, 1), respectively. Moreover, if we put *r* = 0, the right-hand part of the ratio (5) will have the distribution *GG*(*ν*, *t*, *δ*) [8], and for *r* = 1, the right-hand part of (5) will have the distribution *GB*2(*ν*,*s*, *t*, *δ*) [19]. Consequently, the gamma-exponential distribution can be viewed as the distribution connecting and generalizing distributions from the gamma and beta classes.

In practice, the researcher deals with observable quantities that reflect the evolution of the analyzed real process. In relation to these quantities, some model assumptions are made about the form of their distribution. The problem of estimating unknown parameters from real data also arises in the case of modeling a real process using the gamma-exponential distribution. Due to the representation of the density (1) in terms of a special gammaexponential Function (2), the maximum likelihood method seems to be too complicated. The same can be said about the direct method of moments, since the moments of distribution (1) can be represented as a product of non-monotone gamma functions [1]:

$$\mathbb{E}\zeta^{m} = \frac{\delta^{m}\Gamma(t+m/\nu)\Gamma(s-mr/\nu)}{\Gamma(t)\Gamma(s)},\ \ t+\frac{m}{\nu} > 0,\ \ s-\frac{mr}{\nu} > 0. \tag{6}$$

For this reason, In the Refs. [20,21] it was proposed to estimate the parameters of the gamma-exponential distribution using a modified method based on logarithmic moments. In this paper, we consider the estimators for three out of five parameters of the gammaexponential distribution, constructed by the method of logarithmic cumulants.

The paper is organized as follows. Section 2 is devoted to the description of the method based on logarithmic cumulants; it provides an explicit form of theoretical logarithmic cumulants, their connection with logarithmic moments, as well as the form of strongly consistent estimators obtained using this method. Section 3 contains auxiliary relations necessary for formulating the main results. Section 4 contains the main results of the paper on the asymptotic normality of estimators for unknown parameters. In Section 5, a numerical analysis of the obtained results is carried out using the generated samples. The paper also contains the sections of discussions and conclusions.

#### **2. Estimators for the Parameters of the Gamma-Exponential Distribution**

This section defines the estimators for the parameters of bent *r*, shape *ν*, and scale *δ* of the gamma-exponential distribution (1) for fixed values of the concentration parameters *s* and *t*. These estimators were obtained by equating the sample and theoretical cumulants of the gamma-exponential distribution.

Let us introduce the polygamma functions

$$\psi(z) = \frac{d}{dz} \ln \Gamma(z), \ \psi^{(m)}(z) = \frac{d^{m+1}}{dz^{m+1}} \ln \Gamma(z), \ m = 1, 2, \dots$$

To obtain an explicit form of theoretical logarithmic cumulants, consider the Mellin transform

$$\mathcal{M}\_{\tilde{\xi}}(z) = \int\_0^\infty x^z \, dF\_{\tilde{\xi}}(x), \ z \in \mathbb{C}.$$

We use Lemma 1 and the representation *ζ <sup>d</sup>* = *λ*/*μ*, where the independent random variables *λ* and *μ* have distributions *GG*(*ν*, *t*, *δ*) and *GG*(*ν*/*r*,*s*, 1), respectively. For *λ* ∼ *GG*(*ν*, *t*, *δ*), the Mellin transform has the form

$$\mathcal{M}\_{\lambda}(z) = \frac{\delta^z}{\Gamma(t)} \Gamma\left(t + \frac{z}{\nu}\right), \ t + \frac{\text{Re}(z)}{\nu} > 0.1$$

Hence, for the ratio of *λ* ∼ *GG*(*ν*, *t*, *δ*) to *μ* ∼ *GG*(*ν*/*r*,*s*, 1)

$$\mathcal{M}\_{\lambda/\mu}(z) = \frac{\delta^z}{\Gamma(t)\Gamma(s)}\Gamma\left(t + \frac{z}{\nu}\right)\Gamma\left(s - \frac{rz}{\nu}\right), \ t + \frac{\text{Re}(z)}{\nu} > 0, \ s - \frac{r\text{Re}(z)}{\nu} > 0,$$

from where we get the characteristic function of the logarithm of *ζ*:

$$\mathbb{E}e^{i y \ln \zeta} = \frac{\delta^{iy}}{\Gamma(t)\Gamma(s)}\Gamma\left(t + \frac{i y}{\nu}\right)\Gamma\left(s - \frac{i r y}{\nu}\right), \ y \in \mathbb{R}.$$

Thus, the cumulants of the random variable ln *ζ* for fixed *s* and *t* have the form

$$\kappa\_1(r,\nu,\delta) = \mathbb{E}\ln\zeta = \frac{\nu\ln\delta + \psi(t) - r\psi(s)}{\nu};$$

$$\kappa\_m(r,\nu) = (-i)^m \frac{d^m}{dy^m} \ln\mathbb{E}e^{iy\ln\zeta}\Big|\_{y=0} = \frac{\psi^{(m-1)}(t) + (-r)^m\psi^{(m-1)}(s)}{\nu^m}, \quad m > 1. \tag{7}$$

The moments of the random variable ln *ζ* can be represented as [22]

$$
\mu\_m(r, \nu, \delta) \equiv \mathbb{E} \ln^m \zeta = B\_m(\kappa\_1(r, \nu, \delta), \kappa\_2(r, \nu), \dots, \kappa\_m(r, \nu)), \tag{8}
$$

where *Bm* is a complete exponential Bell polynomial that can be recurrently defined as

$$B\_{m+1}(\mathbf{x}\_1, \dots, \mathbf{x}\_{m+1}) = \sum\_{k=0}^{m} \mathbb{C}\_m^k B\_{m-k}(\mathbf{x}\_1, \dots, \mathbf{x}\_{m-k}) \mathbf{x}\_{k+1}, \quad B\_0 = 1.$$

An explicit form of the necessary relations connecting moments and cumulants can be found in the Ref. [22].

In addition, we will need the following moment characteristics of the logarithm of a random variable with a gamma-exponential distribution, calculated using the Formula (8):

$$
\sigma\_m^2(r, \nu, \delta) \equiv \mathbb{D} \ln^m \mathbb{Q} = \mu\_{2m}(r, \nu, \delta) - \mu\_m^2(r, \nu, \delta); \tag{9}
$$

$$\sigma\_{ml}(r,\nu,\delta) \equiv \text{cov}(\text{ln}^m \mathbb{Q}, \text{ln}^l \mathbb{Q}) = \mu\_{m+l}(r,\nu,\delta) - \mu\_m(r,\nu,\delta)\mu\_l(r,\nu,\delta). \tag{10}$$

To define the sample logarithmic cumulants, we introduce a notation for the sample logarithmic moments of the random variable *ζ*:

$$L\_m(X) = \frac{1}{n} \sum\_{i=1}^n \ln^m X\_{i\prime} \tag{11}$$

where *X* = (*X*1,..., *Xn*) is a sample from the distribution of *ζ*. Let us denote *l* = (*l*1, *l*2, *l*3, *l*4). Consider the functions

$$K\_1(l) \equiv K\_1(l\_1) = (\psi(s))^{-1} l\_1;$$

$$K\_2(l) \equiv K\_2(l\_1, l\_2) = (\psi'(s))^{-1} (l\_2 - l\_1^2);$$

$$K\_3(l) \equiv K\_3(l\_1, l\_2, l\_3) = (\psi''(s))^{-1} (l\_3 - 3l\_2l\_1 + 2l\_1^3);$$

$$K\_4(l) \equiv K\_4(l\_1, l\_2, l\_3, l\_4) = (\psi'''(s))^{-1} (l\_4 - 4l\_3l\_1 - 3l\_2^2 + 12l\_2l\_1^2 - 6l\_1^4).$$

Consider the statistics

$$K\_1(X) \equiv K\_1(L\_1(X));$$

$$K\_2(X) \equiv K\_2(L\_1(X), L\_2(X));\tag{12}$$

$$K\_3(X) \equiv K\_3(L\_1(X), L\_2(X), L\_3(X));$$

$$K\_4(X) \equiv K\_4(L\_1(X), L\_2(X), L\_3(X), L\_4(X));\tag{13}$$

$$K(X) = (K\_1(X), K\_2(X), K\_3(X), K\_4(X)).$$

Note that the statistics *ψ*(*m*−<sup>1</sup>)(*s*)*Km*(*X*) are the *m*-th sample logarithmic cumulants of the gamma-exponential distribution.

The method for estimating unknown parameters considered in the paper is based on solving the system for logarithmic cumulants:

$$\kappa\_m(r,\nu,\delta) = \psi^{(m-1)}(s)K\_m(X),\ \ m = 1,2,3,4. \tag{14}$$

To describe the solution of this system, we introduce a number of functions of sample logarithmic cumulants with the arguments *k* = (*k*1, *k*2, *k*3, *k*4):

$$\phi\_{\mathfrak{M}} = \frac{\psi^{(m)}(t)}{\psi^{(m)}(\mathbf{s})}; \ \tau(k) \equiv \tau(k\_2, k\_4) = \phi\_1^2 k\_4 + \phi\_3(k\_4 - k\_2^2); \tag{15}$$

$$R\_{\pm}(k) \equiv R\_{\pm}(k\_2, k\_4) = \sqrt{\frac{\phi\_1 k\_4 \pm k\_2 \sqrt{\tau(k)}}{k\_2^2 - k\_4}};\tag{16}$$

$$V\_{\pm}(k) \equiv V\_{\pm}(k\_2, k\_4) = \sqrt{\frac{\Phi\_1 k\_2 \pm \sqrt{\pi(k)}}{k\_2^2 - k\_4}};\tag{17}$$

$$D\_{\pm}(k) \equiv D\_{\pm}(k\_1, k\_2, k\_4) = \exp\left\{\psi(s)k\_1 + \frac{\psi(s)R\_{\pm}(k) - \psi(t)}{V\_{\pm}(k)}\right\};\tag{18}$$

$$|\_{\text{obs}} = R^3\_{\pm}(k) \qquad$$

$$\Delta\_{\pm}(k) \equiv \Delta\_{\pm}(k\_2, k\_3, k\_4) = \left| \frac{\Phi\_2 - R\_{\pm}^{\mathcal{S}}(k)}{V\_{\pm}^{\mathcal{S}}(k)} - k\_3 \right|;$$

$$\Delta(k) \equiv \Delta(k\_2, k\_3, k\_4) = \min\{\Lambda\_+(k), \Delta\_-(k)\};$$

$$R\_{\Delta}(k) \equiv R\_{\Delta}(k\_2, k\_3, k\_4) = \sqrt{\frac{\Phi\_1 k\_4 - \text{sgn}(\Lambda\_+(k) - \Lambda\_-(k)) k\_2 \sqrt{\pi(k)}}{k\_2^2 - k\_4}};$$

$$V\_{\Delta}(k) \equiv V\_{\Delta}(k\_2, k\_3, k\_4) = \sqrt{\frac{\Phi\_1 k\_2 - \text{sgn}(\Lambda\_+(k) - \Lambda\_-(k)) \sqrt{\pi(k)}}{k\_2^2 - k\_4}};$$

$$D\_{\Delta}(k) \equiv D\_{\Delta}(k\_1, k\_2, k\_3, k\_4) = \exp\left\{\psi(s)k\_1 + \frac{\Psi(s)R\_{\Delta}(k) - \Psi(t)}{V\_{\Delta}(k)}\right\}.$$

The system (14) has several solutions. It was shown in the Ref. [23] that the estimators for the parameters of bent *r*, shape *ν*, and scale *δ* have the form

$$\mathfrak{H}(X) = R\_{\Delta}(K(X));$$

$$\psi(X) = V\_{\Delta}(K(X));\tag{20}$$

$$
\delta(X) = D\_\Delta(K(X)),
\tag{21}
$$

and the following statement holds.

**Lemma 2.** *For fixed parameters s and t of the distribution GE*(*r*, *ν*,*s*, *t*, *δ*)*, the estimators (19)–(21) for the parameters* 0 ≤ *r* < 1*, ν* > 0 *and δ* > 0 *are strongly consistent.*

**Remark 1.** *If it is known that ν* < 0*, one should consider the estimator ν*ˆ(*X*) = −*V*Δ(*K*(*X*)) *instead of V*Δ(*K*(*X*))*, and the estimator*

$$\mathcal{S}(X) = \exp\left\{ \psi(s)K\_1(X) - \frac{\psi(s)R\_\Lambda(K(X)) - \psi(t)}{V\_\Lambda(K(X))} \right\} \tag{22}$$

*instead of D*Δ(*K*(*X*))*.*

#### **3. Auxiliary Relations**

In what follows, we will need the derivatives of Functions (16)–(18) expressed in terms of the functions *φ<sup>m</sup>* and *τ*, defined in (15). Note that

*Rk*2,±(*k*) <sup>≡</sup> *<sup>∂</sup>R*<sup>±</sup> *∂k*<sup>2</sup> (*k*2, *k*4) = ∓ *k*4 *φ*2 1*k*2 <sup>2</sup> + *τ*(*k*) ± 2*φ*1*k*<sup>2</sup> #*τ*(*k*) 2 *k*2 <sup>2</sup> − *k*<sup>4</sup> 3/2#*τ*(*k*) *φ*1*k*<sup>4</sup> ± *k*<sup>2</sup> #*τ*(*k*) ; *Rk*4,±(*k*) <sup>≡</sup> *<sup>∂</sup>R*<sup>±</sup> *∂k*<sup>4</sup> (*k*2, *k*4) = ± *k*2 *φ*2 1*k*2 <sup>2</sup> + *τ*(*k*) ± 2*φ*1*k*<sup>2</sup> #*τ*(*k*) 4(*k*<sup>2</sup> <sup>2</sup> <sup>−</sup> *<sup>k</sup>*4)3/2#*τ*(*k*) *φ*1*k*<sup>4</sup> ± *k*<sup>2</sup> #*τ*(*k*) ; *Vk*2,±(*k*) <sup>≡</sup> *<sup>∂</sup>V*<sup>±</sup> *∂k*<sup>2</sup> (*k*2, *k*4) = ∓ *k*2 *φ*2 <sup>1</sup>*k*<sup>4</sup> + *τ*(*k*) <sup>±</sup> *<sup>φ</sup>*1(*k*<sup>2</sup> <sup>2</sup> + *k*4) #*τ*(*k*) 2(*k*<sup>2</sup> <sup>2</sup> <sup>−</sup> *<sup>k</sup>*4)3/2#*τ*(*k*) *<sup>φ</sup>*1*k*<sup>2</sup> <sup>±</sup> #*τ*(*k*) ; *Vk*4,±(*k*) <sup>≡</sup> *<sup>∂</sup>V*<sup>±</sup> *∂k*<sup>4</sup> (*k*2, *<sup>k</sup>*4) = <sup>±</sup> *<sup>φ</sup>*<sup>2</sup> 1*k*2 <sup>2</sup> + *τ*(*k*) ± 2*φ*1*k*<sup>2</sup> #*τ*(*k*) 4(*k*<sup>2</sup> <sup>2</sup> <sup>−</sup> *<sup>k</sup>*4)3/2#*τ*(*k*) *<sup>φ</sup>*1*k*<sup>2</sup> <sup>±</sup> #*τ*(*k*) ; (23) *Dk*1,±(*k*) <sup>≡</sup> *<sup>∂</sup>D*<sup>±</sup> *∂k*<sup>1</sup> (*k*1, *<sup>k</sup>*2, *<sup>k</sup>*4) = *<sup>ψ</sup>*(*s*) exp\$ *<sup>ψ</sup>*(*s*)*k*<sup>1</sup> <sup>+</sup> *<sup>ψ</sup>*(*s*)*R*±(*k*) <sup>−</sup> *<sup>ψ</sup>*(*t*) *V*±(*k*) % ; *Dk*2,±(*k*) <sup>≡</sup> *<sup>∂</sup>D*<sup>±</sup> *∂k*<sup>2</sup> (*k*1, *<sup>k</sup>*2, *<sup>k</sup>*4) = exp\$ *<sup>ψ</sup>*(*s*)*k*<sup>1</sup> <sup>+</sup> *<sup>ψ</sup>*(*s*)*R*±(*k*) <sup>−</sup> *<sup>ψ</sup>*(*t*) *V*±(*k*) % × ×*ψ*(*t*)*Vk*2,±(*k*) + *<sup>ψ</sup>*(*s*)*Rk*2,±(*k*)*V*±(*k*) <sup>−</sup> *<sup>ψ</sup>*(*s*)*R*±(*k*)*Vk*2,±(*k*) *V*2 <sup>±</sup>(*k*) ; *Dk*4,±(*k*) <sup>≡</sup> *<sup>∂</sup>D*<sup>±</sup> *∂k*<sup>4</sup> (*k*1, *<sup>k</sup>*2, *<sup>k</sup>*4) = exp\$ *<sup>ψ</sup>*(*s*)*k*<sup>1</sup> <sup>+</sup> *<sup>ψ</sup>*(*s*)*R*±(*k*) <sup>−</sup> *<sup>ψ</sup>*(*t*) *V*±(*k*) % × ×*ψ*(*t*)*Vk*4,±(*k*) + *<sup>ψ</sup>*(*s*)*Rk*4,±(*k*)*V*±(*k*) <sup>−</sup> *<sup>ψ</sup>*(*s*)*R*±(*k*)*Vk*4,±(*k*) *V*2 <sup>±</sup>(*k*) .

Using the formula for the derivative of a composite function, we obtain

*∂R*<sup>±</sup> *∂l*<sup>1</sup> (*l*) = <sup>−</sup> <sup>2</sup>*l*<sup>1</sup> *ψ* (*s*) *Rk*2,±(*K*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>4</sup>*l*<sup>3</sup> <sup>−</sup> <sup>24</sup>*l*2*l*<sup>1</sup> <sup>+</sup> <sup>24</sup>*<sup>l</sup>* 3 1 *ψ* (*s*) *Rk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂R*<sup>±</sup> *∂l*<sup>2</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Rk*2,±(*K*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>6</sup>*l*<sup>2</sup> <sup>−</sup> <sup>12</sup>*<sup>l</sup>* 2 1 *ψ* (*s*) *Rk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂R*<sup>±</sup> *∂l*<sup>3</sup> (*l*) = <sup>−</sup> <sup>4</sup>*l*<sup>1</sup> *ψ* (*s*) *Rk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂R*<sup>±</sup> *∂l*<sup>4</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Rk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂V*<sup>±</sup> *∂l*<sup>1</sup> (*l*) = <sup>−</sup> <sup>2</sup>*l*<sup>1</sup> *ψ* (*s*) *Vk*2,±(*K*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>4</sup>*l*<sup>3</sup> <sup>−</sup> <sup>24</sup>*l*2*l*<sup>1</sup> <sup>+</sup> <sup>24</sup>*<sup>l</sup>* 3 1 *ψ* (*s*) *Vk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂V*<sup>±</sup> *∂l*<sup>2</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Vk*2,±(*K*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>6</sup>*l*<sup>2</sup> <sup>−</sup> <sup>12</sup>*<sup>l</sup>* 2 1 *ψ* (*s*) *Vk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂V*<sup>±</sup> *∂l*<sup>3</sup> (*l*) = <sup>−</sup> <sup>4</sup>*l*<sup>1</sup> *ψ* (*s*) *Vk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); (24) *∂V*<sup>±</sup> *∂l*<sup>4</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Vk*4,±(*K*2(*l*), *<sup>K</sup>*4(*l*)); *∂D*<sup>±</sup> *∂l*<sup>1</sup> (*l*) = <sup>1</sup> *ψ*(*s*) *Dk*1,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>2</sup>*l*<sup>1</sup> *ψ* (*s*) *Dk*2,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)) − −4*l*<sup>3</sup> <sup>−</sup> <sup>24</sup>*l*2*l*<sup>1</sup> <sup>+</sup> <sup>24</sup>*<sup>l</sup>* 3 1 *ψ* (*s*) *Dk*4,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)); *∂D*<sup>±</sup> *∂l*<sup>2</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Dk*2,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)) <sup>−</sup> <sup>6</sup>*l*<sup>2</sup> <sup>−</sup> <sup>12</sup>*<sup>l</sup>* 2 1 *ψ* (*s*) *Dk*4,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)); *∂D*<sup>±</sup> *∂l*<sup>3</sup> (*l*) = <sup>−</sup> <sup>4</sup>*l*<sup>1</sup> *ψ* (*s*) *Dk*4,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)); *∂D*<sup>±</sup> *∂l*<sup>4</sup> (*l*) = <sup>1</sup> *ψ* (*s*) *Dk*4,±(*K*1(*l*), *<sup>K</sup>*2(*l*), *<sup>K</sup>*4(*l*)),

where the partial derivatives of the functions *R*±(*k*), *V*±(*k*) and *D*±(*k*) are defined in the relations (23).

#### **4. Asymptotic Normality of the Estimators for the Parameters of the Gamma-Exponential Distribution**

Further arguments are based on the following statements [24].

**Lemma 3.** *In* R*n, the random vector Xn converges in distribution to the random vector X if and only if each linear combination of the components of Xn converges in a distribution to the same linear combination of the components of X.*

**Lemma 4.** *Suppose that in* R*m,*

$$\sqrt{n}(T\_{n1}, \dots, T\_{nm}) \Longrightarrow N(\mu, \Sigma)\_{\prime} \quad n \to \infty, \epsilon$$

*with* Σ *a covariance matrix. Let g*(*t*) = *g*(*t*1, ... , *tm*) *be a real-valued function with a nonzero differential at t* = *μ. Put*

$$d = \left( \frac{\partial \mathcal{g}}{\partial t\_1} \Big|\_{t=\mu'} \dots \Big| \frac{\partial \mathcal{g}}{\partial t\_m} \Big|\_{t=\mu} \right).$$

*Then* <sup>√</sup>*ng*(*Tn*1,..., *Tnm*) =<sup>⇒</sup> *<sup>N</sup>*(*g*(*μ*), *<sup>d</sup>*Σ*dT*)*.*

Let us formulate the statements about the asymptotic normality of the estimators (19)–(21) with fixed concentration parameters *s* and *t*.

Denote

$$
\Sigma = \begin{pmatrix}
\sigma\_1^2(r,\nu,\delta) & \sigma\_{12}(r,\nu,\delta) & \sigma\_{13}(r,\nu,\delta) & \sigma\_{14}(r,\nu,\delta) \\
\sigma\_{12}(r,\nu,\delta) & \sigma\_2^2(r,\nu,\delta) & \sigma\_{23}(r,\nu,\delta) & \sigma\_{24}(r,\nu,\delta) \\
\sigma\_{13}(r,\nu,\delta) & \sigma\_{23}(r,\nu,\delta) & \sigma\_3^2(r,\nu,\delta) & \sigma\_{34}(r,\nu,\delta) \\
\sigma\_{14}(r,\nu,\delta) & \sigma\_{24}(r,\nu,\delta) & \sigma\_{34}(r,\nu,\delta) & \sigma\_4^2(r,\nu,\delta)
\end{pmatrix};\tag{25}
$$

$$d\_{R\_{\pm}} = \left( \frac{\partial R\_{\pm}}{\partial l\_1}(l) \Big|\_{l=\mu}, \frac{\partial R\_{\pm}}{\partial l\_2}(l) \Big|\_{l=\mu}, \frac{\partial R\_{\pm}}{\partial l\_3}(l) \Big|\_{l=\mu}, \frac{\partial R\_{\pm}}{\partial l\_4}(l) \Big|\_{l=\mu} \right), \tag{26}$$

where the variances *σ*<sup>2</sup> *<sup>m</sup>*(*r*, *ν*, *δ*) and the covariances *σml*(*r*, *ν*, *δ*) are defined in (9) and (10), respectively, and partial derivatives *∂R*±/*∂lk*(*l*) are defined in (24).

Let *φm*, *m* = 1, 3 be defined in (15); *R*±(*K*2(*X*), *K*4(*X*)) be defined in (16); *Km*(*X*), *m* = 2, 4, be defined in (12) and (13). The following statement holds.

**Theorem 1.** *Suppose that r*<sup>2</sup> = (*φ*<sup>3</sup> <sup>−</sup> *<sup>φ</sup>*<sup>2</sup> <sup>1</sup>)/(2*φ*1)*.*

*1. Let r*<sup>2</sup> > *φ*3/*φ*1*. Then the estimator r*ˆ(*X*) *for the unknown parameter r has the form r*ˆ(*X*) = *R*+(*K*2(*X*), *K*4(*X*))*, and when n* → ∞ *has the property of asymptotic normality:*

$$\sqrt{n}\frac{f(X) - r}{\sqrt{d\_{R\_+} \Sigma d\_{R\_+}^T}} \implies N(0, 1).$$

*2. Let r*<sup>2</sup> < *φ*3/*φ*1*. Then the estimator r*ˆ(*X*) *for the unknown parameter r has the form r*ˆ(*X*) = *R*−(*K*2(*X*), *K*4(*X*)) *and when n* → ∞ *has the property of asymptotic normality:*

$$
\sqrt{n}\frac{\hat{r}(X) - r}{\sqrt{d\_{R\_{-}}\Sigma d\_{R\_{-}}^{T}}} \implies N(0, 1).
$$

**Proof of Theorem 1.** The sample logarithmic moments *Lm*(*X*) defined in (11) are sums of independent identically distributed random variables with means *μm*(*r*, *ν*, *δ*) defined in (8) and variances *σ*<sup>2</sup> *<sup>m</sup>*(*r*, *ν*, *δ*)/*n* defined in (9). Therefore, when *n* → ∞, the statistics *Lm*(*X*), *k* = 1, 2, 3, 4, together with any of their linear combinations, have the property of asymptotic normality with the corresponding limit means depending on *μm*(*r*, *ν*, *δ*), and variances determined by the covariance matrix Σ given in (25).

In addition, under the conditions of the theorem, the components of the vector *dR*<sup>±</sup> defined in (26) are finite and the function *R*±(*K*2(*l*), *K*4(*l*)) has a nonzero differential at the point *μ* = (*μ*1(*r*, *ν*, *δ*),..., *μ*4(*r*, *ν*, *δ*)).

Thus, all conditions of Lemmas 3 and 4 are satisfied. Hence,

$$
\sqrt{n}\mathcal{R}\_{\pm}(\mathsf{K}\_{2}(X),\mathsf{K}\_{4}(X)) \Longrightarrow \mathcal{N}(\mathsf{R}\_{\pm}(\mathsf{K}\_{2}(\mu),\mathsf{K}\_{4}(\mu)),d\_{\mathsf{R}\_{\pm}}\Sigma d\_{\mathsf{R}\_{\pm}}^{T})\dots
$$

Consider the limiting mean *R*±(*K*2(*μ*), *K*4(*μ*)). Note that when *n* → ∞

$$\begin{array}{c} \text{K}\_2(\text{X}) \longrightarrow \frac{\Phi\_1 + r^2}{\nu^2} \text{ a. s.:}\\\\ \text{K}\_4(\text{X}) \longrightarrow \frac{\Phi\_3 + r^4}{\nu^4} \text{ a. s.:} \end{array}$$

Therefore, for the function *τ*(*k*) defined in (15),

$$\pi(K\_2(X), K\_4(X)) \longrightarrow \frac{(\phi\_1 r^2 - \phi\_3)^2}{\nu^4} \text{ a. s.}$$

when *n* → ∞.

Let *r*<sup>2</sup> > *φ*3/*φ*1. Then

$$\mathcal{R}\_+(\mathsf{K}\_2(X), \mathsf{K}\_4(X)) \longrightarrow \mathcal{R}\_+(\mathsf{K}\_2(\mu), \mathsf{K}\_4(\mu)) = r \text{ a. s.}$$

and the statistic *r*ˆ(*X*) = *R*+(*K*2(*X*), *K*4(*X*)) is a strongly consistent estimator for *r*. Since

$$R\_{-}(K\_{2}(\mu), K\_{4}(\mu)) = \sqrt{\frac{2\phi\_{1}\phi\_{3} + (\phi\_{3} - \phi\_{1}^{2})r^{2}}{\phi\_{1}^{2} - \phi\_{3} + 2\phi\_{1}r^{2}}},$$

the statistic *R*−(*K*2(*X*), *K*4(*X*)) estimates the function *R*−(*K*2(*μ*), *K*4(*μ*)) = *r* and does not satisfy the statement of Lemma 2.

If *<sup>r</sup>*<sup>2</sup> <sup>&</sup>lt; *<sup>φ</sup>*3/*φ*1, we similarly conclude that the statistic *<sup>r</sup>*ˆ(*X*) = *<sup>R</sup>*−(*K*2(*X*), *<sup>K</sup>*4(*X*)) is a strongly consistent estimator for *r* and the statistic *R*+(*K*2(*X*), *K*4(*X*)) does not satisfy the statement of Lemma 2.

Let

$$\begin{split} d\_{V\pm} &= \left( \frac{\partial V\_{\pm}}{\partial l\_{1}}(l) \Big|\_{l=\mu'} \frac{\partial V\_{\pm}}{\partial l\_{2}}(l) \Big|\_{l=\mu'} \frac{\partial V\_{\pm}}{\partial l\_{3}}(l) \Big|\_{l=\mu'} \frac{\partial V\_{\pm}}{\partial l\_{4}}(l) \Big|\_{l=\mu'} \right); \\ d\_{D\_{\pm}} &= \left( \frac{\partial D\_{\pm}}{\partial l\_{1}}(l) \Big|\_{l=\mu'} \frac{\partial D\_{\pm}}{\partial l\_{2}}(l) \Big|\_{l=\mu'} \frac{\partial D\_{\pm}}{\partial l\_{3}}(l) \Big|\_{l=\mu'} \frac{\partial D\_{\pm}}{\partial l\_{4}}(l) \Big|\_{l=\mu'} \right). \end{split}$$

where the partial derivatives *∂V*±/*∂lk*(*l*) and *∂D*±/*∂lk*(*l*) are defined in (24).

Let *φm*, *m* = 1, 3 be defined in (15); *V*±(*K*2(*X*), *K*4(*X*)) be defined in (17); *D*±(*K*2(*X*), *K*4(*X*)) be defined in (18); *Km*(*X*), *m* = 2, 4, be defined in (12) and (13); and the matrix Σ be defined in (25).

Theorems 2 and 3 are proved in a completely similar way to Theorem 1.

**Theorem 2.** *Suppose that r*<sup>2</sup> = (*φ*<sup>3</sup> <sup>−</sup> *<sup>φ</sup>*<sup>2</sup> <sup>1</sup>)/(2*φ*1)*, ν* > 0*.*

*1. Let r*<sup>2</sup> > *φ*3/*φ*1*. Then the estimator ν*ˆ(*X*) *for the unknown parameter ν has the form ν*ˆ(*X*) = *V*+(*K*2(*X*), *K*4(*X*))*, and when n* → ∞ *has the property of asymptotic normality:*

$$
\sqrt{n}\frac{\vartheta(X) - \nu}{\sqrt{d\_{V\_+} \Sigma d\_{V\_+}^T}} \implies N(0, 1).
$$

*2. Let r*<sup>2</sup> < *φ*3/*φ*1*. Then the estimator ν*ˆ(*X*) *for the unknown parameter ν has the form ν*ˆ(*X*) = *V*−(*K*2(*X*), *K*4(*X*))*, and when n* → ∞ *has the property of asymptotic normality:*

$$
\sqrt{n}\frac{\vartheta(X) - \nu}{\sqrt{d\_{V\_-} \Sigma d\_{V\_-}^T}} \implies N(0, 1).
$$

**Theorem 3.** *Suppose that r*<sup>2</sup> = (*φ*<sup>3</sup> <sup>−</sup> *<sup>φ</sup>*<sup>2</sup> <sup>1</sup>)/(2*φ*1)*, ν* > 0*.*

*1. Let r*<sup>2</sup> > *φ*3/*φ*1*. Then the estimator* ˆ *δ*(*X*) *for the unknown parameter δ has the form* ˆ *δ*(*X*) = *D*+(*K*2(*X*), *K*4(*X*))*, and when n* → ∞ *has the property of asymptotic normality:*

$$
\sqrt{n}\frac{\hat{\delta}(X) - \delta}{\sqrt{d\_{D\_+} \Sigma d\_{D\_+}^T}} \Longrightarrow N(0, 1).
$$

*2. Let r*<sup>2</sup> < *φ*3/*φ*1*. Then the estimator* ˆ *δ*(*X*) *for the unknown parameter δ has the form* ˆ *δ*(*X*) = *D*−(*K*2(*X*), *K*4(*X*)) *and when n* → ∞ *has the property of asymptotic normality:*

$$
\sqrt{n}\frac{\hat{\delta}(X) - \delta}{\sqrt{d\_{D\_-} \Sigma d\_{D\_-}^T}} \Longrightarrow N(0, 1).
$$

**Remark 2.** *By analogy with the arguments of the Ref. [23] concerning the statement of Lemma 2, if it is known that ν* < 0*, it is easy to show that the statements of the Theorems 2 and 3 hold for the statistics <sup>ν</sup>*ˆ(*X*) = <sup>−</sup>*V*Δ(*K*(*X*)) *and* <sup>ˆ</sup> *δ*(*X*) *defined in (22) with the corresponding modification of the vectors of derivatives dV*<sup>±</sup> *and dD*<sup>±</sup> *.*

Denote

$$s\_{mm}(X) \equiv \sigma\_m^2(\mathfrak{r}(X), \mathfrak{d}(X), \mathfrak{f}(X));\tag{27}$$

$$s\_{ml}(X) = s\_{lm}(X) \equiv \sigma\_{ml}(\mathfrak{r}(X), \mathfrak{r}(X), \mathfrak{f}(X));\tag{28}$$

$$d\_r^{[m]}(X) \equiv \frac{\partial \mathfrak{f}(X)}{\partial l\_m};\ d\_r^{[m]}(X) \equiv \frac{\partial \mathfrak{f}(X)}{\partial l\_m};\ d\_\delta^{[m]}(X) \equiv \frac{\partial \tilde{\delta}(X)}{\partial l\_m},\tag{29}$$

where *σ*<sup>2</sup> *<sup>m</sup>*(*r*, *ν*, *δ*) and *σml*(*r*, *ν*, *δ*) are defined in (9) and (10), respectively, and the estimators *r*ˆ(*X*), *ν*ˆ(*X*) and ˆ *δ*(*X*) satisfy the conditions of Theorems 1–3.

**Corollary 1.** *Suppose that the conditions of Theorems 1–3 are met; then,*

$$\sqrt{n}\frac{\mathfrak{f}(X) - r}{\sqrt{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{r}^{[m]}(X) s\_{ml}(X) d\_{r}^{[l]}(X)}} \implies N(0, 1);$$

$$\sqrt{n}\frac{\mathfrak{d}(X) - \nu}{\sqrt{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{\nu}^{[m]}(X) s\_{ml}(X) d\_{\nu}^{[l]}(X)}} \implies N(0, 1);$$

$$\sqrt{n}\frac{\mathfrak{f}(X) - \delta}{\sqrt{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{\delta}^{[m]}(X) s\_{ml}(X) d\_{\delta}^{[l]}(X)}} \implies N(0, 1),$$

*when n* <sup>→</sup> <sup>∞</sup>*, where sml*(*X*)*, d*[*m*] *<sup>r</sup>* (*X*)*, d*[*m*] *<sup>ν</sup>* (*X*)*, d*[*m*] *<sup>δ</sup>* (*X*) *are defined in (27)–(29).*

**Proof of Corollary 1.** Due to the strong consistency of the estimators *r*ˆ(*X*), *ν*ˆ(*X*), and ˆ *δ*(*X*), the quadratic form

$$\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_r^{[m]}(X) s\_{ml}(X) d\_r^{[l]}(X)^l$$

converges almost surely to the normalizing function from the Theorem 1. Therefore, by Slutsky's theorem, we obtain the statement of the Corollary 1 for the estimator of the parameter *r*. Similarly, we obtain statements for estimators of the parameters *ν* and *δ*.

Based on Corollary 1, it is possible to construct asymptotic confidence intervals for unknown parameters of the gamma-exponential distribution.

By *uγ*, we denote the (1 + *γ*)/2-quantile of the standard normal distribution.

**Corollary 2.** *Suppose that the conditions of Theorems 1–3 are met; then, asymptotic confidence intervals with the confidence level γ based on the estimators r*ˆ(*X*)*, ν*ˆ(*X*)*,* ˆ *δ*(*X*) *for the unknown parameters r, ν, δ have the form*

$$(A\_r(X), B\_r(X)) = \left(\mathfrak{f}(X) - \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_r(X), \mathfrak{f}(X) + \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_r(X)\right);$$

$$(A\_\nu(X), B\_\nu(X)) = \left(\widehat{\circ}(X) - \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_\nu(X), \widehat{\circ}(X) + \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_\nu(X)\right);$$

$$(A\_\delta(X), B\_\delta(X)) = \left(\delta(X) - \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_\delta(X), \delta(X) + \frac{u\_\gamma}{\sqrt{n}} \mathbb{C}\_\delta(X)\right).$$

*where*

$$\mathcal{C}\_{\boldsymbol{r}}(X) = \sqrt{\overbrace{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{r}^{[m]}(X) s\_{ml}(X) d\_{r}^{[l]}(X)}^{}$$

$$\mathcal{C}\_{\boldsymbol{\nu}}(X) = \sqrt{\overbrace{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{\boldsymbol{\nu}}^{[m]}(X) s\_{ml}(X) d\_{\boldsymbol{\nu}}^{[l]}(X)}^{}$$

$$\mathcal{C}\_{\delta}(X) = \sqrt{\sum\_{m=1}^{4} \sum\_{l=1}^{4} d\_{\delta}^{[m]}(X) s\_{ml}(X) d\_{\delta}^{[l]}(X)},$$

*and sml*(*X*)*, d*[*m*] *<sup>r</sup>* (*X*)*, d*[*m*] *<sup>ν</sup>* (*X*)*, d*[*m*] *<sup>δ</sup>* (*X*) *are defined in (27)–(29).*

**Proof of Corollary 2.** The proof is based on the relation

$$\mathbb{P}\left(|\mathfrak{f}(X) - r| < \frac{\mathfrak{u}\_{\gamma}}{\sqrt{n}} \mathbb{C}\_{r}(X)\right) = \mathbb{P}\left(\frac{\sqrt{n}}{\mathbb{C}\_{r}(X)} |\mathfrak{f}(X) - r| < \mathfrak{u}\_{\gamma}\right) \asymp 2\Phi(\mathfrak{u}\_{\gamma}) - 1 = \gamma\_{r}$$

from which we obtain the form of the confidence interval (*Ar*(*X*), *Br*(*X*)). Similarly, asymptotic confidence intervals for the parameters *ν* and *δ* are obtained.

#### **5. Numerical Analysis of Theoretical Results**

Let us consider the problem of obtaining numerical values of estimates for the parameters of bent *r*, shape *ν*, and scale *δ* of the gamma-exponential distribution *GE*(*r*, *ν*,*s*, *t*, *δ*) for fixed values of concentration parameters *s* and *t*.

The method for obtaining estimators for the parameters *r* and *ν* is based on solving the system of equations [23], where the theoretical logarithmic cumulants (7) of the second and fourth orders are equated to their sample counterparts:

$$\frac{\phi\_1 + r^2}{\nu^2} = \mathcal{K}\_2(X);\tag{30}$$

$$\frac{\phi\_3 + r^4}{\nu^4} = \mathcal{K}\_4(X). \tag{31}$$

Note that the solutions [23]

$$
\psi\_{\pm}^{2}(X) = \frac{\phi\_{1}K\_{4}(X) \pm K\_{2}(X)\sqrt{\tau(K(X))}}{K\_{2}^{2}(X) - K\_{4}(X)};\tag{32}
$$

$$
\psi\_{\pm}^{2}(X) = \frac{\phi\_{1}K\_{2}(X) \pm \sqrt{\tau(K(X))}}{K\_{2}^{2}(X) - K\_{4}(X)}
$$

do not uniquely determine the estimators for the parameters *r* and *ν*, and whereas the sign of *r* is always known, *ν* can be either positive or negative. In addition, numerical experiments show that for a fixed sample, the expression (32) can give non-controversial estimates for any sign before the radical. For this reason, when processing real data, one should use the algorithm for filtering out unnecessary system solutions.

The algorithm for choosing the "correct" solution (*r*ˆ(*X*), *ν*ˆ(*X*)) of the system is as follows. At the first stage, one should try to determine the sign before the radical in the relation (32), using the domain of the parameter *r* ∈ [0, 1). According to the condition of Theorem 1, it is necessary to compare the value of *φ*3/*φ*<sup>1</sup> with one. If *φ*3/*φ*<sup>1</sup> ≥ 1, one should choose *r*ˆ(*X*) = *R*−(*K*(*X*)), where the function *R*−(*k*) is defined in (16). If

*φ*3/*φ*<sup>1</sup> < 1, the values of the right-hand sides of (32) are calculated for the given sample. If the value of *r*ˆ 2 <sup>±</sup>(*X*) for some signs before the radical does not belong to the interval [0, 1), the corresponding solution is eliminated and the solution with the opposite sign before the radical is chosen as the estimate of *r*.

At the second stage, one should additionally use an equation similar to (30) and (31):

$$\frac{\Phi\_2 - r^3}{\nu^3} = K\_3(X).$$

Since the estimators *r*ˆ(*X*) and *ν*ˆ(*X*), along with the statistics *K*3(*X*), are continuous functions of sample logarithmic moments,

$$\left| \frac{\phi\_2 - \mathfrak{h}^3(X)}{\mathfrak{h}^3(X)} - K\_{\mathfrak{h}}(X) \right| \longrightarrow 0 \text{ a.s.}$$

when *n* → ∞.

For a fixed sample size *n*, this relation makes it possible to determine the "correct" solution of the system using the values

$$\Lambda\_{\pm,+} = \left| \frac{\phi\_2 - R\_{\pm}^3(K(X))}{V\_{\pm}^3(K(X))} - K\_3(X) \right| \text{ and } \Lambda\_{\pm,-} = \left| \frac{\phi\_2 - R\_{\pm}^3(K(X))}{-V\_{\pm}^3(K(X))} - K\_3(X) \right|.$$

based on the following criteria:


The estimate ˆ *δ*(*X*) of the scale parameter, *δ* is found from the equation for the first logarithmic cumulant

$$\ln \delta + \frac{\psi(t) - r\psi(s)}{\nu} = L\_1(X)\_{\prime}$$

substituting the solution (*r*ˆ(*X*), *ν*ˆ(*X*)) is found by the above algorithm instead of (*r*, *ν*), and has the form

$$\hat{\delta}(X) = \exp\left\{L\_1(X) + \frac{\psi(s)\mathfrak{f}(X) - \psi(t)}{\mathfrak{d}(X)}\right\}.\tag{33}$$

Let us present some numerical results illustrating the method of choosing the "correct" estimates for the parameters of bent *r*, shape *ν*, and scale *δ* at fixed concentration parameters *s* and *t* of the gamma-exponential distribution (1).

Tables 1–8 provide examples of numerical values of parameter estimates obtained using the algorithm for eliminating unnecessary solutions and constructed from the samples of the size *n* from the model distribution (1) with a set of parameters *E* = (*r*, *ν*,*s*, *t*, *δ*) and examples of the boundary values of asymptotic confidence intervals with a confidence level *γ* = 0.95 for these estimates.

A simulation of pseudo-random samples from the gamma-exponential distribution is based on the relation (5) and is carried out using standard tools in any programming language that has the ability to generate samples from the gamma distribution.

Table 1 shows the values of the estimates of the parameters *r*, *ν* and *δ*, obtained from the sample of the size *n* from a model distribution with a set of parameters *E* = (0.5; 2.5; 2.4; 1.9; 1.0). For this set of parameters, the inequality *φ*3/*φ*<sup>1</sup> ≥ 1 holds, and therefore the first stage of the algorithm for eliminating unnecessary solutions gives the estimate *r*ˆ(*X*) = *R*−(*K*(*X*)). At the second stage, since Δ−,<sup>+</sup> < Δ−,−, the pair (*R*−(*K*(*X*)), *<sup>V</sup>*−(*K*(*X*))) is selected as a solution (*r*ˆ(*X*), *<sup>ν</sup>*ˆ(*X*)). The estimate <sup>ˆ</sup> *δ*(*X*) is obtained from the relation (33). Table 2 shows the values of the estimates (*r*ˆ(*X*), *ν*ˆ(*X*), ˆ *δ*(*X*)) obtained in the previous step and the boundaries of corresponding confidence intervals.


**Table 1.** Examples of estimates for the parameters of the model distribution *GE*(0.5; 2.5; 2.4; 1.9; 1.0).

**Table 2.** Examples of estimates for the parameters and boundaries of confidence intervals for a model distribution *GE*(0.5; 2.5; 2.4; 1.9; 1.0).


Table 3 shows the values of the estimates of the parameters *r*, *ν* and *δ* obtained from the sample of the size *n* from a model distribution with a set of parameters *E* = (0.7; −1.8; 3.6; 3.9; 1.5). For this set of parameters, the inequality *φ*3/*φ*<sup>1</sup> < 1 holds, while the values of the statistics *R*+(*K*(*X*)) are outside the domain of the parameter *r*, and therefore the first stage of the algorithm for eliminating unnecessary solutions gives the estimate *r*ˆ(*X*) = *R*−(*K*(*X*)). At the second stage, since Δ−,<sup>−</sup> < Δ−,+, the pair (*R*−(*K*(*X*)), <sup>−</sup>*V*−(*K*(*X*))) is selected as a solution (*r*ˆ(*X*), *<sup>ν</sup>*ˆ(*X*)). The estimate <sup>ˆ</sup> *δ*(*X*) is obtained from the relation (33). Table 4 shows the values of the estimates (*r*ˆ(*X*), *ν*ˆ(*X*), ˆ *δ*(*X*)) obtained in the previous step, and the boundaries of corresponding confidence intervals.

**Table 3.** Examples of estimates for the parameters of the model distribution *GE*(0.7; −1.8; 3.6; 3.9; 1.5).


**Table 4.** Examples of estimates for the parameters and boundaries of confidence intervals for a model distribution *GE*(0.7; −1.8; 3.6; 3.9; 1.5).


Table 5 shows the values of the estimates of the parameters *r*, *ν* and *δ* obtained from the sample of the size *n* from a model distribution with a set of parameters *E* = (0.8; 1.3; 0.3; 1.4; 2.5). For this set of parameters, the expression under the outer radical in *R*−(*K*(*X*)) is negative, so the first stage of the algorithm for eliminating unnecessary solutions gives the estimate *r*ˆ(*X*) = *R*+(*K*(*X*)). At the second stage, since Δ+,<sup>+</sup> < Δ+,−, the pair (*R*+(*K*(*X*)), *V*+(*K*(*X*))) is selected as a solution (*r*ˆ(*X*), *ν*ˆ(*X*)). The estimate ˆ *δ*(*X*) is obtained from the relation (33). Table 6 shows the values of the estimates (*r*ˆ(*X*), *ν*ˆ(*X*), ˆ *δ*(*X*)) obtained in the previous step, and the boundaries of corresponding confidence intervals.


**Table 5.** Examples of estimates for the parameters of the model distribution *GE*(0.8; 1.3; 0.3; 1.4; 2.5).

**Table 6.** Examples of estimates for the parameters and boundaries of confidence intervals for a model distribution *GE*(0.8; 1.3; 0.3; 1.4; 2.5).


Table 7 shows the values of the estimates of the parameters *r*, *ν* and *δ* obtained from the sample of the size *n* from a model distribution with a set of parameters *E* = (0.6;−2.9; 2.1; 3.9; 0.5). For this set of parameters, the inequality *φ*3/*φ*<sup>1</sup> < 1 holds, while the values of both statistics *R*+(*K*(*X*)) and *R*−(*K*(*X*)) lie in the interval [0, 1). Therefore, at the second stage, since Δ+,<sup>−</sup> = min{Δ+,+,Δ+,−,Δ−,+,Δ−,−}, the pair (*R*+(*K*(*X*)),−*V*+(*K*(*X*))) is selected as a solution (*r*ˆ(*X*), *ν*ˆ(*X*)). The estimate ˆ *δ*(*X*) is obtained from the relation (33). Table 8 shows the values of the estimates (*r*ˆ(*X*), *ν*ˆ(*X*), ˆ *δ*(*X*)) obtained in the previous step, and the boundaries of corresponding confidence intervals.

**Table 7.** Examples of estimates for the parameters of the model distribution *GE*(0.6; −2.9; 2.1; 3.9; 0.5).


**Table 8.** Examples of estimates for the parameters and boundaries of confidence intervals for a model distribution *GE*(0.6; −2.9; 2.1; 3.9; 0.5).


**Remark 3.** *In some cases, when processing real data using the above methods, the lengths of confidence intervals may unboundedly increase. This indicates that the conditions of Theorems 1–3 are violated, that is, either r*<sup>2</sup> = (*φ*<sup>3</sup> <sup>−</sup> *<sup>φ</sup>*<sup>2</sup> <sup>1</sup>)/(2*φ*1) *or r*<sup>2</sup> = *<sup>φ</sup>*3/*φ*1*.*

#### **6. Discussion**

The majority of models of real processes using continuous distributions with unbounded non-negative support operate with special cases of the generalized gamma distribution, proposed in the 1920s by the Italian economist Amoroso in the framework of the study of the dynamic equilibrium theory [7], and special cases of the generalized beta distribution of the second kind, proposed in the 1980s by McDonald as a generalization of the well-known beta-type distributions used to model profitability [9].

The study of probabilistic and statistical properties of distributions from the gamma and beta classes is very important. For example, in the Refs. [10–14], it was proposed to use the generalized gamma distribution and its particular cases in problems of processing radar signals and images, evaluating the concentration of harmful gases in industrial areas, studying the periods of remission of cancer patients, analyzing neurotransmission and anorexia. The results of the Refs. [15–19] concerning the generalized beta distribution of the second kind and its representatives are used for meteorological research, analysis of infectious diseases, climatic phenomena and profitability, the study of physiological characteristics, and consumer price indexes, and can also be used in the theory of reliability when modeling the time of failure.

This article considers the gamma-exponential distribution, a generalization of the Amoroso distribution that gives the McDonald distribution in the limit. Thus, it can be argued that the results of the article will be in demand when studying various models that allow descriptions of real processes using continuous distributions with non-negative unbounded support.

The main problem in the study of the gamma-exponential distribution is the representation of the density (1) in terms of a special gamma-exponential function (2). This fact makes it difficult to study the probabilistic and statistical properties of the distribution using classical methods such as, for example, the maximum likelihood method. In addition, the moments (6) of the gamma-exponential distribution may not exist for some parameter values and are the products of nonmonotonic gamma functions, with arguments depending on several parameters at once. It significantly complicates not only the application of the method of moments, but also the interpretation of the parameters as characteristics of the mean, spread, asymmetry, and so forth. The latter cannot be considered as a disadvantage of this distribution, since in practice, the value of each characteristic is influenced by many factors of a different nature.

The results of this paper concern the estimation of the bent, shape and scale parameters of the gamma-exponential distribution under the assumption that the concentration parameters are known and fixed. This formulation naturally arises in the case of using the gamma-exponential distribution to study scale mixtures of Rayleigh, Maxwell–Boltzmann, Fréchet (Weibull–Gnedenko), Lévy (with zero bias) distributions, and some others. However, a natural question arises about the form of statistical estimates in the case when all five parameters are unknown.

The difficulty in applying the considered method, based on equating theoretical logarithmic cumulants to their sample counterparts, lies in the fact that the concentration parameters enter the equations as arguments of polygamma functions.

There are many papers related to the study of polygamma functions. For example, the Refs. [25,26] provide the estimates of polygamma functions and inverse polygamma functions in terms of elementary functions, Riemann and Hurwitz zeta functions, and Bernoulli numbers, and also investigate the monotonicity properties of expressions associated with polygamma functions. However, the usefulness of these results for the statistical estimation of the arguments of polygamma functions is not obvious yet.

Thus, the problem of developing effective theoretical methods of inverting polygamma functions is an urgent and, apparently, unsolved problem. However, due to the strict monotonicity and continuity of the polygamma functions, the possibility of numerical inversion is obvious, which will allow for an estimation of all five parameters of the gamma-exponential distribution using computer technologies. The solution to this problem is a direction of further research of the authors.

#### **7. Conclusions**

The paper considers the problem of estimating the parameters of the gamma-exponential distribution, which is a generalization and an intermediate link between the generalized gamma distribution and the generalized beta distribution of the second kind. A method for estimating unknown parameters based on logarithmic cumulants is discussed. An algorithm for eliminating unnecessary solutions obtained by solving a system based on logarithmic cumulants is described. The asymptotic normality of the strongly consistent estimators for the bent, shape and scale parameters of the gamma-exponential distribution at fixed concentration parameters is proved. Based on this result, asymptotic confidence intervals for the estimated parameters are constructed. The results are illustrated by numerical examples constructed on the basis of model samples from the gamma-exponential distribution, implemented using the representation of the gamma-exponential distribution as a fractional-scale mixture of gamma distributions. Possible applications of the results in the analysis of processes using continuous distributions with a non-negative unbounded support are discussed.

**Author Contributions:** Conceptualization, A.K. and O.S.; methodology, A.K. and O.S.; formal analysis, A.K. and O.S.; investigation, A.K. and O.S.; writing—original draft preparation, A.K. and O.S.; writing—review and editing, A.K. and O.S.; supervision, A.K. and O.S.; funding acquisition, O.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Ministry of Science and Higher Education of the Russian Federation, project No. 075-15-2020-799.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Monte Carlo Algorithms for the Extracting of Electrical Capacitance**

**Andrei Kuznetsov \* and Alexander Sipin**

Department of Applied Mathematics, Vologda State University, Lenina 15, 160000 Vologda, Russia; cac1909@mail.ru

**\*** Correspondence: pm\_kan@mail.ru or kuznetsovan@vogu35.ru

**Abstract:** We present new Monte Carlo algorithms for extracting mutual capacitances for a system of conductors embedded in inhomogeneous isotropic dielectrics. We represent capacitances as functionals of the solution of the external Dirichlet problem for the Laplace equation. Unbiased and low-biased estimators for the capacitances are constructed on the trajectories of the Random Walk on Spheres or the Random Walk on Hemispheres. The calculation results show that the accuracy of these new algorithms does not exceed the statistical error of estimators, which is easily determined in the course of calculations. The algorithms are based on mean value formulas for harmonic functions in different domains and do not involve a transition to a difference problem. Hence, they do not need a lot of storage space.

**Keywords:** capacitance; dirichlet boundary value problem; monte carlo method; unbiased estimator; von-neumann-ulam scheme

#### **1. Introduction**

The problems of finding potentials and mutual capacitances for complex threedimensional objects have become widespread with the development of high-frequency electrical engineering. In the case of one or two conductors, they can still be solved analytically, but solving problems for systems of a large number of conductors of complex shape causes significant difficulties. The more the operation frequency is, the more impact on the system of parasitic capacitance and induction. This is true for radio frequency communication devices, as well as very large-scale integration circuits and multilayer printed-circuit boards [1,2].

In inhomogeneous media with permittivity *ε*(*x*) the electrostatic potential *ϕ*(*x*) satisfies the boundary value problem:

> <sup>Δ</sup>*<sup>ϕ</sup>* <sup>=</sup> 0, *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**<sup>3</sup> \ (Γ*<sup>i</sup>* <sup>∪</sup> <sup>Γ</sup>*d*); *<sup>ϕ</sup>*(*x*) −→|*<sup>x</sup>*|→<sup>∞</sup> 0; *ϕ*|Γ*<sup>i</sup>* = *ϕi*, *ϕ<sup>i</sup>* = const; *<sup>ϕ</sup>*+(*x*) = *<sup>ϕ</sup>*−(*x*), *<sup>x</sup>* <sup>∈</sup> <sup>Γ</sup>*d*; *<sup>ε</sup>*<sup>+</sup> *∂ϕ*+(*x*) *<sup>∂</sup><sup>n</sup>* <sup>=</sup> *<sup>ε</sup>*<sup>−</sup> *∂ϕ*−(*x*) *<sup>∂</sup><sup>n</sup>* , *x* ∈ Γ*d*; 9 Γ*i ε ∂ϕ <sup>∂</sup><sup>n</sup> dS* = −*qi*. ⎫ ⎪⎪⎪⎪⎪⎪⎪⎪⎬ ⎪⎪⎪⎪⎪⎪⎪⎪⎭ (1)

Here Γ*<sup>i</sup>* denotes the conductor surfaces, Γ*<sup>d</sup>* is the union of the dielectric interfaces, *n* is the external normal to <sup>Γ</sup>*d*, <sup>|</sup>*x*<sup>|</sup> is Euclidean length of *<sup>x</sup>*, *<sup>ϕ</sup>*<sup>+</sup> and *<sup>ϕ</sup>*<sup>−</sup> are the values of the potential on different sides dielectric interfaces, *ε*<sup>+</sup> and *ε*<sup>−</sup> are the permittivity constants on different sides dielectric interfaces, *ϕ<sup>i</sup>* are values of the potential on the Γ*i*, *dS* is the differential element of area, and *qi* is the charge on Γ*<sup>i</sup>* (Figure 1).

**Citation:** Kuznetsov, A.; Sipin, A. Monte Carlo Algorithms for the Extracting of Electrical Capacitance. *Mathematics* **2021**, *9*, 2922. https://doi.org/10.3390/math9222922

Academic Editor: Tuan Phung-Duc

Received: 28 October 2021 Accepted: 14 November 2021 Published: 17 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Figure 1.** Domain for the boundary value problem.

Charges linearly depend on potentials [3]: *qi* = ∑*<sup>m</sup> <sup>j</sup>*=<sup>1</sup> *Cijϕj*. Here, *Cij* is mutual electrostatic capacitance for the conductors *i* and *j*, it is known that *Cij* = *Cji*. Hence, *Cij* is equal to the charge *qi*, when all potential *ϕ<sup>k</sup>* = 0, if *k* = *j*, and *ϕ<sup>j</sup>* = 1.

The analytical solution is only available for simple geometries [3], and could not be used for real-life tasks. Another way is to use pattern-matching algorithms, but there is dependency on available patterns and the quality of geometry approximation with patterns [2,4].

The methods used most for computing capacitances in complicated three-dimensional geometries are the boundary-element technique (for example, [5–7]) and Monte Carlo methods (for example, [8–11]).

The boundary element method is used to solve the system of integral equations of potential theory for the charge density on the surfaces of conductors. The charge on the conductor is then calculated by integrating the density. The main drawbacks of these methods are the necessity of approximation of the conductor's surface, high random access memory requirements, and additional computational error when equations are solved using the iterative technique.

The Monte Carlo method is used to solve the Dirichlet boundary value problem (1). Capacitance is calculated using the Gaussian formula through the normal derivative of the potential. Monte Carlo algorithms for a boundary value problem are based on the representation of its solution in the form of the mathematical expectation of some random variable, which in mathematical statistics is called an unbiased estimator. A common drawback for Monte Carlo methods isthe necessity of a large number of simulations, but usually they are highly parallelizable and have low random access memory requirements.

There are various formulas for the average value for the potential, which determine both the estimate itself and the type of Random Walk along the trajectories of which it is calculated.

One of the first works on using Monte Carlo method for real-life capacitance extraction is [8]. This article describes Random Walk on Cubes methods for rectilinear conductors in a homogeneous medium. The proposed algorithm uses the mean value theorem for the potential at the center of a cube. To simplify the procedure for modeling a Random Walk, the problem was discretized. The development of the method of Random Walk on Cubes in various directions (multiple dielectrics, non-Manhattan polygonal shapes, optimizations) can be found, for example, in [12–14]. Besides the statistical error of Monte

Carlo approximations, Walk on Cubes have additional bias because of the approximation of Green's function for cubes using the Fourier series.

In [15], Random Walk on Boundary was described for calculating conductor's capacitance in free space, in [9] Random Walk on Spheres and Walk on Boundary were used for estimating electrostatic properties of molecules, including cases for different (constant) permittivities. These methods were extended in [16] for analysis on multidielectric integrated circuits of arbitrary geometry from a scanning electron microscopy image. Besides the statistical error of Monte Carlo approximations, there is additional bias, because of various discretizations, that could not be estimated along with the calculations.

In this article we discuss algorithms of the Monte Carlo method that do not require discretization of the boundary value problem. Consequently, there is no approximation error in them. Due to this, it is possible to estimate the error of the approximate solution of the problem during the calculations. Furthermore, Random Walks in unbounded regions may not reach the boundary of the conductors in a finite time with a positive probability. Forced completion of the trajectory leads to a bias in the estimate of the potential, which authors usually do not take into account. Our proposed algorithms are free from this drawback.

In our previous work [10,11], we developed algorithms for mutual capacitance calculation in homogeneous media on trajectories of a Walk on Spheres and in inhomogeneous media on trajectories of a Walk on Hemispheres, when dielectric interfaces are polyhedral. We summarize the main results of these works here.

In this paper, we also consider a new version of the Walk on Hemispheres and its application to the calculation of electrostatic capacitances for systems with various dielectric interfaces, including non-Manhattan geometries.

Using the examples of conductor systems for which the capacitances are calculated analytically [3], it is shown that the accuracy of the Monte Carlo approximation is within the statistical error. In more complex examples, the simulation results are compared with the results of calculating these capacitances using the programs FastCap2 and FFTCap [6,17,18].

The paper is organized as follows. Section 2 introduces a description of the problem. Section 3 describes different kinds of unbiased estimators for the capacitance. It begins with a description of previously proposed algorithms in Sections 3.1 and 3.2, followed by the description of a new version of the Walk on Hemispheres in Section 3.3, and finishes with a description of the generic algorithm for capacitance extraction with these methods in Section 3.4. Section 4 contains the numerical results for capacitance extraction, where we compare the results of the proposed algorithms with analytical solutions or other programs. Section 5 concludes the paper.

#### **2. Integral Representation for the Capacitance**

Using Gauss's theorem we have

$$C\_{ij} = -\oint\_{\Gamma} \varepsilon \frac{\partial \varrho}{\partial n} dS\_{\prime} \tag{2}$$

where Γ is the surface containing the *i*-th conductor inside and separating it from others conductors and interfaces.

Using the Poisson formula, we obtain the following representation for the normal derivative of the potential on the shell Γ:

$$\frac{\partial \rho(\mathbf{x})}{\partial \mathbf{n}} = \frac{1}{4\pi r^2} \oint\_{S\_r} \frac{3}{r^2} (y - \mathbf{x}, n) \rho(y) d\_y S\_r \tag{3}$$

where *x* is a point on the shell around the *i*-th conductor, *r* is distance from point *x* to the nearest conductor or interface, *Sr* is sphere of radius *r* centered at point *x*, *y* is a point on *Sr* (Figure 2).

**Figure 2.** First steps of Random Walk on Spheres for estimation of *C*1*j*. Here Γ*<sup>i</sup>* are conductor's surfaces, Γ is a shell around first conductor.

Finally, replacing the normal derivative *∂ϕ*/*∂n* by its integral representation in the ball, which lies entirely in the region with the dielectric constant *ε*, we obtain an integral representation of the mutual capacitance of the *i*-th and *j*-th conductors:

$$\mathbf{C}\_{ij} = -\frac{1}{\sigma\_{\Gamma}} \oint\_{\Gamma} \frac{\varepsilon}{r^{2}} \oint\_{S\_{\Gamma}} \frac{3\sigma\_{\Gamma}}{4\pi r^{2}} (y - x, n) \,\mathrm{q}(y) d\_{\mathrm{y}} S d\_{\mathrm{x}} S\_{\prime} \tag{4}$$

where *σ*<sup>Γ</sup> is a surface area Γ.

#### **3. Unbiased Estimators for the Capacitance**

Using Formula (4) we have unbiased estimator for capacitance *Cij*

$$\xi = \frac{3\varepsilon(X)\sigma\_{\Gamma}}{r}(\omega, n)\varrho(X + r\omega). \tag{5}$$

Here, a random point *X* is uniformly distributed on Γ, *r* = *r*(*X*), and *ω* is an isotropic vector (random unit vector). It remains to estimate the potential at the point *Y* = *X* +*r*(*X*)*ω*. This can be done using the mean value formula

$$\varrho(\mathbf{x}) = \int\_{Q} \varrho(y) P(\mathbf{x}, dy), \mathbf{x} \in Q,\tag{6}$$

where *<sup>Q</sup>* <sup>=</sup> **<sup>R</sup>**<sup>3</sup> \ *<sup>D</sup>*, and *<sup>D</sup>* is the set of interior points of all conductors. The unbiased estimators for *<sup>ϕ</sup>*(*Y*) are constructed on trajectories of Random Walk {*Yk*}<sup>∞</sup> *<sup>k</sup>*=0, (*Y*<sup>0</sup> = *Y*), in space *Q*. The kernel *P*(*x*, *dy*) must be stochastic or sub-stochastic. It determines the distribution of the next point of the Random Walk over the current point.

Let

$$
\zeta\_0 = \frac{3\varepsilon(X)\sigma\_\Gamma}{r}(\omega, n). \tag{7}
$$

If at time *k* the "weight" *Wk* = *P*(*Yk*, *Q*) < 1, then the current value of the estimator is multiplied by the "weight": *ξk*+<sup>1</sup> = *Wkξk*. The Random Walk stops at time *ν*, when it reaches the *δ*-boundary of the conductors, that is, when the distance dist(*Yk*, *∂D*) from point *Yk* to the boundary of the conductors becomes less than the *δ*. Hence, it must satisfy the condition *P*{*ν* < ∞} = 1. We define estimator *ξδ* = *ξν*, if dist(*Yν*, Γ*j*) < *δ*, and zero, otherwise. If the boundary *∂D* is smooth enough, then |*Cij* − *Eξδ*| < *cδ*, for some constant *c*. In practice, the estimator *ξδ* is simulated in a reasonable time, only if *Eξν* < ∞. Having received a sufficient number of realizations of the estimator *ξδ* and calculating their arithmetic mean, we obtain an approximate value of the capacitance *Cij*.

We will now describe some of the types of Random Walks used to calculate the capacitances of conductors.

#### *3.1. Random Walk on Spheres for the External Dirichlet Problem*

Random Walk on Spheres (WoS) is used to solve the external Dirichlet problem for the Laplace equation [19], and allows for the calculation of the capacitances of conductors in a homogeneous medium [10]. Let all the conductors lie inside a sphere *SR* of radius *R* centered at the origin. Let *ρ*(*y*) be a continuous function such that *c* · dist(*y*, *∂D*) ≤ *ρ*(*y*) ≤ dist(*y*, *∂D*) for some constant *c* > 0.

By the mean value theorem for harmonic functions, we obtain *ϕ*(*x*) = *Eϕ*(*x* + *ρ*(*x*)*ω*) for *<sup>x</sup>* <sup>∈</sup> **<sup>R</sup>**<sup>3</sup> \ *<sup>D</sup>*. Let {*ωk*}<sup>∞</sup> *<sup>k</sup>*=<sup>1</sup> be a sequence of independent isotropic vectors. Then we get a Random Walk

$$\begin{aligned} \Upsilon\_{k+1} &= \Upsilon\_k + \rho(\mathcal{Y}\_k)\omega\_{k+1}, \\ \Upsilon\_0 &= \mathcal{Y}\_\prime \\ W\_k &= 1, \\ k &= 0, 1, 2, \dots \end{aligned}$$

To restrict the region of the Random Walk, we use the Poisson formula for |*x*| > *R*:

$$\varphi(\mathbf{x}) = \frac{1}{4\pi R} \oint\_{S\_R} \frac{|\mathbf{x}|^2 - R^2}{|\mathbf{x} - \mathbf{y}|^3} d\_y S\_\mathbf{x}$$

Namely, if |*Yk*| > *R*, then "weight" *Wk* = *R*/|*Yk*|, and *Yk*<sup>+</sup><sup>1</sup> is distributed on the sphere *SR* with density

$$p(\boldsymbol{\chi}\_{k}, \boldsymbol{y}) = \frac{|\boldsymbol{\chi}\_{k}|^{2} - R^{2}}{|\boldsymbol{\chi}\_{k} - \boldsymbol{y}|^{3}} \cdot \frac{|\boldsymbol{\chi}\_{k}|}{4\pi R^{2}}\tag{8}$$

(Figure 3).

**Figure 3.** Random Walk on Spheres. Return on external sphere *SR*.

It is proved [19] that the Random Walk on Spheres reaches the *δ*-neighborhood of the boundary of the conductors in a finite time. The formulas for simulating the Random Walk are also given.

#### *3.2. Random Walk on Hemispheres*

The Random Walk on Hemispheres (WoH) algorithm was proposed in [20] for solving various boundary value problems for the Laplace and Poisson equations. It allows for the calculation of capacitances when dielectric interfaces are polyhedral [11]. In cases when surfaces of the conductors are also polyhedral, the algorithm gives unbiased statistical estimators of the capacitances. We will now briefly describe this algorithm.

Let all the conductors and dielectric interfaces lie inside a sphere *SR* of radius *R* centered at the origin. If |*Yk*| > *R*, then *Yk*<sup>+</sup><sup>1</sup> ∈ *SR* and has a distribution density (8). "Weight" *Wk* = *R*/|*Yk*|.

Now, let *Yk* ∈ Γ*dl* , where Γ*dl* is a component of the dielectric interface. Next, we choose the maximum *r*, such, that 0 < *r* < dist(*Yk*, *D* ∪ Γ*<sup>d</sup>* \ Γ*dl* ) and part of the Γ*dl* , lying in sphere *Sr*(*Yk*), is plane. The sphere is divided into two parts *S*<sup>+</sup> *<sup>r</sup>* (*Yk*) and *S*<sup>−</sup> *<sup>r</sup>* (*Yk*) lying in media with permittivity constants *ε*<sup>+</sup> and *ε*−, respectively. The point *Yk*<sup>+</sup><sup>1</sup> is uniformly distributed in *S*<sup>+</sup> *<sup>r</sup>* (*Yk*) or in *S*<sup>−</sup> *<sup>r</sup>* (*Yk*) with probability *ε*+/(*ε*<sup>+</sup> + *ε*−) and *ε*−/(*ε*<sup>+</sup> + *ε*−) respectively (Figure 4).

**Figure 4.** Random Walk on Hemispheres. Exit from interface.

If *Yk* ∈/ Γ*<sup>d</sup>* and |*Yk*| ≤ *R*, then *Yk*<sup>+</sup><sup>1</sup> is distributed on a sphere or hemisphere. The center of the hemisphere *Y*&*<sup>k</sup>* must be in a plane containing a face of the conductor surface or interface and is the orthogonal projection of *Yk* onto this plane.

Hemisphere radius *rk* = |*Yk* − *Y*&*k*|/*β*, where 0 < *β* < 1 is a fixed constant. The hemisphere must be contained in a medium with a dielectric constant *ε*(*Yk*). The distribution density of the point *Yk*<sup>+</sup><sup>1</sup> on the hemisphere is the normal derivative of the Green's function for the half of the ball

$$p(Y\_{k'}y) = \begin{cases} \frac{2r\_k\beta}{4\pi} \left(\frac{1}{|Y\_k - y|^3} - \frac{1}{\left(\beta |\overline{Y\_k^\*} - y|\right)^3}\right), & y \in H, \\\frac{r\_k}{4\pi} \left(1 - \beta^2\right) \left(\frac{1}{|Y\_k - y|^3} - \frac{1}{|\overline{Y\_k^\*} - y|^3}\right), & y \in S. \end{cases} \tag{9}$$

Here *H* is the plane part, and *S* is the spherical part of the hemisphere. The point *Yk* is symmetric to *Yk* relative to plane *H*. The point *Y*<sup>∗</sup> *<sup>k</sup>* lies outside of the sphere *S* and it is inverse to the point *Yk* (|*Y*&*<sup>k</sup>* − *Yk*|·|*Y*&*<sup>k</sup>* − *Y*<sup>∗</sup> *<sup>k</sup>* <sup>|</sup> <sup>=</sup> *<sup>r</sup>*2) (Figure 5).

**Figure 5.** Random Walk on Hemispheres. Symmetrical points on hemisphere.

If it is impossible to construct such a hemisphere, then *Yk*<sup>+</sup><sup>1</sup> is distributed uniformly on a sphere of radius *rk* = dist(*Yk*, *D* ∪ Γ*d*), centered at *Yk*.

Von Neumann's Acceptance-Rejection Method can be used to simulate density (9). To do this, we write the density in the form

$$p(\boldsymbol{Y}\_{k'}\boldsymbol{y}) = \frac{1}{4\pi} \frac{\cos\phi \boldsymbol{\gamma}\_{k\boldsymbol{y}}}{\left|\boldsymbol{Y}\_{k} - \boldsymbol{y}\right|^{2}} \cdot k\_{1}(\boldsymbol{Y}\_{k'}\boldsymbol{y})\_{\ast} \tag{10}$$

where *ϕYky* is the angle between the vector *y* − *Yk* and the external normal to the surface of the hemisphere at the point *y*.

The first factor in this formula is the distribution density of the point *Z* on the surface of the hemisphere and the vector *ω* = (*Z* − *Yk*)/|*Z* − *Yk*| is isotropic one. The second factor does not exceed the constant *M* = max(*M*1, *M*2), where

$$M\_1 = \frac{2}{\beta} \sqrt{1 + \beta^2} \left( 1 - \left( \frac{\beta}{\sqrt{1 + \beta^2}} \right)^3 \right).$$

$$M\_2 = \sqrt{1 + \beta^2} \frac{1 + \beta}{1 - \beta} \left( 1 - \left( \frac{1 - \beta}{1 + \beta} \right)^3 \right).$$

To select the next point of the random walk, we simulate an isotropic vector *ω* and a random variable *α* with uniform distribution on [0, 1]. Then we define the point *Z*, in which the ray emerging from the *Yk* in the direction *ω* crosses the hemisphere. If *αM* < *k*1(*Yk*, *Z*), then *Yk*<sup>+</sup><sup>1</sup> = *Z*. Otherwise, it is necessary to repeat the simulation until the inequality is true (Figure 6).

**Figure 6.** Random Walk on Hemispheres. Jump on hemisphere.

#### *3.3. Random Walk on Hemispheres for a Convex Dielectric Interfaces (RWHC)*

Let *γ* be a connected convex part of some dielectric interface Γ*dl* lying inside a sphere *Sr*(*x*) of radius *r*, centered at point *x*. For all *y* ∈ Γ*dl* we choose the direction of the normal vector *ny* so, that the surface Γ*dl* lies in the half-space (*z* − *y*, *ny*) ≤ 0. Surface *γ* divides the sphere into two parts *S*<sup>+</sup> and *S*−, lying in media with permittivity constants *ε*<sup>+</sup> and *ε*−, respectively, and (*z* − *y*, *ny*) ≤ 0 for all *z* ∈ *S*−.

The potential *ϕ*(*x*) is a harmonic function for the part of the ball bounded by surfaces *S*−, *γ* and part of the ball bounded by surfaces *S*+, *γ* also. Using the second Green's formula for a harmonic function in a bounded domain, we obtain the following Theorem.

**Theorem 1.** *Let <sup>λ</sup>* <sup>=</sup> *<sup>ε</sup>*+/*ε*<sup>−</sup> *and let <sup>ϕ</sup>xy be the angle between vectors ny*, *<sup>y</sup>* <sup>−</sup> *<sup>x</sup>*. *Then the potential ϕ*(*y*) *satisfies the mean value formulas:*

$$\begin{split} \boldsymbol{\varrho}(\mathbf{x}) = \frac{1}{1+\lambda} \cdot \frac{1}{2\pi r^2} \int\_{S^-} \boldsymbol{\varrho}(\mathbf{y}) d\_y \boldsymbol{S} + \frac{\lambda}{1+\lambda} \cdot \frac{1}{2\pi r^2} \int\_{S^+} \boldsymbol{\varrho}(\mathbf{y}) d\_y \boldsymbol{S} + \\ &+ \frac{1-\lambda}{1+\lambda} \cdot \frac{1}{2\pi} \int\_{\gamma} \frac{\cos \boldsymbol{\varrho}\_{\text{xy}}}{|\mathbf{x}-\mathbf{y}|^2} \boldsymbol{\varrho}(\mathbf{y}) d\_y \boldsymbol{S}, \quad \mathbf{x} \in \gamma\_{\prime} \quad \text{(11)} \end{split} \tag{11}$$

$$\begin{split} \varphi(\mathbf{x}) &= \frac{1}{4\pi r^2} \int\_{S^-} \varphi(y) d\_y S + \lambda \cdot \frac{1}{4\pi r^2} \int\_{S^+} \varphi(y) d\_y S + \\ &\qquad + (1 - \lambda) \cdot \frac{1}{4\pi} \int\_{\gamma} \frac{\cos \varrho\_{\mathbf{xy}}}{|\mathbf{x} - \mathbf{y}|^2} \varphi(y) d\_y S, \quad \mathbf{x} \notin \gamma, \quad \varepsilon(\mathbf{x}) = \varepsilon^-, \end{split} \tag{12}$$

$$\begin{split} \varphi(\mathbf{x}) &= \frac{1}{4\pi r^2} \int\_{S^+} \varphi(y) d\_y S + \frac{1}{\lambda} \cdot \frac{1}{4\pi r r^2} \int\_{S^-} \varphi(y) d\_y S - \\ &\quad - \left(1 - \frac{1}{\lambda}\right) \cdot \frac{1}{4\pi} \int\_{\gamma} \frac{\cos \varrho\_{\mathbf{xy}}}{|\mathbf{x} - \mathbf{y}|^2} \varrho(y) d\_y S, \quad \mathbf{x} \notin \gamma, \quad \varepsilon(\mathbf{x}) = \varepsilon^+. \end{split} \tag{13}$$

If *λ* < 1, the Formula (11) defines the stochastic kernel. To simulate the transition from the surface *γ*, we chose with the probability *λ*/(1 + *λ*) a random direction *ω*, that satisfies the condition (*ω*, *nx*) > 0, and define *Y* = *x* + *rω*. With probability 1/(1 + *λ*), we simulate a random direction *ω* satisfying the condition (*ω*, *nx*) < 0. We calculate *Y* = *x* + *rω*. If *Y* ∈/ *S*−, we change *Y* to a point *Z* ∈ *γ*, which is visible from *x* in the direction *ω* (Figure 7).

**Figure 7.** RWHC. Jump from convex part of interface.

If *λ* < 1, the Formula (12) defines the stochastic kernel also. To simulate the transition from *x*, we simulate a random direction *ω* and calculate *Y* = *x* + *rω*. If *Y* ∈/ *S*−, than with probability 1 − *λ* we change *Y* to a point *Z* ∈ *γ*, which is visible from *x* in the direction *ω* (Figure 8).

**Figure 8.** RWHC. Jump from dielectric with higher permittivity.

If *λ* > 1, the Formula (13) defines the stochastic kernel, if any ray outgoing from point *x* intersects *γ* at no more than one point. The modeling procedure is similar to the algorithm for the Formula (12).

Thus, Formulas (11)–(13) make it possible to simulate transitions from a region with a higher dielectric constant to a region with a lower dielectric constant. To pass from point *x* through the interface Γ*dl* , it is sufficient to take such *r* ≤ dist(*x*, *D* ∪ Γ*<sup>d</sup>* \ Γ*dl* ) that *Sr*(*x*) ∩ Γ*dl* = ∅. Reverse transitions can be provided using, for example, formulas for solving external and internal Dirichlet problems for standard domains. The exit from the "bad" point *x* can be done by Random Walk on Spheres or Hemispheres in the set *Q*(*x*) = {*y*|*ε*(*y*) = *ε*(*x*)}. As always, from distant points of the external medium there is a transition to the sphere *SR*.

#### *3.4. Algorithm for Mutual Capacitance Calculation*

On this basis we could describe algorithm for capacitance estimation as follows:


#### **4. Results**

#### *4.1. Mutual Capacitance of Two Spheres in Free Space*

Mutual capacitance for two spheres could be calculated analytically [3]. When spheres are not nested:

$$\begin{aligned} \mathbb{C}\_{1,1} &= 4\pi\varepsilon r\_1 r\_2 \sinh a \sum\_{n=1}^{\infty} \frac{1}{r\_2 \sinh(na) + r\_1 \sinh[(n-1)a]}; \\ \mathbb{C}\_{1,2} &= -4\pi\varepsilon \frac{r\_1 r\_2 \sinh a}{d} \sum\_{n=1}^{\infty} \frac{1}{\sinh(na)}; \\ \mathbb{C}\_{2,2} &= 4\pi\varepsilon r\_1 r\_2 \sinh a \sum\_{n=1}^{\infty} \frac{1}{r\_1 \sinh(na) + r\_2 \sinh[(n-1)a]}; \\ \cosh a &= \frac{d^2 - r\_1^2 - r\_2^2}{2r\_1 r\_2} \end{aligned}$$

where *r*<sup>1</sup> and *r*<sup>2</sup> are radii, *d*—distance between sphere centers. For nested spheres (*r*<sup>2</sup> > *r*1):

$$\begin{aligned} \mathbb{C}\_{1,1} &= 4\pi \varepsilon r\_1 r\_2 \sinh a \sum\_{n=1}^{\infty} \frac{1}{r\_2 \sinh(na) - r\_1 \sinh[(n-1)a]}; \\ \mathbb{C}\_{1,2} &= -\mathbb{C}\_{1,1}; \\ \cosh a &= -\frac{d^2 - r\_1^2 - r\_2^2}{2r\_1 r\_2}. \end{aligned}$$

In Table 1 results of mutual capacitance estimation for two non-nested spheres using Walk on Spheres are presented. Here and below Δ is error estimation, calculated as triple square root of ratio of sample variance to number of trajectories, *Time* is "wall time" of calculation, and *Memory* is a peak memory usage. Calculations were performed on one personal computer (PC) with central processing unit (CPU) "AMD Ryzen 7 2700 Eight-Core Processor 3.20 GHz". Monte Carlo simulations were performed in parallel by eight worker processes on one PC using the Message Passing Interface, and memory usage is at its peak for one worker process. FastCap2 and FFTCap are 32-bit single-threaded applications, so no parallel execution were used for them. It should also be noted that we have not used additional optimizations, so the calculation time could be improved for the Monte Carlo case, for example, by using a different pseudo random number generator or using optimizations in distance calculations.


**Table 1.** Mutual capacitance estimation for two non-nested spheres, *Ci*,*j*/4*πε*0.


Usually, the bias order is the same as the order of *δ*, so, in this and following examples, error of methods is equal to statistical error. We will say that results of the estimation are *matched*, when the modulus of difference between the Monte Carlo estimation and the reference solution are not more than the statistical error (|*ref* − *est*| ≤ Δ). As we can see in the Table 1, the analytical solution and our estimation are matched, so the algorithm is working correctly.

In Table 2, results of mutual capacitance estimation for two *nested* spheres using Walk on Spheres (WoS) are presented. There is no formula for *C*<sup>22</sup> in the case of two nested spheres in [3], so we do not show the estimation results for this value.


**Table 2.** Mutual capacitance estimation for two nested spheres, *Ci*,*j*/4*πε*0.


Estimation results in Table 2 are within statistical error, so analytical solution and our estimation are matched.

#### *4.2. Capacitance of "Coated" Sphere*

In Tables 3 and 4, results of capacitance estimation for the conductive sphere of radius *a* encased in concentric spherical dielectric of radius *b* with relative permittivity *ε* using RWHC are presented. In this example, the external sphere radius is equal to the dielectric shell radius. Analytical solutions for this case is <sup>16</sup>*π*2*εε*0*ab <sup>ε</sup><sup>a</sup>* <sup>+</sup> *<sup>b</sup>* <sup>−</sup> *<sup>a</sup>* [3]. Also here we compare results with FastCap2 (FC2) [18] (correspondent sphere discretization was made using spheregen tool from [21] with refine depth 5).


**Table 3.** Capacitance estimation for coated sphere, *C*/4*πε*0.

Comparing values in columns 2, 4, 5 of Table 3 we conclude that the analytical solution and RWHC estimation are matched in all cases. However, comparing columns 2 and 3, we can see that the estimation error for FC2 grows up with *ε* (as it is stated in [7]). So we can state, that RWHC working correctly with high permittivities too, but more number of simulation may be needed to get estimation with desired statistical error.


**Table 4.** Capacitance estimation for coated sphere. Time and memory usage.

Table 4 shows that RWHC is used more often than boundary-element technique-based methods, such as FC2, but use much less memory.

#### *4.3. Mutual Capacitance of Two Spheres in Spherical Dielectric*

In the Tables 5 and 6 results of mutual capacitance estimation using FC2 (correspondent sphere discretization was made using spheregen tool from [21] with refined depth 5) and RWHC for two conductive spheres in spherical dielectric (Figure 9) are presented. In this case external sphere radius is set to the dielectric shell radius.


**Figure 9.** Two spheres in dielectrical shell.


**Table 5.** Mutual capacitance estimation for two spheres in dielectric shell, *Ci*,*j*/4*πε*0.

In this case we have no analytical solutions for reference, so we compare our results with FC2. But we also have no error estimation for FC2 result, so we could not guarantee that the difference will be within statistical margin of error. Comparing results in columns 2 and 3 of Table 5 we could say that the estimations are matched, but it is not true for columns 2 and 4. In previous cases we ascertained that RWHC is matched with the analytical solution. Also, in this case, we can see that results in columns 3 and 4 are matched. So we can state that in this case the estimation error with FC2 is more than with RWHC on 108 trajectories.

**Table 6.** Mutual capacitance estimation for two spheres in dielectric shell. Time and memory usage.


In Table 6 we can see that RWHC is better both in time and memory. This is due to the fact that for FC2 spheres should be approximated with a large number of panels. Also, we can see that with the refinement depth we used, we have almost reached the memory limit for the original 32-bit fastcap application, so we cannot compare to these results with better discretization.

#### *4.4. Mutual Capacitance of Two Spheres in Spherical Dielectrics*

In Tables 7 and 8, results of mutual capacitance estimation using FC2 (correspondent sphere discretization was made using the spheregen tool from [21] with refined depth 5) and RWHC for two conductive spheres, each in its own spherical dielectric, (Figure 10) are presented.


**Figure 10.** Two spheres in dielectrical shells.

**Table 7.** Mutual capacitance estimation for two spheres in dielectric shells, *Ci*,*j*/4*πε*0.


There are no reference analytical solutions in this case either. Using the results from Table 7, we have reached the same conclusion as in the previous case.

**Table 8.** Mutual capacitance estimation for two spheres in dielectric shells. Time and memory usage.


*4.5. Mutual Capacitance of Parallel "Pins"*

In Tables 9 and 10, the results of the mutual capacitance estimation using FFT-Cap [17,18] (correspondent discretization was made using cubegen tool from [18] with 5 panels per side) and Walk on Hemispheres for 81 conductive "pins" placed in uniform lattice points (Figure 11) are presented. The full capacitance matrix has dimensions of 81 × 81, so we show only a few values in the table.


**Figure 11.** 9 × 9 conductive pins.

**Table 9.** Mutual capacitance of 9 × 9 conductive pins, *Ci*,*j*/4*πε*0.


In this case objects have flat faces, so this task is "good" for FFTCap and we can take this solution as reference. The results from Table 9 shows that the estimations are matched. Also, we could see that, besides matching statistical error, the results for WoH estimation when *i* = 1, *j* = 81 have a larger statistical error than in other cases (about 6%). This is due to the fact that these conductors are located in opposite corners of the lattice, so only a small number of trajectories started near the first conductor will end on 81st. In this case, to get an estimation with the desired statistical error, more simulations may be required. For example, with 10<sup>8</sup> trajectories we get the value of <sup>−</sup>6.0515 · <sup>10</sup>−<sup>3</sup> and statistical error of 1.087 · <sup>10</sup>−<sup>4</sup> (less than 2%).

Also we could estimate difference with FFTCap results by norm: let *A*—FFTCap result matrix, *<sup>B</sup>*—WoH result matrix for 107 trajectories from each conductor, then *<sup>A</sup>* <sup>−</sup> *<sup>B</sup><sup>F</sup> A<sup>F</sup>* ≈ 0.009, where *A<sup>F</sup>* is Frobenius norm of matrix *A*.

**Table 10.** Mutual capacitance of 9 × 9 conductive pins. Time and memory usage.


#### *4.6. Mutual Capacitance of Rectangular Parallelepipeds in Dielectric Shells*

In Tables 11 and 12, the results of mutual capacitance estimation using FC2 (correspondent sphere discretization was made using the spheregen tool from [21] with 10 and 15 panels per side) and Walk on Hemispheres for three conductive rectangular parallelepipeds in parallelepipedic dielectrics (Figure 12) are presented.


**Figure 12.** Rectangular parallelepipeds in dielectric shells.



As before, we can compare the RWHC results with the FC2 estimation. In Table 11, the results are not matched between FC2 and RWHC with 108 trajectories when *i* = 2, *j* = 1. Because WoH results are matched for 10<sup>6</sup> and 10<sup>8</sup> trajectories and FC2 results with 15 panels per side are closer to our estimation than results with 10 panels per side, we could assume that this discrepancy is related to an FC2 estimation error, as in example Section 4.3.

**Table 12.** Mutual capacitance of rectangular parallelepipeds in dielectric shells. Time and memory usage.


#### *4.7. Mutual Capacitance of "Woven Bus"*

In Tables 13 and 14, the results of mutual capacitance estimation using FFTCap [18] (correspondent discretization was made using wovengen tool from [21] with 10 panels per side) and Walk on Hemispheres for 9 × 9 woven bus [7] (Figure 13) are presented.


**Figure 13.** 9 × 9 woven bus.

**Table 13.** Mutual capacitance of "woven bus", *Ci*,*j*/4*πε*0.


The results in Table 13 match. The difference with FFTCap results also could be evaluated by norm: let *A*—FFTCap result matrix, *B*—WoH result matrix for 108 trajectories from each conductor, then *<sup>A</sup>* <sup>−</sup> *<sup>B</sup><sup>F</sup> A<sup>F</sup>* ≈ 0.001.

**Table 14.** Mutual capacitance of "woven bus". Time and memory usage.


#### **5. Conclusions**

We developed some new numerical algorithms for extracting capacitances. These algorithms do not use the approximation of the Laplace operator by its difference counterpart. Their computational error is determined by the sum of the statistical error and the value of the estimator bias. The statistical error is determined in the course of calculations. The systematic error of the estimator is equal to the error when we approximate the potential at

points lying near the boundary of the conductor by values at the boundary. This error is controlled by the parameter *δ*.

The Random Walk on Spheres algorithm is universal in the case of a homogeneous dielectric. It works for conductors with any geometry.

The Random Walk on Hemispheres is applied when dielectric interfaces are polyhedral. In cases when surfaces of the conductors are also polyhedral, the algorithm gives unbiased statistical estimators of the capacitances. The accuracy of this algorithm is equal to the statistical error of the estimators, which is easily determined in the course of calculations.

The Modified Random Walk on Hemispheres algorithm works for convex dielectric interfaces.

Computational experiments show that the algorithms are effective. For systems where capacitances are calculated analytically [3], it is shown that the accuracy of the Monte Carlo approximation is within the statistical error (see Tables 1–3). In more complex examples, to prove that the Monte Carlo estimation results are correct, we have matched them with the results of the calculation of the capacitances using the non-Monte Carlo methods implemented in the FastCap2 and FFTCap programs [18] (see Tables 5, 7, 9, 11 and 13). The algorithm also works correctly in cases when the ratio of the permittivities is 100 or more (see Table 3).

Monte Carlo simulation times for different cases were presented with numerical results, but these are not final and could be improved upon, even with the same configuration of PC, by using another implementation of the pseudo random number generator, for example, or using a function for calculating distance that is optimized for a particular task. For example, by using another implementation of pseudo random number generator, or using function for calculating distance, that is optimized for particular task.

**Author Contributions:** Investigation, A.K., A.S. All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was funded by Russian Science Foundation grant 19-11-00020.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data sharing is not applicable to this article.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Resource Retrial Queue with Two Orbits and Negative Customers**

**Ekaterina Lisovskaya, Ekaterina Fedorova, Radmir Salimzyanov and Svetlana Moiseeva \***

Institute of Applied Mathematics and Computer Science, National Research Tomsk State University, 634050 Tomsk, Russia; ekaterina\_lisovs@mail.ru (E.L.); moiskate@mail.ru (E.F.); radmir.salimzyanov@stud.tsu.ru (R.S.)

**\*** Correspondence: smoiseeva@mail.ru; Tel.: +7-913-815-3262

**Abstract:** In this paper, a multi-server retrial queue with two orbits is considered. There are two arrival processes of positive customers (with two types of customers) and one process of negative customers. Every positive customer requires some amount of resource whose total capacity is limited in the system. The service time does not depend on the customer's resource requirement and is exponentially distributed with parameters depending on the customer's type. If there is not enough amount of resource for the arriving customer, the customer goes to one of the two orbits, according to his type. The duration of the customer delay in the orbit is exponentially distributed. A negative customer removes all the customers that are served during his arrival and leaves the system. The objects of the study are the number of customers in each orbit and the number of customers of each type being served in the stationary regime. The method of asymptotic analysis under the long delay of the customers in the orbits is applied for the study. Numerical analysis of the obtained results is performed to show the influence of the system parameters on its performance measures.

**Keywords:** retrial queue; negative customers; resource heterogeneous queue; asymptotic analysis

#### **1. Introduction**

The theory of queuing systems with repeated calls (Retrial Queue) is an important section of the modern teletraffic theory, the relevance of which is due to wide practical applications, such as the performance evaluation and design of broadcast, radio, and cellular networks, as well as local networks with the random multiple access protocols. In the monographs [1–3], the detailed survey of recent queuing models applications in telecommunication, modern computer networks, and information systems are presented.

The retry phenomenon is the integral feature of data transmission systems, and this phenomenon ignored in theoretical research can lead to significant errors in engineering decisions. Many multimedia and service applications on subscriber devices can automatically generate such requests, without any relative restrictions. Such unaccounted traffic consumes the channel resource in excess of the planned one. On the network sections, overflows begin to appear, that leads to the service rejection; thus, it generates more repeated calls again.

A large number of publications has been devoted to the study of retrial queues. The most extensive reviews of significant results, up to 2008, are presented in the monographs [4,5].

Queuing models with negative customers [6–8], or G-queues, are useful models for the analysis of multiprocessor computer systems, neural networks, communication systems, and manufacturing [9,10]. In its simplest version, a negative customer (negative arrival) has the effect of a positive (ordinary) customer(s) being deleted according to some strategy. For example:

• a negative arrival eliminates all the customers in the system or in its part, e.g., under the service or in the buffer (catastrophes);

**Citation:** Lisovskaya, E.; Fedorova, E.; Salimzyanov, R.; Moiseeva, S. Resource Retrial Queue with Two Orbits and Negative Customers. *Mathematics* **2022**, *10*, 321. https:// doi.org/10.3390/math10030321

Academic Editors: Alexander Zeifman, Victor Korolev and Alexander Sipin

Received: 25 November 2021 Accepted: 18 January 2022 Published: 20 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).


Negative arrivals are interpreted as viruses, orders or inhibitor signals, etc. The detailed overview of G-queues is presented in Reference [11]. Retrial queuing systems with negative arrivals have been considered, for example, in References [12–15]. The effects of negative customers considered in these papers are close to the effect of breakdowns. Retrial queues with breakdowns have been studied in References [16–19].

Other features of modern data transmission systems are random amounts of transmitted data and requests for additional resources. Often, in telecommunication systems, calls come from different sources and have different service time characteristics and different priorities, or they need more than one service device, etc. These features make the system analysis more complex. Identifying these aspects and analyzing their influence on the systems allow to optimize networks for loss reduction. Such mathematical models are called queuing systems with a random volume of the customers or resource queuing systems [20–28]. Resource queuing systems are applied in modern wireless communication networks, cloud computing systems, technical devices, or next-generation data transmission networks. It is known that, in classical queuing theory, the evaluation of almost all performance characteristics leads us to the analysis of a stochastic process of the number of customers in the system. However, it is insufficient if we would like to determine a buffer space capacity of a communication network's node which guarantees small losses of transmitted data [20,21,23]. Incoming customers can request some resources (for example, the amount of memory). The requests may be random or deterministic. In queuing models, the total amount of resource is usually limited by a constant value *R* > 0, which is called the buffer space capacity of the system. The buffer space is occupied by a customer at the arrival epoch and is entirely released at the service finish epoch. If value *R* is finite, it leads to additional losses of customers. The complexity of the study of resource systems is due to an universal approach does not exist. We use asymptotic methods [15,27,29], which give asymptotic expressions of the studied system characteristics that are acceptable for practical usage. A retrial queuing system with limited processor sharing (close to resource systems) is considered in Reference [30].

In the paper, the model under study is a non-classical retrial queuing system with nonhomogeneous customers. The main feature of the research is the consideration of possible failures in the system. In the paper, we apply the theory of G-queues [6] for modeling breakdowns in real networks. A breakdown is represented by negative customers arriving at the queuing system and removing all served customers if any is present. So, we research the queuing system with all mentioned above features: repeated calls, negative customers, and resource. Such a model can be applied, for example, for 5G New Radio systems. The key feature and the main problem of the 5G New Radio network is that people themselves, cars, buildings, etc., are signal blockers, which is the cause of service interruptions. In this reasoning, the scientific community has the task of analyzing the performance of these systems and improving it in the future. Currently, there are known studies of "basic" mathematical models of such networks. However, due to introducing this technology into the daily lives of subscribers, we need to propose and analyze of the most appropriate mathematical models.

The considered mathematical model is described in Section 2. In Section 3, the method of asymptotic analysis is proposed and applied for the study. The numerical analysis is presented in Section 4. It includes a comparison of asymptotic and simulated distributions, numerical examples for various values of the model parameters, and calculation of some performance characteristics, such as the probability of the first time rejecting (or, in other words, it is a joining probability) and buffer space utilization. The problems and discussions about the applicability of the obtained approximations are presented in the conclusion.

#### **2. Mathematical Model**

Let us consider the multi-server retrial queue (Figure 1) with two positive and one negative arrival processes. We call customers arrived in the *k*-th positive arrival process customers of the *k*-th type (*k* ∈ {1, 2}). All the processes are Poisson with parameters *λ*1, *λ*<sup>2</sup> (for the first and the second type of positive customers), and *α* (for the negative ones). Positive customers need the service, and the service laws are exponential with parameters *μ*<sup>1</sup> and *μ*<sup>2</sup> for the first and the second type of customers, respectively. In addition, each positive customer requires a deterministic amount of resource (*x*<sup>1</sup> or *x*2, respectively); therefore, a customer occupies some amount of resources during his service. The service time does not depend on customer's resource requirement. The service unit has a limited capacity of resources, which equals to *R*. The number of servers *N* is limited but large enough. If the customer cannot be served at the arrival moment (there is not enough amount of the resource), it goes to the corresponding orbit (some virtual place). The duration of the customers delay in the orbits is distributed exponentially with parameters *σ*<sup>1</sup> and *σ*2, respectively. The negative customer deletes all customers being served in the service unit at his moment of arrival.

**Figure 1.** Resource retrial queue with two orbits and negative customers.

The goal of the paper is to study four-dimensional stochastic process

$$X(t) = (N\_1(t), N\_2(t), I\_1(t), I\_2(t)),$$

where *Nk*(*t*) is the number of *k*-type customers in the service unit at the moment *t*, and *Ik*(*t*) is the number of customers in the *k*-th orbit at the moment *t*. Then, the state space has the form:

$$\mathbb{X} = \{(n\_1, n\_2, i\_1, i\_2) : \mathbf{x}\_1 \mathbf{n}\_1 + \mathbf{x}\_2 \mathbf{n}\_2 \le \mathbf{R}, \quad i\_k \ge 0, \quad k = 1, 2\},$$

where *nk* is a value of the process *Nk*(*t*), and *ik* is a value of the process *Ik*(*t*), where *k* = 1, 2.

Traditionally, continuous-time Markov chains can be represented as a transition graph. In Figure 2, we depict such a graph for the considered Markov chain *X*(*t*). The ovals on the graph represent the states, and the arrows show the possible transitions and their intensities. In addition, next to each state, we show the condition under which it exists, knowing that the central state (any state) on the graph is (*n*1, *<sup>n</sup>*2, *<sup>i</sup>*1, *<sup>i</sup>*2) <sup>∈</sup> <sup>X</sup>.

Let us consider in more detail the possible events that cause a change in the state of the considered Markov chain:


**Figure 2.** Graph of the input/output transitions of central state.

So, we will find the stationary probabilities

$$p(n\_1, n\_2, i\_1, i\_2) = \Pr\{N\_1(t) = n\_1, N\_2(t) = n\_2, I\_1(t) = i\_1, I\_2(t) = i\_2\}.$$

Let us write the following system of equations for them:

*p*(*n*1, *n*2, *i*1, *i*2)[*λ*<sup>1</sup> + *λ*<sup>2</sup> + *n*1*μ*<sup>1</sup> + *n*2*μ*<sup>2</sup> + *i*1*σ*<sup>1</sup> *I*((*n*<sup>1</sup> + 1)*x*<sup>1</sup> + *n*2*x*<sup>2</sup> ≤ *R*, *i*<sup>1</sup> > 0)+ *i*2*σ*<sup>2</sup> *I*(*n*1*x*<sup>1</sup> + (*n*<sup>2</sup> + 1)*x*<sup>2</sup> ≤ *R*, *i*<sup>2</sup> > 0) + *αI*(*n*<sup>1</sup> + *n*<sup>2</sup> = 0)] = *λ*<sup>1</sup> *p*(*n*<sup>1</sup> − 1, *n*2, *i*1, *i*2)*I*(*n*<sup>1</sup> > 0) + *λ*<sup>2</sup> *p*(*n*1, *n*<sup>2</sup> − 1, *i*1, *i*2)*I*(*n*<sup>2</sup> > 0)+ *λ*<sup>1</sup> *p*(*n*1, *n*2, *i*<sup>1</sup> − 1, *i*2)*I*((*n*<sup>1</sup> + 1)*x*<sup>1</sup> + *n*2*x*<sup>2</sup> > *R*, *i*<sup>1</sup> > 0)+ *λ*<sup>2</sup> *p*(*n*1, *n*2, *i*1, *i*<sup>2</sup> − 1)*I*(*n*1*x*<sup>1</sup> + (*n*<sup>2</sup> + 1)*x*<sup>2</sup> > *R*, *i*<sup>2</sup> > 0)+ (*n*<sup>1</sup> + 1)*μ*<sup>1</sup> *p*(*n*<sup>1</sup> + 1, *n*2, *i*1, *i*2)*I*((*n*<sup>1</sup> + 1)*x*<sup>1</sup> + *n*2*x*<sup>2</sup> ≤ *R*)+ (*n*<sup>2</sup> + 1)*μ*<sup>2</sup> *p*(*n*1, *n*<sup>2</sup> + 1, *i*1, *i*2)*I*(*n*1*x*<sup>1</sup> + (*n*<sup>2</sup> + 1)*x*<sup>2</sup> ≤ *R*)+ (*i*<sup>1</sup> + 1)*σ*<sup>1</sup> *p*(*n*<sup>1</sup> − 1, *n*2, *i*<sup>1</sup> + 1, *i*2)*I*(*n*<sup>1</sup> > 0)+(*i*<sup>2</sup> + 1)*σ*<sup>2</sup> *p*(*n*1, *n*<sup>2</sup> − 1, *i*1, *i*<sup>2</sup> + 1)*I*(*n*<sup>2</sup> > 0) +*α* ∑ (*k*1,*k*2,*i*1,*i*2)∈<sup>X</sup> *p*(*k*1, *k*2, *i*1, *i*2)*I*(*k*<sup>1</sup> + *k*<sup>2</sup> = 0)*I*(*n*<sup>1</sup> + *n*<sup>2</sup> = 0), (1)

where

$$I(A) = \begin{cases} 1, \text{if } A \text{ is true,} \\ 0, \text{if } A \text{ is false.} \end{cases}$$

From System (1), we write several equations for different states.

$$\text{(a)}\qquad \text{For } n\_1\mathbf{x}\_1 + n\_2\mathbf{x}\_2 = 0:$$

$$\begin{aligned} p(1,0,i\_1,i\_2)\mu\_1 + p(0,1,i\_1,i\_2)\mu\_2 + a \sum\_{(k\_1,k\_2,i\_1,i\_2)\in\mathcal{X}} p(k\_1,k\_2,i\_1,i\_2)I(k\_1+k\_2\neq 0). \\ \text{(b)} \qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 \le R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 \le R]: \end{aligned}$$

$$\begin{aligned} p(n\_1, n\_2, i\_1, i\_2)[\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + i\_1\sigma\_1 + i\_2\sigma\_2 + n] &= 0\\ p(n\_1 - 1, n\_2, i\_1, i\_2)\lambda\_1 I(n\_1 > 0) + p(n\_1, n\_2 - 1, i\_1, i\_2)\lambda\_2 I(n\_2 > 0) + 1\\ p(n\_1 + 1, n\_2, i\_1, i\_2)(n\_1 + 1)\mu\_1 + p(n\_1, n\_2 + 1, i\_1, i\_2)(n\_2 + 1)\mu\_2 + 1\\ p(n\_1 - 1, n\_2, i\_1 + 1, i\_2)(i\_1 + 1)\sigma\_1 I(n\_1 > 0) + 1\\ p(n\_1, n\_2 - 1, i\_1, i\_2 + 1)(i\_2 + 1)\sigma\_2 I(n\_2 > 0) \end{aligned}$$

$$\text{(c)}\qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 > R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 > R]:$$

$$\begin{aligned} p(n\_1, n\_2, i\_1, i\_2)[\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + \alpha] &= 0\\ p(n\_1 - 1, n\_2, i\_1, i\_2)\lambda\_1 I(n\_1 > 0) + p(n\_1, n\_2 - 1, i\_1, i\_2)\lambda\_2 I(n\_2 > 0) + 1\\ p(n\_1, n\_2, i\_1 - 1, i\_2)\lambda\_1 I(i\_1 > 0) + p(n\_1, n\_2, i\_1, i\_2 - 1)\lambda\_2 I(i\_2 > 0) + 1\\ p(n\_1 - 1, n\_2, i\_1 + 1, i\_2)(i\_1 + 1)\sigma\_1 I(n\_1 > 0) + 1\\ p(n\_1, n\_2 - 1, i\_1, i\_2 + 1)(i\_2 + 1)\sigma\_2 I(n\_2 > 0) \end{aligned}$$

$$\text{(d)}\qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 \le R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 > R]:$$

$$\begin{aligned} p(n\_1, n\_2, i\_1, i\_2)[\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + i\_1\sigma\_1 + n] &= 0\\ p(n\_1, n\_2 - 1, i\_1, i\_2)\lambda\_2 I(n\_2 > 0) + p(n\_1, n\_2, i\_1, i\_2 - 1)\lambda\_2 I(i\_2 > 0) + 1\\ p(n\_1 + 1, n\_2, i\_1, i\_2)(n\_1 + 1)\mu\_1 + p(n\_1 - 1, n\_2, i\_1 + 1, i\_2)(i\_1 + 1)\sigma\_1 I(n\_1 > 0) + 1\\ p(n\_1, n\_2 - 1, i\_1, i\_2 + 1)(i\_2 + 1)\sigma\_2 I(n\_2 > 0) .\end{aligned}$$

$$\text{(e)}\qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 > R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 \le R]:$$

$$\begin{aligned} p(n\_1, n\_2, i\_1, i\_2) [\lambda\_1 + \lambda\_2 + n\_1 \mu\_1 + n\_2 \mu\_2 + i\_1 \sigma\_2 + n] &= 0, \\ p(n\_1 - 1, n\_2, i\_1, i\_2) \lambda\_1 I(n\_1 > 0) + 0, \\ p(n\_1, n\_2 - 1, i\_1, i\_2) \lambda\_2 I(n\_2 > 0) + p(n\_1, n\_2, i\_1 - 1, i\_2) \lambda\_1 I(i\_1 > 0) + 0, \\ p(n\_1, n\_2 + 1, i\_1, i\_2) (n\_2 + 1) \mu\_2 + p(n\_1 - 1, n\_2, i\_1 + 1, i\_2) (i\_1 + 1) \sigma\_1 I(n\_1 > 0) + 0, \\ p(n\_1, n\_2 - 1, i\_1, i\_2 + 1) (i\_2 + 1) \sigma\_2 I(n\_2 > 0). \end{aligned}$$

Because of 0 ≤ *ik* < ∞, Equations (a)–(e) have infinite dimension. So, for solving of difference equations System (a)–(e), we use the method of characteristic transforms. This method allows to find solutions of complex equations in queuing theory in more simple way. In the paper, we introduce the following partial characteristic functions:

$$H(n\_1, n\_2, \mu\_1, \mu\_2) = \sum\_{i\_1=0}^{\infty} e^{j\mu\_1 i\_1} \sum\_{i\_2=0}^{\infty} e^{j\mu\_2 i\_2} p(n\_1, n\_2, i\_1, i\_2), \tag{2}$$

where *<sup>j</sup>* <sup>=</sup> √−1.

Note that *h*(*u*1, *u*2) = *N* ∑ *n*1=0 *N* ∑ *n*2=0 *<sup>H</sup>*(*n*1, *<sup>n</sup>*2, *<sup>u</sup>*1, *<sup>u</sup>*2) = <sup>E</sup>{*eju*1*i*1+*ju*2*i*<sup>2</sup> } is a characteristic function of the two-dimensional process (*I*1(*t*), *I*2(*t*)) of the number of customers in the orbits.

Using Notation (2), Equations (a)–(e) are rewritten as follows:

$$\begin{array}{c} \text{(a)} \qquad \text{For } n\_1 \mathfrak{x}\_1 + n\_2 \mathfrak{x}\_2 = 0: \\ \end{array}$$

$$\begin{aligned} [\lambda\_1 + \lambda\_2]H(0, 0, \mu\_1, \mu\_2) - j\upsilon\_1 \frac{\partial H(0, 0, \mu\_1, \mu\_2)}{\partial \mu\_1} - j\upsilon\_2 \frac{\partial H(0, 0, \mu\_1, \mu\_2)}{\partial \mu\_2} &= 0, \\ \mu\_1 H(1, 0, \mu\_1, \mu\_2) + \mu\_2 H(0, 1, \mu\_1, \mu\_2) + a \sum\_{(k\_1, k\_2, i\_1, i\_2) \in \mathfrak{X}} H(k\_1, k\_2, \mu\_1, \mu\_2) I(k\_1 + k\_2 \neq 0). \end{aligned}$$

$$\text{(b)}\qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 \le R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 \le R]:$$

$$\begin{split} [\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + a]H(n\_1, n\_2, \mu\_1, \mu\_2) &- j\sigma\_1 \frac{\partial H(n\_1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_1} - 1 \\ j\sigma\_2 \frac{\partial H(n\_1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_2} &= \lambda\_1 H(n\_1 - 1, n\_2, \mu\_1, \mu\_2) + \lambda\_2 H(n\_1, n\_2 - 1, \mu\_1, \mu\_2) + \lambda\_1 \\ (n\_1 + 1)\mu\_1 H(n\_1 + 1, n\_2, \mu\_1, \mu\_2) &+ (n\_2 + 1)\mu\_2 H(n\_1, n\_2 + 1, \mu\_1, \mu\_2) - \\ j\sigma\_1 e^{-j\mu\_1} \frac{\partial H(n\_1 - 1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_1} - j\sigma\_2 e^{-j\mu\_2} \frac{\partial H(n\_1, n\_2 - 1, \mu\_1, \mu\_2)}{\partial \mu\_2} \end{split}$$

$$\begin{array}{cc} \text{(c)} & \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 > R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 > R] : \end{array}$$

$$\begin{aligned} \left[\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + \alpha\right] H(n\_1, n\_2, \mu\_1, \mu\_2) &= 0\\ \lambda\_1 H(n\_1 - 1, n\_2, \mu\_1, \mu\_2) + \lambda\_2 H(n\_1, n\_2 - 1, \mu\_1, \mu\_2) I(n\_2 > 0) + \alpha \\ \lambda\_1 e^{j\mu\_1} H(n\_1, n\_2, \mu\_1, \mu\_2) + \lambda\_2 e^{j\mu\_2} H(n\_1, n\_2, \mu\_1, \mu\_2) - \alpha \\ j\sigma\_1 e^{-j\mu\_1} \frac{\partial H(n\_1 - 1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_1} - j\sigma\_2 e^{-j\mu\_2} \frac{\partial H(n\_1, n\_2 - 1, \mu\_1, \mu\_2)}{\partial \mu\_2}. \end{aligned}$$

$$\begin{split} \text{(d)} \qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 \le R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 > R]: \\ & \qquad \qquad \qquad \qquad [\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + n]H(n\_1, n\_2, \mu\_1, \mu\_2) - j\sigma\_1 \frac{\partial H(n\_1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_1} = 0. \\ & \qquad \qquad \qquad \qquad \lambda\_1 H(n\_1 - 1, \mu\_2, \mu\_1, \mu\_2) + \lambda\_2 H(n\_1, n\_2 - 1, \mu\_1, \mu\_2) + \\ & \qquad \qquad \qquad \lambda\_2 \varepsilon^{j\mu\_2} H(n\_1, n\_2, \mu\_1, \mu\_2) + (n\_1 + 1)\mu\_1 H(n\_1 + 1, \mu\_2, \mu\_1, \mu\_2) - j\sigma\_2 \frac{\partial H(n\_1, n\_2 - 1, \mu\_1, \mu\_2)}{\partial \mu\_1} = 0. \\ & \qquad \qquad \qquad j\sigma\_1 e^{-j\mu\_1} \frac{\partial H(n\_1 - 1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_1} - j\sigma\_2 e^{-j\mu\_2} \frac{\partial H(n\_1, n\_2 - 1, \mu\_1, \mu\_2)}{\partial \mu\_2}. \end{split}$$

$$\text{(e)}\qquad \text{For } [(n\_1+1)\mathbf{x}\_1 + n\_2\mathbf{x}\_2 > R] \cap [n\_1\mathbf{x}\_1 + (n\_2+1)\mathbf{x}\_2 \le R]:$$

$$\begin{split} \left[\lambda\_1 + \lambda\_2 + n\_1\mu\_1 + n\_2\mu\_2 + a\right] H(n\_1, n\_2, \mu\_1, \mu\_2) - j\sigma\_2 \frac{\partial H(n\_1, n\_2, \mu\_1, \mu\_2)}{\partial \mu\_2} &= 0, \\ \lambda\_1 H(n\_1 - 1, n\_2, \mu\_1, \mu\_2) + \lambda\_2 H(n\_1, n\_2 - 1, \mu\_1, \mu\_2) &+ \lambda\_1 \\ \lambda\_1 e^{j\mu\_1} H(n\_1, n\_2, \mu\_1, \mu\_2) + (n\_2 + 1)\mu\_2 H(n\_1, n\_2 + 1, \mu\_1, \mu\_2) - j\sigma\_2 e^{-j\mu\_2} \frac{\partial H(n\_1, n\_2 - 1, \mu\_1, \mu\_2)}{\partial \mu\_2}. \end{split}$$

Let us denote the matrix **H**(*u*1, *u*2) = {*H*(*n*1, *n*2, *u*1, *u*2)}*n*1,*n*<sup>2</sup> . So, the following equation can be written:

$$\begin{aligned} (\mathbf{A} + \lambda\_1 e^{j\mu\_1} \mathbf{B}\_1 + \lambda\_2 e^{j\mu\_2} \mathbf{B}\_2) \mathbf{H}(\mu\_1, \mu\_2) + \\ j\varphi\_1(\mathbf{C}\_1 - e^{-j\mu\_1} \mathbf{D}\_1) \frac{\partial \mathbf{H}(\mu\_1, \mu\_2)}{\partial \mu\_1} + j\varphi\_2(\mathbf{C}\_2 - e^{-j\mu\_2} \mathbf{D}\_2) \frac{\partial \mathbf{H}(\mu\_1, \mu\_2)}{\partial \mu\_2} = 0, \end{aligned} \tag{3}$$

where **A**, **B1**, **B2**, **C1**, **C2**, **D1**, **D2** are the followings operators:

**AH**(*u*1, *u*2) = ⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ −[*λ*<sup>1</sup> + *λ*2]*H*(*n*1, *n*2, *u*1, *u*2) + *μ*1*H*(*n*<sup>1</sup> + 1, *n*2, *u*1, *u*2)+ *μ*2*H*(*n*1, *n*<sup>2</sup> + 1, *u*1, *u*2) + *α* ∑ (*k*1,*k*2,*i*1,*i*2)∈<sup>X</sup> *H*(*k*1, *k*2, *u*1, *u*2)*I*(*k*<sup>1</sup> + *k*<sup>2</sup> = 0), (a) −[*λ*<sup>1</sup> + *λ*<sup>2</sup> + *n*1*μ*<sup>1</sup> + *n*2*μ*<sup>2</sup> + *α*]*H*(*n*1, *n*2, *u*1, *u*2)+ *λ*1*H*(*n*<sup>1</sup> − 1, *n*2, *u*1, *u*2) + *λ*2*H*(*n*1, *n*<sup>2</sup> − 1, *u*1, *u*2)+ (*n*<sup>1</sup> + 1)*μ*1*H*(*n*<sup>1</sup> + 1, *n*2, *u*1, *u*2)+(*n*<sup>2</sup> + 1)*μ*2*H*(*n*1, *n*<sup>2</sup> + 1, *u*1, *u*2), (b) −[*λ*<sup>1</sup> + *λ*<sup>2</sup> + *n*1*μ*<sup>1</sup> + *n*2*μ*<sup>2</sup> + *α*]*H*(*n*1, *n*2, *u*1, *u*2)+ *λ*1*H*(*n*<sup>1</sup> − 1, *n*2, *u*1, *u*2) + *λ*2*H*(*n*1, *n*<sup>2</sup> − 1, *u*1, *u*2), (c) −[*λ*<sup>1</sup> + *λ*<sup>2</sup> + *n*1*μ*<sup>1</sup> + *n*2*μ*<sup>2</sup> + *α*]*H*(*n*1, *n*2, *u*1, *u*2)+ *λ*1*H*(*n*<sup>1</sup> − 1, *n*2, *u*1, *u*2) + *λ*2*H*(*n*1, *n*<sup>2</sup> − 1, *u*1, *u*2)+ (*n*<sup>1</sup> + 1)*μ*1*H*(*n*<sup>1</sup> + 1, *n*2, *u*1, *u*2), (d) −[*λ*<sup>1</sup> + *λ*<sup>2</sup> + *n*1*μ*<sup>1</sup> + *n*2*μ*<sup>2</sup> + *α*]*H*(*n*1, *n*2, *u*1, *u*2)+ *λ*1*H*(*n*<sup>1</sup> − 1, *n*2, *u*1, *u*2) + *λ*2*H*(*n*1, *n*<sup>2</sup> − 1, *u*1, *u*2)+ (*n*<sup>2</sup> + 1)*μ*2*H*(*n*1, *n*<sup>2</sup> + 1, *u*1, *u*2). (e) **B1H**(*u*1, *<sup>u</sup>*2) = 0, (a), (b), (d) *H*(*n*1, *n*2, *u*1, *u*2). (c), (e) **B2H**(*u*1, *<sup>u</sup>*2) = 0, (a), (b), (e) *H*(*n*1, *n*2, *u*1, *u*2). (c), (d)

$$\mathbf{C\_1H}(u\_1, u\_2) = \begin{cases} H(n\_1, n\_2, u\_1, u\_2), & \text{(a), (b), (d)}\\ 0. & \text{(c), (e)} \end{cases}$$

$$\mathbf{C\_2H}(u\_1, u\_2) = \begin{cases} H(n\_1, n\_2, u\_1, u\_2), & \text{(a), (b), (e)}\\ 0. & \text{(c), (d)} \end{cases}$$

$$\mathbf{D\_1H}(u\_1, u\_2) = \begin{cases} 0, & \text{(a)}\\ H(n\_1 - 1, n\_2, u\_1, u\_2). & \text{(b), (c), (d), (e)} \end{cases}$$

$$\mathbf{D\_2H}(u\_1, u\_2) = \begin{cases} 0, & \text{(a)}\\ H(n\_1, n\_2 - 1, u\_1, u\_2). & \text{(b), (c), (d), (e)} \end{cases}$$

Operator **E** is the sum of equations for all values of *n*1, *n*2. Obviously,

$$\mathbf{E}(\mathbf{A} + \lambda\_1 \mathbf{B\_1} + \lambda\_2 \mathbf{B\_2}) = 0,$$

$$\mathbf{E}(\mathbf{C\_1} - \mathbf{D\_1}) = 0, \quad \mathbf{E}(\mathbf{C\_2} - \mathbf{D\_2}) = 0.$$

So, we have the following equation:

$$\begin{aligned} \mathbf{E}(\lambda\_1(e^{j u\_1} - 1)\mathbf{B}\_1 + \lambda\_2(e^{j u\_2} - 1)\mathbf{B}\_2)\mathbf{H}(u\_1, u\_2) + \\ j v\_1 (1 - e^{-j u\_1})\mathbf{E}\mathbf{D}\_1 \frac{\partial \mathbf{H}(u\_1, u\_2)}{\partial u\_1} + j v\_2 (1 - e^{-j u\_2})\mathbf{E}\mathbf{D}\_2 \frac{\partial \mathbf{H}(u\_1, u\_2)}{\partial u\_2} = 0. \end{aligned} \tag{4}$$

#### **3. Asymptotic Analysis Method**

Since it is not possible to find an explicit form of the solution of equations system (3) and (4), we propose the asymptotic analysis method [15,27,29]. In the paper, we will use the asymptotic condition of the long delay (when *σ*<sup>1</sup> → 0 and *σ*<sup>2</sup> → 0). The practical mean of the long delay condition is that the service time is much less than the time of repeated call. The algorithm of the proposed method includes the following steps.

	- (a) introducing of an infinitesimal parameter *ε* and an asymptotic function notation (5);
	- (b) rewriting of Equations (3) and (4) for the asymptotic notations;
	- (c) deriving of a limit solution of the asymptotic equations for *ε* → 0;
	- (d) using the inverse substitutions, we obtain the form of the first-order asymptotic characteristic function (10), which gives the value of the asymptotic mean of the considered process.
	- (a) using the result of the first-order asymptotic analysis 1.(d), we rewrite the characteristic function as (11);
	- (b) rewriting of Equations (3) and (4) for this notations;
	- (c) introducing of an infinitesimal parameter *ε*<sup>2</sup> and new asymptotic function notation (13);
	- (d) rewriting of equations obtained in 2.(b) for the asymptotic notations;
	- (e) approximating asymptotic functions by its 2th-degree Maclaurin series with respect to *ε* as (14);
	- (f) deriving of a limit solution of the asymptotic equations for *ε* → 0;
	- (g) using the inverse substitutions, we obtain the form of the second-order asymptotic characteristic function, which gives the value of the asymptotic variance of the considered process.

#### *3.1. First-Order Asymptotics*

First of all, we denote *σ<sup>k</sup>* = *γkσ*, where *σ* → 0 and *γ<sup>k</sup>* = *const*. In the first-order asymptotic analysis method, we use the following notations:

$$
\sigma = \varepsilon, \quad u\_k = \varepsilon w\_k, \quad \mathbf{H}(u\_1, u\_2) = \mathbf{F}(w\_1, w\_2, \varepsilon), \tag{5}
$$

where *ε* is infinitesimal, and **F**(*w*1, *w*2,*ε*) is an asymptotic function.

Substituting Notations (5) into Equations (3) and (4), we obtain the following asymptotic equations:

$$\begin{cases} & \left( (\mathbf{A} + \lambda\_1 e^{j\varpi\_1} \mathbf{B}\_1 + \lambda\_2 e^{j\varpi\_2} \mathbf{B}\_2) \mathbf{F}(w\_1, w\_2, \varepsilon) + \\ & j\gamma\_1 (\mathbf{C}\_1 - e^{-j\varpi\_1} \mathbf{D}\_1) \frac{\partial \mathbf{F}(w\_1, w\_2, \varepsilon)}{\partial w\_1} + j\gamma\_2 (\mathbf{C}\_2 - e^{-j\varpi\_2} \mathbf{D}\_2) \frac{\partial \mathbf{F}(w\_1, w\_2, \varepsilon)}{\partial w\_2} = 0, \\ & \mathbf{E}(\lambda\_1 (e^{j\varpi\_1} - 1) \mathbf{B}\_1 + \lambda\_2 (e^{j\varpi\_2} - 1) \mathbf{B}\_2) \mathbf{F}(w\_1, w\_2, \varepsilon) + \\ & j\gamma\_1 \mathbf{E} \mathbf{D}\_1 (1 - e^{-j\varpi\_1}) \frac{\partial \mathbf{F}(w\_1, w\_2, \varepsilon)}{\partial w\_1} + j\gamma\_2 \mathbf{E} \mathbf{D}\_2 (1 - e^{-j\varpi\_2}) \frac{\partial \mathbf{F}(w\_1, w\_2, \varepsilon)}{\partial w\_2} = 0. \end{cases} (6)$$

Let us make limits lim*ε*→<sup>0</sup> **F**(*w*1, *w*2,*ε*) = **F**(*w*1, *w*2). Then, System (6) has the form:

$$\begin{cases} & (\mathbf{A} + \lambda\_1 \mathbf{B}\_1 + \lambda\_2 \mathbf{B}\_2) \mathbf{F}(w\_1, w\_2) + j\gamma\_1 (\mathbf{C}\_1 - \mathbf{D}\_1) \frac{\partial \mathbf{F}(w\_1, w\_2)}{\partial w\_1} + \\ & j\gamma\_2 (\mathbf{C}\_2 - \mathbf{D}\_2) \frac{\partial \mathbf{F}(w\_1, w\_2)}{\partial w\_2} = 0, \\ & \mathbf{E}(\lambda\_1 w\_1 \mathbf{B}\_1 + \lambda\_2 w\_2 \mathbf{B}\_2) \mathbf{F}(w\_1, w\_2) + j\gamma\_1 w\_1 \mathbf{E} \mathbf{D}\_1 \frac{\partial \mathbf{F}(w\_1, w\_2)}{\partial w\_1} + \\ & j\gamma\_2 w\_2 \mathbf{E} \mathbf{D}\_2 \frac{\partial \mathbf{F}(w\_1, w\_2)}{\partial w\_2} = 0. \end{cases} \tag{7}$$

Obviously, the solution of Equation (7) has the form

$$\mathbf{F}(w\_1, w\_2) = \mathbf{Rexp}\{jw\_1\kappa\_1 + jw\_2\kappa\_2\},\tag{8}$$

where **R** = [*R*(*n*1, *n*2)] is the matrix of the stationary probabilities of the states of process (*N*1(*t*), *N*2(*t*)); *κ*1, *κ*<sup>2</sup> are normalized means of the number of customers in the orbit, which are calculated from the following equations (obtained by substituting (8) in (7)):

$$\begin{cases} \left[ (\mathbf{A} + \lambda\_1 \mathbf{B}\_1 + \lambda\_2 \mathbf{B}\_2) - \kappa\_1 \gamma\_1 (\mathbf{C}\_1 - \mathbf{D}\_1) - \kappa\_2 \gamma\_2 (\mathbf{C}\_2 - \mathbf{D}\_2) \right] \mathbf{R} = 0, \\\ \mathbf{E} [\lambda\_1 \mathbf{B}\_1 - \kappa\_1 \gamma\_1 \mathbf{D}\_1] \mathbf{R} = 0, \\\ \mathbf{E} [\lambda\_2 \mathbf{B}\_2 - \kappa\_2 \gamma\_2 \mathbf{D}\_2] \mathbf{R} = 0, \\\ \mathbf{E} \mathbf{R} = 1. \end{cases} \tag{9}$$

After returning to Substitutions (5), we have obtained the first-order approximation of the characteristic function

$$\mathbf{H}(u\_1, u\_2) = \mathbf{F}(w\_1, w\_2, \varepsilon) \approx \mathbf{F}(w\_1, w\_2) = \mathbf{Rexp}\left\{j w\_1 \frac{\kappa\_1}{\sigma} + j w\_2 \frac{\kappa\_2}{\sigma}\right\},\tag{10}$$

where *<sup>κ</sup>*<sup>1</sup> *<sup>σ</sup>* and *<sup>κ</sup>*<sup>2</sup> *<sup>σ</sup>* are normalized means of processes *<sup>I</sup>*1(*t*) and *<sup>I</sup>*2(*t*).

#### *3.2. Second-Order Asymptotics*

The first step of the second-order asymptotic analysis is rewriting the characteristic functions using the result of the first-order asymptotics (10) as follows:

$$\mathbf{H}(\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}) = \mathbf{H}^{(2)}(\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}) \cdot \exp\left\{ j\frac{\boldsymbol{\mu}\_{1}}{\sigma\_{1}}\gamma\_{1}\boldsymbol{\kappa}\_{1} + j\frac{\boldsymbol{\mu}\_{2}}{\sigma\_{2}}\gamma\_{2}\boldsymbol{\kappa}\_{2} \right\},\tag{11}$$

where **H**(**2**)(*u*1, *u*2) is matrix of characteristic functions of two-dimensional stochastic centered process *<sup>I</sup>*1(*t*) <sup>−</sup> *<sup>κ</sup>*<sup>1</sup> *<sup>σ</sup>* , *<sup>I</sup>*2(*t*) <sup>−</sup> *<sup>κ</sup>*<sup>2</sup> *σ* .

Substituting (11) into Equations (3) and (4), we have

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ (**A** + *λ*1*eju*1**B1** + *λ*2*eju*2**B2**)**H**(**2**) (*u*1, *u*2)+ *jσ*1(**C1** − *e* <sup>−</sup>*ju*1**D1**) *∂***H**(**2**)(*u*1, *u*2) *∂u*<sup>1</sup> − *γ*1*κ*1(**C1** − *e* <sup>−</sup>*ju*1**D1**)**H**(**2**) (*u*1, *u*2)+ *jσ*2(**C2** − *e* <sup>−</sup>*ju*2**D2**) *∂***H**(**2**)(*u*1, *u*2) *∂u*<sup>2</sup> − *γ*2*κ*2(**C2** − *e* <sup>−</sup>*ju*2**D2**)**H**(**2**) (*u*1, *u*2) = 0, **<sup>E</sup>**(*λ*1(*eju*<sup>1</sup> <sup>−</sup> <sup>1</sup>)**B1** <sup>+</sup> *<sup>λ</sup>*2(*eju*<sup>2</sup> <sup>−</sup> <sup>1</sup>)**B2**)**H**(**2**)(*u*1, *<sup>u</sup>*2)+ *jσ*1(1 − *e* <sup>−</sup>*ju*<sup>1</sup> )**ED1** *∂***H**(**2**)(*u*1, *u*2) *∂u*<sup>1</sup> − *γ*1*κ*1(1 − *e* <sup>−</sup>*ju*<sup>1</sup> )**ED1H**(**2**) (*u*1, *u*2)+ *jσ*2(1 − *e* <sup>−</sup>*ju*<sup>2</sup> )**ED2** *∂***H**(**2**)(*u*1, *u*2) *∂u*<sup>2</sup> − *γ*2*κ*2(1 − *e* <sup>−</sup>*ju*<sup>2</sup> )**ED2H**(**2**) (*u*1, *u*2) = 0. (12)

The next step of the analysis is introducing the following notations (as in Section 3.1):

$$
\sigma\_k = \sigma \gamma\_k, \quad \sigma = \varepsilon^2, \quad u\_k = \varepsilon \cdot w\_k, \quad \mathbf{H}^{(2)}(u\_1, u\_2) = \mathbf{F}^{(2)}(w\_1, w\_2, \varepsilon). \tag{13}
$$

Substituting (13) into Equation (12), we obtain the following system of asymptotic equations:

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ (**A** + *λ*1*ejεw*1**B1** + *λ*2*ejεw*2**B2**)**F**(**2**) (*w*1, *w*2,*ε*)+ *jεγ*1(**C1** − *e* <sup>−</sup>*jεw*1**D1**) *∂***F**(**2**)(*w*1, *w*2,*ε*) *∂w*<sup>1</sup> − *γ*1*κ*1(**C1** − *e* <sup>−</sup>*jεw*1**D1**)**F**(**2**) (*w*1, *w*2,*ε*)+ *jεγ*2(**C2** − *e* <sup>−</sup>*jεw*2**D2**) *∂***F**(**2**)(*w*1, *w*2,*ε*) *∂w*<sup>2</sup> − *γ*2*κ*2(**C2** − *e* <sup>−</sup>*jεw*2**D2**)**F**(**2**) (*w*1, *w*2,*ε*) = 0, **<sup>E</sup>**(*λ*1(*ejεw*<sup>1</sup> <sup>−</sup> <sup>1</sup>)**B1** <sup>+</sup> *<sup>λ</sup>*2(*ejεw*<sup>2</sup> <sup>−</sup> <sup>1</sup>)**B2**)**F**(**2**) (*w*1, *w*2,*ε*)+ *jεγ*1(1 − *e* <sup>−</sup>*jεw*<sup>1</sup> )**ED1** *∂***F**(**2**)(*w*1, *w*2,*ε*) *∂w*<sup>1</sup> − *γ*1*κ*1(1 − *e* <sup>−</sup>*jεw*<sup>1</sup> )**ED1F**(**2**) (*w*1, *w*2,*ε*)+ *jεγ*2(1 − *e* <sup>−</sup>*jεw*<sup>2</sup> )**ED2** *∂***F**(**2**)(*w*1, *w*2,*ε*) *∂w*<sup>2</sup> − *γ*2*κ*2(1 − *e* <sup>−</sup>*jεw*<sup>2</sup> )**ED2F**(**2**) (*w*1, *w*2,*ε*) = 0. (14)

The solution of System (14) will be found in the following multiplicative form:

$$\mathbf{F}^{(2)}(w\_1, w\_2, \varepsilon) = \Phi(w\_1, w\_2) \cdot (\mathbf{R} + j\varepsilon w\_1 \mathbf{f\_1} + j\varepsilon w\_2 \mathbf{f\_2}) + O(\varepsilon^2),\tag{15}$$

where Φ(*w*1, *w*2) is an unknown scalar function, and **f1**, **f2** are unknown operators, which will be found further.

Substituting solution (15) into System (14) and using Maclaurin series, we have

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ (**A** + *λ*1(1 + *jεw*1)**B1** + *λ*2(1 + *jεw*2)**B2**)Φ(*w*1, *w*2)(**R** + *jεw*1**f1** + *jεw*2**f2**)+ *jεγ*1(**C1** − (1 − *jεw*1)**D1**) *∂*Φ(*w*1, *w*2) *∂w*<sup>1</sup> (**R** + *jεw*1**f1** + *jεw*2**f2**) + Φ(*w*1, *w*2)*jε***f1** − *γ*1*κ*1(**C1** − ((1 − *jεw*1))**D1**)Φ(*w*1, *w*2)(**R** + *jεw*1**f1** + *jεw*2**f2**)+ *jεγ*2(**C2** − (1 − *jεw*2)**D2**) *∂*Φ(*w*1, *w*2) *∂w*<sup>2</sup> (**R** + *jεw*1**f1** + *jεw*2**f2**) + Φ(*w*1, *w*2)*jε***f2** − *γ*2*κ*2(**C2** − ((1 − *jεw*2))**D2**)Φ(*w*1, *w*2)(**R** + *jεw*1**f1** + *jεw*2**f2**) = 0, **E** *λ*1 *<sup>j</sup>εw*<sup>1</sup> <sup>+</sup> (*jεw*1)<sup>2</sup> 2 **B1** + *λ*<sup>2</sup> *<sup>j</sup>εw*<sup>2</sup> <sup>+</sup> (*jεw*2)<sup>2</sup> 2 **B2** Φ(*w*1, *w*2)× (**R** + *jεw*1**f1** + *jεw*2**f2**) + *jεγ*<sup>1</sup> *<sup>j</sup>εw*<sup>1</sup> <sup>−</sup> (*jεw*1)<sup>2</sup> 2 **ED1**× *∂*Φ(*w*1, *w*2) *∂w*<sup>1</sup> (**R** + *jεw*1**f1** + *jεw*2**f2**) + Φ(*w*1, *w*2)*jε***f1** − *γ*1*κ*<sup>1</sup> *<sup>j</sup>εw*<sup>1</sup> <sup>−</sup> (*jεw*1)<sup>2</sup> 2 **ED1**Φ(*w*1, *w*2) · (**R** + *jεw*1**f1** + *jεw*2**f2**)+ *jεγ*<sup>2</sup> *<sup>j</sup>εw*<sup>2</sup> <sup>−</sup> (*jεw*2)<sup>2</sup> 2 **ED2** *∂*Φ(*w*1, *w*2) *∂w*<sup>2</sup> (**R** + *jεw*1**f1** + *jεw*2**f2**) + Φ(*w*1, *w*2)*jε***f2** − *γ*2*κ*<sup>2</sup> *<sup>j</sup>εw*<sup>2</sup> <sup>−</sup> (*jεw*2)<sup>2</sup> 2 **ED2**Φ(*w*1, *w*2) · (**R** + *jεw*1**f1** + *jεw*2**f2**) = 0.

Make some transformations and obtain *ε* → 0.

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ ((*λ*1*w*1**B1** + *λ*2*w*2**B2**)**R** + (**A** + *λ*1**B1** + *λ*2**B2**)(*w*1**f1** + *w*2**f2**))+ *γ*1 *∂*Φ(*w*1, *w*2)/*∂w*<sup>1</sup> <sup>Φ</sup>(*w*1, *<sup>w</sup>*2) (**C1** <sup>−</sup> **D1**)**<sup>R</sup>** <sup>−</sup> *<sup>γ</sup>*1*κ*1(*w*1**D1R** + (**C1** <sup>−</sup> **D1**)(*w*1**f1** <sup>+</sup> *<sup>w</sup>*2**f2**))+ *γ*2 *∂*Φ(*w*1, *w*2)/*∂w*<sup>2</sup> <sup>Φ</sup>(*w*1, *<sup>w</sup>*2) (**C2** <sup>−</sup> **D2**)**<sup>R</sup>** <sup>−</sup> *<sup>γ</sup>*2*κ*2(*w*2**D2R** + (**C2** <sup>−</sup> **D2**)(*w*1**f1** <sup>+</sup> *<sup>w</sup>*2**f2**)) = 0, **E** (*jw*1)<sup>2</sup> <sup>2</sup> *<sup>λ</sup>*1**B1R** <sup>+</sup> (*jw*2)<sup>2</sup> <sup>2</sup> *<sup>λ</sup>*2**B2R** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>1</sup>*λ*1**B1f1**+ *w*1*w*2*λ*2**B2f1** + *w*1*w*2*λ*1**B1f1** + *w*<sup>2</sup> <sup>2</sup>*λ*2**B2f2** + *γ*1*w*<sup>1</sup> *∂*Φ(*w*1, *w*2)/*∂w*<sup>1</sup> <sup>Φ</sup>(*w*1, *<sup>w</sup>*2) **ED1R** <sup>−</sup> *<sup>γ</sup>*1*κ*1**ED1** −(*jw*1)<sup>2</sup> <sup>2</sup> **<sup>R</sup>** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>1</sup>**f1** + *w*1*w*2**f2** + *γ*2*w*<sup>2</sup> *∂*Φ(*w*1, *w*2)/*∂w*<sup>2</sup> <sup>Φ</sup>(*w*1, *<sup>w</sup>*2) **ED2R** <sup>−</sup> *<sup>γ</sup>*2*κ*2**ED2** −(*jw*2)<sup>2</sup> <sup>2</sup> **<sup>R</sup>** <sup>+</sup> *<sup>w</sup>*1*w*2**f1** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>2</sup>**f2**+ = 0. (16)

The solution of System (16) has the form:

$$\Phi(w\_1, w\_2) = \exp\left(\frac{(jw\_1)^2}{2}K\_{11} + jw\_1 jw\_2 K\_{12} + \frac{(jw\_2)^2}{2}K\_{22}\right),\tag{17}$$

where *K*<sup>11</sup> and *K*<sup>22</sup> are normalized variances of the stochastic process (*I*1(*t*), *I*2(*t*)), and *K*<sup>12</sup> is their normalized covariance. For the *K*11, *K*12, and *K*<sup>22</sup> finding, we substitute (17) into System (16):

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩ ((*λ*1*w*1**B1** + *λ*2*w*2**B2**)**R** + (**A** + *λ*1**B1** + *λ*2**B2**)(*w*1**f1** + *w*2**f2**))− *γ*1(*w*1*K*<sup>11</sup> + *w*2*K*12)(**C1** − **D1**)**R** − *γ*1*κ*1(*w*1**D1R** + (**C1** − **D1**)(*w*1**f1** + *w*2**f2**))+ *γ*2(*w*1*K*<sup>12</sup> + *w*2*K*22)(**C2** − **D2**)**R** − *γ*2*κ*2(*w*2**D2R** + (**C2** − **D2**)(*w*1**f1** + *w*2**f2**)) = 0, **E** *w*2 1 <sup>2</sup> *<sup>λ</sup>*1**B1R** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> 2 <sup>2</sup> *<sup>λ</sup>*2**B2R** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>1</sup>*λ*1**B1f1** + *<sup>w</sup>*1*w*2*λ*2**B2f1** + *<sup>w</sup>*1*w*2*λ*1**B1f1** + *<sup>w</sup>*<sup>2</sup> <sup>2</sup>*λ*2**B2f2** + *<sup>γ</sup>*1*w*1(*w*1*K*<sup>11</sup> <sup>+</sup> *<sup>w</sup>*2*K*12)**ED1R** <sup>−</sup> *<sup>γ</sup>*1*κ*1**ED1**(−*w*<sup>2</sup> 1 <sup>2</sup> **<sup>R</sup>** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>1</sup>**f1** + *w*1*w*2**f2**)+ *<sup>γ</sup>*2*w*2(*w*1*K*<sup>12</sup> <sup>+</sup> *<sup>w</sup>*2*K*22)**ED2R** <sup>−</sup> *<sup>γ</sup>*2*κ*2**ED1**(−*w*<sup>2</sup> 2 <sup>2</sup> **<sup>R</sup>** <sup>+</sup> *<sup>w</sup>*1*w*2**f1** <sup>+</sup> *<sup>w</sup>*<sup>2</sup> <sup>2</sup>**f2**) = 0. (18)

For (18) solving, let us write the factors for different powers of *w*1, *w*2:

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎩ (*λ*1**B1** − *γ*1*κ*1**D1** − *γ*1*K*11(**C1** − **D1**) − *γ*2*K*12(**C2** − **D2**))**R**+ (**A** + *λ*1**B1** + *λ*2**B2** − *γ*1*κ*1(**C1** − **D1**) − *γ*2*κ*2(**C2** − **D2**))**f1** = 0, (*λ*2**B2** − *γ*2*κ*2**D2** − *γ*1*K*11(**C2** − **D2**) − *γ*2*K*12(**C1** − **D1**))**R**+ (**A** + *λ*1**B1** + *λ*2**B2** − *γ*1*κ*1(**C1** − **D1**) − *γ*2*κ*2(**C2** − **D2**))**f2** = 0, **E**[(*λ*1**B1** + *γ*1(*κ*<sup>1</sup> − 2*K*11)**D1**)**R** + 2(*λ*1**B1** − *γ*1*κ*1**D1**)**f1**] = 0, **E**[(*λ*2**B2** + *γ*2(*κ*<sup>2</sup> − 2*K*22)**D2**)**R** + 2(*λ*2**B2** − *γ*2*κ*2**D2**)**f2**] = 0, **E**[−*K*12(*γ*1**D1** + *γ*2**D2**)**R** + (*λ*2**B2** − *γ*2*κ*2**D2**)**f1** + (*λ*1**B1** − *γ*1*κ*1**D1**)**f2**] = 0. (19)

The system of the first and second equations in (19) is the heterogeneous system of linear algebraic equations with respect to **f1** and **f2**. The determinant of the matrix of the system is equal to zero, while the rank of the extended matrix is equal to the rank of the matrix of coefficients, i.e., the system has many solutions. Comparing the first and second equations of systems (19) and the first equation of (9) (in the first-order asymptotics), we can write:

$$\mathbf{f\_1 = CR + g\_1}, \quad \mathbf{f\_2 = CR = g\_2}.$$

where

⎧ ⎪⎪⎪⎪⎪⎪⎪⎪⎨

⎪⎪⎪⎪⎪⎪⎪⎪⎩

$$\mathbf{E} \mathbf{g}\_1 = 0, \quad \mathbf{E} \mathbf{g}\_2 = 0.$$

So, we have the following system for **g1** and **g2** finding:

(*λ*1**B1** − *γ*1*κ*1**D1** − *γ*1*K*11(**C1** − **D1**) − *γ*2*K*12(**C2** − **D2**))**R**+ (**A** + *λ*1**B1** + *λ*2**B2** − *γ*1*κ*1(**C1** − **D1**) − *γ*2*κ*2(**C2** − **D2**))**g1** = 0, (*λ*2**B2** − *γ*2*κ*2**D2** − *γ*1*K*12(**C1** − **D1**) − *γ*2*K*22(**C2** − **D2**))**R**+ (**A** + *λ*1**B1** + *λ*2**B2** − *γ*1*κ*1(**C1** − **D1**) − *γ*2*κ*2(**C2** − **D2**))**g2** = 0, **E**[(*λ*1**B1** + *γ*1(*κ*<sup>1</sup> − 2*K*11)**D1**)**R** + 2(*λ*1**B1** − *γ*1*κ*1**D1**)**g1**] = 0, **E**[(*λ*2**B2** + *γ*2(*κ*<sup>2</sup> − 2*K*22)**D2**)**R** + 2(*λ*2**B2** − *γ*2*κ*2**D2**)**g2**] = 0, **E**[−*K*12(*γ*1**D1** + *γ*2**D2**)**R** + (*λ*2**B2** − *γ*2*κ*2**D2**)**g1** + (*λ*1**B1** − *γ*1*κ*1**D1**)**g2**] = 0. (20)

After returning to Substitutions (13), we obtain that the second-order approximation of the characteristic function has the following form:

$$\mathbf{H}(\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}) = \mathbf{H}^{(2)}(\boldsymbol{\mu}\_{1},\boldsymbol{\mu}\_{2}) \cdot \exp\left\{j\boldsymbol{\mu}\_{1}\frac{\boldsymbol{\kappa}\_{1}}{\sigma} + j\boldsymbol{\mu}\_{2}\frac{\boldsymbol{\kappa}\_{2}}{\sigma}\right\} \approx \mathbf{F}^{(2)}\left(\frac{\boldsymbol{\mu}\_{1}}{\sigma},\frac{\boldsymbol{\mu}\_{2}}{\sigma}\right) \cdot \exp\left\{j\boldsymbol{\mu}\_{1}\frac{\boldsymbol{\kappa}\_{1}}{\sigma} + j\boldsymbol{\mu}\_{2}\frac{\boldsymbol{\kappa}\_{2}}{\sigma}\right\}.$$

So, the asymptotic characteristic function of the considered multi-dimensional process has the following matrix form:

$$\mathbf{H}(u\_1, u\_2) = \mathbf{R} \cdot \exp\left\{ j u\_1 \frac{\kappa\_1}{\sigma} + j u\_2 \frac{\kappa\_2}{\sigma} + \frac{(j u\_1)^2}{2} \frac{K\_{11}}{\sigma} + j u\_1 j u\_2 \frac{K\_{12}}{\sigma} + \frac{(j u\_2)^2}{2} \frac{K\_{22}}{\sigma} \right\}. \tag{21}$$

Therefore, two-dimensional process (*I*1(*t*), *I*2(*t*)) of number of customers in the orbits is asymptotically Gaussian with means' vector *κ*<sup>1</sup> *σ* , *κ*1 *σ* and covariance matrix

$$\mathbf{cov} = \left( \begin{array}{cc} \frac{K\_{11}}{\sigma} & \frac{K\_{12}}{\sigma} \\ \frac{K\_{12}}{\sigma} & \frac{K\_{22}}{\sigma} \end{array} \right).$$

#### **4. Numerical Examples**

In this section, we present the comparison of asymptotic and simulated distributions for some values of the model parameters. In addition, some performance characteristics of the model are evaluated.

For the numerical examples presentation, we assume the following values of the retrial queue parameters:

$$\lambda\_1 = 1, \mu\_1 = 1, \mathbf{x}\_1 = 1,$$

$$\lambda\_2 = 0.3, \mu\_2 = 1, \mathbf{x}\_2 = 3,$$

$$\mathbf{x} = 0.1, \mathbf{R} = 5, \sigma\_1 = 2 \cdot \sigma, \sigma\_2 = 1 \cdot \sigma, \sigma = 0.01.$$

We have software applications for the considering model simulation and computing the asymptotic results.

We use the discrete-event simulation. The state of the system is changed by events, such as:


The condition for stopping the simulation is the service end at least 107 customers of each arrival process.

Note that the structural parameters of the system, such as *R* > *x*1, *x*2, can take any values, and, with growth *R* regarding *x*1, *x*2, the number of system states increases, which entails an increase in the time spent on analytical calculating the probability distribution. The proposed parameter values allow for clearly demonstrating the dimension of the system and the influence of the system load parameters, such as *α*, *λk*, *μk*, *k* = 1, 2, on the main performance characteristics.

First, we compare asymptotic and simulated distributions for various values of *σ* (the infinitesimal variable of the asymptotic analysis). In Figure 3, the two-dimensional asymptotic probability distribution of the number of customers in orbits is presented for *σ* = 0.01.

**Figure 3.** The two-dimensional asymptotic probability distribution of the number of customers in the orbits.

In Figures 4–6, you may find the comparison of the asymptotic and simulated onedimensional distributions for various values of *σ*. Curves 1 and 2 are the probability distributions of the number of customers in the first and the second orbits, respectively.

**Figure 4.** Comparison of the asymptotic and simulated distributions for *σ* = 0.01.

**Figure 5.** Comparison of the asymptotic and simulated distributions for *σ* = 0.03.

**Figure 6.** Comparison of the asymptotic and simulated distributions for *σ* = 0.05.

The mean and the variance of considered processes are calculated (Figures 7 and 8).

**Figure 7.** Asymptotic and simulate means.

**Figure 8.** Asymptotic and simulate variances.

In addition, the relative errors of the asymptotic means and variances (in comparison with the simulated results) are computed (Tables 1 and 2).

**Table 1.** Table of relative errors of means.


**Table 2.** Table of relative errors of variances.


By analyzing Tables 1 and 2, we conclude that the asymptotic formula can be used for *σ* < 0.05.

Another measure of distributions comparison is Kolmogorov distance

$$\delta = \max\_{i \ge 0} \left| \sum\_{l=0}^{i} \left[ \tilde{p}(l) - p(l) \right] \right|,$$

where *p*(*l*) is the probability distribution of the processes *I*1(*t*) or *I*2(*t*) calculated using the asymptotic formula, and *p*˜(*l*) is the corresponding empiric distribution based on the simulation. Values of the Kolmogorov distances for our example are presented in Table 3.

**Table 3.** The Kolmogorov distances between asymptotic and simulated distributions.


Another direction of the numerical analysis is the computations of the model performance characteristics. The main practical feature is the joining probability *Pk* (the probability that the arrival customer goes to an orbit):

$$P\_k = \sum\_{(n\_1, n\_2) \in \mathbb{B}\_k} R(n\_1, n\_2)\_{\prime}$$

where

$$\mathbb{B}\_1 = \{ (n\_1, n\_2) : b\_1(n\_1 + 1) + b\_2 n\_2 > R \}, \\ \mathbb{B}\_2 = \{ (n\_1, n\_2) : b\_1 n\_1 + b\_2 (n\_2 + 1) > R \}.$$

In the following tables, the dependence *Pk* on the system parameters values is presented.

From Figures 9 and 10, we see that parameter *α* does not have much impact on joining probability *P*, and we need to take into account only the system load (*λ*/*μ*) for choosing the amount of resource *R*. Tables 4–7 can be used in practical goals.

**Table 4.** Values of joining probability *P*1.


**Table 5.** Values of joining probability *P*2.


**Table 6.** Values of joining probability *P*1.



**Table 7.** Values of joining probability *P*2.

**Figure 9.** Dependence of *P* on *α*.

**Figure 10.** Dependence of *P* on *λ*2.

#### **5. Conclusions**

In this paper, the resource multi-server retrial queue with different rates of the service laws for different types of arrival customers and with breakdowns of servers modeling by the negative arrival process is analyzed. The capacity of the system resources is limited. The stationary distribution of the number of customers in the orbits and in the service unit is obtained under the asymptotic condition of the long delay in orbits. The comparison of the asymptotic results with simulation ones shows the high accuracy of our approach. Finally, we provide numerical examples to show the impact of the model parameters on some performance characteristics. Such analysis for the queuing system with a limited total amount of resources can be used in the process of communication network node design.

In the future, we plan to study multi-server resource retrial queuing systems with non-Poisson arrival processes and random values of the required resource. In addition, study of queuing systems with the service time depending on the required resource seems interesting.

**Author Contributions:** Conceptualization, E.L., E.F. and S.M.; methodology, E.L., R.S., E.F. and S.M.; software, E.L., R.S.; writing, original draft preparation, E.F.; writing, review and editing, E.L., R.S., E.F. and S.M.; supervision S.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


#### *Article* **Bounds on the Rate of Convergence for** *M<sup>X</sup> <sup>t</sup>* **/***M<sup>X</sup> t* **/1 Queueing Models**

**Alexander Zeifman 1,2,3,4,\*, Yacov Satin <sup>1</sup> and Alexander Sipin <sup>1</sup>**


**Abstract:** We apply the method of differential inequalities for the computation of upper bounds for the rate of convergence to the limiting regime for one specific class of (in)homogeneous continuous-time Markov chains. Such an approach seems very general; the corresponding description and bounds were considered earlier for finite Markov chains with analytical in time intensity functions. Now we generalize this method to locally integrable intensity functions. Special attention is paid to the situation of a countable Markov chain. To obtain these estimates, we investigate the corresponding forward system of Kolmogorov differential equations as a differential equation in the space of sequences *l*1.

**Keywords:** inhomogeneous continuous-time Markov chain; weak ergodicity; rate of convergence; sharp bounds; differential inequalities; forward Kolmogorov system

#### **1. Introduction**

In this paper we consider the problem of finding the upper bounds for the rate of convergence for some (in)homogeneous continuous-time Markov chains.

To obtain these estimates, we investigate the corresponding forward system of Kolmogorov differential equations.

Consideration is given to classic inhomogeneous birth–death processes and to special inhomogeneous chains with transitions intensities, which do not depend on the current state. Namely, let {*X*(*t*), *t* ≥ 0} be an inhomogeneous continuous-time Markov chain with the state space X = {0, 1, 2, . . . , }. Denote by *pij*(*s*, *t*) = *P*{*X*(*t*) = *j*|*X*(*s*) = *i* }, *i*, *j* ≥ 0, 0 ≤ *s* ≤ *t*, the transition probabilities of *X*(*t*) and by *pi*(*t*) = *P*{*X*(*t*) = *i*}—the probability that *X*(*t*) is in state *i* at time *t*. Let **p**(*t*) = (*p*0(*t*), *p*1(*t*),...,) *<sup>T</sup>* be probability distribution vector at instant *t*. Throughout the paper it is assumed that in a small time interval *h* the possible transitions and their associated probabilities are

$$p\_{ij}(t, t+h) = \begin{cases} \begin{array}{c} q\_{ij}(t)h + \alpha\_{ij}(t, h), & \text{if } j \neq i, j\\ 1 - \sum\limits\_{k \in \mathcal{K}, k \neq i} q\_{ik}(t)h + \alpha\_{i}(t, h), & \text{if } j = i, j \end{array} \end{cases}$$

where sup*i*≥<sup>0</sup> <sup>∑</sup>*j*≥<sup>0</sup> <sup>|</sup>*αij*(*t*, *<sup>h</sup>*)<sup>|</sup> <sup>=</sup> *<sup>o</sup>*(*h*), for any *<sup>t</sup>* <sup>≥</sup> 0. We also suppose that the transition intensities *qij*(*t*) ≥ 0 are arbitrary non-random functions of *t*, locally integrable on [0, ∞) and, moreover, that there exists a positive number *L* such that

$$\sup\_{i \in \mathcal{X}} \left( \sum\_{k \in \mathcal{X}, k \neq i} q\_{i\bar{k}}(t) \right) \le L < \infty \tag{1}$$

**Citation:** Zeifman, A.; Satin, Y.; Sipin, A. Bounds on the Rate of Convergence for *M<sup>X</sup> <sup>t</sup>* /*M<sup>X</sup> <sup>t</sup>* /1 Queueing Models. *Mathematics* **2021**, *9*, 1752. https://doi.org/10.3390/ math9151752

Academic Editor: Tuan Phung-Duc

Received: 6 July 2021 Accepted: 23 July 2021 Published: 25 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

for almost all *t* ≥ 0. Then the probabilistic dynamics of the process *X*(*t*) is given by the forward Kolmogorov system

$$\frac{d}{dt}\mathbf{p}(t) = A(t)\mathbf{p}(t),\tag{2}$$

where *A*(*t*) is the transposed intensity matrix i.e., *aij*(*t*) = *qji*(*t*), *i*, *j* ∈ X .

We can consider (2) as the differential equation with bounded operator function in the space of sequences *l*<sup>1</sup> (see details, for instance in [1]) and apply all results of [2].

Throughout this paper by · (or by ·<sup>1</sup> if ambiguity is possible) we denote the *<sup>l</sup>*1-norm, i.e., **p**(*t*) <sup>=</sup> <sup>∑</sup>*i*∈X <sup>|</sup>*pi*(*t*)<sup>|</sup> and *A*(*t*) <sup>=</sup> sup*j*∈X <sup>∑</sup>*i*∈X <sup>|</sup>*aij*(*t*)|. Let <sup>Ω</sup> be a set of all stochastic vectors, i.e., *l*<sup>1</sup> vectors with non-negative coordinates and unit norm. Then *A*(*t*) ≤ 2*L* for almost all *t* ≥ 0, and **p**(*s*) ∈ Ω implies **p**(*t*) ∈ Ω for any 0 ≤ *s* ≤ *t*.

Recall that a Markov chain *X*(*t*) is called *weakly ergodic*, if **p**∗(*t*) − **p**∗∗(*t*) → 0 as *t* → ∞ for any initial conditions **p**∗(0) and **p**∗∗(0), where **p**∗(*t*) and **p**∗∗(*t*) are the corresponding solutions of (2).

We consider, as in [3], the four classes of of Markov chains *X*(*t*) with the following transition intensities:

(i) *qij*(*t*) = 0 for any *<sup>t</sup>* ≥ 0 if |*<sup>i</sup>* − *<sup>j</sup>*| > 1 and both *qi*,*i*+1(*t*) = *<sup>λ</sup>i*(*t*) and *qi*,*i*−1(*t*) = *<sup>μ</sup>i*(*t*) may depend on *i*;

(ii) *qi*,*i*−*k*(*t*) = 0 for *<sup>k</sup>* > 1, *qi*,*i*−1(*t*) = *<sup>μ</sup>i*(*t*) may depend on *<sup>i</sup>*; and *qi*,*i*+*k*(*t*), *<sup>k</sup>* ≥ 1, depend only on *k* and does not depend on *i*;

(iii) *qi*,*i*+*k*(*t*) = 0 for *<sup>k</sup>* > 1, *qi*,*i*+1(*t*) = *<sup>λ</sup>i*(*t*) may depend on *<sup>i</sup>*; and *qi*,*i*−*k*(*t*), *<sup>k</sup>* ≥ 1, depend only on *k* and does not depend on *i*;

(iv) both *qi*,*i*−*k*(*t*) and *qi*,*i*+*k*(*t*), *<sup>k</sup>* ≥ 1, depend only on *<sup>k</sup>* and do not depend on *<sup>i</sup>*.

Each such process can be considered as the queue-length process for the corresponding queueing system *M<sup>X</sup> <sup>t</sup>* /*M<sup>X</sup> <sup>t</sup>* /1.

Then type (i) transitions describe Markovian queues with possibly state-dependent arrival and service intensities (for example, the classic *Mn*(*t*)/*Mn*(*t*)/1 queue); type (ii) transitions allow consideration of Markovian queues with state-independent batch arrivals and state-dependent service intensity; type (iii) transitions lead to Markovian queues with possible state-dependent arrival intensity and state-independent batch service; type (iv) transitions describe Markovian queues with state-independent batch arrivals and batch service. We can refer to them as a *M<sup>X</sup> <sup>t</sup>* /*M<sup>X</sup> <sup>t</sup>* /1 queueing model following the original paper [4], see also [3,5,6].

The paper is organized as follows. Section 2 introduces a description of the problem. Section 3 considers the explicit form of the reduced intensity matrices. In Section 4, we obtain upper bounds for the rate of convergence. Section 5 concludes the paper.

#### **2. Preliminaries**

The problem of estimating the rate of convergence, like the very fact of convergence, is very important for studying the long-run (limiting) behavior of continuous time Markov chains with time varying intensities, see detailed discussion, examples and references in [7]. The simplest and most convenient for studying the rate of convergence to the limiting regime is the method of the logarithmic norm, see, for example [1,3,8].

However, there are situations in which this approach does not give good results.

Next, we show the possibility of using a different approach in such cases, namely the method of differential inequalities.

Another (but similar) approach is to use piecewise-line Lyapunov functions, see, for example, [9–12].

Consider here the two simplest examples of bounding the rate of convergence for differential equations.

Let firstly

$$\frac{d\mathbf{x}}{dt} = P\mathbf{x}$$

be a system of differential equations with **x** = (*x*1, *x*2)*T*, and *P* = <sup>−</sup>5 8 2 −5 . Put *d*<sup>1</sup> = 1, *d*<sup>2</sup> = 2, and **z** = *D***x** = (*d*1*x*1, *d*2*x*2)*T*. Then *<sup>d</sup>***<sup>z</sup>** *dt* <sup>=</sup> *DPD*−1**z**, both column sums for *<sup>P</sup>*<sup>∗</sup> <sup>=</sup> *DPD*−<sup>1</sup> equal to <sup>−</sup>1. Hence the logarithmic norm *<sup>γ</sup>*(*P*∗) = sup*<sup>i</sup> p*∗ *ii* + <sup>∑</sup>*<sup>j</sup>* <sup>=</sup>*<sup>i</sup> <sup>p</sup>*<sup>∗</sup> *ji* equals −1, and we obtain a sharp upper bound on the rate of convergence **z**(*t*) ≤ *e*−*<sup>t</sup>* **z**(0). Such a situation is typical if the matrix of the considered system is essentially non-negative (i.e., all off-diagonal elements are non-negative for any *t* ≥ 0). Note that the corresponding eigenvalues of *P* are −1, −9.

On the other hand, let *P* = <sup>−</sup>3 8 −2 −3 . Then corresponding eigenvalues of *P*

are −3 ± 4*i*. On the other hand, the "weighting" logarithmic norm *P* is not less than 1. In principle, here it is also possible to reduce the matrix to the exact value of the logarithmic norm (−3), see [2], but the corresponding transformation will be complex and difficult to implement. The best result (*Ce*−3*<sup>t</sup>* ) here can be obtained using the Lyapunov function (which does not work well in a countable situation), but the use of differential inequalities gives us an estimate like *Ce*−(3+)*<sup>t</sup>* for any positive , see the corresponding description below, in Section 4. This approach deals with the sums of the columns for various combinations of the signs of the coordinates of the solutions of the system; it is described further in Section 4. It was first proposed in our recent papers; see [3] for the case of finite Markov chain with analytical (in *t*) intensities.

In this paper, it is shown that this method can be applied in a more general situation of locally integrable intensities, and, which is most difficult, for a countable chain that does not lend itself to direct reasoning and requires rather fine approximation estimates.

#### **3. Explicit Forms of the Reduced Intensity Matrices**

Due to the normalization condition *<sup>p</sup>*0(*t*) = <sup>1</sup> − <sup>∑</sup>*i*≥<sup>1</sup> *pi*(*t*), we can rewrite the system (2) as follows:

$$\frac{d}{dt}\mathbf{z}(t) = B(t)\mathbf{z}(t) + \mathbf{f}(t),\tag{3}$$

where

$$\mathbf{f}(t) = \begin{pmatrix} a\_{10}(t), a\_{20}(t), \dots \end{pmatrix}^T, \mathbf{z}(t) = \begin{pmatrix} p\_1(t), p\_2(t), \dots \end{pmatrix}^T.$$

$$B(t) = \begin{pmatrix} a\_{11} - a\_{10} & a\_{12} - a\_{10} & \cdots & a\_{1r} - a\_{10} & \cdots \\ a\_{21} - a\_{20} & a\_{22} - a\_{20} & \cdots & a\_{2r} - a\_{20} & \cdots \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \\ a\_{r1} - a\_{r0} & a\_{r2} - a\_{r0} & \cdots & a\_{rr} - a\_{r0} & \cdots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{pmatrix}. \tag{4}$$

Let **y**(*t*) = **z**∗(*t*) − **z**∗∗(*t*) be the difference of two solutions of system (3), and **y**(*t*) = (*y*1(*t*), *y*2(*t*),...,) *<sup>T</sup>*. Then, in contrast to the coordinates of the vector **p**(*t*), the coordinates of the vector **y**(*t*) have arbitrary signs.

Consider now the 'homogeneous' system

$$\frac{d}{dt}\mathbf{y}(t) = B(t)\mathbf{y}(t),\tag{5}$$

corresponding to (3). As was firstly noticed in [13], it is more convenient to study the rate of convergence using the transformed version *B*∗(*t*) of *B*(*t*) given by *B*∗(*t*) = *TB*(*t*)*T*−1, where *T* is the upper triangular matrix of the form

$$T = \begin{pmatrix} 1 & 1 & 1 & \cdots & 1 & \cdots \\ 0 & 1 & 1 & \cdots & 1 & \cdots \\ 0 & 0 & 1 & \cdots & 1 & \cdots \\ \vdots & \vdots & \vdots & \ddots & \cdots \\ 0 & 0 & 0 & \cdots & 1 & \cdots \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \end{pmatrix} . \tag{6}$$

Let **u**(*t*) = *T***y**(*t*). Then the system (5) can be rewritten in the form

$$\frac{d}{dt}\mathbf{u}(t) = B^\*(t)\mathbf{u}(t),\tag{7}$$

where **u**(*t*) = (*u*1(*t*), *u*2(*t*),...) *<sup>T</sup>* is the vector with the coordinates of arbitrary signs. If one of the two matrices *B*∗(*t*) or *B*(*t*) is known, the other is also (uniquely) defined.

The approach based on the differential inequalities (see [3]) seems to be the most general. On the other hand, if *B*∗(*t*) is essentially non-negative (i.e., all off-diagonal elements are non-negative for any *t* ≥ 0), then the method based on the logarithmic norm gives the same results, but in a much more visual form, see [3].

Let us write out the form of the matrix *B*∗(*t*) for each class of chains; in more detail, the corresponding transformations can be seen in [3].

For *X*(*t*) belonging to class (i) (inhomogeneous birth–death process) one has

$$B^\*(t) = TB(t)T^{-1} = $$

$$\begin{pmatrix} -(\lambda\_0 + \mu\_1) & \mu\_1 & 0 & \dots & 0 & \dots & \dots \\ \lambda\_1 & -(\lambda\_1 + \mu\_2) & \mu\_2 & \dots & 0 & \dots & \dots \\ & \ddots & \ddots & \ddots & \ddots & \ddots & \dots \\ & & & & & \lambda\_{r-1} & -(\lambda\_{r-1} + \mu) & \mu & \dots \\ & \cdots & \cdots & \cdots & \cdots & \cdots & \dots & \dots \end{pmatrix} . \tag{8}$$

For *X*(*t*) belonging to class (ii) (which corresponds to the queueing system with batch arrivals and single services), one has

$$B^\*(t) = TB(t)T^{-1} = \begin{pmatrix} a\_{11} & a\_{1} & 0 & \cdots & 0\\ \begin{matrix} a\_{11} & a\_{22} & a\_{23} & \cdots & 0\\ a\_{21} & a\_{11} & a\_{23} & a\_{33} & \cdots\\ \vdots & \ddots & \ddots & \ddots & \ddots\\ \vdots & \ddots & \ddots & \ddots & \ddots \end{matrix} \\\\ \vdots & \ddots & \ddots & \ddots & \ddots \end{matrix} \tag{9}$$

For *X*(*t*) belonging to class (iii) (which corresponds to the queueing system with single arrivals and group services), one has *B*∗(*t*) = *TB*(*t*)*T*−<sup>1</sup> =

$$
\begin{pmatrix}
\lambda\_1 & - (\lambda\_1 + \sum\_{l \le 2} b\_l) & b\_1 - b\_3 & \dots & \dots \\
& \ddots & \ddots & \ddots & \ddots & \ddots \\
& \ddots & \dots & \dots & \lambda\_{r-1} & -(\lambda\_{r-1} + \sum\_{l \le r} b\_l) & \dots \\
& \ddots & \ddots & \ddots & \ddots & \ddots \\
& & \ddots & \ddots & \ddots & \ddots \\
\end{pmatrix}. \tag{10}
$$

Finally, for *X*(*t*) belonging to class (iv) (which corresponds to the queueing system with state-independent batch arrivals and group services), one has

$$B^\* = TB(t)T^{-1} = \begin{pmatrix} a\_{11} & b\_1 - b\_2 & b\_2 - b\_3 & \cdots & \cdots \\ & a\_1 & a\_{22} & b\_1 - b\_3 & \cdots & \cdots \\ & \ddots & \ddots & \ddots & \ddots & \ddots \\ & a\_{r-1} & \cdots & \cdots & a\_1 & a\_{rr} & \cdots \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \end{pmatrix},\tag{11}$$

where

$$T^{-1} = \begin{pmatrix} 1 & -1 & 0 & \cdots & 0 & \cdots \\ 0 & 1 & -1 & \cdots & 0 & \cdots \\ 0 & 0 & 1 & \cdots & 0 & \cdots \\ \vdots & \vdots & \vdots & \ddots & \cdots \\ 0 & 0 & 0 & \cdots & 1 & \cdots \\ \cdots & \cdots & \cdots & \cdots & \cdots & \cdots \end{pmatrix}.$$

**Remark 1.** *Generally speaking, for models of the first and second classes, the matrix B*∗(*t*) *is always essentially non-negative; at the same time, for models of the third and fourth classes, this requires some additional assumptions. Under essential non-negativity of B*∗(*t*) *all bounds on the rate of convergence can be obtained via logarithmic norm, see [3]. However, in the general case, this approach may not work, and the method of differential inequalities described in our previous papers, see [3,14], would be more effective.*

Thus, in this paper we will consider chains of the third and fourth classes with a countable state space. For simplicity of calculations, we will additionally assume that the size of the simultaneously arriving and/or servicing group of customers does not exceed some fixed number, say *R*, i.e., that all *qij*(*t*) = 0 for |*i* − *j*| > *R* and any *t* ≥ 0.

Let {*di*, *i* ≥ 1} be a sequence of non-zero numbers such that inf*<sup>k</sup>* |*dk*| = *d* > 0. Denote by *D* = *diag*(*d*1, *d*2, ...) the corresponding diagonal matrix, with the off-diagonal elements equal to zero. Let **w**(*t*) = *D***u**(*t*) in (7), then we obtain the following equation

$$\frac{d}{dt}\mathbf{w}(t) = B^{\ast\ast}(t)\mathbf{w}(t),\tag{12}$$

where

$$B^{\*\*}(t) = DB^\*(t)D^{-1} = \left(b^{\*\*}\_{i\bar{j}}(t)\right)\_{i,j \ge 1}.\tag{13}$$

If we write out *<sup>B</sup>*∗(*t*) = *b*∗ *ij*(*t*) *i*,*j*≥1 , then

$$b\_{ij}^{\*\*}(t) = \frac{d\_i}{d\_j} b\_{ij}^\*(t), \quad |i - j| \le R,\tag{14}$$

and our assumption implies *b*∗∗ *ij* (*t*) = *b*<sup>∗</sup> *ij*(*t*) = 0 for any *t* ≥ 0 if |*i* − *j*| > *R*.

#### **4. Upper Bounds on the Rate of Convergence**

Let us first consider a general finite system of linear differential equations, which we will write in the form

$$\frac{d}{dt}\mathbf{x}(t) = B^\*(t)\mathbf{x}(t), \ t \ge 0,\tag{15}$$

where **x**(*t*) = (*x*1(*t*),..., *xS*(*t*)) *<sup>T</sup>*, and let *D* now be the corresponding *finite* diagonal matrix.

The simplest situation with analytical (in *t*) coefficients *b*∗ *ij*(*t*) has been studied in [3,14,15]. The method of estimating under such assumption is based on the fact that, in this case, on any finite interval, each coordinate has a finite number of sign changes, which means that

the semiaxis can be divided into intervals, on each of which the signs of the coordinates are constant. Consider such an (*t*1, *t*2). Choose the signs of *dk*-s so that all *dkxk*(*t*) > 0. Hence **w**(*t*) <sup>=</sup> **x**(*t*)*<sup>D</sup>* <sup>=</sup> <sup>∑</sup>*<sup>S</sup> <sup>k</sup>*=<sup>1</sup> *dkxk*(*t*) ≥ *d***x**(*t*)<sup>1</sup> can be considered as the corresponding norm.

Let ∑*<sup>S</sup> <sup>i</sup>*=<sup>1</sup> *b*∗∗ *ij* (*t*) ≤ −*αD*(*t*), for any *j*, then

$$\frac{d}{dt} \|\mathbf{w}(t)\| = \frac{d(\sum\_{k} w\_{k})}{dt} = \sum\_{i,j} b\_{ij}^{\*\*}(t) w\_{j}(t) \le -\mathfrak{a}\_{D}(t) \|\mathbf{w}(t)\|.\tag{16}$$

Then

$$\|\mathbf{w}(t)\| = \|D\mathbf{x}(t)\|\_1 \le e^{-\int\_s^t aD(\tau)d\tau} \|D\mathbf{x}(s)\|\_1, \quad t\_1 < s < t < t\_2. \tag{17}$$

for the corresponding matrix *D* and corresponding function *αD*(*t*). Hence, we have

$$\|\mathbf{x}(t)\|\_{1} \leq \frac{\max|d\_{k}|}{\min|d\_{m}|} e^{-\int\_{s}^{t} d\_{D}(\mathbf{r})d\tau} \|\mathbf{x}(\mathbf{s})\|\_{1\prime} \tag{18}$$

for any *t*<sup>1</sup> < *s* < *t* < *t*2, and by continuity, for all *t*<sup>1</sup> ≤ *s* < *t* ≤ *t*2.

Let now *s*, *t* be arbitrary, 0 ≤ *s* ≤ *t* < ∞. Then for any interval with fixed signs of coordinates we have bound (18) with the corresponding *D* and *αD*(*t*). Let now *α*∗(*t*) = min *<sup>α</sup>D*(*t*), and *<sup>d</sup>*∗(*S*) = *<sup>d</sup>*<sup>∗</sup> = max <sup>|</sup>*dk* <sup>|</sup> |*dm*| , where the minimum and maximum are taken over all possible combinations of coordinate signs of the solution **x**(*t*), for *any* 0 ≤ *s* ≤ *t*. Then we obtain the following general estimate

$$\|\mathbf{x}(t)\|\|\_{1} \leq d^\*(S)e^{-\int\_s^t \mathbf{a}^\*(\tau)d\tau} \|\mathbf{x}(s)\|\|\_{1} \tag{19}$$

Let there exist positive numbers *M*, *β* such that

$$e^{-\int\_{s}^{t} a^\*(\tau) \, d\tau} \le M e^{-\oint (t-s)}, \quad 0 \le s \le t. \tag{20}$$

Consider now an arbitrary interval [0, *t* ∗]; if our original coefficients are locally integrable, they can be approximated arbitrarily accurately by a continuous functions. In turn, a continuous function can be approximated arbitrarily accurately by an analytic function. As a result, instead of the integrable *B*∗(*t*), we obtain an analytic *B*¯ <sup>∗</sup>(*t*), such that

$$\int\_{0}^{t^\*} \|B^\*(\tau) - \bar{B}^\*(\tau)\| d\tau \le \varepsilon. \tag{21}$$

Denote now by *W*(*t*,*s*) and *W*¯ (*t*,*s*) the Cauchy operators for (15) and the respective system with matrix *B*¯ <sup>∗</sup>(*t*). Then, if (20) holds, in accordance with Lemma 3.2.3 [2] (see [2], pp. 110–111) we obtain

$$\begin{split} \| |\mathcal{W}(t, \mathbf{s}) - \vec{\mathcal{W}}(t, \mathbf{s})| \| \le M d^\* e^{-\beta \left( t - s \right)} \left( e^{M d^\* \int\_s^t \| B^\*(\mathbf{r}) - B^\*(\mathbf{r}) \| d \mathbf{r}} - 1 \right) \\ \le M d^\* e^{-\beta \left( t - s \right)} \left( e^{M d^\* \varepsilon} - 1 \right). \end{split} \tag{22}$$

Hence we have the following statement.

**Lemma 1.** *Let all b*∗ *ij*(*t*) *be locally integrable on* [0, ∞)*. Let inequality (20) hold. Then*

$$\|\mathbf{x}(t)\|\|\_{1} \le d^\*(S)Me^{-\beta(t-s)}\|\mathbf{x}(s)\|\|\_{1},\tag{23}$$

*for any solution of (15) and any* 0 ≤ *s* ≤ *t.*

Let us now return to a countable system (7) and consider the corresponding truncated system

$$\frac{d}{dt}\mathbf{u}(n,t) = B^\*(n,t)\mathbf{u}(n,t),\tag{24}$$

where *<sup>B</sup>*∗(*n*, *<sup>t</sup>*) = *b*∗ *ij*(*t*) *n i*,*j*=1

Below we will identify the finite vector with entries (*a*1, ... , *an*) and the infinite vector with the same first *n* coordinates and the others equal to zero.

Rewrite system (24) as

$$\frac{d}{dt}\mathbf{u}(n,t) = B^\*(t)\mathbf{u}(n,t) + \left(B^\*(n,t) - B^\*(t)\right)\mathbf{u}(n,t). \tag{25}$$

Denote by *V*(*t*,*s*) and *V*(*n*, *t*,*s*) the Cauchy operators for (7) and (24), respectively. Suppose that *n* > *S*, and that, in addition

$$\mathbf{u}(0) = \mathbf{u}(n,0) = \mathbf{u}(S,0), \quad \|\mathbf{u}(0)\|\_1 \le 1. \tag{26}$$

Then one has from (7)

$$\mathbf{u}(t) = V(t)\mathbf{u}(0) = V(t)\mathbf{u}(n,0). \tag{27}$$

On the other hand, from (25) we have

.

$$\mathbf{u}(n,t) = V(t)\mathbf{u}(n,0) + \int\_0^t V(t,\tau)(B^\*(n,\tau) - B^\*(\tau))\mathbf{u}(n,\tau) \,d\tau. \tag{28}$$

Hence in any norm we obtain the bound

$$\|\|\mathbf{u}(t) - \mathbf{u}(n, t)\|\| \le \int\_0^t \|\|V(t, \tau)\|\| \| (B^\*(n, \tau) - B^\*(\tau))\mathbf{u}(n, \tau) \| \, d\tau. \tag{29}$$

Denote sup <sup>|</sup>*dk* <sup>|</sup> <sup>|</sup>*dm*<sup>|</sup> <sup>=</sup> <sup>ˆ</sup>*<sup>d</sup>* <sup>&</sup>lt; <sup>∞</sup>, where supremum is taken over all possible combinations of coordinate signs of the solution **u**(*t*) of (7), under assumption |*k* − *m*| = 1.

Put now *D*∗ = *diag*(*d*∗(1), *d*∗(2),...).

Note that according to (14) the matrix *B*∗∗(*t*) has nonzero entries only on the main diagonal and at most *R* diagonals above and below it. Then

$$\|\boldsymbol{B}^\*(t)\|\|\_{1D^\*} = \|\boldsymbol{B}^{\ast \ast}(t)\|\|\_1 \le K = 2L\boldsymbol{\delta}^R,\tag{30}$$

for almost all *t* ≥ 0. Then

$$\|\|V(t,s)\|\|\_{1D^\*} \le e^{K(t-s)} \le e^{Kt^\*}.\tag{31}$$

On the other hand, all elements of the first *n* − *R* columns of the matrix (*B*∗(*n*, *τ*) − *B*∗(*τ*)) are zeros for any *τ* ≥ 0. Hence, all the first *n* − *R* coordinates of the corresponding vector (*B*∗(*n*, *τ*) − *B*∗(*τ*))**u**(*n*, *τ*) are also zeros too, and

$$\|(B^\*(n,\tau) - B^\*(\tau))\mathbf{u}(n,\tau)\|\_{1D^\*} \le K \sum\_{k=n-R}^n d^k |u\_k(t)|.\tag{32}$$

Put *D*∗∗ = *diag*( ˆ*d*2, ˆ*d*4,...) and **w**∗(*t*) = *D*∗∗**u**(*t*). Then, instead of (30) and (31) we have

$$\|\|B^\*(t)\|\|\_{1D^{\*\*}} = \|\|D^{\*\*}B^\*D^{\*\*^{-1}}(t)\|\|\_1 \le K^\* = 2Ld^{2R},\tag{33}$$

and

$$||V(t,s)||\_{1D^{\*\*}} \le e^{K^\*(t-s)} \le e^{K^\*t^\*},\tag{34}$$

respectively.

Then

$$\|\mathbf{u}(n,t)\|\_{1D^{\*\*}} = \sum\_{k=1}^{n} d^{2k} |u\_k(n,t)| \le \varepsilon^{K^\*t^\*} \sum\_{k=1}^{S} d^{2k} |u\_k(n,0)| \le \varepsilon^{K^\*t^\*} d^{2S} \sum\_{k=1}^{S} |u\_k(n,0)|. \tag{35}$$

Then (35) and (26) imply the bound

$$d^{2n-2R} \sum\_{k=n-R}^{n} |u\_k(n,t)| \le \sum\_{k=1}^{n} \hat{d}^{2k} |u\_k(n,t)| \le e^{K^\*t^\*} \hat{d}^{2S}.\tag{36}$$

Then

$$\sum\_{k=n-R}^{n} \hat{d}^k |u\_k(n,t)| \le \hat{d}^n \sum\_{k=n-R}^{n} |u\_k(n,t)| \le \varepsilon^{K^\prime t^\*} \hat{d}^{2S+2R-n}.\tag{37}$$

Finally, for the right-hand side of (29) we have the bound

$$\int\_{0}^{t} \|V(t,\tau)\|\_{1D^{\*}} \|(B^{\*}(n,\tau) - B^{\*}(\tau))\mathbf{u}(n,\tau)\|\_{1D^{\*}} \,d\tau \le e^{Kt^{\*}} K t^{\*} e^{K^{\*}t^{\*}} \hat{d}^{2S+2R-n},\tag{38}$$

which tends to be zero at *n* → ∞.

Hence we have the following statement.

**Lemma 2.** *Let assumptions of Lemma 1 be fulfilled for any S. Then, under assumption (26), and for any fixed ε* > 0*, t* <sup>∗</sup> > 0*, we obtain* **u**(*t*) − **u**(*n*, *t*)1*D*<sup>∗</sup> < *ε for sufficiently large n, for any t* ∈ [0, *t* ∗]*.*

As a result, Lemmas 1 and 2 guarantee an estimate of the form

$$||\mathbf{u}(t)||\_1 \le Me^{-\beta t}||\mathbf{u}(0)||\_{1D^\*}.\tag{39}$$

Consider now two arbitrary solutions **p**∗(*t*) and **p**∗∗(*t*) of the forward Kolmogorov system (2) with the corresponding initial conditions **p**∗(0) and **p**∗∗(0). Denote by **p**∗ <sup>0</sup> (*t*) and **p**∗∗ <sup>0</sup> (*t*)the respective vector functions with coordinates 1, 2, ... (i.e., without zero coordinates). One can write **u**(*t*) = *T*(**p**∗ <sup>0</sup> (*t*) − **p**∗∗ <sup>0</sup> (*t*)). Then (see for instance [8]), the following

inequality holds: **p**∗(*t*) <sup>−</sup> **<sup>p</sup>**∗∗(*t*)<sup>1</sup> <sup>≤</sup> <sup>2</sup> *<sup>d</sup>* **u**(*t*)1.

Finally we obtain the following statement.

**Theorem 1.** *Let the assumptions of Lemma 1 hold for any natural S. Then X*(*t*) *is weakly ergodic and the following bound on the rate of convergence holds:*

$$\|\mathbf{p}^\*(t) - \mathbf{p}^{\*\*}(t)\|\_1 \le \frac{2M}{d} e^{-\beta t} \|\mathbf{p}\_0^\*(0) - \mathbf{p}\_0^{\*\*}(0)\|\_{1D^\*}.\tag{40}$$

**Remark 2.** *A specific model (which belongs to both classes* (iii) *and* (iv)*) was investigated in [16] by the method described here.*

Namely, in this paper, the queueing model with possible transitions and respective intensities of single arrival *λ*(*t*) and service of group of two customers *μ*(*t*) was considered. Hence

$$B^\*(t) = $$


Let *δ* > 1 be a positive number. Put

*<sup>d</sup>*<sup>1</sup> <sup>=</sup> 1, *<sup>d</sup>*<sup>2</sup> <sup>=</sup> 1/*δ*, *dk* <sup>=</sup> *<sup>δ</sup>k*−2, *<sup>k</sup>* <sup>≥</sup> 3 if all coordinates of solutions are positive; <sup>|</sup>*dk*<sup>|</sup> <sup>=</sup> *<sup>δ</sup>k*−1, *<sup>k</sup>* <sup>≥</sup> 1 otherwise.

Then one has,

$$a^\*(t) \ge \min\left[\lambda(t)\left(1-\delta^{-1}\right), \mu(t)(1+\delta)-1\right]$$

$$\lambda(t)\left(\delta^2 - 1\right), \mu(t)\left(1-\delta^{-1}\right) - \lambda(t)(\delta - 1)\Big|\,. \tag{41}$$

Moreover, *d* = *δ*−1, ˆ*d* = *δ*, *d*<sup>∗</sup> *<sup>k</sup>* <sup>=</sup> *<sup>δ</sup>k*−1, for *<sup>k</sup>* <sup>≥</sup> 1.

In particular, if the process *X*(*t*) is homogeneous i.e., *λ*(*t*) = *λ* and *μ*(*t*) = *μ* are positive numbers, then <sup>∞</sup> <sup>0</sup> *α*∗(*t*)*dt* = +∞ is equivalent to *α*<sup>∗</sup> > 0 and this is equivalent to 0 < *λ* < *μ*. Put *δ* = *μ <sup>λ</sup>* . Hence,

$$\alpha^\* = \min\left[\left(\sqrt{\mu} - \sqrt{\lambda}\right)^2, \lambda\left(1 - \sqrt{\frac{\lambda}{\mu}}\right)\right].\tag{42}$$

In the paper [16] the specific example with periodic intensities was considered. Namely, let *<sup>λ</sup>*(*t*) = <sup>2</sup> <sup>+</sup> sin <sup>2</sup>*π<sup>t</sup>* and *<sup>μ</sup>*(*t*) = <sup>4</sup> <sup>−</sup> cos <sup>2</sup>*πt*. Put *<sup>δ</sup>* <sup>=</sup> <sup>11</sup> <sup>10</sup> . Then, <sup>1</sup> <sup>0</sup> *<sup>α</sup>*∗(*t*) *dt* <sup>≥</sup> <sup>1</sup> <sup>22</sup> > 0, *X*(*t*) is exponentially weakly ergodic and has the 1-periodic limiting mean (a Markov chain has the limiting mean *m*(*t*), if lim*t*→∞(*m*(*t*) − *E*(*t*, *k*)) = 0 for any *k*, *E*(*t*, *k*) is the mathematical expectation of *X*(*t*) under initial condition *X*(0) = *k* ). Now, applying the known truncation technique (see the detailed discussion and bounds in [8]), one can compute all probability characteristics of the queue-length process *X*(*t*). Some of the corresponding graphs are shown in Figures 1–4; see detailed discussion in [16].

**Figure 1.** The mean *E*(*t*, 0) and *E*(*t*, 100) for *t* ∈ [0, 28], this figure shows the rate of convergence.

**Figure 2.** The mean *E*(*t*, 0) and *E*(*t*, 100) for *t* ∈ [28, 29], this figure shows approximation of the limiting mean.

**Figure 3.** Probability *p*2(*t*) for *t* ∈ [0, 28] and initial conditions *X*(0) = 0 and *X*(0) = 100; this figure shows the rate of convergence.

**Figure 4.** Probability *p*2(*t*) for *t* ∈ [28, 29] and initial conditions *X*(0) = 0 and *X*(0) = 100; this figure shows approximation of the limiting probability *p*2(*t*).

#### **5. Conclusions**

In this paper, we have substantiated one of the most general methods for studying the rate of convergence to limit characteristics for weakly ergodic Markov chains with continuous time. Namely, the applicability of the method of differential inequalities for countable inhomogeneous processes in the case of a nonsmooth dependence of intensities as functions of time is shown. Thus, studying models with continuous time from the theory of queues, biology, physics and other sciences, and obtaining guaranteed estimates of the rate of convergence, we can both make sure that the influence of the initial conditions of the system disappears with increasing time, and build the main characteristics of the system to control them.

**Author Contributions:** Investigation, A.Z., Y.S. and A.S. All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Russian Science Foundation under grant 19-11-00020.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Data sharing is not applicable to this article.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


#### MDPI

St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Mathematics* Editorial Office E-mail: mathematics@mdpi.com www.mdpi.com/journal/mathematics

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com