*Article* **Open Markov Type Population Models: From Discrete to Continuous Time**

**Manuel L. Esquível 1,\*, Nadezhda P. Krasii <sup>2</sup> and Gracinda R. Guerreiro <sup>1</sup>**


**Abstract:** We address the problem of finding a natural continuous time Markov type process—in open populations—that best captures the information provided by an open Markov chain in discrete time which is usually the sole possible observation from data. Given the open discrete time Markov chain, we single out two main approaches: In the first one, we consider a calibration procedure of a continuous time Markov process using a transition matrix of a discrete time Markov chain and we show that, when the discrete time transition matrix is *embeddable* in a continuous time one, the calibration problem has optimal solutions. In the second approach, we consider semi-Markov processes—and open Markov schemes—and we propose a direct extension from the discrete time theory to the continuous time one by using a known structure representation result for semi-Markov processes that decomposes the process as a sum of terms given by the products of the random variables of a discrete time Markov chain by time functions built from an adequate increasing sequence of stopping times.

**Keywords:** Markov chains; open population Markov chain models; Semi-Markov processes

#### **1. Introduction**

After the first works introducing homogeneous open Markov population models in [1] followed by those in [2] and then in [3], further expanded by several authors and exposed in [4] and then in [5], the study of open populations in a finite state space in discrete time with a Markov chain structure became well established.

Following the pioneering work of Gani, introducing in [6] what now is known as *Cyclic Open Markov* population models, there were further extensions in [7], for non-homogeneous Markov chains and then, for cyclic non-homogeneous Markov systems or equivalently for non-homogeneous open Markov population processes, by the authors of [8,9]. Let us stress that continuous time non-homogeneous Markov systems have been studied lately in [10]. Furthermore, the recent work in [11] develops an approach to open Markov chains in discrete time—allowing a particle physics interpretation—for which there is a state space of the Markov chain—where distributions are studied by means of moment generating functions—there is an exit *reservoir*, which is tantamount to a cemetery state and, there is an incoming flow of particles, defined as a stochastic process in discrete time whose properties—e.g., stationarity—condition the distribution law of the particles in the state space.

Discrete time non-homogeneous semi-Markov systems or equivalently open semi-Markov population models were introduced and studied in [12,13]. The study of open populations in a finite state space in continuous time and governed by Markov laws, has already been carried in [14] and the references therein, and extensions to a general state space have been given in [15–17]. The continuous time framework has also been

**Citation:** Esquível, M.L.; Krasii, N.P.; Guerreiro, G.R. Open Markov Type Population Models: From Discrete to Continuous Time. *Mathematics* **2021**, *9*, 1496. https://doi.org/10.3390/ math9131496


Academic Editors: Panagiotis-Christos Vassiliou and Andreas C. Georgiou

Received: 31 May 2021 Accepted: 23 June 2021 Published: 25 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

addressed, for instance, in [18–20], for the case of semi-Markov processes and for nonhomogeneous semi-Markov systems [21]. We may also refer a framework of open Markov chains with finite state space—see in [22] and references therein—that has already seen applications in Actuarial or Financial problems—as, for instance, in [23,24]—but also in population dynamics (see [25]). The weaker formalism open Markov schemes, in discrete time—developed in [26]—allows for influxes of new elements in the population to be given as general time series models.

Another example was motivated by the study of a continuous time non homogeneous Markov chain model for Long Term Care, based on an estimated Markov chain transition matrix with a finite state space, in [27], by means of a method for calibrating the intensities on the continuous time Markov chain using the discrete time transition matrix in the context of usual existence theorems for ordinary differential equations (ODE); this method will be considered, in Section 3.2, in the more general context of Caratheodory existence theorems for ODE.

The main contribution of the present work is to extend results on open Markov chains in discrete time to some continuous time process of Markov type using different methods of associating a continuous process to an observed process in discrete time. One of these methods—presented in Sections 3.2 and 3.3—is by calibration of the transition intensities. Another method considered for open Markov schemes—in Section 4.2 and also, briefly, for some particular cases, in Section 4.3—is to exploit a natural representation of the continuous time Markov type process, in Formula (2) of Section 2.

#### **2. From Discrete Time to Continuous Time via a Structural Approach**

We present the main ideas on a structural representation for continuous time process of Markov type that are crucial to our approach. The structure of continuous time processes for instance, Markov, semi-Markov, and Markov type schemes processes—allows us to consider a fairly general representation formula—Formula (2)—decoupling the continuous time process as a discrete time process and a sequence of time functions depending on the sequence of the jump stopping times.

Consider a complete probability space (Ω, <sup>F</sup>, ), a continuous time stochastic process (*Yt*)*t*≥<sup>0</sup> defined on this probability space and - = (F*t*)*t*≥<sup>0</sup> the natural filtration associated to this process, that is, such that F*<sup>t</sup>* := *σ*(*Ys* : *s* ≤ *t*) is the algebra-*σ* generated by the variables of the process until time *t*. Consider also a sequence of random variables (*Zn*)*n*≥<sup>0</sup> taking values in a finite state space **Θ** = {*θ*1, *θ*2, ... , *θr*}, the sequence being adapted to the filtration and 0 <sup>≡</sup> *<sup>τ</sup>*<sup>0</sup> <sup>&</sup>lt; *<sup>τ</sup>*<sup>1</sup> <sup>&</sup>lt; *<sup>τ</sup>*<sup>2</sup> <sup>&</sup>lt; ··· <sup>&</sup>lt; *<sup>τ</sup><sup>n</sup>* <sup>&</sup>lt; ··· an increasing sequence of --stopping times, denoted by T , satisfying the following hypothesis:

**Hypothesis 1.** *Almost surely,* lim*n*→+<sup>∞</sup> *<sup>τ</sup><sup>n</sup>* = +<sup>∞</sup> *and, for any T* <sup>∈</sup> <sup>+</sup> *and almost all <sup>ω</sup>* <sup>∈</sup> <sup>Ω</sup>*:*

$$\#\{k \ge 1 : \tau\_k(\omega) \le T\} < +\infty. \tag{1}$$

This hypothesis means that in every compact time interval [0, *T*], for almost all *ω* ∈ Ω, there is only a finite number of stopping times realizations *τk*(*ω*) in this interval.

**Hypothesis 2.** *The continuous time process* (*Yt*)*t*≥<sup>0</sup> *admits a representation given, for t* ≥ 0*, by*

$$Y\_{\mathbf{f}} = \sum\_{n=0}^{+\infty} Z\_n \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(\mathbf{t}),\tag{2}$$

*that is, a hypothesis on the structure of the continuous time process* (*Yt*)*t*≥0*.*

It is well known—see in [28] (pp. 367–379) and in [29] (pp. 317–320)—that if (*Zn*)*n*≥<sup>0</sup> is a Markov chain and the time intervals (*τn*+<sup>1</sup> − *τn*)*n*≥<sup>0</sup> are Exponentially distributed then (*Yt*)*t*≥<sup>0</sup> can be taken to be a continuous time **homogeneous** Markov chain. If (*Zn*)*n*≥<sup>0</sup> is a Markov chain and the time intervals (*τn*+<sup>1</sup> − *τn*)*n*≥<sup>0</sup> have a distribution that can depend

on the present state as well as on the one visited next then (*Yt*)*t*≥<sup>0</sup> can be taken to be a **semi-Markov** process (see in [30] (pp. 261–262) and in [31] (pp. 295–299), for brief references). In the case of a semi-Markov processes, a nice result of Ronald Pyke (see in [32] (p. 1236)), reproduced ahead in Theorem A7, guarantees that when the state space is finite the process is *regular* implying that almost all paths of such a semi-Markov process are step-functions over [0, +∞[ and so, the paths satisfy Formula (1). In another important case (see Theorems A5 and A6 ahead, or [30] (pp. 262–266) and [31] (pp. 195–244)), adequate hypothesis on the distribution of the stopping times and on the sequence (*Zn*)*n*≥<sup>0</sup> implies that (*Yt*)*t*≥<sup>0</sup> will be a **non homogeneous** Markov chain process in continuous time, whose trajectories are step functions also satisfying Formula (1). The representation in Formula (2), thus covers the cases of homogeneous and non homogeneous Markov processes in continuous time as well as semi-Markov processes, providing a desired connection between a continuous time process and a discrete one that is a component of the former. We observe that there is a practical justification for Hypothesis 1, namely, the *identifiability* of the process; as can be read in [33] (p. 3): "*. . . Actually, in real systems the transition from one observable state into another takes some time*." Being so, the existence of accumulation points in a compact interval would preclude estimation procedures for instance of the distribution of the sequence (*τn*+<sup>1</sup> − *τn*)*n*≥1.

#### **3. From Discrete to Continuous Time Markov Chains: A Calibration Approach**

In this section, we consider a calibration approach in order to determine a set of probability densities that best approaches a sequence of discrete time transition matrices with respect to a quadratic loss function. We then show that *embeddable* stochastic matrices, according to Definition 1, are solutions of the calibration problem. For the reader's convenience, we recall in the first appendix the most important results on continuous time Markov chains with finite state space that are relevant for our study with emphasis on the crucial non-accumulation property of the jump times of a continuous time Markov chain (see Theorem A6 ahead). We will start by recalling the main information on embeddable chains. We then present one of the main contributions of this work, that is, a general result on the optimization problem of calibration and its relations with embeddable properties of discrete time Markov chains.

#### *3.1. The Embedding of a Discrete Time Markov Chain in a Continuous One*

The embedding of the discrete time Markov chain in a continuous one following the guidelines, for instance, in [34–40], can be considered as a method to connect a discrete time process with a continuous one. For notations on non-homogeneous continuous time Markov chains see Section 3.2.

**Definition 1** (Embeddable stochastic matrix (see [38]))**.** *A stochastic matrix R is said to be embeddable if there exists a time t<sup>R</sup>* > 0 *and a family of stochastic matrices P*(*s*, *t*) *continuously defined in the set of times* {(*s*, *<sup>t</sup>*) <sup>∈</sup> <sup>2</sup> : 0 <sup>≤</sup> *<sup>s</sup>* <sup>≤</sup> *<sup>t</sup>* <sup>≤</sup> *<sup>t</sup>R*} *such that*

$$\begin{cases} P(s,t) = P(s,u)P(u,t) & 0 \le s \le u \le t \le t\_{\mathcal{R}}\\ P(s,s) = I & 0 \le s \le t\_{\mathcal{R}}\\ P(0,t\_{\mathcal{R}}) = \mathcal{R}. \end{cases} \tag{3}$$

We observe that by Theorem A2 ahead, the condition in Formulas (3) is tantamount to the definition of a continuous time Markov chain with transition probabilities given by *P*(*s*, *t*).

**Remark 1** (Intrinsic time for embeddable chains)**.** *Goodman in [41]—aiming at a more general result for the Kolmogorov differential equations—showed that with the change of time given by* *ϕ*(*u*) := − log det *P*(0, *u*)*—which amounts to a change in the matrix coefficients of P*(*s*, *t*)*—we have that*

$$t\_R = -\log \det \mathcal{R}.\tag{4}$$

*This remarkable representation for the embedding time t<sup>R</sup> will be useful for a result in Section 3.2 devoted to the calibration approach. It has also been used for estimation in [42] (p. 330).*

See the work in [35] for a definition similar to Definition 1 and for a summary of many important results on this subject. The characterization of an embeddable stochastic matrix in a form useful for practical purposes was recently achieved in [43]. More useful results were obtained in [44]. The connections between this kind of embedding and the other approaches, for the association of a discrete time Markov chain and a continuous time process, deserve further study.

#### *3.2. Continuous Time Markov Chains Calibration with a Discrete Time Markov Transition Matrix*

The calibration of transition intensities of a non homogeneous Markov chain, with a discrete time Markov chain transition matrix estimated from data, was proposed in [27]. In this section, we establish a general formulation of the existence a unicity result that subsumes the approach and we establish a connection with the *embedding* approach of Section 3.1. Notation and needed essential results on non-homogeneous Markov processes in continuous time were recalled in Appendix A.

The procedure for calibration of intensities consists in finding the intensities of a non homogeneous continuous time Markov chain using a probability transition matrix of a discrete time Markov chain and a given loss function—having as arguments the transition probabilities of the continuous time Markov chain and some function of the transition matrix of the discrete time Markov chain—in such a way that the loss function is minimized.

Previously to the consideration of the theorem on the calibration of intensities we discuss some motivation for this approach. It may happen that a phenomena that could be dealt—due to its characteristics—with a continuous time Markov chain model can only be observed at regularly spaced time intervals. This is the case of the periodic assessments of the healthcare status of patients that can change at any time but are only object of a comprehensive evaluation on, say, a weekly basis. With the data originated by these observations we can only determine transition probabilities—for a defined period, say, a week—and, most importantly we cannot determine the time stamps for the patient status change. The question naturally poses itself: is it possible to associate—in some canonical way—to an estimated discrete time Markov chain transition matrix a process in continuous time that encompasses the discrete time process? First steps in this direction are provided by Theorem 1 that we now present and the following Theorems 2 and 3.

We formulate Theorem 1 in the context of Caratheodory's general existence theory of solutions of ordinary differential equations that we briefly recall. One reason for this choice is that according to [41] (p. 169) and we quote: "...*This fact gives further evidence in support of the view that Caratheodory equations occupy a natural place in the theory of non-stationary Markov chains."* Another reason is the fact that Caratheodory existence theory is particularly suited for regime switching models and these models are the object of Theorem 3 ahead. Following the work in [45] (pp. 41–44), we consider the definition of an **extended solution** for a Cauchy problem of a differential equation,

$$\mathbf{Y}'(t) = f(t, \mathbf{Y}(t)), \ \mathbf{Y}(0) = \not\learrow \tag{5}$$

or formulated in an equivalent form,

$$\mathcal{Y}(t) = \xi + \int\_0^t f(s, \mathcal{Y}(s)) ds,\tag{6}$$

for *<sup>f</sup>*(*t*, *<sup>y</sup>*) : *<sup>I</sup>* ×D → *<sup>r</sup>* a non-necessarily continuous function, with *<sup>I</sup>* <sup>⊂</sup> [0, <sup>+</sup>∞[ and D ⊂ *<sup>r</sup>* , to be an **absolutely continuous** function *Y*(*t*) (see [46], pp. 144–150) such that

*f*(*t*, *Y*(*t*)) ∈ D for *t* ∈ *I* and Formula (5) is verified for all *t* ∈ *I* possibly with the exception of a set of null Lebesgue measure. The well-known Caratheodory's existence theorem (see in [45], p. 43) ensures the existence of an extended solution with a given initial condition—given in a neighborhood of the initial time—under the conditions that *f*(*t*, *y*) is measurable in the variable *t*, for fixed *y*, and continuous in the variable *y*, for fixed *t*, and moreover that there exists a Lebesgue integrable function *m*(*t*), defined on a neighborhood of the initial time, let us say *I*, such that | *f*(*t*, *y*)| ≤ *m*(*t*) for (*t*, *y*) ∈ *I* × D. The question of unicity of the solution is dealt, usually, either directly using Theorem 18.4.13 in [47] (p. 337) or using Osgood's uniqueness theorem—as exposed, for instance, in [48] (p. 58) or in [49] (pp. 149–151)—to conclude that the extended solution—that with Caratheodory's theorem we know to exist—is unique in the sense that two solutions may only differ on a set of Lebesgue measure equal to zero. For our purposes we need an existence and unicity theorem for ordinary differential equations with solutions depending continuously on a parameter such as the general result of Theorem 4.2 in [45] (p. 53) with an omitted proof that follows for a lengthy previous exposition of related matters. For completeness we now establish a result that is suited to our purposes as it deals with the particular type of Kolmogorov equations for continuous time Markov chains.

**Theorem 1** (Calibration of intensities with Caratheodory's type ODE existence theorem hypothesis)**.** *Let, for* <sup>1</sup> <sup>≤</sup> *<sup>n</sup>* <sup>≤</sup> *N,* **<sup>R</sup>***τ<sup>n</sup>* <sup>=</sup> *r* (*τn*) *ij i*,*j*=1,...,*r be the generic element of a sequence of numerical transition matrices taken at sequence of increasing dates* (*τn*)1≤*n*≤*N. Consider a set of intensities* **<sup>Q</sup>**(*t*, *<sup>λ</sup>*) = [*q*(*u*, *<sup>i</sup>*, *<sup>j</sup>*, *<sup>λ</sup>*)]*i*,*j*=1,...,*r—with <sup>λ</sup>* <sup>∈</sup> **<sup>Λ</sup>** <sup>⊂</sup> *<sup>d</sup> being a parameter and* **<sup>Λ</sup>** *being a compact set—satisfying the following conditions:*


$$-q(u,i,i,\lambda) \le M(u) \text{ and } \quad \int\_s^t M(u) du < +\infty. \tag{7}$$

*Then, we have*


$$\mathcal{O}(s\_0, \lambda) := \sum\_{i,j=1,\ldots,r} \sum\_{n=1}^{N} \left( p(s\_0, i, \tau\_n, j, \lambda) - r\_{ij}^{(\tau\_n)} \right)^2. \tag{8}$$

*Then, for the optimization problem* inf*λ*∈**<sup>Λ</sup>** O(*s*0, *<sup>λ</sup>*) *there exists <sup>λ</sup>*<sup>0</sup> ∈ **<sup>Λ</sup>** *such that*

$$\mathcal{O}(\mathbf{s}\_{0\prime}\lambda\_0) = \min\_{\lambda \in \Lambda} \mathcal{O}(\mathbf{s}\_{0\prime}\lambda),\tag{9}$$

*the unique minimum being attained at possibly several points λ*<sup>0</sup> ∈ **Λ***.*

**Proof.** We will prove, simultaneously, the existence of the probability transition matrix, the unicity in the extended solution sense and the continuous dependence of the parameter *λ* ∈ **Λ** following the lines of the proof of the result denominated Hostinsky's representation (see in [29], pp. 348–349). As we suppose that **Λ** is compact, the continuity of **P**(*s*0, *t*, *λ*), as a function of *λ* ∈ **Λ** for every fixed *t*, will be enough to establish the second thesis.

We want to determine an extended solution of the Kolmogorov forward equation given in Formula (A11), that is an extended solution of

$$\begin{cases} \mathbf{P}'\_t(\mathbf{s}\_0, t, \lambda) = \mathbf{P}(\mathbf{s}\_0, t, \lambda) \mathbf{Q}(t, \lambda) \\\\ \mathbf{P}(t, t) = \mathbf{I}, \end{cases} \tag{10}$$

an equation which, as seen in Formula (A12), can be read in integral form as,

$$\mathbf{P}(\mathbf{s}\_{0\prime}t\_{\prime}\lambda) = \mathbf{I} + \int\_{\left[\mathbf{s}\_{0\prime}t\_{\prime}\right]} \mathbf{P}(\mathbf{s}\_{0\prime}\mathbf{s}\_{\prime}\lambda)\mathbf{Q}(\mathbf{s}\_{\prime}\lambda)d\mathbf{s}.\tag{11}$$

As previously said, we will now follow the general idea of successive approximations in the proof of the Picard–Lindelöf theorem for proving existence and unicity of solutions of ordinary differential equations for the forward Kolmogorov equation. By replacing **P**(*s*0,*s*, *λ*) in the right-hand member of Equation (11) by this right-hand member we get,

$$\mathbf{P}(\mathbf{s}\_{0\prime}t\_{\prime}\lambda) = \mathbf{I} + \int\_{[\![\mathbf{s}\_{0}t\_{\prime}]} \mathbf{Q}(\mathbf{s},\lambda)ds + \int\_{[\![\mathbf{s}\_{0}t\_{\prime}]} \int\_{[\![\mathbf{s}\_{0}t\_{1}]\!]} \mathbf{P}(\mathbf{s}\_{0\prime}t\_{2},\lambda)\mathbf{Q}(t\_{1\prime}\lambda)\mathbf{Q}(t\_{2\prime}\lambda)dt\_{2}dt\_{1\prime}$$

and, by induction, we obtain

$$\begin{split} \mathbf{P}(\mathbf{s}\_{0},t,\lambda) &= \mathbf{I} + \int\_{[s\_{0},t]} \mathbf{Q}(s,\lambda)ds + \\ &+ \sum\_{n=2}^{k} \int\_{[s\_{0},t]} \int\_{[t\_{1},t]} \cdots \int\_{[t\_{n-1},t]} \mathbf{Q}(t\_{1},\lambda)\mathbf{Q}(t\_{2},\lambda) \cdots \mathbf{Q}(t\_{n},\lambda)dt\_{n} \cdots dt\_{1} + \\ &+ \int\_{[s\_{0},t]} \int\_{[t\_{1},t]} \cdots \int\_{[t\_{k-1},t]} \mathbf{P}(s\_{0},t\_{k},\lambda)\mathbf{Q}(t\_{1},\lambda)\mathbf{Q}(t\_{2},\lambda) \cdots \mathbf{Q}(t\_{k},\lambda)dt\_{k} \cdots dt\_{1}. \end{split}$$

Now, considering the function *M*(*t*) in the third hypothesis stated above about the intensity matrix, we have that, by Lemma A1 (see also Lemma 8.4.1 in [29], p. 348), since *M*(*t*) is integrable over any compact set, considering the (*i*, *j*) component of the *r* × *r* matrix, we have that

$$\begin{aligned} & \left| \left[ \int\_{[s\_0l]} \int\_{[t\_1l]} \cdots \int\_{[t\_{k-1}l]} \mathbf{P}(s\_0, t\_k, \lambda) \mathbf{Q}(t\_1, \lambda) \mathbf{Q}(t\_2, \lambda) \cdots \mathbf{Q}(t\_k, \lambda) dt\_k \cdots dt\_1 \right]\_{ij} \right| \leq \frac{\varepsilon}{k} \\ & \leq \left| r^k \int\_{[s\_0l]} \int\_{[t\_1l]} \cdots \int\_{[t\_{k-1}l]} M(t\_1) M(t\_2) \cdots M(t\_k) dt\_k \cdots dt\_1 \right| \\ &= \frac{\left( r \int\_{[s\_0l]} M(s) ds \right)^k}{k!} . \end{aligned}$$

Finally, as

$$\lim\_{k \to +\infty} \frac{\left(r \int\_{[s\_0, t]} M(s) ds\right)^k}{k!} = 0,$$

we have that the series for which the sum represents **P**(*x*, *t*, *λ*), that is,

$$\mathbf{P}(\mathbf{s}\_{0},t,\lambda) = \mathbb{I} + \sum\_{n=1}^{+\infty} \left( \int\_{[\![s\_{0},t]\!]} \int\_{[\![t\_{1},t]\!]} \cdots \int\_{[\![t\_{n-1},t]\!]} \mathbf{Q}(t\_{1},\lambda)\mathbf{Q}(t\_{2},\lambda)\cdots \mathbf{Q}(t\_{n},\lambda)dt\_{n}\cdots dt\_{1} \right),$$

is a series—of absolutely continuous functions of the variable *t* which are also continuous as functions of the parameter *λ* ∈ **Λ**—converging normally and so the sum is an absolutely continuous function of the variable *t* and continuous function of the parameter *λ*. With a similar reasoning applied to the backward Kolmogorov equation we also have that **P**(*s*, *t*0, *λ*) is absolutely continuous in the variable *s* and, obviously, continuous as a function of the parameter *λ* ∈ **Λ**. We observe that it was stated in [41], pp. 166–167 (with a reference to a proof in [50] and proved also in [51]), that the separate absolute continuity of **P**(*s*, *t*, *λ*) in the variables *s* and *t* ensures the uniqueness of the solution.

**Remark 2** (An alternative path for the existence result)**.** *We observe that, for every fixed value of the parameter λ, by a direct application of Caratheodory's existence theorem to the forward and backward Kolmogorov equations in Theorem A3, we obtain a probability transition matrix P*(*s*, *t*, *λ*) = [*p*(*s*, *i*, *t*, *j*, *λ*)]*i*,*j*=1,...,*r, such that conditions in Definition A2 and the Chapman– Kolmogorov equations in Theorem A1 are verified, that in addition has entries absolutely continuous in s and t and such that Kolmogorov's equations are satisfied almost everywhere. With this approach the continuous dependence of the probability transition matrix on the parameter λ requires further proof.*

**Remark 3** (On the parametrized intensities and transition probabilities)**.** *In a first application to Long-Term Care of a simpler version of Theorem 1 presented in [27], we chose as intensities a parametrized family—of Gompertz–Makeham type (see, for instance, in [52], p. 62)—with a three dimensional parameter. We observe that, in its actual formulation, Theorem 1 contemplates the case of a set of intensities—and of associated transition probabilities—not necessarily with the same functional form with varying parameters but merely with a finite set of different functional forms indexed by the parameters.*

**Remark 4** (Only one transition matrix observation)**.** *In the case where we only have one estimated transition matrix* **R***, we can consider the sequence of n step transition matrices given by the n fold product of the matrix* **R** *by itself. This situation will be addressed in Theorem 2 ahead, in the case of homogeneous Markov chains and in Theorem 3 for the non-homogeneous case.*

*We also observe that in the case of a multidimensional parameter set* **Λ***—say r*1*—and even in a reasonable state space of the discrete time Markov chain—say with r*<sup>2</sup> *states—the optimization problem of Formula* (8) *may require adequate algorithms to be solved as the number of variables is of the order of r*<sup>1</sup> × *r*<sup>2</sup> × (*r*<sup>2</sup> − 1)*. In [27] we opted for a modified grid search coupled with the numerical solutions of the Kolmogorov equations in order to recover the transition probabilities of the continuous time Markov chain.*

**Remark 5** (On the unicity of the solution of the calibration problem)**.** *The unicity in law of the solution of the calibration problem deserves discussion. If there are several minimizers of the calibration problem, to each of these minimizers corresponds an intensity and to each intensity a, possible, different law for the stopping times of the continuous time Markov chain, as these laws are determined by the intensities (see Remark A2). The existence of criteria allowing to identify a distribution of inter-arrival times that stochastically dominates all other solutions is an open problem.*

We can establish a connection between the approach in Section 3.1 and Theorem 1 on calibration above, showing first—in Theorem 2—that, if a matrix is embeddable in a homogeneous continuous time Markov chain—with intensities depending continuously on a parameter—for a fixed value of the parameter, then this continuous time Markov chain solves the calibration problem in an optimum way. We recall that the continuous time Markov chain is homogeneous if, for all 0 ≤ *s*, *t* the transition probabilities satisfy

$$P(s, s+t) = P(0, t)\_{\prime\prime}$$

and that the intensities matrix is constant as a function of time (see [41] (pp. 165–166) for definitions in this context).

**Theorem 2** (Discrete chains embeddable in **homogeneous** continuous chains can be optimally calibrated)**.** *Suppose that the matrix* **R** *is embeddable and let t***<sup>R</sup>** *and the transition probabilities P*(*s*, *t*, *λ***1**) *satisfy Definition 1 in the case of a homogeneous continuous time Markov*

*chain for some family of intensities* **Q**(*λ***1**) *where λ***<sup>1</sup>** ∈ **Λ** *is a given parameter. Then, with <sup>τ</sup><sup>n</sup>* :<sup>=</sup> *nt***<sup>R</sup>** *for <sup>n</sup>* <sup>≥</sup> <sup>1</sup> *and* **<sup>R</sup>***τ<sup>n</sup>* :<sup>=</sup> **<sup>R</sup>**(*n*)*—the <sup>n</sup> fold product of the matrix* **<sup>R</sup>** *by itself—we have that the optimization problem,* inf*λ*∈**<sup>Λ</sup>** O(*λ*) *with respect to the loss function given by Formula* (8) *has an optimal solution P*(*s*, *t*, *λ***1**) *such that*

$$\mathcal{O}(\lambda\_1) = \min\_{\lambda \in \Lambda} \mathcal{O}(\lambda) = 0.$$

**Proof.** It is enough to observe that by Formulas (3) in Definition 1 we have, as *τ*<sup>2</sup> − *τ*<sup>1</sup> = *τ*1,

$$\begin{aligned} P(0, \tau\_2, \lambda\_1) &= P(0, \tau\_1, \lambda\_1) P(\tau\_1, \tau\_2, \lambda\_1) = P(0, \tau\_1, \lambda\_1) P(0, \tau\_2 - \tau\_1, \lambda\_1) = \\ &= P(0, \tau\_1, \lambda\_1) P(0, \tau\_1, \lambda\_1) = \mathbf{R}^{(2)} = \mathbf{R}^{\mathsf{T}}, \end{aligned}$$

and, by induction, that *P*(0, *τn*, *λ*1) = **R***τ<sup>n</sup>* and so in Formula (8) we have that O(*λ*1) = 0.

**Remark 6** (On the skeletons of a homogeneous continuous time Markov chain)**.** *Another possible way to extend results from discrete time to continuous time is the approach of skeletons of Kingman and other authors (see [53,54], for instance). As we are more interested in nonhomogeneous continuous time Markov chains we do not pursue this approach in the present work.*

We now address the case of non homogeneous Markov chain. In Theorem 3, we show that if every element of a sequence, with no gaps, of matrix powers of a discrete time Markov chain is embeddable then there is a regime switching process of Markov type that solves optimally the calibration problem.

**Theorem 3** (Discrete *power-embeddable* discrete chains can be optimally calibrated)**.** *Suppose that all the powers R*(*n*) = *r* (*n*) *ij i*,*j*=1,...,*r , for* 1 ≤ *n* ≤ *N, of a discrete time Markov chain transition matrix R are embeddable and let Pn*(*s*, *t*, *λn*) *be the transition probabilities of the embedding continuous time Markov chain for R*(*n*) *given in their intrinsic time—defined in Remark 1—in such a way that the respective embedding times verifies t <sup>R</sup>*(*n*) = −*n* log det *R (according to Formula (4)). We suppose that the intensities* **Q***n*(*t*, *λn*) *for each of the transition probabilities Pn*(*s*, *t*, *λn*) *depend on parameters λ<sup>n</sup>* ∈ **Λ***, possibly different but all in a common parameter set* **Λ***. With the convention t <sup>R</sup>*(0) = 0*, and*

$$
\lambda(t) := \lambda\_{\mathfrak{w}^\prime} \ t\_{\mathfrak{R}^{(n-1)}} \le t \le t\_{\mathfrak{R}^{(n)} \prime}
$$

*let P*'(*s*, *t*, *λ*(*t*)) *be defined by*

$$\widetilde{P}(\mathbf{s}, t, \lambda(t)) := P\_n(\mathbf{s}, t, \lambda\_n), \; 0 = t\_{\mathbf{R}^{(0)}} \le \mathbf{s} \le t\_{\mathbf{R}^{(n)}}, \; t\_{\mathbf{R}^{(n-1)}} \le t \le t\_{\mathbf{R}^{(n)}}, \; \mathbf{s} \le t\_{\mathbf{r}} \tag{12}$$

*and thus satisfying P*'(0, *t <sup>R</sup>*(*n*), *λ*(*t*)) = *Pn*(0, *t <sup>R</sup>*(*n*), *<sup>λ</sup>n*) = *<sup>R</sup>*(*n*) *. Then, we have that the optimization problem,* inf*λ*∈**<sup>Λ</sup>** O(*λ*) *with respect to the loss function given by*

$$\mathcal{O}(\lambda) := \sum\_{i,j=1,\ldots,r} \sum\_{n=1}^{N} \left( \check{\mathbf{P}}(0, t\_{\mathbf{R}^{(n)}}, \lambda(t))\_{ij} - r\_{ij}^{(n)} \right)^2,\tag{13}$$

*has an optimal solution P*'(*s*, *t*, *λ*(*t*)) *such that*

$$\mathcal{O}(\lambda(t)) = \min\_{\lambda \in \Lambda} \mathcal{O}(\lambda) = 0.$$

**Proof.** We observe that the definition in Formula (12) is coherent—see Figure 1—and then it is a simple verification with the definitions proposed.

**Remark 7** (An associated *regime switching* process)**.** *The function P*'(*s*, *t*, *λ*(*t*)) *defined in Formula* (12) *was obtained by superimposing different transition probabilities for different Markov chains in continuous time. A natural question is to determine if there is—based on these different transitions probabilities—a regime switching Markov chain in continuous time that bears some connection with P*'(*s*, *t*, *λ*(*t*))*. From a brief analysis of Figure 1 we can guess the natural definition of a regime switching Markov chain based on the probabilities Pn*(*s*, *t*, *λn*)*. Let*

$$P(s, t, \lambda(t)) := \mathbf{P}\_{\mathbb{H}}(s, t, \lambda\_{\mathbb{H}}), \ t\_{\mathbb{R}^{(n-1)}} \le s \le t \le t\_{\mathbb{R}^{(n)}}.\tag{14}$$

*Formula* (14) *has the following interpretation. For each* 1 ≤ *n* ≤ *N, consider continuous time Markov chain processes* (*X<sup>n</sup> <sup>t</sup>* )*t*∈[*<sup>t</sup> R*(*n*−1),*<sup>t</sup> R*(*n*)] *with transition probabilities <sup>P</sup>n*(*s*, *<sup>t</sup>*, *<sup>λ</sup>n*) *defined in the domains* <sup>R</sup>*<sup>n</sup>* :<sup>=</sup> {(*s*, *<sup>t</sup>*) <sup>∈</sup> <sup>2</sup> : *<sup>t</sup> <sup>R</sup>*(*n*−1) ≤ *s* ≤ *t* ≤ *t <sup>R</sup>*(*n*)} *with the convention t <sup>R</sup>*(0) = <sup>0</sup>*. The regime switching process* (*Yt*)*t*∈[0,*<sup>t</sup> R*(*n*)] *is such that (compare with Formula* (2)*):*

$$\boldsymbol{\gamma}\_t = \boldsymbol{X}\_t^n, \ t \in [t\_{\mathbf{R}^{(n-1)}\prime}, t\_{\mathbf{R}^{(n)}}]\_{\prime}$$

*that is, the process* (*Yt*)*t*∈[0,*<sup>t</sup> R*(*n*)] *is obtained by gluing together* (*X<sup>n</sup> <sup>t</sup>* )*t*∈[*<sup>t</sup> R*(*n*−1),*<sup>t</sup> R*(*n*)]*, the paths of the processes which are bona fide continuous time Markov processes in each of their—non-random—time intervals* [*t <sup>R</sup>*(*n*−1), *t <sup>R</sup>*(*n*)]*. It is clear that P*(*s*, *t*, *λ*(*t*)) *can be interpreted as a transition probability only when restricted to some domain* R*<sup>n</sup> and that, in general, it will not be a transition probability in the whole interval* [0, *t <sup>R</sup>*(*N*)]*.*

**Figure 1.** A representation of *P*'(*s*, *t*, *λ*(*t*)) in Formula (12) for the first three initial times.

**Remark 8.** *The regime switching process defined in Remark 7 deserves further study. We may, nevertheless, define transition probabilities <sup>P</sup>*\$(*s*, *<sup>t</sup>*, *<sup>λ</sup>*(*t*)) *for <sup>t</sup> <sup>R</sup>*(*k*−1) ≤ *s* ≤ *t <sup>R</sup>*(*k*) ≤ *t* ≤ *t <sup>R</sup>*(*k*+1) *with properties to be thoroughly investigated—by considering*

$$\Psi(\mathbf{s}, \mathbf{t}, \lambda(\mathbf{t})) := P\_k(\mathbf{s}, \mathbf{t}\_{\mathcal{R}^{(k)}}, \lambda\_k) \cdot P\_{k+1}(\mathbf{t}\_{\mathcal{R}^{(k)}}, \mathbf{t}, \lambda\_{k+1}) \dots$$

*3.3. Conclusions on the Relations between Embeddable Matrices, Calibration, and Open Markov Chain Models*

From Theorems 1–3, the following conclusions can be drawn. Given a discrete time Markov transition matrix,

• if the matrix is *embeddable*—according to Definition 1 of Section 3.1—there is an unique in law homogeneous Markov chain in continuous time that solves the calibration problem optimally; the unicity is a consequence of Remark A2 that shows that the laws of the stopping times (*τn*)*n*≥<sup>0</sup> in the representation of Formula (A13) only depend on the intensities and these are uniquely determined whenever the discrete time Markov chain is embeddable.

• if the matrix is *power-embeddable*—that is, if all the matrices of a finite sequence with no gaps of powers of the matrix are embeddable—then there is an unique regime switching continuous time non-homogeneous Markov chain—in the sense of Remark 7—that solves the calibration problem optimally. In this case, the unicity has a justification similar to the previously referred case, that is, the laws of the stopping times only depends on the intensities and these are determined by the fact that the matrix is power-embeddable.

As a consequence, for our purposes, it appears of fundamental importance to determine if a discrete time Markov chain transition matrix is embeddable and to determine—if possible, explicitly—the embedding continuous time Markov chain. Regarding this problem the results in [43,55] deserve further consideration.

**Remark 9** (Aplying Theorems 1–3)**.** *Suppose that discrete time Markov chain transition matrix, of a Markov chain process* (*Zn*)*n*≥<sup>1</sup> *is embeddable in a continuous time Markov chain* (*Xt*)*t*≥0*. We have, for this continuous time process and for a determined sequence of stopping times* (*τn*)*n*≥1*, the representation given in Formula* (A13) *of Theorem A5, that is,*

$$X\_t = \sum\_{n=0}^{+\infty} X\_{\tau\_n} \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t).$$

*Now, as the Theorems referred to may consider that the process* (*Zn*)*n*≥<sup>1</sup> *is suitably approximated by* (*Xt*)*t*≥0*, we can also consider that the continuous time process defined by*

$$\check{X}\_t := \sum\_{n=0}^{+\infty} Z\_{\pi\_n} \mathbf{1}\_{[\pi\_n, \pi\_{n+1}]}(t) \,\prime \tag{15}$$

*is an approximation of* (*Zn*)*n*≥<sup>1</sup> *in continuous time. For processes with a structural representation similar to the one of the process* (*X*'*t*)*t*≥<sup>0</sup> *we propose in Section 4.3 a method to extend from discrete to continuous time the open populations methodology.*

#### **4. More on Open Continuous Time Processes from Discrete Ones**

In this section, we discuss an extension of the formalism of open Markov chains to the case of semi-Markov processes (sMp) and other continuous time processes, namely, the open Markov chain schemes introduced in [26]. For the reader's convenience we present in Appendix B a short summary on sMp and in the next Section 4.1 a review of the main results on the open Markov chain formalism for discrete time. Finally, we propose the second main contribution of this work, that is, an extension of the open Markov chain formalism in discrete time to continuous time in the case of sMp. We also briefly refer the case of open Markov schemes that, in some particular instances, can be dealt as the sMp case.

#### *4.1. Open Markov Chain Modeling in Discrete Time: A Short Review*

We now detail and comment the results that will be used in this paper on discrete time open Markov chains. The study of open Markov chain models we will present next relies on results and notations that were introduced in [56], further developed in [22] and that we reproduce next, for the readers convenience. We will suppose that, in general, the transition matrix of the Markov chain model may be written in the following form:

$$\mathbf{P} = \begin{bmatrix} \mathbf{K} & \mathbf{U}\_1 \\ \mathbf{0} & \mathbf{V} \end{bmatrix} \tag{16}$$

where **K** is a *k* × *k* transition matrix between transient states, **U**<sup>1</sup> a *k* × (*r* − *k*) matrix of transitions between the transient and the recurrent states, and **V** a (*r* − *k*) × (*r* − *k*) matrix of transitions between the recurrent states. A straightforward computation then shows that

$$\mathbf{P}^{(n)} = \begin{bmatrix} \mathbf{K}^{(n)} & \mathbf{U}\_n \\ \mathbf{0} & \mathbf{V}^{(n)} \end{bmatrix}, \quad n \in \mathbb{N}$$

with **<sup>U</sup>***<sup>n</sup>* <sup>=</sup> **<sup>U</sup>***n*−1**<sup>V</sup>** <sup>+</sup> **<sup>K</sup>**(*n*−1)**U**<sup>1</sup> <sup>=</sup> <sup>∑</sup>*n*−<sup>1</sup> *<sup>i</sup>*=<sup>0</sup> **<sup>K</sup>**(*i*) **<sup>U</sup>**<sup>1</sup> **<sup>V</sup>**(*n*−1−*i*). We write the vector of the initial classification, for a time period *i*, as

$$\mathbf{c}\_{i}^{\mathsf{T}} = \begin{bmatrix} \mathbf{t}\_{i}^{\mathsf{T}} \| \mathbf{r}\_{i}^{\mathsf{T}} \end{bmatrix}, \quad i \in \mathbb{N} \tag{17}$$

with **t***<sup>i</sup>* the vector of the initial allocation probabilities for the transient states and **r***<sup>i</sup>* the vector of the initial allocation probabilities for the recurrent states. We suppose that at each epoch *i* ≥ 0 there is an influx of new elements in the classes of the population—population that has its evolution governed by the Markov chain transition matrix—that is, a Poisson distributed with parameter *λi*. It is a consequence of the *randomized sampling* principle (see [57], pp. 216–217) that, if the incoming populations are distributed by the classes according with the multinomial distribution, then the sub-populations in the transient classes have independent Poisson distributions, with parameters given by the product of the Poisson parameter by the probability of the incoming new member being affected to the given class. With Formulas (16) and (17), we now notice that the vector of the Poisson parameters, for the population sizes in each state at an integer time *N*, may be written as

$$\boldsymbol{\lambda}\_{N}^{++\top} = \left[\sum\_{i=1}^{N} \lambda\_{i} \mathbf{t}\_{i}^{\top} \mathbf{K}^{(N-i)} \Bigg| \sum\_{i=1}^{N} \lambda\_{i} \left(\mathbf{t}\_{i}^{\top} \mathbf{U}\_{N-i} + \mathbf{r}\_{i}^{\top} \mathbf{V}^{(N-i)}\right)\right].\tag{18}$$

We observe that the first block corresponds to the transient states and the second block, the one in the right-hand side, corresponds to the recurrent states. From now on, as a first restricting hypothesis, we will also suppose that the transition matrix of the transient states, **K**, is diagonalizable and so

$$\mathbf{K} = \sum\_{j=1}^{k} \eta\_j \mathbf{a}\_j \boldsymbol{\beta}\_j^{\sf T} \boldsymbol{\epsilon}$$

with (*ηj*)*j*∈{1,...,*k*} the eigenvalues, (*αj*)*j*∈{1,...,*k*} the left eigenvectors and (*β<sup>j</sup>* )*j*∈{1,...,*k*} the right eigenvectors of matrix **K**. We observe that *j* ∈ {1, ... , *k*} corresponds to a transient state if and only if | *η<sup>j</sup>* |< 1. We may write the powers of **K** as

$$\mathbf{K}^{(n)} = \sum\_{j=1}^{k} \eta\_{j}^{n} \mathbf{a}\_{j} \boldsymbol{\mathfrak{g}}\_{j}^{\mathsf{T}},\tag{19}$$

and so, as a consequence of (18), for the vector of the Poisson parameters corresponding only to the transient states, *λ*+- *<sup>N</sup>* , we have

$$
\lambda\_N^{+\top} = \sum\_{i=1}^N \lambda\_i \,\mathbf{t}\_i^{\top} \,\mathbf{K}^{(N-i)} = \sum\_{j=1}^k \sum\_{i=1}^N \lambda\_i \,\eta\_j^{N-i} \,\mathbf{t}\_i^{\top} \,\mathbf{a}\_j \,\boldsymbol{\beta}\_j^{\top}. \tag{20}
$$

The main result describing the asymptotic behaviour, established in [22], is the following.

**Theorem 4** (Asymptotic behavior of Poisson parameters of an open Markov chain with Poisson distributed influxes)**.** *Let a Markov chain driven system have a diagonalizable transition matrix between the transient states* **K** = ∑*<sup>k</sup> <sup>j</sup>*=<sup>1</sup> *<sup>η</sup>jαjβ*- *<sup>j</sup> , written in its spectral decomposition form. Suppose the system to be fed by Poisson inputs with intensities* (*λi*)*i*∈<sup>N</sup> *and such that the* *vector of initial classification of the inputs in the transient states converges to a fixed value, that is,* lim*i*→+<sup>∞</sup> **t** - *<sup>i</sup>* = **t** - <sup>∞</sup> <sup>=</sup> **<sup>0</sup>***. Then, with <sup>λ</sup>*+- *<sup>n</sup> the vector of Poisson parameters of the transient sub-populations, at date n* ∈ N*, we have the following:*

*1. If* lim*n*→+<sup>∞</sup> *λ<sup>n</sup>* = *λ* ∈ R+*, then*

$$
\lambda\_{\infty}^{+} = \lim\_{n \to +\infty} \lambda\_{n}^{+\uparrow\uparrow} = \sum\_{j=1}^{k} \frac{\lambda}{1 - \eta\_{j}} \mathbf{t}\_{\infty}^{\sf T} \mathbf{a}\_{j} \boldsymbol{\beta}\_{j}^{\sf T}.\tag{21}
$$

*2. If* lim*n*→+<sup>∞</sup> *λ<sup>n</sup>* = +∞ *and there exists a constant C* > 0 *such that*

$$\max\_{1 \le i \le n} \left| \frac{\lambda\_i - \lambda\_{i+1}}{\lambda\_n} \right| \le C$$

*then*

$$\lim\_{\lambda\_n \to +\infty} \frac{\lambda\_n^{+\top}}{\lambda\_n} = \sum\_{j=1}^k \frac{1}{1 - \eta\_j} \mathbf{t}\_{\infty}^{\top} \mathbf{a}\_j \boldsymbol{\beta}\_j^{\top}. \tag{22}$$

**Remark 10.** *We observe that proportions in the Markov chain transient classes, on both statements of the Theorem 4, only depend on the eigenvalues ηj*, *j* = 1, ... , *k. In fact, whenever using Formula* (21) *to compute proportions these proportions do not depend on the value of λ as we have that*

$$\sum\_{j=1}^{k} \frac{\lambda}{1 - \eta\_{j}} \mathbf{t}\_{\infty}^{\mathsf{T}} \boldsymbol{\alpha}\_{j} \boldsymbol{\beta}\_{j}^{\mathsf{T}} = \lambda \left[ \mathbf{t}\_{\infty}^{\mathsf{T}} \cdot \left( \sum\_{j=1}^{k} \frac{1}{1 - \eta\_{j}} \boldsymbol{\alpha}\_{j} \boldsymbol{\beta}\_{j}^{\mathsf{T}} \right) \right].$$

*and the term in the right-hand side multiplying λ is a vector with the dimension equal to the number of transient classes k, which is equal to the dimension of the square matrix* **K***. As so, when computing proportions, by normalizing this vector with the sum of its components, λ* = 0 *disappears.*

#### *4.2. Open sMP from Discrete time Open Markov Chains*

Let us suppose that the successive Poisson distributions of the influx of new members in the population are independent of the random time at which the influx of new members in the population occurs. For the notations used, see Appendix B. Consider a sMp given by the representation in Formula (A17), that is,

$$\chi\_t = \sum\_{n=0}^{+\infty} Z\_n \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t),$$

in which (*Zn*)*n*≥<sup>0</sup> is the embedded Markov chain and (*τn*)*n*≥<sup>0</sup> are the jump times of the process. We now propose a method to extend the known method to study open Markov chains in discrete time to sMps.

**(1)** In applications we usually consider that we have the influx of new members in the population being modeled by Poisson random variables that at each time *t* has a parameter *λ*(*t*). Being so, Formula (20) may be rewritten as

$$\lambda\_N^{+\top} = \sum\_{i=1}^{i:t\_i \le N} \lambda(t\_i) \text{ t}\_i^{\top} \mathbf{K}^{(N-i)} = \sum\_{j=1}^k \sum\_{i=1}^{i:t\_i \le N} \lambda(t\_i) \text{ } \eta\_j^{N-i} \text{ t}\_i^{\top} \text{ a}\_j \text{ $\mathfrak{F}\_j$ },\tag{23}$$

where usually we can take *ti* = *i*, as in a discrete time Markov chain, the actual time stamp is irrelevant as we only consider the sequence of epochs *i* ≥ 0.

**(2)** In a sMp the only difference we have with respect to a discrete time Markov chain is that the dates *τ<sup>i</sup>* corresponding to each epoch *i* are random; altogether, the structure of the changes in the sub-populations in the transient states is governed by the transition matrix of the Markov chain. In a sMp, the only possible observable changes are those that occur at the random times where it jumps; as so, we will suppose that **the** **influxes of the new members of the population only occur at these random times**. As a consequence, we should have that the vector parameter of the Poisson parameters, in the transient classes, is random since it depends on the random times in each we consider influxes and so, Formula (23) becomes

$$\lambda\_N^{+\top}(\omega) = \sum\_{i=1}^{i:\tau\_i(\omega)\le N} \lambda(\tau\_i(\omega)) \,\mathbf{t}\_i^{\top} \mathbf{K}^{(N-i)} = \sum\_{j=1}^{k} \sum\_{i=1}^{i:\tau\_i(\omega)\le N} \lambda(\tau\_i(\omega)) \,\eta\_j^{N-i} \mathbf{t}\_i^{\top} \,\mathbf{t}\_j \,\mathbf{t}\_j^{\top} . \tag{24}$$

**(3)** The parameters of interest will be the expected values of the random variables *λ*+- *<sup>N</sup>* (*ω*)—with the correspondent asymptotic behavior of these expected values when *N* grows indefinitely—and these expected values can be computed whenever the joint laws of (*τ*0, *τ*1, ... , *τi*) are known, for *i* ≥ 0. In fact, we observe that by Formula (24) we have

$$\begin{split} \mathbb{E}\left[\lambda\_{N}^{+\top}|\boldsymbol{\tau}\_{1},\ldots,\boldsymbol{\tau}\_{i},\ldots\right] &= \mathbb{E}\left[\sum\_{j=1}^{k}\sum\_{i=1}^{i\cdot\tau\_{i}\leq N}\lambda(\boldsymbol{\tau}\_{i})\,\eta\_{j}^{N-i}\,\mathbf{t}\_{i}^{\top}\,\boldsymbol{\mathfrak{a}}\_{j}\boldsymbol{\mathfrak{b}}\_{j}^{\top}|\boldsymbol{\tau}\_{1},\ldots,\boldsymbol{\tau}\_{i},\ldots\right] = \\ &= \sum\_{j=1}^{k}\sum\_{i=1}^{i\cdot\tau\_{i}\leq N}\lambda(\boldsymbol{\tau}\_{i})\,\eta\_{j}^{N-i}\,\mathbf{t}\_{i}^{\top}\,\boldsymbol{\mathfrak{a}}\_{j}\boldsymbol{\mathfrak{b}}\_{j}^{\top}.\end{split}$$

This formula has two consequences. The first one is that given an arbitrary strictly increasing sequence of dates 0 = *t*<sup>0</sup> < *t*<sup>1</sup> < ··· < *ti* < . . . we have

$$\mathbb{E}\left[\lambda\_N^{+\top}|\tau\_1 = t\_1, \dots, \tau\_i = t\_i \dots \right] = \sum\_{j=1}^k \sum\_{i=1}^{i:t\_i \le N} \lambda(t\_i) \,\eta\_j^{N-i} \mathbf{t}\_i^{\top} \,\mathbf{a}\_j \boldsymbol{\beta}\_j^{\top} \mathbf{t}\_i^{\top}$$

thus justifying the assumption that given the strictly increasing of non accumulating stopping times dates (*τ*<sup>1</sup> = *t*1, ... *τ<sup>i</sup>* = *ti* ...) we can proceed as with the usual open Markov chain model in discrete time. The second consequence deserving mention is that in order to compute the expected value of the vector parameters of the transient classes sub-populations, while preserving the Poisson distribution of the influx new members, we compute

$$\mathbb{E}\left[\boldsymbol{\lambda}\_{N}^{+\top}\right] = \mathbb{E}\left[\mathbb{E}\left[\boldsymbol{\lambda}\_{N}^{+\top}|\boldsymbol{\tau}\_{1\prime},\ldots,\boldsymbol{\tau}\_{i},\ldots\right]\right] \\ = \mathbb{E}\left[\sum\_{j=1}^{k}\sum\_{i=1}^{i\cdot\tau\_{i}\leq N}\boldsymbol{\lambda}(\boldsymbol{\tau}\_{i})\,\boldsymbol{\eta}\_{j}^{N-i}\,\mathbf{t}\_{i}^{\top}\,\mathbf{a}\_{j}\boldsymbol{\theta}\_{j}^{\top}\right],$$

using the joint laws of (*τ*1,..., *τi*) for *i* ≥ 0, laws we will suppose to be given.

Theorem 6, in the following, is one possible extension of the open Markov chain formalism to the sMp case taking as a starting point a discrete time Markov chain. To prove this result we will need Theorem 5—a generalization of Lebesgue dominated convergence theorem with varying measures—that we quote from Theorem 3.5 in [58] (p. 390).

**Theorem 5** (Lebesgue dominated convergence theorem with varying measures)**.** *Consider* (*X*, B(*X*)) *a locally compact, separable topological space endowed with its Borel σ-algebra. Suppose that the sequence of probability measures* (*μn*)*n*≥1*—each one of them defined in* (*X*, B(*X*)) *converges weakly to μ on* (*X*, B(*X*)) *and that the sequence of measurable functions* (*fn*)*n*≥<sup>1</sup> *converges continuously to f . Suppose additionally that, for some sequence of measurable functions* (*fn*)*n*≥<sup>1</sup> *defined on X:*

*1. For all t* ∈ *X and n* ≥ 1*, we have that* | *fn*(*t*)| ≤ *gn*(*t*)*.*

*2. With the function g defined on X by*

$$\lg(t) := \inf\_{(\mathfrak{t}\_n)\_{n \ge 1}, \text{ lim}\_{n \to +\infty} \mathfrak{t}\_n = t} \left\{ \liminf\_{n \to +\infty} \mathfrak{g}\_n(\mathfrak{t}\_n) \right\}$$

*we have that*

$$\limsup\_{n \to +\infty} \int \lg\_{\mathfrak{n}}(\mathfrak{t}) d\mu\_{\mathfrak{n}}(\mathfrak{t}) \le \int \lg(\mathfrak{t}) d\mu(\mathfrak{t}) < +\infty.$$

*Then, we have*

$$\lim\_{n \to +\infty} \int f\_n(\mathbf{t}) d\mu\_n(\mathbf{t}) = \int f(\mathbf{t}) d\mu(\mathbf{t}) < +\infty.$$

As said, we will suppose that we only observe the influx of the new members of the population into the sMp classes at the random times where it jumps—but, of course, accounting the state before the jump and the state after the jump—which is a hypothesis that makes sense under the perspective that we usually observe trajectories of the process. We then have the following extension of Theorem 4 to the case of sMp.

**Theorem 6** (On the stability of open sMp transient states)**.** *Let a sMp given by the representation in Formula* (A17)*, that is,*

$$\chi\_t = \sum\_{n=0}^{+\infty} Z\_n \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t),$$

*in which* (*Zn*)*n*≥<sup>0</sup> *is the embedded Markov chain and* (*τi*)*i*≥<sup>0</sup> *are the jump times of the process. For the embedded Markov chain* (*Zn*)*n*≥0*, consider the notations of Section 4.2 and of Theorem <sup>4</sup> in this subsection. Suppose that the influx of new members in the population is modeled by Poisson random variables that at each time t* ∈ [0, +∞[ *have a parameter λ*(*t*)*, with λ a continuous function. Suppose, furthermore, that the following hypothesis are verified.*


$$\lim\_{i \to +\infty} \lambda(t\_i) = \lambda\_{\infty} \tag{25}$$

*Then, we have that the asymptotic behavior of the expected value vector of parameters of Poisson distributed sub-populations in the transient classes of an open sMp, submitted to a Poisson influx of new members at the jump times of the sMp, is given by*

$$\lim\_{N \to +\infty} \mathbb{E}\left[\boldsymbol{\lambda}\_{N}^{+\mathsf{T}}\right] = \lim\_{N \to +\infty} \mathbb{E}\left[\sum\_{j=1}^{k} \sum\_{i=1}^{i \cdot \mathsf{T} \leq N} \boldsymbol{\lambda}\left(\tau\_{i}\right) \eta\_{j}^{N-i} \mathbf{t}\_{i}^{\mathsf{T}} \mathbf{a}\_{j} \boldsymbol{\mathsf{f}}\_{j}^{\mathsf{T}}\right] = \sum\_{j=1}^{k} \frac{\lambda\_{\infty}}{1 - \eta\_{j}} \mathbf{t}\_{\infty}^{\mathsf{T}} \mathbf{a}\_{j} \boldsymbol{\mathsf{f}}\_{j}^{\mathsf{T}}.\tag{26}$$

**Proof.** For each *n* ≥ 1, let *F*(*τ*1,...,*τn*) be the joint distribution function of (*τ*1, ... , *τn*). We want to compute the following limit of expectations:

$$\begin{split} \lim\_{N \to +\infty} \mathbb{E}\left[\lambda\_N^{+\uparrow}\right] &= \lim\_{N \to +\infty} \mathbb{E}\left[\lambda\_N^{+\uparrow} \, \, \tau\_1 < \cdots < \tau\_l \le N\right] = \\ &= \lim\_{N \to +\infty} \int\_{0 < t\_1 < \cdots < t\_i \le N} \lambda\_N^{+\uparrow} dF\_{(\tau\_1, \ldots, \tau\_n)}(t\_1, \ldots, t\_n) = \\ &= \lim\_{N \to +\infty} \int\_{0 < t\_1 < \cdots < t\_i \le N} \left( \sum\_{j=1}^k \sum\_{i=1}^{i:t\_i \le N} \lambda(t\_i) \, \eta\_j^{N-i} \, \mathbf{t}\_i^{\top} \, \mathbf{a}\_j \boldsymbol{\theta}\_j^{\top} \right) dF\_{(\tau\_1, \ldots, \tau\_n)}(t\_1, \ldots, t\_n), \end{split} \tag{27}$$

and we observe that by Theorem 4 and by the first hypothesis, for every sequence of positive real numbers (*ti*)*i*≥<sup>1</sup> such that lim*i*→+<sup>∞</sup> *ti* = +<sup>∞</sup> and *<sup>t</sup>*<sup>1</sup> < *<sup>t</sup>*<sup>2</sup> < ··· < *ti* < ... , we have that

$$\lim\_{N \to +\infty} \left( \sum\_{j=1}^{k} \sum\_{i=1}^{\hat{\mathbf{r}} : t\_i \leq N} \lambda \left( t\_i \right) \eta\_j^{N-i} \mathbf{t}\_i^{\mathsf{T}} \ a\_j \mathcal{B}\_j^{\mathsf{T}} \right) = \sum\_{j=1}^{k} \frac{\lambda\_{\infty}}{1 - \eta\_j} \mathbf{t}\_{\infty}^{\mathsf{T}} a\_j \mathcal{B}\_j^{\mathsf{T}}.\tag{28}$$

The limit in the last term of Formula (27) requires a result of Lebesgue convergence theorem type but with varying measures. For the purpose of applying Theorem 5, we introduce the adequate context and notations and then we will apply the referred theorem. Consider the space *X* = [0, +∞[ ℵ<sup>0</sup> defined to be the space of infinite sequences of numbers in [0, +∞[, that is,

$$X = \{ \mathbf{t} = (t\_1, \dots, t\_{i\prime}, \dots) : \forall i \ge 1, t\_i \in [0, +\infty[]\}.$$

Recall that with the metric *d* given by

$$\forall \mathbf{t} = (t\_1, \dots, t\_i, \dots), \mathbf{t}' = (t'\_1, \dots, t'\_{i'}, \dots) \in X, \ d(\mathbf{t}, \mathbf{t}') := \sum\_{i=1}^{+\infty} \frac{\min(1, \left| t\_i - t'\_1 \right|)}{2^i} \mathbf{t}'$$

*X* is a metric space, locally compact, separable and complete (see, for instance, in [59], pp. 9–10). We will consider *X* = [0, +∞[ <sup>ℵ</sup><sup>0</sup> endowed with the Borel *σ*-algebra B(*X*) generated by the family P*<sup>f</sup>* given by

$$\mathcal{P}\_f = \left\{ A\_{i\_1} \times A\_{i\_2} \times \dots \times A\_{i\_p} \; : \; p \ge 1, \; A\_{i\_1} \in \mathcal{B}([0, +\infty[) \;) \right\},$$

with B([0, +∞[) the Borel *<sup>σ</sup>*-algebra of [0, +∞[. We now take (*τi*)*i*≥<sup>0</sup> the sequence of the jump times of the process represented in Formula (A17). First, we define the sequence of measures (*μn*)*n*≥<sup>1</sup> where for each *n* ≥ 1 we have that *μ<sup>n</sup>* is defined on the measurable space ([0, +∞[ *<sup>n</sup>*, <sup>B</sup>([0, <sup>+</sup>∞[ *<sup>n</sup>*)) by considering, for *<sup>A</sup>*<sup>1</sup> <sup>×</sup> *<sup>A</sup>*<sup>2</sup> ×··· *An* with *Ai* ∈ B([0, <sup>+</sup>∞[), that

$$\mu\_{\boldsymbol{\theta}}(A\_1 \times A\_2 \times \cdots \times A\_{\boldsymbol{\theta}}) = \mathbb{P}\left[\tau\_1 \in A\_1, \ldots, \tau\_{\boldsymbol{\theta}} \in A\_{\boldsymbol{\theta}}\right] = \int\_{\substack{\boldsymbol{t}\_1 \in A\_1, \ldots, \boldsymbol{t}\_d \in A\_d}} dF\_{(\tau\_1, \ldots, \tau\_{\boldsymbol{t}})}(t\_1, \ldots, t\_{\boldsymbol{t}}).\tag{29}$$

Being so, *μ<sup>n</sup>* is the probability joint law of (*τ*1, ... , *τn*) and the last integral in the last term of Formula (27) is exactly an integration with respect to the measure *μn*. As a consequence of Formula (29), the sequence (*μn*)*n*≥<sup>1</sup> verifies the compatibility conditions of Kolmogorov extension theorem (see [60], p. 46) and so there is a probability measure *μ*, defined on (*X*, B(*X*)), having as finite dimensional distributions the measures of the sequence (*μn*)*n*≥1.

Now, for each *n* ≥ 1, we can consider *μ*'*<sup>n</sup>* the extension of *μ<sup>n</sup>* to the measurable space (*X*, B(*X*)) in the following way:

$$\forall A \in \mathcal{B}(X) \; \tilde{\mu}\_n(A) = \int\_{\{\mathbf{t} = (t\_1, \dots, t\_i, \dots) \in A : t\_1, \dots, t\_n \in [0, +\infty]\}} dF\_{(\pi\_1, \dots, \pi\_n)}(t\_1, \dots, t\_n) \,. \tag{30}$$

In fact, with this definition the restriction of *μ*'*<sup>n</sup>* to B([0, +∞[ *<sup>n</sup>*) is exactly *μn*. An important observation is the following. Consider *A* := *Ai*<sup>1</sup> × *Ai*<sup>2</sup> ×···× *Aip* ∈ P*<sup>f</sup>* . Then, for *m* ≥ *ip* we have that

$$\begin{split} \widetilde{\mu}\_{m}(A) &= \int\_{\{\mathbf{t} = (t\_{1}, \ldots, t\_{i}, \ldots) \in A : t\_{1}, \ldots, t\_{m} \in [0, +\infty]\}} dF\_{(\mathbf{t}\_{1}, \ldots, \mathbf{t}\_{m})}(t\_{1}, \ldots, t\_{m}) = \\ &= \int\_{\{\mathbf{t} = (t\_{1}, \ldots, t\_{i}, \ldots) \in A : t\_{1}, \ldots, t\_{i} \in [0, +\infty]\}} dF\_{(\mathbf{t}\_{1}, \ldots, \mathbf{t}\_{i\_{p}})}(t\_{1}, \ldots, t\_{i\_{p}}) = \\ &= \widetilde{\mu}\_{i\_{p}}(A) = \mu\_{i\_{p}}(A) = \mu(A), \end{split} \tag{31}$$

thus showing that for every *A* ∈ P*<sup>f</sup>* the sequence (*μ*'*m*(*A*))*m*≥<sup>1</sup> converges to *μ*(*A*). Now, by Theorem 2.2 in [59] (p. 17), as P*<sup>f</sup>* is a *π*-system and every open set in the metric space (*X*, *d*) is a countable union of elements of P*<sup>f</sup>* , we have that the sequence (*μ*'*m*)*m*≥<sup>1</sup> converges weakly to *μ*. In order to apply Theorem 5 to compute the limit, we may consider two approaches to deal with the fact that *λ*+- *<sup>N</sup>* is a vector of finite dimension *k*. Either we

proceed component wise or we consider norms. Let us follow the second path. Define, for integer *N*, and some constant *M*,

$$f\_N(\mathbf{t}) = f\_N(t\_1, \dots, t\_{i\prime}, \dots) := \sum\_{i=1}^{i:t\_i \le N} \lambda(t\_i) \,\eta\_j^{N-i} \,\mathbf{t}\_i^\top \,\mathbf{a}\_j \boldsymbol{\theta}\_{j\prime}^\top.$$

and also,

$$\mathcal{g}\_N(\mathbf{t}) \equiv \mathcal{g} := \left\| \sum\_{j=1}^k \frac{\lambda\_\infty}{1 - \eta\_j} \mathbf{t}\_\infty^\mathsf{T} \boldsymbol{\alpha}\_j \boldsymbol{\beta}\_j^\mathsf{T} \right\| + \mathcal{M}\_\mathsf{m}$$

in such a way that *fN*(*t*) ≤ *g*; such choice of *M* is possible as a consequence of Formula (28). We can verify that the sequence (*fN*)*N*≥<sup>1</sup> converges continuously to a function *f* by using Theorem 4.1.1 in [22] (p. 373). In fact, let us consider a sequence (*tN*)*n*≥<sup>1</sup> converging to some (*t*<sup>∞</sup> = (*<sup>t</sup>* ∞ <sup>1</sup> , ... , *t* ∞ *<sup>i</sup>* , ...) in the metric space (*X*, *d*). With (*t<sup>N</sup>* = (*t<sup>N</sup>* <sup>1</sup> , ... , *<sup>t</sup><sup>N</sup> <sup>i</sup>* , ...) we surely have that lim*N*→+<sup>∞</sup> *<sup>t</sup><sup>N</sup> <sup>i</sup>* = *t* ∞ *<sup>i</sup>* for all *i* ≥ 1. As a consequence of the continuity of *λ* and of Theorem 4.1.1 in [22] (p. 373), we have that

$$\lim\_{N \to +\infty} f\_N(t\_N) = \lim\_{N \to +\infty} \sum\_{i=1}^{i:t\_i^N \le N} \lambda\left(t\_i^N\right) \eta\_j^{N-i} \mathbf{t}\_i^\mathsf{T} \mathbf{a}\_j \mathbf{\beta}\_j^\mathsf{T} = \sum\_{j=1}^k \frac{\lambda\left(\lim\_{l \to +\infty} t\_i^\oslash\right)}{1 - \eta\_j} \mathbf{t}\_\oslash \mathbf{a}\_j \mathbf{\beta}\_j^\mathsf{T} =: f\left(\mathbf{t}^\oslash\right).$$

It is clear now that the sequences (*fN*)*N*≥1, (*gN*)*t*≥<sup>1</sup> and (*μ*'*n*)*n*≥<sup>1</sup> satisfy together with *μ* the hypothesis of Theorem 5 and so the announced result in Formula (25) follows.

**Remark 11** (Alternative proof for the weak convergence of the sequence (*μ*'*n*)*n*≥1)**.** *There is another proof the weak convergence of the sequence* (*μ*'*m*)*m*≥<sup>1</sup> *to μ that we now present. We proceed by showing that the sequence* (*μ*'*n*)*n*≥<sup>1</sup> *is relatively compact—as a consequence of Prohorov theorem (see [59], pp. 59–63)—because, as we will show next, this sequence is tight. Let an arbitrary* 0 < < 1 *be given and consider a sequence of positive numbers* (*ξi*)*i*≥<sup>1</sup> *such that, by Tchebychev inequality and using the fact that the stopping times τ<sup>i</sup> have finite integrals,*

$$\mathbb{P}\left[\pi\_i > \xi\_i\right] \le \frac{\mathbb{E}\left[\pi\_i\right]}{\xi\_i} \text{ .}$$

*in such a way that*

$$\sum\_{i=1}^{+\infty} \frac{\mathbb{E}\left[\tau\_i\right]}{\xi\_i^{\mathbb{Z}}} < \epsilon.$$

*Now consider the Borel set K* = ∏+<sup>∞</sup> *<sup>i</sup>*=<sup>1</sup> [0, *ξi*] ⊂ *X which is compact by Tychonov theorem. We now have that*

$$\begin{split} \tilde{\mu}\_{n}(K\_{\varepsilon}) &= \int\_{\{\mathbf{t} = (t\_{1}, \dots, t\_{i}, \dots) \in K\_{\varepsilon} \colon t\_{1}, \dots, t\_{n} \in [0, +\infty]\}} dF\_{(\tau\_{1}, \dots, \tau\_{n})}(t\_{1}, \dots, t\_{n}) = \\ &= \int\_{\prod\_{i=1}^{n} [0, \underline{\mathcal{J}\_{i}}]} dF\_{(\tau\_{1}, \dots, \tau\_{n})}(t\_{1}, \dots, t\_{n}) = \\ &= \mathbb{P}\left[ (\tau\_{1}, \dots, \tau\_{n}) \in \prod\_{i=1}^{n} [0, \underline{\mathcal{J}\_{i}}] \right] = \mathbb{P}\left[ \bigcap\_{i=1}^{n} \{\tau\_{i} \le \underline{\mathcal{J}}\_{i} \} \right] = 1 - \mathbb{P}\left[ \bigcup\_{i=1}^{n} \{\tau\_{i} > \underline{\mathcal{J}}\_{i} \} \right] \ge \varepsilon\_{n} \\ &\ge 1 - \sum\_{i=1}^{n} \frac{\mathbb{E}\left[ \tau\_{i} \right]}{\overline{\mathcal{J}\_{i}}} \ge 1 - \sum\_{i=1}^{+\infty} \frac{\mathbb{E}\left[ \tau\_{i} \right]}{\overline{\mathcal{J}\_{i}}} \ge 1 - \varepsilon\_{\prime} \end{split}$$

*thus showing that the sequence of probability measures* (*μ*'*n*)*n*≥<sup>1</sup> *is tight in the measurable space* (*X*, B(*X*))*. As said, by Prokhorov's theorem, this implies that the sequence* (*μ*'*n*)*n*≥<sup>1</sup> *is relatively compact, that is, for every subsequence of* (*μ*'*n*)*n*≥1*, there exists a further subsequence and a probability measure such that this subsequence converges weakly to the said probability measure. Now, as, by construction, the probability measure μ has, as finite dimensional distributions the*

*probability measures* (*μ*'*n*)*n*≥<sup>1</sup> *we can say that for n* ≥ 1*, the finite dimensional distributions of μ*'*<sup>n</sup> converge weakly to the finite dimensional distributions of μ. As a consequence, following the observation in [59] (p. 58), the sequence* (*μ*'*n*)*n*≥<sup>1</sup> *converges weakly to μ.*

**Remark 12** (Applying Theorem 6)**.** *If we manage to estimate a discrete time Markov chain transition matrix and if we manage to fit some function f—such that* lim*t*→+<sup>∞</sup> *f*(*t*) = *λ*∞*—to the number of new incoming members in the population at a set of non accumulating non-evenly spaced dates (as done with a statistical procedure in [22] or, with a simple fitting in [25]) then, Theorem 6 allows us to get the asymptotic expected number of elements in the transient classes of a sMp having as embedded Markov chain the estimated one.*

#### *4.3. Open Continuous Time Processes from Open Markov Schemes*

We may follow the approach of open Markov schemes in [26] and define a process in continuous time after getting a process in random discrete times describing, at least on average, the evolution of the elements in each transient class. Let us briefly recall the main idea. A population model is driven by a Markov chain defined by a sequence of initial distributions given, for *<sup>n</sup>* <sup>≥</sup> 1, by (**q***n*)- = (*q<sup>n</sup>* <sup>1</sup> , *<sup>q</sup><sup>n</sup>* <sup>2</sup> , ... , *<sup>q</sup><sup>n</sup> <sup>r</sup>* ) and a transition matrix **P** = [*pij*], 1 ≤ *i*, *j* ≤ *r* . After the first transition, the new values of the proportions in all states, after one transition, can be recovered from **Pq** = (**q**-**P**) and, after *n* transitions, by (**P**(*n*))**q** = (**q**-**P**(*n*))-. We want to account for the evolution of the **expected** number of elements in each class supposing that, at each **random** date *τk*, a random number *Xτ<sup>k</sup>* of new elements enters the population. Just after the second cohort enters the population, a first transition occurs in the first cohort driven by the Markov chain law and so on and so forth. Table 1 summarizes this accounting process in which, at each step *k*, we distribute multinomially the new random arrivals *Xτ<sup>k</sup>* according to the probability vector **q***<sup>k</sup>* and the elements in each class are redistributed according to the Markov chain transition matrix **P**.


**Table 1.** Accounting of *n* Markov cohorts each with an initial distribution.

At date *τk*, if we suppose that each new set of individuals in the population, a cohort, evolves independently from any one of the already existing sets of individuals but, accordingly, to the same Markov chain model, we may recover the total **expected** number of elements in each class at date *τ<sup>k</sup>* by computing the sum:

$$\mathbf{K}\_n = \sum\_{k=1}^n \mathbb{E}\left[X\_{\tau\_k}\right](\mathbf{q}^k)^\dagger \mathbf{P}^{(n-k)}.\tag{32}$$

Each vector component corresponds precisely to the **expected** number of elements in each class. In order to further study the properties of (**K***n*)*n*≥1, given the properties of a stochastic process = (*Xτ<sup>k</sup>* )*k*≥1, we will randomize formula (32) by considering, instead, for *n* ≥ 1:

$$\mathbf{K}\_n = \sum\_{k=1}^n X\_{\tau\_k}(\mathbf{q}^k) \mathbf{T}^{(n-k)},\tag{33}$$

and we observe that in any case -[**K**] = **K***n*. It is known that if the vector of classification probabilities is constant **c***<sup>k</sup>* = **c** and if the is an ARMA, ARIMA, or SARIMA process, then the populations in each of the transient classes can be described by a sum of a deterministic trend, plus an ARMA process plus an evanescent process, that is a centered process (*Yk*)*k*≥<sup>1</sup> such that lim*k*→+<sup>∞</sup> - |*Yk*| 2 = 0 (see Theorems 3.1 and 3.2 in [26]).

The step process in continuous time naturally associated with the discrete time one would be then defined by for *t* ≥ 0 by

$$\mathbf{K}\_t := \sum\_{n=0}^{+\infty} \mathbf{K}\_n \mathbf{1}\_{[\tau\_n, \tau\_{n+1}]}(t) = \sum\_{n=0}^{+\infty} \left( \sum\_{k=1}^n X\_{\tau\_k}(\mathbf{q}^k)^\mathsf{T} \mathbf{P}^{(n-k)} \right) \mathbf{1}\_{[\tau\_n, \tau\_{n+1}]}(t).$$

In order to study this process we will have to take advantage of the properties of X and of the family of stopping times (*τk*)*k*≥0. It should be noticed that if the process = (*Xt*)*t*≥<sup>0</sup> is Poisson distributed and the laws of the sequence (*τk*)*k*≥<sup>0</sup> are known and it possible to determine the expected value of **K***<sup>t</sup>* for *t* ≥ 0 with a result similar to Theorem 6.

#### **5. Conclusions**

In this work, we studied several ways to associate, to an open Markov chain process in discrete time—which is often the sole accessible fruit of observation—a continuous time Markov or semi-Markov process that bears some natural relation with the discrete time process. Furthermore, we expect that association to allow the extension of the study of open populations from the discrete to the continuous time model. For that purpose, we consider three approaches: the first, for the continuous time Markov chains; the second, for the semi Markov case; and the third, for the open Markov schemes (see in [26]). For the semi-Markov case, under the hypothesis that we only observe the influx of new individuals in the population at the times of the random jumps, in the main result we determine the expected value of the vector of parameters of the conditional Poisson distributions in the transient classes when the influx of new members is Poisson distributed. The third approach, dealing with open Markov schemes is similar to the second one whenever we consider a similar context hypothesis, that is, distributed incoming new members of the population with known distributions and observation of this influx of new individuals at the times of the random jumps. In the case of the first approach, that is, for the case of Markov chain in continuous time, we propose a calibration procedure for which the embeddable Markov chains provide optimal solutions. In this case also, the study of open populations models relies on the main result proved for the semi-Markov case approach. Future work encompasses applications to real data and the determination of criteria to assess the quality of the association of the continuous model to the observed discrete time model.

**Author Contributions:** All authors contributed equally to this work. All authors have read and agreed to the published version of the manuscript.

**Funding:** For the second author, this work was done under partial financial support of RFBR (Grant n. 19-01-00451). For the first and third author this work was partially supported through the project of the Centro de Matemática e Aplicações, UID/MAT/00297/2020 financed by the Fundação para a CiênciaeaTecnologia (Portuguese Foundation for Science and Technology). The APC was funded by the insurance company Fidelidade.

**Acknowledgments:** This work was published with finantial support from the insurance company Fidelidade. The authors would like to thank Fidelidade for this generous support and also, for their interest in the development of models for insurance problems in Portugal. The authors express gratitude to Professor Panagiotis C.G. Vassiliou for his enlightening comments on a previous version of this work and to the comments, corrections and questions of the referees, in particular, to the one question that motivated the inclusion of Remark 5.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Some Essential Results on Continuous Time Markov Chains**

In this exposition of the most relevant results pertinent to our purposes, we follow mainly the references [29–31]. As this exposition is a mere reminder of needed notions and results, the proofs are omitted unless the result is essential for our purposes.

**Definition A1** (Continuous time Markov chain)**.** *Let* I *be some finite set; for instance,* **Θ** = {*θ*1, *θ*2, ... , *θr*} *of Section 2. A stochastic process* (*Xt*)*t*≥<sup>0</sup> *is a continuous time Markov chain with state space* I *if and only if the following Markov property is verified, namely, for all i*0, *i*1,... *in* ∈ I *and* 0 = *t*<sup>0</sup> < *t*<sup>1</sup> < ··· < *tn* < ··· *we have that*

$$\begin{aligned} \mathbb{P}\left[X\_{t\_n} = i\_n | X\_{t\_{n-1}} = i\_{n-1}, \dots X\_{t\_1} = i\_1, X\_{t\_0} = i\_0\right] = \mathbb{P}\left[X\_{t\_n} = i\_n | X\_{t\_{n-1}} = i\_{n-1}\right] \\ = \mathbb{P}\left[X\_{t\_n} = i\_n | X\_{t\_{n-1}} = i\_{n-1}\right]. \end{aligned}$$

We observe that by force of the Markov property in Definition A1 the law of a continuous time Markov chain depends only on the following transition probabilities. Let *I* be the identity matrix with dimension #I the Kronecker's delta be given by

$$
\delta\_i^j = \begin{cases} 0 & i \neq j \\ 1 & i = j. \end{cases}
$$

**Definition A2** (Transition probabilities)**.** *Let* I *be the state space of* (*Xt*)*t*≥<sup>0</sup> *a continuous time Markov chain. The transition probabilities are defined by*

$$\mathbb{P}\left(\forall i, j \in \mathbb{Z}, \text{ s} < \text{t}, \ p(\text{s}, i, \text{t}, j) = \mathbb{P}\left[X\_{\text{t}} = j \mid X\_{\text{s}} = i\right] \text{ and } p(\text{t}, i, \text{t}, j) = \delta\_{i}^{\text{l}}.\right)$$

*Let* <sup>L</sup>(-#<sup>I</sup> ) *be the space of square matrices with coefficients in* -*. The transition probability matrix function P* : -<sup>+</sup> <sup>×</sup> -<sup>+</sup> → L(-#<sup>I</sup> ) *is defined by*

$$\forall i, j \in \mathcal{Z}, \; s < t, \; \mathbf{P}(s, t) = [p(s, i, t, j)]\_{i, j \in \mathcal{Z}} \; and \; \mathbf{P}(t, t) = I. \tag{A1}$$

Transition probabilities of Markov processes in general satisfy a very important functional equation that results from the Markov property.

**Theorem A1** (Chapman-Kolmogorov equations)**.** *Consider a NH-CT-MC as given in Definition A1. Let P its transition probability matrix function as given in Definition A2. We then have*

$$\text{P}(\text{s}, u, t, \text{ } 0 \le \text{s} < u < t, \text{ } \text{P}(\text{s}, t) = \text{P}(\text{s}, u)\text{P}(u, t) \tag{A2}$$

As an application of the celebrated existence theorem of Kolmogorov (in the form exposed in [61], pp. 8–10) we have that, under a set of natural hypothesis, there exists a NH-CT-MC such as the one in Definition A1.

**Theorem A2** (On the existence of NH-CT-MC)**.** *Let p*<sup>0</sup> *be an initial probability over* I*. Consider a matrix valued function P* : -<sup>+</sup> <sup>×</sup> -<sup>+</sup> → L(-#<sup>I</sup> ) *denoted by <sup>P</sup>*(*s*, *<sup>t</sup>*) = [*p*(*s*, *<sup>i</sup>*, *<sup>t</sup>*, *<sup>j</sup>*)]*i*,*j*∈I *and satisfying Formulas* (A3) *and* (A4) *below, that is,*

*1. For all s* < *t and for all i* ∈ I

$$\sum\_{j \in \mathcal{T}} p(s, i, t, j) = 1.\tag{A3}$$

*2. Formula* (A2) *in Theorem A1, namely,*

$$\forall \mathbf{s}, u, t, \; s \preccurlyeq u \preccurlyeq t, \; \mathbf{P}(\mathbf{s}, t) = \mathbf{P}(\mathbf{s}, u)\mathbf{P}(u, t). \tag{A4}$$

*Define, for all i*0, *i*1,... *in* ∈ I *and* 0 = *t*<sup>0</sup> < *t*<sup>1</sup> < ··· < *tn* < ··· *, the function*

$$\begin{split} \nu\_{t\_0, t\_1, \ldots, t\_n}(i\_0, i\_1, \ldots, i\_n) &= \\ &= p\_0(i\_0) p(t\_0, i\_0, t\_1, i\_1) p(t\_1, i\_1, t\_2, i\_2) \cdots p(t\_{n-1}, i\_{n-1}, t\_n, i\_n), \end{split} \tag{A5}$$

*and extend this definition to all possible t*0, *t*1, ... , *tn*, ... *by considering, with the adequate ordering permutation σ of* {0, 1, 2, . . . , #I} *such that we have tσ*(0) < *tσ*(1) < ..., < *tσ*(*n*)*,*

$$\boldsymbol{\nu}\_{t\_{\sigma(0)}, t\_{\sigma(1)}, \dots, t\_{\sigma(n)}} (i\_0, i\_1, \dots, i\_{\sigma}) = \boldsymbol{\nu}\_{t\_0 t\_1, \dots, t\_{\sigma}} (i\_{\sigma^{-1}(0)}, i\_{\sigma^{-1}(1)}, \dots, i\_{\sigma^{-1}(n)}) . \tag{A6}$$

*Then,* (*νt*0,*t*1,...,*tn* )*t*0,1,...,*tn*,*n*≥<sup>1</sup> *is a family of probability measures satisfying the compatibility conditions of Kolmogorov existence theorem and so, there exists a probability measure over the canonical probability space* (Ω, <sup>A</sup>)*—with* <sup>Ω</sup> <sup>=</sup> <sup>I</sup><sup>+</sup> *and* <sup>A</sup> <sup>=</sup> <sup>P</sup>(I)+*—such that if the stochastic process* (*Xt*)*t*≥<sup>0</sup> *is denoted by*

$$\forall \omega = (i\_t)\_{t \ge 0} \in \Omega, \ X\_t(\omega) = i\_{t\nu}$$

*then,*

$$\text{P}(\forall i, j \in \mathbb{Z}, \text{ s} < \text{t}, \text{ p}(\text{s}, i, \text{t}, j) = \mathbb{P}\left[X\_{\text{l}} = j \, | \, X\_{\text{s}} = i\right] \text{ and } \text{p}(\text{t}, \text{i}, \text{t}, j) = \delta\_{\text{i}}^{\text{j}}.\tag{A7}$$

*that is,* (*Xt*)*t*≥<sup>0</sup> *has <sup>P</sup>*(*s*, *<sup>t</sup>*) = [*p*(*s*, *<sup>i</sup>*, *<sup>t</sup>*, *<sup>j</sup>*)]*i*,*j*∈I*—together with <sup>P</sup>*(*t*, *<sup>t</sup>*) = *<sup>I</sup>—as its transition probabilities.*

A natural and useful way of defining transition probabilities is by means of the transition intensities that act like differential coefficients of transition probability functions.

**Definition A3** (Transition intensities)**.** *Let* <sup>L</sup>(-#<sup>I</sup> ) *be the space of square matrices with coefficients in* -*. A function Q* : - → L(-#<sup>I</sup> ) *denoted by*

$$\mathbf{Q}(t) = [q(t, i, j)]\_{i, j \in \mathcal{T}'} $$

*is a transition intensity iff for almost all t* ≥ 0 *it verifies*

*(i)* ∀*i* ∈ I, *t* ≥ 0, *q*(*t*, *i*, *i*) ≤ 0*; (ii)* ∀*i* ∈ I, *t* ≥ 0, *q*(*t*, *i*, *j*) − *q*(*t*, *i*, *i*) ≥ 0*; (iii)* ∀*<sup>i</sup>* ∈ I <sup>∑</sup>*j*∈I *<sup>q</sup>*(*t*, *<sup>i</sup>*, *<sup>j</sup>*) = <sup>0</sup>*.*

There is a way to write differential equations—the Kolmogorov backward and forward equations—useful for recovering the transition probability matrix from the intensities matrix and to study important properties of these transition probabilities.

**Theorem A3** (Backward and Forward Kolmogorov equations)**.** *Suppose that P*(*s*, *t*) *is continuous at s, that is,*

$$\lim\_{t \downarrow 0} P(0, t) = I \text{ and } \lim\_{t \downarrow s} P(s, t) = \lim\_{t \uparrow s} P(t, s) = I. \tag{A8}$$

*If there exists Q such that*

$$\begin{split} \mathcal{Q}(t) &= \lim\_{k+h \to 0, \, k \to 0 \,\forall h \in \mathcal{O}} \frac{\mathcal{P}(t-k, t+h) - I}{k+h} = \lim\_{h \downarrow 0, h > 0} \frac{\mathcal{P}(t, t+h) - I}{h} = \\ &= \lim\_{k \downarrow 0, k > 0} \frac{\mathcal{P}(t-k, t) - I}{k} , \end{split} \tag{A9}$$

*then we have the backward Kolmogorov (matrix) equation:*

$$\frac{\partial}{\partial s}P(s,t) = -\mathcal{Q}(s)P(s,t),\;P(s,s) = I,\tag{A10}$$

*and the forward Kolmogorov (matrix) equation:*

$$\frac{\partial}{\partial t}\mathbf{P}(\mathbf{s},t) = \mathbf{P}(\mathbf{s},t)\mathbf{Q}(\mathbf{s}),\ \mathbf{P}(t,t) = I. \tag{A11}$$

**Remark A1.** *The general theory of Markov processes shows that the condition that P*(*s*, *t*) *is continuous in both s and t is sufficient to ensure the existence of the matrix intensities Q given in Formulas* (A9) *(see [31], p. 232). By means of a change of time Goodman (see [41]) proved that the existence of solutions of Kolmogorov equations is amenable to an application of Caratheodory's existence theorem for differential equations.*

Given transition intensities satisfying an integrability condition there are transition probabilities uniquely associated with these transition intensities.

**Theorem A4** (Transition probabilities from intensities)**.** *Let Q be a transition intensity as in Definition A3 such that Theorem A3 holds. Then, we have that*

$$\mathbf{P}(s,t) = \mathbf{I} + \int\_{s}^{t} \mathbf{Q}(u)\mathbf{P}(u,t) du \text{ and } \mathbf{P}(s,t) = \mathbf{I} + \int\_{s}^{t} \mathbf{P}(s,u)\mathbf{Q}(u) du. \tag{A12}$$

The existence of a NH-CT-MC can also be guaranteed by a constructive procedure that we now present and that is most useful for simulation.

**Remark A2** (Constructive definition)**.** *Given a transition intensity Q define*

$$p^\*(t,i,j) = \begin{cases} \frac{1-\delta\_i^j}{-q(t,i,i)}q(t,i,j) & q(t,i,i) \neq 0\\ \delta\_i^j & q(t,i,i) = 0. \end{cases}$$


$$F\_{\mathbb{T}\_1}(t) = \mathbb{P}\left[\tau\_1 \le t\right] = 1 - \exp\left(\int\_0^t q(u, i, i) du\right),$$

*and*

$$\mathbb{P}\left[X\_{\mathfrak{s}\_1} = j \,|\,\tau\_1 = s\_1, \, X\_0 = i\right] = p^\*(s\_1, i, j)\_{\prime\prime}$$

*and so Xt* = *i for* 0 ≡ *τ*<sup>0</sup> ≤ *t* < *τ*1*. We note that this distribution of the stopping time is mandatory as a consequence of a general result on the distribution of sojourn times of a continuous time Markov chain (see Theorem 2.3.15 in [31], p. 221).*

*3. Given that τ*<sup>1</sup> = *s*<sup>1</sup> *and Xs*<sup>1</sup> = *j, τ*<sup>2</sup> *time of the second jump with Exponential distribution function*

$$F\_{\tau\_2 \mid \tau\_1 = s\_1}(t) = \mathbb{P}\left[\tau\_2 \le t \mid \tau\_1 = s\_1\right] = 1 - \exp\left(\int\_0^t q(u + s\_1, j\_\prime j) du\right)$$

*and*

$$\mathbb{P}\left[X\_{\mathfrak{s}\_2} = k \,|\,\tau\_1 = s\_1, \ X\_0 = i, \ \tau\_2 = s\_2, \ X\_{\mathfrak{s}\_1} = j\right] = p^\star(s\_1 + s\_2, j, k)\_\star$$

*and so Xt* = *j for τ*<sup>1</sup> ≤ *t* < *τ*2*.*

The following result ensures that the preceding construction yields the desired result.

**Theorem A5** (The continuous time Markov chain)**.** *Let the intensities satisfy condition given by Formula* (A12) *in Theorem A4. Then, given the times* (*τ*0)*n*≥1*, we have that with the sequence* (*Yn*)*n*≥<sup>1</sup> *defined by Yn* = *Xτ<sup>n</sup> , the process defined by:*

$$X\_t = \sum\_{n=0}^{+\infty} Y\_n \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t) = \sum\_{n=0}^{+\infty} X\_{\tau\_n} \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t) \tag{A13}$$

*is a continuous time Markov chain with transition probabilities P given by Definition A2 and transition intensities Q given by Definition A3 and Theorem A3.*

**Proof.** This theorem is stated and proved, in the general case of Markov continuous time Markov processes in [31] (p. 229).

**Lemma A1.** *Let <sup>q</sup>* : <sup>+</sup> → *a measurable function integrable over every bounded interval of* +*. Then, we have that*

$$\int\_{s}^{t} \int\_{s\_1}^{t} \cdots \int\_{s\_{n-1}}^{t} q(s\_1)q(s\_2)\dots q(s\_n)ds\_n \dots ds\_2 ds\_1 = \frac{\left(\int\_{s}^{t} q(u)du\right)^n}{n!},$$

*for all* 0 ≤ *s* ≤ *t, n* ≥ 1*.*

**Proof.** Let us observe that, for *n* = 2, we have that

$$\begin{aligned} \left(\int\_s^t q(u) du\right)^2 &= \int\_s^t \int\_s^t q(v)q(u) du dv = \\ &= \int\_s^t \int\_s^t \mathbb{1}\_{\{u \le v\}} q(v)q(u) du dv + \int\_s^t \int\_s^t \mathbb{1}\_{\{v \le u\}} q(v)q(u) du dv. \end{aligned}$$

By induction we have for all *n* ≥ 1, and for every permutation *σ* ∈ S*<sup>n</sup>*

$$\begin{aligned} &\left(\int\_s^t q(u) du\right)^n = \\ &= \sum\_{\sigma \in \mathfrak{S}\_n} \int\_s^t \cdots \int\_s^t \mathbb{1}\_{\{u\_{\sigma(1)} \le u\_{\sigma(2)} \le \dots \le u\_{\sigma(n)}\}} q(u\_1) \ldots q(u\_1) du\_n \ldots du\_1 = \\ &= n! \int\_s^t \cdots \int\_s^t \mathbb{1}\_{\{u\_1 \le u\_2 \le \dots \le u\_n\}} q(u\_1) \ldots q(u\_1) du\_n \ldots du\_1 = \\ &= \int\_s^t \int\_{u\_1}^t \cdots \int\_{u\_{n-1}}^t q(u\_1) q(u\_2) \ldots q(u\_n) du\_n \ldots du\_2 du\_1, \end{aligned}$$

as all the integrals in the sum are equal by the symmetry of the integrand function, and then, by Fubini theorem.

**Remark A3** (On a fundamental condition)**.** *The condition on q stated in Lemma A1 and reformulated in Formula* (7) *is the key to the proof of important results. In fact we have that this condition is sufficient to ensure that the associated Markov process has no discontinuities of the second type (see [31], p. 227) and, most important for the goals in this work, that the trajectories of the associated Markov process are step functions, that is, any trajectory has only a finite number of jumps in any compact subinterval of* [0, +∞[*; we will detail this last part of the remark in Theorem A6.*

Under the perspective of our main motivation the following result is crucial.

**Theorem A6** (The non accumulation property of the jump times of a Markov chain)**.** *Let the intensities satisfy condition given by the statement of Lemma A1. Then, given the times* (*τn*)*n*≥1*, we have that:*

$$\mathbb{P}\left[\sum\_{n=1}^{+\infty} \tau\_n = +\infty\right] = 1,\tag{A14}$$

*and so the trajectories of the process are step functions.*

**Proof.** Property in Formula (A14) has non immediate proof. We present a proof based on a result in [62] (p. 160), stating that the condition given by:

$$\lim\_{h \downarrow 0} \sup\_{t, i} \sum\_{j \neq i} p(t, i, t + h, j) = 0,\tag{A15}$$

guarantees that the process has a stochastic equivalent that is a step process, meaning that for any trajectory *ω* the set of jumps of this trajectory has no limit points in the interval [0, *ζ*(*ω*)[, with *ζ*(*ω*) being the end date of the trajectory. This result is based on a thorough analysis (see [62], pp. 149–159) of the conditions for a Markov process not to have discontinuities of the second type, meaning that the right-hand side and left-hand side limits exists for every date point and every trajectory. Now, with,

$$q(t) := \max\_{1 \le i \le \#\mathcal{T}} |q(t, i, i)|\_{\prime}$$

by virtue of the condition on *q* in Lemma A1—that is reformulated more precisely in Formula (7) of the statement in Theorem 1—we have that:

$$p(t, i, t+h, j) \le \sum\_{k=1}^{+\infty} \frac{\left(\#\mathcal{T} \int\_t^{t+h} q(u) du\right)^k}{k!}.$$

Therefore, for almost all *t* ∈ [0, *T*],

$$\lim\_{h \downarrow 0} \sup\_{t,i} \sum\_{j \neq i} p(t, i, t + h, j) = (\#\mathbb{Z} - 1) \lim\_{h \downarrow 0} \sup\_{t} \sum\_{k=0}^{+\infty} \frac{\left(\#\mathbb{Z} \cdot \int\_{t}^{t+h} q(u) du\right)^k}{k!} = 0$$

$$= (\#\mathbb{Z} - 1) \limsup\_{h \downarrow 0} \sum\_{t}^{+\infty} \frac{\left(h \cdot \#\mathbb{Z} \cdot \frac{1}{h} \int\_{t}^{t+h} q(u) du\right)^k}{k!} = 0,$$

as the series is uniformly convergent and for almost all *t* ∈ [0, *T*],

$$\lim\_{h \downarrow 0} \frac{1}{h} \int\_{t}^{t+h} q(u) du = q(t)\_{\prime}$$

by Lebesgue's differentiation theorem.

**Remark A4** (Negative properties)**.** *The following negative properties suggest the alternative calibration approach that we propose in Section 3.2. Given*(*Xτ<sup>n</sup>* )*n*≥0*, the successive states occupied by the process, we observe that*


#### **Appendix B. Semi-Markov Processes: A Short Review**

For the reader's convenience we present a short summary of the most important results semi-Markov processes (sMp), needed in this work, following [63] (pp. 189–200). The main foundational references for the theory of sMp are [32,64,65]. Important developments can be read in [33,66,67]. Among the many works with relevance for applications we refer, for instance, [68–73]. Let us consider a complete probability space (Ω, <sup>F</sup>, ). The approach of Markov and semi-Markov processes via kernels if fruitful and so we are lead to the following definitions and results for what we will now follow, mainly, the works in [67] (pp. 7–15) and in [33]. Consider a general measurable state space (**Θ**, A(**Θ**)). The *σ*-algebra A(**Θ**) may be seen as the observable sets of the state space of the process **Θ**.

**Definition A4** (Semi-Markov transition kernel)**.** *A map Q* : **Θ** × A(**Θ**) × [0, +∞[→ [0, 1] *such that* (*x*, *B*, *t*) → *Q*(*x*, *B*, *t*) *is a semi-Markov transition kernel if it satisfies the following properties.*

	- *(ii.1)For fixed θ* ∈ **Θ** *and t* > 0*, the map Q*(*θ*, ·, *t*) : A(**Θ**) → [0, 1] *is a measure and we have Q*(*θ*, **Θ**, *t*) ≤ 1*; if Q*(*θ*, **Θ**, *t*) = 1 *we have that Q*(·, ·, *t*) *is a stochastic kernel. (ii.2)For a fixed T* ∈ **Θ** *we have that Q*(·, *T*, *t*) : **Θ** → [0, 1] *is measurable with respect to* A(**Θ**)*.*

Now, consider *Q* a semi-Markov transition kernel, a continuous time stochastic process (*Yt*)*t*≥<sup>0</sup> defined on this probability space and = (F*t*)*t*≥<sup>0</sup> the natural filtration associated to this process, i.e., F*<sup>t</sup>* := *σ*(*Ys* : *s* ≤ *t*) is the algebra-*σ* generated by the variables of the process until time *t*. We now consider a sequence of random variables (*Zn*)*n*≥0—taking values in a state space **Θ**, that for our purposes will, in general, be finite state space **Θ** = {*θ*1, *θ*2, ... , *θr*} and sometimes an infinite one **Θ** = {*θ*1, *θ*2, ... , *θr*, ... }—the sequence being adapted to the filtration . We consider also 0 <sup>≡</sup> *<sup>τ</sup>*<sup>0</sup> <sup>&</sup>lt; *<sup>τ</sup>*<sup>1</sup> <sup>&</sup>lt; *<sup>τ</sup>*<sup>2</sup> <sup>&</sup>lt; ··· <sup>&</sup>lt; *<sup>τ</sup><sup>n</sup>* <sup>&</sup>lt; ··· an increasing sequence of -stopping times, denoted by <sup>T</sup> and <sup>Δ</sup>*<sup>n</sup>* :<sup>=</sup> *<sup>τ</sup><sup>n</sup>* <sup>−</sup> *<sup>τ</sup>n*−<sup>1</sup> for *<sup>n</sup>* <sup>≥</sup> 1.

**Definition A5** (Markov renewal process)**.** *A two dimensional discrete time process* (*Zn*, Δ*n*)*n*≥<sup>0</sup> *with state space* **Θ** × [0, +∞[ *verifying,*

$$\mathbb{P}\left[Z\_{n+1} = \theta\_{\circ}, \Delta\_n \le t \,|\, Z\_{0\circ}, \dots, Z\_n, \Delta\_1, \Delta\_2, \dots, \Delta\_n\right] = \mathbb{P}\left[Z\_{n+1} = \theta\_{\circ}, \Delta\_n \le t \,|\, Z\_n\right],$$

*for all θ<sup>j</sup>* ∈ **Θ***, t* ≥ 0 *and almost surely that is, an homogeneous two dimensional Markov Chain, is a Markov renewal process if its transition probabilities are given by:*

$$\mathbb{Q}(\theta, T, t) = \mathbb{P}[Z\_{n+1} \in T, \,\Delta\_n \le t \,|\, Z\_n = \theta].$$

**Remark A5** (Markov chains and Markov renewal processes)**.** *The transition probabilities of a Markov renewal process do not depend on the second component; as so, a Markov renewal process is a process of different type of a two dimensional Markov chain process. The first component of a Markov renewal process is a Markov chain, denoted the embedded Markov chain, with transition probabilities given by:*

$$P(\theta, T) = \mathcal{Q}(\theta, T, +\infty) = \lim\_{t \to +\infty} \mathcal{Q}(\theta, T, t) = \mathbb{P}[Z\_{n+1} \in T \, | \, Z\_n = \theta].$$

**Definition A6** (Markov renewal times)**.** *The Markov renewal times of the Markov renewal process* (*τn*)*n*≥<sup>0</sup> *are defined by*

$$\pi\_n = \sum\_{k=1}^n \Delta\_{k,n}$$

*and the probability distribution functions F<sup>θ</sup> of the Markov renewal times depend on the states of the embedded Markov chain, as, by definition we have*

$$F\_{\theta}(t) := \mathcal{Q}(\theta, \Theta, t) = \mathbb{P}[\Delta\_{\mathfrak{n}} \le t \, | \, Z\_{\mathfrak{n}} = \theta].$$

**Proposition A1.** *Consider a general measurable state space* (**Θ**, A(**Θ**))*. Let Q be a semi-Markov transition kernel and P the associated stochastic kernel according to Definition A4. Then, there exists a function Fθ*(*γ*, *t*) *such that:*

$$\mathcal{Q}(\theta, T, t) = \int\_{T} F\_{\theta}(\gamma, t) P(\theta, d\gamma). \tag{A16}$$

**Proof.** As we have for *θ* ∈ **Θ** and *T* ∈ A(**Θ**)) that *P*(*θ*, *T*) = *Q*(*θ*, *T*, +∞), we may conclude that *Q*(*θ*, *T*, +∞) ≤ *P*(*θ*, *T*) and so, the measure *Q*(*θ*, ·, +∞) is absolutely continuous with respect to the probability measure *P*(*θ*, ·) on (**Θ**, A(**Θ**)) and so, by the Radon– Nicodym theorem, there exists a density *Fθ*(*γ*, *t*) verifying Formula (A16).

**Remark A6** (Semi-Markov kernel for discrete space state)**.** *In the case of a discrete state space, say* **Θ** = {*θ*1, *θ*2, ... , *θr*, ... }*, we may consider* A(**Θ**) = P(**Θ**) *the maximal σ-algebra of all the subsets of* **Θ**) *and, with this condition, a semi-Markov kernel Q is defined by a matrix function <sup>Q</sup>* <sup>=</sup> [*q*(*i*, *<sup>j</sup>*, *<sup>t</sup>*)]*i*,*j*≥1,*t*≥<sup>0</sup> *such that*


**Definition A7** (Semi-Markov process)**.** *The process* (*Yt*)*t*≥<sup>0</sup> *is a semi-Markov process if:*

*(i) The process admits a representation given, for t* ≥ 0*, by*

$$Y\_t = \sum\_{n=0}^{+\infty} Z\_n \mathbb{1}\_{[\tau\_n, \tau\_{n+1}]}(t). \tag{A17}$$


$$\begin{aligned} \mathbb{P}\left[Z\_{n+1} = \theta\_{\dot{\prime}\prime} \tau\_{n+1} - \tau\_n \le t \, | \, Z\_0, \dots, Z\_n, \tau\_1, \tau\_2, \dots, \tau\_n \right] &= \\ \mathbb{P}\left[Z\_{n+1} = \theta\_{\dot{\prime}\prime} \tau\_{n+1} - \tau\_n \le t \, | \, Z\_n \right], \end{aligned} \tag{A18}$$

*for all θ<sup>j</sup>* ∈ **Θ***, t* ≥ 0 *and almost surely—as it is a conditional expectation.*

**Proposition A2** (The sMp as a Markov chain)**.** *The process* (*Zn*, *τn*)*n*≥<sup>0</sup> *is a Markov chain with state space* **Θ** × [0, +∞[ *and with semi-Markov transition kernel given by:*

$$q(i,j,t) := \mathbb{P}\left[Z\_{n+1} = \theta\_j, \tau\_{n+1} - \tau\_n \le t \, | \, Z\_{\mathbb{R}} = \theta\_i\right]. \tag{A19}$$

**Proposition A3** (The embedded Markov chain of the Mrp)**.** *The process* (*Zn*)*n*≥<sup>0</sup> *is a Markov chain with state space* **Θ** *with transition probabilities given by:*

$$p(i,j) := q(i,j,+\infty) = \mathbb{P}\left[Z\_{n+1} = \theta\_{\mathbf{j}} \, | \, Z\_n = \theta\_{\mathbf{i}}\right],\tag{A20}$$

*and is denoted as the embedded Markov chain of the Mrp.*

**Proposition A4** (The conditional distribution function of the time between two successive jumps)**.** *Let <sup>Q</sup>* <sup>=</sup> [*q*(*i*, *<sup>j</sup>*, *<sup>t</sup>*)]*i*,*j*∈{1,2,...*r*},*t*≥<sup>0</sup> *be the semi-Markov kernel as in Proposition A20. Let the times between successive jumps be* Δ*<sup>n</sup>* := *τ<sup>n</sup>* − *τn*−<sup>1</sup> *have the conditional distribution function of the time between two successive jumps be given by*

$$F\_{ij}(t) := \mathbb{P}\left[\Delta\_n \le t \,|\, Z\_n = \theta\_i, \, Z\_{n+1} = \theta\_j\right]. \tag{A21}$$

*Then, the semi-Markov kernel verifies,*

$$q(i,j,t) := \mathbb{P}\left[Z\_{n+1} = \theta\_{\mathbf{j}}, \Delta\_n \le t \,|\, Z\_n = \theta\_{\mathbf{i}}\right] = p(i,j)F\_{\mathbf{i}\mathbf{j}}(t),\tag{A.22}$$

*with p*(*i*, *j*) *as defined in Proposition A3.*

**Proof.** It is a consequence of Proposition A1.

**Remark A7** (Homogeneous Markov chains as semi Markov processes)**.** *Let* (*Xt*)*t*≥<sup>0</sup> *be a homogeneous Markov chain in continuous time with state space* **Θ** = {*θ*1, *θ*2, ... , *θr*, ... } *and with—time independent—transition intensities given by <sup>Q</sup>*(*t*) = [*q*(*i*, *<sup>j</sup>*)]*i*,*j*≥<sup>1</sup> *(see Definition A3). Then, by the well known results on homogeneous Markov chains (see [29] pp. 317, 318) and by the representation given by Formula* (A22)*, we have that*

$$q(t,i,j) = \begin{cases} \frac{q(i,j)}{-q(i,j)} \left(1 - e^{q(i,j)t}\right) & i \neq j, \\ 0 & i = j \text{ or } q(i,i) = 0, \end{cases} \tag{A23}$$

*is the semi Markov kernel of a sMp. Being so, comparing Formula* (A23) *with Formulas* (A21) *and* (A22)*, we can see that the main difference between a sMp and a continuous time Markov process is the fact that in the sMp case the conditional distribution function of the time between two successive jumps depend not only on the initial state of the jump but also on the final state, while in the homogeneous Markov chain case the dependence is only on the initial state of the jump.*

**Definition A8** (The sojourn time distribution in a state)**.** *The sojourn time distribution in the state θ<sup>i</sup>* ∈ **Θ** = {*θ*1, *θ*2,..., *θr*,... }*, is defined by:*

$$H\_i(t) := \sum\_{j=1}^{+\infty} q(i, j, t) = \sum\_{j=1}^{+\infty} p(i, j) F\_{ij}(t). \tag{A24}$$

*Its mean value represent the mean sojourn time in state θ<sup>i</sup> of the sMP* (*Yt*)*t*≥0*.*

**Definition A9** (Regular sMp)**.** *A sMP* (*Yt*)*t*≥<sup>0</sup> *is regular, with N*(*t*) *the number of jumps of the process in the time interval* ]0, *t*] *given by:*

$$N(t) := \sup\{n \ge 0 : \tau\_n \le t\},\tag{A25}$$

*defined for t* > 0 *verifies for all θ<sup>i</sup>* ∈ **Θ***,*

$$\mathbb{P}\_{\mathrm{i}}[N(t) < +\infty] := \mathbb{P}[N(t) < +\infty | Z\mathfrak{o} = \mathfrak{e}\_{\mathrm{i}}] = 1. \tag{A26}$$

**Proposition A5** (Jumps times of a regular sMp do not have accumulation points)**.** *Let the sMP* (*Yt*)*t*≥<sup>0</sup> *be regular. Then, almost surely,* lim*n*→+<sup>∞</sup> *<sup>τ</sup><sup>n</sup>* = +<sup>∞</sup> *and, for any <sup>T</sup>* <sup>∈</sup> <sup>+</sup> *and almost all ω* ∈ Ω*:*

$$\#\{k \ge 1 : \tau\_k(\omega) \le T\} < +\infty. \tag{A27}$$

*This means that in every compact time interval* [0, *T*]*, for almost all ω* ∈ Ω *there is only a finite number of times τk*(*ω*) *in this interval.*

The following fundamental theorem ensures that for sMp with finite state space the sequence of stopping times do not accumulate in a compact interval.

**Theorem A7** (A sufficient condition for regularity of a sMp)**.** *Let α* > 0 *and β* > 0 *be constants such that or every state θ<sup>i</sup> the sojourn time distribution in this state Hi*(*t*) *defined in Definition A8 verifies:*

$$H\_i(\mathfrak{a}) < 1 - \beta.$$

*Then, the sMp is regular. In particular, any sMp with a finite state space is regular.*

**Proof.** See in [74] (p. 88).

**Remark A8** (On the estimation of sMp)**.** *The estimation of sMp is dealt, for instance, in [75,76].*

#### **References**

