Next Article in Journal
Measurement Studies Utilizing Similarity Evaluation between 3D Surface Topography Measurements
Next Article in Special Issue
A Class of Power Series q-Distributions
Previous Article in Journal
Event-Triggered Relearning Modeling Method for Stochastic System with Non-Stationary Variable Operating Conditions
Previous Article in Special Issue
Strong Ergodicity in Nonhomogeneous Markov Systems with Chronological Order
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation–Calibration of Continuous-Time Non-Homogeneous Markov Chains with Finite State Space

by
Manuel L. Esquível
1,*,
Nadezhda P. Krasii
2 and
Gracinda R. Guerreiro
1
1
Department of Mathematics, NOVA FCT, and NOVA Math, Universidade Nova de Lisboa, Quinta da Torre, 2829-516 Monte de Caparica, Portugal
2
Department of Higher Mathematics, Don State Technical University, Gagarin Square 1, Rostov-on-Don 344000, Russia
*
Author to whom correspondence should be addressed.
Mathematics 2024, 12(5), 668; https://doi.org/10.3390/math12050668
Submission received: 2 January 2024 / Revised: 20 February 2024 / Accepted: 21 February 2024 / Published: 24 February 2024

Abstract

:
We propose a method for fitting transition intensities to a sufficiently large set of trajectories of a continuous-time non-homogeneous Markov chain with a finite state space. Starting with simulated data computed with Gompertz–Makeham transition intensities, we apply the proposed method to fit piecewise linear intensities and then compare the transition probabilities corresponding to both the Gompertz–Makeham transition intensities and the fitted piecewise linear intensities; the main comparison result is that the order of magnitude of the average fitting error per unit time—chosen as a year—is always less than 1%, thus validating the methodology proposed.

1. Introduction with a Literature Review

This study follows on from a previous article (see [1]) in which we developed a way to calibrate a Markov chain model in continuous time using data obtained from Portuguese National Network of Continuing Care. The calibration methodology used in that work, although very effective, is not completely satisfactory as it rests on a series of ad hoc processes with reduced guarantees of reproducibility and robustness.
In the present work, we intend to develop simpler and more robust means of estimating and calibrating intensities for non-homogenous continuous-time Markov chains (see [2] for a recent introduction to these processes and their applications). For this purpose, we first develop the two following subjects. The first subject deals with Markov chain regime switching achieved by considering an abrupt change in the intensities, for instance, having intensities with jumps. The second subject complements the first one if we suppose that we replace regular intensities by irregular ones—like piecewise linear—in principle with more easily estimable parameters; we study the effect on the transition probabilities of a replacement of the original intensities by sufficiently close alternative intensities. These two different streams of ideas are connected not only to one another but also to the estimation–calibration techniques to be studied.
We now present a review of the literature, mainly covering the subject of estimation and calibration of continuous-time non-homogeneous Markov chains with finite state space relevant for health insurance and long-term care (LTC), existing results for the Kolmogorov ordinary differential equations, as well as works where one can find some similarities between non-homogeneous Markov chains and semi-Markov jump linear systems.
A consecrated approach in the study of continuous-time Markov chains for applications, namely, in the multiple state models—the transition intensity approach (see [3] p. 126 or [4] p. 189)—consists of giving the intensities, solving the Kolmogorov ODE and using the transition probabilities obtained for computations. The intensities should be estimated from the data. This is the approach that we assume in this work.
The statistics of homogeneous Markov chains has already received several very thorough analyses. A very well organised and complete one is provided in Billingsley’s monograph [5] that treats, in the first part, the discrete-time homogeneous Markov chains and in the second part the continuous-time chains by resorting to the canonical embedded process. A companion reference is article [6] that provides a very complete set of references on the subject until 1961. In order to obtain consistency and asymptotic normality results for the maximum likelihood estimators, the author assumes, as usual, stringent regularity assumptions in particular on the intensities.
The statistics of Markov chain models for multiple state models is usually performed under simplifying assumptions on the model. For instance, in [7], the intensities are supposed to be constant in selected time intervals, and observations are chosen for which the exact age belongs to a given selected time interval. Another set of simplifying assumptions is proposed in [8] (pp. 126–128); at first, the transition functions are approximated in a one-year period interval by a one-sided Stieltjes interval. and then, using these approximations the transition intensities, with adequate analytical properties, are obtained as a result of a minimisation of a sum of squares objective function. The method proposed in [4] (pp. 147–169) has also two steps; in the first step, the transitions intensities are supposed to be constant in one-year period intervals and are estimated with a maximum likelihood approach. Subsequently, there is a second step of denominated graduation—a method generally described in [9]—that fits parameters of exponential functional intensity using generalised linear models. The method is applied to real data, and it becomes clear that several adequate particular ad hoc assumptions in the method are inevitable in order to deal with specific properties of the data. The simplifying assumption of transition intensities constant in each one-year period is also taken in [10] (pp. 683–690), where a detailed treatment of an example is also presented; in a commentary, the authors also refer the need of a graduated procedure to obtain the final intensities.
The excellent review work [11] illustrates the manner in which multiple-state Markov and semi-Markov models can be used for the actuarial modelling of health insurance policies. The bivariate character of the Markov process naturally associated with a semi-Markov model is useful whenever the durational effects are not negligible but in contrast is technically much more difficult to handle than the univariate Markov process. Considering discrete-time semi-Markov processes, the authors in [12] study semi-Markov jump linear systems—which is a hybrid dynamical system that consists of a family of subsystem modes and a semi-Markov process that orchestrates switching between them—with bounded sojourn times, in order to provide sufficient criteria for the stability and stabilisation problems with respect to a specified approximation error. The companion work [13] enlarges the previous model by considering delay, and by means of a novel Lyapunov–Krasovskill functional and using the probability structure of semi-Markov switching signal, the sufficient stability conditions for the considered systems are presented in terms of a set of linear matrix inequalities and a proper semi-Markov switching condition. It now becomes clear that a natural extension of our work would be to consider semi-Markov models instead of Markov chains.
The estimation–calibration methodology we propose in this work is applied to continuous piecewise linear intensities. Since in health insurance and long-term-care multiple-state models, the intensities are usually of Gompertz–Makeham type (see [4] pp. 21, 24, 101), we previously showed that the distance between two transition probability matrices—in the sense of some matrix norm—is bounded by the same distance between the correspondent intensity matrices, thus showing that the Gompertz–Makeham functional form for the intensities is not really necessary.
We now refer to some works with contributions to the topic of estimation calibration of multiple state models, also called multi-state models. The work [14] proposes a review of multiple-state models via continuous-time Markov chains, signalling the usual approach for the non-homogeneous case of considering piecewise constant intensities. It is at first reading, a most interesting review paper with applications to real data comparing different model analyses. Ref. [15] deals with a nonparametric approach to statistical inference in non-homogeneous Markov processes based on counting processes for transition intensities—namely using the so-called Nelson–Aalen estimator or the kernel smoothing estimator of Ramlau–Ahnsen—presenting a case study using this methodology. Ref. [16] can be seen as a continuation of the previously referred to work. Besides reviewing methods for non-parametric estimation of transition probabilities, the authors study the case where semi-parametric Cox type regression models are specified for the transition intensities whenever there is specification of the development of the time-dependent covariates. An illustration of the methods with data from a randomised clinical trial in patients with liver cirrhosis is also presented. Ref. [17] is an ancillary reference for graduating the transition intensities in a multiple-state model for permanent health insurance applications based on generalised linear models—with a random component based on independent Poisson response variables—in the case that the intensities are supposed to depend on some secondary variables. The work in [18] follows the preceding paper in the main intent of proposing a graduation method for the transition intensities of a non-homogeneous continuous-time Markov chain model. In the work [19], a comparison between a discrete-time and continuous-time homogeneous Markov chain models is presented in order to assess the effect of unevenly spaced observations. Since the authors want to incorporate covariates in the model, this study also deals with a series of multinomial logit regressions for the discrete-time model and proportional hazard regressions for the covariates through transition intensity functions for the continuous-time model. Ref. [20] is a simplified multiple-state model that develops a generic estimation method for calculating the transition probabilities in a one-year multiple-state model based on disability prevalence rates; multiple logistic regression models are employed to estimate disability prevalence rates and the one-year recovery rates. In doing so, the authors assume three conditions of the ratio between the mortality rate of inactive and active people—and several other conditions used in the literature—that allow the necessary computations in the case treated which concerns cross-sectional data measuring the disabled status of an individual at one point in time. The work [21] introduces a semi-parametric model that employs a logit function to capture the treatment intensities across two groups, aiming to estimate transition intensity rates within the framework of an illness–death model. Parameter estimation is conducted through an EM algorithm coupled with profile likelihood. Simulation studies presented in the text indicate that the proposed method is straightforward to apply and produces results comparable to those of the parametric model. The study [22] examines the impact of part-time and full-time employment on health by employing a Markov three state model—using piecewise constant forces, where the transition intensities are graduated using generalised linear models and assumed, at the start, to be equal per age level—and generalised linear models to refine the initial raw rates. Integration of the corresponding Chapman–Kolmogorov equations allow us to derive a comprehensive solution. As an application of the model, the effectiveness of a partial early-retirement incentive in the Netherlands is evaluated. The refined rates obtained indicate that working part-time does not necessarily correlate with improved health among the elderly.
In the present work, we also establish a result with regime-switching Markov chains exploring the possibility of having, in the whole time period under study, intensities with several different functional forms (linear, exponential, etc.) in different subintervals of the whole time period. Our study of ordinary differential Equations (ODE) with regimes that began with the work [23] and was exploited in [24] is based on general results of existence and uniqueness of solutions of ODE—due to Caratheodory and Wintner for existence and Osgood for uniqueness, among others—with a non-regular second member. In the case of a second member of the non-regular ODE, the regimes appear clearly in the solutions of the Kolmogorov equations due, for example, to possible discontinuities in the entries of the intensity matrix.
The general theory of existence of solutions of Kolmogorov equations is exhaustively treated in the works of Feinberg, Shiryaev and Mandala (see [25,26,27]). Their powerful results apply to jump processes with values in a general Borel space and so are also transferable to non-homogeneous Markov chains with a finite state space. Since we are dealing with this finite state space—due to our interest in health insurance and long-term-care multiple-state models—we chose to present a more simple approach that requires only classical existence theorems for ordinary differential equations, namely Caratheodory’s existence theorem and Osgood’s uniqueness theorem.
We now succinctly describe the contents of this work.
  • In Section 2, we develop the subject of regime-switching Markov chains. The results obtained can be applied to the consideration of discontinuous intensity matrices.
  • In Section 3, we deal with the approximation of matrices of transition probabilities given an approximation of the correspondent matrices of intensities.
  • Section 4, Section 5 and Section 6 detail, with an example, the methodology for estimation–calibration proposed and present an analysis of the results obtained.
  • In Section 7, we provide a discussion of the results obtained in the example treated, and in Section 8, we summarise all the results obtained in this work.
There are three main contributions of this work. The first is the proposal of a method to estimate the parameters of a set of transition intensities from ideal observed data. The second is a result on regime-switching Markov chains that establishes the possibility of considering transition intensities made up of different sorts of functional forms, with each one of the functional forms depending on different sets of parameters. Finally, the third contribution is a result that quantifies the norm of the difference of two probability transition matrices in terms of the norm of the corresponding matrices of transition intensities; this last result justifies the choice of arbitrary functional forms for the transition intensities in ways more adequate for parameter estimation.

2. Regime Switching Markov Chains

In this section, we develop the formalism of regime switching for Markov chains, in which the transition probabilities are derived from intensities that, at a certain point in time, can change either in functional form or in the parameters. The consideration of discontinuous piecewise linear intensities suggests the study of Markov chains in continuous time with regimes. Let us state some preliminary notations and results for context purposes (see [28]). Firstly, we recall the definition of an intensity matrix Q ( t , θ ) .
Definition 1.
Let L ( R d × d ) be the space of d × d square matrices with coefficients in R . A function Q : [ 0 , + [ L ( R d × d ) denoted by
Q ( t , θ ) = μ i j θ ( t ) i , j = 1 , , d ,
with θ Θ R p a parameter is a transition intensity matrix if, for almost all t 0 , it verifies
(i) 
i = 1 , , d ,   t 0 ,   μ i i θ ( t ) 0 ;
(ii) 
i = 1 , , d ,   j = 1 , , d ,   t 0 ,   i j μ i j θ ( t ) 0 ;
(iii) 
i = 1 , , d ,   t 0 ,   j = 1 , , d μ i j θ ( t ) = 0 .
Secondly, we recall the Kolmogorov ordinary differential Equations (ODE) for non-homogeneous continuous-time Markov chains. These equations, upon integration, give the matrix of transition probabilities P ( x , t , θ ) as a function of the matrix of intensities Q ( t , θ ) , where θ Θ is a parameter. The forward Kolmogorov ODE can be represented in the following form:
P t ( x , t , θ ) = P ( x , t , θ ) Q ( t , θ ) P ( x , x ) = I ,
or, in integrated form, by
P ( x , t , θ ) = I + [ x , t ] P ( x , s , θ ) Q ( s , θ ) d s .
Finally, let us now deal with regime-switching Markov chains.
The general motivation for the study of regime switching in ODE can be seen in [23,24]. Let us elaborate on the motivation to study regime-switching Markov chains. Suppose that we have two continuous-time Markov chains, having the same state space, with intensities of different functional forms—such as piecewise constant or affine of the Gompertz-Makeham type, etc.—depending on two sets of parameters, say Θ 1 and Θ 2 , respectively, defined in two contiguous-time intervals, say [ 0 , t 1 ] and [ t 1 , t 2 ] . The result we proof shows that there exists a well-defined Markov chain in the interval [ 0 , t 2 ] with intensities depending on a set of parameters Θ : = Θ 1 Θ 2 such that the transition probabilities of this Markov chain—obtained by the solution of the Kolmogorov equations—coincide with the transition probabilities of the first initial Markov chain in [ 0 , t 1 ] and also coincide with the transition probabilities of the second initial Markov chain in [ t 1 , t 2 ] . This result will grant us a greater latitude in the choice of the functional forms of the intensities for estimation purposes since it will be possible to partition a time interval of interest in two or more disjoint intervals and to have intensities, in each one of the intervals, possibly of different functional forms and different sets of parameters. In Theorem 4, we consider extended solutions of an ODE in the sense of Carathéodory. For that, following [29] (pp. 41–44), we consider the definition of an extended solution of a differential equation.
Definition 2
(Extended solution of an ODE). For f ( t , y ) : I × D R d × d a non necessarily continuous function, with I [ 0 , + [ and D R d × d and a differential equation given by
Y ( t ) = f ( t , Y ( t ) ) ,   Y ( 0 ) = y 0 R d × d ,
or in the equivalent integral form, with the appropriate Lebesgue measure d u in R ,
Y ( t ) = y 0 + 0 t f ( u , Y ( u ) ) d u ,
an extended solution Y ( t ) of the ODE in Formula (3) is an absolutely continuous function Y ( t ) , such that f ( t , Y ( t ) ) D for t I and Formula (3)—or equivalently, Formula (4)—is verified for all t I almost everywhere (a.e), that is, possibly with the exception of a set of null Lebesgue measure in [ 0 , + [ .
We now recall Caratheodory’s existence theorem—see [29] (p. 43) for the unidimensional result and [30] (pp. 28–29) for the multidimensional result, with a proof via Schauder’s fixed point theorem—in the context of the model we are studying, a theorem that ensures the existence of an extended solution under general conditions.
Theorem 1
(Caratheodory’s existence theorem). Suppose that f ( t , y ) : I × D R d × d , with I = [ 0 , u [ an open set of [ 0 , + [ and D an open set of R d × d , verifies that
(i) 
f ( t , y ) is measurable in the variable t, for fixed y , and continuous in the variable y , for fixed t, for ( t , y ) I × D .
(ii) 
For each compact set K D and T > 0 , there exists a Lebesgue integrable function λ ( t ) , such that f ( t , y ) λ ( t ) for ( t , y ) [ 0 , T ] × K .
Then, for every ( t 0 , y 0 ) I × D such that Y ( t 0 ) = y 0 , that is, a given initial condition of equation in Formula (3), there exists an extended solution according to Definition 2, defined in a neighbourhood of ( t 0 , y 0 ) .
Despite the fact that Theorem 1 guaranties the local existence of an extended solution, it is always possible to consider a maximal extension of this solution, possibly, to a larger time interval (see [30] pp. 29–30).
Theorem 2
(Maximal time interval for existence). With the notations and under the hypothesis of Theorem 1, any existing solution Y admits a continuation Y ˜ to a maximal time interval of existence, let it be [ a , b ] such that, D being the boundary of D :
lim t a   Y ˜ ( t ) D   a n d   lim t b   Y ˜ ( t ) D .
Remark 1
(Applying Caratheodory theorem to the Kolmogorov ODE). Kolmogorov ODE for continuous time Markov chains, in Formula (1), falls under this formalism in the following way:
f ( t , y , θ ) = Q ( t , θ ) · y
which is essentially the equation in Formula (3) with the possibility of dependence on a parameter θ Θ . Consider a matrix norm · in the sense of [31] (p. 340), that is, a submultiplicative norm—such as the l 1 norm and l 2 norm, also known as the Frobenius norm—and observe that since
f ( t , y , θ ) = Q ( t , θ ) · y Q ( t , θ ) y ,
and since any norm of a probability intensity matrix is bounded, we can apply Caratheodory’s theorem to the Kolmogorov ODE under the condition that there exists a Lebesgue integrable function λ ( t ) such that
sup θ Θ   Q ( t , θ ) λ ( t ) ,
for all t [ 0 , T ] . We will see in Remark 3 that the condition in Formula (7) is also sufficient to ensure the unicity of the extended solutions.
Remark 2
(Existence of extended solutions alternative proof). We could also quote Wintner’s theorem, referred to in [32], a theorem that states that if f ( t , y ) N ( t ) L ( y ) with N and L piecewise continuous, positive and L non-decreasing, such that for some c > 0
c + 1 L ( s ) d s = + ,
then the ODE in Formula (3) has a solution for a given initial condition. We observe that the quoted theorem is valid, under the assumption that f ( t , y ) is continuous with the possible exception of points of a null Lebesgue set of the time variable, by considering extended solutions instead of usual solutions, which have a continuous derivative, and as L ( t ) and N ( t ) both satisfy the hypotheses of Wintner’s theorem for parameters of the intensity functions and taking each, two (or several) distinct values in two (or several) complementary intervals of the time domain.
We now quote an unicity result—from [30] p. 30—applicable whenever there is existence of an extended solution in the Caratheodory sense.
Theorem 3
(Unicity of extended solutions). Suppose that f ( t , y ) : I × D R d × d , with I = [ 0 , u [ an open set of [ 0 , + [ and D an open set of R d × d , verifies the conditions in Theorem 1, and moreover, that for each compact set K D and T > 0 , there exists a Lebesgue integrable function λ K ( t ) , such that
f ( t , y 1 ) f ( t , y 2 ) λ K ( t ) y 1 y 2 ,
for ( t , y 1 ) ,   ( t , y 2 ) [ 0 , T ] × K . Then, for every ( t 0 , y 0 ) I × D such that Y ( t 0 ) = y 0 , that is, a given initial condition of equation in Formula (3), there exists an unique extended solution Y according to Definition 2, defined in a neighbourhood of ( t 0 , y 0 ) . The domain of definition of Y is open, and Y is continuous in this domain.
Remark 3
(Applying the unicity result to the Kolmogorov ODE). With the interpretation given in Formula (5) and if the norm is a matrix norm, similarly to what we had in Formula (6), we now have that a sufficient condition for the unicity of the extended solutions of Kolmogorov ODE is for each compact set K D and T > 0 , there exists a Lebesgue integrable function λ K ( t ) such that Q ( t , θ ) λ K ( t ) , thus implying that
f ( t , y 1 , θ ) f ( t , y 2 , θ ) = Q ( t , θ ) · y 1 y 2 Q ( t , θ ) y 1 y 2 λ K ( t ) y 1 y 2 ,
which is the hypothesis bound in Formula (8) for Theorem 3.
Remark 4
(On the unicity of the extended solutions). Either directly using Theorem 18.4.13 in [33] (p. 337) or using Osgood’s uniqueness theorem—as presented for instance, in [34] (p. 58) or in [35] (pp. 149–151)—we may also conclude that the extended solution, that we know to exist, is unique, in the sense that two solutions may only differ on a set of Lebesgue measures equal to zero.
Remark 5
(On the numerical computation of extended solutions). We observe that these existence and uniqueness results are essential for a numerical integration of the ODE, but that no result on numerical convergence is implied—in the existence and uniqueness results above—for the regime switching ODE with discontinuous coefficients. Nevertheless, the Lipschitz condition with respect to the y variable—such as the one in Formula (8)—is sufficient for the convergence of the Euler method (see [36] p. 74).
The next theorem is a simple example of a regime-switching result for continuous-time Markov chains. The extension of this result to more than two regimes is straightforward. We consider the Kolmogorov ODE in a time interval [ 0 , T ] .
Theorem 4
(Regime switching continuous-time Markov chains). Let · denote a matrix norm, let Θ denote a parameter set and let Q 1 ( t , θ ) defined for t [ 0 , t 1 ] and Q 2 ( t , θ ) defined for t [ t 1 , T ] be two intensity matrices such that for λ ( t ) , an integrable function defined in [ 0 , T ] , we have for t [ 0 , T ]
max sup θ   Θ Q 1 ( t , θ ) , sup θ   Θ Q 2 ( t , θ ) λ ( t ) .
Then, there exists P ˜ ( t , θ ) such that
1.
In [ 0 , t 1 ] , we have that P ˜ P 1 , a.e. in t, where P 1 is a solution of the Cauchy problem ( P 1 ) t ( t , θ ) = P 1 Q 1 ( t , θ ) with the usual initial conditions;
2.
In [ t 1 , T ] , we have that P ˜ P 2 a.e. in t, where P 2 is a solution of the Cauchy problem ( P 2 ) t ( t , θ ) = P 2 Q 2 ( t , θ ) with the initial conditions given by P 1 ( t 1 , θ ) ;
3.
P ˜ is a transition probability matrix.
Proof. 
Let Q 1 ( t , θ ) = μ i j 1 , θ ( t ) i , j = 1 , , d and Q 2 ( t , θ ) = μ i j 2 , θ ( t ) i , j = 1 , , d . If we define
Q ( t , θ ) = μ i j θ ( t ) i , j = 1 , , d : = Q 1 ( t , θ ) 1 I [ 0 , t 1 ] ( t ) + Q 2 ( t , θ ) 1 I ] t 1 , T ] ( t ) ,
we will have that:
μ i j θ ( t ) = μ i j 1 , θ ( t ) 1 I [ 0 , t 1 ] ( t ) + μ i j 2 , θ ( t ) 1 I ] t 1 , T ] ( t ) ,
and we can, immediately, verify that Q ( t , θ ) is an intensity matrix on [ 0 , T ] according to Definition 1. Moreover, since by Formula (10) and the definition in Formula (11), we have that
Q ( t , θ ) λ ( t ) ,
we can let P ˜ ( t , θ ) be the unique solution of the Kolmogorov equation P ˜ t = P ˜ Q on [ 0 , T ] with the usual conditions. It is then clear that P ˜ ( t , θ ) is a transition probability matrix. Furthermore, if we define
P ^ ( t , θ ) : = P 1 ( t , θ ) 1 I ] 0 , t 1 [ ( t ) + P 2 ( t , θ ) 1 I ] t 1 , T [ ( t )
and we will then have, using the hypothesis that
P ^ ( t , θ ) t a . e . = P ^ 1 ( t , θ ) t 1 I ] 0 , t 1 [ ( t ) + P ^ 2 ( t , θ ) t 1 I ] t 1 , T [ ( t ) a . e . = a . e . = P 1 Q 1 ( t , θ ) 1 I ] 0 , t 1 [ ( t ) + P 2 Q 2 ( t , θ ) 1 I ] t 1 , T [ ( t ) a . e . = a . e . = P 1 ( t , θ ) 1 I ] 0 , t 1 [ ( t ) + P 2 ( t , θ ) 1 I ] t 1 , T [ ( t ) Q 1 ( t , θ ) 1 I [ 0 , t 1 ] ( t ) + Q 2 ( t , θ ) 1 I ] t 1 , T ] ( t ) a . e . = a . e . = P ^ ( t , θ ) Q ( t , θ ) ,
which shows that P ^ ( t , θ ) P ˜ ( t , θ ) a.e. in t and, as a consequence, that P ˜ ( t , θ ) P ^ ( t , θ ) P 1 in [ 0 , t 1 ] a.e. and P ˜ ( t , θ ) P ^ ( t , θ ) P 2 in [ t 1 , T ] a.e. □
We present in Figure 1 graphical representations of transition probabilities obtained by numerical integration of Kolmogorov equations with discontinuous piecewise linear intensities for a four-state Markov chain with intensity matrix given in Formula (16).
State one is the healthy state, state four is an absorbing state corresponding to death and states two and three are intermediate dependent states. This representation, aside from being an illustration of a regime switching Markov chain, also illustrates the possible extreme effects of a regime switching of discontinuous-intensity matrix entries. The subject of Markov chains with regimes has, as can easily be observed, an interest independent of the objective that motivates us to study it. However, in the context of the present work, it can be a way to justify a more efficient and robust parameter estimation (or calibration) process by an adequate choice of functional forms for the intensities.

3. On the Approximation of Intensities and Corresponding Transition Probabilities

One way to simplify the estimation of intensities of Markov chain models in continuous time—relevant for health insurance and long-term-care models—is to replace the usual Gompertz–Makeham type intensities—containing exponential and linear terms and therefore being difficult to estimate—by continuous piecewise linear intensities. In this sense, it is important to have a result that controls the distance between two matrices of transition probabilities resulting from the integration of the Kolmogorov equations for the correspondent two matrices of intensities.
It is known (see, for example, [37] p. 264 and [1]) that we can represent the transition probabilities in the Hostinsky form:
P ( x , t , θ ) = = I + n = 1 + [ x , t ] [ t 1 , t ] [ t n 1 , t ] Q ( t 1 , θ ) Q ( t 2 , θ ) Q ( t n , θ ) d t n d t 1 ,
where the right-hand member only depends on the intensities and where the series converges uniformly. The following theorem—akin to a multidimensional Gronwall-type inequality—is a natural result.
Theorem 5
(Dependence of the transition probabilities on the intensities). Let · be a matrix norm and let Q 1 ( t , θ ) and Q 2 ( t , θ ) be two matrices of intensities norm bounded by M > 0 in [ 0 , T ] . Define
ϵ Q 1 , Q 2 : = sup t [ 0 , T ] ,   θ Θ Q 1 ( t , θ ) Q 2 ( t , θ ) .
Then, we have that
sup t [ x , T ] ,   θ Θ   P 1 ( x , t , θ ) P 2 ( x , t , θ )   ϵ Q 1 , Q 2 e M | T x | 1 M ,
where P 1 ( x , u , θ ) and P 2 ( x , u , θ ) are the solutions of the Kolmogorov equations—given in Formula (1)—with matrices of intensities Q 1 ( t , θ ) and Q 2 ( t , θ ) , respectively.
Proof. 
The proof of a result of this type for an ordinary differential equation y ( t ) = f ( t , y ( t ) ) , satisfying the condition
| f ( t , y 1 ( t ) ) f ( t , y 2 ( t ) ) | λ ( t ) | y 1 ( t ) y 2 ( t ) | ,
where λ is integrable and is immediate from the integral representation of the differential equation. So we will use the integral representation given by Formula (12). The following well-known result (see [38] p. 217 and, for a proof, [28] p. 348) will be instrumental.
Lemma 1.
Let q : R + R a measurable function integrable over every bounded interval of R + . Then, we have that
s t s 1 t s n 1 t q ( s 1 ) q ( s 2 ) q ( s n ) d s n d s 2 d s 1 = s t q ( u ) d u n n ! ,
for all 0 s t , n 1 .
Let us show, by induction, that if Q 1 ( t , θ ) M and Q 2 ( t , θ ) M for some 0 < M < + then, using hypothesis in Formula (13), we have that
k = 1 n Q 1 ( t k , θ ) k = 1 n Q 2 ( t k , θ ) max k = 1 , , n Q 1 ( t k , θ ) Q 2 ( t k , θ ) · M n 1 ϵ Q 1 , Q 2 M n 1
In fact, for the first order bound we have that
Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) = = Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 1 ( t 1 , θ ) Q 2 ( t 2 , θ ) + Q 1 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) = = Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 2 ( t 2 , θ ) + Q 1 ( t 1 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) ,
and then it follows, using the matrix norm hypothesis, that
Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) M Q 1 ( t 2 , θ ) Q 2 ( t 2 , θ ) + M Q 1 ( t 1 , θ ) Q 2 ( t 1 , θ )   M · max k = 1 , 2 Q 1 ( t k , θ ) Q 2 ( t k , θ ) .
Consider now, for clarity, the next induction step, the second order bound.
Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 1 ( t 3 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) = = Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 1 ( t 3 , θ ) Q 1 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) + + Q 1 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) = = Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 1 ( t 3 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) + + Q 1 ( t 1 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) .
Again it follows, from the matrix norm hypothesis and using the previous order one bound, that
Q 1 ( t 1 , θ ) Q 1 ( t 2 , θ ) Q 1 ( t 3 , θ ) Q 2 ( t 1 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ ) M Q 1 ( t 2 , θ ) Q 1 ( t 3 , θ ) Q 2 ( t 2 , θ ) Q 2 ( t 3 , θ )   + M 2 Q 1 ( t 1 , θ ) Q 2 ( t 1 , θ )   M 2 · max k = 1 , 2 , 3   Q 1 ( t k , θ ) Q 2 ( t k , θ ) .
The induction proof is now cleared. Now, by using Formulas (12) and (15) and Lemma 1, we have the following bound for the norm of the difference of the two transition probability matrices.
P 1 ( x , t , θ ) P 2 ( x , t , θ ) n = 1 + [ x , t ] [ t 1 , t ] [ t n 1 , t ] k = 1 n Q 1 ( t k , θ ) k = 1 n Q 2 ( t k , θ ) d t n d t 1 1 M   n = 1 + [ x , t ] [ t 1 , t ] [ t n 1 , t ] ϵ Q 1 , Q 2 M n d t n d t 1 1 M   n = 1 + [ x , t ] [ t 1 , t ] [ t n 1 , t ] ϵ Q 1 , Q 2 1 / n   M × × ϵ Q 1 , Q 2 1 / n   M   d t n d t 1 = = 1 M   ϵ Q 1 , Q 2 n = 1 + M n | t x | n n !   = ϵ Q 1 , Q 2 e M | T x | 1 M ,
thus proving the result, in Formula (14). □
A result like the one given by Theorem 5 is expected to simplify the estimation of the parameters θ Θ that allow the fitting of a Markov chain model to real data coming, for example, from multi-state models for health insurance or long-term care, which was the case discussed in [1].
Remark 6
(Applying Theorem 5). The applicability of Formula (14) requires that M | T x | 1 ; if not, the result is of no use. The usefulness of the result relies on the possibility of localising the computation of the solutions of Kolmogorov ODE. Once time units are chosen—let us say, one year—this is achieved by solving, successively, the Kolmogorov differential equations in time intervals [ x k , x k + 1 ] of length | x k + 1 x k | 1 . In doing so, the final values of the transition probability matrix in one interval must be the initial values of the transition probability matrix Kolmogorov ODE in the immediately following time interval. Supposing the adequate hypothesis for the existence and unicity of solutions of the Kolmogorov forward equations, we may then conclude that two matrix intensity matrices are close to one another in some small time interval, the correspondent probability transition matrices will be close to one another in the same small time interval.
An illustrative example of the usefulness of this result is given in Figure 2 for which the intensity matrix considered is the following.
μ 11 ( t ) 1.20135 × 10 5 e 0.117 ( t + 50 ) + 1 200 1.2 · μ 12 ( t ) 0.05 · μ 34 ( t ) 0.7 · μ 12 ( t ) μ 22 ( t ) 5.49958 × 10 6 e 0.128 ( t + 50 ) + 3 500 0.5 · μ 34 ( t ) 0.36 · μ 12 ( t ) 0.3 · μ 23 ( t ) μ 33 ( t ) 4.08902 × 10 6 e 0.139 ( t + 50 ) + 7 1000 0 . 0 0 1
That is all the intensities are of Gompertz–Makeham type; moreover, μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) were first defined, and then, all the others were defined proportional to these three; the determinations of the parameters of these intensities from the data is the goal of an estimation calibration procedure. The coefficients of μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) were chosen having in mind a unit time of one year and a LTC model starting at the age of 50 years and going on until 100 years of age. The proportions between μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) and all the others can be calibrated using a discrete-time transition matrix if available. The linear interpolations of the intensities were given at the six following points t = 0 , 15 , 30 , 40 , 45 , 50 . The differences of the linear interpolated intensities and the original Gompertz–Makeham intensities μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) are shown in Figure 3.
The analysis of Figure 3 together with Figure 2 conveys an illustration of Theorem 5 and Remark 6 in the particular case of the approximation of Gompertz–Makeham intensities by linear interpolated ones.

4. Constructive Definition of CT-MC

The existence of a non-homogenous continuous-time Markov chain can also be guaranteed by a known constructive procedure that we now present, for completeness, and that is most useful for simulation and that we will use for defining the estimation calibration procedure proposed in this work. A reference for the following algorithmic definition of a Markov chain in continuous time is [37] (p. 266). For a proof of Theorem 6 below, see [38] (pp. 221–233). Let θ Θ be a parameter.
Definition 3
(Constructive definition). Given a transition intensity matrix,
Q ( t , θ ) = μ i j θ ( t ) i , j = 1 , , d ,
define
p ( t , i , j ) = 1 δ i j μ i i θ ( t ) μ i j θ ( t ) μ i i θ ( t ) 0 δ i j μ i i θ ( t ) = 0 ,
with δ i j Kronecker’s delta. Let X 0 = i , according to some initial distribution on { 1 , 2 , , d } .
1.
The jump sequence ( τ n ) n 0 of stopping times is defined by induction as follows; τ 0 0 .
2.
τ 1 , the sojourn time in state i which is also the time of first jump, has an exponential distribution function given by
F τ 1 ( t ) = P τ 1 t = 1 exp 0 t μ i i θ ( t ) d u .
We note that this distribution of the stopping time is mandatory as a consequence of a general result on the distribution of sojourn times of a continuous-time Markov chain (see Theorem 2.3.15 in [38] p. 221).
3.
Given that the process is in state i, it may jump to state j at time τ 1 = s 1 with probability p ( t , i , j ) defined in Formula (3), that is
P X s 1 = j τ 1 = s 1 ,   X 0 = i = p ( s 1 , i , j ) ,
and so X t = i for 0 τ 0 t < τ 1 .
4.
Given that τ 1 = s 1 and X s 1 = j , τ 2 time of the second jump with an exponential distribution function,
F τ 2 τ 1 = s 1 ( t ) = P τ 2 t τ 1 = s 1 = 1 exp 0 t μ j j θ ( u + s 1 ) d u
and
P X s 2 = k τ 1 = s 1 ,   X 0 = i ,   τ 2 = s 2 ,   X s 1 = j = p ( s 1 + s 2 , j , k ) ,
and so X t = j for τ 1 t < τ 2 .
The following theorem ensures that the preceding construction yields the desired result.
Theorem 6
(The continuous-time Markov chain). Let the intensity matrix be norm bounded by a Lebesgue integrable function in [ 0 , T ] . Then, given the times ( τ 0 ) n 1 , we have that with the sequence ( Y n ) n 1 defined by Y n = X τ n , the process defined by
X t = n = 0 + Y n 1 I [ τ n , τ n + 1 [ ( t ) = n = 0 + X τ n 1 I [ τ n , τ n + 1 [ ( t )
which is a continuous-time Markov chain with transition probabilities P and transition intensities Q .
Proof. 
This theorem is stated and proved, in the general case of continuous-time Markov processes in [38] (p. 229). □

5. Estimation–Calibration Procedure

In the following, we consider the set of procedures that allow us to obtain transition intensities from simulated data and then, by integration of the forward Kolmogorov ODE, the transition probabilities; these so-called estimated transition probabilities will be compared to the original transition probabilities that were used to generate the simulated data. The procedures comprehend both non-parametric statistical estimation by kernel methods and fitting piecewise linear functions to data with additional constraints, a procedure more akin to calibration.
We will consider an ideal sample of complete data represented in Figure 4; each line represents a trajectory and we have on the left-hand side the initial state, following it, we have the time spent in that state then the new state and the time spent in that state an so on and so forth…
Using the procedure detailed in Section 4, these data were generated with a full intensity matrix, that is a matrix of the form,
( μ 12 + μ 13 + μ 14 ) μ 12 μ 13 μ 14 μ 21 ( μ 21 + μ 23 + μ 24 ) μ 23 μ 24 μ 31 μ 32 ( μ 31 + μ 32 + μ 34 ) μ 34 0 0 0 0
with the intensities given by
μ 12 = 3.47 · 10 6   e 0.138 ( t + 65 ) + 1 2500 = μ 21 μ 13 = 0.5 · μ 12 μ 23 = 1.5 · μ 12 μ 14 = 0.0000758   e 0.087 ( t + 65 ) + 1 2000 μ 24 = 1.4 · μ 14 μ 34 = 1.8 · μ 14 μ 21 = μ 12 μ 31 = 0.1 · μ 21 μ 32 = 0.4 · μ 21
This set of intensities used to generate the full data sample was supposed to determine a model for LTC with four states; the relations between the intensities reflect the qualitative relations describing the force of transitions that we suppose are held in this particular model.
For estimation–calibration purposes—following the results on the distance of transition probabilities—we will suppose that is has given the most tractable functional form of the intensities depending on some parameters to be estimated. For instance, a continuous piecewise linear functional form for which we can have, for i j ,
μ i j θ ( t ) = k = 1 r   θ i j k , 1 + θ i j k , 2 t 1 I [ t k , t k + 1 [ ( t ) ,
with 0 = t 0 < t 1 < < t r + 1 = T , and possibly some conditions on the parameters θ i j k , 1 and θ i j k , 2 if the intensity μ i j θ ( t ) is supposed to be piecewise linear and continuous.
We stress again that the values in Figure 4 were simulated. For LTC, for example, real data will be given, possibly, with time stamps of a day, at best; being so, the order of approximation will have to be chosen looking for precision with a balance between a sufficiently narrow interval around a given time and having enough observations to perform the estimation.
Let us detail a methodology to identify the parameter θ Θ inspired by the constructive definition of the Markov chain in Remark 3. The general idea of the methodology is as follows.
(i)
Given a state i, we have to find a fitting for the distribution of random times of i i jumps. According to Formula (18), these times have an exponential distribution with density μ i i θ ( t ) .
(ii)
For every other state j, by using P X s 1 = j τ 1 = s 1 ,   X 0 = i , possibly with an approximation, by Formula (19), we can obtain an approximation of p ( s 1 , i , j ) .
(iii)
By using Formula (17) and the approximation obtained for p ( s 1 , i , j ) , we can obtain an approximation for μ i j θ ( t ) .
(iv)
Finally, we will fit a linear continuous piecewise intensity to μ i j θ ( t ) .
Let us detail the procedures for applying the methodology just described.
  • Recall that an observed trajectory has the following structure: (first state, time spent in state, second state, time spent in state, third state …). The maximum length of a trajectory in our sample is 11. Select all the trajectories of length greater than 3 that start at state i = 1 . If the next state is also i = 1 , the time spent in state—in this case, in state i = 1 —is the first part of the sample for obtaining μ 11 θ ( t ) . Select all the trajectories of length greater than 5 for which the second state is i = 1 ; this set of trajectories already contains the previous considered set of trajectories and so, if the third state is also i = 1 , the sum of the time spent in the first state and the time spent in the second state will be the second part of the sample for obtaining μ i i θ ( t ) . Repeat, successively, the procedure for all trajectories of length greater than 7, then of length greater than 9 and finally of length greater than 11 to obtain the full sample for the intensity μ 11 θ ( t ) .
  • Fit a smooth kernel distribution to the sample obtaining the intensity μ 11 θ ( t ) .
  • Repeat the procedure used for obtaining the sample for μ 11 θ ( t ) , but this time, select the transitions 1 2 , that is, the transitions from state i = 1 to state i = 2 . Fit a smooth kernel distribution to these data.
  • Now, we look for an estimate of p ( t , i , j ) given by Formulas (17) and (19). For that, we will consider rounding the sojourn times—say to the unity, in order to have enough observations—and then group all observations of jumps from the first state according to this rounding. Consider then the observations towards state i = 2 . We will then have that
    p ( s 1 , 1 , 2 ) = P X s 1 = 2 τ 1 = s 1 ,   X 0 = 1 P X s 1 = 2 , s 1 0.5 τ 1 < s 1 + 0.5 P s 1 0.5 τ 1 < s 1 + 0.5 = = P s 1 0.5 τ 1 < s 1 + 0.5 X s 1 = 2 · P X s 1 = 2 P s 1 0.5 τ 1 < s 1 + 0.5
    and most of the left-hand side of the formula will be estimable with the observations by using the smooth kernel distributions.
  • Resorting to Formula (17), we can compute values for μ 12 θ ( t ) and fit a piecewise linear density. That is, using again Formula (17), since μ 11 θ ( s 1 ) 0 , we have that for an arbitrary time t = s 1 ,
    μ 12 θ ( s 1 ) μ 11 θ ( s 1 ) 1 δ 1 2 p ( s 1 , 1 , 2 ) = μ 11 θ ( s 1 ) · p ( s 1 , 1 , 2 ) .
  • We consider a set of values of μ 12 θ ( s 1 ) , μ 12 θ ( s 2 ) , , μ 12 θ ( s k ) and then fit the multidimensional parameter θ Θ to these values (see Formula (23) for the case of a continuous piecewise linear intensity functional form).
  • These procedures are to be repeated in order to obtain the parameters μ 2 j θ for j 2 and μ 3 j θ for j 3 .
  • The intensities μ j j θ for j = 1 , 2 , 3 are obtained in the usual form and are forcible continuous piecewise linear since they are the sum of continuous piecewise linear functions.

6. Results of the Estimation–Calibration Procedure

We present the results from the estimation procedure developed in accordance with the methodology proposed in Section 5.
The estimated matrix structure is a full matrix such as the one given in Formula (21) but with piecewise linear intensities which are not very elucidative in themselves. It is preferable to look at a graphical representation of these intensities. In Figure 5, we have the estimated intensities (in blue colour) and the fitted piecewise linear intensities (in red colour). We can observe that despite observable differences the fitting is reasonably good with the exception of the intensity μ 21 . This may be due to the fact that we only had seven observations in the sample, and they are consistent with LTC data. We also observe that the fitting with an exponential-type density will give a non satisfactory result. In order to evaluate the quality of this fitting, we present in Table 1 the maximum distance between the computed approximate intensities and the fitted piecewise linear intensities.
We observe that the error for μ 31 is quite large compared to the other errors; this could be due to the fact that the estimation was performed with only 20 observations, again consistent with LTC data.
In Figure 6, we can compare qualitatively the original transition probabilities with the ones obtained as the solution of the forward Kolmogorov ODE with the estimated piecewise linear intensities. A first qualitative observation is that the general behaviour of the estimated and fitted intensities is similar. In order to compare quantitatively the original transition probabilities with the ones obtained as the solution of the Kolmogorov ODE with the estimated piecewise linear intensities, we present, in Figure 7, the difference between the original transition probabilities and the estimated transition probabilities for each of the three transient states. It is clear that there are substantial differences. To justify these differences we have at least two cumulative sources of error. The first error source is induced by the fact that there was a estimation–calibration procedure applied to 5000 trajectories generated from the original transition probabilities; the second error source comes from the fact that while the original transition probabilities are produced from intensities of the Gompertz–Makeham type, the estimated transition probabilities are produced by continuous piecewise linear intensities fitted to the simulated data.
In order to detail the average error per year coming from the estimation procedure we can compute
Δ i j : = 1 27   0 27 | p i j ( t )   p ˜ i j ( t ) | d t
with p i j ( t ) the original transition probabilities and p ˜ i j ( t ) the estimated transition probabilities. The results are presented in Table 2.
The main conclusion is that the order of magnitude of the average error per unit time—chosen as a year since it is the time duration of an interval where transition probabilities in health insurance and long-term-care models can have a significant impact—is always less than 1 % . Of course, we have to be careful of the extremes of these errors that are visible in Figure 7.
All computations and graphic representations were created with a Mac mini (M1 2020) equipped with macOS Monterey 12.5.1 with Wolfram Mathematica 12, version 12.3.1.0. The estimation–calibration procedures use either native functions or small routines that require reasonable execution times of the order of a second.

7. Discussion

The methodology proposed in this work gave us the continuous piecewise linear intensities depicted in Figure 5. The use of the continuous piecewise linear functional form was intentional although not necessary; a better fit to the data, most possibly with a larger number of parameters, could be computed and possibly could provide a better final result, qualitatively, in Figure 6 and quantitatively with metrics given by both the average error per year, as in Table 2 and as in Figure 7, showing the analysis of local discrepancies between the original transition probabilities and the transition probabilities resulting from the estimation methodology. The intention of using the continuous piecewise linear functional form for the intensities was to illustrate the possibility of an estimation–calibration procedure relying on a small number of parameters. Whenever faced with the estimation–calibration of intensities for real-data modelling, a situation where there are no known determined intensities generating the data, the choice of the functional form is secondary with respect to the quality of the model fitting.

8. Conclusions

In this work, we proposed a methodology for estimation–calibration of continuous-time non-homogeneous Markov chains with finite state space. We presented an application of the methodology to a Monte Carlo simulated set of trajectories generated from intensities of Gompertz-Makeham type, and we obtained, by the methodology, estimated continuous piecewise linear intensities. We compared the correspondent transition probabilities—obtained by solving the forward Kolmogorov ODE for the original Gompertz–Makeham intensities and for the continuous piecewise linear intensities—obtaining an average error per year of always less than 1%.
In order to justify the methodology, we presented a result on regime-switching Markov chains, proving the existence of a Markov chain process, in a given time interval, obtained by glueing together different intensities matrices defined in the different intervals of a partition of the time interval of the Markov chain process; this result shows that it is possible to consider intensities of different functional forms for different sub intervals of the time interval where the whole Markov chain process is defined.
We also presented a result that shows that it is possible to bound the distance between the matrix of transition probabilities corresponding to different matrix intensities by the distance between these matrix intensities. This result shows that, with respect to the transition probabilities, we can consider changes in functional form of the intensities—in an intensity matrix—as long as the distance between the original intensity matrix and the altered one is small enough.
In future work, we intend to improve the methodology in order to control the quality of the process by adequate tests and to improve the algorithm used.

Author Contributions

Conceptualization, M.L.E.; methodology, M.L.E., N.P.K. and G.R.G.; software, M.L.E., N.P.K. and G.R.G.; validation, M.L.E., N.P.K. and G.R.G.; formal analysis, M.L.E., N.P.K. and G.R.G.; investigation, M.L.E., N.P.K. and G.R.G.; data curation, M.L.E., N.P.K. and G.R.G.; writing—original draft preparation, M.L.E.; writing—review and editing, M.L.E., N.P.K. and G.R.G.; supervision, M.L.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially funded by national funds through the FCT—Fundação para a Ciência e Tecnologia, I.P., under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020—Center for Mathematics and Applications.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors are indebted to Albert N. Shiryaev for drawing attention to very important work on the subject and to the referees for the improvement seggestions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MCCTMarkov chain in continuous time
ODEordinary differential equations

References

  1. Esquível, M.; Guerreiro, G.; Oliveira, M.; Real, P. Calibration of Transition Intensities for a Multistate Model: Application to Long-Term Care. Risks 2021, 9, 37. [Google Scholar] [CrossRef]
  2. Vassiliou, P.C.G. Non-Homogeneous Markov Chains and Systems—Theory and Applications; CRC Press: Boca Raton, FL, USA, 2023; pp. xxi+450. [Google Scholar] [CrossRef]
  3. Pitacco, E. Health Insurance; European Actuarial Academy (EAA) Series; Basic Actuarial Models; Springer: Cham, Switzerland, 2014; pp. xii+162. [Google Scholar] [CrossRef]
  4. Haberman, S.; Pitacco, E. Actuarial Models for Disability Insurance; Chapman & Hall/CRC: Boca Raton, FL, USA, 1999; pp. xx+280. [Google Scholar]
  5. Billingsley, P. Statistical inference for Markov Processes; Volume II, Statistical Research Monographs; University of Chicago Press: Chicago, IL, USA, 1961; pp. vii+75. [Google Scholar]
  6. Billingsley, P. Statistical methods in Markov chains. Ann. Math. Statist. 1961, 32, 12–40. [Google Scholar] [CrossRef]
  7. Waters, H.R. An Approach to the Study of Multiple State Models. J. Inst. Actuar. (1886–1994) 1984, 111, 363–374. [Google Scholar] [CrossRef]
  8. Wolthuis, H. Life Insurance Mathematics (The Markovian Model); CAIRE: Brussels, Belgium, 1994. [Google Scholar]
  9. Forfar, D.O.; McCutcheon, J.J.; Wilkie, A.D. On Graduation by Mathematical Formula. J. Inst. Actuar. (1886–1994) 1988, 115, 1–149. [Google Scholar] [CrossRef]
  10. Dickson, D.C.M.; Hardy, M.R.; Waters, H.R. Actuarial Mathematics for Life Contingent Risks, 3rd ed.; International Series on Actuarial Science; Cambridge University Press: Cambridge, UK, 2020; pp. xxiv+759. [Google Scholar] [CrossRef]
  11. Christiansen, M.C. Multistate models in health insurance. AStA Adv. Stat. Anal. 2012, 96, 155–186. [Google Scholar] [CrossRef]
  12. Wang, B.; Zhu, Q. Stability Analysis of Discrete-Time Semi-Markov Jump Linear Systems. IEEE Trans. Autom. Control. 2020, 65, 5415–5421. [Google Scholar] [CrossRef]
  13. Wang, B.; Zhu, Q.; Li, S. Stability Analysis of Discrete-Time Semi-Markov Jump Linear Systems with Time Delay. IEEE Trans. Autom. Control. 2023, 68, 6758–6765. [Google Scholar] [CrossRef]
  14. Meira–Machado, L.; de Uña Álvarez, J.; Cadarso-Suárez, C.; Andersen, P.K. Multi-state models for the analysis of time-to-event data. Stat. Methods Med. Res. 2009, 18, 195–222. [Google Scholar] [CrossRef]
  15. Keiding, N.; Andersen, P.K. Nonparametric estimation of transition intensities and transition probabilities: A case study of a two-state Markov process. J. R. Statist. Soc. Ser. C 1989, 38, 319–329. [Google Scholar] [CrossRef]
  16. Andersen, P.K.; Hansen, L.S.; Keiding, N. Non- and semi-parametric estimation of transition probabilities from censored observation of a nonhomogeneous Markov process. Scand. J. Statist. 1991, 18, 153–167. [Google Scholar]
  17. Renshaw, A.E.; Haberman, S. On the graduations associated with a multiple state model for permanent health insurance. Insur. Math. Econ. 1995, 17, 1–17. [Google Scholar] [CrossRef]
  18. Fong, J.H.; Shao, A.W.; Sherris, M. Multistate actuarial models of functional disability. N. Am. Actuar. J. 2015, 19, 41–59. [Google Scholar] [CrossRef]
  19. Wan, L.; Lou, W.; Abner, E.; Kryscio, R.J. A comparison of time-homogeneous Markov chain and Markov process multi-state models. Commun. Stat. Case Stud. Data Anal. Appl. 2016, 2, 92–100. [Google Scholar] [CrossRef] [PubMed]
  20. Naka, P.; Boado-Penas, M.d.C.; Lanot, G. A multiple state model for the working-age disabled population using cross-sectional data. Scand. Actuar. J. 2020, 2020, 700–717. [Google Scholar] [CrossRef]
  21. Qian, C.; Srivastava, D.K.; Pan, J.; Hudson, M.M.; Rai, S.N. Estimating transition intensity rate on interval-censored data using semi-parametric with EM algorithm approach. In Communications in Statistics—Theory and Methods; Taylor and Francis: Abingdon, UK, 2023; pp. 1–17. [Google Scholar] [CrossRef]
  22. de Mol van Otterloo, S.; Alonso-García, J. A multi-state model for sick leave and its impact on partial early retirement incentives: The case of the Netherlands. Scand. Actuar. J. 2023, 2023, 244–268. [Google Scholar] [CrossRef]
  23. Esquível, M.; Patrício, P.; Guerreiro, G. From ODE to Open Markov Chains, via SDE: An application to models for infections in individuals and populations. Comput. Math. Biophys. 2020, 8, 180–197. [Google Scholar] [CrossRef]
  24. Esquível, M.; Krasii, N.; Guerreiro, G.; Patricio, P. The Multi-Compartment SI(RD) Model with Regime Switching: An Application to COVID-19 Pandemic. Symmetry 2021, 13, 2427. [Google Scholar] [CrossRef]
  25. Feinberg, E.A.; Mandava, M.; Shiryaev, A.N. On solutions of Kolmogorov’s equations for nonhomogeneous jump Markov processes. J. Math. Anal. Appl. 2014, 411, 261–270. [Google Scholar] [CrossRef]
  26. Feinberg, E.; Mandava, M.; Shiryaev, A.N. Kolmogorov’s equations for jump Markov processes with unbounded jump rates. Ann. Oper. Res. 2022, 317, 587–604. [Google Scholar] [CrossRef]
  27. Feinberg, E.A.; Shiryaev, A.N. Kolmogorov’s equations for jump Markov processes and their applications to control problems. Theory Probab. Appl. 2022, 66, 582–600, Transl. Teor. Veroyatn. Primen. 2021, 66, 734–759. [Google Scholar] [CrossRef]
  28. Rolski, T.; Schmidli, H.; Schmidt, V.; Teugels, J. Stochastic Processes for Insurance and Finance; Wiley Series in Probability and Statistics; John Wiley & Sons, Ltd.: Chichester, UK, 1999; pp. xviii+654. [Google Scholar] [CrossRef]
  29. Coddington, E.A.; Levinson, N. Theory of Ordinary Differential Equations; McGraw-Hill Book Co., Inc.: New York, NY, USA, 1955; pp. xii+429. [Google Scholar]
  30. Hale, J.K. Ordinary Differential Equations, 2nd ed.; Robert E. Krieger Publishing Co., Inc.: Huntington, NY, USA, 1980; pp. xvi+361. [Google Scholar]
  31. Horn, R.A.; Johnson, C.R. Matrix Analysis, 2nd ed.; Cambridge University Press: Cambridge, UK, 2013; pp. xviii+643. [Google Scholar]
  32. Stokes, A. The application of a fixed-point theorem to a variety of nonlinear stability problems. Proc. Nat. Acad. Sci. USA 1959, 45, 231–235. [Google Scholar] [CrossRef] [PubMed]
  33. Kurzweil, J. Ordinary differential equations. In Studies in Applied Mechanics; Elsevier Scientific Publishing Co.: Amsterdam, The Netherlands, 1986; Volume 13, 440p. [Google Scholar]
  34. Teschl, G. Ordinary Differential Equations and Dynamical Systems; Graduate Studies in Mathematics; American Mathematical Society: Providence, RI, USA, 2012; Volume 140, pp. xii+356. [Google Scholar] [CrossRef]
  35. Nevanlinna, F.; Nevanlinna, R. Absolute Analysis; Springer: New York, NY, USA, 1973; Volume 102, pp. vi+270. [Google Scholar]
  36. Butcher, J.C. Numerical Methods for Ordinary Differential Equations, 3rd ed.; John Wiley & Sons, Ltd.: Chichester, UK, 2016; pp. xxiii+513. [Google Scholar] [CrossRef]
  37. Iosifescu, M. Finite Markov Processes and Their Applications; (Wiley Series in Probability and Mathematical Statistics); John Wiley & Sons, Ltd.: Chichester, UK; Editura Tehnică: Bucharest, Romania, 1980; p. 295. [Google Scholar]
  38. Iosifescu, M.; Tăutu, P. Stochastic Processes and Applications in Biology and Medicine. I: Theory; Editura Academiei RSR: Bucharest, Romania; Springer: Berlin/Heidelbeerg, Germany, 1973; Volume 3, p. 331. [Google Scholar]
Figure 1. Solutions of Kolmogorov ODE for discontinuous linear intensities. p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple). Lower left p i 3 : i = 1 (blue); i = 2 (orange); Lower right p i 4 : i = 1 (blue); i = 2 (orange); i = 3 (green).
Figure 1. Solutions of Kolmogorov ODE for discontinuous linear intensities. p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple). Lower left p i 3 : i = 1 (blue); i = 2 (orange); Lower right p i 4 : i = 1 (blue); i = 2 (orange); i = 3 (green).
Mathematics 12 00668 g001
Figure 2. Comparing transition probabilities from Gompertz–Makeham intensities (left-hand side) and corresponding six-point linear interpolations (right-hand side). p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple).
Figure 2. Comparing transition probabilities from Gompertz–Makeham intensities (left-hand side) and corresponding six-point linear interpolations (right-hand side). p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple).
Mathematics 12 00668 g002
Figure 3. The symmetric of the differences between Gompertz–Makeham intensities and corresponding six-point linear interpolations for μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) .
Figure 3. The symmetric of the differences between Gompertz–Makeham intensities and corresponding six-point linear interpolations for μ 12 ( t ) , μ 23 ( t ) and μ 34 ( t ) .
Mathematics 12 00668 g003
Figure 4. A set of simulated trajectories of a 4-state continuous-time Markov chain.
Figure 4. A set of simulated trajectories of a 4-state continuous-time Markov chain.
Mathematics 12 00668 g004
Figure 5. Data recovered from the simulated results (blue) and the piecewise linear intensities fitted (red).
Figure 5. Data recovered from the simulated results (blue) and the piecewise linear intensities fitted (red).
Mathematics 12 00668 g005
Figure 6. Comparing original transition probabilities used for simulation (left) with the estimated piecewise linear transition probabilities (right). p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple).
Figure 6. Comparing original transition probabilities used for simulation (left) with the estimated piecewise linear transition probabilities (right). p i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red). j = 1 4 p i j (purple).
Mathematics 12 00668 g006
Figure 7. The difference between the original transition probabilities and the estimated ones. p i j   p ˜ i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red).
Figure 7. The difference between the original transition probabilities and the estimated ones. p i j   p ˜ i j : j = 1 (blue); j = 2 (orange); j = 3 (green); j = 4 (red).
Mathematics 12 00668 g007
Table 1. (1)—Maximum distance between the values of the computed approximate intensities; and (2)—the fitted piecewise linear intensities and the maximum distance normalised by the maximum value of the estimated intensities.
Table 1. (1)—Maximum distance between the values of the computed approximate intensities; and (2)—the fitted piecewise linear intensities and the maximum distance normalised by the maximum value of the estimated intensities.
μ 12 μ 13 μ 14
(1)0.0253810.1591001.165812
(2)0.0006670.0041860.030679
μ 21 μ 23 μ 24
(1)0.0036610.0205070.397256
(2)0.0009630.0053960.104541
μ 31 μ 32 μ 34
(1)7.3982720.0000240.086817
(2)3.522990.0000110.04134
Table 2. Average error per year between the estimated and the original probability transitions.
Table 2. Average error per year between the estimated and the original probability transitions.
Δ 11 Δ 12 Δ 13 Δ 14
0.0046470.0018060.0008960.007118
Δ 21 Δ 22 Δ 23 Δ 24
0.0023940.0021400.0026240.004081
Δ 31 Δ 32 Δ 33 Δ 34
0.0008960.0012580.0059070.003898
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Esquível, M.L.; Krasii, N.P.; Guerreiro, G.R. Estimation–Calibration of Continuous-Time Non-Homogeneous Markov Chains with Finite State Space. Mathematics 2024, 12, 668. https://doi.org/10.3390/math12050668

AMA Style

Esquível ML, Krasii NP, Guerreiro GR. Estimation–Calibration of Continuous-Time Non-Homogeneous Markov Chains with Finite State Space. Mathematics. 2024; 12(5):668. https://doi.org/10.3390/math12050668

Chicago/Turabian Style

Esquível, Manuel L., Nadezhda P. Krasii, and Gracinda R. Guerreiro. 2024. "Estimation–Calibration of Continuous-Time Non-Homogeneous Markov Chains with Finite State Space" Mathematics 12, no. 5: 668. https://doi.org/10.3390/math12050668

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop