Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits

Asmussen, Søren; Laub, Patrick J.; Yang, Hailiang

doi:10.3390/risks7010017

Open AccessArticle

Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits

by

Søren Asmussen

^1,*

,

Patrick J. Laub

²

and

Hailiang Yang

³

¹

Department of Mathematics, Aarhus University, 8000 Aarhus, Denmark

²

Institut de Science Financière et d’Assurances, Université Lyon 1, 69007 Lyon, France

³

Department of Statistics & Actuarial Science, Hong Kong University, Hong Kong 999077, China

^*

Author to whom correspondence should be addressed.

Risks 2019, 7(1), 17; https://doi.org/10.3390/risks7010017

Submission received: 11 December 2018 / Revised: 31 January 2019 / Accepted: 4 February 2019 / Published: 11 February 2019

Download

Browse Figures

Versions Notes

Abstract

:

Phase-type (PH) distributions are defined as distributions of lifetimes of finite continuous-time Markov processes. Their traditional applications are in queueing, insurance risk, and reliability, but more recently, also in finance and, though to a lesser extent, to life and health insurance. The advantage is that PH distributions form a dense class and that problems having explicit solutions for exponential distributions typically become computationally tractable under PH assumptions. In the first part of this paper, fitting of PH distributions to human lifetimes is considered. The class of generalized Coxian distributions is given special attention. In part, some new software is developed. In the second part, pricing of life insurance products such as guaranteed minimum death benefit and high-water benefit is treated for the case where the lifetime distribution is approximated by a PH distribution and the underlying asset price process is described by a jump diffusion with PH jumps. The expressions are typically explicit in terms of matrix-exponentials involving two matrices closely related to the Wiener-Hopf factorization, for which recently, a Lévy process version has been developed for a PH horizon. The computational power of the method of the approach is illustrated via a number of numerical examples.

Keywords:

generalized Coxian distribution; EM algorithm, guaranteed minimum death benefit; high-water benefit; jump diffusion, matrix-analytic methods; reserves; Wiener-Hopf factorization

1. Introduction

A random variable (r.v.)

τ

is phase-type (PH) with representation

(α, T)

if it is distributed as the lifetime

inf {t \geq 0 : J_{t} = †}

of a killed (terminating) time-homogenous Markov process

J =

{J_{t}}_{t \geq 0}

with

p < \infty

states, initial distribution

α

(a row vector), and generator

T

, where † is an additional absorbing state. PH distributions originate from queueing theory and the work of A.K. Erlang, A. Jensen and M.F. Neuts. However, in recent decades, they have found numerous applications to different areas. The main reason for this is that many calculations that are explicit for exponential distributions are often computationally tractable with PH assumptions and that, in addition, PH distributions are dense so that a given distribution F on

[0, \infty)

can be approximated arbitrarily well by a PH distribution provided one takes p large enough. For surveys of the theory and applications, see (Asmussen 2003; Asmussen and Albrecher 2010; Bladt and Nielsen 2017). The present paper is concerned with some applications to life insurance, an area where PH distributions have been far less employed than in the related areas of non-life insurance. There are many papers in ruin theory that assume that claim sizes are PH distributed; see (Cai and Li 2005b) and the review paper (Bladt 2005). Some papers assume that the claim time is PH distributed; see (Stanford et al. 2000). (Zadeh et al. 2014) considered disability insurance using a PH model. (Zadeh and Stanford 2016) studied credibility theory using PH distributions. In finance, PH distributions have also been used extensively. For example, (Asmussen et al. 2004) considered option pricing under an exponential phase-type Lévy model; (Cai and Li 2005a) dealt with applications of PH distributions to risk measures; and (Egami and Yamazaki 2014) considered PH approximations of Lévy models. On the contrary, the only references of applications of PH distributions to life insurance we know of are (Lin and Liu 2007) and (Zadeh et al. 2014).

The motivation for the present study is some valuation problems of equity-linked products in life insurance considered in Gerber et al. (2012, 2013, 2015) (GSY). Let

τ = τ_{x}

be the remaining lifetime of an insured of age x when signing the contract, then the payment of this class of benefits has the form

Ψ ({\bar{S}}_{τ}, S_{τ})

where

S_{t}

is the price at time t of some stock or stock index and

{\bar{S}}_{t} = {max}_{v \leq t} S_{t}

the running maximum. An example of such a benefit is the guaranteed minimum death benefit (GMDB)

max (S_{τ}, K)

, but there are many others; see the list in Section 4. Defining

X_{t}

as

X_{t} = log (S_{t} / S_{0})

,

M_{t} = {\bar{X}}_{t}

as the running maximum of X, and

D_{t} = M_{t} - X_{t}

as the displacement from the maximum, we then have:

Ψ ({\bar{S}}_{τ}, S_{τ}) = Φ (M_{τ}, D_{τ}) where Φ (m, d) = Ψ (S_{0} e^{m}, S_{0} e^{m - d)})

(1)

(note that

X_{t} = M_{t} - D_{t}

), and the fair value of the benefit is the expected discounted payment.

A widely-used model (Cont and Tankov 2004) assumes that

X =

{X_{t}}_{t \geq 0}

is a Lévy process, and the key vehicle in GSY for the computations for such expectations is the classical Wiener-Hopf factorization:

Lemma 1.

For a Lévy process

X_{t}

and an independent exponential time τ, the r.v.’s

M_{τ} = {max}_{0 \leq s \leq τ} X_{s}

and

D_{τ} = M_{τ} - X_{τ}

are independent. Further,

D_{τ} \overset{d}{=} - {\underset{̲}{X}}_{τ} = - min_{0 \leq s \leq τ} X_{s} .

For the geometric Brownian motion model where X is a Brownian motion (BM) with drift,

M_{τ}

and

D_{τ}

both have an exponential distribution, and for more general jump diffusions, a number of computational procedures have been developed (see Section 5 for references). In GSY, they approximate the density/tail of

τ_{x}

by a linear combination of exponential terms, which allows them to reduce the computations to simple integrations using Lemma 1.

The approximation of the distribution of

τ_{x}

by a linear combination of exponential terms in GSY uses an ad hoc optimization procedure. The main problem there is that it is very difficult to make the linear combination of exponential densities to be a density function; in particular, very often, in the tail part, the linear combination of exponential densities can be negative. Since, in the application to valuing equity-linked insurance products, the distribution at the tail part plays an important role, this motivated us to consider using the PH distribution to approximate the future lifetime distribution. For PH distributions, more classical statistical methods using maximum likelihood have, however, been developed (Asmussen et al. 1996), and in view of the denseness of PH distributions and the fact that they have proven a computationally-tractable generalization of the exponential distribution in other settings, a natural idea is therefore to use a PH approximation

τ

to

τ_{x}

. The first step in implementing this for pricing equity-linked benefits is to develop a version of the Wiener-Hopf factorization for the case where

τ

is PH and independent of X. This has been done in a companion paper (Asmussen and Ivanovs 2019), and the contribution of the present paper is to provide the further steps such as PH fitting of human mortality and computations of prices of equity-linked benefits for BM and jump diffusions.

2. Preliminaries

The vector of killing rates of a PH

(α, T)

r.v.

τ

is given by

t = - T 1

, where

1

is a column vector with all entries equal to one. The density is

α e^{T z} t

, and the tail

P (τ > z)

is

α e^{T z} 1

. Here, the exponential

e^{R}

of a matrix is defined as

\sum_{0}^{\infty} R^{n} / n!

. We shall often use formulas like:

\int_{a}^{b} e^{R x} d x = R^{- 1} (e^{R b} - e^{R a}) = (e^{R b} - e^{R a}) R^{- 1}, \int_{0}^{\infty} e^{R x} d x = - R^{- 1},

that are immediate analogues of standard formulas for univariate exponentials. For example, taking

R = - s I + T

gives the Laplace transform

E [e^{- s τ}]

as

α {(s I - T)}^{- 1} t

. Of course, the existence of

R^{- 1}

has to be verified, which for a PH generator

R = T

is a standard fact.

The subclass of generalized Coxian (GC) distributions will play a particular role. Here, the states are run through in lexicographical order, with exponential

(λ_{i})

holding time in state i, and it is possible to enter or exit from any state (see Figure 1), where

q_{i}

is the probability of exit after a visit to i and

r_{i} = 1 - q_{i}

the probability of going to the next state. The structure of Coxian distributions is similar, except that only State 1 can be entered so that

α_{1} = 1

. Coxian distributions are in particular used in (Lin and Liu 2007) and (Zadeh et al. 2014).

Remark 1.

Though computational tractability and denseness are the main reasons for the popularity of PH distributions, other advantages have also sometimes been advocated. For example, whereas in the majority of examples, the PH modeling is purely descriptive, there are others where attempts have been made to give the Markov states a physical meaning like in compartment models and, in the life insurance context, in (Lin and Liu 2007) and (Zadeh et al. 2014). This makes the number p of phases quite large there, of order 200, whereas we have taken the descriptive point of view. The fit obviously becomes better when increasing p, but how good it needs to be depends very much on what it will be used for. In this paper, we consider the valuation problem, that is to calculate the net premium (expected discounted payoff) for equity-linked insurance products, and our numerical examples will show that taking

p = 50

will be sufficient for the precision one needs in this context.

3. PH Fits of Human Mortality Data

The distribution of human lifetimes of course varies somewhat according to the period where the data were collected, geographical and social conditions, etc., but the basic shapes are much alike. We have therefore concentrated on a single example, the illustrative life table in Appendix 2A of (Bowers et al. 1997). The form of the distribution, as illustrated via the number of deaths in consecutive years, is given in Figure 2.

We use an EM algorithm to fit various types of phase-type distributions to this life table. This idea was originally developed in (Asmussen et al. 1996), with the case of censored data and other implementation details given in (Olsson 1996). Olsson (Olsson 1998) has written a C implementation of this algorithm called EMpht, which is available online. The overwhelming computational burden is in the E step, where many complicated integrals involving matrix exponentials must be evaluated. This is done approximately by converting the problem into a system of ODEs, which are approximately solved via a Runge-Kutta method. While the original C implementation is still very adequate for creating fits with a small number of phases p, it can be improved upon for larger p.

We wrote a new implementation using the Julia language. It allows for the E step to be evaluated in three different ways: (i) using an ODE solver (as in (Asmussen et al. 1996; Olsson 1996)), (ii) using quadrature methods, and (iii) using the uniformization method of Okamura et al. Okamura et al. (2011a, 2011b). We used this last method to generate the fits found later in this paper; the code we used to create the figures below will be made available online (Asmussen et al. 2019). Each fit was given one hour execution time restricted to one thread on the processor Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20 GHz. The EM algorithm could have been terminated much earlier without a major difference to the quality of the fits, but as we were comparing the various values of p, we wanted to ensure no individual fit was judged poorly for simply being stuck in a local maxima.

The PH representation in terms of

α, T

is heavily overparameterized. There is an abundance of sets

α, T

that leads to the same distribution. Consider, e.g., the PH with

α = \frac{1}{p} 1^{⊤}

(where

^{⊤}

stands for transposition) and where

T

is a diagonal matrix with

(- 1, - 2, \dots, - p)

on the diagonal (i.e., a hyperexponential distribution). If the diagonal elements in

T

are permuted in any of the

p!

possibilities, then the distribution is unchanged. This example indicates that the maximum

ℓ^{*}

of the likelihood function ℓ is attained at several parameter values so that the choice of the initial parameter set will determine which one is found by the EM algorithm. The presence of local maxima with values

< ℓ^{*}

cannot be excluded, but we are not aware of serious cases where the EM algorithm has been stuck in one of these.

In life insurance calculations, the relevant lifetime is not the overall one

τ_{0}

, but rather the remaining lifetime

τ = τ_{x}

of the insured at the time when he/she signs the contract. That is, if the age of the insured at that time is x, the relevant distribution is that of

τ_{0} - x

given

τ_{0} \geq x

. As a typical example, we took

x = 35

and fitted PH distributions with various structure and various numbers p of phases. Note that while we fit the distribution of

τ_{0} - x

, which starts at zero, in the following, we plot

τ_{0}

given

τ_{0} \geq x

(i.e., the x axes begin at

x = 35

and not zero).

As a first illustration, we fit both a general structure PH and a generalized Coxian (GC) one with

p = 20

phases. The results are in Figure 3, with the densities of the fitted PH distributions in the left panel together with the data and the corresponding hazard rates in the right panel.

The fits of the two structures are almost identical (but the shape of the density is somewhat off the data, which indicates that a larger valuer of p could be appropriate). The general structure leads to a system of ODEs of dimension

q = p^{2} + p

, whereas the corresponding number for GC is only

q = 2 p - 1

. Due to this curse of dimensionality and the marginal differences of the fits, we have only considered GC structures for higher values of p.

Figure 4 shows GC fits for

p = 35

and

p = 40

. As expected,

p = 40

improves upon

p = 35

.

In Figure 5, we complement Figure 4 by plots of the tails. Such plots are frequently used to illustrate the quality of fits, but in our minds, a good agreement is not an argument in itself: two tails may come out quite similar in a comparison even if the shapes of the densities are rather different. Finally, Figure 6 gives GC fits with

p = 50, 75, 100

.

Which value of p should one then work with? There exists hardly a single criterion for answering such questions, i.e., assessing the quality of the fits to the data. Some concerns may be aesthetic, such as avoiding negative values of the density (cf. Appendix A below), support outside the obvious range of the data (cf. kernel estimates), or an excessive number of parameters (as is the case for most PH fits as compared to, e.g., the classical Gompertz-Makeham fit of life table data). Other concerns may have to do with the features of the distribution which are considered especially important, be it the overall shape of the density (caught well by maximum likelihood), the cumulative distribution function (c.d.f.), although c.d.f.’s may look close even for rather different distributions, the tail (where maximum likelihood fails and other tools such as the Hill estimator may be more appropriate), the hazard rate (as showing up in numerous life insurance contexts), or the behavior in a more narrow range (cf. a life insurance contract signed at the start of employment and running up to the time of retirement). For the present purposes and related ones like applications in queueing and insurance risk, the main point is, however, that PH fitting is not done for its own sake, but for the purpose of facilitating some calculation, here of the prices of equity-linked products, in queueing, say of waiting time characteristics, etc. In the rest of the paper, we therefore study the performance of the PH approach and the various fits and return to a discussion of the issue in the concluding Section 8.

4. Valuation of Benefits

The main application of the results in this paper is valuing equity-linked insurance products, in particular GMDBs in various deferred annuities. The time of payment of this kind of product is at the time

τ

of death of the policyholder, often denoted

τ_{x}

, where x is his/her age at

t = 0

when he/she signs the contract. The amount of the payment depends on the price

S_{τ}

of a stock (or stock index) at that time and possibly also on the history of the stock price as, e.g., in (1). There are many different products (benefit functions) in the market. One example is the GMDB with benefit function being:

Ψ ({\bar{S}}_{τ}, S_{τ}) = max (S_{τ}, K) = S_{τ} + {[K - S_{τ}]}_{+},

(2)

where K is the guaranteed amount, e.g.,

K = 0.9 S_{0}

. The second term on the right-hand side (r.h.s.) of (2) is the payoff of a life-contingent put option. Another popular product is the so-called high-water benefit (HWB) with the payoff function being:

Ψ ({\bar{S}}_{τ}, S_{τ}) = {\bar{S}}_{τ} .

Another form of the HWB function is:

Ψ ({\bar{S}}_{τ}, S_{τ}) = max (a {\bar{S}}_{τ}, S_{τ})

where

0 < a < 1

. Since:

max (a {\bar{S}}_{τ}, S_{τ}) = S_{τ} + max (a {\bar{S}}_{τ} - S_{τ}, 0),

(3)

the second term on the r.h.s. of (3) is the payoff of a life-contingent fractional lookback option.

In practice, the policyholders can withdraw. Note that the payment of the GMDB can be written as the payoff of a put option (plus a share of underlying stock). When the stock price rises, put options become less valuable; hence, policies may lapse. This motivates considering the benefit function:

Ψ ({\bar{S}}_{τ}, S_{τ}) = I_{{{\bar{S}}_{τ} < H}} {[K - S_{τ}]}_{+},

where

I_{{.}}

is the indicator function and H is a “barrier”. This is a barrier option payoff form.

Since we can approximate the distribution of

τ_{x}

by a PH distribution, the valuation problem becomes calculating:

E [e^{- δ τ} Ψ ({\bar{S}}_{τ}, S_{τ})],

(4)

where

δ

denotes a discounting factor and

τ

is PH r.v. One example is to take

δ

as the force of interest, but the company may want to use a lower

δ

in its technical reserve to ensure a conservative pricing.

5. The Factorization: Implementation for Jump Diffusions

Our basic assumption is that X is a jump diffusion with Brownian parameters

μ, σ^{2}

, upward jumps at rate

λ_{1}

and with a PH

(p_{1}, α_{1}, T_{1})

distribution of the sizes, and similar parameters

λ_{2}, p_{2}, α_{2}, T_{2}

for the downward jumps.

We will first need to introduce the equivalent time-reversed PH representation

(α^{*}, T^{*})

of

τ

. It is obtained by considering:

J_{t}^{*} = \{\begin{matrix} J_{(τ - t) -}, & if t < τ, \\ †, & otherwise . \end{matrix}

It turns out (Asmussen 2003, III.5) that

J^{*}

is a terminating time-homogeneous Markov process with initial distribution

α^{*}

and generator

T^{*}

given by:

α^{*} = E τ \cdot t^{⊤} Δ_{ν}, T^{*} = Δ_{ν}^{- 1} T^{⊤} Δ_{ν},

(5)

where

Δ_{ν}

denotes the diagonal matrix with the positive vector

ν = - α T^{- 1}

on the diagonal. To see this, consider a doubly-infinite stationary Markovian arrival process (Asmussen 2003, XI.1) with generator

T + D

, where

D = t α

gives the rates of phase changes with arrivals, and note that the corresponding stationary distribution of the phase process is proportional to

ν

. Now, just use the standard time reversion for non-killed Markov processes. Considering, for the ease of exposition, the case

δ = 0

of no discounting, the form of the factorization identity of (Asmussen and Ivanovs 2019) that is most convenient for our purposes is then:

\begin{matrix} P (M_{τ} \in d x, D_{τ} \in d y) & = \sum_{k = 1}^{p} \frac{1}{c_{k}} P_{α}^{*} (M_{τ} \in d x, J_{{\bar{σ}}_{τ}} = k) P_{α^{*}}^{*} (M_{τ} \in d y, J_{{\bar{σ}}_{τ}} = k), \end{matrix}

(6)

where

P^{*}

is the probability measure where

τ

is generated by the time-reversed representation and X evolves as

- X

did under

P

,

{\bar{σ}}_{τ}

is the time at which the maximum is attained, and

c_{k} = P_{α} (J_{{\bar{σ}}_{τ}} = k) = P_{α^{*}}^{*} (J_{{\bar{σ}}_{τ}} = k)

.

The distributions of

M_{τ}, D_{τ}

are particularly simple if X is BM

(μ, σ^{2})

and

τ

exponential

(λ)

. In fact, they are then exponential with rates

ρ^{+}

, respectively

ρ^{-}

, where:

ρ^{\pm} = \mp \frac{μ}{σ^{2}} + \sqrt{\frac{μ^{2}}{σ^{4}} + \frac{2 λ}{σ^{2}}} .

If more generally,

X (t)

is a jump diffusion with both up- and down-ward PH jumps, there is an abundance of results and algorithms giving the distributions in various representations or their Laplace transforms. See, among others, (Bean et al. 2005; Breuer 2008; Dieker and Mandjes 2011; Horváth and Telek 2017; Jacobsen 2005; Jiang and Pistorius 2008; Lewis and Mordecki 2008; Mordecki 2002; Pistorius 2006). The most appropriate form for our purposes is that for (computable) matrices

U, U^{*}

of dimension

p (1 + p_{1}) \times p (1 + p_{1})

, respectively

p (1 + p_{2}) \times p (1 + p_{2})

, one has:

\begin{matrix} P_{i} (M_{τ} \in d x, J_{\bar{σ}} = k) & = e_{i}^{⊤} e^{U x} e_{k} \cdot u_{k} \end{matrix}

(7)

\begin{matrix} P_{j}^{*} (M_{τ} \in d y, J_{{\bar{σ}}_{τ}} = k) & = e_{j}^{⊤} e^{U^{*} y} e_{k} \cdot u_{k}^{*} \end{matrix}

(8)

where

u_{k}, u_{k}^{*}

are the

k^{th}

element of the vectors

- U e

, respectively

- U^{*} e

; this occurs already in (Asmussen 1995), as well as several later sources, but we have in fact used a new and (we feel) somewhat more straightforward approach; see Appendix B.

Multiplying (7), (8) by

α_{i}

, respectively

α_{j}^{*}

and summing over

i, j

, the factorization (6) then takes the form:

\begin{matrix} P (M_{τ} \in d x, D_{τ} \in d y) = \sum_{k = 1}^{p} r_{k} \cdot α e^{U x} e_{k} \cdot α^{*} e^{- U^{*} y} e_{k} \end{matrix}

where

r_{k} = u_{k} u_{k}^{*} / c_{k}

. The modification with discounting is explained in Section 4 of (Asmussen and Ivanovs 2019) and becomes:

\begin{matrix} E [e^{- δ τ}; M_{τ} \in d x, D_{τ} \in d y] = \sum_{k = 1}^{p} r_{k} \cdot α e^{U_{δ} x} e_{k} \cdot α^{*} e^{- U_{δ}^{*} y} e_{k} \end{matrix}

(9)

where

U_{δ}

,

U_{δ}^{*}

are calculated as

U, U^{*}

with

T

replaced by

T - δ I

(here, the

r_{k}

remain unchanged, i.e.,

u_{k}

,

u_{k}^{*}

and

c_{k}

are computed for the case

δ = 0

). This can be rewritten in matrix form as follows:

Corollary 1.

(i) For

x, y \geq 0

,

\begin{matrix} E [e^{- δ τ}; M_{τ} \in d x, D_{τ} \in d y] = α e^{U_{δ} x} Δ_{r} e^{{U_{δ}^{*}}^{⊤} y} {α^{*}}^{⊤} = α^{*} e^{U_{δ}^{*} y} Δ_{r} e^{{U_{δ}}^{⊤} x} α^{⊤}; \end{matrix}

(ii) For functions f and g,

E [e^{- δ τ} f (M_{τ}) g (D_{τ})] = α F_{U_{δ}} Δ_{r} G_{U_{δ}^{*}}^{⊤} {α^{*}}^{⊤} = α^{*} G_{U_{δ}^{*}} Δ_{r} G_{U_{δ}}^{⊤} α^{⊤}

where

F_{U_{δ}}

,

G_{U_{δ}^{*}}

are the matrices

\int_{0}^{\infty} f (x) e^{U_{δ} x} d x

, respectively

\int_{0}^{\infty} g (y) e^{U_{δ}^{*} y} d y

.

Proof.

Since

α^{*} e^{- U_{δ}^{*} y} e_{k} \in R^{1}

, it may be replaced by its transpose

e_{k}^{⊤} e^{- {U_{δ}^{*}}^{⊤} y} {α^{*}}^{⊤}

in (9), and this gives the first identity in (i); the second follows by transposition. Part (ii) then follows by integration. ▯

For computations, the matrix form rather than the sum form is convenient for keeping the code short and transparent, and in many software packages like MATLAB, it also is faster. In a number of our examples, the matrices

F_{U}

,

G_{U^{*}}

come out in a simple form, avoiding integration. Here is a first example:

Example 1.

For the HWB, we need to compute:

E [e^{- δ τ} max (a {\bar{S}}_{τ}, S_{τ})] = S_{0} E [e^{- δ \bar{τ}} e^{M_{τ}} \cdot e^{- δ \underset{̲}{τ}} max (a, e^{- D_{τ}})] .

(10)

Thus (assuming

S_{0} = 1

w.l.o.g.), we have

f (x) = e^{x}

,

g (y) = max (a, e^{- y})

in Part (ii) of Corollary 1, and we get:

\begin{matrix} F_{U_{δ}} & = \int_{0}^{\infty} e^{x} e^{U_{δ} x} d x = - {(I + U_{δ})}^{- 1}, \\ G_{U_{δ}^{*}} & = a \int_{- log a}^{\infty} e^{U_{δ}^{*} y} d y + \int_{0}^{- log a} e^{- y} e^{U_{δ}^{*} y} d y \\ = - a e^{- U_{δ}^{*} log a} {U_{δ}^{*}}^{- 1} + (I - a e^{- U_{δ}^{*} log a}) {(I - U_{δ}^{*})}^{- 1} . \end{matrix}

⋄

Corollary 2.

Define

S_{δ} = \int_{0}^{\infty} e^{U_{δ} y} Δ_{r} e^{{U_{δ}^{*}}^{⊤} y} d y

. Then:

E [e^{- δ τ}; X_{τ} \in d x] = \{\begin{matrix} α e^{U_{δ} x} S_{δ} {α^{*}}^{⊤} & f o r x > 0 \\ α^{*} e^{- U_{δ}^{*} x} S_{δ}^{⊤} α^{⊤} & f o r x < 0 . \end{matrix}

Proof.

For

x > 0

, we have:

E [e^{- δ τ}; X_{τ} \in d x] = \int_{0}^{\infty} E [e^{- δ τ}; M_{τ} \in d x + y, D_{τ} \in d y],

and so, the result follows immediately by Part (ii) of Corollary 1. The proof for

x < 0

is similar; one only has to note that if

S_{δ}^{*}

is defined as

S_{δ}

with

U_{δ}

and

U_{δ}^{*}

interchanged, then

S_{δ}^{*} = S_{δ}^{⊤}

. ▯

Remark 2.

The distribution of

X_{τ}

restricted to

(0, \infty)

is in fact (defective) PH, but in general, with a different generator

\tilde{U}

than

U = U_{0}

(for this one would have needed that

S_{δ} {α^{*}}^{⊤}

be proportional to

- U e

). This follows from the general fact that if

α, s \geq 0

and

T

is a phase generator, but possibly

s \neq - T e

, then

α e^{T x} s = \tilde{α} e^{\tilde{T} x} \tilde{t}

where:

\tilde{α} = α Δ_{v}, \tilde{T} = Δ_{v}^{- 1} T Δ_{v}, \tilde{t} = - \tilde{T} e = Δ_{v}^{- 1} s with v = - T^{- 1} s,

and here,

\tilde{T}

is a phase generator; all needed positivity properties follow from

S_{δ}

and

- T^{- 1}

being non-negative.

Similar remarks apply to the negative part of

X_{τ}

. These observations may be seen as an extension to jump diffusions at PH times of the standard fact that Brownian motion at an independent exponential time has a two-sided exponential distribution (often called the Laplace distribution). A similar result for matrix-exponential distributions is on p. 478 of (Bladt and Nielsen 2017), but is in terms of Laplace transforms rather than densities.

Example 2.

For the GMDB, the price is (taking

S_{0} = 1

):

\begin{matrix} E [e^{- δ τ} max (S_{τ}, K)] = K E [e^{- δ τ}; X_{τ} \leq log K] + E [e^{- δ τ + X_{τ}}; X_{τ} > log K] . \end{matrix}

In the most common case

K < S_{0}

, the first term equals:

\begin{matrix} K \int_{- \infty}^{log K} α^{*} e^{- U_{δ}^{*} x} S_{δ}^{⊤} α^{⊤} d x = - K α^{*} {U_{δ}^{*}}^{- 1} e^{- U_{δ}^{*} log K} S_{δ}^{⊤} α^{⊤} \end{matrix}

when

S_{0} = 1

, whereas the second is:

\begin{matrix} \int_{log K}^{0} e^{x} α^{*} e^{- U_{δ}^{*} x} S_{δ}^{⊤} α^{⊤} d x + \int_{0}^{\infty} e^{x} α e^{U_{δ} x} S_{δ} {α^{*}}^{⊤} d x \\ = α^{*} {(I - U_{δ}^{*})}^{- 1} (I - K e^{- U_{δ}^{*} log K}) S_{δ}^{⊤} α^{⊤} - α {(I + U_{δ})}^{- 1} S_{δ} {α^{*}}^{⊤} . \end{matrix}

⋄

Example 3.

The calculation of reserves:

V_{t} = E [e^{- δ (τ - t)} Φ (M_{τ}, D_{τ}) | X_{s}, 0 \leq s \leq t, τ > t]

is closely related to price calculations. However, the details may be somewhat more cumbersome since for a fixed t, the expressions for

M_{τ}, D_{τ}

have two components, the quantities

m_{0} = {max}_{s \leq t} X_{s}

and

d_{0} = m_{0} - X_{t}

, which are known at time t, and the unknown random ones:

m_{1} = max_{t \leq s < τ} (X_{s} - X_{t}), d_{1} = m_{1} - x_{1} where x_{1} = X_{τ} - X_{t} .

In terms of these and the relations:

\begin{matrix} M_{τ} & = max (m_{0}, x_{0} + m_{1}), \\ D_{τ} & = max (m_{0}, x_{0} + m_{1}) - x_{0} - x_{1} = max (d_{0}, d_{1}) - x_{1} \\ = {[d_{0} - m_{1}]}^{+} + d_{1}, \end{matrix}

we can write

Φ (M_{τ}, D_{τ}) = \tilde{Φ} (m_{1}, d_{1})

, where:

\tilde{Φ} (x, y) = Φ (max (m_{0}, x_{0} + x), {[d_{0} - x]}^{+} + y) .

Since the conditional distribution of

τ - t

given

τ > t

is PH

(α_{t}, T)

where

α_{t} = α e^{T t} / α e^{T t} e

, this gives:

V_{t} = \int_{0}^{\infty} \int_{0}^{\infty} \tilde{Φ} (x, y) α_{t} e^{U_{δ} x} Δ_{r} e^{{U_{δ}^{*}}^{⊤} y} {α^{*}}^{⊤} d y d x

(11)

(note that for the reversal, we need not calculate

α^{*}

by replacing

α

by

α_{t}

in (5), but can use the same

α^{*}

for all t, cf. Section 2.2 of (Asmussen and Ivanovs 2019)).

Example 4.

For the GMDB, we have:

Φ (M_{τ}, D_{τ}) = max (K, e^{X_{τ}}) = max (K, e^{x_{0} + x_{1}}) = e^{x_{0}} max (\tilde{K}, e^{x_{1}})

where

\tilde{K} = K e^{- x_{0}}

. Combining with Remark 2, we therefore arrive at the same expression as in Example 2 when

K e^{- x_{0}} \leq 1

, only with K replaced by

\tilde{K}

and

α

by

α_{t}

. The case

K e^{- x_{0}} > 1

may arise by randomness, but is just a minor modification.

Example 5.

The details for the HWB reserves are somewhat more complicated than for the GMDB. Here, one needs to consider

e^{M_{τ}} max (a, e^{- D_{τ}})

, corresponding to:

\tilde{Φ} (x, y) = exp \{max (m_{0}, x_{0} + m_{1})\} max (a, exp \{- {[d_{0} - m_{1}]}^{+} - d_{1}\} .

One approach (potentially the easier one) may be simply to insert in (11) and do the two-dimensional numerical integration. However, in fact, one can reduce to a one-dimensional numerical integration similar to Corollary 2. To this end, note that

e^{M_{τ}} max (a, e^{- D_{τ}})

after simple derivations comes out as:

{\begin{cases} e^{x_{0} + m_{1}} a & if m_{1} > d_{0}, d_{1} > - log a, & (12 a) \\ e^{x_{0} + m_{1} - d_{1}} & if m_{1} > d_{0}, d_{1} \leq - log a, & (12 b) \\ e^{m_{0} a} & if m_{1} \leq d_{0}, d_{0} - m_{1} + d_{1} > - log a, & (12 c) \\ e^{m_{0}} e^{- d_{0} + m_{1} - d_{1}} & if m_{1} \leq d_{0}, d_{0} - m_{1} + d_{1} \leq - log a . & (12 d) \end{cases}

of these four possibilities, (12d) (and to a slightly lesser extent, (12c)) is the most complicated one with which to deal. The contribution to the reserve from (12c) is zero if

m_{0} < d_{0} + log a

, since then, the defining region is empty. Otherwise, it becomes:

e^{- δ t + m_{0} - d_{0}} \int_{m_{0} + log a}^{d_{0}} \int_{0}^{x - d_{0} - log a} e^{x - y} α_{t} e^{U_{δ} x} Δ_{r} e^{{U_{δ}^{*}}^{⊤} y} {α^{*}}^{⊤} d y d x

which after two integrations reduces to an expression, which, though complicated, only involves known matrices and their exponentials, except that:

S_{δ} (b, c) = \int_{b}^{c} e^{U_{δ} x} Δ_{r} e^{{U_{δ}^{*}}^{⊤} x} d x

is left to be determined by numerical integration. ⋄

6. Numerical Examples

For the underlying stock price process

S_{t} = S_{0} e^{X_{t}}

, the most popular model is the geometric Brownian motion model. Since this model cannot capture the asymmetric leptokurtic features and the volatility smile, many modified models have been proposed in the literature; one of these is the jump diffusion model, cf. (Kou 2002). Empirical studies show that the daily return distribution tends to have larger kurtosis than the distribution of monthly returns, and the jump diffusion model can capture this feature; see (Das and Foresi 1996). As an illustrative example, we suppose that the yearly interest rate is

r = 3 %

and the yearly volatility

σ = 0.25

. The jumps follow a compound Poisson process with upward jump rate

λ_{1} = 3

with the jump sizes being exponential with rate

ν_{1} = 50

(mean 1/50) and downward jump rate

λ_{2} = 2

with jump size being exponential with rate

ν_{2} = 30

. When considering the valuation of an insurance product alone, it is not necessary to use a risk-neutral valuation approach. However, since in our setting, the underlying involves an equity, it is natural to use an equivalent martingale measure

Q

, defined by the requirement

log E_{Q} e^{X (1)} = r

. This does not define

Q

uniquely since the market is incomplete due to the jumps and the mortality risk. Our choice was to let the parameters

σ^{2}, λ_{1}, λ_{2}, ν_{1}, ν_{2}

remain unchanged, which gives the risk-neutral Brownian drift as:

μ = r - \frac{σ^{2}}{2} - \frac{λ_{1}}{ν_{1} - 1} + \frac{λ_{2}}{ν_{2} + 1} .

For the illustrative example, we have:

μ = 0.03 - \frac{{0.25}^{2}}{2} - \frac{3}{50 - 1} + \frac{2}{30 + 1} = 0.002042 .

It is left to compute the

U_{δ}, U_{δ}^{*}

. Writing:

U = U (T, μ, σ^{2}, λ_{1}, λ_{2}, ν_{1}, ν_{2})

we obtain:

U^{*} = U (T^{*}, - μ, σ^{2}, λ_{2}, λ_{1}, ν_{2}, ν_{1}), U_{δ} = U (T - δ I, μ, σ^{2}, λ_{1}, λ_{2}, ν_{1}, ν_{2}),

and thus, it suffices to consider

U

. Essentially, two types of algorithms are available. One set is based on finding roots in the complex plane of equations of the form

κ (s) = a

where

κ (s) = log E e^{s X (1)}

is the Lévy exponent of X. The other uses fixed-point equations of the form

U = ψ (U)

and iteration,

U = {lim}_{n \to \infty} U_{n}

, with

U_{n} = ψ (U_{n - 1})

and

U_{0}

suitably chosen.

We give a few examples of previous algorithms of both types in Appendix B. When implementing these with

τ

having one of the GC representations fitted in Section 3, our experience was that the root-finding method was only feasible for a rather small number p of phases. More precisely, for

p = 50

or

p = 75

, meaningless results were produced, with warnings of the matrix

H

being close to singular. This is understandable since many eigenvalues of the relevant matrices are almost identical. Since the iterative algorithms ran without problems also for large values of p, we have therefore concentrated on these. In fact, we have developed a new iterative algorithm, which may be simpler than previous ones. It is presented in Appendix B, but for now, we concentrate on some numerical examples.

Example 6.

In Figure 7, we have plotted the marginal densities of

M_{τ}

,

D_{τ}

, and

X_{τ}

.

Example 7.

Figure 8 gives the joint distribution of

M_{τ}

and

D_{τ}

(computed by Corollary 1) for our jump diffusion with the parameters given above.

Example 8.

To consider the sensitivity of the algorithms to the value p of chosen phases for

τ

, we took

δ = r = 0.03

or 0.00 and computed the price (10) for

p =

20, 35, 40, 50, 75, 100 and the GC fits presented in Section 3 for two products, the HWB with

a = 0.85

and the GMDB with

K = 0.85

; the relevant formulas are available from Examples 1 and 2. The results are in Table 1. We also complemented with a Monte Carlo simulation estimate with an associated 95% confidence interval based on

R = 10^{6}

replications, where

τ

was generated from the discrete life table distribution with an additional randomization within the year.

A related question is how varying different fits can be for a fixed p due to issues such as non-uniqueness of PH representations and possibly local maxima of the likelihood. We illustrate this point in Table 2, which is a parallel of Table 1. We fix p at 50, but use five different initial values (seeds) for the EM algorithm, all allowed the same running time. The fitted densities are in Figure 9.

Our conclusion is that the differences are too minor to be of any concern, and certainly, using a single fit with

p \geq 50

will produce acceptable results. Insisting on more than 2–3-digit precision in price estimates does not make sense due to issues such as model uncertainty and that payments will be adjusted later via bonus and dividend arrangements.

The programming was done in MATLAB on a 2014 MacBook Air with a 1.4 GHz Intel Core i5 processor. As an example of execution times, MATLAB’s tic-toc command reported approximately two seconds for each HWB or GMDB value using the PH methodology. The figures for the Monte Carlo values were much higher, approximately 30,000 replications per minute (recall that

10^{6}

replications were used). The programming is very simple, with the main effort lying in writing a routine for computing

U

for a given set of input parameters. This amounted to 50 lines of code in our implementation. Given this ease, we did some further price computations, even if our aim here is not to illustrate pricing aspects, but rather the computational ones.

Example 9.

We first consider varying the discount rate

δ

, but not the force of interest r, cf. the remarks following (4) (that in particular motivate taking

δ \leq r

). The results for the HWB are in Table 3 and show the expected picture, that the price is decreasing in the discount rate.

Example 10.

The next example is discounting with the force of interest, i.e., taking

r = δ

and varying r. The numerical values are in Table 4, and the picture is again that the price is decreasing in

r = δ

. The intuition is less clear-cut here, since the Brownian drift

μ

is increasing and thus pulls in the opposite direction.

Example 11.

Next, we give an illustration of a feature of the factorization: if

τ

is exponential, then

M_{τ}

and

D_{τ}

are independent. cf. Lemma 1. In the PH case, there is dependence caused by the common value of

\bar{τ} = k

. How does this influence the price? A numerical illustration is in Table 5, where the entries are the ratios between the correct price with dependence and the incorrect one computed assuming independence, i.e., as:

\begin{matrix} E e^{- δ \bar{τ}} e^{M_{τ}} \cdot E e^{- δ \underset{̲}{τ}} max (a, e^{- D_{τ}}) \\ = \int_{0}^{\infty} e^{x} α e^{U_{δ} x} u d x \cdot (a \int_{- log a}^{\infty} α^{*} e^{U_{δ}^{*} y} u^{*} d y + \int_{0}^{- log a} e^{- y} α^{*} e^{U_{δ}^{*} y} u^{*} d y) \\ = - α {(I + U_{δ})}^{- 1} u (- a α^{*} e^{- U_{δ}^{*} log a} {U_{δ}^{*}}^{- 1} u^{*} + α^{*} (I - a e^{- U_{δ}^{*} log a}) {(I - U_{δ}^{*})}^{- 1} u^{*}) . \end{matrix}

The other settings are as in Table 3. Table 5 shows that in some cases, dependence indeed influences the price quite a bit.

Example 12.

As a follow-up of Example 3, we finally consider the evaluation of reserves. Two paths of the jump diffusion were simulated, and the reserve was calculated for each value of t using the formulas of Examples 3, 4 and 5. The results are given in the graphs of Figure 10. The upper panel gives the stock price

S_{a - 35} = e^{X_{a - 35}}

(assuming

S_{35} = 1

) and the lower one the ratio

V_{a - 35} / e^{X_{a - 35}}

as function of the age a of the insured.

When

δ = r

,

\begin{matrix} V_{t} & = E [e^{- δ (τ - t)} max (K, e^{X_{τ}}) | τ > t, X_{t}] \\ \geq E [e^{- δ (τ - t)} e^{X_{τ}}) | τ > t, X_{t}] = e^{X_{t}} \end{matrix}

because of the martingale property. Hence, the ratio in the lower panel should always be at least one. This is also confirmed from the figure, as well as the expectation that the ratio should be close to one when

X_{t}

is large.

In the calculations, we used the fact that

α^{*}

and

U^{*}

need not be recalculated for each t, cf. the remarks following (11). This gave a substantial speed-up.

7. Erlangization and Extrapolation

A multitude of papers in finance, insurance, and queuing theory explore the idea of Erlangization or Canadization, that is approximation of a deterministic time T by an Erlang

(q)

time

E = E_{q, q / T}

with the same mean. Here, q is the number of stages and

q / T

is the rate, so that

E_{q, q / T}

is the sum of q mean

T / q

exponential r.v.’s. This idea has various applications: explicit identities (see, e.g., (Asmussen et al. 2002; Carr 1998; Stanford et al. 2005)); this is used in (Kuznetsov et al. 2011) together with the traditional Wiener-Hopf factorization to provide an algorithm simulating

(M_{E}, X_{E})

to approximate

(M_{T}, X_{T})

and as a numerical tool for calculating

f (T)

as the limit of

E f (E_{q, q / T})

(or just

f (E_{q, q / T})

) under suitable continuity conditions on f; see, e.g., (Asmussen et al. 2002; Carr 1998; Leung et al. 2015). This last procedure uses the fact that

E_{q, q / T}

converges in probability to T as

q \to \infty

. In (Asmussen et al. 2002), it is combined with extrapolation, which is based on the asymptotics:

E f (E_{q, q / T}) = f (T) + \frac{C}{q} + O (q^{- 2})

(13)

valid for f smooth and some

C = C (f)

. This C is typically not available, but (13) suggests eliminating C by estimating

f (T)

using:

q E f (E_{q, q / T}) - (q - 1) E f (E_{(q - 1), q / T})

(14)

which converges to

f (T)

at the improved rate

O (q^{- 2})

. The numerical examples of (Asmussen et al. 2002) indicate that often, calculations with q set as small as 2–4 will give very good precision.

The relevance for the present purposes is that the time horizon of many life insurance contracts is not the remaining lifetime

τ

of the insured, but rather

T \land τ

, where T is a fixed deterministic number. For example, when the age of the insured is 35 at the time of underwriting,

T = 45

corresponds to contract expiry at age 80 if the insured is still alive. Some calculations can be found in Section 10 of (Gerber et al. 2012) for

τ

exponential and a call or put option, but rely heavily on this particular structure. For our life insurance applications, it is crucial that

E_{q, q / T} \land τ

is again PH when

E_{q, q / T}

and

τ

are independent. The details involve the Kronecker product ⊗ and sum ⊕ so that:

A \oplus B = A \otimes I + I \otimes B, (A \oplus B) (v \otimes h) = A v \otimes h + v \otimes B .

For two Markov process generators, the Kronecker sum is the generator of independent versions, and so,

E_{q, q / T} \land τ

is PH with generator

E \oplus T

of dimension

n q \times n q

where

E = E_{q, q / T}

is the PH generator of

E_{q, q / T}

. Written out with lexicographical order of the states:

E \oplus T = (\begin{matrix} T - β I & β I & 0 & \dots & 0 & 0 \\ 0 & T - β I & β I & \dots & 0 & 0 \\ ⋮ & ⋱ & \dots \\ 0 & 0 & 0 & \dots & T - β I & β I \\ 0 & 0 & 0 & \dots & 0 & T - β I \end{matrix}) .

Example 13.

We consider a HWB or GMDB contract signed at age 35 of the insured and expiring at the time

τ

of death or at age 70, whatever comes first. We took

r = δ = 3 %

and used the GC fit with

p = 50

, leading to the numerical values given in Table 6; in comparison, Monte Carlo simulation with

R =

50,000 replications gave the 95% confidence interval 1.636 ± 0.034 for the HWB and 1.097 ± 0.029 for the GMDB.

The extrapolation is done via (14), with the required smoothness being obvious for the GMDB. For the HWB,

d \mapsto max (a, e^{- d})

is not smooth at

d = - log a

, but some rough heuristics (that we omit) suggest that (13) remains in force for this special f. Certainly, extrapolation appears to produce sensible results also for the GMDB. In both cases, it seems that

q =

4–6 is enough to obtain sufficiently precise estimates, whereas without extrapolation, one would need at least

q = 10

. This is a considerable improvement upon simple Erlangization, since the dimension of the matrices in the computations (where inversion is the heavy part) is

2 p q

. This means that for

p = 50

, one would need to invert matrices of dimension 400 when

q = 4

, but 1000 when

q = 10

, and since the inversion of an

n \times n

matrix has complexity

n^{3}

(or slightly less with sophisticated methods), this corresponds to reducing computation time by a factor of about

{2.5}^{3} = 15.6

.

8. Conclusions

In this paper, we have considered pricing calculations for life insurance products like the guaranteed minimum death benefit and the high-water benefit. The payout occurs at the death time

τ

of the insured, and its size is linked to an asset price S either via its current value

S_{τ}

or to features of its past evolution during the time of the contract such as the maximum

{\bar{S}}_{τ}

.

The key observation is that for the most popular model of S, an exponential jump diffusion, the distributions of

S_{τ}

and/or

{\bar{S}}_{τ}

are available when the jumps and

τ

are phase-type. This has long been known in the geometric Brownian motion setting without jumps when

τ

is exponential and is essentially just the Wiener-Hopf factorization of Brownian motion. There is also an extensive literature for PH jumps, and for

τ

itself being PH, the relevant Wiener-Hopf factorization has recently been developed in (Asmussen and Ivanovs 2019).

The program thus involves two steps, first fitting a PH distribution of

τ

to lifetime data and next implementing the pricing calculations.

The shape of lifetime data with a steep decline after the mode makes the fitting of PH distributions non-trivial, necessitating the number p of phases to be relatively high. However, we already made the points that a good fit is one that produces reliable estimates of the prices of the life insurance products, and that giving more than 2–3-digit precision in price estimates does not make sense due to issues such as uncertainty about the model and its parameters (think, e.g., of longevity risk) and bonus/dividend arrangements. We found

p = 50

to be more than sufficient in the sense of providing appropriate accuracy of the calculated price of life insurance products and could well have worked with smaller values. An alternative is developed in GSY, to use matrix-exponentials fitted at a selected number of points, but we show in Appendix A that this approach has its pitfalls.

Fitting of so high-dimensional PH distributions is slow, though we developed in part some new software giving substantial speed-up. However, given this step has been passed, pricing calculations are both very easy to program and run very fast, also compared to Monte Carlo simulation, and this may be the main advantage of our approach. A limitation is of course that we need the pay-out to be of the form (1), and there are many products for which this is not the case. Alternatives such as Monte Carlo simulation and traditional methods based on Thiele-type differential equations may then be more suitable. Therefore, we do not claim the superiority of our own approach for all purposes, but have rather tried to clarify its advantages and disadvantages.

Author Contributions

Conceptualization and methodology, S.A. and H.Y.; software and validation, S.A. and P.J.L.; writing–original draft preparation, all authors; writing–review and editing, P.J.L.

Funding

Hailiang Yang was funded by Research Grants Council of the Hong Kong Special Administrative Region (Project No. HKU 17305018). Patrick J. Laub conducted this research with the support of the ANR under the “Investments for the Future” program and the support of DAMI—Data Analytics and Models for Insurance—Chair under the aegis of the Fondation du Risque, a joint initiative by UCBL (Université Lyon 1, SAF EA2429) and BNP Paribas Cardif.

Acknowledgments

We are grateful to Jevgenijs Ivanovs for useful discussions and comments.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Appendix A. Fits Using Sum-of-Exponentials

We here briefly reinspect the approach of (Siu et al. 2015) for fitting a distribution with tail of the form:

{\bar{F}}_{a, b} (t) = \sum_{i = 1}^{ℓ} a_{i} e^{- b_{i} t}

to lifetime data where

a

is the vector

(a_{1}, \dots, a_{ℓ})

(some

a_{i}

may be negative) and similarly for

b

. Such a tail is of matrix-exponential form, meaning

\bar{F} (t) =

a e^{Δ t} s

for suitable

a, Δ, s

(not restricted to the Markov process interpretation in PH tails). In fact, the only restriction is that

Δ

must be diagonalizable. We refer to (Bladt and Nielsen 2017) for further background on matrix-exponential distributions.

In (Siu et al. 2015), the vectors

a

and

b

are found by minimizing:

\sum_{t = 1}^{n} {(_{t} p_{x} - {\bar{F}}_{a, b} (t))}^{2}

(A1)

for some fixed ℓ and n, where

_{t} p_{x}

is the survival probability (derived from the life table) of a person aged x surviving until age

x + t

. In other words, they minimize the mean squared error between the empirical and theoretical survival probabilities at ages x,

x + 1

, ...,

x + n

.

In (Siu et al. 2015), the values

x = 30

and

n = 25

are used as examples, and so, the age span considered begins at 30 and goes until 55. We fit this type of approximation to our life table with similar parameters and plotted the resulting densities; we used

ℓ = 10

exponential terms, the same as in (Siu et al. 2015), but we set the starting age as

x = 35

rather than

x = 30

.

The relevant ages for life insurance are often greater than 60 years old, and so, we create a sequence of fits with

n = 25, 50, 75

(shown in Figure A1 and Figure A2a) to try to approximate accurately an increasing portion of the life table. For each fit, we plot time across the whole age range of 35–110, even though one cannot expect a sum-of-exponentials fit to be accurate outside of the time period

t \in (x, x + n)

.

Figure A1. Sum-of-exponentials fits in age range 35–60.

Figure A2. Sum-of-exponentials fits in age ranges 35–85 (left) and 35–110 (right).

In Figure A1 and Figure A2a, the fits are fine in the selected age range

(x, x + n)

, but fail miserably beyond. This is not necessarily a drawback if the fit has been made for the pricing of a product where

x + n

is the upper limit of the relevant age range. The picture in Figure A2b is more rough, but presumably repairable by increasing ℓ beyond

ℓ = 10

. Another phenomenon that is observed is the possibility of densities that are negative and exhibit weird oscillations. This is seen most markedly in Figure A2b.

This behavior is a result of the fitting criterion (A1), which only pays attention to the selected grid

x, x + 1, \dots, x + n

, and not to what goes on in between the grid points.

Appendix B. Algorithms for Computing U

As an example of the root-finding algorithms, we state the following particularly simple example, which is a special case of Corollary 5.2 of (Asmussen 1995):

Proposition A1.

Assume X is BM

(μ, σ^{2})

and that τ is GC, as in Figure 1. Then,

P_{i} (M_{τ} \in d y; J_{\bar{σ}} = k) = e_{i}^{⊤} e^{U y} e_{k} d y

where:

$U = H Δ_{- s} H^{- 1}$ ;
$s$ is the vector with elements $s_{j} = - μ / σ^{2} + \sqrt{μ^{2} / σ^{4} + 2 λ_{j} / σ^{2}}$ ;
$H$ is the matrix with columns $h_{1}, \dots, h_{n}$ ;
$h_{j}$ is the eigenvector of $T + λ_{j} I$ corresponding to eigenvalue zero, with elements $h_{i j}$ given by $h_{i j} = 0$
for $i > j$ , $h_{1 j} = 1$ and:

$h_{i j} = \prod_{k = 1}^{i - 1} \frac{λ_{k} - λ_{j}}{λ_{k} q_{k}} for 1 < i \leq j .$

In particular, $P (J_{\bar{σ}} = k) = - α U^{- 1} e_{k} \cdot u_{k}$ where $u_{k}$ is the $k^{th}$ element of $- U e$ .

Remark A1.

For a GC distribution, the characteristic polynomial, its roots, and the eigenvalues are explicit because of the upper triangular form of the generator matrix. This may of course be a curiosity in the computer age, and it is probably faster to code the program of Proposition A1 via matrix software than to use the explicit formulas. However, continuing in the same vein, note that not only is the determinant of an upper triangular matrix explicit, but also the inverse. Indeed, let

k

be the vector of diagonal elements of

H

and write

H = Δ_{k} (I + N)

, i.e.,

N = Δ_{1 / k} (H - I)

. Then,

N

is nilpotent and:

H^{- 1} = {(I - N)}^{- 1} Δ_{1 / k} = \sum_{k = 1}^{n - 1} {(- 1)}^{k} N^{k} Δ_{1 / k} .

The iterative algorithms are hardly simpler for the GC case than for a general PH structure. An example (Theorem 4.1 of (Asmussen 1995)) follows.

Proposition A2.

Assume X is BM

(μ, σ^{2})

and that τ is PH

(α, T)

. Define

λ_{i} = - T_{i i}

,

ω_{i} = - μ / σ^{2} + \sqrt{μ^{2} / σ^{4} + 2 λ_{j} / σ^{2}}, η_{i} = μ / σ^{2} + \sqrt{μ^{2} / σ^{4} + 2 λ_{j} / σ^{2}} .

Then,

U

is the solution of the fixed-point problem

U = ψ (U)

where

ψ (V)

is the matrix whose

i^{th}

row is the

i^{th}

row of the matrix:

\frac{2}{σ^{2}} (Δ_{λ} + T) {(η_{i} I - V)}^{- 1} - Δ_{ω} .

Further,

U = {lim}_{n \to \infty} U_{n}

with

U_{n} = ψ (U_{n - 1})

and

U_{0} = - Δ_{ω}

.

In our numerical examples, we have used an algorithm that is somewhat related, but also contains some genuinely new ideas. It is applicable to more general models than jump diffusions and thereby also in other application areas than life insurance. For this reason, the details will be provided elsewhere, and we give here only a very brief outline. The basic idea is illustrated in Figure A3, which contains a realization of the jump diffusion X up to the PH time

τ

. The underlying Markov process has two states, blue b and red r for

τ

, two cyan c and magenta m for the upward jumps, and one green g for the downward ones. The phases when X is at a maximum are shown in the left-hand vertical bar and have four possible states b, r, cb, mb, cr, mr with legends given in the upper right corner.

Figure A3. Jump diffusion and embedded Markov random walk.

The basic observation for the algorithm is that this process

ξ (x)

of phases at maxima x (and thereby, the overall maximum

M_{τ}

) is the same as for the embedded discrete time killed Markov random walk marked by black dots. Its values are the values in the graph of X either at the start of a Brownian segment, just before the end, or at the maximum, or at phase changes in jumps. Here, the graph of X is understood to comprise the vertical parts from jumps, and Brownian segments are separated either by state changes in the Markov process underlying the phase horizon

τ

, or by jumps. The generator

U

of

ξ

is then shown to satisfy a certain fixed-point equation that is solved iteratively. Computationally, the main effort in each step is to invert a matrix

U_{n}

whose dimension is the number

p (1 + p^{+})

of states of

ξ

.

References

Asmussen, Søren. 1995. Stationary distributions for fluid flow models with or without brownian noise. Stochastic Models 11: 21–49. [Google Scholar] [CrossRef]
Asmussen, Søren. 2003. Applied Probability and Queues. In Stochastic Modelling and Applied Probability Series, 2nd ed. Berlin: Springer, vol. 51. [Google Scholar]
Asmussen, Søren, and Hansjörg Albrecher. 2010. Ruin Probabilities. In Advanced Series on Statistical Science and Applied Probability, 2nd ed. Singapore: World Scientific, vol. 14. [Google Scholar]
Asmussen, Søren, Florin Avram, and Martijn R. Pistorius. 2004. Russian and american put options under exponential phase-type lévy models. Stochastic Processes and Their Applications 109: 79–111. [Google Scholar] [CrossRef]
Asmussen, Søren, Florin Avram, and Miguel Usabel. 2002. Erlangian approximations for finite-horizon ruin probabilities. ASTIN Bulletin 32: 267–81. [Google Scholar] [CrossRef]
Asmussen, Søren, and Jevgenijs Ivanovs. 2019. A factorization of a Lévy process over a phase-type horizon. Stochastic Models. accepted. [Google Scholar] [CrossRef]
Asmussen, Søren, Patrick J. Laub, and Hailiang Yang. 2019. Online accompaniment for “Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits”. Available online: https://github.com/Pat-Laub/PhaseTypeLifeInsurance (accessed on 4 February 2019).
Asmussen, Søren, Olle Nerman, and Marita Olsson. 1996. Fitting phase-type distributions via the EM algorithm. Scandinavian Journal of Statistics 23: 419–41. [Google Scholar]
Bean, Nigel G., Małgorzata M. O’Reilly, and Peter G. Taylor. 2005. Algorithms for return probabilities for stochastic fluid flows. Stochastic Models 21: 149–84. [Google Scholar] [CrossRef]
Bladt, Mogens. 2005. A review on phase-type distributions and their use in risk theory. ASTIN Bulletin 35: 145–67. [Google Scholar] [CrossRef]
Bladt, Mogens, and Bo Friis Nielsen. 2017. Matrix-Exponential Distributions in Applied Probability. Berlin: Springer. [Google Scholar]
Bowers, Newton L., Hans U. Gerber, James C. Hickman, Donald A. Jones, and Cecil J. Nesbitt. 1997. Actuarial Mathematics, 2nd ed.Schaumburg: The Society of Actuaries. [Google Scholar]
Breuer, Lothar. 2008. First passage times for Markov additive processes with positive jumps of phase type. Journal of Applied Probability 45: 779–99. [Google Scholar] [CrossRef]
Cai, Jun, and Haijun Li. 2005a. Conditional tail expectations for multivariate phase type distributions. Journal of Applied Probability 42: 810–25. [Google Scholar] [CrossRef]
Cai, Jun, and Haijun Li. 2005b. Multivariate risk model of phase type. Insurance: Mathematics and Economics 36: 137–52. [Google Scholar] [CrossRef]
Carr, Peter. 1998. Randomization and the american put. The Review of Financial Studies 11: 597–626. [Google Scholar] [CrossRef]
Cont, Rama, and Peter Tankov. 2004. Financial Modelling with Jump Processes. London: Chapman & Hall, CRC. [Google Scholar]
Das, Sanjiv Ranjan, and Silverio Foresi. 1996. Exact solutions for bond and option prices with systematic jump risk. Review of Derivatives Research 1: 7–24. [Google Scholar] [CrossRef]
Dieker, Antonius B., and Michel Mandjes. 2011. Extremes of markov-additive processes with one-sided jumps, with queueing applications. Methodology and Computing in Applied Probability 13: 221–67. [Google Scholar] [CrossRef]
Egami, Masahiko, and Kazutoshi Yamazaki. 2014. Phase-type fitting of scale functions for spectrally negative Lévy processes. Journal of Computational and Applied Mathematics 264: 1–22. [Google Scholar] [CrossRef]
Gerber, Hans U., Elias S. W. Shiu, and Hailiang Yang. 2012. Valuing equity-linked death benefits and other contingent options: A discounted density approach. Insurance: Mathematics and Economics 51: 73–92. [Google Scholar] [CrossRef]
Gerber, Hans U., Elias S. W. Shiu, and Hailiang Yang. 2013. Valuing equity-linked death benefits in jump diffusion models. Insurance: Mathematics and Economics 53: 615–23. [Google Scholar] [CrossRef]
Gerber, Hans U., Elias S. W. Shiu, and Hailiang Yang. 2015. Geometric stopping of a random walk and its application to valuation of equity-linked death benefits. Insurance: Mathematics and Economics 64: 313–25. [Google Scholar]
Horváth, Gábor, and Miklós Telek. 2017. Matrix-analytic solutions of infinite, finite and level-dependent second-order fluid queues. Queueing Systems: Theory and Applications 87: 325–43. [Google Scholar] [CrossRef]
Jacobsen, Martin. 2005. The time to ruin for a class of markov additive risk process with two-sided jumps. Advances in Applied Probability 37: 963–92. [Google Scholar] [CrossRef]
Jiang, Zhengjun, and Martijn R. Pistorius. 2008. On perpetual american put valuation and first-passage in a regime-switching model with jumps. Finance and Stochastics 12: 331–55. [Google Scholar] [CrossRef]
Kou, Steven G. 2002. A jump diffusion model for option pricing. Management Science 48: 1086–101. [Google Scholar] [CrossRef]
Kuznetsov, Alexey, Andreas E. Kyprianou, Juan C. Pardo, and Kees van Schaik. 2011. A Wiener–Hopf Monte Carlo simulation technique for Lévy processes. The Annals of Applied Probability 21: 2171–90. [Google Scholar] [CrossRef]
Leung, Tim, Kazutoshi Yamazaki, and Hongzhong Zhang. 2015. An analytic recursive method for optimal multiple stopping: Canadization and phase-type fitting. International Journal of Theoretical and Applied Finance 18: 31. [Google Scholar] [CrossRef]
Lewis, Alan L., and Ernesto Mordecki. 2008. Wiener-Hopf factorization for Lévy processes having positive jumps with rational transforms. Journal of Applied Probability 45: 118–34. [Google Scholar] [CrossRef]
Lin, X. Sheldon, and Xiaoming Liu. 2007. Markov aging process and phase-type law of mortality. North American Actuarial Journal 11: 92–109. [Google Scholar] [CrossRef]
Mordecki, Ernesto. 2002. The distribution of the maximum of a Lévy process with positive jumps of phase-type. Theory of Stochastic Processes 8: 309–16. [Google Scholar]
Okamura, Hiroyuki, Tadashi Dohi, and Kishor S. Trivedi. 2011a. Improvement of expectation-maximization algorithm for phase-type distributions with grouped and truncated data. Applied Stochastic Models in Business and Industry 29: 141–56. [Google Scholar] [CrossRef]
Okamura, Hiroyuki, Tadashi Dohi, and Kishor S. Trivedi. 2011b. A refined EM algorithm for PH distributions. Performance Evaluation 68: 938–54. [Google Scholar] [CrossRef]
Olsson, Marita. 1996. Estimation of phase-type distributions from censored data. Scandinavian Journal of Statistics 23: 443–60. [Google Scholar]
Olsson, Marita. 1998. The Empht-Programme. Available online: http://home.math.au.dk/asmus/dl/EMusersguide.ps (accessed on 4 February 2019).
Pistorius, Martijn R. 2006. On maxima and ladder processes for a dense class of Lévy process. Journal of Applied Probability 43: 208–20. [Google Scholar] [CrossRef]
Siu, Chi Chung, Sheung Chi Yam, and Hailiang Yang. 2015. Valuing equity-linked deaths benefits in a regime-switching framework. ASTIN Bulletin 45: 355–95. [Google Scholar] [CrossRef]
Stanford, David A., Guy Latouche, Douglas G. Woolford, Dennis Boychuk, and Alek Hunchak. 2005. Erlangized fluid queues with application to uncontrolled fire perimeter. Stochastic Models 21: 631–42. [Google Scholar] [CrossRef]
Stanford, David A., Krzysztof J. Stroiński, and Karen Lee. 2000. Ruin probabilities based at claim instants for some non-Poisson claim processes. Insurance: Mathematics and Economics 26: 251–67. [Google Scholar] [CrossRef]
Zadeh, Amin Hassan, Bruce L. Jones, and David A. Stanford. 2014. The use of phase-type models for disability insurance calculations. Scandinavian Actuarial Journal 8: 714–28. [Google Scholar] [CrossRef]
Zadeh, Amin Hassan, and David A. Stanford. 2016. Bayesian and Bühlmann credibility for phase-type distributions with a univariate risk parameter. Scandinavian Actuarial Journal 2016: 338–55. [Google Scholar] [CrossRef]

Figure 1. A GC distribution.

Figure 2. Life table data.

Figure 3. Probability density functions and hazard rate functions for the life table and phase-type (PH) fits using the general and the generalized Coxian (GC) structures with

p = 20

phases.

Figure 3. Probability density functions and hazard rate functions for the life table and phase-type (PH) fits using the general and the generalized Coxian (GC) structures with

p = 20

phases.

Figure 4. Probability density functions and hazard rate functions for the life table and phase-type fits using the generalized Coxian structures with

p = 35, 40

phases.

Figure 4. Probability density functions and hazard rate functions for the life table and phase-type fits using the generalized Coxian structures with

p = 35, 40

phases.

Figure 5. Tails of the life table and GC fits with

p = 35, 40

phases.

Figure 5. Tails of the life table and GC fits with

p = 35, 40

phases.

Figure 6. Probability density functions and hazard rate functions for the life table and GC fits with

p = 50, 75, 100

phases.

Figure 6. Probability density functions and hazard rate functions for the life table and GC fits with

p = 50, 75, 100

phases.

Figure 7. Densities of (a)

M_{τ}

, (b)

D_{τ}

, and (c)

X_{τ}

.

Figure 7. Densities of (a)

M_{τ}

, (b)

D_{τ}

, and (c)

X_{τ}

.

Figure 8. Joint distribution of

M_{τ}

and

D_{τ}

. The joint density is in the left panel and a contour plot in the right.

Figure 8. Joint distribution of

M_{τ}

and

D_{τ}

. The joint density is in the left panel and a contour plot in the right.

Figure 9. Fitted densities for five different seeds (against the life table data).

Figure 10. Asset price paths and excess ratio of associated GMDB reserves.

Table 1. Price estimates for

δ = r

and varying p. HWB, high-water benefit; GMDB, guaranteed minimum death benefit.

Table 1. Price estimates for

δ = r

and varying p. HWB, high-water benefit; GMDB, guaranteed minimum death benefit.

p	20	35	40	50	75	100	Monte Carlo
HWB, $δ = r = 0$	2.703	2.703	2.704	2.704	2.704	2.704	2.665 ± 0.008
HWB, $δ = r = 0.03$	1.698	1.699	1.699	1.699	1.699	1.699	1.680 ± 0.008
GMDB, $δ = r = 0$	1.468	1.468	1.468	1.468	1.468	1.468	1.467 ± 0.008
GMDB, $δ = r = 0.03$	1.080	1.080	1.079	1.079	1.079	1.079	1.079 ± 0.010

Table 2. Price estimates for

δ = r

and varying seeds.

Table 2. Price estimates for

δ = r

and varying seeds.

Seed	1	2	3	4	5	Monte Carlo
HWB, $δ = r = 0$	2.704	2.704	2.703	2.702	2.704	2.665 ± 0.008
HWB, $δ = r = 0.03$	1.699	1.699	1.699	1.698	1.699	1.680 ± 0.008
GMDB, $δ = r = 0$	1.468	1.468	1.468	1.467	1.468	1.467 ± 0.008
GMDB, $δ = r = 0.03$	1.079	1.079	1.079	1.078	1.080	1.079 ± 0.010

Table 3. HWB price estimates for

r = 0.03

and varying

δ

.

Table 3. HWB price estimates for

r = 0.03

and varying

δ

.

$δ$	0.00	0.01	0.02	0.03
Price	6.24	3.99	2.58	1.70

Table 4. HWB price estimates for varying

r = δ

.

Table 4. HWB price estimates for varying

r = δ

.

$r = δ$	0.00	0.01	0.02	0.03	0.05
Price	2.70	2.23	1.92	1.70	1.44

Table 5. Ratios between prices with and without dependence.

$δ$	0.00	0.01	0.02	0.03	0.05
Ratio	0.97	0.88	0.78	0.68	0.51

Table 6. Erlangized price estimates.

q	HWB Simple	HWB Extrapolated	GMDB Simple	GMDB Extrapolated
1	1.523	1.523	1.092	1.092
2	1.583	1.642	1.097	1.102
3	1.606	1.652	1.097	1.099
4	1.618	1.655	1.097	1.096
5	1.626	1.657	1.097	1.095
6	1.631	1.658	1.096	1.095
7	1.635	1.658	1.096	1.094
8	1.638	1.659	1.096	1.094
9	1.640	1.659	1.096	1.094
10	1.642	1.659	1.095	1.094

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Asmussen, S.; Laub, P.J.; Yang, H. Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits. Risks 2019, 7, 17. https://doi.org/10.3390/risks7010017

AMA Style

Asmussen S, Laub PJ, Yang H. Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits. Risks. 2019; 7(1):17. https://doi.org/10.3390/risks7010017

Chicago/Turabian Style

Asmussen, Søren, Patrick J. Laub, and Hailiang Yang. 2019. "Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits" Risks 7, no. 1: 17. https://doi.org/10.3390/risks7010017

APA Style

Asmussen, S., Laub, P. J., & Yang, H. (2019). Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits. Risks, 7(1), 17. https://doi.org/10.3390/risks7010017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Phase-Type Models in Life Insurance: Fitting and Valuation of Equity-Linked Benefits

Abstract

1. Introduction

2. Preliminaries

3. PH Fits of Human Mortality Data

4. Valuation of Benefits

5. The Factorization: Implementation for Jump Diffusions

6. Numerical Examples

7. Erlangization and Extrapolation

8. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Fits Using Sum-of-Exponentials

Appendix B. Algorithms for Computing U

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI