Next Article in Journal
Deflation Risk and Implications for Life Insurers
Next Article in Special Issue
Bayesian Option Pricing Framework with Stochastic Volatility for FX Data
Previous Article in Journal
Estimation of Star-Shaped Distributions
Previous Article in Special Issue
Optimal Premium as a Function of the Deductible: Customer Analysis and Portfolio Characteristics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Predicting Human Mortality: Quantitative Evaluation of Four Stochastic Models

by
Anastasia Novokreshchenova
Dipartimento di Statistica e Matematica Applicata, corso Unione Sovietica 218 bis, Torino 10134, Italy
Risks 2016, 4(4), 45; https://doi.org/10.3390/risks4040045
Submission received: 29 August 2016 / Revised: 13 November 2016 / Accepted: 25 November 2016 / Published: 2 December 2016

Abstract

:
In this paper, we quantitatively compare the forecasts from four different mortality models. We consider one discrete-time model proposed by Lee and Carter (1992) and three continuous-time models: the Wills and Sherris (2011) model, the Feller process and the Ornstein-Uhlenbeck (OU) process. The first two models estimate the whole surface of mortality simultaneously, while in the latter two, each generation is modelled and calibrated separately. We calibrate the models to UK and Australian population data. We find that all the models show relatively similar absolute total error for a given dataset, except the Lee-Carter model, whose performance differs significantly. To evaluate the forecasting performance we therefore look at two alternative measures: the relative error between the forecasted and the actual mortality rates and the percentage of actual mortality rates which fall within a prediction interval. In terms of the prediction intervals, the results are more divergent since each model implies a different structure for the variance of mortality rates. According to our experiments, the Wills and Sherris model produces superior results in terms of the prediction intervals. However, in terms of the mean absolute error, the OU and the Feller processes perform better. The forecasting performance of the Lee Carter model is mostly dependent on the choice of the dataset.

1. Introduction

One of the main issues facing financial and governmental institutions, within the current economic climate, is the forecasting of mortality among an elderly population. Within a vast list of effected parties are public pension policies, private pension funds and life insurance businesses. They face the greatest risk, due to an increasing life expectancy across developed countries.
Over the last few decades it has become widely accepted that mortality can be more accurately measured by the use of stochastic models (see [1]), since they are better able to capture the uncertainty inherent within the problem. For any given individual, the probability of death naturally increases with age, however, as life expectancy increases over time, we observe improvements in mortality rates. Due to these effects, “dynamic mortality” has been introduced to produce models with age and time dependence. One of the seminal works, which became a benchmark within the industry, is the model of Lee and Carter [2] who model the central death rate as a two variable function. Since the publication of their work, several extensions of the Lee-Carter model have been proposed. For example, Renshaw-Haberman [3] considered a model that allows for a cohort effect and Blake and Dowd [4] proposed a two-factor model for mortality rates. Traditionally mortality models are used for forecasting mortality for older generations (ages over 50) since these mostly affect the uncertainty in the value of financial instruments offered by pension funds due to improvements in mortality and longer life expectancy (phenomena referred to in the literature as longevity risk). However, Plat [5] has recently suggested a model that can fit mortality to a wider range of ages (20–89). In [6] this model has been extended to fit even younger ages (5–89).
A fairly recent stream of actuarial literature has dealt with the phenomenon of stochastic mortality by modelling the instantaneous mortality intensity as a stochastic process. Recent works include Milevsky and Promislow [7], Dahl [8], Biffis [9], Denuit and Devolder [10], Luciano and Vigna [11], Shrager [12]. The mathematical framework in these models has been adapted from the credit risk literature to value securities subject to risk to default. Similarities between the time to default and remaining lifetime and between short-term interest rate and the force of mortality are exploited in this approach. Moreover, if the intensity process is affine, then the survival function for an individual can be derived in a closed form. This is extremely useful when pricing mortality-linked financial products, such as endowments, annuities, variable annuities and other forms of mortality-linked financial securities.
Luciano and Vigna [11] have studied the applicability of the affine processes, such as Ornstein-Uhlenbeck and Feller, for modelling mortality intensities. The approach is focused on fitting the survival curve for which closed-form solutions are available. The future projections for survival probabilities are made, their closeness to the historical values is discussed, but not evaluated quantitatively.
Another continuous-time mortality model we consider in this work is the one proposed by Wills and Sherris (2011) [13] for the Australian population. As with the Lee and Carter (1992) [2] model, it is able to capture the whole “mortality surface” across age and period. Moreover, it takes account of the correlation structure between different generations. This is important for life offices portfolios which often have contracts written on individuals from different cohorts. The authors have shown that the multiple risk factors implied by the model reflect the actual correlation structure between generations inferred from the data and that the model is suitable for pricing financial instruments (see Wills and Sherris [13,14]).
The advantages of continuous time mortality models mean that it is important to study how well continuous time processes can predict future mortality. There are numerous papers comparing the performance of mortality models—[5,6,15,16,17,18] are among them. Nevertheless, most of them have focused on discrete stochastic mortality models. For example, Cairns et al. [15] examined the in-sample fits of eight different discrete time stochastic mortality models. However, as noted in Dowd et al. [16], it is quite possible for a model to provide a good in-sample fit to historical data and produce forecasts that appear plausible ex ante, but still produce poor ex-post forecasts, that is, forecasts that differ significantly from the subsequently realised outcomes. Consequently, a “good” model should produce forecasts that perform well out-of-sample which can be evaluated using backtesting methods.
Lee and Miller [19] evaluated the performance of the Lee-Carter model by examining the behaviour of forecast errors and plots of “percentile error distributions”, although they did not report any formal test results. In contrast, Dowd et al. [16] formally evaluate the forecasting performance of six different stochastic mortality models applied to male mortality data for England and Wales. They use a backtesting procedure to test the stability of forecasts over different time horizons and conclude that the investigated models perform adequately, and that there is little difference between them.
The framework for backtesting stochastic mortality models in Dowd et al. [16] is a very general one. The “backtests” might involve the use of plots whose goodness of fit is interpreted informally, as well as formal statistical tests of predictions. The evaluation can be done for different metrics (the forecasted variable) of interest – possible metrics include mortality rates, life expectancy, future survival rates, the prices of annuities and other life-contingent financial instruments.
This paper focuses on the forecasting performance of several continuous-time models, making a novel contribution to the literature. More specifically, we concentrate on the following continuous-time mortality models: the Ornstein-Uhlenbeck process, the Feller process and the Wills and Sherris model.
To compare the performance of these models to a benchmark, we also include the Lee and Carter model in our experiments. We evaluate the in-sample goodness of fit by using statistical techniques including the BIC criteria and an analysis of the fitted residuals. To assess the forecasting performance of each model we employ out-of-sample back-testing methods using mortality rates as the metric. This is done first by computing the relative error between the forecast and actual mortality rates and then by looking at the percentage of observed mortality rates which fall within a prediction interval. However, the same backtesting procedure using different metrics might be relevant for different purposes.
For our analysis we employ the data of the British and Australian population as they are among the countries where the market for mortality derivatives has started to emerge. According to [20], the annuity markets are relatively well developed in the UK and US. Some product innovations, such as variable annuities with guaranteed withdrawal lifetime benefits have been introduced in Australia, Japan and Europe. Multiple mortality and longevity derivatives (such as q-forwards, s-forwards, longevity and survivor bonds and swaps) have been suggested in the literature as well, see [14,21]. In [14,20] the authors study the securitisation of longevity risk for the Australian pension industry. In [22] natural hedging of longevity risk with application to the UK population is analysed.
This paper is organised as follows: in Section 2 we present some notation and description of the data that will be used in the subsequent analysis. In Section 3 we provide an overview of the Lee-Carter model, which we will use as a benchmark for our comparisons. Section 4 provides the Wills and Sherris model setup. Section 5 describes time-homogeneous affine processes. Section 6 calibrates the four models to the UK female dataset and Section 7 compares the results of this calibration for the four models. Section 8 discusses the robustness of the simulation results on the male and female datasets for the British and Australian populations. Section 9 concludes.

2. Notation and Data Description

Throughout the paper we use the following notation. Define m ( x , t ) to be the observed central death rate in year t for lives initially aged x as a number of deaths divided by the population exposure:
m ( x , t ) = D ( x , t ) E ( x , t ) ,
Here E ( t , x ) is the average size of the population aged x last birthday during year t and D ( t , x ) is the number of deaths during year t recorded as age x last birthday at the date of death. The observed central death rate can be calculated directly from the data.
Another measure of mortality is the force of mortality μ ( x , t ) . It is interpreted as the instantaneous death rate at exact time t for individuals aged exactly x at time t. The probability of death between t and t + d t for small t is then approximately μ ( x , t ) × d t . Thus, assuming that the force of mortality remains constant over a year: μ ( x + s , t + u ) = μ ( x , t ) for 0 s , u < 1 , we can approximate the force of mortality μ ( x , t ) with the mortality rate m ( x , t ) .
A typical dataset consists of a number of deaths, D ( x , t ) , and the corresponding exposures, E ( x , t ) , over a range of years t and ages x. The data for the UK we use in this study contains the number of deaths and the population exposure. It was taken from the Human Mortality Database (Human Mortality Database. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany)). We consider female population aged 50–99 (which is relevant to the pension fund industry) during the years 1970–2009.

3. Lee-Carter Model

In this section we describe general characteristics of the famous Lee Carter model [2] and its estimation process and forecasting technique. Lee-Carter mortality model is used widely in academia, as well as industry. It has been proposed by Lee and Carter in 1992 specifically for US mortality data covering years 1933–1987. However it has been used as a benchmark model to mortality data from many countries and time-periods. It has been shown (see [23]) that the Lee-Carter model is a special type of multivariate random walk with a drift (RWD), in which the covariance matrix depends on the drift vector. For estimation of the model parameters, principle component analysis (PCA) with a single component is applied to the census data.
Let m x t denote the log of the mortality rate in an age group x ( x = 1 , , A ) and time t ( t = 1 , , T ) for one country. The mortality rate is modelled as follows:
m x t = α x + β x κ t + ϵ x t
where ϵ x t is a set of random disturbances and α x , β x and κ t are parameters to be estimated:
α x is the average mortality curve across ages;
β x is a set of parameters representing the sensitivity of the mortality rate at age x to changes in κ t ;
κ t is a time-varying parameter representing a common risk factor;
ϵ x t is a zero mean Gaussian error N ( 0 , σ 2 ) .
The parametrisation in (2) is not unique, since the likelihood function associated with the model above has an infinite number of equivalent maxima, each of which would produce identical forecasts, see Lee and Carter [2]. In practice, model identification implies imposing constrains. Lee and Carter adopt the constraints t κ t = 0 and x β x = 1 .
The constraint t κ t = 0 implies that the parameter α x is simply the empirical average over time of the age profile in age group a: α x = m ¯ x . We can therefore rewrite the model in terms of the mean centered log-mortality rate, m ˜ x t = m x t m ¯ x . Thus, we can rewrite Equation (2) as a multiplicative fixed effects model for the centered age profile:
m ˜ x t N ( μ ¯ x t , σ 2 ) , E ( m ˜ x t ) = μ ¯ x t = β x κ t .
As a result, we use A + T parameters (with A and T being the total number ages and the total number of years considered) to approximate the A × T elements of the mortality matrix, where each row represents the age of the population and each column represents the year of the observation, with the age-specific parameter β x which is fixed over time for all x and the time-specific parameter κ t which is fixed over age groups for all t.
The parameters β x and κ t in the model can be found easily using singular value decomposition (SVD) of the matrix of centered age profiles, m ˜ = B L U = Z , which we denote by Z. Then the estimate for β x is the first column of B, b 1 (normalised eigenvector of the matrix Z Z ) and the estimate for κ t is λ 1 u 1 , where u 1 is the first column of the matrix U (normalised eigenvector of the matrix Z Z ) and λ 1 is the first element of the diagonal matrix L (the largest eigenvalue corresponding to the eigenvectors). Typically, for low-mortality populations, the approximation Z λ 1 b 1 u 1 accounts for more than 90% of the variance of m x t , see [23].
To forecast future mortality, Lee and Carter assume that α x and β x remain constant over time and the time factor γ t is viewed as a stochastic process. They find that a random walk with drift is the most appropriate model for their data:
κ ^ t = κ ^ t 1 + θ + ξ t ; ξ t N ( 0 , σ r w 2 ) ,
where θ is known as the drift parameter and its maximum likelihood estimate is simply θ ^ = ( κ ^ T κ 1 ^ ) / ( T 1 ) , which only depends on the first and last components of the κ t vector.
We can forecast κ ^ t at time T + h with data available up to period T, as follow:
κ ^ T + h = κ ^ T + h θ ^ + l = 1 h ξ T + l 1 .
From this model, we can obtain forecast point estimates, which follow a straight line as a function of h with slope θ ^ :
E [ κ ^ T + h | κ ^ 1 , , κ ^ T ] = κ ^ T + h θ ^ .
To make a point estimate forecast for log-mortality we plug the obtained expression for κ ^ T + h into the vectorised version of expression (3):
μ T + ( Δ t ) = m ¯ + β ^ x κ ^ T + h = m ¯ + β x ^ [ κ ^ T + h θ ^ ] ,
where β ^ x = b 1 and κ ^ T = λ 1 u 1 are the estimates of β x and κ T respectfully obtained using SVD.

4. Wills and Sherris Model

Wills and Sherris suggested a stochastic longevity model where the force of mortality for age x at time t has the following dynamics (see [13,14]):
d μ ( x , t ) = ( a ( x + t ) + b ) μ ( x , t ) d t + σ μ ( x , t ) d W ( x , t ) , 0 < x < ω , 0 < t < ω x .
In the above expression the drift parameter is an affine function of the current age ( x + t ) , while volatility function is a constant.Applying the Ito’s lemma, we find the solution to the SDE (8):
μ ( x , t ) = μ ( x , 0 ) exp a 2 t 2 + ( a x + b 1 2 σ 2 ) t + σ W ( x , t ) ,
which can be written as follows:
ln μ ( x , t ) μ ( x , 0 ) = a 2 t + a x + b 1 2 σ 2 t + σ W ( x , t ) .
For all ages x 1 , x N , we consider a multivariate random vector of mortality rates:
μ ̲ ( x , t ) = μ ( x 1 , t ) μ ( x N , t )
The dynamics d μ ̲ ( x , t ) are assumed to be driven by the multivariate Wiener process d W ̲ ( x , t ) , with mean zero and the instantaneous correlation matrix given by:
D = δ 11 δ 1 N δ N 1 δ N N
This means that the Wiener processes are independent between time periods, but correlated between ages and the multivariate Wiener process d W ̲ ( x , t ) can be expressed in terms of independent Wiener process d Z ̲ ( x , t ) = [ d Z 1 ( t ) , , d Z N ( t ) ] as d W ( x , t ) = D d Z ( x , t ) .
Thus, the model described by Equation (8), becomes a system of equations where the dependence between the ages is captured by the δ x , i term:
d μ ( x , t ) = ( a ( x + t ) + b ) μ ( x , t ) d t + σ μ ( x , t ) i = 1 N δ x , i d Z i ( t ) x = x 1 , , x N .
Using the fact that the distribution of the changes in the force of mortality follows a normal distribution, we can find the parameters a ^ , b ^ and σ ^ by means of maximum likelihood estimation. In particular,
Δ μ ( x , t ) N ( ( a ( x + t ) + b ) μ , σ μ )
To estimate the covariance matrix of d W ̲ ( x , t ) , we apply Principle Component Analysis (PCA) to the standardised residuals of the model. For each year, they are the realisations of the random vector d W ̲ ( x , t ) :
r ( x , t ) = Δ μ ^ ( x , t ) / μ ^ ( x , t ) ( a ^ ( x + t ) + b ^ ) σ ^

5. Time-Homogeneous Affine Processes

Mortality intensity since recently has been modelled as a stochastic process, (see Cairns [1]). In this field, an important stream of literature focuses on describing death arrival as the first jump time of a Poisson process with stochastic intensity. This approach is named doubly stochastic. Milevsky and Promislow [7] have used a stochastic force of mortality, whose expectation at any future date has a Gompertz specification. Dahl [8], Biffis [9], Denuit and Devolder [10] and Schrager [12] in modelling the stochastic force of mortality have applied the same mathematical tools used in the credit risk literature to model the time to default. Under this setting, the remaining lifetime of an individual, τ , is a doubly stochastic stopping time with intensity λ.
Let the mortality process μ x ( t ) represent the mortality intensity of an individual belonging to the generation x at (calendar) time t and τ be the time of death of an individual of generation x. Then the survival probability from time t to time T t is defined as a function of τ, S x ( t , T ) , under the probability measure P, conditional on the survivorship up to time t:
S x ( t , T ) = ( τ T | τ > t ) ,
A doubly stochastic stopping time is the analogue of the first jump time of a Poisson process, where the intensity is a stochastic process. If τ is the first jump time of a Poisson process with parameter μ, then
P ( τ > t ) = e μ t
Similarly, if τ is doubly stochastic with intensity μ, then the individual’s survival function S x ( t ) is given by
S x ( t , T ) = P ( τ > t | F s ) = E e s t μ ( u ) d u | F s
where F s describes the information at time s.
In general, the expectation in (11) is not easy to calculate. However, if the intensity process is affine (see Duffie, Filipovic and Schachermayer [24]), then it is possible to provide the closed form for the survival probability:
S x ( t , T ) = e α ( T t ) + β ( T t ) μ x ( t ) .
where the functions α ( · ) and β ( · ) satisfy generalised Riccati ODEs, which can be solved analytically or at least numerically. The closed-from expression of survival probabilities (12) in affine framework allows to price financial instruments written on the underlying population, such as endowments, annuities, variable annuities and other forms of mortality-linked financial securities. Due to this result, in applications the processes selected for the mortality intensity are typically affine.
Luciano and Vigna [11] proposed and tested time-homogeneous non-mean reverting affine processes for the intensity of mortality, which are natural generalisation of the Gompertz law of mortality. They consider Ornstein Uhlenbeck process, Ornstein Uhlenbeck process with jumps and the Feller process. They provide the analytical solutions for survival function (12) for these processes and discuss the appropriateness of using them in modelling mortality. Calibrations on historical data show that despite their simple form, these processes fit mortality intensity dynamics very well. Another study shows how to use these processes to delta-gamma hedge mortality and interest rate risk, see Luciano, Regis and Vigna [22] .

5.1. The Ornstein-Uhlenbeck Processes

The SDE for the Ornstein-Uhlenbeck (OU) process without mean-reversion, with the associated solutions of the Riccati ODE α ( · ) and β ( · ) , is the following:
d μ x ( t ) = a μ x ( t ) d t + σ d W x ( t ) , α ( t ) = σ 2 2 a 2 t σ 2 a 3 e a t + σ 2 4 a 3 e 2 a t + 3 σ 2 4 a 3 , β ( t ) = 1 a ( 1 e a t ) ,
where a > 0 , σ > 0 .
We calibrate the parameters of the OU process by means of Maximum Likelihood method applied to the mortality intensities.
Assume that the dynamics of the mortality intensity is described by the OU process without mean reversion as given by SDE (13). Then, the conditional probability density of an observation μ i + 1 , given a previous observation μ i (with a δ time step between them), has a form (here we omit x which symbolises a certain generation):
f ( μ i + 1 | μ i ; a , σ ^ ) = 1 2 π σ ^ 2 e ( μ i + 1 μ i e a δ ) 2 2 σ ^ 2 ,
where σ ^ 2 = σ 2 1 e 2 a δ 2 a .
The log-likelihood function of a set of observations μ ¯ = ( μ 1 , μ 2 , , μ n ) can be derived from the conditional density function:
L ( μ ¯ ; a , σ ^ ) = i = 1 n ln f ( μ i + 1 | μ i ; a , σ ^ ) = = n 2 ln ( 2 π ) n ln ( σ ^ ) 1 2 σ ^ 2 i = 1 n ( μ i + 1 μ i e a δ ) 2 .
From the Maximum Likelihood conditions we find the following equations for the parameters:
a = 1 δ i = 1 n μ i + 1 μ i i = 1 n μ i 2 σ ^ 2 = i = 1 n ( μ i + 1 μ i e a δ ) 2 n
The OU process in general can produce negative paths. The probability of λ x turning negative is
P ( μ x ( t ) 0 ) = Φ μ x ( 0 ) e a t σ e 2 a t 1 2 a = Φ ζ ( σ , a ) ,
where Φ is the distribution function of a standard normal.
In fact, the function ζ ( σ , a ) = μ x ( 0 ) e a t σ e 2 a t 1 2 a is increasing in σ and decreasing in a, as well as the probability of negative values of μ . In mortality modelling applications the probability that μ ( t ) takes negative values is very small, because in practice the obtained value of σ is small enough and the value of a, on the contrary, is high enough. In our calibration we check that the values of the obtained parameters are such that there are no negative intensities. Otherwise, to keep mortality intensity positive, it is possible to impose a restriction using Equation (16) during the parameter search, such that probability (16) is negligible (see Luciano and Vigna [11] for more details).

5.2. The Feller Process

The fourth model is the Feller process which is described by the following SDE with the associated solutions of the Riccati ODEs α ( · ) and β ( · ) :
d μ ( t ) = a μ ( t ) d t + σ μ ( t ) d W ( t ) , α ( t ) = 0 , β ( t ) = 1 e b t c + d e b t ,
where a > 0 , σ 0 , the boundary conditions are α ( 0 ) = 0 and β ( 0 ) = 0 , and the coefficients are:
b = a 2 + 2 σ 2 c = b + a 2 , d = b a 2 .
The solution to the Equation (17) has the form:
μ ( t ) = μ ( 0 ) e a t + σ 0 t e a ( t u ) d W ( u )
The Feller process is a type of the Cox, Ingersoll, Ross (1985) process [25] without mean reversion. It was proposed as a model of a short rate for financial market, referred to as the CIR model. This model is described by the following SDE:
d r ( t ) = a ( b r ( t ) ) d t + σ r ( t ) d W ( t ) ,
where b > 0 is the mean-reversion level. Thus, the model suggests that the r ( t ) is pulled towards b at a speed controlled by a. If condition 2 a b > σ 2 holds and r ( 0 ) > 0 , then the CIR process remains strictly positive, almost surely, and the state (marginal) distribution of the process is steady. The marginal density is gamma-distributed. The maximum likelihood estimation of the parameter vector θ = ( a , b , σ ) is based on the transition density. Given r t at time t the density of r t + Δ t at time t + Δ t is
p ( r t + Δ t | r t ; θ ) = c e u v v u q 2 I q ( 2 u v ) ,
where
c = 2 a σ 2 ( 1 e a Δ t ) u = c r t e a Δ t , v = c r t + Δ t , q = 2 a b σ 2 1 ,
and I q ( 2 u v ) is modified Bessel function of the first kind and of order q. Then the likelihood function of the time series ( r 1 r N ) with the time between two observations Δ t = 1 is
L ( θ ) = i = 1 N 1 p ( r i + 1 | r i ; θ ) ,
from which the log-likelihood function of the CIR process is derived:
ln L ( θ ) = ( N 1 ) ln c + i = 1 N 1 u i v i + 1 + 0.5 q ln v i + 1 u i + ln ( I q ( 2 u i v i + 1 ) ) ,
where u i = c r i e a Δ t and v i + 1 = c r i + 1 .
There are two approaches to calibrating affine mortality processes to the historical data. One is to match the survival function (Equation (12)) using the solutions of the Riccati ODEs for the OU (Equation (13)) and the Feller (Equation (17)) processes to the set of observed survival probabilities. Another approach is to maximise the likelihood function of the transition density. In this work we employ the second approach as described in [26] since both for the OU process and the Feller process the transition density is known in closed-form.

6. Models Calibration

In this section we work with the UK female dataset which describes the mortality in population aged 50–99 for the years 1970–2009. First, we divide the data in two data sets: the estimation data set, containing 30 years of observations, from 1970 until 1999; and the backtesting data set containing the last 10 years of observations, from 2000 till 2009. First we estimate the model parameters on the estimation data set, then we make 10-years predictions of mortality rates and calculate how well the forecast is compared with actual mortality rates for the period 2000–2009.
For the Lee and Carter and the Wills and Sherris models we use the whole surface of mortality to calibrate the models. Then, to compare the performance of the models between each other, we chose 19 generations. To have reliable estimation results and to make the comparison between the models which simulate the whole mortality surface (the Lee Carter and the Wills and Sherris models) and the ones which model each generation separately (the OU and the Feller processes) fairer, we take all possible generations from the data, which satisfy the criteria that the length of the backtesting period would not be less than 10 years. This results in 19 generations—aged 42–60 in the year 1970. We obtain the mortality rates for corresponding generations from the surface by taking the relative diagonal of the matrix. For the OU and the Feller processes, however, we calibrate the parameters for each of the 19 generations separately. We calculate the parameters on the estimation time period (1970–1999) and then use them to make forecasts of mortality for the next 10 years. Thus, we build forecasts for these generations and compute the relative error of prediction, as well as the percentage value of the actual mortality rates which fall within the prediction interval in the test period 2000–2009.

6.1. Calibration of the Lee-Carter Model

First of all, we compute the average of the log mortality m x t for every age over time period 1970-1999 for the estimation dataset and subtract it in order to obtain mean centered log-mortality rates, m ˜ x t = m x t m ¯ x . The average of the log mortality for the whole dataset is shown in Figure 1. Then, we perform SVD on m ˜ x t matrix and obtain estimates for parameters – two vectors β ^ x and κ ^ t . The actual centered mortality and its SVD approximation are illustrated in Figure 2.
The obtained ML estimates for the drift and the variance of the innovations are θ ^ M L = 0.5992 and σ 2 ^ r w = 0.9154 , respectively. Using these parameters we can compute the forecast for κ t as given by Equation (4) and its forecast point estimate as described in Equation (6). In Figure 3a the estimated vector of κ t and its forecast obtained for the next 10 years (in red) are shown. Then, we calculate the forecast for log-mortality as given in Equation (7). Figure 3b shows the mortality for the 10 years forecasted by the Lee-Carter. The forecast corresponds well to the observed mortality rates for the UK female population presented in Figure 1b. However, we can see that the cohort effect (diagonal trends in the data present in Figure 1b) is not captured by the Lee-Carter forecast of mortality.

6.2. Calibration of the Wills and Sherris Model

The analysis of the fit is based on the assumption that the residuals are independent and identically distributed normal variables with mean 0 and standard deviation 1. Figure 4a shows the graph of residuals for the UK female population aged 50–99, years 1970–1999 (the estimation dataset). The plot, together with the residuals descriptive statistics in Table 1, supports the hypothesis that the residuals follow a standard normal distribution with mean close to zero and standard deviation very close to one. The table also contains the value of the log likelihood function, the BIC criteria and the value of the χ-square statistics.
The Bayesian information criteria (BIC) is defined as
B I C = 2 ln ( L ^ ) + k ln ( n ) ,
where:
  • n – the number of observations (sample size);
  • k – the number of free parameters to be estimated.
  • L ^ – the maximized value of the likelihood function of the model.
Pearson’s chi-square statistics, defined as χ 2 = o b s e r v a t i o n s ( O i E i ) 2 E i , allows us to evaluate the goodness of fit by testing wether or not an observed frequency distribution differs from the theoretical one. We compare weather the computed value of χ 2 with the critical value of the statistic with degree of freedom defined as
d f = number of observations number of independent parameters 1 .
The obtained value of the χ 2 is 1.5835. This is compared to the chi-square distribution with 217 degrees of freedom ( 49 29 ( 49 2 / 2 + 3 ) 1 ) . Higher values of the χ 2 statistic suggest a poorer fit. Since the calculated value is very low, the test confirms a very good fit to the data.
To capture the correlation structure between ages we calculate eigenvectors of the matrix of the obtained residuals using Principal Component Analysis. Table 2 summarises the percentage of the observed variation explained by these vectors. The observed age-correlation matrix has a total of 49 eigenvalues. In our experiments we take first 30 eigenvectors to approximate the correlation matrix as they account for 98.9 % of the variation.
Figure 4b shows the mortality surface for the test period built with the Wills and Sherris model using the parameters obtained on the estimation dataset. By comparing the forecast with the actual mortality rates (Figure 1b), we can see that the model gives projections which are similar to the real data, although we see more variation in the simulated mortality intensities. In order to obtain a reliable prediction of mortality rates for a particular generation, we perform Monte Carlo simulations of the mortality surface for the test period and extract from the surface a diagonal corresponding to a specific generation. Then we estimate the mean of the Monte Carlo simulations for a given generation, together with the 90 % prediction interval.

6.3. Calibration of the OU-Process

We calibrate the model on 19 generations and evaluate the goodness of fit by means of the BIC criteria and analysis of the residuals. For each generation x, having a series of length N we use n = N 10 observations (first n years of the sample) to estimate the parameters a and σ and last 10 observation for backtesting the results. For instance, for the UK data, if we consider individuals who were 50 in the year 1970, and we have the data until the year 2009, we have 40 years of observations. Then the first thirty years of observations (1970–1999) is the estimation data and the last ten years of observations (2000–2009) is the backtesting data.
After obtaining the parameters we use the following simulation equation to generate paths of the mortality intensities. This expression is an exact solution of the SDE (13):
μ i 1 = μ i e a δ + σ 1 e 2 a δ 2 a N 0 , 1 .
From the mortality intensities one can easily obtain the survival probabilities by means of the analytical formula 12 with t = T j + 1 T j = 1 . In this way the expression for a one-year survival curve with α and β being constants is:
S x ( T j + 1 , T j ) = e α + β μ x j , α = σ 2 2 a 2 σ 2 a 3 e a + σ 2 4 a 3 e 2 a + 3 σ 2 4 a 3 , β = 1 a ( 1 e a ) ,
However, in our study we focus on mortality intensities. We obtain the parameters using the estimation dataset as described above, after which we use them to generate paths and to forecast mortality intensities. Finally, we calculate the error between the forecasted mortality curve and the actual mortality rates.
The residuals of the model are the realisations of the random component d W ( t ) which should follow the standard normal distribution if the parameters are estimated correctly:
Δ μ a μ σ N ( 0 , 1 ) .
We use Kolmogorov-Smirnov statistic to test hypothesis that the errors come from a standard normal distribution.
Taking the mortality intensities for the 7 generations we obtain the parameters presented
In Table 3 we report the obtained parameters for selected 7 generations. As expected, the a parameter is increasing with age, which means that the average mortality intensity is larger for older generations. The σ parameter is also growing with age. This proves the fact that there is more uncertainty in mortality rates for older ages.
Figure 5a represents the residuals of the model. According to the Kolmogorov-Smirnov test, the errors of the model are confirmed to be standard normal at 5 % significance level. Figure 5b illustrates historical mortality intensities (in blue), 1000 MC simulations (in yellow), average among simulations (in red) and confidence intervals (in green) for the entire period (1970–2009). This graph is done for the generation aged 51 in the year 1970.

6.4. Calibration of the Feller Process

We have maximised the log-likelihood function as it is stated in Equation (24) assuming that the mean-reversion parameter is zero. Table 4 reports the obtained parameters and the value of the maximazed log-likelihood function for each generation. The obtained parameter values correspond well with the previous work, such as in Luciano and Vigna [11] .
The simulation of the future mortality is performed by discretising Equation (17) with time step equal to one year:
μ t + 1 = μ t + a μ t + σ μ t N ( 0 , 1 ) .

7. Comparison of the Four Models

To compare the performance of the models for the 19 generations based on their age in 1970. For each, we forecast mortality rates in the period 2000–2009 and compute the relative error of prediction, as well as the percentage of the observed mortality rates in the test period which fall within the prediction interval. The forecast and the prediction bounds are obtained using 15,000 Monte Carlo simulations.
In this section we define the tests of the mean relative error and the prediction intervals. A model can perform very well with respect to the percentage of the mortality rates which fall within the prediction bounds, while at the same time having a high relative error, if its variance grows rapidly and, therefore, the model produces wide prediction bounds. We say that a model is precise if its forecasts of mortality are consistent with respect to the prediction interval and that a model is accurate if its mean absolute forecast errors are small. Of course, it is desirable for a good model to be both accurate and precise. To interpret the results of the experiments, it is important to understand the form of the variance implied by each model which we discuss in this section as well.

7.1. Relative Error

For each x , t the relative error is defined as follows:
e r r o r x ( t ) = μ x p r e d i c t e d ( t ) μ x a c t u a l ( t ) μ x a c t u a l ( t ) x , t .
Since the longevity risk corresponds to lower-than-expected mortality rates, we define the error so that it is positive if the forecast of mortality exceeds the historical values (actual values are lower-than-expected), and negative in the opposite case.
We compute the relative error for 19 generations—they are female aged 42–60 in the base year 1970. Thus, in the test period, for which the graphs of error are plotted, they are 30 years older—72–90 years, respectively.
The results of the experiments are presented in Figure 6 and Figure 7 and Table 5 and Table 6. The graphs of the mean absolute errors in Figure 7 illustrate the results shown in Table 5 and Table 6. We can see that most of the errors fall in the range [ 0.1 ; 0.1 ] . The exception is the OU process for the generation aged 60 in 1970, especially for later years of projections. The error for this generation in the Feller process forecast is also large – its absolute mean for generation aged 60 in 1970 is 0.0427 (Table 5). This increases the mean absolute errors for this generation for these processes shown in Table 5. Figure 7b shows the relative absolute error for each year in the test period average over 19 generations. We see that the error is smaller for younger ages . All the models show a high error for the generation aged 50 in the year 1970. This might be due to the cohort effect which is generally present in the UK data. The biggest error for this generation is produced by the Lee-Carter model. The graph of the relative absolute errors averaged over generations by year (Figure 7a) shows an increasing trend for all 4 models, especially for the Lee and Carter model (red line). This effect is due to the fact that the variances of the projected mortality rates increase with projection time. However, this does not happen at the same rate in different models.
The errors of the Lee-Carter model are mostly positive (Figure 6a) and we can observe a relative increase of the errors in time for each generation, indicating that the Lee-Carter model has tended to predict mortality rates that are too high. The errors of the Wills and Sherris model exhibit two patterns for different generations (Figure 6c). They are negative for the older generations (aged 60, 57, 54, 51 and 48 in 1970) and positive for the two youngest ones (45, 42 in 1970). We note that the older generations belong to the lower diagonal of the initial mortality matrix, while the younger two belong – to the upper diagonal of the matrix. Thus, it may be that the Wills and Sherris model has a tendency to overestimate the mortality for younger generations and underestimates the mortality for the older ones.
According to the Table 5 and Table 6, the OU process exhibits the lowest mean absolute error, followed by the Feller process, the Wills and Sherris model and the Lee and Carter model for the UK female data.

7.2. Discussion on the Variances

As it has been stated in the description of the model, the variance of the mortality intensity μ ( t ) , conditional on time 0, in the OU specification has a form σ 2 e 2 a t 1 2 a , where t is the time elapsed. For the Feller process, when intensity μ ( t ) is specified by the CIR process of the form:
d μ ( t ) = ( b + a μ ( t ) ) d t + σ μ ( t ) d W ( t ) ,
with a > 0 , b > 0 , σ > 0 , the conditional distribution of the mortality intensity at time t, conditional on time 0 is given by a non-central chi-square distribution:
μ ( t ) σ 2 ( e a t 1 ) 4 a χ d 2 ( ν ) ,
where χ d 2 ( ν ) denotes the density of a non-central chi-square random variable with d degrees of freedom:
d = 4 b σ 2 ,
and the non-centrality parameter ν is
ν = 4 a e a t σ 2 ( e a t 1 ) μ ( 0 ) .
The χ d 2 ( ν ) distribution, has a variance V a r χ d 2 ( ν ) = 2 ( d + 2 ν ) . Thus, intensity μ ( t ) has a variance:
σ 2 ( e a t 1 ) 2 a 4 b σ 2 + 8 a e a t σ 2 ( e a t 1 ) μ ( 0 )
In the Feller specification, parameter b is not defined and, hence, the number of degrees of freedom d is not defined either. However, we can see that, other parameters being equal, the variance of the OU process should grow faster in time than the variance of the CIR process, as it has e 2 a t term rather than e a t .
The Wills and Sherris model assumes that the distribution of the changes in the force of mortality follows a normal distribution:
Δ μ ( x , t ) N ( ( a ( x + t ) + b ) μ , σ μ ) .
Thus, the variance of the mortality intensity grows in time as at each time installment it is multiplied by the mortality rate from the year before.
For the Lee Carter model, the variance of the logarithm of the mortality rate at time t for each age x, is β x σ ^ r w 2 t , where σ ^ r w is the variance of the random walk process κ t , in our case equal to 0.9154, and t is the time passed.
To understand better the variance of the distribution of mortality rates at each point in time we show the graphs of the standard deviation in Figure 8. The variance of each model grows with time, however, for the Wills and Sharris model it grows substantially faster than for other three models. The second fastest growing variance is of the Feller process, then the OU-process and, finally, the Lee-Carter model shows the slowest growth in variance.
Regarding the covariance/correlation across generations and ages, all models employ a different structure. Lee-Carter is a one-factor model, which results in mortality improvements at all ages being perfectly correlated. Wills and Sherris model is designed to capture correlation between the ages. In practice it amplifies the effect of the variance growth over time since in reality the correlation increases with age (see Wills and Sherris [13]). In fact, in the simulation procedure the Wiener process is multiplied by the instantaneous correlation matrix D which describes the correlation structure between ages. The OU and the Feller processes in the current study do not take into account the correlation between generations. However, they can also be extended to the case of multiple generations. In [27] it is described how the OU process can be extended to the case of two generations, whose changes in mortality intensities are correlated with an instantaneous correlation coefficient.

7.3. Discussion on the Number of Parameters

The number of estimated parameters is different for each procedure. The Wills and Sherris model estimates only 3 parameters for the whole dataset plus eigenvectors to approximate the correlation matrix (in our case we take 30 eigenvectors), while both the OU and the Feller processes fit 2 parameters for each generation. To calibrate the Lee-Carter model, we have to estimate A + T = 50 + 30 = 80 parameters. To predict mortality for each generation in the dataset for which the size of the estimation part would be not be less that 10 years, we have to estimate the OU and the Feller processes for 19 generations resulting in 38 parameters each, for the Wills and Sherris model we have used 3 + 30 = 33 parameters.

7.4. Prediction Intervals

The results of the experiments based on prediction intervals are presented in the Figure 9 and Table 7. Figure 9 illustrates the 90% prediction intervals built for each model with 15000 MC simulations for generation aged 57 in 1970. We can see that the width of the intervals is different for each model—it is the smallest for the OU-process; medium for the Lee-Carter model and the Feller process, and it is the widest for the Wills and Sherris model, especially for the older ages.
According to Table 7, the Wills and Sherris model shows the best results in terms of the percentage of the actual future mortality which appear within the confidence bounds—for 15 out of 19 generations the percentage reaches 100 % . The Lee-Carter model and the Feller process results are comparably good, while the OU process shows the worst result with the average percentage of the actual mortality within the prediction intervals being 53 % (Table 7).
Note, that although the variances of the Lee Carter model is smaller than the ones produced by the OU-process and the Feller process, the first model shows better results.

8. Robustness of Simulation Results

Here we evaluate the performance of the approach described above on the 4 datasets. They are:
  • UK Females
  • UK Males
  • Australian Females
  • Australian Males
The experiments in this section are made using the same time and the age periods—1970–2009 and 50–99. The generations aged 42–60 in the year 1970 are chosen in the same manner as in Section 6 and Section 7.
The results of the estimation are presented in Table 8 and Table 9. More detailed results of the estimation for the 4 datasets are included in the Appendix as tables and plots of the errors for each of the 19 generations. According to the Table 8 and Table 9, the results for the UK males data with regard to accuracy are the same as for the UK females—the OU and the Feller processes produce the smallest error, while the Lee-Carter and the Wills and Sherris models show the largest error. However, the mean of the absolute error for the Lee-Carter model in this case is 3 times larger in comparison to the error of the UK females estimation. This model also shows very bad result according to precision (with the percentage of the actual mortality within prediction interval being only 4.21 % ).
Australian females data is the only dataset which shows good results using the Lee Carter model, both according to precision and accuracy. The OU and the Feller processes, on the contrary, produce large errors for this dataset, especially for the generations aged 45–55. This may be explained by the fact that mortality in Australia is lower for people in their 40s and 50s in comparison to their UK counterparts, and, as a consequence, mortality intensities are larger for older ages. This can be seen from the plots of mortality curves for generations aged 51 and 54 in the year 1970 (Figure 10). More prominent convex form of the mortality curves for Australian population makes the error (which is calculated for the last 10 years of the observations) larger as the prediction of mortality underestimates the actual mortality intensity. We would suggest that the inclusion of the correlation coefficient for the OU and the Feller processes to describe the dependence between the generations could improve the calibration results for these procedures by taking into account the fact that if mortality of the generations aged 45–55 is rather low, it would imply an increase in the mortality intensity for the older ages.
It is worth noting that for all datasets the errors produced by the Wills and Sherris model, the OU process and the Feller process exhibit similar patterns (Figure 7b, Figure 11b, Figure 12b and Figure 13b), while the errors produced by the Lee-Carter model have a different pattern. This may be explained by the fact that the first three procedures model the advances (changes) in mortality intensity for a cohort, while the Lee-Carter models the central mortality rate itself.
On the whole, we can say that the results are data dependent. However, from the estimation results on the four datasets, we can conclude that the Wills and Sherris model performs best in terms of precision, but it is one of the worst in terms of accuracy. The Lee Carter model shows better fit to the Australian population dataset rather that to the British one, both for males and females. The OU process and the Feller process provide rather good results in terms of accuracy, while they have often get low ranks in terms of precision.

9. Conclusions

In this study we have calibrated 4 mortality models to the UK and Australian populations and have quantitatively compared their accuracy and precision in forecasting mortality rates. To evaluate this we have used two measures – first, we looked at the relative errors between the forecasted and the observed mortality rates and second, we investigated the percentage of the observed mortality rates which fell within the projected prediction intervals. Our experiments compare one discrete-time model, proposed by Lee and Carter, and three continuous-time models—the Wills and Sherris model, the Ornstein-Uhlenbeck process and the Feller process. The first two models estimate the whole surface of mortality across ages and years simultaneously, while the latter two model each generation separately. One major advantage of the OU and the Feller processes is that they belong to the affine class of mortality models and so allow closed-form expressions for survival probabilities, which is useful for pricing many financial securities. On the other hand, the Wills and Sherris model allows the dependencies between generations to be captured, which may be useful for life offices who have portfolios written on multiple cohorts.
The choice of the model may depend on the goal and the data available. As a result of our experiments with the UK female, the Wills and Sherris model performs best in terms of the prediction interval, followed by the Lee-Carter model. In terms of the mean absolute error, the OU and the Feller processes are better. Thus, for the UK data models which capture the whole mortality surface are more precise, meaning that their forecast prediction intervals are more likely to include the observed mortality rates. Models for a single generation, on the other hand, tend to be more accurate, meaning that their mean absolute errors between the forecast and observed mortality are smaller. For the UK male data the results are rather similar—the main difference here is that the LC model in this case provides much worst result both in terms of precision and accuracy.
However, the results are different for the Australian dataset. In this case, the Lee-Carter model and the OU process are the best in terms of accuracy, both for males and females. The Wills and Sherris model shows good result with respect to the precision measure for Australia as well, followed by the LC for the females and the Feller process for the males.
Based on our experiments, different models appear to be preferred for specific generations and years. We believe that our analysis and the results discussed in this paper are useful for the insurance industry. In particular, we provide potentially useful insights into different mortality modelling frameworks and allow practitioners to chose a model that suits their specific needs.

Acknowledgments

The author thanks Luca Regis, assistant professor at the IMT Luca for assistance with the experiments setup. I am also grateful to the participants of the ASF 2016 and ICASQF 2016 conferences, especially to Anrew Hunt, research actuary at Pacific Life Re and Lewis Ramsden, postgraduate research student at the University of Liverpool for their comments on the earlier version of the manuscript.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Table A1. Mean (over 10 years) of the absolute errors for each generation, UK male data.
Table A1. Mean (over 10 years) of the absolute errors for each generation, UK male data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
600.04940.06120.07580.0560
590.04450.06340.07370.0506
580.02270.07390.03200.0222
570.02880.09410.03420.0257
560.03600.11480.04250.0315
550.06020.16010.06700.0506
540.05110.17460.05490.0423
530.05830.22750.03990.0340
520.05270.20310.03680.0344
510.05360.14550.01430.0188
500.03560.06760.02490.0228
490.04080.12050.02040.0178
480.05790.13500.03470.0306
470.05850.13690.02800.0260
460.07380.10860.04560.0410
450.06800.11690.02740.0278
440.09240.10650.06590.0598
430.09760.12910.07350.0614
420.11960.14040.10070.0904
Mean (Rank)0.0580 (3)0.1252 (4)0.0470 (2)0.0391 (1)
Table A2. Mean (over 19 generations) of the absolute errors for each year, UK male data.
Table A2. Mean (over 19 generations) of the absolute errors for each year, UK male data.
Year (Rank)Wills and SherrisLee-CarterOU-ProcessFeller
20000.03180.06430.02830.0268
20010.04180.07900.03420.0312
20020.03540.07930.02680.0226
20030.03780.07350.02670.0271
20040.06210.12370.04850.0413
20050.05760.12480.04490.0356
20060.07590.15750.06110.0503
20070.07130.16460.05660.0449
20080.06320.16320.04940.0349
20090.10270.22240.09320.0768
Mean (Rank)0.0580 (4)0.1252 (1)0.0470 (3)0.0391 (2)
Table A3. Percentage of the actual mortality rates which falls within a 90% prediction interval, UK male data.
Table A3. Percentage of the actual mortality rates which falls within a 90% prediction interval, UK male data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
601.00000.10000.10000.5000
591.00000.10000.10000.6000
581.000000.60000.9000
571.000000.70001.0000
561.000000.30000.7000
551.0000000.5000
541.000000.40000.7000
531.000000.60000.9000
521.000000.40000.7000
511.00000.10001.00001.0000
501.00000.40000.80000.9000
491.000000.70000.9000
481.000000.50001.0000
471.000000.50000.6000
460.900000.20000.6000
451.000000.60001.0000
440.80000.10000.10000.6000
430.5000000.5000
420.1000000
Mean (Rank)0.9105 (1)0.0421 (4)0.4000 (3)0.7158 (2)
Table A4. Mean (over 10 years) of the absolute errors for each generation, Australian female data.
Table A4. Mean (over 10 years) of the absolute errors for each generation, Australian female data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
600.12150.06970.07460.0983
590.11130.03860.04610.0834
580.11910.05730.03470.0777
570.12010.03680.03720.0869
560.14050.05910.05610.1049
550.12700.01980.05700.1018
540.16370.02150.11230.1490
530.15330.02290.11580.1469
520.17130.02690.14020.1709
510.17360.02530.16330.1791
500.19550.10260.18020.2019
490.14380.02550.12690.1533
480.14340.03000.10300.1390
470.16220.03440.17350.1814
460.15510.03110.16650.1839
450.12270.01510.13310.1453
440.09860.03560.10440.1206
430.09050.03310.08330.1017
420.08500.03730.07870.0944
Mean (Rank)0.1368 (3)0.0380 (4)0.1046 (1)0.1326 (2)
Table A5. Mean (over 19 generations) of the absolute errors for each year, Australian female data.
Table A5. Mean (over 19 generations) of the absolute errors for each year, Australian female data.
Year (Rank)Wills and SherrisLee-CarterOU-ProcessFeller
20000.03220.04000.03030.0308
20010.04360.02530.03540.0391
20020.09780.03940.07280.0887
20030.10410.03090.07100.0941
20040.12070.03160.08250.1110
20050.12840.03220.08770.1205
20060.16640.03580.12310.1612
20070.20250.04120.15880.2007
20080.24210.06060.19920.2441
20090.22960.04340.18490.2361
Mean (Rank)0.1368 (4)0.0380 (1)0.1046 (2)0.1326 (3)
Table A6. Percentage of the actual mortality rates which falls within a 90% prediction interval, Australian female data.
Table A6. Percentage of the actual mortality rates which falls within a 90% prediction interval, Australian female data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
601.00000.40000.30000.4000
591.00000.70000.40000.4000
581.00000.40000.60000.4000
571.00000.70000.60000.5000
561.00000.60000.20000.2000
551.00001.00000.40000.3000
540.70001.00000.10000.2000
530.70000.90000.20000.2000
520.50001.00000.10000.1000
510.50001.00000.10000.2000
500.30000.400000.1000
490.70000.90000.10000.2000
480.60001.00000.20000.5000
470.60000.90000.20000.2000
460.60001.00000.10000.2000
450.80001.00000.20000.4000
440.80001.00000.40000.5000
431.00001.00000.70000.7000
421.00001.00000.70000.9000
Mean (Rank)0.7789 (2)0.8368 (1)0.2947 (4)0.3474 (3)
Table A7. Mean (over 10 years) of the absolute errors for each generation, Australian male data.
Table A7. Mean (over 10 years) of the absolute errors for each generation, Australian male data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
600.09400.09300.08010.0945
590.04390.05340.03860.0346
580.11710.05530.09800.1133
570.08790.03760.05860.0783
560.09740.02310.06570.0867
550.12380.05220.13850.1357
540.09490.06220.08050.0958
530.11440.07370.11630.1257
520.09280.08600.08170.0955
510.09000.08060.08800.0968
500.13380.01790.15430.1531
490.07250.10360.09160.0906
480.06930.08070.05410.0658
470.05810.09670.05780.0663
460.04990.08740.05620.0617
450.04600.07840.04580.0500
440.03870.08600.04700.0486
430.03560.11110.06280.0591
420.03570.07420.04220.0432
Mean (Rank)0.0787 (3)0.0712 (1)0.0767 (2)0.0840 (4)
Table A8. Mean (over 19 generations) of the absolute errors for each year, Australian male data.
Table A8. Mean (over 19 generations) of the absolute errors for each year, Australian male data.
Year (Rank)Wills and SherrisLee-CarterOU-ProcessFeller
20000.02790.03650.02870.0285
20010.03620.05160.03700.0367
20020.06230.04040.05900.0615
20030.05360.06090.05210.0525
20040.06150.05970.05800.0640
20050.05610.09700.04780.0578
20060.08580.09130.08250.0926
20070.11320.08660.10760.1239
20080.15260.08140.14970.1652
20090.13800.10670.14480.1571
Mean (Rank)0.0787 (3)0.0712 (1)0.0767 (2)0.0840 (4)
Table A9. Percentage of the actual mortality rates which falls within a 90% prediction interval, Australian male data.
Table A9. Percentage of the actual mortality rates which falls within a 90% prediction interval, Australian male data.
Age in 1970Wills and SherrisLee-CarterOU-processFeller
601.00000.20000.40000.6000
591.00000.40000.70001.0000
581.00000.20000.20000.3000
571.00000.30000.40000.8000
561.00000.60000.40000.5000
551.00000.100000
541.00000.20000.40000.5000
531.00000.10000.20000.3000
521.000000.30000.4000
511.00000.20000.30000.3000
501.00001.000000.2000
491.000000.40000.7000
481.00000.20000.60000.8000
471.00000.10000.40000.6000
461.00000.50000.70000.7000
451.00000.60000.70000.8000
441.00000.40000.70000.8000
431.00000.10000.70000.7000
421.00000.60000.90000.9000
Mean (Rank)1.0000 (1)0.3053 (4)0.4421 (3)0.5737 (2)

References

  1. A.J.G. Cairns, D. Blake, and K. Dowd. “Pricing death: Framework for the valuation and securitization of mortality risk.” ASTIN Bull. 36 (2005): 79–120. [Google Scholar] [CrossRef]
  2. R.D. Lee, and L.R. Carter. “Modelling and forecasting U.S. mortality.” J. Am. Stat. Assoc. 87 (1992): 659–671. [Google Scholar]
  3. A.E. Renshaw, and S. Haberman. “A cohort-based extension to the Lee- Carter model for mortality reduction factors.” Insur. Math. Econ. 38 (2006): 556–570. [Google Scholar] [CrossRef]
  4. A.J.G. Cairns, D. Blake, and K. Dowd. “A two-factor model for stochastic mortality with parameter uncertainty: Theory and calibration.” J. Risk Insur. 73 (2006): 687–718. [Google Scholar] [CrossRef]
  5. R. Plat. “On stochastic mortality modelling.” Insur. Math. Econ. 45 (2009): 393–404. [Google Scholar] [CrossRef]
  6. C. O’Hare, and Y. Li. “Explaining young mortality.” Insur. Math. Econ. 50 (2012): 12–25. [Google Scholar] [CrossRef]
  7. M.A. Milevsky, and D. Promislow. “Mortality Derivatives and the Option to Annuitize.” Insur. Math. Econ. 29 (2001): 299–318. [Google Scholar] [CrossRef]
  8. M. Dahl. “Stochastic mortality in life insurance: Market reserves and mortality-linked insurance contracts.” Insur. Math. Econ. 35 (2004): 113–136. [Google Scholar] [CrossRef]
  9. E. Biffis. “Affine processes for dynamic mortality and actuarial valuations.” Insur. Math. Econ. 37 (2005): 443–468. [Google Scholar] [CrossRef]
  10. M. Denuit, and P. Devolder. Continuous Time Stochastic Mortality and Securitization of Longevity Risk. Louvain-la-Neuve Working Paper 06-02; Louvain-la-Neuve, Belgium: Institut des Sciences Actuarielles, Universite’ Catholique de Louvain, 2006. [Google Scholar]
  11. E. Luciano, and E. Vigna. “Mortality risk via affine stochastic intensities: Calibration and empirical relevance.” Belgian Actuar. Bull. 8 (2008): 5–16. [Google Scholar]
  12. D. Schrager. “Affine stochastic mortality.” Insur. Math. Econ. 38 (2006): 81–97. [Google Scholar] [CrossRef]
  13. S. Wills, and M. Sherris. Integrating Financial and Demographic Longevity Risk Models: An Australian Model for Financial Applications. UNSW Australian School of Business Research Paper No. 2008ACTL05; Kensington NSW, Australia: UNSW Australian School, 2011. [Google Scholar]
  14. S. Wills, and M. Sherris. “Securitization, Structuring and Pricing of Longevity Risk.” Insur. Math. Econ. 46 (2010): 173–185. [Google Scholar] [CrossRef]
  15. A.J.G. Cairns, D. Blake, K. Dowd, G.D. Coughlan, D. Epstein, A. Ong, and I. Balevich. “A quantitative comparison of stochastic mortality models using data from England and Wales and the United States.” N. Am. Actuar. J. 13 (2009): 1–35. [Google Scholar] [CrossRef]
  16. K. Dowd, A.J.G. Cairns, D. Blake, G.D. Coughlan, D. Epstein, and M. Khalaf-Allah. “Evaluating the goodness of fit of stochastic mortality models.” Insur. Math. Econ. 47 (2010): 255–265. [Google Scholar] [CrossRef]
  17. K. Dowd, A.J.G. Cairns, D. Blake, G.D. Coughlan, D. Epstein, and M. Khalaf-Allah. “Backtesting stochastic mortality models: An ex-post evaluation of multi-period-ahead density forecasts.” N. Am. Actuar. J. 14 (2010): 281–298. [Google Scholar] [CrossRef]
  18. R. Giacometti, M. Bertocchi, S.T. Rachev, and F.J. Fabozzi. “A comparison of the Lee-Carter model and AR-ARCH model for forecasting mortality rates.” Insur. Math. Econ. 50 (2012): 85–93. [Google Scholar] [CrossRef]
  19. R.D. Lee, and T. Miller. “Evaluating the performance of the Lee-Carter method for forecasting mortality.” Demography 38 (2001): 537–549. [Google Scholar] [CrossRef] [PubMed]
  20. N. Ngai, and M. Sherris. “Longevity risk management for life and variable annuities: The effectiveness of static hedging using longevity bonds and derivatives.” Insur. Math. Econ. 49 (2011): 100–114. [Google Scholar] [CrossRef]
  21. D. Blake, and W. Burrows. “Survivor bonds: Helping to hedge mortality risk.” J. Risk Insur. 68 (2001): 339–348. [Google Scholar] [CrossRef]
  22. E. Luciano, L. Regis, and E. Vigna. “Delta-Gamma hedging of mortality and interest rate risk.” Insur. Math. Econ. 50 (2012): 402–412. [Google Scholar] [CrossRef]
  23. F. Girosi, and G. King. “Understanding the Lee-Carter Mortality Forecasting Method.” Available online: http://gking.harvard.edu/files/abs/lc-abs.shtml (accessed on 29 November 2016).
  24. D. Duffie, D. Filipovic, and W. Schachermayer. “Affine processes and applications in finance.” Ann. Appl. Probab. 13 (2003): 984–1053. [Google Scholar]
  25. J.C. Cox, J.E. Ingersoll, and S.A. Ross. “A Theory of the Term Structure of Interest Rates.” Econometrica 53 (1985): 385–407. [Google Scholar] [CrossRef]
  26. K. Kladivko. “Maximum Likelihood Estimation of the Cox-Ingersoll-Ross Process: The Matlab Implementation.” In Proceedings of the Technical Computing Conference, Prague, Czech Republic, 19–24 July 2007.
  27. P. Jevtic, E. Luciano, and E. Vigna. “Mortality surface by means of continuous time cohort models.” Insur. Math. Econ. 53 (2013): 122–133. [Google Scholar] [CrossRef]
Figure 1. Observed mortality rates for the UK female population. (a) Estimation data set, 1970–1999; (b) Backtesting data set, 2000–2009.
Figure 1. Observed mortality rates for the UK female population. (a) Estimation data set, 1970–1999; (b) Backtesting data set, 2000–2009.
Risks 04 00045 g001
Figure 2. Actual centered mortality and its approximation. (a) Mean centered log-mortality rate; (b) Approximation by 1-factor SDV.
Figure 2. Actual centered mortality and its approximation. (a) Mean centered log-mortality rate; (b) Approximation by 1-factor SDV.
Risks 04 00045 g002
Figure 3. Results of the Lee-Carter model. (a) Estimation and forecast of κ t ; (b) Lee-Carter forecast for 2000–2009.
Figure 3. Results of the Lee-Carter model. (a) Estimation and forecast of κ t ; (b) Lee-Carter forecast for 2000–2009.
Risks 04 00045 g003
Figure 4. Results of the Wills and Sherris model. (a) UK female 1970-1999 fitted residual; (b) Mortality forecast for the test period.
Figure 4. Results of the Wills and Sherris model. (a) UK female 1970-1999 fitted residual; (b) Mortality forecast for the test period.
Risks 04 00045 g004
Figure 5. Results of the OU-process for UK female generation aged 51 in the year 1970. (a) Fitted residuals of the model (1971–2009); (b) Historical mortality and simulated paths (1971–2009).
Figure 5. Results of the OU-process for UK female generation aged 51 in the year 1970. (a) Fitted residuals of the model (1971–2009); (b) Historical mortality and simulated paths (1971–2009).
Risks 04 00045 g005
Figure 6. Relative error of each model for every generation (a)–(d). (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Figure 6. Relative error of each model for every generation (a)–(d). (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Risks 04 00045 g006
Figure 7. Mean relative absolute error of each model, UK female data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Figure 7. Mean relative absolute error of each model, UK female data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Risks 04 00045 g007
Figure 8. Standard deviation of each model for every generation (ad). (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Figure 8. Standard deviation of each model for every generation (ad). (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Risks 04 00045 g008
Figure 9. Actual mortality rate and 90% prediction intervals for generation aged 57 for each model. (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Figure 9. Actual mortality rate and 90% prediction intervals for generation aged 57 for each model. (a) Lee-Carte model; (b) OU-process; (c) Wills-Sherris model; (d) Feller process.
Risks 04 00045 g009
Figure 10. Observed mortality curves for the UK and Australian generations aged 51 (a) and 54 (b). (a) Generation aged 51 in the year 1970; (b) Generation aged 54 in the year 1970.
Figure 10. Observed mortality curves for the UK and Australian generations aged 51 (a) and 54 (b). (a) Generation aged 51 in the year 1970; (b) Generation aged 54 in the year 1970.
Risks 04 00045 g010
Figure 11. Mean relative absolute error of each model, UK male data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Figure 11. Mean relative absolute error of each model, UK male data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Risks 04 00045 g011
Figure 12. Mean relative absolute error of each model, Australian female data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Figure 12. Mean relative absolute error of each model, Australian female data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Risks 04 00045 g012
Figure 13. Mean relative absolute error of each model, Australian male data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Figure 13. Mean relative absolute error of each model, Australian male data. (a) Relative absolute error, average by year; (b) Relative absolute error, average by generation.
Risks 04 00045 g013
Table 1. Parameter estimates and residual descriptive statistics for the Wills and Sherris model fit to UK female mortality rates 1970–1999.
Table 1. Parameter estimates and residual descriptive statistics for the Wills and Sherris model fit to UK female mortality rates 1970–1999.
Parameter Estimates
a0.0007032
b0.0850
σ0.0385
Log-likelihood7246.4
BIC-criteria−14480
Residual Descriptive Statistics
mean1.3192 × 10 15
Minimum−5.1384
Maximum2.9886
Standard Deviation1.0004
Standard Error0.0125
Confidence Level0.0003
χ 2 1.5835
Table 2. Percentage of the observed variation in residuals explained by the eigenvectors using PCA.
Table 2. Percentage of the observed variation in residuals explained by the eigenvectors using PCA.
Number of Eigenvectors% of Observed Variation
128.1
555.8
1075.4
1586.5
2093.1
2596.8
3098.9
35100
Table 3. ML parameters of the OU-process and maximised log-likelihood.
Table 3. ML parameters of the OU-process and maximised log-likelihood.
Generation Age in 1970a σ MaxLogLikelihoodBIC
600.10240.0020138.1895−269.5766
570.09990.0015146.5591−286.3159
540.09510.0008164.1945−321.5865
510.08940.0008165.3501−323.8979
480.08450.0004168.9370−331.2095
450.08410.0004155.7339−305.0300
420.08150.0004136.3726−266.5631
Table 4. ML parameters of the Feller process and maximised log-likelihood.
Table 4. ML parameters of the Feller process and maximised log-likelihood.
Generation Age in 1970a σ MaxLogLikelihoodBIC
600.09840.0073147.1586−287.5148
570.09550.0061156.6162−306.4299
540.09220.0043171.9529−337.1034
510.08690.0046172.7420−338.6817
480.08330.0036170.9023−335.1403
450.08260.0034155.2175−303.9972
420.08010.0031139.2860−272.3900
Table 5. Mean (over 10 years) of the absolute errors for each generation, UK female data.
Table 5. Mean (over 10 years) of the absolute errors for each generation, UK female data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
600.03830.03870.08370.0428
590.03460.02450.05430.0251
580.03960.02320.05620.0248
570.04570.02560.04300.0269
560.03740.03130.05730.0218
550.04200.06110.02090.0313
540.05150.05990.02200.0376
530.03790.09650.01600.0383
520.03620.06960.02600.0320
510.03200.03370.01690.0404
500.10080.16300.09190.1164
490.03880.02490.02560.0405
480.03530.04410.03830.0588
470.02580.04060.02570.0393
460.02120.02580.02510.0409
450.02790.02960.02630.0298
440.02670.02990.01930.0228
430.03120.05090.02210.0285
420.03340.06110.02320.0220
Mean (Rank)0.0388 (3)0.0492 (4)0.0365 (1)0.0379 (2)
Table 6. Mean (over 19 generations) of the absolute errors for each year, UK female data.
Table 6. Mean (over 19 generations) of the absolute errors for each year, UK female data.
Year (Rank)Wills and SherrisLee-CarterOU-ProcessFeller
20000.03140.03800.03790.0318
20010.02480.04170.03670.0247
20020.02670.03200.01910.0187
20030.04930.03160.02410.0462
20040.03320.04730.03210.0204
20050.04090.04010.02540.0311
20060.03260.06420.04050.0281
20070.04570.05370.03660.0475
20080.06260.04790.04830.0690
20090.04030.09510.06450.0614
Mean (Rank)0.0388 (3)0.0492 (4)0.0365 (1)0.0379 (2)
Table 7. Percentage of the actual mortality rates which falls within a 90% prediction interval, UK female data.
Table 7. Percentage of the actual mortality rates which falls within a 90% prediction interval, UK female data.
Age in 1970Wills and SherrisLee-CarterOU-ProcessFeller
601.00000.50000.20000.7000
591.00000.90000.30000.8000
581.00000.90000.30000.8000
571.00000.90000.50000.8000
561.00000.80000.10000.9000
551.00000.30000.60000.8000
541.00000.70000.80000.7000
531.00000.20000.80000.7000
521.00000.60000.50000.7000
511.00000.90000.80000.7000
500.5000000
490.90001.00000.70000.5000
481.00000.70000.40000.5000
471.00000.90000.80000.8000
461.00001.00000.40000.5000
451.00001.00000.60000.9000
441.00001.00000.70001.0000
430.90000.80000.70000.8000
420.90000.70000.90000.9000
Mean (Rank)0.9579 (1)0.7263 (2)0.5316 (4)0.7105 (3)
Table 8. Mean of the absolute errors for each dataset over 19 generations (rank of accuracy).
Table 8. Mean of the absolute errors for each dataset over 19 generations (rank of accuracy).
Model; DatasetUK, FemalesUK, MalesAustralia, FemalesAustralia, Males
Wills and Sherris0.0388 (3)0.0580 (3)0.1368 (4)0.0787 (3)
Lee-Carter0.0492 (4)0.1252 (4)0.0380 (1)0.0712 (1)
OU-process0.0365 (1)0.0470 (2)0.1046 (2)0.0767 (2)
Feller0.0379 (2)0.0391 (1)0.1326 (3)0.0840 (4)
Table 9. Percentage within a 90% prediction interval for each dataset (rank of precision).
Table 9. Percentage within a 90% prediction interval for each dataset (rank of precision).
Model; DatasetUK, FemalesUK, MalesAustralia, FemalesAustralia, Males
Wills and Sherris0.9579 (1)0.9105 (1)0.7789 (2)1.0000 (1)
Lee-Carter0.7263 (2)0.0421 (4)0.8368 (1)0.3053 (4)
OU-process0.7105 (3)0.4000 (3)0.2947 (4)0.4421 (3)
Feller0.5316 (4)0.7158 (2)0.3474 (3)0.5737 (2)

Share and Cite

MDPI and ACS Style

Novokreshchenova, A. Predicting Human Mortality: Quantitative Evaluation of Four Stochastic Models. Risks 2016, 4, 45. https://doi.org/10.3390/risks4040045

AMA Style

Novokreshchenova A. Predicting Human Mortality: Quantitative Evaluation of Four Stochastic Models. Risks. 2016; 4(4):45. https://doi.org/10.3390/risks4040045

Chicago/Turabian Style

Novokreshchenova, Anastasia. 2016. "Predicting Human Mortality: Quantitative Evaluation of Four Stochastic Models" Risks 4, no. 4: 45. https://doi.org/10.3390/risks4040045

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop