Modified Munich Chain-Ladder Method

Merz, Michael; Wüthrich, Mario V.

doi:10.3390/risks3040624

Open AccessArticle

Modified Munich Chain-Ladder Method

by

Michael Merz

^1,* and

Mario V. Wüthrich

^2,†

¹

Faculty of Business Administration, University of Hamburg, 20146 Hamburg, Germany

²

ETH Zurich, RiskLab, Department of Mathematics, 8092 Zurich, Switzerland

^*

Author to whom correspondence should be addressed.

^†

Swiss Finance Institute SFI Professor.

Risks 2015, 3(4), 624-646; https://doi.org/10.3390/risks3040624

Submission received: 30 September 2015 / Accepted: 1 December 2015 / Published: 21 December 2015

(This article belongs to the Special Issue Applying Stochastic Models in Practice: Empirics and Numerics)

Download

Browse Figure

Versions Notes

Abstract

:

The Munich chain-ladder method for claims reserving was introduced by Quarg and Mack on an axiomatic basis. We analyze these axioms, and we define a modified Munich chain-ladder method which is based on an explicit stochastic model. This stochastic model then allows us to consider claims prediction and prediction uncertainty for the Munich chain-ladder method in a consistent way.

Keywords:

Munich chain-ladder method; claims reserving; prediction uncertainty; mean-square error of prediction; multivariate Gaussian model; paid and incurred claims

1. Introduction

The Munich chain-ladder method was introduced by Quarg and Mack [1] on a pure axiomatic basis, and in 2003 it was awarded the Gauss prize by Deutsche Aktuarvereinigung (DAV) and Deutsche Gesellschaft für Versicherungs- und Finanzmathematik (DGVFM), see [1]. However, today it is still not known whether there is a non-trivial interesting stochastic model that fulfills these axioms, nor is much known about the prediction uncertainty in the Munich chain-ladder method. Liu and Verrall [2] propose to use bootstrap for the estimation of the prediction uncertainty in the Munich chain-ladder method, however this requires existence of a model that fulfills the Munich chain-ladder axioms. The aim of this paper is to study the axioms of the Munich chain-ladder method and to define a modified Munich chain-ladder method which is based on an explicit stochastic model. This explicit stochastic model gives a rigorous mathematical foundation for the analysis of claims prediction and its uncertainty.

There are two different ways to view the Munich chain-ladder method. The first way is to define a stochastic model which has the required structure of the Munich chain-ladder factors; this is the approach taken in [1]. The second way is to define a general chain-ladder model and to derive estimators in this model that have the Munich chain-ladder factor structure; this is the approach taken in [3]. Here, we analyze both of these views and we show how the second way leads to a modified Munich chain-ladder method. This analysis is done within the family of multivariate log-normal models. The first main result is that within this family of models, there is, in general, no interesting Munich chain-ladder model, see Theorem 2 below. The resulting Munich chain-ladder predictor always has an approximation error which is quantified in Theorem 3 below. Based on these findings, we define a modified Munich chain-ladder model for which we can derive optimal predictors and the corresponding prediction uncertainty.

Organization of the Paper

In the next section, we consider stochastic models which simultaneously fulfill the chain-ladder assumptions for cumulative payments and incurred claims. In Theorem 1, we see that such models only permit rather restricted correlation structures. For these restricted chain-ladder models, we then study the optimal one-step ahead prediction in Section 3. This optimal one-step ahead prediction can directly be compared to the Munich chain-ladder axioms which are introduced in Section 4. In Theorem 2, we find that, in general, the Munich chain-ladder axioms are not fulfilled in our modeling framework. This motivates a modified Munich chain-ladder method which is presented in Section 5. For this modified version, we derive optimal predictors and study prediction uncertainty in Section 6. These results are compared numerically in Section 7. The numerical study is based on the original data set of Quarg and Mack [1].

2. Chain-Ladder Models

We denote cumulative payments of accident year i and development year j by

P_{i, j}

and the corresponding incurred claims are denoted by

I_{i, j}

for

i = 0, \dots, J

and

j = 0, \dots, J

. We define the following sets of information

B_{j}^{P} = \{P_{i, k}; k \leq j, 0 \leq i \leq J\}, B_{j}^{I} = \{I_{i, k}; k \leq j, 0 \leq i \leq J\} and B_{j} = B_{j}^{P} \cup B_{j}^{I} .

Assumption 1

(distribution-free chain-ladder model).

(A1): We assume that the random vectors $(P_{i, 0}, \dots, P_{i, J}, I_{i, 0}, \dots, I_{i, J})$ have strictly positive components and are independent for different accident years $i = 0, \dots, J$ .
(A2): There exist parameters $f_{j}^{P}, f_{j}^{I}, σ_{j}^{P}, σ_{j}^{I} > 0$ such that for $0 \leq j \leq J - 1$ and $0 \leq i \leq J$

$\begin{matrix} E [P_{i, j + 1}| B_{j}^{P}] = f_{j}^{P} P_{i, j} & and & Var (P_{i, j + 1}| B_{j}^{P}) = {(σ_{j}^{P})}^{2} P_{i, j}^{2}, \\ E [I_{i, j + 1}| B_{j}^{I}] = f_{j}^{I} I_{i, j} & and & Var (I_{i, j + 1}| B_{j}^{I}) = {(σ_{j}^{I})}^{2} I_{i, j}^{2} . \end{matrix}$

These assumptions correspond to assumptions PE, PV, IE, IV and PIU in [1], except that we make a modification in the variance assumptions PV and IV. We make this change because it substantially simplifies our considerations (we come back to this in Remark 1 below). Assumption 1 states that cumulative payments

{(P_{i, j})}_{i, j}

and incurred claims

{(I_{i, j})}_{i, j}

fulfill the distribution-free chain-ladder model assumptions simultaneously. Our first aim is to show that there is a non-trivial stochastic model that fulfills the chain-ladder model assumptions simultaneously for cumulative payments and incurred claims. To this end, we define an explicit distributional model. The distributions are chosen such that the analysis becomes as simple as possible. We will see that assumption (A2) requires a sophisticated consideration.

We choose a continuous and strictly increasing link function

g : R_{+} \to R

with

{lim}_{x \to 0} g (x) = - \infty

and

{lim}_{x \to \infty} g (x) = \infty

. The standard example is the log-link function given by

g (x) = log x for x > 0,

(1)

but the results derived in this section hold true for general such link functions. The log-link function has the advantage of closed form solutions. For (general) link function g (as introduced above), we define the transformed age-to-age ratios for

0 \leq j \leq J

and

0 \leq i \leq J

by

ξ_{i, j}^{P} = g (\frac{P_{i, j}}{P_{i, j - 1}}) and ξ_{i, j}^{I} = g (\frac{I_{i, j}}{I_{i, j - 1}}),

where we set fixed initial values

P_{i, - 1} = I_{i, - 1} = ν_{i}

according to given volume measures

ν_{i} > 0

. To simplify the outline, we introduce vector notation, for

0 \leq i \leq J

we set

Ξ_{i} = {(ξ_{i, 0}^{P}, \dots, ξ_{i, J}^{P}, ξ_{i, 0}^{I}, \dots, ξ_{i, J}^{I})}^{'} .

Assumption 2

(multivariate (log-)normal chain-ladder model I).

(B1): We assume that the random vectors $Ξ_{i}$ are independent for different accident years $i = 0, \dots, J$ .
(B2): There exists a parameter vector $θ = {(θ_{0}^{P}, \dots, θ_{J}^{P}, θ_{0}^{I}, \dots, θ_{J}^{I})}^{'} \in R^{2 (J + 1)}$ and a positive definite covariance matrix $Σ \in R^{2 (J + 1) \times 2 (J + 1)}$ such that we have for $0 \leq i \leq J$

$Ξ_{i} \sim N (θ, Σ) .$

For log-link Equation (1), we obtain the log-normal chain-ladder model and for a general link g a general link ratio model. We have the following identities for the generated σ-algebras

σ \{P_{i, k}; k \leq j, 0 \leq i \leq J\} = σ {ξ_{i, k}^{P}; k \leq j, 0 \leq i \leq J} .

Therefore, by an abuse of notation, we use

B_{j}^{P}

for both sets of information, and analogously for

B_{j}^{I}

and

B_{j}

. From this, we immediately see that assumptions (A1) and (B1) agree with each other. Due to the independence of different accident years i we have for

* \in {P, I}

\begin{matrix} ξ_{i, j + 1}^{*} |_{B_{j}^{*}} & \overset{(d)}{=} & ξ_{i, j + 1}^{*} |_{{ξ_{i, 0}^{*}, \dots, ξ_{i, j}^{*}}}, \\ (ξ_{i, j + 1}^{P}, ξ_{i, j + 1}^{I}) |_{B_{j}} & \overset{(d)}{=} & (ξ_{i, j + 1}^{P}, ξ_{i, j + 1}^{I}) |_{{ξ_{i, 0}^{P}, \dots, ξ_{i, j}^{P}, ξ_{i, 0}^{I}, \dots, ξ_{i, j}^{I}}} . \end{matrix}

For

* \in {P, I}

we denote by

θ_{[j]}^{*} = {(θ_{0}^{*}, \dots, θ_{j}^{*})}^{'} \in R^{j + 1}

and let

Σ_{[j]}^{*} \in R^{(j + 1) \times (j + 1)}

be the (positive definite) covariance matrix of the random vector

ξ_{i, [j]}^{*} = {(ξ_{i, 0}^{*}, \dots, ξ_{i, j}^{*})}^{'}

. Moreover, let

Σ_{j, j + 1}^{*} \in R^{j + 1}

denote the covariance vector between

ξ_{i, [j]}^{*}

and

ξ_{i, j + 1}^{*}

, and let

{(s_{j + 1}^{*})}^{2} \in R_{+}

be the variance of component

ξ_{i, j + 1}^{*}

.

Lemma 1.

Under Assumption 2 we have for

* \in {P, I}

,

0 \leq j \leq J - 1

and

0 \leq i \leq J

ξ_{i, j + 1}^{*} |_{B_{j}^{*}} \sim N (θ_{j + 1}^{*} + {(Σ_{j, j + 1}^{*})}^{'} {(Σ_{[j]}^{*})}^{- 1} (ξ_{i, [j]}^{*} - θ_{[j]}^{*}), {(s_{j + 1}^{*, post})}^{2}),

with

{(s_{j + 1}^{*, post})}^{2} = {(s_{j + 1}^{*})}^{2} - {(Σ_{j, j + 1}^{*})}^{'} {(Σ_{[j]}^{*})}^{- 1} Σ_{j, j + 1}^{*}

.

Proof.

This is a standard result for multivariate Gaussian distributions, see Result 4.6 in [4]. ☐

Using Lemma 1, we can calculate the conditionally expected claims for given link function g. We have for

0 \leq j \leq J - 1

and

0 \leq i \leq J

\begin{matrix} E [P_{i, j + 1}| B_{j}^{P}] & = & P_{i, j} E [g^{- 1} (g (\frac{P_{i, j + 1}}{P_{i, j}}))| B_{j}^{P}] = P_{i, j} E [g^{- 1} (ξ_{i, j + 1}^{P})| B_{j}^{P}], \end{matrix}

(2)

\begin{matrix} E [I_{i, j + 1}| B_{j}^{I}] & = & I_{i, j} E [g^{- 1} (g (\frac{I_{i, j + 1}}{I_{i, j}}))| B_{j}^{I}] = I_{i, j} E [g^{- 1} (ξ_{i, j + 1}^{I})| B_{j}^{I}] . \end{matrix}

(3)

In a similar way, we obtain for the conditional variances

\begin{matrix} Var (P_{i, j + 1}| B_{j}^{P}) & = & P_{i, j}^{2} (E [g^{- 1} {((ξ_{i, j + 1}^{P}))}^{2}| B_{j}^{P}] - E {[g^{- 1} (ξ_{i, j + 1}^{P})| B_{j}^{P}]}^{2}), \end{matrix}

(4)

\begin{matrix} Var (I_{i, j + 1}| B_{j}^{I}) & = & I_{i, j}^{2} (E [g^{- 1} {((ξ_{i, j + 1}^{I}))}^{2}| B_{j}^{I}] - E {[g^{- 1} (ξ_{i, j + 1}^{I})| B_{j}^{I}]}^{2}) . \end{matrix}

(5)

We have assumed that Σ is positive definite. This implies that also

{(Σ_{[j]}^{*})}^{- 1}

is positive definite for

* \in {P, I}

. We then see from Lemma 1 that, in general, the last terms in Equations (2)–(5) depend on

ξ_{i, [j]}^{P}

and

ξ_{i, [j]}^{I}

, respectively. Therefore, these last terms are not constant w.r.t. information

B_{j}^{P}

and

B_{j}^{I}

, respectively, and Assumption 1 (A2) is not fulfilled unless both

Σ_{j, j + 1}^{P}

and

Σ_{j, j + 1}^{I}

are equal to the zero vector. This immediately gives the next theorem.

Theorem 1.

Assume that Assumption 2 is fulfilled for general link function g as introduced above. The model fulfills Assumption 1 if and only if

Σ_{[J]}^{P} = diag ({(s_{0}^{P})}^{2}, \dots, {(s_{J}^{P})}^{2}) and Σ_{[J]}^{I} = diag ({(s_{0}^{I})}^{2}, \dots, {(s_{J}^{I})}^{2}) .

(6)

Under Equation (6), we have in the special case of the log-link

g (x) = log x

and for

0 \leq j \leq J - 1

and

0 \leq i \leq J

\begin{matrix} E [P_{i, j + 1}| B_{j}^{P}] & = & P_{i, j} exp \{θ_{j + 1}^{P} + {(s_{j + 1}^{P})}^{2} / 2\}, \\ Var (P_{i, j + 1}| B_{j}^{P}) & = & P_{i, j}^{2} exp \{2 θ_{j + 1}^{P} + {(s_{j + 1}^{P})}^{2}\} (exp \{{(s_{j + 1}^{P})}^{2}\} - 1) . \end{matrix}

Analogous statements hold true for incurred claims

I_{i, j + 1}

, conditioned on

B_{j}^{I}

.

Remark 1.

The previous theorem says that covariance structure Equation (6) is a necessary condition to obtain the chain-ladder model of Assumption 1. This holds for general link functions g, see Equations (2)–(5), and under Gaussian age-to-age ratios. The resulting variance properties differ from the classical ones of Quarg and Mack [1]. However, our argument does not use the variance assumption in a crucial way (it is already sufficient to consider Equations (2)–(3)), except that under Assumption 2 the analysis receives an analytically tractable closed form solution. Therefore, we expect this result to hold true in broader generality.

Under the assumptions of Theorem 1, the process

{(P_{i, j})}_{0 \leq j \leq J}

has the Markov property, and we obtain the following chain-ladder parameters for the log-link

g (x) = log x

f_{j}^{*} = exp \{θ_{j + 1}^{*} + {(s_{j + 1}^{*})}^{2} / 2\} and {(σ_{j}^{*})}^{2} = {(f_{j}^{*})}^{2} (exp \{{(s_{j + 1}^{*})}^{2}\} - 1),

(7)

with

* \in {P, I}

. Moreover, the covariance matrix Σ under Theorem 1 is given by

Σ = (\begin{matrix} Σ_{[J]}^{P} = diag ({(s_{0}^{P})}^{2}, \dots, {(s_{J}^{P})}^{2}) & A \\ A^{'} & Σ_{[J]}^{I} = diag ({(s_{0}^{I})}^{2}, \dots, {(s_{J}^{I})}^{2}) \end{matrix}),

(8)

for an appropriate matrix

A \in R^{(J + 1) \times (J + 1)}

such that Σ is positive definite.

Lemma 2.

A symmetric matrix Σ of the form Equation (8) is positive definite if and only if the matrix

S_{[J]}^{P} = Σ_{[J]}^{I} - A^{'} {(Σ_{[J]}^{P})}^{- 1} A is positive definite,

or, equivalently, if and only if the matrix

S_{[J]}^{I} = Σ_{[J]}^{P} - A^{'} {(Σ_{[J]}^{I})}^{- 1} A is positive definite .

Proof.

This lemma is a standard result in linear algebra about Schur complements, see Section C.4.1 in [5]. ☐

The matrices

S_{[J]}^{*}

are called Schur complements of

Σ_{[J]}^{*}

in Σ, for

* \in {P, I}

. One may still choose more structure in matrix

A = {(a_{k, l})}_{0 \leq k, l \leq J}

, for instance, a lower-left-triangular matrix is often a reasonable choice, i.e.,

a_{k, l} = 0

for all

k < l

. For the time-being, we allow for any matrix A such that Σ is positive definite. This leads to the following model assumptions.

Assumption 3

(multivariate (log-)normal chain-ladder model II).

(C1): We assume that the random vectors $Ξ_{i}$ are independent for different accident years $i = 0, \dots, J$ .
(C2): There exists a parameter vector $θ = {(θ_{0}^{P}, \dots, θ_{J}^{P}, θ_{0}^{I}, \dots, θ_{J}^{I})}^{'} \in R^{2 (J + 1)}$ and a matrix Σ of the form Equation (8) with positive definite Schur complements $S_{[J]}^{P}$ and $S_{[J]}^{I}$ such that we have for $0 \leq i \leq J$

$Ξ_{i} \sim N (θ, Σ) .$

Corollary 1.

The model of Assumption 3 fulfills the distribution-free chain-ladder model of Assumption 1 for any link function g (as introduced above). The chain-ladder parameters are given by Equation (7) in the special case of the log-link function Equation (1).

The previous corollary states that we have found a class of non-trivial stochastic models that fulfill the distribution-free chain-ladder assumptions simultaneously for cumulative payments and incurred claims. Note that an appropriate choice of matrix A in Equation (8) allows us for dependence modeling between cumulative payments and incurred claims, this will be crucial in the sequel.

3. One-Step Ahead Prediction

Formulas (2) and (3) and Theorem 1 provide the best prediction of

P_{i, j + 1}

based on

B_{j}^{P}

and the best prediction of

I_{i, j + 1}

based on

B_{j}^{I}

, respectively, under Assumption 3. The basic idea behind the Munich chain-ladder method is to consider best predictions based on both sets of information

B_{j} = B_{j}^{P} \cup B_{j}^{I}

, that is, how does prediction of, say, cumulative payments

P_{i, j + 1}

improve by enlarging the information from

B_{j}^{P}

to

B_{j}

. This is similar to the considerations in [3]. In this section, we start with the special case of “one-step ahead prediction”, the general case is presented in Section 6, below. We denote by

θ_{[j]} = {(θ_{0}^{P}, \dots, θ_{j}^{P}, θ_{0}^{I}, \dots, θ_{j}^{I})}^{'} \in R^{2 (j + 1)}

and let

Σ_{[j]} \in R^{2 (j + 1) \times 2 (j + 1)}

be the (positive definite) covariance matrix of the random vector

ξ_{i, [j]} = {(ξ_{i, 0}^{P}, \dots, ξ_{i, j}^{P}, ξ_{i, 0}^{I}, \dots, ξ_{i, j}^{I})}^{'}

. Moreover, let

Σ_{j, j + 1}^{(*)} \in R^{2 (j + 1)}

denote the covariance vector between

ξ_{i, [j]}

and

ξ_{i, j + 1}^{*}

for

* \in {P, I}

. Note that in contrast to Lemma 1 we replace

Σ_{j, j + 1}^{*}

by

Σ_{j, j + 1}^{(*)}

, i.e., we set the upper index in brackets.

Lemma 3.

Under Assumption 3 we have for

* \in {P, I}

,

0 \leq j \leq J - 1

and

0 \leq i \leq J

ξ_{i, j + 1}^{*} |_{B_{j}} \sim N (θ_{j + 1}^{*} + {(Σ_{j, j + 1}^{(*)})}^{'} Σ_{[j]}^{- 1} (ξ_{i, [j]} - θ_{[j]}), {(s_{j + 1}^{(*), post})}^{2}),

with

{(s_{j + 1}^{(*), post})}^{2} = {(s_{j + 1}^{*})}^{2} - {(Σ_{j, j + 1}^{(*)})}^{'} Σ_{[j]}^{- 1} Σ_{j, j + 1}^{(*)}

.

Proof.

This is a standard result for multivariate Gaussian distributions, see Result 4.6 in [4]. ☐

The previous lemma shows that the conditional expectation of

ξ_{i, j + 1}^{P}

, given

B_{j}

, is linear in the observations

ξ_{i, [j]}

. This will be crucial. An easy consequence of the previous lemma is the following corollary for the special case of the log-link.

Corollary 2

(one-step ahead prediction for log-link). Under Assumption 3 we have prediction for log-link

g (x) = log x

and for

0 \leq j \leq J - 1

and

0 \leq i \leq J

\begin{matrix} E [P_{i, j + 1}| B_{j}] & = & P_{i, j} exp \{θ_{j + 1}^{P} + {(Σ_{j, j + 1}^{(P)})}^{'} Σ_{[j]}^{- 1} (ξ_{i, [j]} - θ_{[j]}) + {(s_{j + 1}^{(P), post})}^{2} / 2\} \\ = & f_{j}^{P} P_{i, j} γ_{j}^{P} (ξ_{i, [j]}) = E [P_{i, j + 1}| B_{j}^{P}] γ_{j}^{P} (ξ_{i, [j]}), \end{matrix}

with for

* \in {P, I}

\begin{matrix} γ_{j}^{*} (ξ_{i, [j]}) & = & exp \{β_{j}^{*} (ξ_{i, [j]}) - {(Σ_{j, j + 1}^{(*)})}^{'} Σ_{[j]}^{- 1} Σ_{j, j + 1}^{(*)} / 2\}, \\ β_{j}^{*} (ξ_{i, [j]}) & = & {(Σ_{j, j + 1}^{(*)})}^{'} Σ_{[j]}^{- 1} (ξ_{i, [j]} - θ_{[j]}) . \end{matrix}

Analogous statements hold true for incurred claims

I_{i, j + 1}

.

The term

γ_{j}^{P} (ξ_{i, [j]})

gives the correction if we experience not only

B_{j}^{P}

but also

B_{j}^{I}

. This increased information leads also to a reduction of prediction uncertainty of size

{(s_{j + 1}^{P})}^{2} \mapsto {(s_{j + 1}^{(P), post})}^{2} = {(s_{j + 1}^{P})}^{2} - {(Σ_{j, j + 1}^{(P)})}^{'} Σ_{[j]}^{- 1} Σ_{j, j + 1}^{(P)} \leq {(s_{j + 1}^{P})}^{2} .

Example 1

(log-link). The analysis of the correction term

γ_{j}^{P} (ξ_{i, [j]})

is not straightforward. Therefore, we consider an explicit example for the case

J = 2

and

j = 0, 1

. In this case, the covariance matrix Σ under Assumption 3 is given by

Σ = Σ_{[2]} = (\begin{matrix} \begin{matrix} {(s_{0}^{P})}^{2} & 0 & 0 \\ 0 & {(s_{1}^{P})}^{2} & 0 \\ 0 & 0 & {(s_{2}^{P})}^{2} \end{matrix} & \begin{matrix} a_{0, 0} & a_{0, 1} & a_{0, 2} \\ a_{1, 0} & a_{1, 1} & a_{1, 2} \\ a_{2, 0} & a_{2, 1} & a_{2, 2} \end{matrix} \\ \begin{matrix} a_{0, 0} & a_{1, 0} & a_{2, 0} \\ a_{0, 1} & a_{1, 1} & a_{2, 1} \\ a_{0, 2} & a_{1, 2} & a_{2, 2} \end{matrix} & \begin{matrix} {(s_{0}^{I})}^{2} & 0 & 0 \\ 0 & {(s_{1}^{I})}^{2} & 0 \\ 0 & 0 & {(s_{2}^{I})}^{2} \end{matrix} \end{matrix}) .

Case $j = 0$ . We start the analysis for $j = 0$ , i.e., given information $B_{0}$ .

Σ_{[1]} = (\begin{matrix} \begin{matrix} {(s_{0}^{P})}^{2} & 0 \\ 0 & {(s_{1}^{P})}^{2} \end{matrix} & \begin{matrix} a_{0, 0} & a_{0, 1} \\ a_{1, 0} & a_{1, 1} \end{matrix} \\ \begin{matrix} a_{0, 0} & a_{1, 0} \\ a_{0, 1} & a_{1, 1} \end{matrix} & \begin{matrix} {(s_{0}^{I})}^{2} & 0 \\ 0 & {(s_{1}^{I})}^{2} \end{matrix} \end{matrix}) and Σ_{[0]}^{- 1} = \frac{1}{{(s_{0}^{P} s_{0}^{I})}^{2} - a_{0, 0}^{2}} (\begin{matrix} {(s_{0}^{I})}^{2} & - a_{0, 0} \\ - a_{0, 0} & {(s_{0}^{P})}^{2} \end{matrix}) .

Moreover,

Σ_{0, 1}^{(P)} = {(0, a_{1, 0})}^{'}

. This provides credibility weight

{(α_{[0]}^{P})}^{'} \in R^{2}

given by

α_{[0]}^{P} = {(Σ_{0, 1}^{(P)})}^{'} Σ_{[0]}^{- 1} = \frac{1}{{(s_{0}^{P} s_{0}^{I})}^{2} - a_{0, 0}^{2}} (- a_{1, 0} a_{0, 0}, a_{1, 0} {(s_{0}^{P})}^{2}),

and posterior variance

{(s_{1}^{(P), post})}^{2} = {(s_{1}^{P})}^{2} - {(Σ_{0, 1}^{(P)})}^{'} Σ_{[0]}^{- 1} Σ_{0, 1}^{(P)} = {(s_{1}^{P})}^{2} - \frac{{(a_{1, 0} s_{0}^{P})}^{2}}{{(s_{0}^{P} s_{0}^{I})}^{2} - a_{0, 0}^{2}} .

Observe that

a_{1, 0} = Cov (ξ_{i, 1}^{P}, ξ_{i, 0}^{I})

is the crucial term in the credibility weight

α_{[0]}^{P}

. If these two random variables

ξ_{i, 1}^{P}

and

ξ_{i, 0}^{I}

are uncorrelated, then

a_{1, 0} = 0

and we cannot learn from observation

ξ_{i, 0}^{I}

to improve prediction

ξ_{i, 1}^{P}

. The predictor for log-link

g (x) = log x

is given by

E [P_{i, 1}| B_{0}] = P_{i, 0} exp \{θ_{1}^{P} + β_{0}^{P} (ξ_{i, [0]}) + {(s_{1}^{(P), post})}^{2} / 2\} = f_{0}^{P} P_{i, 0} γ_{0}^{P} (ξ_{i, [0]}),

with

β_{0}^{P} (ξ_{i, [0]}) = α_{[0]}^{P} (ξ_{i, [0]} - θ_{[0]}) = - \frac{a_{1, 0} a_{0, 0}}{{(s_{0}^{P} s_{0}^{I})}^{2} - a_{0, 0}^{2}} (ξ_{i, 0}^{P} - θ_{0}^{P}) + \frac{a_{1, 0} {(s_{0}^{P})}^{2}}{{(s_{0}^{P} s_{0}^{I})}^{2} - a_{0, 0}^{2}} (ξ_{i, 0}^{I} - θ_{0}^{I}) .

Also remarkable is that observation

ξ_{i, 0}^{P}

is used to improve the prediction of

ξ_{i, 1}^{P}

, though these two random variables are uncorrelated under Assumption 3. This comes from the fact that if

a_{0, 0} \neq 0

then

ξ_{i, 0}^{P}

is used to adjust

ξ_{i, 0}^{I}

.

Case $j = 1$ . This case is more involved. Set

\begin{matrix} b_{0, 0} & = & {(s_{0}^{I})}^{2} - a_{0, 0}^{2} / {(s_{0}^{P})}^{2} - a_{1, 0}^{2} / {(s_{1}^{P})}^{2}, b_{0, 1} = - a_{0, 0} a_{0, 1} / {(s_{0}^{P})}^{2} - a_{1, 1} a_{1, 0} / {(s_{1}^{P})}^{2}, \\ b_{1, 1} & = & {(s_{1}^{I})}^{2} - a_{0, 1}^{2} / {(s_{0}^{P})}^{2} - a_{1, 1}^{2} / {(s_{1}^{P})}^{2}, \\ c_{0, 0} & = & \frac{b_{0, 1} a_{0, 1} - b_{1, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}}, c_{0, 1} = \frac{- b_{0, 0} a_{0, 1} + b_{0, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}}, \\ c_{1, 0} & = & \frac{b_{0, 1} a_{1, 1} - b_{1, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}}, c_{1, 1} = \frac{- b_{0, 0} a_{1, 1} + b_{0, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}} . \end{matrix}

We have the following inverse matrix for

Σ_{[1]}

, see Appendix B for the full inverse matrix,

Σ_{[1]}^{- 1} = (\begin{matrix} \begin{matrix} * & * \\ * & * \end{matrix} & \begin{matrix} c_{0, 0} & c_{0, 1} \\ c_{1, 0} & c_{1, 1} \end{matrix} \\ \begin{matrix} c_{0, 0} & c_{1, 0} \\ c_{0, 1} & c_{1, 1} \end{matrix} & \begin{matrix} \frac{b_{1, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} & \frac{- b_{0, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} \\ \frac{- b_{0, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} & \frac{b_{0, 0}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} \end{matrix} \end{matrix}) .

Moreover,

Σ_{1, 2}^{(P)} = {(0, 0, a_{2, 0}, a_{2, 1})}^{'}

is the covariance vector between

ξ_{i, [1]}

and

ξ_{i, 2}^{P}

. This provides credibility weight

{(α_{[1]}^{P})}^{'} = {({(Σ_{1, 2}^{(P)})}^{'} Σ_{[1]}^{- 1})}^{'} = Σ_{[1]}^{- 1} Σ_{1, 2}^{(P)} \in R^{4}

given by

α_{[1]}^{P} = (a_{2, 0} c_{0, 0} + a_{2, 1} c_{0, 1}, a_{2, 0} c_{1, 0} + a_{2, 1} c_{1, 1}, \frac{b_{1, 1} a_{2, 0} - b_{0, 1} a_{2, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}}, \frac{- b_{0, 1} a_{2, 0} + b_{0, 0} a_{2, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}}),

and posterior variance

{(s_{2}^{(P), post})}^{2} = {(s_{2}^{P})}^{2} - {(Σ_{1, 2}^{(P)})}^{'} Σ_{[1]}^{- 1} Σ_{1, 2}^{(P)} = {(s_{2}^{P})}^{2} - \frac{b_{1, 1} a_{2, 0}^{2} - 2 b_{0, 1} a_{2, 1} a_{2, 0} + b_{0, 0} a_{2, 1}^{2}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} .

We again see that the crucial terms are

a_{2, 0} = Cov (ξ_{i, 2}^{P}, ξ_{i, 0}^{I})

and

a_{2, 1} = Cov (ξ_{i, 2}^{P}, ξ_{i, 1}^{I})

. If these two covariances are zero then incurred claims observation is not helpful to improve the prediction of

ξ_{i, 2}^{P}

. Therefore, we assume that at least one of these two covariances is different from zero. The predictor for the log-link

g (x) = log x

is given by

E [P_{i, 2}| B_{1}] = P_{i, 1} exp \{θ_{2}^{P} + β_{1}^{P} (ξ_{i, [1]}) + {(s_{2}^{(P), post})}^{2} / 2\} = f_{1}^{P} P_{i, 1} γ_{1}^{P} (ξ_{i, [1]}) .

with

\begin{matrix} β_{1}^{P} (ξ_{i, [1]}) & = & (a_{2, 0} c_{0, 0} + a_{2, 1} c_{0, 1}) (ξ_{i, 0}^{P} - θ_{0}^{P}) + (a_{2, 0} c_{1, 0} + a_{2, 1} c_{1, 1}) (ξ_{i, 1}^{P} - θ_{1}^{P}) \\ + \frac{b_{1, 1} a_{2, 0} - b_{0, 1} a_{2, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} (ξ_{i, 0}^{I} - θ_{0}^{I}) + \frac{- b_{0, 1} a_{2, 0} + b_{0, 0} a_{2, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} (ξ_{i, 1}^{I} - θ_{1}^{I}) . \end{matrix}

(9)

Again

ξ_{i, 0}^{P}

and

ξ_{i, 1}^{P}

are used to adjust

ξ_{i, 0}^{I}

and

ξ_{i, 1}^{I}

through

a_{0, 0}

,

a_{0, 1}

and

a_{1, 0}

,

a_{1, 1}

, respectively, which are integrated into

c_{0, 0}

,

c_{0, 1}

and

c_{1, 0}

,

c_{1, 1}

, respectively.

4. Munich Chain-Ladder Model

In Corollary 2, we have derived the best prediction under Assumption 3 for the log-link. This best prediction is understood relative to the mean-square error of prediction, and it crucially depends on the choice of the link function g. Since our model fulfills the chain-ladder model Assumption 1 for any link function g according to Corollary 1, it can also be considered as the best prediction for given information

B_{j}

in the distribution-free chain-ladder model for other link function choices g. The Munich chain-ladder method tackles the problem from a different viewpoint in that it extends the distribution-free chain-ladder model Assumption 1, so that it enforces the best prediction to have a pre-specified form. We define this extended model in Assumption 4, below, and then study under which circumstances our distributional model from Assumption 3 fulfills these Munich chain-ladder model assumptions. Define the residuals

ε_{i, j}^{I | P} = \frac{I_{i, j} - E [I_{i, j}| B_{j}^{P}]}{Var {(I_{i, j}| B_{j}^{P})}^{1 / 2}} and ε_{i, j}^{P | I} = \frac{P_{i, j} - E [P_{i, j}| B_{j}^{I}]}{Var {(P_{i, j}| B_{j}^{I})}^{1 / 2}} .

The adapted Munich chain-ladder assumptions of Quarg and Mack [1] are given by:

Assumption 4

(Munich chain-ladder model). Assume in addition to Assumption 1 that there exist constants

λ^{P}, λ^{I} \in (- 1, 1)

such that for

0 \leq j \leq J - 1

and

0 \leq i \leq J

E [P_{i, j + 1}| B_{j}] = f_{j}^{P} P_{i, j} + λ^{P} Var {(P_{i, j + 1}| B_{j}^{P})}^{1 / 2} ε_{i, j}^{I | P},

and

E [I_{i, j + 1}| B_{j}] = f_{j}^{I} I_{i, j} + λ^{I} Var {(I_{i, j + 1}| B_{j}^{I})}^{1 / 2} ε_{i, j}^{P | I} .

Remark 2.

The idea behind these additional assumptions is that one corrects for high and low incurred-paid and paid-incurred ratios via the residuals

ε_{i, j}^{I | P}

and

ε_{i, j}^{P | I}

because, for instance for cumulative payments, we have

ε_{i, j}^{I | P} = \frac{I_{i, j} - E [I_{i, j}| B_{j}^{P}]}{Var {(I_{i, j}| B_{j}^{P})}^{1 / 2}} = \frac{\frac{I_{i, j}}{P_{i, j}} - E [\frac{I_{i, j}}{P_{i, j}}| B_{j}^{P}]}{Var {(\frac{I_{i, j}}{P_{i, j}}| B_{j}^{P})}^{1 / 2}} = \frac{Q_{i, j}^{- 1} - E [Q_{i, j}^{- 1}| B_{j}^{P}]}{Var {(Q_{i, j}^{- 1}| B_{j}^{P})}^{1 / 2}},

with incurred-paid ratio

Q_{i, j}^{- 1} = I_{i, j} / P_{i, j}

. Therefore, the additional assumptions in Assumption 4 exactly provide PQ and IQ of Quarg and Mack [1]. If we choose the log-link for Assumption 3 then the incurred-paid ratio

Q_{i, j}^{- 1}

is turned into a difference on the log scale, that is,

log (Q_{i, j}^{- 1}) = log I_{i, j} - log P_{i, j} = \sum_{l = 0}^{j} ξ_{i, l}^{I} - \sum_{l = 0}^{j} ξ_{i, l}^{P}

. The aim of this section is to analyze under which circumstances these Munich chain-ladder corrections lead to the optimal predictors provided in Corollary 2. Below we will see that the constants

λ^{P}

and

λ^{I}

are crucial, they measure the (positive) correlation between the cumulative payments and the incurred-paid ratio correction (and similarly for incurred claims), see also Section 2.2.2 in [1]. Moreover,

λ^{P}

and

λ^{I}

receive an explicit meaning in Theorem 3, below.

The tower property of conditional expectations

E [P_{i, j + 1} | B_{j}^{P}] = E [E [P_{i, j + 1} | B_{j}] | B_{j}^{P}]

implies under Assumption 4

\begin{matrix} E [P_{i, j + 1}| B_{j}^{P}] = f_{j}^{P} P_{i, j} + λ^{P} Var {(P_{i, j + 1}| B_{j}^{P})}^{1 / 2} E [ε_{i, j}^{I | P}| B_{j}^{P}] = f_{j}^{P} P_{i, j} . \end{matrix}

Therefore, Assumption 4 does not contradict Assumption 1. As mentioned in Remark 2, we now analyze Assumption 4 from the viewpoint of the multivariate (log-)normal chain-ladder model of Assumption 3. We therefore need to analyze the correction term defined in the Munich chain-ladder model

λ^{P} Var {(P_{i, j + 1}| B_{j}^{P})}^{1 / 2} ε_{i, j}^{I | P} = λ^{P} σ_{j}^{P} P_{i, j} ε_{i, j}^{I | P},

(10)

and compare it to the optimal correction term obtained from Lemma 3 and Corollary 2, respectively. We start with log-link

g (x) = log x

and then provide the general result in Theorem 2, below. For the log-link we have representation of incurred claims

I_{i, j} = ν_{i} exp \{\sum_{l = 0}^{j} ξ_{i, l}^{I}\} .

(11)

Therefore, for

ε_{i, j}^{I | P}

we need to determine the conditional distribution of

\sum_{l = 0}^{j} ξ_{i, l}^{I}

, given

ξ_{i, [j]}^{P}

.

Lemma 4.

Under Assumption 3, we have

\sum_{l = 0}^{j} ξ_{i, l}^{I} |_{B_{j}^{P}} \sim N (\sum_{l = 0}^{j} θ_{l}^{I} + {(a_{0 : j}^{I})}^{'} {(Σ_{[j]}^{P})}^{- 1} (ξ_{i, [j]}^{P} - θ_{[j]}^{P}), {(s_{0 : j}^{I, post})}^{2}),

with covariance vector

a_{0 : j}^{I} = {(\sum_{l = 0}^{j} a_{0, l}, \dots, \sum_{l = 0}^{j} a_{j, l})}^{'} \in R^{j + 1}

for

A = {(a_{k, l})}_{0 \leq k, l \leq J}

, and posterior variance

{(s_{0 : j}^{I, post})}^{2} = \sum_{l = 0}^{j} {(s_{l}^{I})}^{2} - {(a_{0 : j}^{I})}^{'} {(Σ_{[j]}^{P})}^{- 1} a_{0 : j}^{I}

.

Proof.

This is a standard result for multivariate Gaussian distributions, see Result 4.6 in [4]. ☐

Example 2

(log-link). We consider log-link

g (x) = log x

. In this case, we have from Equation (11) and using Lemma 4 for the residual of the correction term

\begin{matrix} ε_{i, j}^{I | P} & = & \frac{exp {\sum_{l = 0}^{j} ξ_{i, l}^{I}} - E [exp {\sum_{l = 0}^{j} ξ_{i, l}^{I}}| B_{j}^{P}]}{Var {(exp {\sum_{l = 0}^{j} ξ_{i, l}^{I}}| B_{j}^{P})}^{1 / 2}} \\ = & \frac{exp \{\sum_{l = 0}^{j} (ξ_{i, l}^{I} - θ_{l}^{I}) - {(a_{0 : j}^{I})}^{'} {(Σ_{[j]}^{P})}^{- 1} (ξ_{i, [j]}^{P} - θ_{[j]}^{P}) - {(s_{0 : j}^{I, post})}^{2} / 2\} - 1}{{(exp \{{(s_{0 : j}^{I, post})}^{2}\} - 1)}^{1 / 2}} . \end{matrix}

This implies for the Munich chain-ladder model Assumption 4, we also use Equation (7),

f_{j}^{P} P_{i, j} + λ^{P} Var {(P_{i, j + 1}| B_{j}^{P})}^{1 / 2} ε_{i, j}^{I | P} = f_{j}^{P} P_{i, j} + λ^{P} σ_{j}^{P} P_{i, j} ε_{i, j}^{I | P} = f_{j}^{P} P_{i, j} γ_{j}^{P, MCL} (ξ_{i, [j]}),

with Munich chain-ladder correction factor defined by

γ_{j}^{P, MCL} (ξ_{i, [j]}) = 1 + λ^{P} \sqrt{\frac{e^{{(s_{j + 1}^{P})}^{2}} - 1}{e^{{(s_{0 : j}^{I, post})}^{2}} - 1}} (e^{\sum_{l = 0}^{j} (ξ_{i, l}^{I} - θ_{l}^{I}) - {(a_{0 : j}^{I})}^{'} {(Σ_{[j]}^{P})}^{- 1} (ξ_{i, [j]}^{P} - θ_{[j]}^{P}) - \frac{{(s_{0 : j}^{I, post})}^{2}}{2}} - 1) .

(12)

We analyze this Munich chain-ladder correction factor for

j = 1

. It is given by

\begin{matrix} γ_{1}^{P, MCL} (ξ_{i, [1]}) & = & 1 + λ^{P} \sqrt{\frac{e^{{(s_{2}^{P})}^{2}} - 1}{e^{{(s_{0 : 1}^{I, post})}^{2}} - 1}} \\ \times (e^{(ξ_{i, 0}^{I} - θ_{0}^{I}) + (ξ_{i, 1}^{I} - θ_{1}^{I}) - \frac{a_{0, 0} + a_{0, 1}}{{(s_{0}^{P})}^{2}} (ξ_{i, 0}^{P} - θ_{0}^{P}) - \frac{a_{1, 0} + a_{1, 1}}{{(s_{1}^{P})}^{2}} (ξ_{i, 1}^{P} - θ_{1}^{P}) - \frac{{(s_{0 : j}^{I, post})}^{2}}{2}} - 1) . \end{matrix}

(13)

We compare this to the best prediction under Assumption 3 in the case

j = 1

characterized by Equation (9) and under the additional assumptions that

a_{2, 0} = 0

and

a_{2, 1} \neq 0

. In this case we obtain from Equations (7) and (9) correction term

\begin{matrix} γ_{1}^{P} (ξ_{i, [1]}) & = & exp (β_{1}^{P} (ξ_{i, [1]}) - \frac{b_{0, 0} a_{2, 1}^{2}}{2 (b_{0, 0} b_{1, 1} - b_{0, 1}^{2})}) \\ = & exp {\frac{a_{2, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} [- b_{0, 1} (ξ_{i, 0}^{I} - θ_{0}^{I}) + b_{0, 0} (ξ_{i, 1}^{I} - θ_{1}^{I})] \\ + a_{2, 1} [c_{0, 1} (ξ_{i, 0}^{P} - θ_{0}^{P}) + c_{1, 1} (ξ_{i, 1}^{P} - θ_{1}^{P})] - \frac{b_{0, 0} a_{2, 1}^{2}}{2 (b_{0, 0} b_{1, 1} - b_{0, 1}^{2})}} . \end{matrix}

(14)

Note that Equations (13) and (14) differ. This can, for instance, be seen because all terms in the sum

\sum_{l = 0}^{j} (ξ_{i, l}^{I} - θ_{l}^{I})

in

γ_{1}^{P, MCL} (ξ_{i, [1]})

are equally weighted, whereas for the best predictor we consider a weighted sum

- b_{0, 1} (ξ_{i, 0}^{I} - θ_{0}^{I}) + b_{0, 0} (ξ_{i, 1}^{I} - θ_{1}^{I})

in

γ_{1}^{P} (ξ_{i, [1]})

. We conclude that, in general, Assumption 3 does not imply that the Munich chain-ladder model Assumption 4 is fulfilled.

The (disappointing) conclusion from Example 2 is that within the family of models fulfilling Assumption 3 with log-link

g (x) = log x

there does not exist (a general) interesting example satisfying the Munich chain-ladder model Assumption 4. Exceptions can only be found for rather artificial covariance matrices Σ, for instance, a choice with

A = 0

would fulfill the Munich chain-ladder model Assumption 4. But this latter choice is not of interest because it requires

λ^{P} = λ^{I} = 0

(which does not support the empirical findings of [1] that these correlation parameters should be positive). The result of Example 2 can be generalized to any link function as the next theorem shows.

Theorem 2.

Assume that cumulative payments

P_{i, j}

and incurred claims

I_{i, j}

fulfill Assumption 3 for a given continuous and strictly increasing link function

g : R_{+} \to R

with

{lim}_{x \to 0} g (x) = - \infty

and

{lim}_{x \to \infty} g (x) = \infty

. In general, this model does not fulfill the Munich chain-ladder model Assumption 4, except for special choices of Σ.

Proof.

The optimal one-step ahead prediction for given link function g is given by, see also Lemma 3,

E [P_{i, j + 1}| B_{j}] = P_{i, j} E [g^{- 1} (ξ_{i, j + 1}^{P})| B_{j}],

with

ξ_{i, j + 1}^{P} |_{B_{j}} \sim N (θ_{j + 1}^{P} + {(Σ_{j, j + 1}^{(P)})}^{'} Σ_{[j]}^{- 1} (ξ_{i, [j]} - θ_{[j]}), {(s_{j + 1}^{(P), post})}^{2}) .

From the latter, we observe that observation

ξ_{i, [j]}^{I}

is considered in a linear fashion

c^{'} ξ_{i, [j]}^{I}

for an appropriate vector

c \in R^{j + 1}

, which typically is different from zero (for

A \neq 0

) and which does not point into the direction of

{(1, \dots, 1)}^{'} \in R^{j + 1}

, i.e., we consider a weighted sum of the components of

ξ_{i, [j]}^{I}

(with non-identical weights).

On the other hand, the correction terms from the Munich chain-ladder assumption for a given link function g are given by, see also Equation (10),

\begin{matrix} λ^{P} σ_{j}^{P} P_{i, j} ε_{i, j}^{I | P} & = & λ^{P} σ_{j}^{P} P_{i, j} \frac{g^{- 1} (ν_{i}) \prod_{l = 0}^{j} g^{- 1} (ξ_{i, l}^{I}) - E [I_{i, j}| B_{j}^{P}]}{Var {(I_{i, j}| B_{j}^{P})}^{1 / 2}} \\ = & λ^{P} σ_{j}^{P} P_{i, j} \frac{g^{- 1} (ν_{i}) exp \{\sum_{l = 0}^{j} log (g^{- 1} (ξ_{i, l}^{I}))\} - E [I_{i, j}| B_{j}^{P}]}{Var {(I_{i, j}| B_{j}^{P})}^{1 / 2}} . \end{matrix}

Thus, the only link function g which considers the components of

ξ_{i, [j]}^{I}

in a linear fashion is the log-link

g (x) = log x

. For the log-link we get

\begin{matrix} λ^{P} σ_{j}^{P} P_{i, j} ε_{i, j}^{I | P} & = & λ^{P} σ_{j}^{P} P_{i, j} \frac{exp \{ν_{i} + \sum_{l = 0}^{j} ξ_{i, l}^{I}\} - E [I_{i, j}| B_{j}^{P}]}{Var {(I_{i, j}| B_{j}^{P})}^{1 / 2}} . \end{matrix}

From this we see that all components of

ξ_{i, [j]}^{I}

are considered with identical weights, and, therefore, it differs from the optimal one-step ahead prediction (if the latter uses non-identical weights). This is exactly what we have seen in Example 2 and proves the theorem. ☐

In Theorem 4.1 of [3], the Munich chain-ladder structure has been found as a best linear approximation to

E [P_{i, j + 1}| B_{j}]

in the following way

\begin{matrix} {\hat{E}}^{linear} [P_{i, j + 1}| B_{j}] & = & \underset{X = c_{1} P_{i, j} + c_{2} I_{i, j}; c_{1}, c_{2} \in L (B_{j}^{P})}{argmin} E [{(X - P_{i, j + 1})}^{2}| B_{j}^{P}] \\ = & f_{j}^{P} P_{i, j} + Corr (P_{i, j + 1}, I_{i, j}| B_{j}^{P}) Var {(P_{i, j + 1}| B_{j}^{P})}^{1 / 2} ε_{i, j}^{I | P}, \end{matrix}

(15)

where

L (B_{j}^{P})

is the space of

B_{j}^{P}

-measurable random variables. Note that this approximates the exact conditional expectation

E [P_{i, j + 1}| B_{j}]

and it gives an explicit meaning to parameter

λ^{P} \in (- 1, 1)

(which typically is non-constant in j), see also Section 2.2.2 in [1].

Theorem 3

(approximation error of MCL predictor). Under Assumption 3 and the log-link choice

g (x) = log x

we have approximation error for the Munich chain-ladder predictor

{\hat{E}}^{linear} [P_{i, j + 1}| B_{j}]

given by the difference

{\hat{E}}^{linear} [P_{i, j + 1}| B_{j}] - E [P_{i, j + 1}| B_{j}] = f_{j}^{P} P_{i, j} (γ_{j}^{P, MCL} (ξ_{i, [j]}) - γ_{j}^{P} (ξ_{i, [j]})),

where

γ_{j}^{P, MCL} (ξ_{i, [j]})

is given in Equation (12) with

λ^{P}

replaced by

Corr (P_{i, j + 1}, I_{i, j} | B_{j}^{P})

and

γ_{j}^{P} (ξ_{i, [j]})

is given in Corollary 2.

Proof.

This proof follows from Example 2. ☐

Remark 3.

In Theorem 2, we have seen that, in general, the Munich chain-ladder model Assumption 4 is not fulfilled for chain-ladder models satisfying Assumption 3. If, nevertheless, we would like to use an estimator that has Munich chain-ladder structure, we should use it in the sense of best-linear approximation Equation (15) to the best prediction

E [P_{i, j + 1}| B_{j}]

. Theorem 3 gives the approximation error of this approach for the log-link choice.

5. The Modified Munich Chain-Ladder Method

In the sequel, we concentrate on the model of Assumption 3 with log-link function

g (x) = log x

. This provides the chain-ladder model specified in the second part of Theorem 1 and the one-step ahead prediction given in Corollary 2. The issues that we still need to consider are the following: (i) We would like to extend the one-step ahead predictions to get the predictions of

P_{i, J}

and

I_{i, J}

, i.e., the final values of each accident year

i = 1, \dots, J

; (ii) Typically, model parameters are not known and need to be estimated; (iii) We should specify the prediction uncertainty. In order to achieve these goals, we choose a Bayesian modeling framework.

We remark that we consider tasks (ii) and (iii) in a Bayesian framework which turns out to be rather straightforward. Alternatively, one could also consider these questions from a frequentist’s viewpoint. In this case, (ii) is solved by maximum likelihood estimation and (iii) can be assessed either with bootstrap methods or by (asymptotic) results for maximum likelihood estimates. Our experience is that in many cases these different assessments lead to rather similar values if one uses non-informative priors in the Bayesian approach.

Assumption 5

((Bayesian) modified Munich chain-ladder model). Choose log-link

g (x) = log x

and assume the following: There is given a fixed covariance matrix Σ of the form Equation (8) having positive definite Schur complements

S_{[J]}^{P}

and

S_{[J]}^{I}

.

Conditionally, given parameter vector $Θ = {(Θ_{0}^{P}, \dots, Θ_{J}^{P}, Θ_{0}^{I}, \dots, Θ_{J}^{I})}^{'}$ , the random vectors $Ξ_{i}$ are independent for different accident years $i = 0, \dots, J$ with

${Ξ_{i}|}_{Θ} \sim N (Θ, Σ) .$
The parameter vector Θ has prior distribution

$Θ \sim N (θ, T),$

with prior mean $θ \in R^{2 (J + 1)}$ and symmetric positive definite prior covariance matrix $T \in R^{2 (J + 1) \times 2 (J + 1)}$ .

We first merge all accident years

i = 0, \dots, J

to one random vector

Ξ = {(Ξ_{0}^{'}, \dots, Ξ_{J}^{'})}^{'},

which has conditional distribution

{Ξ|}_{Θ} \sim N (B Θ, Σ^{+}),

for an appropriate matrix

B \in R^{2 {(J + 1)}^{2} \times 2 (J + 1)}

and covariance matrix

Σ^{+} = diag (Σ, \dots, Σ)

. The following lemma is crucial, we refer to Corollary 4.3 in [6].

Lemma 5.

Under Assumption 5 the random vector

ζ = {(Ξ^{'}, Θ^{'})}^{'}

has a multivariate Gaussian distribution given by

ζ = (\begin{matrix} Ξ \\ Θ \end{matrix}) \sim N (μ = (\begin{matrix} B θ \\ θ \end{matrix}), S = (\begin{matrix} Σ^{+} + B T B^{'} & B T \\ T B^{'} & T \end{matrix})) .

An easy consequence of Lemma 5 is the following marginal distribution

Ξ \sim N (B θ, Σ^{+} + B T B^{'}) .

This shows that, in the Bayesian multivariate normal model with Gaussian priors, we can completely “integrate out” the hierarchy of parameters Θ. However, we keep the hierarchy of parameters in order to obtain Bayesian parameter estimates for Θ.

Denote the dimension of ζ by

n = 2 {(J + 1)}^{2} + 2 (J + 1)

. Choose

t, v \in N

with

t + v = n

. Denote by

P_{t} \in R^{t \times n}

and

P_{v} \in R^{v \times n}

the projections such that we obtain a disjoint decomposition of the components of ζ

ζ \mapsto (ζ_{t}, ζ_{v}) = (P_{t} ζ, P_{v} ζ) .

(16)

The random vector

{(ζ_{t}^{'}, ζ_{v}^{'})}^{'}

has a multivariate Gaussian distribution with expected values

μ_{t} = E [ζ_{t}] = P_{t} μ and μ_{v} = E [ζ_{v}] = P_{v} μ,

and with covariance matrices

S_{t} = Cov (ζ_{t}) = P_{t} S P_{t}^{'}, S_{v} = Cov (ζ_{v}) = P_{v} S P_{v}^{'}, S_{v, t}^{'} = S_{t, v} = Cov (ζ_{t}, ζ_{v}) = P_{t} S P_{v}^{'} .

The projections in Equation (16) only describe a permutation of the components of ζ. In complete analogy to Lemma 1 we have the following lemma.

Lemma 6.

Under Assumption 5, the random vector

ζ_{v} |_{{ζ_{t}}}

has a multivariate Gaussian distribution with the first two conditional moments given by

\begin{matrix} μ_{v}^{post} & = & E [ζ_{v}| ζ_{t}] = μ_{v} + S_{v, t} {(S_{t})}^{- 1} (ζ_{t} - μ_{t}), \\ S_{v}^{post} & = & Cov (ζ_{v}| ζ_{t}) = S_{v} - S_{v, t} {(S_{t})}^{- 1} S_{t, v} . \end{matrix}

This lemma allows us to estimate the parameters and calculate the predictions at time J, conditionally given observations

\begin{matrix} D_{J}^{P} & = & \{P_{i, j}; 0 \leq i \leq J, 0 \leq j \leq J; i + j \leq J\}, \\ D_{J}^{I} & = & \{I_{i . j}; 0 \leq i \leq J, 0 \leq j \leq J; i + j \leq J\}, \\ D_{J} & = & D_{J}^{P} \cup D_{J}^{I} . \end{matrix}

Choose

t = | D_{J} |

and

v = n - t

and denote by

P_{t}

the projection of ζ onto the components

ξ_{i, j}^{P}

and

ξ_{i, j}^{I}

with

i + j \leq J

. These are exactly the components that generate information

D_{J}

. Lemma 6 allows us to calculate the posterior distribution of

ζ_{v}

, conditionally given

D_{J}

. We split this calculation into two parts, one for parameter estimation and one for claims prediction. We consider therefore the following projection

P_{Θ} \in R^{2 (J + 1) \times v} with P_{Θ} ζ_{v} = Θ .

This projection extracts the parameter vector Θ from the unobserved components

ζ_{v}

.

Corollary 3

(parameter estimation). Under Assumption 5, the Bayesian estimator for the parameter vector Θ is at time J given by

θ^{post} = E [Θ| D_{J}] = P_{Θ} μ_{v}^{post} .

This can now be compared to the individual estimates

θ^{(*), post} = E [Θ| D_{J}^{*}],

(17)

where for

* \in {P, I}

we either condition on

D_{J}^{P}

or on

D_{J}^{I}

.

6. Claims Prediction and Prediction Uncertainty

For the prediction of the total claim amount of accident year i, we have two different possibilities, either we use the predictor of cumulative payments

P_{i, J}

or the one of incurred claims

I_{i, J}

. Naturally, these two predictors differ and the Munich chain-ladder method exactly aims at diminishing this difference by including the incurred-paid and paid-incurred ratios, see Remark 2 and [1]. Choose the log-link

g (x) = log x

, then we calculate for

i = 1, \dots, J

the best predictors

E [P_{i, J}| D_{J}] = P_{i, J - i} E [exp \{\sum_{l = J - i + 1}^{J} ξ_{i, l}^{P}\}| D_{J}],

and

E [I_{i, J}| D_{J}] = I_{i, J - i} E [exp \{\sum_{l = J - i + 1}^{J} ξ_{i, l}^{I}\}| D_{J}] .

Assume again that

ζ_{t}

exactly corresponds to the observations in

D_{J}

. Then we define for

i = 1, \dots, J

and

* \in {P, I}

the linear maps

G_{i}^{*} \in R^{1 \times v} with G_{i}^{*} ζ_{v} = \sum_{l = J - i + 1}^{J} ξ_{i, l}^{*} .

This is the sum of the unobserved components of accident year i at time J for cumulative payments and incurred claims, respectively.

Theorem 4

(modified Munich chain-ladder (mMCL) predictors). Under Assumption 5, the Bayesian predictors for the total claim amount of accident year

i = 1, \dots, J

at time J are

{\hat{P}}_{i, J}^{mMCL} = E [P_{i, J}| D_{J}] = P_{i, J - i} exp \{G_{i}^{P} μ_{v}^{post} + G_{i}^{P} S_{v}^{post} {(G_{i}^{P})}^{'} / 2\},

and

{\hat{I}}_{i, J}^{mMCL} = E [I_{i, J}| D_{J}] = I_{i, J - i} exp \{G_{i}^{I} μ_{v}^{post} + G_{i}^{I} S_{v}^{post} {(G_{i}^{I})}^{'} / 2\} .

The conditional mean-square error of prediction is given by

\begin{matrix} {msep}_{\sum_{i = 1}^{J} P_{i, J} | D_{J}} (\sum_{i = 1}^{J} {\hat{P}}_{i, J}^{mMCL}) & = & Var (\sum_{i = 1}^{J} P_{i, J}| D_{J}) \\ = & \sum_{i, k = 1}^{J} {\hat{P}}_{i, J}^{mMCL} {\hat{P}}_{k, J}^{mMCL} (exp \{G_{i}^{P} S_{v}^{post} {(G_{k}^{P})}^{'}\} - 1), \end{matrix}

and analogously for incurred claims

{msep}_{\sum_{i = 1}^{J} I_{i, J} | D_{J}} (\sum_{i = 1}^{J} {\hat{I}}_{i, J}^{mMCL})

.

This can now again be compared to the individual predictors

{\hat{P}}_{i, J}^{HCL} = E [P_{i, J}| D_{J}^{P}] and {\hat{I}}_{i, J}^{HCL} = E [I_{i, J}| D_{J}^{I}],

(18)

and the corresponding conditional mean-square errors of prediction. Note that these individual predictors correspond to the predictors in the model of Hertig [7] under Gaussian prior assumptions for the (unknown) mean parameters. Predictors and prediction uncertainty of Equation (18) can (easily) be obtained from Theorem 4 using the particular choice

A = 0

in Σ.

Before we give a numerical example, we briefly describe these predictors. The likelihood function of Assumption 5 is given by

\begin{matrix} L (Ξ, Θ) & = & \frac{1}{{(2 π)}^{2 {(J + 1)}^{2} / 2} \det {(Σ^{+})}^{1 / 2}} exp \{- \frac{1}{2} {(Ξ - B Θ)}^{'} {(Σ^{+})}^{- 1} (Ξ - B Θ)\} \\ \times \frac{1}{{(2 π)}^{2 (J + 1) / 2} \det {(T)}^{1 / 2}} exp \{- \frac{1}{2} {(Θ - θ)}^{'} T^{- 1} (Θ - θ)\} . \end{matrix}

Under the additional assumption of diagonal matrices

\begin{matrix} Σ & = & diag ({(s_{0}^{P})}^{2}, \dots, {(s_{J + 1}^{P})}^{2}, {(s_{0}^{I})}^{2}, \dots, {(s_{J + 1}^{I})}^{2}), \\ T & = & diag ({(t_{0}^{P})}^{2}, \dots, {(t_{J + 1}^{P})}^{2}, {(t_{0}^{I})}^{2}, \dots, {(t_{J + 1}^{I})}^{2}), \end{matrix}

(19)

we obtain log-likelihood (we drop all normalizing constants)

log L (Ξ, Θ) \propto - \frac{1}{2} \sum_{j = 0}^{J} [\sum_{i = 0}^{J} \frac{{(ξ_{i, j}^{P} - Θ_{j}^{P})}^{2}}{{(s_{j}^{P})}^{2}} + \frac{{(Θ_{j}^{P} - θ_{j}^{P})}^{2}}{{(t_{j}^{P})}^{2}} + \sum_{i = 0}^{J} \frac{{(ξ_{i, j}^{I} - Θ_{j}^{I})}^{2}}{{(s_{j}^{I})}^{2}} + \frac{{(Θ_{j}^{I} - θ_{j}^{I})}^{2}}{{(t_{j}^{I})}^{2}}] .

From this, we see that the Bayesian estimators of the parameters are for

j = 0, \dots, J

and

* \in {P, I}

under Equation (19) given by, see also Corollary 3,

E [Θ_{j}^{*}| D_{J}] = z_{j}^{*} {\hat{Θ}}_{j}^{*} + (1 - z_{j}^{*}) θ_{j}^{*},

with prior mean

θ_{j}^{*}

, and empirical mean

{\hat{Θ}}_{j}^{*}

and credibility weight

z_{j}^{*}

given by

{\hat{Θ}}_{j}^{*} = \frac{1}{I - j + 1} \sum_{i = 0}^{J - i} ξ_{i, j}^{*} and z_{j}^{*} = \frac{J - i + 1}{J - i + 1 + {(σ_{j}^{*} / t_{j}^{*})}^{2}} .

If we now let the prior information become non-informative, i.e.,

t_{j}^{*} \to \infty

, we obtain estimate

lim_{t_{j}^{*} \to \infty} E [Θ_{j}^{*}| D_{J}] = {\hat{Θ}}_{j}^{*},

(20)

and posterior variances

{(σ_{j}^{*})}^{2} / (J - i + 1)

. In view of Theorem 4, this provides under Equation (19) and in the non-informative prior limit

lim_{t_{j}^{P} \to \infty} {\hat{P}}_{i, J}^{mMCL} = P_{i, J - i} \prod_{j = J - i + 1}^{J} exp \{{\hat{Θ}}_{j}^{P} + \frac{{(σ_{j}^{P})}^{2}}{2} (1 + \frac{1}{J - i + 1})\} = P_{i, J - i} \prod_{j = J - i}^{J - 1} {\hat{f}}_{j}^{P},

(21)

where the latter identity defines the chain-ladder parameter estimates

{\hat{f}}_{j}^{P}

for our model. This is exactly the chain-ladder predictor obtained in Hertig’s log-normal chain-ladder model, see formula (5.9) in [7]. The corresponding result also holds true for incurred claims under Equation (19).

As was investigated by Quarg and Mack [1], see also Remark 2 above and Figure 1 below, we expect positive dependence between cumulative payment residuals and incurred-paid ratios (and between incurred claims residuals and paid-incurred ratios). This will be reflected by a covariance matrix choice Σ that does not have diagonal form Equation (19) but a general off-diagonal matrix A in Equation (8) such that the Schur complements are positive definite (see Assumption 5). In this case, the best predictors are provided by Theorem 4. They do not have a simple form (though their calculation is straightforward using matrix algebra). We will compare these predictors to the Munich chain-ladder predictors Equation (15) which are non-optimal in our context (see Theorem 3).

7. Example

We provide an explicit example which is based on the original data of Quarg and Mack [1], the data is provided in the Appendix A. We calculate for this data set Hertig’s chain-ladder (HCL) reserves according to Equations (18) and (21), the reserves in the modified Munich chain-ladder (mMCL) method of Theorem 4 and the (non-optimal) log-normal Munich chain-ladder (LN–MCL) reserves Equation (15) (according to Assumption 4). These reserves are based on the Bayesian multivariate log-normal framework of Assumption 5 with log-link

g (x) = log x

. For comparison purposes, we also provide the classical chain-ladder (CL) reserves together with the Quarg and Mack Munich chain-ladder reserves (QM–MCL); these two latter methods differ from our results because of the different variance assumption in Assumption 1. In order to have comparability between the different approaches, we choose non-informative priors

t_{j}^{P}, t_{j}^{I} \to \infty

in the former Bayesian methods, see also Equations (20) and (19).

First, we need to estimate the parameters in the log-normal model of Assumption 5. For

s_{j}^{P}

and

s_{j}^{I}

, we choose the sample standard deviations of the observed log-link ratios

ξ_{i, j}^{P}

and

ξ_{i, j}^{I}

,

i + j \leq J

, with the usual exponential extrapolation for the last period

j = 6

. Using these sample estimators, we calculate the posterior means

θ_{j}^{(P), post}

and

θ_{j}^{(I), post}

using Corollary 3 under choice

A = 0

. In the non-informative prior limit, these posterior means are given by Equation (21) (and similarly for incurred claims). This then allows one to calculate Hertig’s chain-ladder parameters

{\hat{f}}_{j}^{P}

and

{\hat{f}}_{j}^{I}

, see Equation (21). These parameters are provided in Table 1. Note that these chain-ladder factors differ from the ones in the classical chain-ladder model because of the different variance assumptions.

Table 1. Sample standard deviations

s_{j}^{P}

and

s_{j}^{I}

; posterior means

θ_{j}^{(P), post}

and

θ_{j}^{(I), post}

obtained from Corollary 3, see also Equation (21); and Hertig’s chain-ladder estimates

{\hat{f}}_{j}^{P}

and

{\hat{f}}_{j}^{I}

according to Equation (21).

**Table 1.** Sample standard deviations $s_{j}^{P}$ and $s_{j}^{I}$ ; posterior means $θ_{j}^{(P), post}$ and $θ_{j}^{(I), post}$ obtained from Corollary 3, see also Equation (21); and Hertig’s chain-ladder estimates ${\hat{f}}_{j}^{P}$ and ${\hat{f}}_{j}^{I}$ according to Equation (21).
a.y. i/d.y. j	0	1	2	3	4	5	6
$θ_{j}^{(P), post}$	7.2195	0.9163	0.1203	0.0296	0.0216	0.0205	0.0137
$s_{j}^{P}$	0.4972	0.1600	0.0515	0.0069	0.0036	0.0101	0.0036
${\hat{f}}_{j}^{P}$	1,573	2.5376	1.1296	1.0301	1.0219	1.0208	1.0138
$θ_{j}^{(I), post}$	7.8404	0.5151	0.0137	0.0003	0.0115	−0.0090	−0.0037
$s_{j}^{I}$	0.5182	0.1503	0.0406	0.0146	0.0022	0.0180	0.0022
${\hat{f}}_{j}^{I}$	2,963	1.6959	1.0148	1.0004	1.0116	0.9912	0.9963

Using these parameters, we calculate the HCL reserves (prediction minus the last observed cumulative payments

P_{i, J - i}

at time J), and for comparison purposes we provide the classical CL reserves. These results are provided in Table 2. The main observation is that there are quite substantial differences between the HCL reserves from cumulative payments of 6,205 and the HCL reserves from incurred claims of 7,730, see Table 2. This also holds true for the classical CL reserves 5,938 versus 7,503. This gap mainly comes from the last accident year

i = 6

because incurred claims observation

I_{6, 0}

is comparably high. We also note that the HCL reserves are more conservative than the classical CL ones. This mainly comes from the variance correction that enters the mean of log-normal random variables, see Equation (21).

To bridge this gap between the cumulative payments and the incurred claims methods we study the other reserving methods. We start with the LN–MCL method under the log-normal assumptions of Assumption 4. First we determine the correlation parameters. We use the estimators of Section 3.1.2 in [1] with changed variance functions. This provides estimates

{\hat{λ}}^{P} = 49 %

and

{\hat{λ}}^{I} = 45 %

. Note that this exactly corresponds to the positive linear dependence illustrated in Figure 1; Quarg and Mack [1] obtain under their (changed) variance assumption 64% and 44%, respectively, which is in line with our findings. Using these estimates we can then calculate the reserves in our LN–MCL method and in Quarg-Mack’s QM–MCL method. The results are provided in Table 2. We observe that the gap between the cumulative payments reserves of 6,729 and the incurred claims reserves of 7,140 becomes more narrow due to the correction factors. The same holds true for QM–MCL with reserves 6,847 and 7,120, respectively. Moreover, both models LN–MCL and QM–MCL provide rather close results, though their model assumptions differ in the variance assumption.

Table 2. Resulting reserves from the Hertig’s chain-ladder (HCL) method based on paid and incurred; from the log-normal Munich chain-ladder (LN–MCL) method based on paid and incurred; from the modified Munich chain-ladder (mMCL) paid method; the classical chain-ladder (CL) method based on paid and incurred (inc.); and the Quarg and Mack Munich chain-ladder (QM–MCL) method paid and incurred.

**Table 2.** Resulting reserves from the Hertig’s chain-ladder (HCL) method based on paid and incurred; from the log-normal Munich chain-ladder (LN–MCL) method based on paid and incurred; from the modified Munich chain-ladder (mMCL) paid method; the classical chain-ladder (CL) method based on paid and incurred (inc.); and the Quarg and Mack Munich chain-ladder (QM–MCL) method paid and incurred.
a.y. i	HCL		LN-MCL		mMCL	CL		QM-MCL
a.y. i	paid	inc.	paid	inc.	paid	paid	inc.	paid	inc.
1	32	97	35	95	16	32	97	35	96
2	157	92	92	147	115	158	88	103	135
3	337	286	262	346	375	332	276	269	326
4	416	201	289	330	382	408	191	289	302
5	925	459	656	688	906	924	466	646	655
6	4,339	6,594	5,395	5,534	5,130	4,084	6,385	5,505	5,606
total	6’205	7’730	6’729	7’140	6’924	5’938	7’503	6’847	7’120

Figure 1. (lhs) Incurred-paid residuals obtained from

Q_{i, j}^{- 1} = I_{i, j} / P_{i, j}

, see Remark 2, versus claims payments residuals obtained from

P_{i, j + 1}

, straight line has slope

{\hat{λ}}^{P} = 49 %

; (rhs) paid-incurred residuals obtained from

Q_{i, j} = P_{i, j} / I_{i, j}

versus incurred claims residuals obtained from

I_{i, j + 1}

, straight line has slope

{\hat{λ}}^{I} = 45 %

.

Figure 1. (lhs) Incurred-paid residuals obtained from

Q_{i, j}^{- 1} = I_{i, j} / P_{i, j}

, see Remark 2, versus claims payments residuals obtained from

P_{i, j + 1}

, straight line has slope

{\hat{λ}}^{P} = 49 %

; (rhs) paid-incurred residuals obtained from

Q_{i, j} = P_{i, j} / I_{i, j}

versus incurred claims residuals obtained from

I_{i, j + 1}

, straight line has slope

{\hat{λ}}^{I} = 45 %

.

Finally, we study the modified Munich chain-ladder method mMCL of Assumption 5, see Theorem 4. We therefore need to specify the off-diagonal matrix

A = {(a_{k, l})}_{0 \leq k, l \leq J}

, see Equation (8). A first idea to calibrate this matrix A is to use correlation estimate

{\hat{λ}}^{P} = 49 %

from the LN-MCL method. A crude approximation using Theorem 3 provides

49 % = {\hat{λ}}^{P} \approx Corr (P_{i, j + 1}, I_{i, j}| B_{j}^{P}) \approx \frac{\sum_{k = 0}^{j} a_{j + 1, k}}{σ_{j + 1}^{P} {(\sum_{k = 0}^{j} {(σ_{k}^{I})}^{2})}^{1 / 2}} = \frac{\sum_{k = 0}^{j} Corr (ξ_{i, j + 1}^{P}, ξ_{i, k}^{I}) σ_{k}^{I}}{{(\sum_{k = 0}^{j} {(σ_{k}^{I})}^{2})}^{1 / 2}} .

From this, we see that in our numerical example we need comparatively high correlations, for instance,

Corr (ξ_{i, j + 1}^{P}, ξ_{i, k}^{I}) \geq 40 %

would be in line with

{\hat{λ}}^{P} = 49 %

. The difficulty with this choice is that the resulting matrix Σ of type Equation (8) is not positive definite. Therefore, we need to choose smaller correlations. We do the following choice for all

i, j \geq 0

Corr (ξ_{i, j + m}^{P}, ξ_{i, j}^{I}) = \{\begin{matrix} 40 % & for m = 1, \\ 30 % & for m = 2, \\ 20 % & for m = 3, \end{matrix}

(22)

and 0% otherwise. This provides a positive definite choice for Σ of type Equation (8) in our example. This choice means that we can learn from incurred claims observations

ξ_{i, j}^{I}

(which relate to residuals

ε_{i, j}^{I | P}

) for cumulative payments observations

ξ_{i, j + m}^{P}

with development lags

m = 1, 2, 3

, but no other conclusions can be drawn from other observations. Note that, in this example, we only use correlation choices Equation (22), but no similar choice for

Corr (ξ_{i, j + m}^{I}, ξ_{i, j}^{P})

is done. The reason is that if we choose positive correlations for the latter, in general, Σ is not positive definite. This shows that requirement Equation (8) is rather restrictive and we expect that data usually does not satisfy Assumption 1, because both plots in Figure 1 show a positive slope.

The resulting mMCL reserves

{\hat{R}}_{i}^{mMCL} = {\hat{P}}_{i, J}^{mMCL} - P_{i, J - i},

according to Theorem 4, are provided in Table 2. Correlation choice Equation (22) means that we learn from incurred claims, which are above average for accident year

i = 6

. This substantially increases the mMCL reserves based on cumulative payments to 6’924. Note that we do not provide the values for incurred claims: positive definiteness of Σ restricts

Corr (ξ_{i, j + m}^{I}, ξ_{i, j}^{P}) = 0

(under Equation (22)) which implies that we obtain almost identical values to the HCL incurred reserves for

{\hat{I}}_{i, J}^{mMCL} - P_{i, J - i}

.

Finally, we analyze the prediction uncertainty measured by the square-rooted conditional mean square error of prediction. The results are provided in Table 3.

The prediction uncertainties of the HCL reserves and of the mMCL reserves were calculated according to Theorem 4. For the former (HCL reserves), we simply need to set

A = 0

. We see that the uncertainties in the modified version for cumulative payments are reduced from 1’249 to 1’208 because correlations Equation (22) imply that we can learn from incurred claims for cumulative payments. For incurred claims, they remain (almost) invariant because of choices

a_{k, l} = 0

for

k < l

.

We can now calculate the prediction uncertainty for the LN–MCL method (which is still an open problem). Within Assumption 5, we know that the mMCL predictor is optimal, therefore, we obtain prediction uncertainty for the LN–MCL method

{msep}_{\sum_{i} P_{i, J} | D_{J}} (\sum_{i} {\hat{P}}_{i, J}^{MCL}) = {msep}_{\sum_{i} P_{i, J} | D_{J}} (\sum_{i} {\hat{P}}_{i, J}^{mMCL}) + {(\sum_{i} {\hat{P}}_{i, J}^{MCL} - \sum_{i} {\hat{P}}_{i, J}^{mMCL})}^{2},

(23)

and similarly for incurred claims. The second term in Equation (23) is the approximation error because the LN–MCL predictor is non-optimal within Assumption 5.

Table 3. Resulting reserves and square-rooted conditional mean square error of prediction of the different chain-ladder methods. ^* is calculated from Equation (23).

**Table 3.** Resulting reserves and square-rooted conditional mean square error of prediction of the different chain-ladder methods. ^* is calculated from Equation (23).
	Reserves	msep^1/2
Hertig’s chain-ladder HCL paid	6,205	1,249
Hertig’s chain-ladder HCL incurred	7,730	1,565
log-normal Munich chain-ladder LN–MCL paid	6,729	1,224^*
log-normal Munich chain-ladder LN–MCL incurred	7,140	1,673^*
modified Munich chain-ladder mMCL paid	6,924	1,208
modified Munich chain-ladder mMCL incurred	7,730	1,565
classical chain-ladder CL paid	5,938	994
classical chain-ladder CL incurred	7,503	995
Quarg-Mack Munich chain-ladder QM–MCL paid	6,847	n/a
Quarg-Mack Munich chain-ladder QM–MCL incurred	7,120	n/a

To resume, the modified Munich chain-ladder method for cumulative payments and under assumption Equation (8) provides in our example, claims reserves that are between the HCL paid and the HCL incurred reserves (as requested). Moreover, it provides the smallest prediction uncertainty among the methods based on multivariate normal distributions. This is because, in contrast to the HCL paid and HCL incurred methods, it simultaneously considers the entire information

D_{J}

, and because there is no bias (approximation error) compared LN–MCL paid and LN–MCL incurred. These conclusions are always based on the validity of Assumption 5 which is the weakness of the method because real data typically requires different covariance matrix choices than Equation (8).

8. Conclusions

We have studied the Munich chain-ladder axioms of Quarg and Mack [1] under the moment assumptions of Assumption 1. In a multivariate log-normal modeling framework, this provides rather restrictive covariance matrix Σ requirements, see Equation (8), so that Assumption 1 is simultaneously fulfilled for cumulative payments and incurred claims. For instance, a reasonable choice of Σ for the data of Quarg and Mack [1] will differ from structure Equation (8), see Section 7 where a simultaneous choice of Equation (22) for cumulative payments and a similar choice for incurred claims would lead to a covariance matrix Σ that is not positive definite.

If Equation (8) holds, then there exists a consistent Munich chain-ladder framework, see Assumption 3, for which we can analyze claims reserves and their prediction uncertainty, see Theorem 4. Moreover, the Munich chain-ladder predictor is non-optimal in this framework, see Theorem 2, and the approximation error is provided in Theorem 3.

Author Contributions

Both authors have contributed to this document to a similar extent.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix

A. Data of Quarg and Mack [1]

Table A1. Observed cumulative payments

P_{i, j}

,

i + j \leq 6

, source Quarg and Mack [1].

**Table A1.** Observed cumulative payments $P_{i, j}$ , $i + j \leq 6$ , source Quarg and Mack [1].
a.y. i/d.y. j	0	1	2	3	4	5	6
0	576	1,804	1,970	2,024	2,074	2,102	2,131
1	866	1,948	2,162	2,232	2,284	2,348
2	1,412	3,758	4,252	4,416	4,494
3	2,286	5,292	5,724	5,850
4	1,868	3,778	4,648
5	1,442	4,010
6	2,044

Table A2. Observed incurred claims

I_{i, j}

,

i + j \leq 6

, source Quarg and Mack [1].

**Table A2.** Observed incurred claims $I_{i, j}$ , $i + j \leq 6$ , source Quarg and Mack [1].
a.y. i/d.y. j	0	1	2	3	4	5	6
0	978	2,104	2,134	2,144	2,174	2,182	2’174
1	1,844	2,552	2,466	2,480	2,508	2,454
2	2,904	4,354	4,698	4,600	4,644
3	3,502	5,958	6,070	6,142
4	2,812	4,882	4,852
5	2,642	4,406
6	5,022

B. Inverse Matrix Σ_[1]

Consider the matrix

Σ_{[1]} = (\begin{matrix} \begin{matrix} {(s_{0}^{P})}^{2} & 0 \\ 0 & {(s_{1}^{P})}^{2} \end{matrix} & \begin{matrix} a_{0, 0} & a_{0, 1} \\ a_{1, 0} & a_{1, 1} \end{matrix} \\ \begin{matrix} a_{0, 0} & a_{1, 0} \\ a_{0, 1} & a_{1, 1} \end{matrix} & \begin{matrix} {(s_{0}^{I})}^{2} & 0 \\ 0 & {(s_{1}^{I})}^{2} \end{matrix} \end{matrix}) .

Set

\begin{matrix} b_{0, 0} & = & {(s_{0}^{I})}^{2} - a_{0, 0}^{2} / {(s_{0}^{P})}^{2} - a_{1, 0}^{2} / {(s_{1}^{P})}^{2}, \\ b_{1, 1} & = & {(s_{1}^{I})}^{2} - a_{0, 1}^{2} / {(s_{0}^{P})}^{2} - a_{1, 1}^{2} / {(s_{1}^{P})}^{2}, \\ b_{0, 1} & = & - a_{0, 0} a_{0, 1} / {(s_{0}^{P})}^{2} - a_{1, 1} a_{1, 0} / {(s_{1}^{P})}^{2} . \end{matrix}

The inverse matrix of

Σ_{[1]}

is given by

\begin{matrix} {(Σ_{[1]})}^{- 1} & = & (\begin{matrix} \begin{matrix} \frac{1}{{(s_{0}^{P})}^{2}} + \frac{b_{0, 0} a_{0, 1}^{2} - 2 b_{0, 1} a_{0, 0} a_{0, 1} + b_{1, 1} a_{0, 0}^{2}}{(b_{0, 0} b_{1, 1} + b_{0, 1}^{2}) {(s_{0}^{P})}^{4}} & \frac{b_{0, 0} a_{0, 1} a_{1, 1} - b_{0, 1} (a_{0, 0} a_{1, 1} + a_{1, 0} a_{0, 1}) + b_{1, 1} a_{1, 0} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2} {(s_{1}^{P})}^{2}} \\ \frac{b_{0, 0} a_{0, 1} a_{1, 1} - b_{0, 1} (a_{0, 0} a_{1, 1} + a_{1, 0} a_{0, 1}) + b_{1, 1} a_{1, 0} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2} {(s_{1}^{P})}^{2}} & \frac{1}{{(s_{1}^{P})}^{2}} + \frac{b_{0, 0} a_{1, 1}^{2} - 2 b_{0, 1} a_{1, 1} a_{1, 0} + b_{1, 1} a_{1, 0}^{2}}{(b_{0, 0} b_{1, 1} + b_{0, 1}^{2}) {(s_{1}^{P})}^{4}} \end{matrix} & \begin{matrix} \frac{b_{0, 1} a_{0, 1} - b_{1, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}} & \frac{- b_{0, 0} a_{0, 1} + b_{0, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}} \\ \frac{b_{0, 1} a_{1, 1} - b_{1, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}} & \frac{- b_{0, 0} a_{1, 1} + b_{0, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}} \end{matrix} \\ \begin{matrix} \frac{b_{0, 1} a_{0, 1} - b_{1, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}} & \frac{b_{0, 1} a_{1, 1} - b_{1, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}} \\ \frac{- b_{0, 0} a_{0, 1} + b_{0, 1} a_{0, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{0}^{P})}^{2}} & \frac{- b_{0, 0} a_{1, 1} + b_{0, 1} a_{1, 0}}{(b_{0, 0} b_{1, 1} - b_{0, 1}^{2}) {(s_{1}^{P})}^{2}} \end{matrix} & \begin{matrix} \frac{b_{1, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} & \frac{- b_{0, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} \\ \frac{- b_{0, 1}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} & \frac{b_{0, 0}}{b_{0, 0} b_{1, 1} - b_{0, 1}^{2}} \end{matrix} \end{matrix}) . \end{matrix}

References

G. Quarg, and T. Mack. “Munich chain ladder.” Blätter DGVFM XXVI (2004): 597–630. [Google Scholar] [CrossRef]
H. Liu, and R. Verrall. “Bootstrap estimation of the predictive distributions of reserves using paid and incurred claims.” Variance 4/2 (2010): 121–135. [Google Scholar]
M. Merz, and M.V. Wüthrich. “A credibility approach to the Munich chain-ladder method.” Blätter DGVFM XXVII (2006): 619–628. [Google Scholar] [CrossRef]
R.A. Johnson, and D.W. Wichern. Applied Multivariate Statistical Analysis, 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, 1988. [Google Scholar]
S. Boyd, and L. Vandenberghe. Convex Optimization. Cambridge, UK: Cambridge University Press, 2004. [Google Scholar]
M.V. Wüthrich, and M. Merz. “Stochastic claims reserving manual: Advances in dynamic modeling.” Available online: http://papers.ssrn.com/sol3/papers.cfm?abstract-id=2649057 (accessed on 21 December 2015).
J. Hertig. “A statistical approach to the IBNR-reserves in marine insurance.” ASTIN Bull. 15/2 (1985): 171–183. [Google Scholar] [CrossRef]

© 2015 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Merz, M.; Wüthrich, M.V. Modified Munich Chain-Ladder Method. Risks 2015, 3, 624-646. https://doi.org/10.3390/risks3040624

AMA Style

Merz M, Wüthrich MV. Modified Munich Chain-Ladder Method. Risks. 2015; 3(4):624-646. https://doi.org/10.3390/risks3040624

Chicago/Turabian Style

Merz, Michael, and Mario V. Wüthrich. 2015. "Modified Munich Chain-Ladder Method" Risks 3, no. 4: 624-646. https://doi.org/10.3390/risks3040624

Article Menu

Modified Munich Chain-Ladder Method

Abstract

1. Introduction

Organization of the Paper

2. Chain-Ladder Models

3. One-Step Ahead Prediction

4. Munich Chain-Ladder Model

5. The Modified Munich Chain-Ladder Method

6. Claims Prediction and Prediction Uncertainty

7. Example

8. Conclusions

Author Contributions

Conflicts of Interest

Appendix

A. Data of Quarg and Mack [1]

B. Inverse Matrix Σ_[1]

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Modified Munich Chain-Ladder Method

Abstract

1. Introduction

Organization of the Paper

2. Chain-Ladder Models

3. One-Step Ahead Prediction

4. Munich Chain-Ladder Model

5. The Modified Munich Chain-Ladder Method

6. Claims Prediction and Prediction Uncertainty

7. Example

8. Conclusions

Author Contributions

Conflicts of Interest

Appendix

A. Data of Quarg and Mack [1]

B. Inverse Matrix Σ[1]

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

B. Inverse Matrix Σ_[1]