*Article* **A Likelihood Approach to Bornhuetter–Ferguson Analysis**

#### **Valandis Elpidorou 1, Carolin Margraf 2, María Dolores Martínez-Miranda 3,\* and Bent Nielsen <sup>4</sup>**


Received: 23 October 2019; Accepted: 3 December 2019; Published: 10 December 2019

**Abstract:** A new Bornhuetter–Ferguson method is suggested herein. This is a variant of the traditional chain ladder method. The actuary can adjust the relative ultimates using externally estimated relative ultimates. These correspond to linear constraints on the Poisson likelihood underpinning the chain ladder method. Adjusted cash flow estimates were obtained as constrained maximum likelihood estimates. The statistical derivation of the new method is provided in the generalised linear model framework. A related approach in the literature, combining unconstrained and constrained maximum likelihood estimates, is presented in the same framework and compared theoretically. A data illustration is described using a motor portfolio from a Greek insurer.

**Keywords:** chain ladder; Bornhuetter–Ferguson; maximum likelihood; exponential families; canonical parameters; prior knowledge

#### **1. Introduction**

While high dimensional data and its validation is important for many machine learning experiments, it is also true that many machine learning applications combine mathematical statistical methods with prior knowledge. The difficulty is to include this prior knowledge to upgrade the statistical analysis without violating the fundamental principles of mathematical statistics. These kinds of applications are omnipresent in insurance reserving, which often cite the original paper of Bornhuetter and Ferguson from 1972. This combination of prior knowledge and mathematical statistics is the purpose of this paper, where we are able to make it while sticking to the classical maximum likelihood technique of mathematical statistics.

The chain ladder method is the basic actuarial tool for reserving in general insurance. This method is based on the paid run-off triangle and provides estimates for the ultimate reserve along with development factors that are used for determining cash flow. In practice, the actuary usually adjusts the ultimates using additionally available information. With the Bornhuetter et al. (1972) method the chain ladder ultimates are adjusted using prior knowledge while the adjusted cash flow is proportional to the original chain ladder cash flow. Mack (2000) gave a credibility interpretation of the Bornhuetter–Ferguson method.

The adjustment of the ultimates can be done in two ways. Either by correcting the levels of the ultimates or the relative levels of the ultimates. By this, we distinguish between the situation where the actuary has an estimate for the ultimate for a given policy year and the situation where the actuary is more comfortable with the forecast that the ultimate for a given policy year is 10% higher, say, than in the previous year. Such an estimate could, for instance, come from chain ladder analysis of incurred data. Indeed, we provide an empirical illustration where this is the case. The levels approach is most common in the literature; see for instance (Mack 2000, 2006), Taylor (2000), Verrall (2004), Wüthrich and Merz (2008) and Heberle and Thomas (2016). The relative levels approach is more recent; see (Martínez-Miranda et al. 2013, 2015).

There are potentially two concerns with the traditional Bornhuetter–Ferguson correction. It may move the reserves too much, and the cash flow distribution is not adjusted in light of the external information. Verrall (2004) addressed this in a Bayesian setup and Mack (2006) proposed an alternative approach where new weights are computed by combining actual payments and the externally estimated reserves.

Our proposal is related to that of Mack (2006), but with weights derived from a likelihood function. Adjusting relative ultimates as opposed to level ultimates is natural when working with the likelihood function in the same way as traditional chain ladder development factors are concerned with relative effects. A feature of our approach is, therefore, that external information is linked directly to the parameters of the underlying Poisson model and it is possible to express the Bornhuetter–Ferguson adjustment in terms of adjustments to the development factors. Another feature of this approach is that we can evaluate how much the adjustment moves the reserves and establish inequalities relating our approach and the traditional Bornhuetter–Ferguson adjustments.

A fundamental interpretation of the Bornhuetter–Ferguson method arises when combining chain ladder with credibility formulas. Credibility formulas have been investigated in reserving by, for instance, de Vylder (1982), Mack (2000). and more recently, Bühlmann and Moriconi (2015). We have been particularly influenced by Mack (2000), who gives a credibility formula showing that adjusting the ultimates with prior knowledge yields a partial adjustment of the reserves. He then continues to show that the iterations of the credibility formula leads to the Benktander (1976) approach. These ideas are taken a step further by Gigante et al. (2013), whereas Taylor (2000) and Wüthrich and Merz (2008) give general overviews of the Bornhuetter–Ferguson method. Our first contribution is to show that the credibility formula also applies when adjusting the relative levels of the ultimates.

It is useful to recall that the chain ladder method has the nice interpretation as maximum likelihood in a Poisson model. Kremer (1985) (see also Mack 1991) showed that the chain ladder forecasts are maximum likelihood. These forecasts are the product of observed accident year row sums and functions of the development factors; see (4). Renshaw and Verrall (1998) showed that the development factors themselves are maximum likelihood estimators in a conditional Poisson model conditioning on row sums, while Kuang et al. (2009) showed that they are also maximum likelihod in the unconditional Poisson model. The maximum likelihood result means that it is possible to compute the chain ladder estimates using generalised linear model methods. In practice the Poisson assumption is not realistic as the paid data typically have considerable over-dispersion; see for instance England and Verrall (2002). Nonetheless, the chain ladder method provides good reserve estimates that are, at least, anchored in a quasi-likelihood.

The main idea of our approach is to impose the externally estimated relative ultimates on the Poisson likelihood. Initially, it is useful to work with the standard parametrisation of the generalised linear model as opposed to the development factors. We can then formulate the relative ultimates' constraint as a linear constraint on the parameters and derive maximum likelihood estimators. Subsequently, we translate these estimators into adjusted development factors.

The constrained maximum likelihood approach satisfies a monotonicity result. If, for instance, all the relative ultimates are increased relative to the chain ladder ultimates, then it follows that the reserves are increased. However, these new reserves increase less than the traditional Bornhuetter–Ferguson reserves that would arise by combining the adjusted relative ultimates with the chain ladder development factors.

In this paper we focus on classical mathematical statistics through the maximum likelihood method. Recent work in reserving has emerged in the literature using modern machine learning techniques. Kuo (2019) proposes deep neural networks to joint modelling paid data and total claims outstanding, claiming that no manual input is required during model updates or forecasting. Additionally, using neural networks Gabrielli and Wüthrich (2018) develop a "stochastic simulation

machine" that generates individual claims histories of non-life insurance claims. Another individual claims approach but based on the classification and regression trees is suggested by De Felice and Moriconi (2019). Chukhrova and Johannssen (2017) describe a state space model for cumulative payments (that extend the chain ladder method) in combination with the Kalman filter.

The rest of the paper is organised as follows. In Section 2 we first describe the chain ladder forecasts in terms of the development factors and certain weights. Using this formulation we describe two Bornhuetter–Ferguson approaches in the literature: the interpretation offered by Mack (2000) that uses levels of ultimates, and the approach of Martínez-Miranda et al. (2013) using relative ultimates. From these two approaches the future cash flow is not influenced by the external information. We then present our proposed Bornhuetter–Ferguson reserves which are determined by a Poisson likelihood constrained by the external information. The formal derivation of our proposal is provided in Section 3 in the generalised, lineal model framework. Our proposal is derived as the solution of a constrained maximum likelihood approach, where the constraint is given by the imposed external information. Later, in Section 3.5 we show that the approach by Martínez-Miranda et al. (2013) is a mixed approach which combines unconstrained and constrained maximum likelihood estimators. Reserve forecasts and cash flow are described from our proposal, the mixed approach and the traditional chain ladder method. Only our proposed cash flow is affect by the external information, which is explicitly shown in terms of some pseudo development factors introduced in Section 3.6. In Section 4 we illustrate our proposal using a motor portfolio from a Greek insurer. These data include both paid and incurred triangles. In addition, an external estimate of the reserve is available so that this example nicely illustrates the practical issues that lead to the use of the Bornhuetter–Ferguson method. Conclusions and final remarks are provided in Section 5.

#### **2. The Bornhuetter–Ferguson Problem**

We present two standard Bornhuetter–Ferguson approaches. For now we will not formulate a statistical model, but just use the standard chain ladder formulas.

#### *2.1. Data Structure*

Consider a standard run-off triangle of paid amounts. The dimension is denoted *k* and we use the incremental form of the triangle. Each entry is denoted *Yij* so that *i* is the accident year index and *j* is the development year index. The indices vary in the upper triangle with indices 1 ≤ *i*, *j* ≤ *k* and *i* + *j* − 1 ≤ *k*. This is the area *I* in Figure 1. The objective is to forecast values of *Yij* in the lower triangle with indices 1 ≤ *i*, *j* ≤ *k* and *k* + 1 ≤ *i* + *j* − 1 ≤ 2*k* − 1. This is the area *J* in Figure 1.

In the analysis we will be interested in row sums, column sums and rectangular sums

$$R\_i = \sum\_{j=1}^{k+1-i} Y\_{ij\prime} \qquad \mathbb{C}\_j = \sum\_{i=1}^{k+1-j} Y\_{ij\prime} \qquad S\_r = \sum\_{\ell=1}^r \sum\_{j=1}^{k-r} Y\_{\ell j\prime} \tag{1}$$

for 1 ≤ *i*, *j* ≤ *k* and 1 ≤ *r* ≤ *k* − 1. In practice the payments *Yij* may be negative. This is at odds with the Poisson model interpretation of the chain ladder method which requires the payments to be non-negative. With the Poisson formulation we need the further requirement that the rectangular sums, the row sums and the column sums are positive Kuang et al. (2009) (Theorem 2); that is,

$$\mathcal{S}\_1, \dots, \mathcal{S}\_{k-1}, \mathcal{R}\_2, \dots, \mathcal{R}\_k, \mathcal{C}\_2, \dots, \mathcal{C}\_k > 0. \tag{2}$$

We will assume these constraints throughout the paper, while noting that constraints in (2) may be satisfied even if some payments *Yij* are negative.

**Figure 1.** Illustration of data layout.

#### *2.2. Chain Ladder Method*

The chain ladder method is computed from row sums or cumulative payments *Ri* as in (1) and development factors

$$F\_{\vec{\gamma}} = \frac{\sum\_{i=1}^{k+1-j} \sum\_{\ell=1}^{j} \mathbf{Y}\_{i\ell}}{\sum\_{i=1}^{k+1-j} \sum\_{\ell=1}^{j-1} \mathbf{Y}\_{i\ell}} \quad \text{for } j = 2, \dots, k. \tag{3}$$

The development factors are larger than one under the constraint (2) to *Ri*, *Cj*, *Sr*; see Theorem 2 in Kuang et al. (2009). The chain ladder forecasts of the amounts in the lower triangle are then

$$\widetilde{Y}\_{i\bar{j}} = R\_i (F\_{\bar{j}} - 1) \prod\_{\ell=k+2-i}^{\bar{j}-1} F\_{\ell}. \tag{4}$$

From this we compute the reserve for accident year *i*, for *i* = 2, . . . , *k*, as

$$V\_i = \sum\_{j=k+2-i}^{k} \widetilde{Y}\_{ij} = R\_i(F\_i^{prod} - 1) \qquad \text{where} \qquad F\_i^{prod} = \prod\_{\ell=k+2-i}^{k} F\_{\ell \prime} \tag{5}$$

and the predicted ultimate payment as; see also Section 2.1.3 of England and Verrall (2002),

$$\text{all}\_{i} = R\_{i} + V\_{i} = R\_{i}F\_{i}^{prod} \quad \text{for } i = 2, \ldots, k. \tag{6}$$

If we use the convention that empty products are unity, this matches with *U*<sup>1</sup> = *R*<sup>1</sup> and *V*<sup>1</sup> = 0, so that the in-sample prediction of the sum of the payments for accident year one equals the observation.

It will be convenient to express the above formulas in terms of certain weights. Thus, define weights, for *i* = 2, . . . , *k*, *j* = *k* + 2 − *i*,..., *k*,

$$\mathcal{W}\_{ij} \quad = \quad (F\_j - 1) \frac{\prod\_{\ell=k+2-i}^{j-1} F\_\ell}{F\_i^{\text{prod}}} = \frac{F\_j - 1}{\prod\_{\ell=j}^k F\_\ell} \, \tag{7}$$

$$\mathcal{W}\_i \quad = \frac{F\_i^{prod} - 1}{F\_i^{prod}} = \sum\_{j=k+2-i}^k \mathcal{W}\_{ij}. \tag{8}$$

These are numbers between zero and unity when the development factors are larger than unity. The weights *Wi* approach unity when the product of the development factors approaches infinity. We can then write the forecasts for each cell and each row in the lower triangle as

$$
\widetilde{Y}\_{i\mathbf{j}} = \mathcal{U}\_{\mathbf{i}} \mathcal{W}\_{i\mathbf{j}\prime} \qquad V\_{\mathbf{i}} = \mathcal{U}\_{\mathbf{i}} \mathcal{W}\_{i\prime} \qquad \mathbf{i} = \mathbf{2}, \dots, k. \tag{9}
$$

These formulas show how the reserve *Vi* can be found as a fraction of the predicted ultimate *Ui*, while *Yij* indicates how the cash flow is distributed.

The chain ladder is maximum likelihood in a Poisson model that will be presented in Section 3. A feature of the likelihood function (25) is that it is symmetrical in the indices for accident year *i* and development year *j*. This observation leads to a new expression for the forecast of the reserve, which will be proved in the Appendix A. Traditionally, we forecast by computing row sums *Ri* of the data and multiplying by the column wise forward factors *Fj* as in (4). Alternatively, we can compute columns sums *Cj* as in (1) and row-wise forward factors

$$\mathbf{G}\_{i} = \frac{\sum\_{j=1}^{k+1-i} \sum\_{\ell=1}^{i} \mathbf{Y}\_{\ell j}}{\sum\_{j=1}^{k+1-i} \sum\_{\ell=1}^{i-1} \mathbf{Y}\_{\ell j}} \quad \text{for } i = 2, \dots, k \tag{10}$$

and combine these to get the forecasts for the lower triangle, proved in the Appendix A,

$$\widetilde{Y}\_{ij} = R\_i(F\_j - 1) \prod\_{\ell=k+2-i}^{j-1} F\_\ell = C\_j(G\_i - 1) \prod\_{\ell=k+2-j}^{i-1} G\_\ell. \tag{11}$$

#### *2.3. Bornhuetter–Ferguson Using Levels of Ultimates*

This section presents the Bornhuetter–Ferguson interpretation offered by Mack (2000); see also England and Verrall (2002) and Alai et al. (2009).

England and Verrall present the Bornhuetter–Ferguson idea as follows. Suppose we replace the chain ladder ultimate *Ui* by an externally estimated reserve *Ulevel <sup>i</sup>* in the Formula (9). Then we get the level-based Bornhuetter–Ferguson reserve

$$V\_i^{BF, level} = \mathcal{U}\_i^{level} \mathcal{W}\_i \quad \text{for } i = 2, \dots, k. \tag{12}$$

Thus, the Bornhuetter–Ferguson reserve is the proportion *Wi* of the externally estimated level of the ultimate. In a similar fashion the Bornhuetter–Ferguson cash flow is given by

$$
\hat{Y}\_{ij}^{BF, level} = \mathcal{U}\_i^{level} \mathcal{W}\_{ij} \quad \text{for } i = 2, \dots, k, j = k + 2 - i, \dots, k. \tag{13}
$$

The predicted ultimate payout turns out to be a convex combination of the chain ladder reserve *Ui* and the externally generated number *Ulevel <sup>i</sup>* . To see this, use the Formulas (6) and (8) to write the cumulated payments as *Ri* <sup>=</sup> *Ui*/*Fprod <sup>i</sup>* = *Ui*(1 − *Wi*). It then follows that

$$\mathcal{U}\_{i}^{BF, level} = R\_i + V\_i^{BF, level} = \mathcal{U}\_{i}(1 - \mathcal{W}\_{i}) + \mathcal{U}\_{i}^{level}\mathcal{W}\_{i}.\tag{14}$$

Mack (2000) refers to this a credibility formula and traces it back to Benktander (1976). He points out that it can be iterated by replacing *Ulevel <sup>i</sup>* by *<sup>U</sup>BF*,*level <sup>i</sup>* . Another consequence is the following ordering, assuming 0 < *Wi* < 1,

$$
\mathcal{U}I\_i < \mathcal{U}I\_i^{\text{level}} \qquad \Rightarrow \qquad \mathcal{U}I\_i < \mathcal{U}\_i^{BF, \text{level}} < \mathcal{U}\_i^{\text{level}}.\tag{15}
$$

#### *2.4. Bornhuetter–Ferguson Using Relative Ultimates*

This section presents the Bornhuetter–Ferguson approach of Martínez-Miranda et al. (2013). The idea is now to replace the relative ultimates rather than levels of ultimates. We then rewrite (9) as

$$W\_i = R\_1 \frac{lI\_i}{lI\_1} W\_i \quad \text{for } i = 2, \dots, k\_\prime \tag{16}$$

recalling that *U*<sup>1</sup> = *R*1. We now replace *Ui*/*U*<sup>1</sup> by some external measure *Urel <sup>i</sup>* /*Urel* <sup>1</sup> , which only provides information about the relative ultimates, such as the figure for year *i* being 10% higher than that for year *i* − 1. This results in the relative level-based Bornhuetter–Ferguson reserve

$$V\_i^{BF,rel} = R\_1 \frac{U\_i^{rel}}{U\_1^{rel}} W\_i \quad \text{for } i = 2, \dots, k. \tag{17}$$

The corresponding cash flow is then

$$
\hat{Y}\_{ij}^{BF,rel} = R\_1 \frac{\mathcal{U}\_i^{rel}}{\mathcal{U}\_1^{rel}} \mathcal{W}\_{ij} \quad \text{for } i = 2, \dots, k, j = k + 2 - i, \dots, k. \tag{18}
$$

The relative Bornhuetter–Ferguson reserve also satisfies an actuarial credibility formula. To see this define *U*<sup>1</sup> = *R*1, write *Ri* = *R*1(*Ri*/*U*1) and combine it with *Ri* = *Ui*(1 − *Wi*) as before, to get

$$\mathcal{U}\_{i}^{BF,rel} = \mathcal{R}\_{i} + V\_{i}^{BF,rel} = \mathcal{R}\_{1} \left\{ \frac{\mathcal{U}\_{i}}{\mathcal{U}\_{1}} (1 - \mathcal{W}\_{i}) + \frac{\mathcal{U}\_{i}^{rel}}{\mathcal{U}\_{1}^{rel}} \mathcal{W}\_{i} \right\}. \tag{19}$$

Once again, we have the ordering, for *i* = 2, . . . *k* and assuming 0 < *Wi* < 1,

$$\frac{\text{U}I\_i}{\text{U}I\_1} < \frac{\text{U}I\_i^{\text{rel}}}{\text{U}I\_1^{\text{rel}}} \qquad \Rightarrow \qquad \text{U}\_i < \text{U}I\_i^{\text{BF,rel}}.\tag{20}$$

Martínez-Miranda et al. (2013) suggested that the relative external numbers could be computed from an incurred triangle. They extended this further to allow for reporting delays using a double chain ladder method. However, in the present paper we focus on the consequences of a Bornhuetter–Ferguson correction rather than how the external numbers are generated.

#### *2.5. Proposed Bornhuetter–Ferguson Reserves*

With the above approaches the future cash flow is determined by the chain ladder method through the weights *Wij* and not influenced by the external information. As argued by Verrall (2004) and Mack (2006) it may be desirable that the cash flow is also influenced by the external information. Our proposal allows the cash flow to be determined by a Poisson likelihood, constrained by the external information. Before we give the derivation it is useful to give a brief overview of the results.

The proposed Bornhuetter–Ferguson approach evolves around the chain ladder reserving Formula (11) involving column sums *Cj* and row-wise forward factors *Gi*. Suppose we have externally given relative ultimates *Urel <sup>i</sup>* /*Urel* <sup>1</sup> for *<sup>i</sup>* = 2, ... , *<sup>k</sup>*, with the convention that *<sup>U</sup>rel <sup>i</sup>* /*Urel* <sup>1</sup> = 1 for *i* = 1. We then construct Bornhuetter–Ferguson row-wise forward factors

$$\Gamma\_i^{rel} = \frac{\sum\_{\ell=1}^i \left( \mathcal{U}\_{\ell}^{rel} / \mathcal{U}\_1^{rel} \right)}{\sum\_{\ell=1}^{i-1} \left( \mathcal{U}\_{\ell}^{rel} / \mathcal{U}\_1^{rel} \right)} \qquad \text{for } i = 2, \dots, k. \tag{21}$$

In Section 3.3 we show that Γ*rel <sup>i</sup>* naturally comes from a constrained likelihood. For now, the intuition is that the traditional development factors *Fj*, and *Gi*, are relative effects computed as ratios of sums over data rectangles of different sizes. This compensates for the fact that the data are only available in triangular form, so that column and row lengths are unbalanced. Once we impose the relative ultimates, which are relative row effects, then the unbalanced row lengths are essentially eliminated and we can capture relative row effects in a simpler fashion, as shown in (21).

The Bornhuetter–Ferguson forecasts of individual payments and of reserves are, then,

$$\tilde{Y}\_{\hat{i}\hat{j}} = \mathbb{C}\_{\hat{i}}(\Gamma\_{i}^{rel} - 1) \prod\_{\ell=k+2-j}^{\hat{i}-1} \Gamma\_{\ell}^{rel}, \qquad V\_{\hat{i}}^{rel} = \sum\_{j=k+2-i}^{k} \tilde{Y}\_{\hat{i}\hat{j}}.\tag{22}$$

#### **3. Generalised Linear Model Framework**

We present a Generalised Linear Model framework for Bornhuetter–Ferguson analysis. The usual chain ladder estimators are maximum likelihood in a Poisson model; see Kremer (1985). In practice, reserving data have considerable over-dispersion; see England and Verrall (2002), so that Poisson likelihood becomes a quasi likelihood. In the present paper this distinction is not so important as we will only be concerned with point forecasts. Now, if we maximise the likelihood while imposing constraints from external relative levels of ultimates, we get a closed form cash flow forecast that adapts to both data and the imposed constraints.

Next we describe the unconstrained and constrained Poisson likelihood. The first one provides the chain ladder forecasts without external information in Section 3.2, and the second one provides our proposed forecasts in Section 3.3. The approach of Martínez-Miranda et al. (2013) is shown in Section 3.5 as a mixed approach that combines constrained and unconstrained maximum likelihood estimates. Our proposed forecasts have an equivalent expression involving new column-wise development factors provided in Section 3.6. These development factors will be different for different accident years. For this reason we refer to them as pseudo development factors. This kind of formulation is also possible for the mixed approach, but with the standard chain ladder development factors, as we show in Section 3.7. A monotonicity result provided in Section 3.8 gives some insight about the effect that the imposed external information has on our proposed forecasts and those from the mixed approach.

#### *3.1. Statistical Model*

We assume that the incremental observations *Yij* are independent Poissons with log expectation E*Yij* = exp(*μij*), where the predictor is given by

$$
\mu\_{ij} = \alpha\_i + \beta\_j + \delta. \tag{23}
$$

Here *α<sup>i</sup>* is the level of the accident year effect, *β<sup>j</sup>* is the level of the development year effect and *δ* is an overall level. The parametrisation presented in (23) does not identify the distribution, so we switch to the invariant parametrisation of Kuang et al. (2009); that is,

$$
\mu\_{ij} = \mu\_{11} + \sum\_{\ell=2}^{i} \Delta \alpha\_{\ell} + \sum\_{\ell=2}^{j} \Delta \beta\_{\ell \prime} \tag{24}
$$

with the convention that empty sums are zero. Here Δ*α<sup>i</sup>* = *α<sup>i</sup>* − *αi*−<sup>1</sup> is the relative accident year effect and Δ*β<sup>j</sup>* = *β<sup>j</sup>* − *βj*−<sup>1</sup> is the relative development year effect, while the overall level is determined by *μ*11. The Poisson log likelihood function is

$$\ell(\mu\_{11}, \Delta \alpha\_i, \Delta \beta\_j) = \sum\_{1 \le i, j, i+j-1 \le k} \{\mu\_{i\}} Y\_{i\nkern-1.7mu\text{-}\mu} - \exp(\mu\_{i\nkern-1.7mu\text{-}\mu}) - \log(Y\_{i\nkern-1.7mu\text{-}\mu})\}.\tag{25}$$

This is a regular exponential family with canonical parameters *μ*11, Δ*αi*, Δ*βj*.

#### *3.2. The Chain Ladder*

The chain ladder arises by maximising the unconstrained likelihood. Theorem 3 in Kuang et al. (2009) shows that the maximum likelihood estimators are

$$
\Delta \widehat{a}\_i = \Delta \log R\_i + \log F\_{k+2-i} \quad \text{for } i = 2, \dots, k,\tag{26}
$$

$$
\Delta \hat{\boldsymbol{\beta}}\_{j} = \quad \Delta \log \mathbf{C}\_{j} + \log \mathbf{G}\_{k+2-j} \quad \text{for } j = 2, \dots, k,\tag{27}
$$

$$
\hat{\mu}\_{11} \quad = \quad \log R\_1 - \sum\_{j=2}^{k} \log F\_j. \tag{28}
$$

When inserting these estimators into Equation (23) we get estimators *<sup>μ</sup>ij*. In turn, the relative ultimates are estimated by

$$\frac{\mathcal{U}\_{i}}{\mathcal{U}\_{1}} = \frac{\sum\_{j=1}^{k} \exp(\hat{\mu}\_{ij})}{\sum\_{j=1}^{k} \exp(\hat{\mu}\_{1j})} = \exp(\sum\_{\ell=2}^{i} \Delta \hat{a}\_{\ell}) \quad \text{for } i = 2, \dots, k,\tag{29}$$

which are the relative ultimates entering in Equation (16). It is convenient to define *U*<sup>1</sup> = *R*1, as this says that the ultimate for first accident year equals the claims observed. With this definition, we find that *R*<sup>1</sup> = *U*<sup>1</sup> is the maximum likelihood estimator for the expected ultimates E*R*<sup>1</sup> for the first accident year. In turn, the maximum likelihood estimators for the ultimate levels satisfy

$$\mathcal{U}\_{\hat{i}} = \mathcal{U}\_{1} \frac{\mathcal{U}\_{\hat{i}}}{\mathcal{U}\_{1}} = \mathcal{U}\_{1} \exp(\sum\_{\ell=2}^{i} \Delta \hat{a}\_{\ell}) \quad \text{for } i = 2, \ldots, k. \tag{30}$$

Now, insert the expression for <sup>Δ</sup>*α<sup>i</sup>* in (26) to get

$$\mathcal{U}\_{l} = \mathcal{U}\_{1} \prod\_{\ell=2}^{i} (\frac{R\_{\ell}}{R\_{\ell-1}} F\_{k+2-\ell}) = \mathcal{U}\_{1} \frac{R\_{i}}{R\_{1}} \prod\_{\ell=2}^{i} F\_{k+2-\ell} = R\_{i} \prod\_{\ell=k+2-i}^{k} F\_{\ell \prime} \tag{31}$$

which are the ultimates in (6). Thus, in both cases the ultimate formulas are closely linked to the estimated relative accident year effects <sup>Δ</sup>*αi*.

An additional result from Theorem 3 in Kuang et al. (2009) is that the forward factors *Fj* and *Gi* can be viewed as maximum likelihood estimators for certain combinations of the canonical parameters Δ*β<sup>j</sup>* and Δ*αi*, respectively. These combinations are, for *i*, *j* = 2, . . . , *k*,

$$\Phi\_{\hat{l}} = \frac{\sum\_{\ell=1}^{j} \exp\left(\sum\_{h=2}^{\ell} \Delta \beta\_{h}\right)}{\sum\_{\ell=1}^{j-1} \exp\left(\sum\_{h=2}^{\ell} \Delta \beta\_{h}\right)}, \qquad \Gamma\_{\hat{l}} = \frac{\sum\_{\ell=1}^{j} \exp\left(\sum\_{h=2}^{\ell} \Delta a\_{h}\right)}{\sum\_{\ell=1}^{j-1} \exp\left(\sum\_{h=2}^{\ell} \Delta a\_{h}\right)},\tag{32}$$

with the convention that empty sums are zero. The development factors are the corresponding maximum likelihood estimators; that is, *Fj* = Φ *<sup>j</sup>* and *Gi* = Γ*i*.

#### *3.3. Imposing External Information on the Relative Ultimates*

Suppose some external values are available for the relative ultimates, *Urel <sup>i</sup>* /*Urel* 1 . Equivalently, we have external values for the relative accident year effects Δ*α*† *<sup>i</sup>* ; that is,

$$
\Delta a\_i^\dagger = \log \left( \mathcal{U}\_i^{rel} / \mathcal{U}\_{i-1}^{rel} \right). \tag{33}
$$

We could impose these as a constraint on the likelihood (25). The constraint is linear and the likelihood remains that of a regular exponential family.

The constrained maximum likelihood estimators have a simple analytical form. In line with the parameters Γ*<sup>i</sup>* defined in (32), define

$$\Gamma\_i^\dagger = \frac{\sum\_{\ell=1}^i \exp\left(\sum\_{l=2}^\ell \Delta a\_{l\bar{l}}^\dagger\right)}{\sum\_{\ell=1}^{i-1} \exp\left(\sum\_{l=2}^\ell \Delta a\_{l\bar{l}}^\dagger\right)}.\tag{34}$$

We then have the following result, which is proven in the Appendix A.

**Theorem 1.** *Consider the Poisson likelihood (25) with known* Δ*α<sup>i</sup>* = Δ*α*† *<sup>i</sup> for <sup>i</sup>* = 2, ... , *<sup>k</sup> and define* <sup>Γ</sup>† *<sup>i</sup> as (32), computed using* Δ*α*† *<sup>i</sup> . The constrained maximum likelihood estimator is unique if and only if Cj* > 0 *for all j* = 1, . . . *k and given by*

$$
\Delta \dot{\beta}\_j^\dagger = \Delta \log \mathbb{C}\_j + \log \Gamma\_{k+2-j}^\dagger \tag{35}
\\
\tag{35}
$$

$$
\widehat{\mu}\_{11}^{\dagger} = \log \mathbb{C}\_1 - \log \{ 1 + \sum\_{i=2}^{k} \exp(\sum\_{\ell=2}^{i} \Delta a\_{\ell}^{\dagger}) \} = \log \mathbb{C}\_1 - \sum\_{\ell=2}^{k} \log \Gamma\_{\ell}^{\dagger}.\tag{36}
$$

As a consequence, the out-of-sample forecast from the constrained chain ladder has a simple explicit form, as shown in the following result, which is proven in the Appendix A. The result resembles the forecast in the unrestricted chain ladder computed from column sums and row-wise development factors as described in (11).

**Theorem 2.** *Consider the setup in Theorem 1. Point forecasts for the lower triangle are given by*

$$\widetilde{Y}\_{ij}^{\dagger} = \mathbb{C}\_{\dot{\jmath}}(\Gamma\_i^{\dagger} - 1) \prod\_{\ell=k+2-j}^{i-1} \Gamma\_\ell^{\dagger}. \tag{37}$$

We can now compute a Bornhuetter–Ferguson reserve based on Theorem 2. For each accident year we get

$$V\_i^\dagger = \sum\_{j=k+2-i}^k \check{Y}\_{ij}^\dagger. \tag{38}$$

In the case where we impose external relative ultimates, the above expressions reduce to those presented previously in (22). In the above expression the notation reflects that the external information is concerned with the relative accident year parameters Δ*α*† *<sup>i</sup>* . Now, suppose the relative ultimates *Urel <sup>i</sup>* /*Urel* <sup>1</sup> are taken as given. We then apply the Formula (30) to get cumulated relative accident parameters exp(∑*<sup>i</sup>* -<sup>=</sup><sup>2</sup> <sup>Δ</sup>*α*† - ) = *Urel <sup>i</sup>* /*Urel* <sup>1</sup> . Inserting this in the expression (34) for <sup>Γ</sup>† *<sup>i</sup>* in (34) gives

$$\Gamma\_i^\dagger = \frac{\sum\_{\ell=1}^i \exp\left(\sum\_{h=2}^\ell \Delta a\_h^\dagger\right)}{\sum\_{\ell=1}^{i-1} \exp\left(\sum\_{h=2}^\ell \Delta a\_h^\dagger\right)} = \frac{\sum\_{\ell=1}^i \wr I\_\ell^{rel} / \wr I\_1^{rel}}{\sum\_{\ell=1}^{i-1} \wr I\_\ell^{rel} / \wr I\_1^{rel}} = \Gamma\_i^{rel},\tag{39}$$

which is the expression for Γ*rel <sup>i</sup>* in (21). Since <sup>Γ</sup>† *<sup>i</sup>* = <sup>Γ</sup>*rel <sup>i</sup>* we see that the point forecast *<sup>Y</sup>*† *ij* in (37) equals the point forecast *<sup>Y</sup>ij* in (22). In turn the reserve *<sup>V</sup>*† *<sup>i</sup>* in (38) equals the reserve *<sup>V</sup>rel <sup>i</sup>* in (22).

#### *3.4. Implementation in GLM Software*

The constrained model can also be estimated using ready-made algorithms for generalised linear models. The analysis presented above shows that the constrained model is a regular exponential family so the algorithms should perform well.

For the implementation we organise the triangle *Y* as a vector **Y**, say, of dimension *k*(*k* + 1)/2. A design matrix **X** can be constructed from the formula (24). It has dimension {*k*(*k* + 1)/2} × (2*k* − 1) and the row corresponding to entry *i*, *j* is given by

$$X'\_{ij} = \{1, 1\_{(2^{\underline{\varsigma}} \stackrel{\leftarrow}{=})}, \dots, 1\_{(k \stackrel{\leftarrow}{=})}, 1\_{(2^{\underline{\varsigma}} \stackrel{\leftarrow}{=})}, \dots, 1\_{(k \stackrel{\leftarrow}{=})}\},\tag{40}$$

where the indicator function 1(*m*≤*i*) takes the value unity if *m* ≤ *i* and zero otherwise. The unrestricted model is then estimated through a generalised linear model regression of **Y** on **X** using the Poisson distribution with a log-link function.

In the constrained model the parameters *θknown* = (Δ*α*† <sup>2</sup>, ... , <sup>Δ</sup>*α*† *<sup>k</sup>* ) are known. Deleting the corresponding columns from **X** gives a design matrix **X***reduced* with *k* columns. The deleted columns are collected as **X***known*, say. The model is then estimated as a generalised linear model regression of **Y** on **X***reduced* using the Poisson distribution with a log-link function and offset given by **X***knownθknown*.

#### *3.5. A Mixed Approach*

By now we have two maximum likelihood approaches: the classical chain ladder and the restricted maximum likelihood approach derived above. These give different point forecasts for the lower triangle. A third type of point forecast arises from the Bornhuetter–Ferguson double chain ladder (BDCL) method in Martínez-Miranda et al. (2013). In the following it is shown how the three are connected.

Let us first summarise the results we obtained so far in terms of the log likelihood. In the classical chain ladder approach, we maximise the unrestricted likelihood in (25), which leads to the unrestricted estimator

$$\dot{\xi}^{\dot{\zeta}} = \max\_{\dot{\xi}} \ell(\dot{\xi}) = (\hat{\mu}\_{11}, \Delta \hat{a}\_i, \Delta \dot{\beta}\_j)'. \tag{41}$$

The restricted likelihood from Section 3.3 with restriction Δ*α<sup>i</sup>* = Δ*α*† *<sup>i</sup>* has a restricted likelihood maximum likelihood estimator given by

$$\hat{\xi}^{\dagger} = \max\_{\xi:\Delta a = \Delta a^{\dagger}} \ell(\xi) = (\hat{\mu}\_{11}^{\dagger}, \Delta a\_{i}^{\dagger}, \Delta \hat{\beta}\_{j}^{\dagger})^{\prime}. \tag{42}$$

Notice, that if Δ*α*† *<sup>i</sup>* <sup>=</sup> <sup>Δ</sup>*αi*, then *<sup>μ</sup>*† <sup>11</sup> <sup>=</sup> *<sup>μ</sup>*<sup>11</sup> and <sup>Δ</sup>*<sup>β</sup>* † *<sup>j</sup>* = Δ*β j*.

A third estimator is achieved by mixing the above estimators. This combines the unrestricted estimators for *μ*<sup>11</sup> and *β<sup>j</sup>* with the given Δ*α*† *<sup>i</sup>* , such that

$$
\hat{\xi}^{\dagger} = (\hat{\mu}\_{11}, \Delta a\_i^{\dagger}, \Delta \hat{\beta}\_j)'. \tag{43}
$$

In the following, parameters resulting from this mixed approach will be marked with the index "‡," just as parameters resulting from the constrained method will be marked with "†." The forecast for future payments computed from *ξ* ‡ is

$$\widehat{Y}\_{ij}^{\dagger} = \exp\left(\widehat{\mu}\_{11} + \sum\_{h=2}^{i} \Delta a\_{h}^{\dagger} + \sum\_{h=2}^{j} \Delta \widehat{\beta}\_{h}\right). \tag{44}$$

In the Appendix A we prove the identities

$$\hat{Y}\_{ij}^{\ddagger} = \hat{Y}\_{ij} \frac{\exp\left(\sum\_{h=2}^{i} \Delta a\_h^{\ddagger}\right)}{\exp\left(\sum\_{h=2}^{i} \Delta \hat{a}\_h\right)} = \hat{Y}\_{ij}^{\ddagger} \frac{\sum\_{\ell=2}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta a\_h^{\ddagger}\right)}{\sum\_{\ell=2}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta \hat{a}\_h\right)}.\tag{45}$$

In the case when the known accident parameters are derived by applying chain ladder on the incurred data, such that Δ*α*† *<sup>i</sup>* <sup>=</sup> <sup>Δ</sup>*αinc <sup>i</sup>* , this method gives exactly the same results as the Bornhuetter–Ferguson double chain ladder (BDCL) method in Martínez-Miranda et al. (2013).

The log likelihood function evaluated in the three points satisfies

$$\ell(\widehat{\xi}) \ge \ell(\widehat{\xi}^{\dagger}) \ge \ell(\widehat{\xi}^{\dagger}).$$

The first inequality holds since *ξ* is maximum likelihood, while *ξ* † is restricted maximum likelihood. The second inequality holds since *ξ* ‡ satisfies the restriction, but it is not maximum likelihood.

#### *3.6. Pseudo Development Factors*

It is common practice to think about the classical chain ladder method in terms of row sums *Ri* and column wise development factors *Fj* given in (1) and (3). For the restricted maximum likelihood approach there are no natural development factors in a maximum likelihood sense. Since development factors are important in daily actuarial work it is of interest to develop pseudo-development factors that keep the chain ladder pattern.

In the classical chain ladder, the forecasts for the lower triangle are computed using the Formula (11) by forwarding the row sums *Ri* using the factors *Fj*. However, in this classical setting the predicted value for the row sum equals the row sum. In the likelihood analysis, this stems from a likelihood equation of the type *Ri* = E(*Ri*); see Equation (20) in Kuang et al. (2009). Thus, we can also interpret the chain ladder forecast as forwarding the predicted row sums.

Once we have imposed external information on the relative ultimates, then the forecast changes and we break the link to the original row sums and development factors. We can, however, construct pseudo forecasts of the row sums and pseudo forward factors that satisfy a relationship like (11) but with the new forecasts.

Under the constraint that <sup>Δ</sup>*<sup>α</sup>* <sup>=</sup> <sup>Δ</sup>*α*† we compute estimates *<sup>μ</sup>*† <sup>11</sup> and Δ*β* † *<sup>j</sup>* using (35) and (36) in Theorem 1. From these we compute pseudo forward factors from (32); that is,

$$F\_j^\dagger = \frac{\sum\_{\ell=1}^j \exp\left(\sum\_{h=2}^\ell \Delta \hat{\beta}\_h^\dagger\right)}{\sum\_{\ell=1}^{j-1} \exp\left(\sum\_{h=2}^\ell \Delta \hat{\beta}\_h^\dagger\right)},\tag{46}$$

and a pseudo first row sum from (28) as

$$\log R\_1^\dagger = \hat{\mu}\_{11}^\dagger + \sum\_{j=2}^k \log F\_{j}^\dagger \,\tag{47}$$

and then the remaining pseudo row sums from (26) as

$$
\Delta \log R\_i^\dagger = \Delta a\_i^\dagger - \log F\_{k+2-i}^\dagger. \tag{48}
$$

We show in the Appendix A that the forecast from (37) can be computed as

$$
\tilde{Y}\_{ij}^{\dagger} = \mathbb{R}\_i^{\dagger} (F\_j^{\dagger} - 1) \prod\_{\ell=k+2-i}^{j-1} F\_\ell^{\dagger}. \tag{49}
$$

*Risks* **2019**, *7*, 119

The above formulas for predicted reserve and the cash flow can also be written in the credibility format we saw in (17) and (18). To see this introduce the weights

$$\mathcal{W}\_{ij}^{\dagger} = (F\_j^{\dagger} - 1) \frac{\prod\_{\ell=k+2-i}^{j-1} F\_\ell^{\dagger}}{F\_i^{prod \dagger}}, \qquad \mathcal{W}\_i^{\dagger} = \frac{F\_i^{prod \dagger} - 1}{F\_i^{prod \dagger}}.$$

where, as before, *Fprod*† *<sup>i</sup>* <sup>=</sup> <sup>∏</sup>*<sup>k</sup>* -<sup>=</sup>*k*+2−*<sup>i</sup> <sup>F</sup>*† -. Introducing the ultimates and relative ultimates

$$\boldsymbol{\mathcal{U}}\_{i}^{\dagger} = \boldsymbol{\mathcal{R}}\_{i}^{\dagger} \boldsymbol{F}\_{i}^{prod \dagger} \prime \qquad \frac{\boldsymbol{\mathcal{U}}\_{i}^{\dagger}}{\boldsymbol{\mathcal{U}}\_{i-1}^{\dagger}} = \frac{\boldsymbol{\mathcal{R}}\_{i}^{\dagger}}{\boldsymbol{\mathcal{R}}\_{i-1}^{\dagger}} \boldsymbol{F}\_{k+2-i}^{\dagger} = \exp(\boldsymbol{\Delta} \boldsymbol{a}\_{i}^{\dagger})$$

we can then write the predicted reserve and cash flow as

$$
\widetilde{Y}\_{ij}^{\dagger} = \mathcal{U}\_i^{\dagger} \mathcal{W}\_{ij\prime}^{\dagger} \qquad V\_i^{\dagger} = \mathcal{U}\_i^{\dagger} \mathcal{W}\_i^{\dagger} \dots
$$

#### *3.7. Chain Ladder Forecasts with the Mixed Approach*

In the mixed approach we follow a similar procedure to satisfy a relationship like (11) in order to obtain the new forecasts. The difference to the constrained method is that we can keep the forward factors from the unconstrained chain ladder model, *Fj*. However, we need to construct pseudo row sums *R*‡ *<sup>i</sup>* as follows.

We fix the pseudo first row sum as

$$\log R\_1^\ddagger = \log R\_{1\prime} \tag{50}$$

and then compute the remaining pseudo row sums from (26) as

$$
\Delta \log R\_i^\ddagger = \Delta a\_i^\ddagger - \log F\_{k+2-i}.\tag{51}
$$

We show in the Appendix A that the forecast from (44) can be computed as

$$
\hat{Y}\_{ij}^{\ddagger} = R\_i^{\ddagger} (F\_{\hat{\jmath}} - 1) \prod\_{\ell=k+2-i}^{j-1} F\_{\ell}. \tag{52}
$$

The forecast can be written in terms of weights, as before. Since the cash flow is derived from the chain ladder development factors, the weights are as defined in (7) and (8). In particular we have the ultimates and relative ultimates

$$
\mu \mathcal{U}\_i^\ddagger = \mathcal{R}\_i^\ddagger \mathcal{F}\_i^{prod}, \qquad \frac{\mathcal{U}\_i^\ddagger}{\mathcal{U}\_{i-1}^\ddagger} = \frac{\mathcal{R}\_i^\ddagger}{\mathcal{R}\_{i-1}^\ddagger} F\_{k+2-i} = \exp(\Delta a\_i^\dagger).
$$

We can then write the forecast of future payments and the cash flow as

$$
\tilde{Y}\_{ij}^{\ddagger} = \mathcal{U}\_i^{\ddagger} \mathcal{W}\_{ij\prime} \qquad V\_i^{\ddagger} = \mathcal{U}\_i^{\ddagger} \mathcal{W}\_i. \tag{53}
$$

#### *3.8. Monotonicity*

The idea of the Bornhuetter–Ferguson approach is to first compute the chain ladder, and then adjust it by imposing values for the ultimates. This is a quite complicated approach and it is not immediately clear what the effect is. However, when all adjustments are in the same direction it is actually possible to show a monotonicity result for the effect of the Bornhuetter–Ferguson adjustment.

Let us consider the case when the known accident parameters, Δ*α*† *<sup>i</sup>* , are bigger than the accident parameters we obtain from the chain ladder method on paid data, <sup>Δ</sup>*αi*. The following theorem, proved in the Appendix A, shows monotonicity results regarding the remaining parameters given in Theorem 1, and the resulting forecasts we obtain from the constrained approach in Section 3.3, *<sup>Y</sup>*† *ij*, and the mixed approach in Section 3.5, *<sup>Y</sup>*‡ *ij*.

**Theorem 3.** *Suppose* Δ*α*† *<sup>i</sup>* <sup>&</sup>gt; <sup>Δ</sup>*α<sup>i</sup> for all* <sup>2</sup> <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *k. Then,*

$$(a) \quad \Gamma\_i^\dagger > G\_i \text{ for all } 2 \le i \le k;$$


To interpret this, suppose all imposed relative ultimates *Urel <sup>i</sup>* /*Urel <sup>i</sup>*−<sup>1</sup> <sup>=</sup> exp(Δ*α*† *<sup>i</sup>* ) are larger than the chain ladder forecasts of the relative ultimates *Ui*/*Ui*−<sup>1</sup> <sup>=</sup> exp(Δ*αi*). Suppose also that the imposed relative ultimates are taken from incurred estimates as in the Bornhuetter–Ferguson double chain ladder (BDCL) method in Martínez-Miranda et al. (2013). We then get that the point forecasts for the lower triangle are ordered so that the Bornhuetter–Ferguson double chain ladder forecast *<sup>Y</sup>*‡ *ij* is larger than the Bornhuetter–Ferguson-restricted maximum likelihood forecast *<sup>Y</sup>*† *ij*, which is larger than the chain ladder forecast *Yij*. This will be the situation in the empirical illustration in Section 4.

#### **4. Empirical Illustration**

We illustrate the new methods by an example where the external knowledge comes from incurred payments. In practice, the external knowledge may also come from incurred counts, from other business lines or from other sources.

We used data from a Greek non-life insurer for motor third party liability, aggregated over bodily injury and property damage. The data are presented as cumulative run-off triangles for accident years from 2005 to 2013. Table 1 shows payments, while Table 2 shows incurred amounts.

**Table 1.** Payments in Euros.


**Table 2.** Incurred amounts in Euros.


Table 3 shows parameter estimates for the paid data computed using the chain ladder and the Bornhuetter–Ferguson constrained model. For the moment we focus on the canonical parameters Δ*α<sup>i</sup>* for the relative accident year effect, Δ*β<sup>j</sup>* for the relative development year effect and *μ*<sup>11</sup> for the overall level. First, the chain ladder estimates are reported as <sup>Δ</sup>*αi*, <sup>Δ</sup>*<sup>β</sup> <sup>j</sup>* and <sup>Δ</sup>*μ*11. Second, for the constrained model we first applied chain ladder to the incurred data. The estimates for the relative accident year effect are reported as Δ*α*† *<sup>i</sup>* . The estimates Δ*β* † *<sup>j</sup>* and <sup>Δ</sup>*μ*† <sup>11</sup> were then computed from the paid data using Theorem 1. We note that the ordering Δ*α*† *<sup>i</sup>* <sup>&</sup>gt; <sup>Δ</sup>*α<sup>i</sup>* applies for these data for all *<sup>i</sup>* <sup>=</sup> 2, ..., *<sup>k</sup>* <sup>=</sup> 9. Thus, the monotonicity results from Theorem 3 apply. In particular, we see that Δ*β* † *<sup>j</sup>* > Δ*β <sup>j</sup>* for all *j* = 2, ..., *k* = 9 and *<sup>μ</sup>*† <sup>11</sup> <sup>&</sup>lt; *<sup>μ</sup>*<sup>11</sup> in Table 3.

**Table 3.** Estimates.


A third approach is to use the mixed approach outlined in Section 3.5. Here we use the external estimate Δ*α*† *<sup>i</sup>* for the relative accident year effects along with the chain ladder estimates Δ*β <sup>j</sup>* and <sup>Δ</sup>*μ*11. When the external estimate is based on the incurred data, as in here, this is the same as the Bornhuetter–Ferguson double chain ladder (BDCL) approach of Martínez-Miranda et al. (2013).

Table 4 presents the estimated (pseudo) forward factors and the (pseudo) row sums. For the chain ladder, we have the observed row sums *Ri* and the traditional forward factors *Fj* computed by (1) and (3). For the Bornhuetter–Ferguson constrained model we have the pseudo row sums *R*† *<sup>i</sup>* and the pseudo forward factors *F*† *<sup>j</sup>* computed by (46)–(48). For the mixed approach we have the pseudo row sums *R*‡ *<sup>i</sup>* computed by (50) and (51) and the traditional forward factors *Fj*. Once again we see that the monotonicity results from Theorem 3 apply so that *R*‡ *<sup>i</sup>* > *<sup>R</sup>*† *<sup>i</sup>* and *<sup>F</sup>*† *<sup>j</sup>* > *Fj*.


**Table 4.** Row sums and forward factors.

Table 5 shows the reserves resulting from the classical chain ladder method, ∑*<sup>k</sup> <sup>i</sup>*=<sup>2</sup> *Vi* from (5); the constrained approach, ∑*<sup>k</sup> <sup>i</sup>*=<sup>2</sup> *V*† *<sup>i</sup>* from (38); and the mixed approach, <sup>∑</sup>*<sup>k</sup> <sup>i</sup>*=<sup>2</sup> *<sup>V</sup>*‡ *<sup>i</sup>* from (53). We see that the ordering from Theorem 3 applies. For comparison we note that this portfolio was evaluated at 137 million by an external actuary, with the comment that this figure may be slightly too low. This valuation is based on the information that since 2009, the case reserves incurred were gradually increased, but the gap between incurred and paid reserves was not fully closed as of 2014. In light of this, the Bornhuetter–Ferguson constrained method appears to apply rather well in this situation.

**Table 5.** Reserves in million Euros.


#### **5. Conclusions**

The paper introduces a Bornhuetter–Ferguson approach that replaces the relative ultimates rather than levels of ultimates. This approach has been suggested in the Bornhuetter–Ferguson double chain ladder (BDCL) method in Martínez-Miranda et al. (2013). The traditional Bornhuetter–Ferguson method uses chain ladder weights, whereas we have estimated weights.

We made use of the fact that the chain ladder method has a nice interpretation as maximum likelihood in a Poisson model, and we formulated the relative ultimates constraint as a linear constraint on the parameters and derived maximum likelihood estimators. Furthermore, we followed this approach to reproduce the results of the BDCL method in a mixed approach, combining the constrained method with the classical chain ladder.

Monotonicity results compare the constrained method, the mixed approach and the original chain ladder results. An example illustrates the mentioned results with data from a Greek general insurer. The example shows that, when comparing all methods mentioned above, including chain ladder, the reserve given by the constrained method is in fact the closest estimate to the number given by an external expert.

Our proposal incorporates prior knowledge in a transparent way, keeping the standard principles of maximum likelihood and its well known mathematical properties. In this sense we recommend our approach over traditional Bornuetter-Ferguson adjustments as a formal statistical method for the same purpose, which keeps the simplicity and the intuition of traditional reserving. This is further shown in the convenient formulation of the forecasts in terms of the pseudo development factors provided above. Apart from this, one would also benefit from the practical advantages of using maximum likelihood that include standard inference and distribution forecasting. This cannot be done with such a level of formality in the classical approach, while for the BDCL method it has been done using intense bootstrap techniques. Another advantage of our approach is that it can, unlike the BDCL method, be applied using only one triangle, usually the payments triangle. On the other hand, this has the

disadvantage of not being able to distinguish between IBNR and RBNS reserves as the BDCL method does. Another limitation of our proposal is that it cannot handle negative cells, as it is sometimes the case in the payments triangle. Further refinements are required to deal with this problem.

An outstanding problem is to provide distribution forecasts of Bornhuetter–Ferguson adjusted reserves. In practice the data will have considerable over-dispersion. By modelling that we could complement the point forecasts with distribution forecasts. Recently, Harnau and Nielsen (2018) developed an asymptotic distribution theory for the chain ladder within an over-dispersed Poisson framework. The present situation is a special case of their setup so it could potentially be extended with Bornhuetter–Ferguson adjustments. That was beyond the scope of this paper though.

Finally, because there is a full statistical model specification incorporating prior knowledge, one could implement the same type of cash-flow data validation as in Agbeko et al. (2014), based on back-testing (see also De Felice and Moriconi 2019). However, this approach has several drawbacks, more so for small datasets. Controversy also exits about which error criteria should be considered. We did not consider empirical validation in this paper and focused on theoretical statistical properties when comparing reserving methods under the generalised linear models framework. A recent discussion on empirical validation methods in reserving can be found in Matinek (2019). These can be potentially used in the context of this paper.

**Author Contributions:** C.M. and B.N. developed the theory and R-code. V.E. provided the data and practical insights. M.D.M.-M. prepared the final version of the manuscript.

**Funding:** This research was funded by the programme for Economic Modelling, Oxford and European Research Council, grant AdG 694262, and the Spanish Ministry of Economy and Competitiveness, grant MTM2016-76969P, which includes support from the European Regional Development Fund (ERDF).

**Acknowledgments:** The authors thank three anonymous reviewers for helpful comments. Discussions with J.P. Nielsen are gratefully acknowledged.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **Appendix A. Proofs of Theorems**

**Proof of Equation (11).** Consider the Poisson model. The predictor is given in (24) and has the form *μij* = *μ*<sup>11</sup> + ∑*<sup>i</sup>* -<sup>=</sup><sup>2</sup> Δ*α*- + ∑*<sup>j</sup>* -<sup>=</sup><sup>2</sup> Δ*β*-, while the log likelihood is as in (25). This results in maximum likelihood estimators <sup>Δ</sup>*αi*, <sup>Δ</sup>*<sup>β</sup> <sup>j</sup>* and *<sup>μ</sup>*<sup>11</sup> presented in (26)–(28), and in turn, the development factor *Fj* is maximum likelihood estimator for Φ*<sup>j</sup>* given in (32). When combining these we get the chain ladder forecast *<sup>Y</sup>ij* <sup>=</sup> *Ri*(*Fj* <sup>−</sup> <sup>1</sup>) <sup>∏</sup>*j*−<sup>1</sup> -<sup>=</sup>*k*+2−*<sup>i</sup> <sup>F</sup>*in (11).

The derivations sketched above are symmetric in row and column. Suppose we transpose the data triangle by swapping rows and columns, so that *i*, *j* become *j*, *i* and *Ri*, *Cj* become *Cj*, *Ri*, while <sup>Δ</sup>*αi*, <sup>Δ</sup>*β<sup>j</sup>* become <sup>Δ</sup>*βj*, <sup>Δ</sup>*αi*. Correspondingly, <sup>Δ</sup>*αi*, <sup>Δ</sup>*<sup>β</sup> <sup>j</sup>* and *Gi*, *Fj* become Δ*β j*, <sup>Δ</sup>*α<sup>i</sup>* and *Fj*, *Gi*; and then, second expression for the chain ladder forecast arises; that is, *<sup>Y</sup>ij* <sup>=</sup> *Cj*(*Gi* <sup>−</sup> <sup>1</sup>) <sup>∏</sup>*i*−<sup>1</sup> -<sup>=</sup>*k*+2−*<sup>j</sup> <sup>G</sup>*-.

**Proof of Theorem 1.** *The likelihood.* When the Δ*α*† *<sup>i</sup>* s are known the likelihood is

$$\ell(\mu\_{11}, \Delta \beta) = \mu\_{11} \sum\_{j=1}^{k} \mathbb{C}\_{j} + \sum\_{j=2}^{k} \Delta \beta\_{j} \sum\_{\ell=j}^{k} \mathbb{C}\_{j} - \kappa(\mu\_{11}, \Delta \beta) + h(data), \ell$$

where *h* is a function of the data, not depending on the unknown parameters, while

$$\kappa(\mu\_{11}, \Delta \beta) = \sum\_{i=1}^{k} \sum\_{j=1}^{k+1-i} \exp(\mu\_{ij}) = \sum\_{i=1}^{k} \sum\_{j=1}^{k+1-i} \exp(\mu\_{11} + \sum\_{\ell=2}^{i} \Delta a\_{\ell}^{\dagger} + \sum\_{\ell=2}^{j} \Delta \beta\_{\ell}),$$

is the cumulant generating function. Empty sums are zero.

*Uniqueness of the estimator.* For a full exponential family the likelihood has a maximum if and only if the natural statistic is interior to its convex support, and then the maximum likelihood estimator

is unique Barndorff-Nielsen (1978) (Theorem 9.13). The natural statistic *T*† *<sup>k</sup>* = <sup>∑</sup>*i*,*j*∈I (*Yij*, *<sup>C</sup>*2, ... , *Ck*) arises through a bijective, linear mapping of (*C*1, ... , *Ck*) . Since *Yij* ≥ 0 by the Poisson assumption, *Cj* ≥ 0, with *Cj* = 0 as a possible outcome. Since *C*1, ... , *Ck* are based on unrelated observations, the interior of the convex support is given by the condition that *Cj* > 0 for all *j* = 1, . . . , *k*.

*Likelihood equations.* Since the exponential family is regular, the *k* likelihood equations are *T*† *<sup>k</sup>* = E*T*† *<sup>k</sup>* Barndorff-Nielsen (1978) (Corollary 9.6). Since <sup>∑</sup>*<sup>k</sup> <sup>i</sup>*=<sup>1</sup> <sup>∑</sup>*k*+1−*<sup>i</sup> <sup>j</sup>*=<sup>1</sup> *Yij* <sup>=</sup> <sup>∑</sup>*<sup>k</sup> <sup>j</sup>*=<sup>1</sup> *Cj*, this in turn implies the equations

$$\mathbf{C}\_{j} = \mathbb{E}\mathbf{C}\_{j}, \qquad \text{ for } j = 1, \ldots k. \tag{A1}$$

.

*Estimating the level*. The expression for *<sup>μ</sup>*† <sup>11</sup> arises from the first likelihood equation

$$\mathbb{C}\_1 = \mathbb{E}\mathbb{C}\_1 = \exp(\mu\_{11}) \sum\_{i=1}^k \exp(a\_i - a\_1),$$

since the parameters *<sup>α</sup><sup>i</sup>* <sup>−</sup> *<sup>α</sup>*<sup>1</sup> <sup>=</sup> <sup>∑</sup>*<sup>i</sup>* -<sup>=</sup><sup>2</sup> Δ*α*are known.

*Estimating the development parameters*. The expression for Δ*β* † *<sup>j</sup>* arises by combining the (*j* − 1)th and *j*th likelihood equations

$$\frac{\mathbb{C}\_{j}}{\mathbb{C}\_{j-1}} = \frac{\mathbb{E}\mathbb{C}\_{j}}{\mathbb{E}\mathbb{C}\_{j-1}} = \frac{\exp(\mu\_{11} + \beta\_{j} - \beta\_{1}) \sum\_{i=1}^{k+1-j} \exp(\alpha\_{i} - \alpha\_{1})}{\exp(\mu\_{11} + \beta\_{j-1} - \beta\_{1}) \sum\_{i=1}^{k+2-j} \exp(\alpha\_{i} - \alpha\_{1})}.$$

Recalling the expression for Γ*<sup>i</sup>* in (10) this reduces to

$$\frac{\mathcal{C}\_{j}}{\mathcal{C}\_{j-1}} = \frac{\exp\left(\Delta\beta\_{j}\right)}{\Gamma\_{k+2-j}},$$

which has the desired solution.

**Proof of Theorem 2.** Use the expressions from Theorem 1 to get

$$\begin{split} \tilde{Y}\_{ij}^{\dagger} &= \ \exp(\hat{\mu}\_{11}^{\dagger} + a\_{i}^{\dagger} - a\_{1}^{\dagger} + \hat{\beta}\_{j}^{\dagger} - \hat{\beta}\_{1}^{\dagger}) \\ &= \ \ \frac{\mathbb{C}\_{1}}{\prod\_{\ell=2}^{k} \Gamma\_{\ell}^{\dagger}} (\Gamma\_{i}^{\dagger} - 1) \left(\prod\_{\ell=2}^{i-1} \Gamma\_{\ell}^{\dagger}\right) \frac{\mathbb{C}\_{j}}{\mathbb{C}\_{1}} \prod\_{\ell=2}^{j} \Gamma\_{k+2-\ell}^{\dagger} = \mathbb{C}\_{j} (\Gamma\_{i}^{\dagger} - 1) \frac{\prod\_{\ell=k+2-j}^{k} \Gamma\_{\ell}^{\dagger}}{\prod\_{\ell=i}^{k} \Gamma\_{\ell}^{\dagger}} \end{split}$$

We get the desired result by simplifying the last fraction using *i* > *k* + 2 − *j*.

**Proof of Equation (45).** *First identity*. Combine the forecasts; see (44).

$$\hat{Y}\_{ij}^{\dagger} = \exp(\hat{\mu}\_{11} + \sum\_{h=2}^{i} \Delta a\_{h}^{\dagger} + \sum\_{h=2}^{j} \Delta \hat{\beta}\_{h}), \qquad \hat{Y}\_{ij} = \exp(\hat{\mu}\_{11} + \sum\_{h=2}^{i} \Delta \hat{a}\_{h} + \sum\_{h=2}^{j} \Delta \hat{\beta}\_{h}).$$

*Second identity*. From (11) we have *<sup>Y</sup>ij* <sup>=</sup> *Cj*(*Gi* <sup>−</sup> <sup>1</sup>) <sup>∏</sup>*i*−<sup>1</sup> -<sup>=</sup>*k*+2−*<sup>j</sup> <sup>G</sup>*-. Write *Gi* = *Ni*/*Ni*−<sup>1</sup> where *N<sup>i</sup>* = ∑*i* -<sup>=</sup><sup>1</sup> exp(∑- *<sup>h</sup>*=<sup>2</sup> <sup>Δ</sup>*αh*) and *<sup>N</sup><sup>i</sup>* <sup>−</sup> *<sup>N</sup>i*−<sup>1</sup> <sup>=</sup> exp(∑*<sup>i</sup> <sup>h</sup>*=<sup>2</sup> <sup>Δ</sup>*αh*). Then, we get

$$
\widetilde{Y}\_{ij} = \mathbb{C}\_{\boldsymbol{\gamma}} \frac{\widehat{\boldsymbol{N}}\_{i} - \widehat{\boldsymbol{N}}\_{i-1}}{\widehat{\boldsymbol{N}}\_{i-1}}\_{\ell=k+2-j} \prod\_{\ell=k+2-j}^{i-1} \frac{\widehat{\boldsymbol{N}}\_{\ell}}{\widehat{\boldsymbol{N}}\_{\ell-1}} = \mathbb{C}\_{\boldsymbol{\gamma}} \frac{\widehat{\boldsymbol{N}}\_{i} - \widehat{\boldsymbol{N}}\_{i-1}}{\widehat{\boldsymbol{N}}\_{k+1-j}} = \mathbb{C}\_{\boldsymbol{\gamma}} \frac{\exp\left(\sum\_{h=2}^{i} \Delta \widehat{\boldsymbol{a}}\_{h}\right)}{\sum\_{\ell=1}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta \widehat{\boldsymbol{a}}\_{h}\right)}.
$$

Correspondingly, we get from (37), that

$$\tilde{Y}\_{ij}^{\dagger} = \mathbb{C}\_{j} \frac{\exp\left(\sum\_{h=2}^{i} \Delta \alpha\_{h}^{\dagger}\right)}{\sum\_{\ell=1}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta \alpha\_{h}^{\dagger}\right)}.$$

*Risks* **2019**, *7*, 119

Then, combine the expressions for *<sup>Y</sup>*† *ij* and *<sup>Y</sup>*† *ij*.

**Proof of Equation (49).** The point forecast is *Y*˜ † *ij* <sup>=</sup> exp(*μ*† <sup>11</sup> <sup>+</sup> <sup>∑</sup>*<sup>i</sup> <sup>h</sup>*=<sup>2</sup> <sup>Δ</sup>*α*† *<sup>h</sup>* <sup>+</sup> <sup>∑</sup>*<sup>j</sup> <sup>h</sup>*=<sup>2</sup> Δ*β* † *<sup>h</sup>*). Insert the expression for *<sup>μ</sup>*† <sup>11</sup> from (47), for <sup>Δ</sup>*α*† *<sup>i</sup>* from (48) and exp(∑*<sup>j</sup> <sup>h</sup>*=<sup>2</sup> Δ*β* † *h*)=(*F*† *<sup>j</sup>* <sup>−</sup> <sup>1</sup>) <sup>∏</sup>*j*−<sup>1</sup> -<sup>=</sup><sup>2</sup> *<sup>F</sup>*† - , which follows from (46), to get

$$\tilde{Y}\_{ij}^{\dagger} = \begin{array}{c} \frac{R\_1^{\dagger}}{\prod\_{\ell=2}^k F\_\ell^{\dagger}} \left( \frac{R\_i^{\dagger}}{R\_1^{\dagger}} \prod\_{\ell=2}^i F\_{k+2-\ell}^{\dagger} \right) (F\_j^{\dagger}-1) \prod\_{\ell=2}^{j-1} F\_\ell^{\dagger}. \end{array}$$

Equation (49) follows by reducing common factors and noting that *j* > *k* + 2 − *i*.

**Proof of Equation (52).** The point forecast is *<sup>Y</sup>*‡ *ij* <sup>=</sup> exp(*μ*<sup>11</sup> <sup>+</sup> <sup>∑</sup>*<sup>i</sup> <sup>h</sup>*=<sup>2</sup> <sup>Δ</sup>*α*† *<sup>h</sup>* <sup>+</sup> <sup>∑</sup>*<sup>j</sup> <sup>h</sup>*=<sup>2</sup> Δ*β h*), as given in (44). Insert the expression for *<sup>μ</sup>*<sup>11</sup> from (28), the expression for <sup>Δ</sup>*α*† *<sup>i</sup>* from (51) and exp(∑*<sup>j</sup> <sup>h</sup>*=<sup>2</sup> Δ*β h*) = (*Fj* <sup>−</sup> <sup>1</sup>) <sup>∏</sup>*j*−<sup>1</sup> -<sup>=</sup><sup>2</sup> *F*-, which follows from (32) noting that *Fj* = Φ *<sup>j</sup>*, to get

$$\widetilde{Y}\_{ij}^{\ddagger} = \begin{array}{c} \frac{R\_1}{\prod\_{\ell=2}^k F\_\ell} \left( \frac{R\_i^{\ddagger}}{R\_1} \prod\_{\ell=2}^i F\_{k+2-\ell} \right) (F\_{\bar{j}} - 1) \prod\_{\ell=2}^{j-1} F\_\ell. \end{array}$$

Equation (52) follows by reducing common factors and noting that *j* > *k* + 2 − *i*.

**Proof of Theorem 3.** (*a*) We show that Γ*<sup>i</sup>* defined in (32) increases in the Δ*αi*'s. Write Γ*<sup>i</sup>* = *Ni*/*Ni*−<sup>1</sup> where *Ni* = ∑*<sup>i</sup>* -<sup>=</sup><sup>1</sup> exp(∑- *<sup>h</sup>*=<sup>2</sup> Δ*αh*). Thus, we must show that the derivative of Γ*<sup>i</sup>* with respect to Δ*α<sup>n</sup>* is positive for all *n* ≤ *i* and zero otherwise. It suffices to consider the numerator of that derivative, which is *<sup>N</sup>*˙ *iNi*−<sup>1</sup> <sup>−</sup> *NiN*˙ *<sup>i</sup>*−1. Now,

$$\dot{N}\_{\dot{i}} = \frac{\partial N\_{\dot{i}}}{\partial \Delta \alpha\_{\text{fl}}} = \sum\_{\ell=\text{n}}^{\dot{i}} \exp(\sum\_{\text{h}=2}^{\ell} \Delta \alpha\_{\text{h}}) = N\_{\dot{i}} - N\_{\text{n}-1}\omega$$

for *<sup>n</sup>* <sup>≤</sup> *<sup>i</sup>* and zero otherwise. This implies *<sup>N</sup>*˙ *iNi*−<sup>1</sup> <sup>−</sup> *NiN*˙ *<sup>i</sup>*−<sup>1</sup> <sup>=</sup> *Nn*−1(*Ni* <sup>−</sup> *Ni*−1), noting that the cases where *n* < *i* and *n* = *i* have to be checked separately. The desired result now follows by noting that *Nn*−<sup>1</sup> and *Ni* − *Ni*−<sup>1</sup> are both positive.

(*b*) Using (35) and (*a*) we get

$$
\Delta \bar{\beta}\_j^\dagger = \Delta \log \mathcal{C}\_j + \log \Gamma\_{k+2-j}^\dagger \\
> \Delta \log \mathcal{C}\_j + \log \mathcal{G}\_{k+2-j} = \Delta \bar{\beta}\_{j'},
$$

where the last equality is of a similar type as (35) and comes from Theorem 3 in Kuang et al. (2009).

(*c*) Using (36) and (*a*) we get

$$
\widehat{\mu}\_{11}^{\dagger} = \log \mathbb{C}\_1 - \sum\_{\ell=2}^k \log \Gamma\_\ell^{\dagger} < \log \mathbb{C}\_1 - \sum\_{\ell=2}^k \log G\_\ell = \widehat{\mu}\_{11},
$$

where the last equality comes from from Theorem 3 in Kuang et al. (2009).

(*d*) First, we compare the new reserve *<sup>Y</sup>*† *ij* with <sup>Δ</sup>*α*† *<sup>i</sup>* known to the old reserve *Yij* from CL. Since <sup>1</sup> ≤ *Gi* < <sup>Γ</sup>† *<sup>i</sup>* for 2 ≤ *i* ≤ *k*, by (11), (*a*) and (37),

$$\widetilde{Y}\_{ij} = \mathbb{C}\_{\dot{\boldsymbol{\beta}}}(\mathbf{G}\_{\dot{\boldsymbol{\epsilon}}} - \mathbf{1}) \prod\_{\ell=k+2-j}^{i-1} \mathbf{G}\_{\ell} < \mathbb{C}\_{\dot{\boldsymbol{\beta}}}(\Gamma\_{i}^{\dagger} - \mathbf{1}) \prod\_{\ell=k+2-j}^{i-1} \Gamma\_{\ell}^{\dagger} = \widetilde{Y}\_{ij}^{\dagger}.$$

Second, we compare the new reserve *<sup>Y</sup>*† *ij*, using <sup>Δ</sup>*α*† *<sup>i</sup>* , *<sup>μ</sup>*† <sup>11</sup> and Δ*β* † *<sup>j</sup>* , to the mixed reserve *<sup>Y</sup>*‡ *ij*, using Δ*α*† *<sup>i</sup>* , *<sup>μ</sup>*<sup>11</sup> and <sup>Δ</sup>*<sup>β</sup> j*. From (45) we have

$$
\widetilde{Y}\_{ij}^{\ddagger} = \widetilde{Y}\_{ij}^{\dagger} \frac{\sum\_{\ell=2}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta \alpha\_h^{\dagger}\right)}{\sum\_{\ell=2}^{k+1-j} \exp\left(\sum\_{h=2}^{\ell} \Delta \widehat{\alpha}\_h\right)}.
$$

Since Δ*α*† *<sup>i</sup>* <sup>&</sup>gt; <sup>Δ</sup>*αi*, for all 2 <sup>≤</sup> *<sup>i</sup>* <sup>≤</sup> *<sup>k</sup>* it follows that *<sup>Y</sup>*‡ *ij* > *<sup>Y</sup>*† *ij*.


$$\frac{R\_i^\ddagger}{R\_i^\ddagger} = \frac{Y\_{ij}^\ddagger}{Y\_{ij}^\ddagger} \frac{(F\_j^\ddagger - 1)\prod\_{\ell=k+2-i}^{j-1} F\_\ell^\ddagger}{(F\_j - 1)\prod\_{\ell=k+2-i}^{j-1} F\_\ell}.$$

Then, apply the orderings *<sup>Y</sup>*‡ *ij* > *<sup>Y</sup>*† *ij* and *<sup>F</sup>*† *<sup>j</sup>* > *Fj* from (*d*),(*e*).

$$(\text{g}) \text{ Use (32), (52) to get } R\_i^\sharp / R\_{i\bar{j}} = \hat{Y}\_i^\sharp / \hat{Y}\_{i\bar{j}} \text{ for all } k + 2 - i \le j \le k. \text{ Apply } (f). \quad \square$$

#### **References**


Taylor, Gregory. 2000. *Loss Reserving: An Actuarial Perspective*. Norwel: Kluwer Academic Publishers. [CrossRef]


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
