Credibility Distribution Estimation with Weighted or Grouped Observations

Pitselis, Georgios

doi:10.3390/risks12010010

Open AccessArticle

Credibility Distribution Estimation with Weighted or Grouped Observations

by

Georgios Pitselis

^1,2

¹

Department of Statistics & Insurance Science, University of Piraeus, 80 Karaoli & Dimitriou Str. T. K., 18534 Piraeus, Greece

²

Department of Mathematics & Statistics, Concordia University, 1455 De Maisonneuve Blvd. W., Montreal, QC H3G 1M8, Canada

Risks 2024, 12(1), 10; https://doi.org/10.3390/risks12010010

Submission received: 17 November 2023 / Revised: 26 December 2023 / Accepted: 26 December 2023 / Published: 3 January 2024

(This article belongs to the Special Issue Statistics, Stochastic Modelling and Quantitative Risk Management for Insurance)

Download

Browse Figures

Versions Notes

Abstract

In non-life insurance practice, actuaries are often faced with the challenge of predicting the number of claims and claim amounts to be incurred at any given time, which serve to implement fair pricing and reserves given the nature of the risk. This paper extends Jewell’s credible distribution in terms of forecasting the distribution of individual risk in cases where the observations are weighted or are grouped in intervals. More specifically, we show how empirical distribution functions can be embedded within Bühlmann’s and Straub’s credibility model. The optimal projection theorem is applied for credibility estimation and more insight into the derivation of the credibility distribution estimators is also provided. In addition, distribution credibility estimators are established and numerical illustrations are presented herein. Two examples of distribution credibility estimation are given, one with insurance loss data and the other with industry financial data.

Keywords:

credibility distribution estimation; empirical Bayes

1. Introduction

In actuarial science, one of the fundamental problems is that of predicting future claims of individual risk given one’s past experience of a collective of heterogeneous risks. Credibility is a ratemaking technique that serves to forecast future premiums for a group of insurance contracts for which we have experience, whilst we have a lot more experience for a collection of contracts that are similar but not exactly the same.

In the insurance industry, some legislated rules indicate that some changes over time occurred across the claim distribution. Therefore, it is essential to examine these changes at different points of the distribution. An empirical distribution function provides a way to model and sample cumulative probabilities for a data sample that does not fit a standard probability distribution. Its value at a given point is equal to the proportion of observations from the sample that are less than or equal to that point.

In non-life insurance practice, actuaries are often faced with the challenge of predicting the number of claims and the claim amounts to be incurred at any given time, which serve to implement fair pricing and reserves given the nature of the risk. Actuaries usually deal with events that are uncertain and their economic consequences. The aim of this paper is to carry out the credibility estimation of empirical distribution functions in measuring and managing these uncertainties.

In the first part of this paper, we extend the work of Jewell (1974b) in terms of forecasting the distribution of individual risk in cases where the observations are weighted for the non-homogeneous and homogeneous models. Here, the weights (sizes)

w_{i j}

,

i = 1, \dots, n_{j}

,

j = 1, \dots, K

are now changing in time. The contract j might result from a grouping and averaging of

w_{i j}

observations in a contract with several independent and identically distributed observations

S_{l i j}

,

l = 1, \dots w_{i j}

, during the year i, i.e.,

X_{i j} = \frac{1}{w_{i j}} \sum_{l = 1}^{w_{i j}} S_{l i j}

, and then taking the conditional mean of the identity function

E [I (X_{i j} \leq x) | Θ_{j}]

. Alternatively, in the case of raw data, the contract j might result from the grouping and averaging of identity functions within the year i,

\bar{I} (S_{i j} \leq x) = \frac{1}{w_{i j}} \sum_{l = 1}^{w_{i j}} I (S_{l i j} \leq x)

and then taking the

E [\bar{I} (S_{i j} \leq x) | Θ_{j}]

.

Here, we proceed with the former considering the credibility distribution estimation as a point estimate approach of

F_{X_{i j} | Θ_{j}} (x | Θ_{j}) = E [I (X_{i j} \leq x) | Θ_{j}]

. Optimal linearized estimators of

F_{X_{i j} Θ_{j}} (x | Θ_{j})

are obtained by the classical least squares approach as well as by the optimal projection theorem of random variables on planes as presented by De Vylder (1976, 1996).

In the second part of this paper, we consider credibility distribution estimation based on grouped data formed by aggregating the individual observations of a variable into groups. The construction of the empirical distribution based on grouped data can be performed by obtaining the point values of the empirical distribution function whenever is possible. Then, we approximate the distribution functions by connecting those points with straight lines and applying premium estimation in a credibility framework. An alternative model of credibility estimation is also obtained similarly as in Bühlmann and Straub (1970) model.

Related Works

Bühlmann (1967) and Bühlmann and Straub (1970) established the theoretical foundation of modern credibility theory, presented as a distribution-free credibility estimation. The method was extended in the regression model by Hachemeister (1975), where the credibility premium depends linearly on a number of risk characteristics.

Jewell (1974a) has shown that credibility is exactly Bayesian for a certain exponential family of distributions with natural conjugate priors. Furthermore, Landsman and Makov (1998, 1999) extended the results on the exponential family to the exponential dispersion family. The following key references are related to new developments in credibility estimation: Makov et al. (1996), Christiansen and Schinzinger (2016), Tsai and Lin (2017), Gong et al. (2018), Xacur and Garrido (2018), Tsai and Wu (2020), Tsai and Zhang (2019), Bozikas and Pitselis (2020, 2021), Youn et al. (2021), Wang et al. (2021), Yan and Song (2022), and Kim et al. (2022).

Credibility distribution estimation is closely connected to the area of quantile credibility estimation. The quantile function is the inverse of the distribution function. It specifies the value of the random variable such that the probability of the variable that is less than or equal to that value is equal to the given probability. Kim and Jeon (2013) proposed a credibility theory by truncating the loss data based on quantiles. Some other references related to quantile estimation are: Pitt (2006), Pitselis (2009, 2013, 2017), Kudryavtsev (2009), Gebizlioglu and Yagci (2008), Denuit (2008) and Landsman (1996).

Jewell (1974b) extended the classical Bühlmann (1967) model to the problem of forecasting the distribution of individual risk based upon collective statistics and individual experience data and solved the problem by finding a Bayesian conditional distribution. Jewell (1974b) also obtain an additional insight into the nature of credibility estimation assuming that the true value of

Θ_{j}

is known and obtained credible distributions and credible densities by carrying out simulations for some conjugate prior families of distributions (e.g., Poisson–gamma, etc.). He also considered the problem of founding a credibility approximation to the true distribution of the next observation.

Korwar and Hollander (1973) defined a sequence of empirical Bayes estimators for estimating a distribution function. Zehnwirth (1981) established the asymptotic optimality of the empirical Bayes distribution function created from the Bayes rule relative to the Dirichlet process prior with unknown parameter. Cai et al. (2015) combined Bühlmann’s credibility theory and Ferguson’s (1973) nonparametric Bayes analysis to develop a completely nonparametric estimation for loss distributions and established a unified distribution-free approach to experience rating for arbitrary premium principles.

This paper is organized as follows. In Section 2, both the linearized non-homogeneous and homogeneous estimators in the weighted credibility distribution model are obtained, and the credibility parameters are estimated. Optimal credibility distribution estimators are also obtained using the optimal projection theorem. In Section 3, the credibility distribution estimation for grouped data is presented. In Section 4, an alternative model of credibility distribution estimation is obtained when the observations are grouped in intervals. Applications to real data are presented in Section 5, one with insurance loss data and the other with industry financial data. Some concluding remarks are presented in Section 6.

2. Weighted Credibility Distribution Estimation

In the following, we consider the credibility model with several contracts and weighted observations. For an insurance portfolio,

X_{i j}

are the average losses of

w_{i j}

observations for contract

j = 1, \dots, K

and period

i = 1, \dots n_{j}

. For industry portfolios,

X_{i j}

denotes the average returns (losses/gains) of

w_{i j}

firms in

j = 1, \dots K

portfolios for period

i = 1, \dots, n_{j}

.

2.1. Assumptions

We have the following assumptions:

(i): The contracts are independent and the variables $Θ_{1}, \dots, Θ_{K}$ are identically distributed.
(ii): $F_{X_{i j} | Θ_{j}} (x | Θ_{j}) = P (X_{i j} \leq x | Θ_{j}) = E [I (X_{i j} \leq x) | Θ_{j}]$ , where $I (X_{i j} \leq x)$ is an indicator function that is equal to 1 if $X_{i j} \leq x$ and 0 otherwise.
(iii): $C o v [I (X_{i j} \leq x), I (X_{r j} \leq x) | Θ_{j}] = δ_{i r} \frac{1}{w_{i j}} σ_{F_{x}}^{2} (Θ_{j}),$ where $δ_{i r} = 1$ if $i = r$ and 0 otherwise.

2.2. Structural Parameters

The structural parameters

F_{X_{i j}} (x)

,

s_{F_{x}}^{2}

, and

a_{F_{x}}

are as follows:

(SP1): $F_{X_{i j}} (x) = E [F_{X_{i j} | Θ_{j}} (x | Θ_{j})] = E [P (X_{i j} \leq x | Θ_{j})];$
(SP2): $s_{F_{x}}^{2} = E [σ_{F_{x}}^{2} (Θ_{j})]$ ;
(SP3): $a_{F_{x}} = V a r {E [I (X_{i j} \leq x) | Θ_{j}]} = V a r [F_{X_{i j} | Θ_{j}} (x | Θ_{j})] .$

2.3. Notation

Here, we present the weighted empirical distribution function as well as some notations that are useful for the derivation of the credibility distribution estimation.

\begin{matrix} F_{n_{w j}} (x) = \sum_{i = 1}^{n_{j}} \frac{w_{i j}}{w_{. j}} I (X_{i j} \leq x), F_{n_{w w}} (x) = \sum_{j = 1}^{K} \frac{w_{. j}}{w_{. .}} F_{n_{w j}} (x), w_{. j} = \sum_{i = 1}^{n_{j}} w_{i j}, \\ w_{. .} = \sum_{i = 1}^{K} w_{. j}, F_{n_{w z}} (x) = \sum_{j = 1}^{K} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} F_{n_{w j}} (x), z_{.}^{F_{x}} = \sum_{j = 1}^{K} Z_{j}^{F_{x}}, \\ Z_{j}^{F_{x}} = \frac{a_{F_{x}} w_{. j}}{a_{F_{x}} w_{. j} + s_{F_{x}}^{2}} . \end{matrix}

(1)

Lemma 1

(Expectation and Covariance Relations). Based on the above assumptions, we can obtain expressions for the conditional expectations and covariances as follows,

\begin{matrix} E [I (X_{i j} \leq x)] = E [F_{n_{w j}} (x)] = E [F_{n_{w w}} (x)] = E [F_{n_{w z}} (x)] = F_{X_{i j}} (x), \end{matrix}

(2)

\begin{matrix} C o v [I (X_{i j} \leq x), I (X_{r j} \leq x)] = δ_{i r} \frac{1}{w_{i j}} s_{F_{x}}^{2} + a_{F_{x}}, \end{matrix}

(3)

\begin{matrix} C o v [I (X_{i j} \leq x), F_{n_{w j}} (x)] = C o v [F_{n_{w j}} (x), F_{n_{w j}} (x)] = \frac{1}{w_{. j}} s_{F_{x}}^{2} + a_{F_{x}} = \frac{a_{F_{x}}}{Z_{j}^{F_{x}}}, \end{matrix}

(4)

\begin{matrix} C o v [I (X_{i j} \leq x), F_{n_{w z}} (x)] = C o v [F_{n_{w j}} (x), F_{n_{w z}} (x)] = C o v [F_{n_{w z}} (x), F_{n_{w z}} (x)] = \frac{a_{F_{x}}}{z_{.}^{F_{x}}}, \end{matrix}

(5)

\begin{matrix} C o v [F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)] = δ_{j j^{'}} a_{F_{x}} . \end{matrix}

(6)

Proof.

Relation (2) is straightforward. Relation (3) results from

\begin{matrix} C o v [I (X_{i j} \leq x), I (X_{r j} \leq x)] \\ = E (C o v [I (X_{i j} \leq x), I (X_{r j} \leq x) | Θ_{j}]) + C o v (E [I (X_{i j} \leq x) | Θ_{j}], E [I (X_{r j} \leq x) | Θ_{j}]) \\ = δ_{i r} \frac{1}{w_{i j}} s_{F_{x}}^{2} + a_{F_{x}} . \end{matrix}

The first part of (4) results from

\begin{matrix} C o v [I (X_{i j} \leq x), F_{n_{w_{j}}} (x)] = C o v (I (X_{i j} \leq x), \sum_{r = 1}^{n_{j}} \frac{w_{r j}}{w_{. j}} I (X_{r j} \leq x)) \\ = \sum_{r = 1}^{n_{j}} \frac{w_{r j}}{w_{i j}} C o v [I (X_{i j} \leq x), I (X_{r j} \leq x)] = \sum_{r = 1}^{n_{j}} \frac{w_{r j}}{w_{. j}} (δ_{i r} \frac{1}{w_{i j}} s_{F_{x}}^{2} + a_{F_{x}}) \\ = \frac{1}{w_{. j}} s_{F_{x}}^{2} + \sum_{r \neq i}^{n_{j}} δ_{i r} \frac{1}{w_{i j}} s_{F_{x}}^{2} + a_{F_{x}} \\ = \frac{1}{w_{. j}} s_{F_{x}}^{2} + a_{F_{x}} . \end{matrix}

Similarly, we can prove the second and third parts of (4). For the proof of the first part of relation (5), we have

\begin{matrix} C o v [I (X_{i j} \leq x), F_{n_{w z}} (x)] = C o v (I (X_{i j} \leq x), \sum_{j^{'} = 1}^{K} \frac{Z_{j^{'}}^{F_{x}}}{z_{.}^{F_{x}}} F_{n_{w j^{'}}} (x)) \\ = \sum_{j^{'} = 1}^{K} \sum_{r = 1}^{n_{j^{'}}} \frac{Z_{j^{'}}^{F_{x}}}{z_{.}^{F_{x}}} C o v [I (X_{i j} \leq x), I (X_{r j^{'}} \leq x)] \\ = \sum_{r = 1}^{n_{j}} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} C o v [I (X_{i j} \leq x), I (X_{r j} \leq x)] + \sum_{j^{'} \neq j}^{K} \sum_{r = 1}^{n_{j^{'}}} \frac{Z_{j^{'}}^{F_{x}}}{z_{.}^{F_{x}}} C o v [I (X_{i j} \leq x), I (X_{r j^{'}} \leq x)] \\ = \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} C o v [I (X_{i j} \leq x), I (X_{i j} \leq x)] + \sum_{r \neq i}^{n_{j}} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} C o v [(I (X_{i j} \leq x), I (X_{r j} \leq x)] \\ = \frac{a_{F_{x}}}{z_{.}^{F_{x}}} . \end{matrix}

In the same way, we can prove the second and third parts of (5). Finally, (6) can be proved as

\begin{matrix} C o v [F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)] \\ = & E {C o v [F_{X_{i j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x) | Θ_{j}]} \\ + C o v {E [F_{X_{i j} | Θ_{j}} (x | Θ_{j}) | Θ_{j}], E [I (X_{i j^{'}} \leq x) | Θ_{j}]} \\ = & δ_{j j^{'}} a_{F_{x}} . \end{matrix}

□

Similarly, as in Bühlmann and Straub (1970), by the following theorems, we will provide the optimal linearized non-homogeneous (as well as the homogeneous) credibility estimators and provide some useful estimators for the structure parameters.

Theorem 1

(Linearized non-homogeneous credibility distribution estimator). Under the assumptions

(i)

–

(i i i)

, the optimal linearized non-homogeneous estimator of

F_{X_{i j}} (x | Θ_{j})

is obtained by

\begin{matrix} F_{X_{i j}}^{C r e d} (x | Θ_{j}) = Z_{j}^{F_{x}} F_{n_{w j}} (x) + (1 - Z_{j}^{F_{x}}) F_{X_{i j}} (x), \end{matrix}

(7)

with

F_{n_{w j}} (x)

and

Z_{j}^{F_{x}}

as in (1).

Proof.

We have to find

c_{0}^{j}, c_{11}^{j}, . . ., c_{n_{j} K}^{j}

in

\begin{matrix} g_{j} [I (X_{11} \leq x), \dots, I (X_{n_{j} K} \leq x)] = c_{0}^{j} + \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i j} \leq x), \end{matrix}

(8)

such that

\begin{matrix} Q = E {(P (X_{i j} \leq x | Θ_{j}) - c_{0}^{j} - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i j} \leq x))}^{2} \end{matrix}

(9)

is minimum. Differentiating (9) with respect to

c_{0}^{j}

, we have

\begin{matrix} \frac{\partial Q}{\partial c_{0}^{j}} & = & E (F_{X_{i j} | Θ_{j}} (x | Θ_{j}) - c_{0}^{j} - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i j} \leq x)) = 0 \\ \Rightarrow c_{0}^{j} & = & E [F_{X_{i j} | Θ_{j}} (x | Θ_{j})] - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} E [I (X_{i j} \leq x)] \\ = & F_{X_{i j}} (x) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} F_{X_{i j}} (x) . \end{matrix}

(10)

Substituting the value of

c_{0}^{j}

in (9) and differentiating with respect to

c_{r l^{'}}

, we obtain

\begin{matrix} \frac{\partial Q}{\partial c_{r l^{'}}} & = & \frac{\partial}{\partial c_{r l^{'}}} E {[F_{X_{i j} Θ_{j}} (x | Θ_{j}) - F_{X_{i j}} (x) + \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} F_{X_{i j}} (x) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i j} \leq x)]}^{2} \\ = & E (F_{X_{i j} | Θ_{j}} (x | Θ_{j}) - F_{X_{i j}} (x) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} [I (X_{i l} \leq x) - F_{X_{i j}} (x)] [I (X_{r l^{'}} \leq x) - F_{X_{i j}} (x)]) \\ = & 0 \end{matrix}

\begin{matrix} \Rightarrow C o v [F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{r l^{'}} \leq x)] = \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} C o v [I (X_{i l} \leq x), I (X_{r l^{'}} \leq x)] . \end{matrix}

(11)

The right-hand side of (11) becomes

\begin{matrix} \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} C o v [I (X_{i l} \leq x), I (X_{r l^{'}} \leq x)] \\ = \sum_{i = 1}^{n_{j}} c_{i j}^{j} C o v [I (X_{i j} \leq x), I (X_{r j} \leq x)] + \sum_{l \neq j}^{K} \sum_{i = 1}^{n_{j}} C o v [I (X_{i l} \leq x), I (X_{r j} \leq x)] \\ = \sum_{i = 1}^{n_{j}} c_{i j}^{j} (δ_{i r} \frac{1}{w_{r j}} s_{F_{x}}^{2} + a_{F_{x}}) . \end{matrix}

Then, (11) implies that

\begin{matrix} a_{F_{x}} = \sum_{i = 1}^{n_{j}} c_{i j}^{j} a_{F_{x}} + c_{i j}^{j} \frac{1}{w_{i j}} s_{F_{x}}^{2} . \end{matrix}

(12)

Multiplying (12) by

w_{i j}

and summing with respect to

i = 1, \dots, n_{j}

, (i.e.,

\sum_{i = 1}^{n_{j}} c_{i j}^{j} = c_{. j}^{j}

), we obtain

\begin{matrix} c_{. j}^{j} = \frac{a_{F_{x}} w_{. j}}{a_{F_{x}} w_{. j} + s_{F_{x}}^{2}} = Z_{j}^{F_{x}} \Rightarrow c_{i j}^{j} = \frac{w_{i j}}{w_{, j}} Z_{j}^{F_{x}} . \end{matrix}

Since the probability distribution of

I (X_{11} \leq x), \dots, I (X_{n_{j} K} \leq x)

is invariant under permutations of

I (X_{i j} \leq x)

and

F_{X_{i j} | Θ_{j}} (x | Θ_{j})

is uniquely defined, it must hold that

c_{1 j}^{j} = c_{2 j}^{j} = \dots = c_{n_{j} j}^{j}

. Then, (8) becomes

\begin{matrix} F_{X_{i j}}^{C r e d} (x | Θ_{j}) & = & F_{X_{i j}} (x) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} F_{X_{i j}} (x) + \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i l} \leq x) \\ = & F_{X_{i j}} (x) - \sum_{i = 1}^{n_{j}} c_{i j}^{j} F_{X_{i j}} (x) - \sum_{l \neq j}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} F_{X_{i j}} (x) \\ + \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i j} \leq x) + \sum_{l \neq l}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i l} \leq x) \\ = & Z_{j}^{F_{x}} \sum_{i = 1}^{n_{j}} \frac{w_{i j}}{w_{. j}} I (X_{i j} \leq x) + (1 - Z_{j}^{F_{x}}) F_{X_{i j}} (x), \end{matrix}

which provides (7). □

Theorem 2

(Linearized homogeneous credibility distribution estimator). Under the assumptions

(i)

–

(i i i)

, the optimal linearized homogeneous estimator of

F_{X_{i j}} (x | Θ_{j})

is obtained by

\begin{matrix} F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j}) = Z_{j}^{F_{x}} F_{n_{w j}} (x) + (1 - Z_{j}^{F_{x}}) F_{n_{w z}} (x), \end{matrix}

(13)

with

F_{n_{w j}} (x)

,

F_{n_{w z}} (x)

and

Z_{j}^{F_{x}}

as defined in (1).

Proof.

Letting

\begin{matrix} g_{j} [I (X_{11} \leq x), \dots, I (X_{n_{l} K} \leq x)] = \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} I (X_{i l} \leq x), \end{matrix}

(14)

we have to minimize

\begin{matrix} Q & = & E {(F_{X_{i j} | Θ_{j}} (x | Θ_{j}) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} I (X_{i l} \leq x))}^{2}, \end{matrix}

(15)

such that

\begin{matrix} E [F_{X_{i l}} (x | Θ_{j})] = \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} E [I (X_{i l} \leq x)] \end{matrix}

(16)

holds under the restrictions

\sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} = 1

, with the Lagrange multiplier

2 λ

. The following quantity leads to

\begin{matrix} Q & = & E {(F_{X_{i j} | Θ_{j}} (x | Θ_{j}) - F_{X_{i j}} (x) - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} [I (X_{i j} \leq x) - F_{X_{i j}} (x)])}^{2} \\ - 2 λ (\sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} F_{X_{i j}} (x) - F_{X_{i j}} (x)) . \end{matrix}

(17)

From (16), we obtain

\begin{matrix} F_{X_{i j}} (x) = E [F_{X_{i j} | Θ_{j}} (x | Θ_{j})] & = & \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} E [I (X_{i l} \leq x)] = F_{X_{i j}} (x) \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} \\ \Rightarrow \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} & = & 1 . \end{matrix}

(18)

Differentiating (17) with respect to

c_{i^{'} l^{'}}

, we obtain

\begin{matrix} C o v [F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i^{'} l^{'}} \leq x)] - \sum_{l = 1}^{K} \sum_{i = 1}^{n_{l}} c_{i l}^{j} C o v [I (X_{i l^{'}} \leq x), I (X_{i^{'} l^{'}} \leq x)] - λ F_{X_{i j}} (x) = 0 \end{matrix}

\begin{matrix} \Rightarrow δ_{j l^{'}} a_{F_{x}} + λ F_{X_{i j}} (x) & = & \sum_{i = 1}^{n_{l}} c_{i l^{'}}^{j} C o v [I (X_{i l^{'}} \leq x), I (X_{i^{'} l^{'}} \leq x)] \\ = & \sum_{i = 1}^{n_{l}} c_{i l^{'}}^{j} (a_{F_{x}} + δ_{i i^{'}} \frac{1}{w_{i^{'} l^{^{'}}}} s_{F_{x}}^{2}) \\ = & c_{. l^{'}}^{j} a_{F_{x}} + c_{i^{'} l^{'}}^{j} \frac{1}{w_{i^{'} l^{^{'}}}} s_{F_{x}}^{2} . \end{matrix}

(19)

Multiplying both sides by

w_{i^{'} l^{'}}

and taking the sum over

i^{'}

, we obtain for each

l^{'}

:

\begin{matrix} c_{. l^{'}}^{j} = \frac{[δ_{j l^{^{'}}} a_{F_{x}} + λ F_{X_{i j}} (x)] w_{. l^{^{'}}}}{(a_{F_{x}} w_{. l^{^{'}}} + s_{F_{x}}^{2}])} = [δ_{j l^{^{'}}} + \frac{λ}{a_{F_{x}}} F_{X_{i j}} (x)] Z_{l^{'}}^{F_{x}} . \end{matrix}

(20)

Substituting (20) into (19), we obtain

\begin{matrix} c_{i^{'} l^{'}}^{j} & = & \frac{[δ_{j l^{^{'}}} a_{F_{x}} + λ F_{X_{i j}} (x)] (1 - Z_{l^{'}}^{F_{x}})}{s_{F_{x}}^{2}} w_{i^{'} l^{^{'}}} \end{matrix}

(21)

\begin{matrix} \Rightarrow 1 = \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} & = & \sum_{l = 1}^{K} c_{. l}^{j} = \sum_{l = 1}^{K} (δ_{j l} + \frac{λ}{a_{F_{x}}} F_{X_{i j}} (x)) Z_{l}^{F_{x}} \\ = & \frac{λ}{a_{F_{x}}} F_{X_{i j}} (x) \sum_{l = 1}^{K} Z_{l}^{F_{x}} + \sum_{l \neq j}^{K} Z_{l}^{F_{x}} δ_{j l} + δ_{j j} Z_{j}^{F_{x}} \\ \Rightarrow λ F_{X_{i j}} (x) & = & a_{F_{x}} \frac{(1 - Z_{j}^{F_{x}})}{z_{.}^{F_{x}}} . \end{matrix}

(22)

Then, the optimal linearized homogeneous estimator of

F_{X_{i j}} (x | Θ_{j})

becomes

\begin{matrix} F_{X_{i j}}^{C r e d} (x | Θ_{j}) & = & \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i l}^{j} I (X_{i l} \leq x) \\ = & \sum_{l = 1}^{K} \sum_{i = 1}^{n_{j}} \frac{[δ_{j l^{^{'}}} a_{F_{x}} + λ F_{X_{i j}} (x)] (1 - Z_{l}^{F_{x}})}{s_{F_{x}}^{2}} w_{i l} I (X_{i l} \leq x) \\ = & \sum_{l = 1}^{K} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} \sum_{i = 1}^{n_{j}} \frac{w_{i l}}{w_{. l}} (1 - Z_{j}^{F_{x}}) I (X_{i l} \leq x) + \sum_{l = 1}^{K} \sum_{i = 1}^{n} Z_{l}^{F_{x}} \frac{w_{i l}}{w_{. l}} δ_{j l} I (X_{i l} \leq x) \\ = & (1 - Z_{j}^{F_{x}}) \sum_{l = 1}^{K} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} F_{n_{w j}} (x) + \sum_{l \neq j}^{K} \sum_{i = 1}^{n_{j}} Z_{l}^{F_{x}} \frac{w_{i l}}{w_{. l}} δ_{j l} I (X_{i l} \leq x) \\ + Z_{j}^{F_{x}} \sum_{i = 1}^{n_{j}} \frac{w_{i j}}{w_{. j}} I (X_{i j} \leq x) \end{matrix}

resulting in (13). □

The following theorem will prove that

F_{n_{w z}} (x)

has a smaller variance than

F_{n_{w j}} (x)

, i.e., based on the heterogeneity and the fluctuation of the risk,

F_{n_{w z}} (x)

has a minimal mean square error.

Theorem 3.

The

V a r (\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} I (X_{i j} \leq x))

is the minimum for all

c_{i j}

, such that

\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}}

c_{i j} = 1

, for

c_{i j} = \frac{w_{i j}}{w_{. j}} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}}

.

Proof.

We have to minimize the following quantity

\begin{matrix} Q & = & E {(\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} I (X_{i j} \leq x) - E [\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j}^{j} I (X_{i j} \leq x)])}^{2} \\ - 2 λ F_{X_{i j}} (x) (\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j}^{j} - 1) . \end{matrix}

(23)

Taking the derivative of (23) with respect

c_{i^{'} l^{'}}

for

i^{'} = 1, \dots, n

and

j^{'} = 1, \dots, K

, we obtain

\begin{matrix} \frac{\partial Q}{\partial c_{i^{'} j^{'}}} & = & 2 E (\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} I (X_{i j} \leq x) - E [\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} I (X_{i j} \leq x)]) (I (X_{i^{'} j^{'}} \leq x) - F_{X_{i j}} (x)) \\ - 2 λ F_{X_{i j}} (x) = 0 \\ \Rightarrow E (\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} [I (X_{i j} \leq x) - F_{X_{i j}} (x)] [I (X_{i^{'} j^{'}} \leq x) - F_{X_{i j}} (x)]) = λ F_{X_{i j}} (x) \\ \Rightarrow \sum_{i = 1}^{n_{j}} c_{i j} C o v [I (X_{i j} \leq x), I (X_{i^{'} j^{'}} \leq x)] = λ F_{X_{i j}} (x) . \end{matrix}

(24)

This is the same as

\begin{matrix} \sum_{j \neq j^{'}}^{K} \sum_{i = 1}^{n_{j}} c_{i j} C o v [I (X_{i j} \leq x), I (X_{i^{'} j^{'}} \leq x)] + \\ + \sum_{i = 1}^{n_{j}} c_{i j} C o v [I (X_{i j} \leq x), I (X_{i^{'} j^{'}} \leq x)] = λ F_{X_{i j}} (x) \\ \Rightarrow \sum_{i = 1}^{n_{j}} c_{i j} (a_{F_{x}} + δ_{i i^{'}} \frac{s_{F_{x}}^{2}}{w_{i j}}) = λ F_{X_{i j}} (x) . \end{matrix}

(25)

This gives

\begin{matrix} \Rightarrow a_{F_{x}} \sum_{i = 1}^{n_{j}} c_{i j} + \sum_{i \neq i^{'}}^{n} c_{i j} δ_{i i^{'}} \frac{s_{F_{x}}^{2}}{w_{i j}} + c_{i j} δ_{i i} \frac{s_{F_{x}}^{2}}{w_{i j}} = λ F_{X_{i j}} (x) \\ \Rightarrow a_{F_{x}} \sum_{i = 1}^{n_{j}} c_{i j} + c_{i j} \frac{s_{F_{x}}^{2}}{w_{i j}} = λ F_{X_{i j}} (x) \\ \Rightarrow a_{F_{x}} w_{i j} c_{. j} + c_{i j} s_{F_{x}}^{2} = λ F_{X_{i j}} (x) w_{i j} \\ \Rightarrow c_{i j} = \frac{w_{i j} [λ F_{X_{i j}} (x) - a_{F_{x}} c_{. j}]}{s_{F_{x}}^{2}} \end{matrix}

(26)

\begin{matrix} \Rightarrow \sum_{i = 1}^{n_{j}} a_{F_{x}} w_{i j} c_{. j} + \sum_{i = 1}^{n_{j}} c_{i j} s_{F_{x}}^{2} = \sum_{i = 1}^{n_{j}} λ F_{X_{i j}} (x) w_{i j} \\ \Rightarrow a_{F_{x}} w_{. j} c_{. j} + c_{. j} s_{F_{x}}^{2}] = λ F_{X_{i j}} (x) w_{. j} \\ \Rightarrow c_{. j} = \frac{w_{. j} λ F_{X_{i j}} (x)}{a_{F_{x}} w_{. j} + s_{F_{x}}^{2}} = λ F_{X_{i j}} (x) \frac{Z_{j}^{F_{x}}}{a_{F_{x}}} . \end{matrix}

(27)

We know that

\begin{matrix} \sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} c_{i j} = 1 \Rightarrow \sum_{j = 1}^{K} c_{. j} = 1 \\ \Rightarrow \sum_{j = 1}^{K} λ F_{X_{i j}} (x) \frac{Z_{j}^{F_{x}}}{a_{F_{x}}} = λ F_{X_{i j}} (x) \frac{z_{.}^{F_{x}}}{a_{F_{x}}} = 1 \Rightarrow λ = \frac{a_{F_{x}}}{F_{X_{i j}} (x) z_{.}^{F_{x}}} . \end{matrix}

We therefore obtain

\begin{matrix} c_{. j} = \frac{a_{F_{x}} F_{X_{i j}} (x) Z_{j}^{F_{x}}}{F_{X_{i j}} (x) z_{.}^{F_{x}}} = \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} \end{matrix}

(28)

and from (26), we have

\begin{matrix} c_{i j} & = & \frac{w_{i j} (λ F_{X_{i j}} (x) - a_{F_{x}} c_{. j})}{s_{F_{x}}^{2}} = \frac{w_{i j} (\frac{a_{F_{x}}}{z_{.}^{F_{x}}} - \frac{a_{F_{x}} Z_{j}^{F_{x}}}{z_{.}^{F_{x}}})}{s_{F_{x}}^{2}} = \frac{w_{i j}}{w_{. j}} \frac{Z_{j}^{F_{x}}}{z_{.}^{F_{x}}} . \end{matrix}

□

Theorem 4.

Under assumptions

(i)

–

(i i i)

, the quadratic loss for the credibility distribution estimator is given by

\begin{matrix} E {[F_{X_{i j}}^{C r e d} (x | Θ_{j}) - F_{X_{i j}} (x | Θ_{j})]}^{2} = a_{F_{x}} (1 - Z_{j}^{F_{x}}) . \end{matrix}

(29)

Proof.

We have

\begin{matrix} E {[F_{X_{i j}}^{C r e d} (x | Θ_{j}) - F_{X_{i j}} (x | Θ_{j})]}^{2} \\ = & E {(Z_{j}^{F_{x}} [F_{w j} (x) - F_{X_{i j}} (x)] - [F_{X_{i j}} (x | Θ_{j}) - F_{X_{i j}} (x)])}^{2} \\ = & {(Z_{j}^{F_{x}})}^{2} (V a r [F_{w j} (x)] + V a r [F_{X_{i j}} (x | Θ_{j})] - 2 Z_{j}^{F_{x}} C o v [F_{w j} (x), F_{X_{i j}} (x | Θ_{j})]) \\ = & Z_{j}^{F_{x}} (\frac{a_{F_{x}} w_{. j}}{a_{F_{x}} w_{. j} + s_{F_{x}}^{2}}) (\frac{s_{F_{x}}^{2}}{w_{. j}} + a_{F_{x}}) + a_{F_{x}} - 2 Z_{j}^{F_{x}} a_{F_{x}} \end{matrix}

that provides (29). □

2.4. Optimal Projection Theorem

In the following, De Vylder’s (1976, 1996) optimal projection theorem of random variables in the plane is applied in order to derive the optimal estimator of

F_{X_{i j}} (x)

and

F_{X_{i j}} (x | Θ_{j})

. Practically,

F_{X_{i j}} (x)

is replaced by

F_{n_{w z}} (x)

in (7).

Theorem 5.

The optimal estimator of

F_{X_{i j}} (x)

in the plane

H (I (X_{i j} \leq x)

,

i = 1, \dots, n_{j}

,

j = 1, \dots, K

) is

\begin{matrix} F_{X_{i j}} {(x)}_{P r o j} & = & P r o j [F_{X_{i j}} (x) | H_{F} (I (X_{i j} \leq x), i = 1, \dots, n_{j}, j = 1, \dots, K)] \\ = & F_{n_{w z}} (x) . \end{matrix}

Proof.

Directly from (2) and (5). □

Theorem 6.

The optimal credibility estimator of

F_{X_{i j}} (x | Θ_{j})

based on

I (X_{11} \leq x), \dots, I (X_{n_{j} K} \leq x)

is

\begin{matrix} F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j}) & = & P r o j [F_{X_{i j} | Θ_{j}} (x | Θ_{j}) | H_{F} (I (X_{i j} \leq x), i = 1, \dots, n_{j}, j = 1, \dots, K)] \\ = & Z_{j}^{F_{x}} F_{n_{w j}} (x) + (1 - Z_{j}^{F_{x}}) F_{n_{w z}} (x) . \end{matrix}

(30)

Proof.

In order to prove (30), it is sufficient to prove the unbiasedness and covariance conditions of the optimal projection theorem of random variables on planes not through the origin (see De Vylder (1976, 1996)), that is

\begin{matrix} E [F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j})] = E [F_{X_{i j} | Θ_{j}} (x | Θ_{j})] = F_{X_{i j}} (x) \end{matrix}

and

\begin{matrix} C o v [F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j}) - F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)] = c o n s t . \end{matrix}

The unbiasedness condition results from (2) and

\begin{matrix} E [F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j})] = E (Z_{j}^{F_{x}} F_{n_{w j}} (x) + (1 - Z_{j}^{F_{x}}) F_{n_{w z}} (x)) = F_{X_{i j}} (x) . \end{matrix}

The covariance condition results from the independence of the contracts and the covariance relations of Lemma 1, which gives

\begin{matrix} C o v [F_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j}) - F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)] \\ = & C o v (Z_{j}^{F_{x}} F_{n_{w j}} (x) + (1 - Z_{j}^{F_{x}}) F_{n_{w z}} (x) - F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)) \\ = & Z_{j}^{F_{x}} δ_{j j^{'}} C o v [F_{n_{w j}} (x), I (X_{i j^{'}} \leq x)] + (1 - Z_{j}^{F_{x}}) C o v [F_{n_{w z}} (x), I (X_{r j^{'}} \leq x)] \\ - δ_{j j^{'}} C o v [F_{X_{i j} | Θ_{j}} (x | Θ_{j}), I (X_{i j^{'}} \leq x)] \\ = & Z_{j}^{F_{x}} δ_{j j^{'}} \frac{a_{F_{x}}}{Z_{j}^{F_{x}}} + (1 - Z_{j}^{F_{x}}) \frac{a_{F_{x}}}{z_{.}^{F_{x}}} - δ_{j j^{'}} a_{F_{x}} = (1 - Z_{j}^{F_{x}}) \frac{a_{F_{x}}}{z_{.}^{F_{x}}} . \end{matrix}

(31)

□

2.5. Unbiased Estimators

Below, we provide unbiased estimators analogous to the Bühlmann and Straub (1970) model.

Lemma 2.

The following estimators of the structural parameters

F_{X_{i j}} (x)

,

s_{F_{x}}^{2}

and

a_{F_{x}}

, presented in Section 2.2, are unbiased.

\begin{matrix} {\hat{F}}_{X_{i j}} (x) = F_{n_{w w} (x)} o r {\hat{F}}_{X_{i j}} (x) = F_{n_{w z} (x)}, \end{matrix}

(32)

\begin{matrix} {\hat{s}}_{F_{x}}^{2} = \frac{\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} w_{i j} {[I (X_{i j} \leq x) - F_{n_{w j}} (x)]}^{2}}{\sum_{j = 1}^{K} (n_{j} - 1)}, \end{matrix}

(33)

\begin{matrix} {\hat{a}}_{F_{x}} = \frac{w_{. .}}{w_{. .}^{2} - \sum_{j = 1}^{K} w_{. j}^{2}} [\sum_{j = 1}^{K} w_{. j} {[F_{n_{w j}} (x) - F_{n_{w w}} (x)]}^{2} - (K - 1) {\hat{s}}_{F_{x}}^{2}] . \end{matrix}

(34)

Based on De Vylder (1978), an unbiased estimator of

a_{F_{x}}

can take the form

\begin{matrix} {\hat{a}}_{F_{x}} = \frac{\sum_{j = 1}^{K} Z_{j}^{F_{x}} {[F_{n_{w j}} (x) - F_{n_{w z}} (x)]}^{2}}{(K - 1)}, (p s e u d o - e s t i m a t o r) . \end{matrix}

(35)

Proof.

The unbiasedness of

{\hat{F}}_{X_{i j}} (x)

is straightforward and is omitted. The unbiasedness of

{\hat{s}}_{F}^{2}

follows from

\begin{matrix} \sum_{j = 1}^{K} (n_{j} - 1) E ({\hat{s}}_{F_{x}}^{2}) = E [\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} w_{i j} {[I (X_{i j} \leq x) - F_{n_{w j}} (x)]}^{2}] \\ = & \sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} [w_{i j} V a r [I (X_{i j} \leq x)] + V a r [F_{n_{w j}} (x)] - 2 C o v [I (X_{i j} \leq x), F_{n_{w j}} (x)]] \\ = & \sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} w_{i j} (\frac{s_{F_{x}}^{2}}{w_{i j}} + a_{F_{x}} + \frac{a_{F_{x}}}{Z_{j}^{F_{x}}} - 2 \frac{a_{F_{x}}}{Z_{j}^{F_{x}}}) = \sum_{j = 1}^{K} (n_{j} - 1) s_{F_{x}}^{2}, \end{matrix}

(36)

resulting in (33). For the proof of the unbiasedness of (34), we refer to Bühlmann and Straub (1970). Finally, the unbiasedness of

{\hat{a}}_{F}

in (35) results from

\begin{matrix} (K - 1) E ({\hat{a}}_{F_{x}}) = E [\sum_{j = 1}^{K} Z_{j}^{F_{x}} {[F_{n_{w j}} (x) - F_{n_{w z}} (x)]}^{2}] \\ = & \sum_{j = 1}^{K} Z_{j}^{F_{x}} (V a r [F_{n_{w j}} (x)] + V a r [F_{n_{w z}} (x)] - 2 C o v [F_{n_{w j}} (x), F_{n_{w z}} (x)]) \\ = & \sum_{j = 1}^{K} Z_{j}^{F_{x}} [(\frac{a_{F_{x}}}{Z_{j}^{F_{x}}}) + (\frac{a_{F_{x}}}{z_{.}^{F_{x}}}) - 2 (\frac{a_{F_{x}}}{z_{.}^{F_{x}}})] = (K - 1) a_{F_{x}} \end{matrix}

which implies (35). □

3. Credible Distribution for Grouped Data

Grouped data are formed by aggregating the individual observations of a variable into groups. For example, a histogram is a density approximation for grouped data. The construction of the empirical distribution based on grouped data can be achieved by obtaining the point values of the empirical distribution function whenever possible. Then, we can approximate the distribution function by connecting those point values with straight lines.

Empirical distribution for grouped data is evaluated at a point estimate x. We consider the case where the point estimate x is at a boundary and the case where the value of x is between the boundaries.

3.1. Empirical Distribution for Grouped Data at Boundary

For contract j, let the group boundaries be

c_{0 j} < c_{1 j} < \dots < c_{n j}

, where

c_{0 j} = 0

and

c_{n + 1, j} = \infty

. Let

m_{i j}

be the number of observations in the interval

(c_{i - 1, j}, c_{i j})

,

i = 1, 2, \dots, n_{j}

,

j = 1, 2, \dots, K

and

m_{. j} = \sum_{i = 1}^{n_{j}} m_{i j}

be the total number of observations for the j contract. For grouped data, the empirical distribution function at each group boundary

c_{i j}

is defined as

\begin{matrix} F_{m j} (c_{r j}) = \frac{1}{m_{. j}} \sum_{i = 1}^{r} m_{i j} . \end{matrix}

(37)

For grouped data, there is no problem if the distribution function has to be estimated at a boundary. When all of the information is available, working with the empirical estimate of the distribution function is straightforward (see Klugman et al. (2012)). We have the following assumptions:

3.1.1. Assumptions

(i*): The contracts are independent and the variables $Θ_{1}, \dots, Θ_{K}$ are identically distributed. The observations $X_{i j}$ have finite variance,
(ii*): $E [I (X_{i j} \leq x) | Θ_{j})] = E [F_{m j} (x | Θ_{j})] = F_{X_{i j} | Θ_{j}} (x | Θ_{j}),$
(iii*): $V a r [I (X_{i j} \leq x) | Θ_{j}] = \frac{1}{m_{i j}} σ_{x}^{2} (Θ_{j}) and$ $V a r [F_{m j} (x | Θ_{j})] = \frac{1}{m_{. j}} σ_{x}^{2} (Θ_{j}) .$

3.1.2. Structural Parameters

\begin{matrix} μ_{x} = E [F_{m j} (x)] = F_{X_{i j}} (x), s_{x}^{2} = E [σ_{x}^{2} (Θ_{j})], a_{x} = V a r {E [I (X_{i j} \leq x) | Θ_{j}]} . \end{matrix}

(38)

3.1.3. Notation

Here, we adopt the following notation:

\begin{matrix} F_{m j} (x) = \sum_{j = 1}^{n_{j}} \frac{m_{i j}}{m_{. j}} I (X_{i j} \leq x), F_{m m} (x) = \sum_{j = 1}^{K} \frac{m_{. j}}{m_{. .}} F_{m j} (x), F_{m z} (x) = \sum_{j = 1}^{K} \frac{Z_{j}^{x}}{z_{.}^{x}} F_{m j} (x), \\ m_{. j} = \sum_{i = 1}^{n_{j}} m_{i j}, m_{. .} = \sum_{i = 1}^{K} m_{. j}, z_{.}^{x} = \sum_{j = 1}^{K} Z_{j}^{x}, Z_{j}^{x} = \frac{m_{. j} a_{x}}{m_{. j} a_{x} + s_{x}^{2}} . \end{matrix}

(39)

Based on the above assumptions, a credibility distribution estimator for

F_{X_{i j}} (x | Θ_{j})

is obtained as

\begin{matrix} F_{X_{i j}}^{C r e d} (x | Θ_{j}) = Z_{j}^{x} F_{m j} (x) + (1 - Z_{j}^{x}) F_{X_{i j}} (x) . \end{matrix}

(40)

With the following theorem, we can obtain the credibility distribution estimator of

F_{X_{i j}} (x | Θ_{j})

.

Theorem 7.

Under the assumptions

(i^{*})

–

(i i i^{*})

, the credibility factor in (40) is given by

\begin{matrix} Z_{j}^{x} = \frac{m_{. j} a_{x}}{m_{. j} a_{x} + s_{x}^{2}}, \end{matrix}

with

a_{x}

as in (38) and

m_{. j}

as in (39).

Proof.

The proof of the theorem can be obtained by minimizing the expression

\begin{matrix} Q = E {(F_{X_{i j}}^{C r e d} (x | Θ_{j}) - Z_{j}^{x} F_{m j} (x) - (1 - Z_{j}^{x}) F_{X_{i j}} (x))}^{2}, \end{matrix}

with respect to

Z_{j}^{x}

. □

3.1.4. Credibility Estimators

Lemma 3.

The credibility point estimators of

F_{X_{i j}} (x)

,

s_{x}^{2}

and

a_{x}

are given as follows:

\begin{matrix} {\hat{F}}_{X_{i j}} (x) & = & F_{m m} (x), o r {\hat{F}}_{X_{i j}} (x) = F_{m z} (x) \\ {\hat{s}}_{x}^{2} & = & \frac{\sum_{j = 1}^{K} \sum_{i = 1}^{n_{j}} m_{i j} {[I (X_{i j} \leq x) - F_{m j} (x)]}^{2}}{\sum_{j = 1}^{K} (n_{j} - 1)} \\ {\hat{a}}_{x} & = & (\frac{m_{. .}}{m_{. .}^{2} - \sum_{j = 1}^{K} m_{. j}^{2}}) [\sum_{j = 1}^{K} m_{. j} {[F_{m j} (x) - F_{m m} (x)]}^{2} - (K - 1) {\hat{s}}_{x}^{2}] \\ o r \\ {\hat{a}}_{x} & = & \frac{\sum_{j = 1}^{K} Z_{j}^{x} {[F_{m j} (x) - F_{m z} (x)]}^{2}}{(K - 1)} . \end{matrix}

Proof.

Similarly to the proof of Lemma 2. □

3.2. Empirical Distribution for Grouped Data at Value x between Boundaries

Now, suppose that the value of x is between the boundaries

c_{i - 1, j}

and

c_{i j}

. Then, for contract j, the empirical distribution function is given by

\begin{matrix} F_{m j} (x) = \{\begin{matrix} 0, & x \leq c_{0}, \\ \frac{(c_{i j} - x) F_{m j} (c_{i - 1, j}) + (x - c_{i - 1, j}) F_{m j} (c_{i j})}{c_{i j} - c_{i - 1, j}}, & c_{i - 1, j} \leq x \leq c_{i j}, \\ 1, & x > c_{n} . \end{matrix} \end{matrix}

(41)

This function is differentiable at all values except for the group boundaries. Based on (41), we can obtain the following

\begin{matrix} E [F_{m j} (x | Θ_{j})] = \frac{(c_{i j} - x) F_{X_{i j}} (c_{i - 1, j} | Θ_{j}) + (x - c_{i - 1, j}) F_{X_{i j}} (c_{i j} | Θ_{j})}{c_{i j} - c_{i - 1, j}} \end{matrix}

and

\begin{matrix} F_{X_{i j}} (x) = E [F_{m j} (x)] = \frac{(c_{i j} - x) F_{X_{i j}} (c_{i - 1, j}) + (x - c_{i - 1, j}) F_{X_{i j}} (c_{i j})}{c_{i j} - c_{i - 1, j}} . \end{matrix}

Note that the above estimator is biased although it is an unbiased estimator of the true interpolated value (see Klugman et al. (2012)).

The conditional variance of the empirical distribution is

\begin{matrix} V a r [F_{m j} (x | Θ_{j})] & = & \frac{{(c_{i j} - x)}^{2} V a r [F_{m_{j}} (c_{i - 1, j} | Θ_{j})] + {(x - c_{i - 1, j})}^{2} V a r [F_{m j} (c_{i j} | Θ_{j})]}{{(c_{i j} - c_{i - 1, j})}^{2}} \\ + \frac{2 C o v [F_{m j} (c_{i - 1, j} | Θ_{j}), F_{m j} (c_{i j} | Θ_{j})]}{{(c_{i j} - c_{i - 1, j})}^{2}}, \end{matrix}

where

\begin{matrix} V a r [F_{m j} (c_{i - 1, j} | Θ_{j})] = \frac{1}{m_{. j}} F_{X_{i j}} (c_{i - 1, j} | Θ_{j}) [1 - F_{X_{i j}} (c_{i - 1, j} | Θ_{j})], \end{matrix}

\begin{matrix} V a r [F_{m j} (c_{i j} | Θ_{j})] = \frac{1}{m_{. j}} F_{X_{i j}} (c_{i j} | Θ_{j}) [1 - F_{X_{i j}} (c_{i j} | Θ_{j})] \end{matrix}

and

\begin{matrix} C o v [F_{m j} (c_{i - 1, j} | Θ_{j}), F_{m j} (c_{i j} | Θ_{j})] & = & \frac{1}{m_{. j}} (F_{X_{i j}} (min {c_{i - 1, j}, c_{i j}} | Θ_{j}) \\ - F_{X_{i j}} (c_{i - 1, j} | Θ_{j}) F_{X_{i j}} (c_{i j} | Θ_{j})) . \end{matrix}

Then, we can proceed as in Section 3.1 for obtaining the credibility distribution estimator of

F_{X_{i j}} (x | Θ_{j})

, when the value of x is between boundaries.

4. Alternative Credibility Distribution Approach for Grouped Data

For grouped data, the previous approaches yield credibility point estimates. If we want to find the credibility estimation in the framework of Bühlmann and Straub (1970), we may apply the concept of uniform distribution within each interval

(c_{i - 1, j}, c_{i j})

and the first two moments can be estimated from

\begin{matrix} {\hat{μ}}_{j}^{(k)} = \sum_{i = 1}^{r} (\frac{m_{i j}}{m_{. j}}) (\frac{c_{i j}^{k + 1} - c_{i - 1, j}^{k + 1}}{(k + 1) (c_{i j} - c_{i - 1, j})}), \end{matrix}

for

k = 1, 2

. Thus, for contract j, the empirical estimate of the mean

(k = 1)

is the weighted average of the interval midpoints where the weight

m_{i j}

for an interval is the proportion of the observations that are in the interval (histogram), i.e.,

\begin{matrix} {\hat{μ}}_{j} = \sum_{i = 1}^{r} \frac{m_{i j}}{m_{. j}} (\frac{c_{i j} + c_{i - 1, j}}{2}) . \end{matrix}

Letting

C_{i j} = \frac{c_{i j} + c_{i - 1, j}}{2}

and assuming that

E (C_{i j} | Θ_{j}) = μ (Θ_{j})

and

Cov (C_{r j}, C_{i j} | Θ_{j}) = δ_{r i} \frac{1}{m_{i j}} σ^{2} (Θ_{j})

, the credibility estimation based on grouped data can be obtained similarly as in the Bühlmann and Straub (1970) model

\begin{matrix} μ^{C r e d} (Θ_{j}) = Z_{j} μ_{j} + (1 - Z_{j}) μ, \end{matrix}

(42)

with parameters

\begin{matrix} μ = E [μ (Θ_{j})], s^{2} = E [σ^{2} (Θ_{j})], a = V a r [μ (Θ_{j})], Z_{j} = \frac{a m_{. j}}{a m_{. j} + s^{2}} . \end{matrix}

(43)

Theorem 8.

The following are unbiased estimators for μ,

s^{2}

, and a:

\begin{matrix} \hat{μ} = {\bar{C}}_{. .} = \sum_{j = 1}^{K} \frac{m_{. j}}{m_{. .}} {\bar{C}}_{. j}, w i t h {\hat{μ}}_{j} = {\bar{C}}_{. j} = \sum_{j = 1}^{K} \frac{m_{i j}}{m_{. j}} C_{i j}, \end{matrix}

\begin{matrix} {\hat{s}}^{2} = \frac{1}{K} \sum_{j = 1}^{K} {\hat{s}}_{j}^{2}, w i t h {\hat{s}}_{j}^{2} = {\hat{μ}}_{j}^{(2)} - {({\hat{μ}}_{j})}^{2} \end{matrix}

and

\begin{matrix} \hat{a} = (\frac{m_{. .}}{m_{. .}^{2} - \sum_{j = 1}^{K} m_{. j}^{2}}) [\sum_{j = 1}^{K} m_{. j} ({\hat{μ}}_{j} - \hat{μ})^{2} - (K - 1) {\hat{s}}^{2}], \end{matrix}

or

\begin{matrix} \hat{a} = \frac{1}{K - 1} \sum_{j = 1}^{K} Z_{j} {({\bar{C}}_{. j} - {\bar{C}}_{z})}^{2}, w h e r e {\bar{C}}_{z} = \sum_{j = 1}^{K} \frac{Z_{j}}{z_{.}} {\bar{C}}_{. j} . \end{matrix}

Proof.

See Bühlmann and Straub (1970) and De Vylder (1978). □

5. Numerical Illustrations

In this section, we use two datasets, one with insurance motor claims data and a second with monthly returns financial data.

5.1. Numerical Example with Insurance Data

The dataset is provided by Insurance Europe (2022) and includes a database with figures on the European insurance industry during the period 2004–2020 for 32 EU countries. Our numerical illustration is based on a complete dataset of 10 selected countries for the years 2004–2018. Our dataset also contains the motor claims paid and the number of motor claims for each country and each year. The selected countries are the following: Austria (AT); Germany (DE); Finland (FI); Greece (GR); Hungary (HR); Italy (IT); Norway (NO); Poland (PL); Portugal (PT); and Sweden (SE). Table 1 shows the summary statistics of the motor claim amounts and the claim numbers for countries

j = 1, \dots, 10

and years

i = 1, \dots, 15

.

Table 2 illustrates the results of a credibility distribution function for motor claims amount data during the years 2004–2018. More analytically, the upper part of the table shows the individual empirical distribution

{\hat{F}}_{n_{w j}} (x)

of claim amounts

X_{i j} \leq x, X_{i j} \leq x,

(x = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897) and the corresponding credibility distribution estimators

{\hat{F}}_{X_{i j}}^{C r e d} (x | Θ_{j})

are shown in the middle part of the table. The estimated credibility factors

{\hat{Z}}_{j}^{F_{x}}

, as well the estimated parameters

{\hat{F}}_{n_{w w}} (x)

,

{\hat{s}}_{F}^{2}

,

{\hat{a}}_{F}

, are presented in the lower part of Table 2. Note that

{\hat{F}}_{n_{w j}} (x) = 0

means that the value of all claims

X_{i j} > x

and

{\hat{F}}_{n_{w j}} (x) = 1

if claims

X_{i j} \leq

x.

In Table 2, we observe a lack of monotonicity of the estimated credibility distributions for all contracts. In order to obtain monotonicity, we similarly proceed as in Cai et al. (2015) by restricting the credibility factor

Z_{j}^{F}

to be a constant free of x. The results are shown in Table 3. Although monotonicity has been restored from a risk management perspective (which serves to fair pricing and reserves given the nature of the risk), more investigation is required, especially in the points where monotonicity breaks down.

Remark 1.

Another way of obtaining monotonicity of the credibility estimated by distribution functions is by sorting the resulting credibility by estimated distribution functions. In the relevant literature, there are methods for extracting a monotone function from non-monotonic data. Such a method is the monotonic regression that achieves the monotonicity and smoothness of the regression by introducing a regularization term, and solving an optimization problem with constraints. Some key references are: Friedman and Tibshirani (1984), Mukerjee (1988), Shively et al. (2009) and Zhang (2004). Similarly, the above approaches could be applied to our model.

By letting the values of motor claims be larger than x = 23,800 and less than or equal to x = 23,897 x = 23,897 is the maximum threshold of contract DE, which is the contract with the largest values of motor claims, as shown in Table 1), whilst the values of the estimated credibility distribution

{\hat{F}}_{X_{i j}}^{C r e d} (x | Θ_{j})

remain the same up to the fifth decimal place. By letting x > 23,897, the estimated credibility distribution goes to 1 (see Table 2).

Remark 2.

Similarly to in Bühlmann and Straub (1970) model,

{\hat{a}}_{F_{x}}

can possibly be negative. This means that there is no detectable difference between the risks. In this case we put

{\hat{a}}_{F_{x}} = 0

, as in our cases for x = 23,800, 23,896, 23,897.

Figure 1 displays the individual empirical distribution in each contract. Note that the red bullets indicate the corresponding credibility estimate at specific points presented in Table 2.

Credibility Coefficients for Motor Claims Data

In the following, we provide an intuitive interpretation for the form of the credibility distribution estimator given in Theorem 1 for motor claims by presenting the following coefficients in Table 4, which were derived based on the lower part of Table 2. These are: the coefficient of variation

B R V = \frac{\sqrt{{\hat{a}}_{F_{x}}}}{{\hat{F}}_{n_{w w}} (x)}

, which is a good measure for the heterogeneity of the portfolio, (i.e., a good measure for the between-risk variability) and the average within the risk coefficient of variation

W R V = \frac{{\hat{s}}_{F_{x}}}{{\hat{F}}_{n_{w w}} (x)}

, which is a good measure of the within risk variability. The smaller the credibility coefficient

C C = \frac{{\hat{s}}_{F_{x}}^{2}}{{\hat{a}}_{F_{x}}}

, the greater the

{\hat{Z}}_{j}^{F_{x}} .

Remark 3.

The results of Table 2 and Remark 2, for x = 23,800,

{\hat{a}}_{F_{x}} =

0 and

{\hat{s}}_{F_{x}}^{2}

= 61,020, imply that BRV = 0 and

C C = \infty

. Similarly, setting x = 23,900,

{\hat{a}}_{F_{x}}

= 0 and

{\hat{s}}_{F_{x}}^{2} =

1,736,923 implies that BRV = 0 and

C C = \infty

.

5.2. Example of Credibility Distribution Estimation with Financial Data

The dataset was created (see Fama and French (2022)) as follows: each NYSE, AMEX, and NASDAQ stock was assigned to an industry portfolio at the end of June of year t based on its four-digit SIC code at that time. Compustat SIC codes have been used for the fiscal year ending in the calendar year

t - 1

. Whenever Compustat SIC codes are not available, CRSP SIC codes for June of year t were used. Then, returns from July of year t to June of year

t + 1

are computed. The weights are the number of firms in portfolios.

In particular, the portfolios are constructed with monthly returns from July 1926 to July 2022 and it contains value returns for 10 industry portfolios. The credibility distribution for each of these portfolios needs to be estimated. As a profit (P), we consider a random variable X, with positive returns values and as a loss (L) with negative return values. The 10 industry portfolios are as follows:

(1): NoDur: consumer non-durables—food, tobacco, textiles, apparel, leather, and toys.
(2): Durbl: Consumer durables—cars, TVs, furniture, household appliances.
(3): Manuf: Manufacturing—machinery, trucks, planes, chemicals, off-furn, and paper.
(4): Enrgy: Oil, gas, and coal extraction and products.
(5): HiTec: Business equipment—computers, software, and electronic equipment.
(6): Telcm: Telephone and television transmission.
(7): Shops: Wholesale, retail, and some services (laundries, repair shops).
(8): Hlth: healthcare, medical equipment, and drugs.
(9): Utils: Utilities.
(10): Other: Other—mines, construction, building material, transportation, hotels, bus service, entertainment, and finance.

Table 5 provides some descriptive statistics of the (P/L) monthly returns of the 10 industry portfolios. The number of observations in each portfolio is

n = 1155

.

Table 6 illustrates the results of credibility distribution function for monthly returns for 10 industry portfolios from July 1926 to July 2022. More analytically, the upper part of the table shows the individual empirical distribution

{\hat{F}}_{n_{w j}} (x)

of the returns

X_{i j h} \leq x,

(

x = - 15, - 10, - 5, 0, 10, 15, 34.17, 59, 60, 79.79

) and the corresponding credibility distribution estimators

{\hat{F}}_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j})

are shown in the middle part of the table. The estimated credibility factors

{\hat{Z}}_{j}^{F_{x}}

, as well the estimated parameters

{\hat{F}}_{n_{w w}} (x)

,

{\hat{s}}_{F_{x}}^{2}

,

{\hat{a}}_{F_{x}}

, are presented in the lower part of Table 6. The monotonicity of the estimated distribution function is shown in Table 6. By letting the values of returns be larger than x = 59 and less than or equal to x = 79.79 (x = 79.79 is the maximum threshold of portfolio Durbl, which is the portfolio with the largest return values, as shown in Table 5), the values of the estimated credibility distribution

{\hat{F}}_{X_{i j}}^{C r e d} (x = 0 | Θ_{j})

remain the same up to the fifth decimal place. By letting

x > 79.79

,

{\hat{F}}_{X_{i j}}^{C r e d} (x > 70.79 | Θ_{j})

goes to 1 (see Table 6).

Figure 2 displays the individual empirical distribution in each contract. Again, note that the red bullets indicate the corresponding credibility estimate at specific points presented in Table 6.

Credibility Coefficients for Industry Portfolios Data

Here, we provide an intuitive interpretation for the form of the credibility distribution estimator for the monthly returns for the 10 industry portfolios, by presenting the following credibility coefficients. Table 7 illustrates the coefficient of variation

B R V = \frac{\sqrt{{\hat{a}}_{F_{x}}}}{{\hat{F}}_{n_{w w}} (x)}

, the average within-risk coefficient of variation

W R V = \frac{{\hat{s}}_{F_{x}}}{{\hat{F}}_{n_{w w}} (x)}

, and the credibility coefficient

C C = \frac{{\hat{s}}_{F_{x}}^{2}}{{\hat{a}}_{F_{x}}}

for the industry portfolio data.

Remark 4.

The results of Table 6 and Remark 2, for x = 50,

{\hat{a}}_{F_{x}} =

0 and

{\hat{s}}_{F_{x}}^{2} =

0.0469, imply that BRV = 0 and

C C = \infty

.

5.3. Example of Credibility Distribution Estimation with Financial Grouped Data

The empirical distribution function for the grouped data was depicted by the step function of Fama and French (2022) data. The grouping (see Table 8) is a subjective element in this fit and other persons would have different ones. The total number of observations in each portfolio is the same (

m_{. j} = 1155

).

Table 9 illustrates the results of the credibility distribution function for monthly returns for 10 industry portfolios from July 1926 to July 2022. Analytically, the upper part of the table shows the individual empirical distribution

{\hat{F}}_{m j} (x)

of returns

X_{i j} \leq x,

(

x = - 15, - 10, - 5, 0, 10, 15

) and the corresponding credibility distribution estimators

{\hat{F}}_{X_{i j} | Θ_{j}}^{C r e d} (x | Θ_{j})

are shown in the middle part of the table. The estimated credibility factors

Z_{j}^{x}

, as well the estimated parameters

{\hat{F}}_{m m} (x)

,

{\hat{s}}_{x}^{2}

,

{\hat{a}}_{x}

, are presented in the lower part of Table 9. The monotonicity of the estimated distribution function is shown in Table 9, but the convergence to one of the estimated credibility distribution for grouped data should be further investigated.

Figure 3 displays the smoothed individual empirical distribution for grouped data in each contract. Again, the red bullets indicate the corresponding credibility estimate at specific points presented in Table 9.

Credibility Coefficients for Financial Grouped Data

Table 10 illustrates the coefficient of variation

B R V

, the average within-risk coefficient of variation

W R V

, and the credibility coefficient

C C

for the industry portfolios of grouped data.

5.4. Example of the Classical Credibility Estimation with Financial Grouped Data

For grouped data, the previous approach gives a credibility point estimate. If we want to derive the classical credibility estimation, we can apply the concept of uniform distribution within each interval of returns and take the interval midpoints as the value of return. The weights are the number of observations in each interval. Table 11 shows the individual average return for the 10 industry portfolios

{\hat{μ}}_{j}

, the credibility estimation of returns for these portfolios

μ {(Θ_{j})}^{C r e d}

, along with the credibility factor

Z_{j}

and the estimated parameters

\hat{μ}

,

\hat{a}

and

{\hat{s}}^{2}

.

6. Concluding Remarks

The objective of this paper was to present the appropriate credibility distribution model that adequately describes the insurance losses, a model that can be used for risk management purposes.

The main contribution of the paper is that it embedded the empirical distribution into credibility modeling in the form of the Bühlmann and Straub (1970) model. In the first part of the paper, we present the model of the weighted credibility distribution, and in the second part, a model that applies to a grouped data in intervals.

With our models, we examine two datasets, one with motor claim amounts and the number of motor claims from 10 selected European countries during the period 2004–2020, and a second with monthly returns from July 1926 to July 2022 for 10 industry portfolios. For applying our credibility distribution model with grouped data, we grouped the second dataset (Fama/French financial data) into intervals of claim amounts. Under this setting, the grouping is subjective and the weights are the number of points within each interval and the total weights in each interval are the same.

The monotonicity (or non-monotonicity) and the convergence to one of the estimated distribution functions are shown numerically in Table 2, Table 3, Table 6 and Table 9. From a theoretical point of view, the monotonicity, as well as the convergence of the estimated distribution functions need further investigation. Furthermore, the sufficient conditions for the asymptotic optimality of the empirical credibility distribution estimators can be also investigated, providing some good ideas for a new project.

Funding

This research received no external funding.

Data Availability Statement

The datasets that were used in this study are available online on the following link, https://www.insuranceeurope.eu/statistics, accessed on 10 September 2022 and https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html, accessed on 20 September 2022.

Acknowledgments

This work has been partly supported by the University of Piraeus Research Center. The author thanks the anonymous referees for their suggestions and comments that helped to improve the paper.

Conflicts of Interest

The author declares no conflict of interest.

References

Bozikas, Apostolos, and Georgios Pitselis. 2020. Incorporating crossed classification credibility into the Lee–Carter model for multi-population mortality data. Insurance: Mathematics and Economics 93: 353–68. [Google Scholar] [CrossRef]
Bozikas, Apostolos, and Georgios Pitselis. 2021. Multi-population mortality modelling and forecasting: A hierarchical credibility regression approach. European Actuarial Journal 11: 231–67. [Google Scholar] [CrossRef]
Bühlmann, Hans. 1967. Experience rating and credibility. ASTIN Bulletin 4: 199–207. [Google Scholar]
Bühlmann, Hans, and Erwin Straub. 1970. Glaubwürdigkeit für Schadensätze. Mitteilungen der Vereinigung Schweizerischer Versicherungsmathematiker 70: 111–33. [Google Scholar]
Cai, Xiaoqiang, Limin Wen, Xianyi Wu, and Xian Zhou. 2015. Credibility Estimation of distribution functions with applications to experience rating in general insurance. North American Actuarial Journal 19: 311–35. [Google Scholar] [CrossRef]
Christiansen, Marcus, and Edo Schinzinger. 2016. A Credibility Approach for Combining Likelihoods Generalized Linear Models. Astin Bulletin 46: 531–69. [Google Scholar] [CrossRef]
Denuit, Michel. 2008. Comonotonic approximations to quantiles of life annuity conditional expected present value. Insurance: Mathematics and Economics 42: 831–38. [Google Scholar] [CrossRef]
De Vylder, Etienne F. 1976. Geometrical Credibility. Scandinavian Actuarial Journal 3: 121–49. [Google Scholar] [CrossRef]
De Vylder, Etienne F. 1978. Parameter Estimation in Credibility Theory. ASTIN Bulletin 10: 99–112. [Google Scholar] [CrossRef]
De Vylder, Etienne F. 1996. Advanced Risk Theory-A Self-Contained Introduction. Brussels: Editions de L’Universite de Bruxelles. [Google Scholar]
Fama, Eugene F., and Kenneth R. French. 2022. CRSP Data. Available online: https://mba.tuck.dartmouth.edu/pages/faculty/ken.french/data_library.html (accessed on 20 September 2022).
Ferguson, Thomas 1973. A Bayesian analysis of some non-parametric problems. Annals of Statistics 1: 209–30.
Friedman, Jerome, and Robert Tibshirani. 1984. The monotone smoothing of scatterplots. Technometrics 26: 243–50. [Google Scholar] [CrossRef]
Gebizlioglu, Omer L., and Banu Yagci. 2008. Tolerance intervals for quantiles of bivariate risks and risk measurement. Insurance: Mathematics and Economics 42: 1022–27. [Google Scholar] [CrossRef]
Gong, Yikai (Maxwell), Zhuangdi Li, Maria Milazzo, Kristen Moore, and Matthew Provencher. 2018. Credibility methods for individual life insurance. Risks 6: 144. [Google Scholar] [CrossRef]
Hachemeister, Charles. A. 1975. Credibility for regression models with application to trend. In Credibility, Theory and Applications. Edited by P. Kahn. New York: Academic Press, Inc., pp. 307–48. [Google Scholar]
Insurance Europe. 2022. Available online: https://www.insuranceeurope.eu/statistics (accessed on 10 September 2022).
Jewell, William S. 1974a. Credible means are exact Bayesian for exponential families. ASTIN Bulletin 8: 77–90. [Google Scholar] [CrossRef]
Jewell, William S. 1974b. The Credible distribution. ASTIN Bulletin 7: 237–69. [Google Scholar] [CrossRef]
Kim, Joseph H.T., and Yongho Jeon. 2013. Credibility theory based on trimming. Insurance: Mathematics and Economics 53: 46–57. [Google Scholar] [CrossRef]
Kim, Minwoo, Himchan Jeong, and Dipak Dey. 2022. Approximation of Zero-Inflated Poisson Credibility Premium via Variational Bayes Approach. Risks 10: 54. [Google Scholar] [CrossRef]
Klugman, Stuart A., Harry Panjer, and Gordon E. Willmot. 2012. Loss Models: From Data to Decisions. New York: Wiley. [Google Scholar]
Korwar, Ramesh M., and Myles Hollander. 1973. Contributions to the theory of Dirichlet processes. Annals of Statistics 1: 705–11. [Google Scholar] [CrossRef]
Kudryavtsev, Andrey. 2009. Using quantile regression for rate-making. Insurance: Mathematics and Economics 45: 296–304. [Google Scholar]
Landsman, Zinoviy. 1996. Sample quantiles and additive statistics: Information, sufficiency, estimation. Journal of Statistical Planning and Inference 52: 93–108. [Google Scholar]
Landsman, Zinoviy M., and Udi E. Makov. 1998. Exponential dispersion models and credibility. Scandinavian Actuarial Journal 1: 89–96. [Google Scholar] [CrossRef]
Landsman, Zinoviy M., and Udi E. Makov. 1999. Credibility evaluations for exponential dispersion families. Insurance: Mathematics and Economics 24: 33–9. [Google Scholar] [CrossRef]
Makov, Udi E., Adrian F. M. Smith, and Yu H. Liu. 1996. Bayesian methods in actuarial science. Journal of the Royal Statistical Society Series D 45: 503–15. [Google Scholar] [CrossRef]
Mukerjee, Hari. 1988. Monotone nonparametric regression. Annals of Statistics 16: 741–50. [Google Scholar] [CrossRef]
Pitselis, Georgios. 2009. Solvency Supervision based on a total balance sheet approach. Journal of Computational and Applied Mathematics 233: 83–96. [Google Scholar] [CrossRef]
Pitselis, Georgios. 2013. Quantile credibility models. Insurance: Mathematics and Economics 52: 477–89. [Google Scholar] [CrossRef]
Pitselis, Georgios. 2017. Risk measures in a quantile regression credibility framework with Fama/French data applications. Insurance: Mathematics and Economics 74: 122–34. [Google Scholar] [CrossRef]
Pitt, David G. W. 2006. Regression quantile analysis of claim termination rates for income protection insurance. Annals of Actuarial Science 1: 345–57. [Google Scholar] [CrossRef]
Shively, Thomas S., Thomas W. Sager, and Stephen G. Walker. 2009. A Bayesian Approach to Non-Parametric Monotone Function Estimation. Journal of the Royal Statistical Society Series B: Statistical Methodology 71: 159–75. [Google Scholar] [CrossRef]
Tsai, Cary Chi-Liang, and Adelaide Di Wu. 2020. Bühlmann credibility-based approaches to modelling mortality rates for multiple populations. North American Actuarial Journal 24: 290–315. [Google Scholar] [CrossRef]
Tsai, Cary Chi-Liang, and Tzuling Lin. 2017. Incorporating the Bühlmann credibility into mortality models to improve forecasting performances. Scandinavian Actuarial Journal 5: 419–40. [Google Scholar] [CrossRef]
Tsai, Cary Chi-Liang, and Ying Zhang. 2019. A multi-dimensional Bühlmann credibility approach to modelling multi-population mortality rates. Scandinavian Actuarial Journal 5: 406–31. [Google Scholar] [CrossRef]
Wang, Wei, Limin Wen, Zhixin Yang, and Quan Yuan. 2021. Quantile Credibility Models with Common Effects. Risks 8: 100. [Google Scholar] [CrossRef]
Xacur, Oscar Alberto Quijano, and José Garrido. 2018. Bayesian credibility for GLMs. Insurance: Mathematics and Economics 83: 180–89. [Google Scholar] [CrossRef]
Yan, Yujie, and Kai-Sheng Song. 2022. A general optimal approach to Bühlmann credibility theory. Insurance: Mathematics and Economics 104: 262–82. [Google Scholar] [CrossRef]
Youn, Ahn Jae, Himchan Jeong, and Yang Lu. 2021. On the ordering of credibility factors. Insurance: Mathematics and Economics 101: 626–38. [Google Scholar] [CrossRef]
Zehnwirth, Benjamin. 1981. A Note on the Asymptotic Optimality of the Empirical Bayes Distribution Function. Annals of Statistics 9: 221–24. [Google Scholar] [CrossRef]
Zhang, Jin-Ting. 2004. A simple and efficient monotone smoother using smoothing splines. Journal of Nonparametric Statistics 16: 779–96. [Google Scholar] [CrossRef]

Figure 1. Individual empirical distribution and credibility distribution point estimates (in red) for motor claims per contract.

Figure 2. Individual empirical distribution and credibility distribution point estimates (in red) for industry portfolios per contract.

Figure 3. Individual empirical distribution and credibility distribution point estimates (in red) for grouped data per contract.

Table 1. Summary statistics for 10 selected European countries.

Motor Claims and Number of Motor Claims for the Years 2004–2018
$X_{i j}$ : Motor claims amount in millions for the years $i = 1, \dots, 15$ and countries $j = 1, \dots, 10$
Country	AT	DE	FI	GR	HR	IT	NO	PL	PT	SE
Min.	1923	18,789	692	438	207	12,791	1062	1582	987	311
1st Qu.	1989	19,322	803	530	226	12,968	1205	1935	1187	912
Median	2032	20,222	914	978	248	15,239	1396	2219	1261	987
Mean	2104	20,692	909	840	256	15,112	1359	2322	1218	1098
3rd Qu.	2201	21,828	1024	1083	288	16,492	1514	2620	1282	1504
Max.	2430	23,897	1141	1224	330	18,210	1637	3235	1362	1670
$w_{i j}$ : Weights–number of motor claims
Min.	1,177,269	8,673,000	368,898	403,604	170,205	3,389,677	562,981	1,307,003	643,713	890,304
1st Qu.	1,238,392	9,002,000	420,122	427,265	199,179	3,467,180	659,063	1,465,112	712,819	1,043,110
Median	1,279,586	9,247,000	497,201	474,875	204,421	4,541,671	758,814	1,749,483	837,694	1,098,411
Mean	1,282,093	9,220,067	493,392	475,977	208,084	4,317,710	728,372	2,177,715	800,425	1,111,064
3rd Qu.	1,323,949	9,425,500	573,942	515,984	218,964	5,026,480	783,394	1,891,870	876,323	1,168,650
Max.	1,396,250	9,750,000	641,513	580,985	238,904	5,249,558	908,663	4,515,087	959,781	1,328,331

Note:

X_{i j}

are the average claims per year and

w_{i j}

represents the number of motor claims that correspond to each

X_{i j}

.

Table 2. Credibility distribution estimation for motor claims.

Motor Claim Amounts and Number of Motor Claims from 10 Selected European Countries
during the Period 2004–2018
Individual empirical distribution with claim amount $X_{i j} \leq x,$ $(x = 320, 800, 1000, 2000, 3000$ , $23,800$ , $23,896$ , $23,897$ )
Country	AT	DE	FI	GR	HR	IT	NO	PL	PT	SE
${\hat{F}}_{n_{w j}} (320)$	0.0000	0.0000	0.0000	0.0000	0.9235	0.0000	0.0000	0.0000	0.0000	0.0624
${\hat{F}}_{n_{w j}} (800)$	0.0000	0.0000	0.1584	0.4206	1.0000	0.0000	0.0000	0.0000	0.0000	0.1940
${\hat{F}}_{n_{w j}} (1000)$	0.0000	0.0000	0.5286	0.5501	1.0000	0.0000	0.0000	0.0000	0.1428	0.5359
${\hat{F}}_{n_{w j}} (2000)$	0.2518	0.0000	1.0000	1.0000	1.0000	0.0000	1.0000	0.2128	1.0000	1.0000
${\hat{F}}_{n_{w j}} (3000)$	1.0000	0.0000	1.0000	1.0000	1.0000	0.0000	1.0000	0.7278	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,800)	1.0000	0.9339	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,896)	1.0000	0.9339	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,897)	1	1	1	1	1	1	1	1	1	1
Credibility distribution estimation with claim amount $X_{i j} \leq x,$ $(x$ = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897)
Country	AT	DE	FI	GR	HR	IT	NO	PL	PT	SE
${\hat{F}}_{X_{i j}}^{C r e d} (320 Θ_{j})$	0.01256	0.01256	0.01256	0.01256	0.01256	0.01256	0.01256	0.01256	0.01256	0.01256
${\hat{F}}_{X_{i j}}^{C r e d} (800 Θ_{j})$	0.02790	0.01348	0.04299	0.06155	0.06539	0.01980	0.03015	0.02490	0.02983	0.05827
${\hat{F}}_{X_{i j}}^{C r e d} (1000 Θ_{j})$	0.02754	0.00583	0.23820	0.24210	0.25270	0.01136	0.03722	0.01939	0.10500	0.33390
${\hat{F}}_{X_{i j}}^{C r e d} (2000 Θ_{j})$	0.25080	0.00105	0.93630	0.93420	0.86420	0.00223	0.95570	0.21300	0.95950	0.97040
${\hat{F}}_{X_{i j}}^{C r e d} (3000 Θ_{j})$	0.95520	0.00313	0.89460	0.89140	0.79400	0.00660	0.92490	0.71150	0.93090	0.94880
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,800 $Θ_{j})$	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,896 $Θ_{j})$	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707	0.9707
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,897 $Θ_{j})$	1	1	1	1	1	1	1	1	1	1
Credibility factor $X_{i j} \leq x,$ $(x$ = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897)
Parameter	${\hat{Z}}_{1}^{F_{x}}$	${\hat{Z}}_{2}^{F_{x}}$	${\hat{Z}}_{3}^{F_{x}}$	${\hat{Z}}_{4}^{F_{x}}$	${\hat{Z}}_{5}^{F_{x}}$	${\hat{Z}}_{6}^{F_{x}}$	${\hat{Z}}_{7}^{F_{x}}$	${\hat{Z}}_{8}^{F_{x}}$	${\hat{Z}}_{9}^{F_{x}}$	${\hat{Z}}_{10}^{F_{x}}$
x = 320	0	0	0	0	0	0	0	0	0	0
x = 800	0.1727	0.6002	0.0744	0.0719	0.0328	0.4128	0.1060	0.2618	0.1153	0.1532
x = 1000	0.6020	0.9158	0.3679	0.3596	0.1971	0.8359	0.4622	0.7198	0.4857	0.5672
x = 2000	0.9669	0.9953	0.9182	0.9155	0.8256	0.9899	0.9431	0.9802	0.9479	0.9619
x = 3000	0.9340	0.9903	0.8448	0.8400	0.6965	0.9794	0.8893	0.9600	0.8983	0.9246
x = 23,800	0	0	0	0	0	0	0	0	0	0
x = 23,896	0	0	0	0	0	0	0	0	0	0
x = 23,897	0	0	0	0	0	0	0	0	0	0
Parameter estimation $X_{i j} \leq x,$ (x = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897)
x = 320	${\hat{F}}_{n_{w w}} (x) = 0.01256$ ${\hat{a}}_{F_{x}} = 0.00000$ ${\hat{s}}_{F_{x}}^{2}$ = 456,695
x = 800	${\hat{F}}_{n_{w w}} (x) = 0.03372$ ${\hat{a}}_{F_{x}} = 0.00412$ ${\hat{s}}_{F_{x}}^{2}$ = 379,951
x = 1000	${\hat{F}}_{n_{w w}} (x) = 0.06920$ ${\hat{a}}_{F_{x}} = 0.02147$ ${\hat{s}}_{F_{x}}^{2}$ = 273,007
x = 2000	${\hat{F}}_{n_{w w}} (x) = 0.22117$ ${\hat{a}}_{F_{x}} = 0.09854$ ${\hat{s}}_{F_{x}}^{2}$ = 64,969
x = 3000	${\hat{F}}_{n_{w w}} (x) = 0.32113$ ${\hat{a}}_{F_{x}} = 0.14983$ ${\hat{s}}_{F_{x}}^{2}$ = 203,743
x = 23,800	${\hat{F}}_{n_{w w}} (x) = 0.97070$ ${\hat{a}}_{F_{x}} = 0.00000$ ${\hat{s}}_{F_{x}}^{2}$ = 1,610,559
x = 23,896	${\hat{F}}_{n_{w w}} (x) = 0.97070$ ${\hat{a}}_{F_{x}} = 0.00000$ ${\hat{s}}_{F_{x}}^{2}$ = 1,610,559
x = 23,897	${\hat{F}}_{n_{w w}} (x) = 1.00000$ ${\hat{a}}_{F_{x}} = 0.00000$ ${\hat{s}}_{F_{x}}^{2}$ = 1,736,923

Table 3. Credibility distribution estimation for motor claims.

Motor Claim Amounts and Number of Motor Claims from 10 Selected European Countries
during the Period 2004–2018, $Z_{F}$ Free of $x$
Individual empirical distribution with claim amount $X_{i j} \leq x,$ $(x$ = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897)
Country	AT	DE	FI	GR	HR	IT	NO	PL	PT	SE
${\hat{F}}_{n_{w j}} (320)$	0.0000	0.0000	0.0000	0.0000	0.9235	0.0000	0.0000	0.0000	0.0000	0.0624
${\hat{F}}_{n_{w j}} (800)$	0.0000	0.0000	0.1584	0.4206	1.0000	0.0000	0.0000	0.0000	0.0000	0.1940
${\hat{F}}_{n_{w j}} (1000)$	0.0000	0.0000	0.5286	0.5501	1.0000	0.0000	0.0000	0.0000	0.1428	0.5359
${\hat{F}}_{n_{w j}} (2000)$	0.2518	0.0000	1.0000	1.0000	1.0000	0.0000	1.0000	0.2128	1.0000	1.0000
${\hat{F}}_{n_{w j}} (3000)$	1.0000	0.0000	1.0000	1.0000	1.0000	0.0000	1.0000	0.7278	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,800)	1.0000	0.9339	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,896)	1.0000	0.9339	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000	1.0000
${\hat{F}}_{n_{w j}}$ (23,897)	1	1	1	1	1	1	1	1	1	1
Credibility distribution estimation with claim amount $X_{i j} \leq x,$ $(x$ = 320, 800, 1000, 2000, 3000, 23,800, 23,896, 23,897)
Country	AT	DE	FI	GR	HR	IT	NO	PL	PT	SE
${\hat{F}}_{X_{i j}}^{C r e d} (320 Θ_{j})$	0.00706	0.00190	0.00966	0.00974	0.11476	0.00347	0.00871	0.00541	0.00845	0.03262
${\hat{F}}_{X_{i j}}^{C r e d} (800 Θ_{j})$	0.01896	0.00511	0.06246	0.12047	0.14212	0.00931	0.02338	0.01452	0.02269	0.09829
${\hat{F}}_{X_{i j}}^{C r e d} (1000 Θ_{j})$	0.03891	0.01049	0.17511	0.17703	0.17362	0.01911	0.04798	0.02979	0.09327	0.25723
${\hat{F}}_{X_{i j}}^{C r e d} (2000 Θ_{j})$	0.23458	0.03352	0.40072	0.39581	0.30854	0.06106	0.46002	0.21640	0.47592	0.53495
${\hat{F}}_{X_{i j}}^{C r e d} (3000 Θ_{j})$	0.61831	0.04866	0.477641	0.47336	0.39729	0.08866	0.52932	0.55269	0.54317	0.59463
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,800 $Θ_{j})$	0.98352	0.93947	0.97745	0.97727	0.97398	0.99191	0.97968	0.98738	0.98028	0.98250
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,896 $Θ_{j})$	0.98352	0.93947	0.97745	0.97727	0.97398	0.99191	0.97968	0.98738	0.98028	0.98250
${\hat{F}}_{X_{i j}}^{C r e d}$ (23,897 $Θ_{j})$	1	1	1	1	1	1	1	1	1	1
${\hat{F}}_{n_{w w}} (320) = 0.01256$ , ${\hat{F}}_{n_{w w}} (800) = 0.03372,$ ${\hat{F}}_{n_{w w}} (1000) = 0.06920,$ ${\hat{F}}_{n_{w w}} (2000) = 0.22117$ ,
${\hat{F}}_{n_{w w}} (3000) = 0.32113$ , ${\hat{F}}_{n_{w w}}$ (23,800) = 0.97070, ${\hat{F}}_{n_{w w}}$ (23,896) = 0.97070, ${\hat{F}}_{n_{w w}}$ (23,897) = 1.000
Parameter estimation free of x
${\hat{a}}_{F} = 0.008457228$ ${\hat{s}}_{F}^{2} = 208898.6$
Credibility factor	${\hat{Z}}_{1}^{F}$	${\hat{Z}}_{2}^{F}$	${\hat{Z}}_{3}^{F}$	${\hat{Z}}_{4}^{F}$	${\hat{Z}}_{5}^{F}$	${\hat{Z}}_{6}^{F}$	${\hat{Z}}_{7}^{F}$	${\hat{Z}}_{8}^{F}$	${\hat{Z}}_{9}^{F}$	${\hat{Z}}_{10}^{F}$
$Z_{j}^{F}$ free of x	0.43775	0.84846	0.23054	0.22423	0.11218	0.72391	0.30667	0.56942	0.32708	0.40288

Table 4. Credibility coefficients.

Credibility Coefficients
x	320	800	1000	2000	23,800	23,900
WRV	23,262.8543	5787.8587	3718.7216	1152.5669	254.4523	1317.9237
BRV	8.448499	4.140172	3.170754	1.976505	0	0
CC	7,581,705	1,954,336	1,375,506	340,045	∞	∞

Table 5. Summary statistics for 10 industry portfolios.

Monthly Returns and Number of Firms in Portfolios from July 1926–July 2022
$X_{i j}$ : Value returns for $i = 1, \dots, 1155$ , $j = 1, \dots, 10$
Portfolio	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
Min.	−24.6900	−34.800	−29.820	−34.490	−33.870	−21.5600	−30.240	−34.080	−33.0500	−30.0200
1st Qu.	−1.3900	−2.855	−2.000	−2.470	−2.600	−1.5200	−2.030	−1.920	−1.6850	−2.0750
Median	1.0800	0.980	1.350	0.890	1.320	0.9000	1.090	1.100	1.0500	1.2800
Mean	0.9524	1.158	1.002	1.027	1.116	0.8198	1.014	1.072	0.8738	0.9013
3rd Qu.	3.6450	4.845	4.235	4.590	5.030	3.2400	4.090	4.060	3.6200	4.1850
Max.	34.1700	79.790	57.200	38.990	53.490	28.1700	42.450	37.130	43.4600	58.6700
$w_{i j}$ : Weight–number of firms in portfolios for $i = 1, \dots, 1155$ , $j = 1, \dots, 10$
Min.	87.0	37.0	125.0	45.0	18.0	4.00	41.0	4.0	21.0	110.0
1st Qu.	136.0	56.0	313.0	55.0	44.0	8.00	84.0	18.0	72.0	156.0
Median	173.0	92.0	449.0	116.0	358.0	41.00	276.0	122.0	102.0	1002.0
Mean	230.2	101.2	507.7	131.1	428.1	53.76	298.4	237.2	106.9	887.5
3rd Qu.	334.0	148.0	772.5	173.0	797.5	99.00	472.5	509.0	179.5	1619.0
Max.	547.0	213.0	967.0	404.0	1465.0	189.00	823.0	868.0	204.0	2249.0

Note:

X_{i j}

denotes the values of monthly returns per year and

w_{i j}

represents the number of firms in portfolios that correspond to each

X_{i j}

.

Table 6. Credibility distribution estimation for industry portfolios.

Monthly returns for 10 industry portfolios from July 1926–July 2022
Individual empirical distribution with returns $X_{i j} \leq x,$ $(x = - 15, - 10, 5, 0, 10, 15, 34.17, 59, 60, > 79.79)$
Portfolios	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
${\hat{F}}_{n_{w j}} (- 15)$	0.00309	0.01670	0.00745	0.01000	0.01840	0.00498	0.00630	0.00213	0.00177	0.00952
${\hat{F}}_{n_{w j}} (- 10)$	0.02113	0.04486	0.02680	0.04053	0.05738	0.02831	0.02171	0.01005	0.01472	0.03030
${\hat{F}}_{n_{w j}} (- 5)$	0.07267	0.13872	0.10760	0.12174	0.16253	0.10611	0.10199	0.10553	0.07147	0.11525
${\hat{F}}_{n_{w j}} (0)$	0.38587	0.42946	0.40825	0.41781	0.43866	0.39040	0.39827	0.37940	0.39013	0.40579
${\hat{F}}_{n_{w j}} (10)$	0.97618	0.93303	0.96468	0.95040	0.90779	0.97479	0.95921	0.97628	0.98455	0.96935
${\hat{F}}_{n_{w j}} (15)$	0.99500	0.97700	0.99300	0.98200	0.96200	0.99800	0.99500	0.99700	0.99600	0.99200
${\hat{F}}_{n_{w j}} (34.17)$	0.99959	0.99676	0.99896	0.99929	0.99985	1.00000	0.99980	0.99996	0.99982	0.99974
${\hat{F}}_{n_{w j}} (59)$	1.000000	0.9996	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000
${\hat{F}}_{n_{w j}} (60)$	1.000000	0.9996	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000
${\hat{F}}_{n_{w j}} (x > 79.79)$	1.000000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000
Credibility distribution estimation with returns $X_{i j} \leq x,$ $(x = - 15, - 10, 5, 0, 10, 15, 34.17, 59, 60, > 79.79)$
Portfolios	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
${\hat{F}}_{X_{i j}}^{C r e d} (- 15 Θ_{j})$	0.00496	0.01270	0.00771	0.00952	0.01650	0.00762	0.00700	0.00426	0.00537	0.00946
${\hat{F}}_{X_{i j}}^{C r e d} (- 10 Θ_{j})$	0.02260	0.04073	0.02711	0.03813	0.05501	0.02939	0.02283	0.01323	0.01926	0.03032
${\hat{F}}_{X_{i j}}^{C r e d} (- 5 Θ_{j})$	0.07919	0.13160	0.10810	0.12000	0.15820	0.10980	0.10360	0.10480	0.08371	0.11520
${\hat{F}}_{X_{i j}}^{C r e d} (0 Θ_{j})$	0.40100	0.40800	0.40500	0.40700	0.41400	0.40400	0.40300	0.40000	0.40400	0.40600
${\hat{F}}_{X_{i j}}^{C r e d} (10 Θ_{j})$	0.97500	0.93600	0.96500	0.94900	0.91000	0.97100	0.95900	0.97500	0.98100	0.96900
${\hat{F}}_{X_{i j}}^{C r e d} (15 Θ_{j})$	0.99440	0.97890	0.99280	0.98280	0.96320	0.99530	0.99460	0.99630	0.99470	0.99190
${\hat{F}}_{X_{i j}}^{C r e d} (34.17 Θ_{j})$	0.99952	0.99951	0.99951	0.99952	0.99953	0.99952	0.99952	0.99952	0.99952	0.99953
${\hat{F}}_{X_{i j}}^{C r e d} (59 Θ_{j})$	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999
${\hat{F}}_{X_{i j}}^{C r e d} (60 Θ_{j})$	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999	0.99999
${\hat{F}}_{X_{i j}}^{C r e d} (x > 79.79 Θ_{j})$	1.000000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000	1.00000
Credibility factor with returns $X_{i j} \leq x,$ $(x = - 15, - 10, 5, 0, 10, 15, 34.17, 59, 60, > 79.79)$
Parameter	${\hat{Z}}_{1}^{F_{x}}$	${\hat{Z}}_{2}^{F_{x}}$	${\hat{Z}}_{3}^{F_{x}}$	${\hat{Z}}_{4}^{F_{x}}$	${\hat{Z}}_{5}^{F_{x}}$	${\hat{Z}}_{6}^{F_{x}}$	${\hat{Z}}_{7}^{F_{x}}$	${\hat{Z}}_{8}^{F_{x}}$	${\hat{Z}}_{9}^{F}$	${\hat{Z}}_{10}^{F}$
x = −15	0.68105	0.48412	0.82488	0.54873	0.79887	0.33276	0.73462	0.68754	0.49796	0.89169
x = −10	0.84475	0.70514	0.92310	0.75602	0.91008	0.55964	0.87584	0.84866	0.71652	0.95451
x = −5	0.84567	0.70659	0.92359	0.75731	0.91066	0.56137	0.87660	0.84955	0.71794	0.95481
x = 0	0.20416	0.10132	0.36138	0.12747	0.32303	0.05653	0.24956	0.20908	0.10648	0.49727
x = 10	0.92860	0.85111	0.96632	0.88105	0.96031	0.75234	0.94401	0.93057	0.85799	0.98045
x = 15	0.91814	0.83134	0.96115	0.86463	0.95426	0.72373	0.93565	0.92037	0.83897	0.97740
x = 34.17	0.09671	0.04494	0.19106	0.05747	0.16608	0.02440	0.12188	0.09937	0.04738	0.29220
x = 59	0	0	0	0	0	0	0	0	0	0
x = 60	0	0	0	0	0	0	0	0	0	0
$x > 79.79$	0	0	0	0	0	0	0	0	0	0
Parameter estimation $X_{i j} \leq x,$ $(x = - 15, - 10, 5, 0, 10, 15, 34.17, 59, 60, > 79.79)$
x = −15	${\hat{F}}_{n_{w w}} (x) = 0.00894$ ${\hat{a}}_{F_{x}} = 2.118$ $\times 10^{- 5}$ ${\hat{s}}_{F_{x}}^{2} = 2.637$
x = −10	${\hat{F}}_{n_{w w}} (x) = 0.03077$ ${\hat{a}}_{F_{x}} = 0.00018110$ ${\hat{s}}_{F_{x}}^{2} = 8.848$
x = −5	${\hat{F}}_{n_{w w}} (x) = 0.11460$ ${\hat{a}}_{F_{x}} = 0.00062020$ ${\hat{s}}_{F_{x}}^{2} = 30.09$
x = 0	${\hat{F}}_{n_{w w}} (x) = 0.40530$ ${\hat{a}}_{F_{x}} = 6.935$ $\times 10^{- 5}$ ${\hat{s}}_{F_{x}}^{2} = 71.87$
x = 10	${\hat{F}}_{n_{w w}} (x) = 0.95820$ ${\hat{a}}_{F_{x}} = 0.00057730$ ${\hat{s}}_{F_{x}}^{2} = 11.80$
x = 15	${\hat{F}}_{n_{w w}} (x) = 0.98810$ ${\hat{a}}_{F_{x}} = 0.00014610$ ${\hat{s}}_{F_{x}}^{2} = 3.463$
x = 34.17	${\hat{F}}_{n_{w w}} (x) = 0.99952$ ${\hat{a}}_{F_{x}} = 5.731$ $\times 10^{- 8}$ ${\hat{s}}_{F_{x}}^{2} = 0.1423$
x = 59	${\hat{F}}_{n_{w w}} (x) = 0.99999$ ${\hat{a}}_{F_{x}} = 0.00000000$ ${\hat{s}}_{F_{x}}^{2} = 0.00424$
x = 60	${\hat{F}}_{n_{w w}} (x) = 0.99999$ ${\hat{a}}_{F_{x}} = 0.00000000$ ${\hat{s}}_{F_{x}}^{2} = 0.00424$
$x > 79.79$	${\hat{F}}_{n_{w w}} (x) = 1.00000$ ${\hat{a}}_{F_{x}} = 0.00000000$ ${\hat{s}}_{F_{x}}^{2} = 0.00000$

Table 7. Credibility coefficients.

Credibility Coefficients for Industry Portfolios
x	−15	−10	−5	0	10	15	50
WRV	181.642532	96.670744	47.865927	20.916895	3.584964	1.883325	21.656408
BRV	0.51478450	0.43735262	0.21731078	0.02054692	0.02507521	0.01223275	0
CC	124504.25	48856.99	48516.61	1036337.42	20439.98	23702.94	∞

Table 8. Credibility distribution estimation for grouped data.

10 Industry Portfolios
Grouped Monthly Returns in 10 Intervals from July 1926 to July 2022
$m_{i j}$ : Number of data points in the interval for each portfolio
Interval of return	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
$- 35 \leq X_{i j} \leq - 20$	4	12	6	4	10	1	6	4	3	8
$- 20 \leq X_{i j} \leq - 13$	2	20	19	16	21	9	13	8	13	18
$- 13 \leq X_{i j} \leq - 6$	60	105	84	92	108	61	73	70	64	94
$- 6 \leq X_{i j} \leq - 4$	58	86	67	80	77	64	78	77	68	65
$- 4 \leq X_{i j} \leq - 1$	199	188	195	210	181	205	197	204	200	195
$- 1 \leq X_{i j} \leq 2$	383	245	273	274	234	394	300	304	351	272
$2 \leq X_{i j} \leq 8$	406	357	434	364	380	362	388	408	391	415
$8 \leq X_{i j} \leq 10$	20	55	30	52	58	29	52	42	28	48
$10 \leq X_{i j} \leq 22$	20	72	41	56	81	29	43	33	33	34
$22 \leq X_{i j} \leq 80$	3	15	6	7	5	1	5	5	4	6
Total # of observations	1155	1155	1155	1155	1155	1155	1155	1155	1155	1155

Table 9. Grouped credibility distribution estimation.

Grouped monthly returns in 10 intervals for 10 industry portfolios
from July 1926–July 2022
Individual empirical distribution with returns $X_{i j} \leq x$ , $(x = - 15, - 10, - 5, 0, 10, 15)$
Portfolios	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
${\hat{F}}_{m j} (- 15)$	0.00470	0.02276	0.01694	0.01336	0.02165	0.00643	0.01323	0.00841	0.01064	0.01800
${\hat{F}}_{m j} (- 10)$	0.02746	0.06667	0.05281	0.05145	0.06691	0.03129	0.04354	0.03636	0.03760	0.05739
${\hat{F}}_{m j} (- 5)$	0.08225	0.15580	0.12340	0.13160	0.15370	0.08918	0.11340	0.10430	0.09870	0.13200
${\hat{F}}_{m j} (0)$	0.39020	0.42660	0.40000	0.42710	0.41130	0.40810	0.40430	0.40200	0.40260	0.40750
${\hat{F}}_{m j} (10)$	0.98009	0.92468	0.95931	0.94545	0.92554	0.97403	0.95844	0.96710	0.96797	0.96537
${\hat{F}}_{m j} (15)$	0.98730	0.95065	0.97410	0.96566	0.95476	0.98449	0.97395	0.97900	0.97987	0.97763
Credibility distribution estimation with returns $X_{i j} \leq x$ , $(x = - 15 - 10, - 5, 0, 10, 15)$
Portfolios	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
${\hat{F}}_{X_{i j}}^{C r e d} (- 15 Θ_{j})$	0.00472	0.02270	0.01690	0.01340	0.02160	0.00645	0.01320	0.00842	0.01060	0.01810
${\hat{F}}_{X_{i j}}^{C r e d} (- 10 Θ_{j})$	0.02764	0.06649	0.05276	0.05141	0.06673	0.03143	0.04357	0.03646	0.03769	0.0573
${\hat{F}}_{X_{i j}}^{C r e d} (- 5 Θ_{j})$	0.08268	0.15540	0.12330	0.13140	0.15330	0.08953	0.11350	0.10450	0.09893	0.13180
${\hat{F}}_{X_{i j}}^{C r e d} (0 Θ_{j})$	0.39200	0.42470	0.40080	0.42520	0.41100	0.40810	0.40470	0.40260	0.40310	0.40705
${\hat{F}}_{X_{i j}}^{C r e d} (10 Θ_{j})$	0.97986	0.92499	0.95929	0.94556	0.92584	0.97386	0.95842	0.96700	0.96786	0.96529
${\hat{F}}_{X_{i j}}^{C r e d} (15 Θ_{j})$	0.98717	0.95084	0.97409	0.96572	0.95492	0.98439	0.97394	0.97895	0.97981	0.97759
Parameter estimation $X_{i j} \leq x$ , $(x = - 15, - 10, - 5, 0, 10, 15)$
x = −15	${\hat{F}}_{m m} (x) = 0.013617$ ${\hat{a}}_{x} = 3.8391$ $\times 10^{- 5}$ ${\hat{s}}_{x}^{2} = 7.9089$ $\times 10^{- 7}$ ${\hat{Z}}_{j}^{x} = 0.997944$
x = −10	${\hat{F}}_{m m} (x) = 0.047149$ ${\hat{a}}_{x} = 0.000196743$ ${\hat{s}}_{x}^{2} = 1.7862$ $\times 10^{- 5}$ ${\hat{Z}}_{j}^{x} = 0.991002$
x = −5	${\hat{F}}_{m m} (x) =$ 0.011843 ${\hat{a}}_{x} = 0.0006369784$ ${\hat{s}}_{x}^{2} = 7.6405$ $\times 10^{- 5}$ ${\hat{Z}}_{j}^{x} = 0.950530$
x = 0	${\hat{F}}_{m m} (x) = 0.407970$ ${\hat{a}}_{x} = 0.000118016$ ${\hat{s}}_{x}^{2} = 0.1327$ $\times 10^{- 3}$ ${\hat{Z}}_{j}^{x} = 0.9881473$
x = 10	${\hat{F}}_{m m} (x) = 0.956798$ ${\hat{a}}_{x} = 0.000362038$ ${\hat{s}}_{x}^{2} = 3.5504$ $\times 10^{- 5}$ ${\hat{Z}}_{j}^{x} = 0.990288$
x = 15	${\hat{F}}_{m m} (x) = 0.972741$ ${\hat{a}}_{x} = 0.000146344$ ${\hat{s}}_{x}^{2} = 1.2874$ $\times 10^{- 5}$ ${\hat{Z}}_{j}^{x} = 0.9912793$

Table 10. Credibility coefficients.

Credibility Coefficients for Industry Portfolios for Grouped Data
x	−15	−10	−5	0	10	15
WRV	0.06530954	0.08963808	0.11209926	0.05720951	0.00622752	0.00368866
BRV	0.45502292	0.29749328	0.15540177	0.09426977	0.01988643	0.01243626
CC	0.02060092	0.09078849	0.52034766	0.36829137	0.09806570	0.08797441

Table 11. Classical credibility model for grouped data.

Individual Average Return for the 10 Industry Portfolios
Portfolios	NoDur	Durbl	Manuf	Enrgy	HiTec	Telcm	Shops	Hlth	Utils	Other
${\hat{μ}}_{j}$	1.15022	1.52554	1.20217	1.18052	1.34199	1.01862	1.27099	1.25368	1.12597	1.06277
Credibility estimation of returns for the 10 industry portfolios
$μ {(Θ_{j})}^{C r e d}$	1.18166	1.36974	1.20769	1.19685	1.27776	1.11571	1.24219	1.23351	1.16951	1.13784
Credibility parameter estimation
	$\hat{μ} = 1.213247$ $\hat{a} = 4.807552$ ${\hat{s}}^{2} = 5528.034$ $Z_{j} = 0.501114$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pitselis, G. Credibility Distribution Estimation with Weighted or Grouped Observations. Risks 2024, 12, 10. https://doi.org/10.3390/risks12010010

AMA Style

Pitselis G. Credibility Distribution Estimation with Weighted or Grouped Observations. Risks. 2024; 12(1):10. https://doi.org/10.3390/risks12010010

Chicago/Turabian Style

Pitselis, Georgios. 2024. "Credibility Distribution Estimation with Weighted or Grouped Observations" Risks 12, no. 1: 10. https://doi.org/10.3390/risks12010010

APA Style

Pitselis, G. (2024). Credibility Distribution Estimation with Weighted or Grouped Observations. Risks, 12(1), 10. https://doi.org/10.3390/risks12010010

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Credibility Distribution Estimation with Weighted or Grouped Observations

Abstract

1. Introduction

Related Works

2. Weighted Credibility Distribution Estimation

2.1. Assumptions

2.2. Structural Parameters

2.3. Notation

2.4. Optimal Projection Theorem

2.5. Unbiased Estimators

3. Credible Distribution for Grouped Data

3.1. Empirical Distribution for Grouped Data at Boundary

3.1.1. Assumptions

3.1.2. Structural Parameters

3.1.3. Notation

3.1.4. Credibility Estimators

3.2. Empirical Distribution for Grouped Data at Value x between Boundaries

4. Alternative Credibility Distribution Approach for Grouped Data

5. Numerical Illustrations

5.1. Numerical Example with Insurance Data

Credibility Coefficients for Motor Claims Data

5.2. Example of Credibility Distribution Estimation with Financial Data

Credibility Coefficients for Industry Portfolios Data

5.3. Example of Credibility Distribution Estimation with Financial Grouped Data

Credibility Coefficients for Financial Grouped Data

5.4. Example of the Classical Credibility Estimation with Financial Grouped Data

6. Concluding Remarks

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI