Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses

Yan, Tianxing; Lu, Yi; Jeong, Himchan

doi:10.3390/risks12060097

Open AccessEditor’s ChoiceArticle

Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses

by

Tianxing Yan

,

Yi Lu

and

Himchan Jeong

^*

Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC V5A 1S6, Canada

^*

Author to whom correspondence should be addressed.

Risks 2024, 12(6), 97; https://doi.org/10.3390/risks12060097

Submission received: 28 April 2024 / Revised: 10 June 2024 / Accepted: 13 June 2024 / Published: 16 June 2024

(This article belongs to the Special Issue Statistical Modelling in Risk Management)

Download

Browse Figures

Versions Notes

Abstract

The Danish fire loss dataset records commercial fire losses under three insurance coverages: building, contents, and profits. Existing research has primarily focused on the heavy-tail behaviour of the losses but ignored the relationship among different insurance coverages. In this paper, we aim to model the aggregate loss for all three coverages. To study the pairwise dependence of claims from all types of coverage, an independent model, a hierarchical model, and some copula-based models are proposed for the frequency component. Meanwhile, we applied composite distributions to capture the heavy-tailed severity component. It is shown that consideration of dependence for the multi-peril frequencies (i) significantly enhances model goodness-of-fit and (ii) provides more accurate risk measures of the aggregated losses for all types of coverage in total.

Keywords:

dependence modelling; ratemaking; multi-peril insurance; heavy-tail distributions; composite models; copulas; binomial thinning

1. Introduction

Insurance provides financial compensation to individuals or companies after a particular event occurs. Basically, the insurance business relies on the diversification effects after the risks of the policyholders are pooled together and supported by collected premiums from the policyholders, which are primarily determined by the expected amount of claims from each of the policyholders. In this regard, there are two components to be considered for effective risk management purposes: the heavy-tail behaviour of insurance claims and the possible dependence among different insurance coverages. Heavy-tail behaviour can affect the effectiveness of risk mitigation as the risk pooling is inherently based on the total expected claim amounts, whereas an excessively large claim could dampen the solvency of the insurance portfolio due to the heavy-tail behaviour. Such impacts could be more substantial if the insurance portfolio provides multiple types of coverage and the claims from different types of coverage are positively correlated. In this paper, we focus on a reinsurance dataset, Danish multi-peril commercial fire loss, aiming to incorporate the dependency among different insurance coverages and heavy-tailed losses for modelling the monthly aggregate loss for the company.

In this regard, there have been many approaches that could handle heavy-tail behaviours of insurance claims. One example is the peak-over-threshold (POT) approach. However, for a reinsurance company, it is important to consider the financial loss in the entire range instead of focusing on the extreme cases. Additionally, a finite mixture model can be used to handle heavy-tail behaviours, which is a linear combination of multiple distributions such as (Hong and Martin 2018; Miljkovic and Grün 2016). A unique advantage of such models is the ability to construct a multimodal distribution. Different from finite mixture models, a composite model splices and combines random variables (usually continuous random variables) with the consideration of continuity and differentiability at the splicing points. It allows the fitting of different distributions with desirable distributional properties on certain ranges of data, especially to accommodate the heavy-tail nature of the data. For example, (Cooray and Cheng 2015; Pigeon and Denuit 2011) focus on composite lognormal–Pareto models, and (Scollnik and Sun 2012) applied composite Weibull–Pareto models. By considering the model complexity and ability to capture the realistic loss behaviour, we utilize composite models for the loss severities (claim amounts) in this paper.

In addition to the heavy-tail, the dependency among risks is another unignorable behaviour. Two different methods have been applied in this study: the hierarchical and the copula-based modelling frameworks. With the multilevel modelling technique, the hierarchical modelling framework bridges the relationship among different events by sharing the belief that risks from a common environment are not independently distributed. For instance, (Fung et al. 2023) proposed a hierarchical modelling approach that models the number of certain climate events and the associated claim counts subsequently. Recently, (Jeong 2024) considered a multivariate Tweedie distribution where the correlated random effects are modelled only with their moments. An alternative way of studying dependency is the copula method. This methodology allows to flexibly connect random variables using a dependent structure. The authors of (Lee and Shi 2019) suggested a copula-based collective risk model for describing various dependencies in longitudinal insurance claims data. The authors of (Oh et al. 2021) provided a copula-based collective risk model for microlevel multi-year claims data. The authors of (Jeong et al. 2023) considered a factor copula model to capture dependence among claim counts from multiple lines of business.

There are existing studies in the literature that analyze the heavy-tailed behaviour of Danish fire losses. For example, (McNeil 1997) applied the generalized Pareto distribution for the total losses for all coverages and tested the goodness-of-fit. Additionally, (Resnick 1997) suggested some alternative methodologies to study the tail behaviour. However, there is a lack of studies that focus on the dependency among losses under different coverages. They are needed, from the practice perspective, in order to appropriately price and set reserves for multi-peril insurance products. We conduct a comprehensive study to address both issues for effective risk management purposes regarding the aggregate Danish fire loss on a monthly basis. More specifically, under the framework of collective risk models, we model the claim frequency from various types of insurance coverage first and then study the aggregate loss by adding the losses from all insurance coverages and using the composite model for loss amounts.

In this study, we propose three different types of frequency models: a fully independent model as a benchmark model, a hierarchical model, and some copula-based models to model the frequency component. The hierarchical model and copula-based models proposed incorporate the dependency among different coverages. Meanwhile, several two-component composite models are implemented to model the severity component. After conducting statistical and risk analyses, by comparing with the benchmark model, the fully independent model, we conclude that the models with dependency structure significantly improve model goodness-of-fit and provide more accurate risk measures of the aggregate losses for all types of coverages in total. However, there are limitations regarding our proposed models. Our proposed models do not consider the dependency between the frequency and severity. In the literature, for example, (Vernic et al. 2021) proposed a Sarmanov distribution for modelling dependence between the frequency and the average severity of insurance claims. Additionally, there could be models other than copula ones to model the dependence between the claim counts of different types.

The remainder of this article is organized as follows. Section 2 introduces the dataset that motivates our research. Section 3 provides a statistical framework to model the dependent claim frequency from multiple types of insurance coverage. Section 4 provides a framework for the severity component to capture the heavy-tail behaviour of insurance claims. Section 5 provides the estimation results from different models, associated with their implications in risk management. Section 6 concludes this article.

2. Data Exploration

We start with the introduction of a dataset of Danish multi-peril fire losses, which is available in an R library, CASdataset. It was recorded by the Denmark’s Copenhagen Reinsurance Company and contains 2167 commercial fire loss records from 1980 to 1990. Each recorded claim includes the loss amounts of three sections: building, contents, and profits, which are adjusted by inflation using 1985 as the base year. Below, a few rows of the data are provided. The building, contents, and profits columns show the Danish Krone losses in millions and the total column is the sum of the three. Table 1 shows the first 5 rows of the dataset.

We recall that we are interested in analyzing the Danish reinsurance aggregate loss on a monthly basis. The collective risk modelling framework is utilized for each coverage to model the total loss for a single coverage. Then, the adding-up of the losses from three coverages is the aggregate loss we are interested in. In this regard, we aggregate the 132 months claim numbers to obtain the observations using the following notations:

$M_{t}$ : Number of reported accidents during month $t = 1, \dots, 132$ ;
$N_{j t}$ : Number of claims from the jth lines of insurance during month t, where $j = 1, 2, 3$ represent the building, contents, and profits;
$Y_{j t k}$ : kth individual loss amounts from jth line of insurance during month t for $k = 1, \dots, N_{j t}$ ;
$S_{j t}$ : Aggregate loss amount from the jth line of insurance during month t, which is defined as

$S_{j t} : = \sum_{k = 1}^{N_{j t}} Y_{j t k}, N_{j t} > 0$

(1)

and 0 otherwise. We use a compound risk model (CRM) to describe $S_{j t}$ ;
$S_{t}$ : Aggregate loss for all lines of insurance during month t, which is defined as

$S_{• t} : = S_{1 t} + S_{2 t} + S_{3 t} .$

(2)

For the jth lines of insurance,

j = 1

represents the damage to the building,

j = 2

is the related contents, and

j = 3

stands for the profit line. Since we aggregate the data monthly, we consider whether we could assume that the claim numbers

S_{1 t}, S_{2 t},

and

S_{3 t}

are time-independent. Figure 1 shows the boxplots of the claim numbers in three insurance lines for different months.

The plots show no significant seasonal effect. We also checked that claims in the current month do not affect claims in the following month with a separate exploratory analysis not included in this article. Therefore, it could be innocuous to assume that

S_{1}

,

S_{2}

,

S_{3}

, and

S_{•}

are the aggregate claim random variables that we are interested in, where

S_{1 t}

,

S_{2 t}

,

S_{3 t}

, and

S_{• t}

are i.i.d. samples of

S_{1}

,

S_{2}

,

S_{3}

, and

S_{•}

, respectively. However, we can detect some dependence among the claim numbers from three lines of businesses.

We calculate the Pearson correlation coefficient for any two lines of claim numbers. The claim numbers of buildings are highly related to the contents. The Pearson coefficient is 0.876580. For the contents and profits coverage, the coefficient is 0.7454898. The relationship between the buildings and profits is not as strong as the others, and the coefficient is 0.574442. Overall, this exploratory analysis shows the necessity of modelling dependence among the claim counts from multiple coverage.

It is known that the losses in property insurance are mostly heavy-tailed, whereby the given data is not exceptional. In this regard, (McNeil 1997; Resnick 1997) worked on the extreme value analyses using this dataset, where they applied the peak-over-threshold and estimated the parameters to implement generalized Pareto distributions. In each business line, we observe several data points that have relatively large losses. In Table 2, for all business lines, the averages of the observed losses are higher than the corresponding third quarters.

3. Dependence Modelling for Multivariate Claim Frequencies

To model possible dependence among the claim frequencies from the three types of coverage, we consider three types of models: a fully independent model (Section 3.1), a binomial thinning model (Section 3.2), and copula-based models (Section 3.3). The independent model is used as a benchmark model for comparison. Other models consider the dependence among the claim numbers in different lines. The binomial thinning model utilizes a hierarchical framework to model such a dependence. However, it only allows fixed dependency structures between any two margins. Copula-based models are used to construct the joint distribution flexibly.

3.1. Benchmark Model: Independent Frequency Model

The fully independent frequency model assumes independent relationships among the margins of the frequencies (and, subsequently, severities) from multiple types of coverage.

We recall that the claim numbers data are over-dispersed. To capture this behaviour, we model the number of building, content, and profit claims as well as the number of reported accidents using negative binomial random variables,

N_{1}, N_{2}, N_{3}

, and M, respectively, instead of using the Poisson distribution that implicitly assumes equi-dispersion. Negative binomial also performs better than Poisson in terms of in-sample goodness-of-fit measures such as AIC, BIC, and the log-likelihood in our dataset. For all

j = 1, 2, 3

, we assume

N_{j} \sim NB (λ_{j}, r_{j})

and

M \sim NB (λ, r)

with the following parameterization:

\begin{matrix} f_{N_{j}} (N_{j} = n_{j}) & = \frac{Γ (r_{j} + n_{j})}{Γ (r_{j}) Γ (n_{j} + 1)} {(\frac{λ_{j}}{r_{j} + λ_{j}})}^{n_{j}} {(\frac{r_{j}}{r_{j} + λ_{j}})}^{r_{j}}, \end{matrix}

(3)

\begin{matrix} f_{M} (M = m) & = \frac{Γ (r + m)!}{Γ (r) Γ (m + 1)} {(\frac{λ}{r + λ})}^{m} {(\frac{r}{r + λ})}^{r}, \end{matrix}

(4)

where r and

r_{j}

are the size parameters of the negative binomial distributions. Instead of using the probability as the second parameter, we use

λ_{j}

and

λ

, which stand for the means of the random variables

N_{j}

and M. The likelihood function of the negative binomial parameters is given by:

\begin{matrix} L (θ | D) & = \prod_{t = 1}^{132} f_{N_{1}, N_{2}, N_{3}, M} (N_{1 t}, N_{2 t}, N_{3 t}, M_{t}; θ) \\ \overset{ind .}{=} \prod_{t = 1}^{132} f_{M} (M; r, λ) \cdot f_{N_{1}} (N_{1 t}; r_{1}, λ_{1}) \cdot f_{N_{2}} (N_{2 t}; r_{2}, λ_{2}) \cdot f_{N_{3}} (N_{3 t}; r_{3}, λ_{3}) \\ = \prod_{t = 1}^{132} \{[\frac{Γ (r + m_{t})}{Γ (m_{t} + 1) Γ (r)} \frac{λ^{m_{t}}}{{(r + λ)}^{r + m_{t}}}] \cdot \prod_{j = 1}^{3} [\frac{Γ (r_{j} + n_{j t})}{Γ (n_{j t} + 1) Γ (r_{i})} \frac{λ_{j}^{n_{j t}}}{{(r_{j} + λ_{j})}^{r_{j} + n_{j t}}}]\}, \end{matrix}

(5)

where

θ

is a vector of all the parameters for the negative binomial distributions for the reported accident numbers and the claim numbers from each coverage. We let

D

denote the available data.

3.2. Binomial Thinning Model

To investigate the possible dependence of the frequencies of this dataset, we note that the claim numbers (

N_{j}, j = 1, 2, 3

) cannot be larger than the number of accidents reported (M), by definition. In this case, one can consider the binomial distribution as a natural fit for the

N_{j}

given M, while

N_{j}

s are conditionally independent given M.

More specifically, we use the negative binomial distribution to model M due to observed over-dispersion, namely

M \sim NB (λ, r)

. We also set

N_{j} | M = m \sim BN (m, λ_{j} / λ)

for the three sources of claim numbers so that the joint distribution of

(M, N_{1}, N_{2}, N_{3})

can be expressed as:

\begin{matrix} f_{N_{1}, N_{2}, N_{3}, M} (n_{1}, n_{2}, n_{3}, m) \\ = f_{N_{1} | M} (n_{1} | M = m) \cdot f_{N_{2} | M} (n_{2} | M = m) \cdot f_{N_{3} | M} (n_{3} | M = m) \cdot f_{M} (m) \\ = \prod_{j = 1}^{3} [(\binom{m}{n_{j}}) \frac{λ_{j}^{n_{j}} {(λ - λ_{j})}^{m - n_{j}}}{λ^{m}}] \cdot [\frac{Γ (r + m)}{Γ (m + 1) Γ (r)} {(\frac{λ}{r + λ})}^{m} {(\frac{r}{r + λ})}^{r}] . \end{matrix}

(6)

We note that the marginal distributions of

N_{j}

is a negative binomial random variable with size parameter r and mean

λ_{j}

, as shown below:

\begin{matrix} f_{N_{j}} (n_{j}) = \sum_{m = n_{j}}^{\infty} [f_{N_{j}} (n_{j} | M = m) \cdot f_{M} (m)] \\ = \sum_{m = n_{j}}^{\infty} [\frac{Γ (r + m)}{Γ (m + 1) Γ (r)} {(\frac{λ}{r + λ})}^{m} {(\frac{r}{r + λ})}^{r}] \cdot [(\binom{m}{n_{j}}) \frac{λ_{j}^{n_{j}} {(m - λ_{j})}^{λ - n_{j}}}{λ^{m}}] \\ = \frac{Γ (r + n_{j})}{Γ (n_{j} + 1) Γ (r)} \sum_{m = n_{j}}^{\infty} (\binom{r + m - 1}{m - n_{j}}) {(\frac{λ - λ_{j}}{λ})}^{m - n_{j}} {(\frac{λ_{j}}{λ})}^{n_{j}} {(\frac{λ}{λ + r})}^{m} {(\frac{r}{λ + r})}^{r} \\ = \frac{Γ (r + n_{j})}{Γ (n_{j} + 1) Γ (r)} {(\frac{λ_{j}}{λ_{j} + r})}^{n_{j}} {(\frac{r}{λ_{j} + r})}^{r} \sum_{m = n_{j}}^{\infty} (\binom{r + m - 1}{m - n_{j}}) \frac{{(λ_{j} + r)}^{n_{j} + r} {(λ - λ_{j})}^{m - n_{j}}}{{(λ + r)}^{m + r}} \\ = \frac{Γ (r + n_{j})}{Γ (n_{j} + 1) Γ (r)} {(\frac{λ_{j}}{λ_{j} + r})}^{n_{j}} {(\frac{r}{λ_{j} + r})}^{r}, n_{j} = 0, 1, \dots . \end{matrix}

(7)

One can write the last step directly from the previous step because we recognize that the part behind the summation is a probability mass function of a negative binomial. Another way to show that the marginal distribution of

N_{j}

is negative binomial is to use either probability generating or characteristic functions.

Compared with the independent model, this binomial thinning considers the dependency among three lines of business. However, a very obvious drawback is the unchangeable dependent structure, as the dependent relationship is tied to the marginal distributions.

3.3. Copula-Based Frequency Model

To overcome the drawback of the binomial thinning model, one can use copulas, which were originally defined by (Sklar 1959), where the joint distribution of

N_{1}

, …,

N_{k}

(denoted by H) can be written as a combination of a copula C and the corresponding marginal distributions

F_{1}, \dots, F_{k}

as follows:

H (n_{1}, \dots, n_{k}) = P (N_{1} \leq n_{1}, \dots, N_{k} \leq n_{k}) = C (F_{1} (n_{1}), \dots, F_{k} (n_{k})) .

(8)

As we consider the frequencies from three types of insurance coverage, one can write the joint probability of the claim frequencies via a copula C as follows:

\begin{matrix} P (N_{1} = n_{1}, N_{2} = n_{2}, N_{3} = n_{3}) = \\ C (F_{1} (n_{1}), F_{2} (n_{2}), F_{3} (n_{3})) - C (F_{1} (n_{1} - 1), F_{2} (n_{2}), F_{3} (n_{3})) - \\ C (F_{1} (n_{1}), F_{2} (n_{2} - 1), F_{3} (n_{3})) - C (F_{1} (n_{1}), F_{2} (n_{2}), F_{3} (n_{3} - 1)) + \\ C (F_{1} (n_{1} - 1), F_{2} (n_{2} - 1), F_{3} (n_{3})) + C (F_{1} (n_{1} - 1), F_{2} (n_{2}), F_{3} (n_{3} - 1)) + \\ C (F_{1} (n_{1}), F_{2} (n_{2} - 1), F_{3} (n_{3} - 1)) - C (F_{1} (n_{1} - 1), F_{2} (n_{2} - 1), F_{3} (n_{3} - 1)) . \end{matrix}

(9)

To maintain consistency in our analysis, we again assume the same marginal distributions of

N_{j}

, which means that

N_{j} \sim NB (λ_{j}, r_{i})

for

i = 1, 2, 3

. Regarding the copula families, we use the following three-dimensional copulas:

Gaussian:

$C (u_{1}, u_{2}, u_{3}) = Φ_{3 | Σ} (Φ^{- 1} (u_{1}), Φ^{- 1} (u_{2}), Φ^{- 1} (u_{3}))$

(10)

where $Φ_{3 | Σ}$ is the joint distribution function of the trivariate normal distribution with mean 0 and the (exchangeable) covariance matrix $Σ = σ J_{3} J_{3}^{'} + (1 - σ) I_{3}$ with $J_{3} = {(1, 1, 1)}^{'}$ , $I_{3}$ is an identity matrix of size 3 for $σ \in (- 1, 1)$ , and $Φ^{- 1}$ is the quantile function of a standard normal random variable. It is implicitly assumed all pairwise correlations in the correlation matrix are the same, which means that the Gaussian copula has an exchangeable structure.
Gumbel:

$C (u_{1}, u_{2}, u_{3}) = exp [- {(- {(log u_{1})}^{σ_{G}} - {(log u_{2})}^{σ_{G}} - {(log u_{3})}^{σ_{G}})}^{1 / σ_{G}}],$

(11)

where $σ_{G} \geq 1$ is the parameter of the Gumbel copula. A larger $σ_{G}$ value indicates that any pairwise marginals are more positively related.
Joe:

$C (u_{1}, u_{2}, u_{3}) = 1 - {(1 - [1 - {(1 - u_{1})}^{σ_{J}}] [1 - {(1 - u_{2})}^{σ_{J}}] [1 - {(1 - u_{3})}^{σ_{J}}])}^{1 / σ_{J}},$

(12)

where $σ_{J} \geq 1$ is the parameter the Joe copula. Similar to the Gumbel copula, a larger $σ_{J}$ constructs stronger positive dependency between any pairwise marginals.

4. Composite Models for Heavy-Tailed Severities

Several positive continuous distributions can be used to study the claim amounts distribution. While some distributions such as gamma and lognormal are good candidates for modelling the low-cost range, they might not be able to capture the heavy-tail behaviour. In this regard, we consider some two-component composite models to model both the body and tail parts in a balanced way.

A two-component composite model combines the body part of a light-tailed distribution with the tail part of a heavy-tailed distribution. Different from mixture distributions, there is no overlap between the supports of these components. By denoting the light-tailed and heavy-tailed densities/distribution functions as

g_{1} (Y) / G_{1} (Y)

and

g_{2} (Y) / G_{2} (Y)

, respectively, one can write the density of a composite random variable with two components as follows:

g_{c o m p} (y) = \{\begin{matrix} \frac{1}{1 + ϕ} \frac{g_{1} (y)}{G_{1} (u)} & y < u; \\ \frac{ϕ}{1 + ϕ} \frac{g_{2} (y)}{1 - G_{2} (u)} & y \geq u, \end{matrix}

(13)

where

ϕ

is the weight parameter, and u is the threshold to separate the two components. The cumulative distribution function of a composite model can be expressed as follows:

G_{c o m p} (y) = \{\begin{matrix} \frac{1}{1 + ϕ} \frac{G_{1} (y)}{G_{1} (u)} & y < u; \\ \frac{1}{1 + ϕ} + \frac{ϕ}{1 + ϕ} \frac{G_{2} (y)}{1 - G_{2} (u)} & y \geq u . \end{matrix}

(14)

Regarding the estimation scheme, one can use the maximum likelihood estimation to find the optimal parameters for the body and tail distributions. We note that the threshold u and weight parameter

ϕ

are not estimated but determined to guarantee the continuity and differentiability of the composite distribution at the threshold with the following constraints:

Continuity:

$lim_{y \to u^{-}} g_{c o m p} (y) = lim_{y \to u^{+}} g_{c o m p} (y) ⟹ ϕ = \frac{{lim}_{y \to u^{-}} \frac{g_{1} (y)}{G_{1} (u)}}{{lim}_{y \to u^{+}} \frac{g_{2} (y)}{1 - G_{2} (u)}} = \frac{g_{1} (u) (1 - G_{2} (u))}{g_{2} (u) G_{1} (u)};$

(15)
Differentiability:

$\frac{1}{1 + ϕ} lim_{y \to u^{-}} \frac{d}{d y} \frac{g_{1} (y)}{G_{1} (u)} = \frac{ϕ}{1 + ϕ} lim_{y \to u^{+}} \frac{d}{d y} \frac{g_{2} (y)}{1 - G_{2} (u)} ⟹ \frac{d}{d u} ln (\frac{g_{1} (u)}{g_{2} (u)}) = 0 .$

(16)

For example, assume that

Y_{1}

and

Y_{2}

follow gamma and inverse gamma distributions, that is

Y_{1} \sim G (α_{1}, θ_{1})

and

Y_{2} \sim IG (α_{2}, θ_{2})

with the following density functions:

\begin{matrix} g_{1} (y_{1}) & = \frac{{(y_{1} / θ_{1})}^{α_{1}} e^{- y_{1} / θ_{1}}}{y_{1} Γ (α_{1})}, y_{1} \geq 0, \end{matrix}

(17)

\begin{matrix} g_{2} (y_{2}) & = \frac{{(θ_{2} / y_{2})}^{α_{2}} e^{- θ_{2} / y_{2}}}{y_{2} Γ (α_{2})}, y_{2} \geq 0 . \end{matrix}

(18)

With Equations (15) and (16), one can find the threshold and weight parameters as functions of the distribution parameters as follows:

\begin{matrix} 0 & = \frac{d}{d u} [ln \frac{g_{1} (u)}{g_{2} (u)}] \\ = \frac{d}{d u} [ln \frac{\frac{{(u / θ_{1})}^{α_{1}} e^{- u / θ_{1}}}{u Γ (α_{1})}}{\frac{{(θ_{2} / u)}^{α_{2}} e^{- θ_{2} / u}}{u Γ (α_{2})}}] \\ = \frac{d}{d u} [α_{1} ln u - \frac{u}{θ_{1}} + α_{2} ln u + \frac{θ_{2}}{u}] \\ = (α_{1} + α_{2}) u - \frac{u^{2}}{θ_{1}} - θ_{2} ⟹ \end{matrix}

\begin{matrix} u & = \frac{α_{1} + α_{2} + \sqrt{{(α_{1} + α_{2})}^{2} - 4 \frac{θ_{2}}{θ_{1}}}}{2 / θ_{1}} \end{matrix}

(19)

\begin{matrix} ϕ & = \frac{g_{1} (u) (1 - G_{2} (u))}{g_{2} (u) G_{1} (u)} . \end{matrix}

(20)

Other composite models considered are listed in Table 3 with corresponding equations to determine the threshold values u. We note that the weight parameter

ϕ

is given by (15).

5. Empirical Analysis and Implications for Risk Management

5.1. Estimation Results

The logic of estimating the parameters using the benchmark model is straightforward, so that one can directly maximize the joint log-likelihood with a numerical routine, for example, the optim function in R. Table 4 shows the estimated values for the binomial thinning model and the benchmark, independent frequency model. The point estimates of

λ, λ_{1}, λ_{2}

, and

λ_{3}

under the two models are similar. Additionally, the standard errors of the mean parameters are smaller compared with the standard errors of dispersion parameters, r,

r_{1}

,

r_{2}

, and

r_{3}

. It is observed that there is a significant level of improvement in the log-likelihood by incorporating dependence via the common factor. We note that the improvements in AIC and BIC are even greater with the dependence modelling as the binomial thinning model is more parsimonious than the independent model.

As mentioned in the previous section, the joint distribution of a copula-based model combines marginal distributions with a copula function. Here, we use the inference by margin (IFM) method, so that the marginal distributions in the independent model are considered as given, while only the copula part is additionally estimated. Table 5 shows the estimated copula parameters and the log-likelihood values of each of the copula models. We note that the parameter estimated with the Gaussian copula model implies positive relationships among the three lines. By comparing the log-likelihood, the Gaussian copula outperforms the others.

For the severity components, we use several composite models. For the body part (modelled with a light-tailed distribution), we consider the gamma and exponential distributions. For the tail part (modelled with a heavy-tailed distribution), we use inverse-gamma, Pareto, and lognormal distributions. In the following Table 6, Table 7 and Table 8, the model selection criteria for various composite models are demonstrated, fitted with the building/content/profit severity data, respectively.

In the case of building losses, the gamma and lognormal (

G

and

LN

) and gamma and Pareto (

G

and

P a

) distributions are shown to have the best goodness-of-fit. Likewise, we find that gamma and lognormal (

G

and

LN

) is the best for modelling contents losses, and gamma and Pareto (

G

and

P a

) fits the profits losses well. Table 9 shows the point estimates of three composite distributions’ parameters, given the best combinations for each coverage, along with the splicing points distribution and the corresponding weight parameter values based on the parameter estimates. We note that some transformation is required to make the weight parameter meaningful. For example, in the case of building losses, we can interpret

\frac{1}{1 + ϕ} = 0.2433

as the proportion of Y that is from the body part, of the gamma distribution. On the other hand,

\frac{ϕ}{1 + ϕ} = 0.7567

of Y is from the tail part, of the lognormal distribution. Specifically, a larger weight parameter value indicates that the composite model is more heavy-tailed, and vice versa. The splicing point parameter, u, indicates the change of distribution components. For the building coverage, the splicing parameter is 2.08943, which means that the building losses greater than 2.08943 million Danish Krone are modelled by a lognormal distribution. Additionally, we observe that the losses from the profit line are more heavy-tailed compared with the losses from the other two lines.

5.2. Empirical Findings for Risk Management

In the insurance industry, estimating the risk level for a product or portfolio is critical for determining appropriate levels of the premium and reserve. We recall that

S_{j}

and

S_{•}

mean that the random variable stands for the aggregate loss amount from the jth line and the aggregate loss for all lines of insurance, as defined by (1) and (2). It is also straightforward to see that

E [S_{•}] = \sum_{j = 1}^{3} E [S_{j}]

due to the additivity of expectation. However, such a property generally does not hold for other types of risk measures, so it is important to properly analyze the risk level of total claims

S_{•}

, rather than summing up the risk level of

S_{1}, S_{2}

, and

S_{3}

. For our risk analysis, we use the following well-known risk measures:

Value at Risk (VaR)— ${VaR}_{α} (Y) = min {y \in R : F_{Y} (y) \geq α}, α \in [0, 1]$ ;
Tail Value at Risk (TVaR)— ${TVaR}_{α} (Y) = E [Y | Y \geq {VaR}_{α} (Y)], α \in [0, 1]$ ;
Proportional Hazard (PH) Risk Measure (Wang 1995)— ${PH}_{α} (X) = \int_{0}^{\infty} {(1 - F (y))}^{1 / α} d y$ , $α \geq 1$ .
Dual Power (DP) Risk Measure— ${DP}_{β} (X) = \int_{0}^{\infty} 1 - {(F (y))}^{β} d y, β \geq 1$ .

While TVaR is not a coherent risk measure unless the underlying distribution is continuous, it is innocuous to assume that the TVaR of

S_{1}, S_{2}, S_{3}

, and

S_{•}

are coherent, as we mainly focus on the tail part, where the claim amounts are for sure strictly positive and the underlying distributions are continuous. We also note that (Wang 1994) showed that the integration of the transformed distribution is coherent when the transformation is a concave function, so that PH and DP are both coherent.

For comparison of the calculated risk measures under each of the models, we apply a Monte Carlo simulation to numerically evaluate the values of risk measures. More specifically, we simulated 100,000 data points, number of accidents M, the claim numbers for three business lines

N_{1}, N_{2},

and

N_{3}

, and, subsequently, the claim amounts

S_{1}, S_{2},

and

S_{3}

under each of the model specifications with the estimated parameters shown in Section 5.1.

The simulation for the independent model is straightforward. Because of the independence, we apply random generations for the negative binomial distributions to simulate claim numbers

N_{1}, N_{2},

and

N_{3}

for all business lines. Unlike the independent one, the binomial thinning model requires the simulation of the reported number of accidents M. After that, a binomial random generation with size parameters corresponding to the reported claim numbers is applied to get the claim numbers

N_{1}, N_{2},

and

N_{3}

for three lines of business. The logic for the copula models is similar. We first generate trivariate uniform random numbers from the copula functions. With the generated uniform random numbers, we get the claim numbers

N_{1}, N_{2},

and

N_{3}

using the inverse of the marginal distributions. Once the claim frequencies

N_{1}, N_{2},

and

N_{3}

were generated, the severity components are generated, subsequently. For example, if

N_{1}

is given, then uniform random numbers are generated

N_{1}

times and they are converted to the individual severities via the inverse distribution (or quantile) function of the composite distribution function for building losses. Lastly, these values are summed up as

S_{1}

.

Figure 2, Figure 3 and Figure 4 show the scatterplots of the combinations of building, content, and profit claims for observed and simulated frequency data. Based on the plots of observed data, there are apparent positive relationships among the marginal frequencies. As we expected, however, the independent model cannot capture such dependent behaviours. In the case of the other models, the binomial thinning model shows a substantial linear relationship between the building and contents claim numbers, which is the most similar to the observed. In the case of Figure 3, however, the Joe copula best captures the relationship between the building and profit frequencies.

Lastly, Table 10 shows the approximated risk measures under different models. The independent model reproduces relatively smaller values of VaR and TVaR for the aggregated claims

S_{•} = S_{1} + S_{2} + S_{3}

, whereas the calculated risk measure values for each coverage,

S_{1}, S_{2},

and

S_{3}

are more or less the same, regardless of the chosen model. This is quite natural as, regardless of the (assumed) dependence structure, the marginal distributions for

N_{1}, N_{2},

and

N_{3}

(and, subsequently,

S_{1}, S_{2},

and

S_{3}

) are the same. As a result, the TVaR for

S_{•}

under the independent model is severely underestimated compared to the observed (or empirical) TVaR, while the other dependent models are able to reproduce the empirical TVaR for

S_{•}

with less deviations. It implies that it is required to consider possible dependence among different types of insurance coverage for an effective enterprise risk management purpose.

6. Conclusions and Discussions

In conclusion, we bridged the connection among different coverages by considering the loss amounts incurred under different coverages due to a fire accident and taking into account the heavy-tail behaviour. From the insurance aspect, we assessed several risk measures, which can be interpreted differently. For example, the Value at Risk with

α

can be interpreted as the assets that should be reserved to reduce the bankruptcy possibility to

1 - α

. By comparing with the fully independent model, we found that both dependent modelling frameworks performed better from both statistical and insurance aspects. Specifically, the binomial thinning model captured the behaviour of the observed claim numbers better than the independent model from the calculated model evaluation criterion. All binomial thinning and copula-based models provided more reasonable and consistent risk measures.

Additionally, we presented two modelling frameworks to capture the dependency: binomial thinning and copula-based. Although, from the approximate risk measures results, we could not conclude which is the best, we could still observe the flexibility of copula-based models. The binomial thinning model suggested a certain dependent structure. However, by implying different copula functions, we may capture the dependence based on different joint distributions.

There are some concerns and limitations of the current research. Firstly, while we used copulas with discrete random variables, (Genest and Nešlehová 2007) discussed the limitations of applying copula with discrete random variables. Thus, we shall carefully interpret the results relative to the dependent measures because of the lack of uniqueness of the copula functions. However, it is still effective when the discrete random variables’ probability mass is spread widely enough on their support. Secondly, we also implicitly assumed that the frequency and severity components are independent, whereas some existing literature show the presence of dependence among the frequency and severity components, including, but not limited to, (Jeong and Valdez 2020) and (Vernic et al. 2021).

For future research, (Geenens 2020) proposed a method of incorporating dependency for discrete random variables by using an idea similar to the copula method. Based on and further improving it, we can provide more rigorous analyses for the dependent multivariate discrete data. Additionally, one can also study the dependency between the frequency and severity components on top of the dependence among the claims from multiple types of coverage.

Author Contributions

Conceptualization, H.J. and Y.L.; methodology, H.J. and Y.L.; software, T.Y. and H.J.; validation, T.Y.; formal analysis, T.Y.; investigation, T.Y. and H.J.; data curation, T.Y. and H.J.; writing—original draft preparation, T.Y. and H.J.; writing—review and editing, H.J. and Y.L.; visualization, T.Y.; supervision, H.J. and Y.L.; project administration, H.J. and Y.L.; funding acquisition, H.J. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by NSERC Discovery grant number R611467/R611851 and CANSSI GSE, Scholarship number R619645.

Data Availability Statement

The research data used in this article are available with a R package CASdatasets. The R codes to reproduce the results in this article are available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

VaR	Value at Risk
TVaR	Tail Value at Risk
PH	Proportional Hazard
DP	Dual Power

References

Cooray, Kahadawala, and Chin-I Cheng. 2015. Bayesian estimators of the lognormal–Pareto composite distribution. Scandinavian Actuarial Journal 2015: 500–15. [Google Scholar] [CrossRef]
Fung, Tsz Chai, Himchan Jeong, and George Tzougas. 2023. Investigating the effect of climate-related hazards on claim frequency prediction in motor insurance. SSRN Electronic Journal, SSRN 4638074. [Google Scholar] [CrossRef]
Geenens, Gery. 2020. Copula modeling for discrete random vectors. Dependence Modeling 8: 417–40. [Google Scholar] [CrossRef]
Genest, Christian, and Johanna Nešlehová. 2007. A Primer on Copulas for Count Data. Astin Bulletin 37: 475–515. [Google Scholar] [CrossRef]
Hong, Liang, and Ryan Martin. 2018. Dirichlet process mixture models for insurance loss data. Scandinavian Actuarial Journal 2018: 545–54. [Google Scholar] [CrossRef]
Jeong, Himchan, and Emiliano A. Valdez. 2020. Predictive compound risk models with dependence. Insurance: Mathematics and Economics 94: 182–95. [Google Scholar] [CrossRef]
Jeong, Himchan, George Tzougas, and Tsz Chai Fung. 2023. Multivariate claim count regression model with varying dispersion and dependence parameters. Journal of the Royal Statistical Society Series A: Statistics in Society 186: 61–83. [Google Scholar] [CrossRef]
Jeong, Himchan. 2024. Tweedie multivariate semi-parametric credibility with the exchangeable correlation. Insurance: Mathematics and Economics 115: 13–21. [Google Scholar] [CrossRef]
Lee, Gee Y., and Peng Shi. 2019. A dependent frequency–severity approach to modeling longitudinal insurance claims. Insurance: Mathematics and Economics 87: 115–29. [Google Scholar] [CrossRef]
McNeil, Alexander J. 1997. Estimating the Tails of Loss Severity Distributions Using Extreme Value Theory. ASTIN Bulletin 27: 117–37. [Google Scholar] [CrossRef]
Miljkovic, Tatjana, and Bettina Grün. 2016. Modeling loss data using mixtures of distributions. Insurance: Mathematics and Economics 70: 387–96. [Google Scholar] [CrossRef]
Oh, Rosy, Himchan Jeong, Jae Youn Ahn, and Emiliano A. Valdez. 2021. A multi-year microlevel collective risk model. Insurance: Mathematics and Economics 100: 309–28. [Google Scholar] [CrossRef]
Pigeon, Mathieu, and Michel Denuit. 2011. Composite Lognormal–Pareto model with random threshold. Scandinavian Actuarial Journal 2011: 177–92. [Google Scholar] [CrossRef]
Resnick, Sidney I. 1997. Discussion of the Danish Data on Large Fire Insurance Losses. ASTIN Bulletin 27: 139–51. [Google Scholar] [CrossRef]
Scollnik, David P., and Chenchen Sun. 2012. Modeling with Weibull-Pareto models. North American Actuarial Journal 16: 260–72. [Google Scholar] [CrossRef]
Sklar, Abe. 1959. Fonctions de répartition à n dimensions et leurs marges. Publications de l’Institut Statistique de l’Université de Paris VIII: 229–31. [Google Scholar]
Vernic, Raluca, Catalina Bolancé, and Ramon Alemany. 2021. Sarmanov distribution for modeling dependence between the frequency and the average severity of insurance claims. Insurance: Mathematics and Economics 102: 111–25. [Google Scholar] [CrossRef]
Wang, Shaun. 1994. Premium Calculation by Transforming the Layer Premium Density. ASTIN Bulletin 26: 71–92. [Google Scholar] [CrossRef]
Wang, Shaun. 1995. Insurance Pricing and Increased Limits Ratemaking by Proportional Hazards Transforms. Insurance: Mathematics and Economics 17: 43–54. [Google Scholar] [CrossRef]

Figure 1. Exploration of seasonal effects with the Danish reinsurance dataset.

Figure 2. Observed and simulated building vs. contents claims.

Figure 3. Observed and simulated building vs. profits claims.

Figure 4. Observed and simulated contents vs. profits claims.

Table 1. Excerpt from the Danish Fire Dataset.

Date	Building	Contents	Profits	Total
3 January 1980	1.09809663	0.58565150	0.00000000	1.683748
4 January 1980	1.75695461	0.33674960	0.00000000	2.093704
5 January 1980	1.73258126	0.00000000	0.00000000	1.732581
7 January 1980	0.00000000	1.30537600	0.47437775	1.779754
7 January 1980	1.24450952	3.36749600	0.00000000	4.612006

Table 2. Summary of loss amount for three business lines.

Source	Min.	1st Qu.	Median	Mean	3rd Qu.	Max.
Building	0.02319	0.96618	1.32013	1.98668	1.97860	152.41321
Contents	0.00083	0.29000	0.57570	1.70178	1.44648	132.01320
Profits	0.00408	0.10011	0.26619	0.85180	0.67929	61.93265

Table 3. Threshold values u for various composite models (

G

: gamma,

E

: exponential,

IG

: Inverse-gamma,

LN

: lognormal, and

P a

: Pareto).

Table 3. Threshold values u for various composite models (

G

: gamma,

E

: exponential,

IG

: Inverse-gamma,

LN

: lognormal, and

P a

: Pareto).

Name	Head Dist.	Tail Dist.	u
$G$ and $IG$	$\frac{{(x / θ_{1})}^{α_{1}} e^{- x / θ_{1}}}{x Γ (α_{1})}$	$\frac{{(θ_{2} / x)}^{α_{2}} e^{- θ_{2} / x}}{x Γ (α_{2})}$	$u = \frac{α_{1} + α_{2} + \sqrt{{(α_{1} + α_{2})}^{2} - 4 \frac{θ_{2}}{θ_{1}}}}{2 / θ_{1}}$
$G$ and $LN$	$\frac{{(x / θ_{1})}^{α_{1}} e^{- x / θ_{1}}}{x Γ (α_{1})}$	$\frac{exp \{- \frac{{(ln x - μ_{2})}^{2}}{2 σ_{2}^{2}}\}}{x σ_{2} \sqrt{2 π}}$	$0 = α_{1} - \frac{u}{θ_{1}} + \frac{ln x - u}{σ_{2}^{2}}$
$G$ and $P a$	$\frac{{(x / θ_{1})}^{α_{1}} e^{- x / θ_{1}}}{x Γ (α_{1})}$	$\frac{α_{2} θ_{2}^{α_{2}}}{{(x + θ_{2})}^{α_{2} + 1}}$	$u = \frac{α_{1} + α_{2} - \frac{θ_{2}}{θ_{1}} + \sqrt{{(α_{1} + α_{2} \frac{θ_{2}}{θ_{1}})}^{2} + 4 \frac{θ_{2}}{θ_{1}} (α_{1} - 1)}}{2 / θ_{1}}$
$E$ and $IG$	$\frac{e^{- x / θ_{1}}}{θ_{1}}$	$\frac{{(θ_{2} / x)}^{α_{2}} e^{- θ_{2} / x}}{x Γ (α_{2})}$	$u = \frac{α_{2} + 1 + \sqrt{{(α_{2} + 1)}^{2} - 4 \frac{θ_{2}}{θ_{1}}}}{2 / θ_{1}}$
$E$ and $LN$	$\frac{e^{- x / θ_{1}}}{θ_{1}}$	$\frac{exp \{- \frac{{(ln x - μ_{2})}^{2}}{2 σ_{2}^{2}}\}}{x σ_{2} \sqrt{2 π}}$	$0 = - \frac{1}{θ_{1}} + \frac{1}{u} + \frac{ln u - μ_{2}}{u σ_{2}^{2}}$
$E$ and $P a$	$\frac{e^{- x / θ_{1}}}{θ_{1}}$	$\frac{α_{2} θ_{2}^{α_{2}}}{{(x + θ_{2})}^{α_{2} + 1}}$	$u = (α_{2} + 1) θ_{1} - θ_{2}$

Table 4. Parameter estimates for the binomial thinning and independent frequency models.

	Binomial Thinning Model				Independent Frequency Model
	Estimates	CI Lower (95%)	CI Upper (95%)	Sth.Err	Estimates	CI Lower (95%)	CI Upper (95%)	Sth.Err
$λ_{1}$	15.08	14.24	15.91	0.43	15.08	14.21	15.95	0.44
$r_{1}$	-	-	-	-	20.74	8.96	32.52	6.01
$λ_{2}$	12.72	11.97	13.47	0.2	12.72	11.92	13.52	0.41
$r_{2}$	-	-	-	-	17.59	7.55	27.64	5.12
$λ_{3}$	4.67	4.27	5.07	0.20	4.67	4.11	5.22	0.28
$r_{3}$	-	-	-	-	3.62	1.94	5.30	0.86
$λ$	16.42	15.53	17.30	0.45	16.42	15.53	17.30	0.45
r	25.32	10.03	40.62	7.80	25.24	10.05	40.43	7.75
$log L$	$- 1183.47$				$- 1516.57$
AIC	$2382.94$				$3043.14$
BIC	$2406.00$				$3057.56$

Table 5. The estimates and log-likelihood of copula models.

	Gaussian Copula	Gumbel Copula	Joe Copula
Est. parameter	0.70452	1.83147	2.17170
$log L$	−1015.953	−1021.079	−1033.461

Table 6. Log-likelihoods of composite models for building severity.

	$G$ and $IG$	$G$ and $P a$	$G$ and $LN$	$E$ and $IG$	$E$ and $P a$	$E$ and $LN$
$log L$	−2800.93	−2771.15	−2771.14	−3181.33	−3220.69	−3220.72
AIC	5609.87	5550.30	5550.29	6368.65	6447.37	6447.43
BIC	5632.25	5572.68	5572.67	6385.44	6464.16	6464.22

Table 7. Log-likelihoods of composite models for content severity.

	$G$ and $IG$	$G$ and $P a$	$G$ and $LN$	$E$ and $IG$	$E$ and $P a$	$E$ and $LN$
$log L$	−2187.88	−2039.52	−2037.59	−2102.97	−2102.81	−2102.16
AIC	4383.77	4087.04	4083.18	4211.95	4211.61	4210.32
BIC	4405.47	4108.74	4104.88	4228.23	4227.89	4226.60

Table 8. Log-likelihoods of composite models for profit severity.

	$G$ and $IG$	$G$ and $P a$	$G$ and $LN$	$E$ and $IG$	$E$ and $P a$	$E$ and $LN$
$log L$	−309.19	−297.19	−427.81	−305.93	−304.53	−304.48
AIC	626.38	602.39	863.62	617.86	615.06	614.97
BIC	644.07	620.08	881.31	631.13	628.33	628.24

Table 9. Parameter estimates for the severity components.

	Building: $G$ and $LN$	Contents: $G$ and $LN$	Profits: $G$ and $P a$
Head Dist.	$α_{1} = 3.71085$	$α_{1} = 1.98766$	$α_{1} = 1.55072$
	$θ_{1} = 0.37198$	$θ_{1} = 0.21591$	$θ_{1} = 0.10144$
Tail Dist.	$μ_{2} = - 331.88884$	$μ_{2} = - 1.34871$	$α_{2} = 1.41237$
	$σ_{2} = 13.20987$	$σ_{2} = 1.69228$	$θ_{2} = 0.37195$
u	$2.08943$	$0.47466$	$0.11282$
$ϕ$	$0.32151$	$1.34244$	$2.92302$

Table 10. Values of risk measures under different models.

Measure	Model	Building ( $S_{1}$ )	Content ( $S_{2}$ )	Profit ( $S_{3}$ )	Aggregate ( $S_{•}$ )
${VaR}_{0.90}$	Observations	43.36753	37.61073	8.927676	84.95829
	Independent	45.83948	40.13501	8.853674	82.54968
	Bin. Thin.	45.55415	39.81174	8.330259	87.44851
	Gaussian	46.00964	39.93148	8.925007	88.03041
	Gumbel	45.94382	40.4155	8.80955	88.64008
	Joe	46.00862	40.25366	8.792391	88.37843
${TVaR}_{0.90}$	Observations	67.63288	61.7309	17.34992	130.9614
	Independent	62.29311	64.19727	21.77994	114.7784
	Bin. Thin.	62.18439	63.46745	22.57028	122.2398
	Gaussian	62.57564	63.71817	24.13904	123.6911
	Gumbel	62.72085	64.09827	24.19948	124.4939
	Joe	63.20349	63.48082	22.36632	122.8164
${TVaR}_{0.95}$	Observations	88.59588	82.65819	24.1481	171.8545
	Independent	75.19079	82.92127	32.76977	140.1614
	Bin. Thin.	75.28895	82.01287	34.9973	149.5173
	Gaussian	75.53303	82.18786	37.40043	151.7549
	Gumbel	75.80739	82.24379	37.66422	152.5675
	Joe	76.64548	81.35734	34.09076	149.5723
${PH}_{2}$	Observations	51.93275	42.81760	11.31421	93.62481
	Independent	54.19547	56.37977	28.35901	102.4392
	Bin. Thin.	58.84183	56.89981	40.01965	114.9348
	Gaussian	53.31081	50.2014	63.63473	134.7637
	Gumbel	53.52596	49.71846	55.63991	126.7038
	Joe	60.19573	47.35576	38.58851	111.7715
${DP}_{3}$	Observations	43.22733	35.96571	8.24774	82.29961
	Independent	41.57237	35.75419	9.25285	76.3109
	Bin. Thin.	41.50927	35.52621	9.46211	79.59646
	Gaussian	41.69046	35.60265	9.98022	80.09476
	Gumbel	41.69769	35.82457	9.98923	80.20223
	Joe	41.82949	35.57191	9.42534	79.60271

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, T.; Lu, Y.; Jeong, H. Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses. Risks 2024, 12, 97. https://doi.org/10.3390/risks12060097

AMA Style

Yan T, Lu Y, Jeong H. Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses. Risks. 2024; 12(6):97. https://doi.org/10.3390/risks12060097

Chicago/Turabian Style

Yan, Tianxing, Yi Lu, and Himchan Jeong. 2024. "Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses" Risks 12, no. 6: 97. https://doi.org/10.3390/risks12060097

APA Style

Yan, T., Lu, Y., & Jeong, H. (2024). Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses. Risks, 12(6), 97. https://doi.org/10.3390/risks12060097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dependence Modelling for Heavy-Tailed Multi-Peril Insurance Losses

Abstract

1. Introduction

2. Data Exploration

3. Dependence Modelling for Multivariate Claim Frequencies

3.1. Benchmark Model: Independent Frequency Model

3.2. Binomial Thinning Model

3.3. Copula-Based Frequency Model

4. Composite Models for Heavy-Tailed Severities

5. Empirical Analysis and Implications for Risk Management

5.1. Estimation Results

5.2. Empirical Findings for Risk Management

6. Conclusions and Discussions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI