Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions

Brazauskas, Vytaras; Upretee, Sahadeb

doi:10.3390/risks7020055

Open AccessArticle

Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions

by

Vytaras Brazauskas

^* and

Sahadeb Upretee

Department of Mathematical Sciences, University of Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53201, USA

^*

Author to whom correspondence should be addressed.

Risks 2019, 7(2), 55; https://doi.org/10.3390/risks7020055

Submission received: 29 January 2019 / Revised: 23 April 2019 / Accepted: 26 April 2019 / Published: 15 May 2019

(This article belongs to the Special Issue Estimation of Risk Measures from Data -- Estimators, Computation, Robustness and Elicitability)

Download

Browse Figures

Versions Notes

Abstract

:

Quantiles of probability distributions play a central role in the definition of risk measures (e.g., value-at-risk, conditional tail expectation) which in turn are used to capture the riskiness of the distribution tail. Estimates of risk measures are needed in many practical situations such as in pricing of extreme events, developing reserve estimates, designing risk transfer strategies, and allocating capital. In this paper, we present the empirical nonparametric and two types of parametric estimators of quantiles at various levels. For parametric estimation, we employ the maximum likelihood and percentile-matching approaches. Asymptotic distributions of all the estimators under consideration are derived when data are left-truncated and right-censored, which is a typical loss variable modification in insurance. Then, we construct relative efficiency curves (REC) for all the parametric estimators. Specific examples of such curves are provided for exponential and single-parameter Pareto distributions for a few data truncation and censoring cases. Additionally, using simulated data we examine how wrong quantile estimates can be when one makes incorrect modeling assumptions. The numerical analysis is also supplemented with standard model diagnostics and validation (e.g., quantile-quantile plots, goodness-of-fit tests, information criteria) and presents an example of when those methods can mislead the decision maker. These findings pave the way for further work on RECs with potential for them being developed into an effective diagnostic tool in this context.

Keywords:

data truncation and censoring; empirical estimator; maximum likelihood; model uncertainty; percentile matching; quantile estimation

1. Introduction

Quantiles of probability distributions play a central role in the definition of risk measures (e.g., value-at-risk, conditional tail expectation) which in turn are used to capture the riskiness of the distribution tail. Estimates of risk measures are needed in many practical situations such as in pricing of extreme events, developing reserve estimates, designing risk transfer strategies, and allocating capital. When solving such problems, the first highly consequential task is to find point estimates of quantiles and to assess their variability. In this context, the empirical nonparametric approach is the simplest one to use (see Jones and Zitikis 2003), but it lacks efficiency due to the scarcity of sample data in the tails. On the other hand, parametric estimators can significantly improve quantile estimators’ efficiency (see Brazauskas and Kaiser 2004; Kaiser and Brazauskas 2006). Moreover, the parametric approach can accommodate truncation and censoring that are common features of insurance loss data. Of course, the main drawback of parametric estimators is that they are sensitive to initial modeling assumptions, which creates model uncertainty1.

There is a growing number of studies on various aspects of model risk in modeling, measuring and pricing risks. Cairns (2000) was the first author to systematically study model risk in insurance. He discussed different sources of model risk, including parameter uncertainty and model uncertainty, and presented methods to treat these uncertainties coherently. Hartman et al. (2017) focused on parameter uncertainty and analyzed its impact in different sectors of insurance practice, namely, life insurance, health insurance, and property/casualty insurance. They also gave a comprehensive review of the literature concerning parameter uncertainty. A recent article by Hong et al. (2018) shows typical claim predictions change when the model is uncertain. In particular, they illustrate such effects by using standard model selection tools such as Akaike Information Criterion to determine the “best” regression subset of covariates, and then apply the selected model for claim prediction. Bignozzi et al. (2015) and Samanthi et al. (2017) are two recent examples of theoretical and practical investigations, respectively, of the effects of the data dependence assumption on subsequent risk measuring. Also, an extensive simulation study involving estimation of upper quantiles of lognormal, log-logistic, and log-double exponential distributions under model and parameter uncertainty was conducted by Modarres et al. (2002). Their overall conclusion was that when modeling is done by assuming one of the three families and treating the other two as possible misspecification, the least severe effect on upper quantile estimates occurs when the lognormal distribution is assumed.

Further, there is even more interest in this topic in the financial risk management literature. Model uncertainty within the risk aggregation problems has been recently studied by Embrechts et al. (2015) and Cambou and Filipović (2017), and for value-at-risk estimation by Alexander and Sarabia (2012). Cont et al. (2010) and Glasserman and Xu (2014) linked financial risk measurement procedures, model risk, and robustness. The first paper suggests the use of the classical robust statistics techniques for managing model risk, while the second pursues model distance and entropy based techniques to derive the worst-case risk measurements (relative to measurements from a baseline model). Finally, Aggarwal et al. (2016) and Black et al. (2018) provide comprehensive accounts on model risk identification, measurement, and management in practice. These authors develop a model risk framework, identify distinct model cultures within an organization, review common methods and challenges for quantifying model risk, and discuss difficulties that arise in mapping model errors to actual financial impact.

The implied conclusion in many academic and practice oriented papers on model risk is that it can be reduced or mitigated by using all or a combination of the following: performing model validation, fitting multiple models, and applying various stress tests or sensitivity analysis. This idea was in part adopted in the case studies of Brazauskas and Kleefeld (2016), which were based on well-known (real) reinsurance data. What was discovered by these authors, however, is that fitting multiple models and using extensive model validation for each of them may not be sufficient if data are left-truncated. That is, they used quantile-quantile plots, Kolmogorov-Smirnov (KS) and Anderson-Darling (AD) tests, Akaike and Bayesian information criteria (AIC and BIC) and had concluded that six different models are acceptable for each of the 12 data sets analyzed. However, when all those models were used to estimate the 90% and 95% quantiles (value-at-risk measures) for ground-up loss, for some data sets they resulted in similar estimates, which would be expected, while for others they were far apart, which is counterintuitive. Moreover, using left-truncated operational risk data, Yu and Brazauskas (2017) have shown that even shifted parametric models (which might seem like a plausible option but nonetheless incorrectly account for data truncation) can pass those standard model validation tests. Next, due to the presence of deductibles and policy limits in insurance contracts, data truncation and censoring are unavoidable modifications of the loss severity variable. This suggests that quantile and, more generally, risk measure estimation requires careful thinking and analysis.

In this paper, we present the empirical nonparametric, maximum likelihood, and percentile-matching estimators of ground-up loss distribution quantiles (at various levels). Asymptotic distributions of these estimators are derived when data are left-truncated and right-censored. Relative efficiency curves (REC) for all the estimators are then constructed, and plots of such curves are provided for exponential and single-parameter Pareto distributions. Then, we generate a sample of 50 observations from a left-truncated and right-censored Pareto I model and using that data set investigate how biased quantile estimates can be when one makes incorrect distributional assumptions or relies on a wrong modeling approach. The numerical analysis is also supplemented with standard model diagnostics and validation (e.g., quantile-quantile plots and KS and AD tests) and demonstrates how those methods can mislead the decision maker. In addition, we examine the information provided by RECs and conclude that such curves have strong potential for being developed into an effective diagnostic tool in this context.

The rest of the paper is organized as follows. In Section 2, nonparametric and parametric quantile estimators are defined and their asymptotic distributions are specified when the underlying random variable is left-truncated and right-censored. The next section presents two illustrative examples of RECs for exponential and single-parameter Pareto distributions. Specifically, RECs of maximum likelihood, percentile matching, and empirical estimators of quantiles of these distributions are plotted. Section 4 studies the effects of distribution choice and modeling approach on estimates of quantiles. Concluding remarks are offered in Section 5. Finally, the appendix provides two asymptotic theorems of mathematical statistics and a detailed description of how to contruct RECs. These results are essential to analytic derivations in the paper, and we recommend the reader to review them first.

2. Quantile Estimation

Insurance contracts have coverage modifications that need to be taken into account when modeling the underlying loss severity variable. In this section, we specify the estimators of quantiles of the ground-up distribution and derive their asymptotic distributions when the loss variable is affected by left truncation (due to deductible) and right censoring (due to policy limit). We consider three types of estimators: empirical (Section 2.1), aximum likelihood, MLE (Section 2.2), and percentile matching, PM (Section 2.3).

To present the estimators and their properties, let us start with notation and assumptions. Suppose we observe n continuous independent identically distributed (i.i.d.) random variables

X_{1}^{*}, \dots, X_{n}^{*}

, where each

X^{*}

is equal to the ground-up variable X, if X exceeds threshold t (

t \geq 0

) but is capped at upper limit u (

u > t

). That is,

X^{*}

is a mixed discrete-continuous random variable that satisfies the following conditional event relationship:

X^{*} \overset{d}{=} min {X, u} | X > t,

where

\overset{d}{=}

denotes “equal in distribution.” Also, let us denote the probability density function (pdf), cumulative distribution function (cdf), and quantile function (qf) of X as f, F, and

F^{- 1}

, respectively. Then, the cdf

F_{*}

, pdf

f_{*}

, qf

F_{*}^{- 1}

of

X^{*}

are related to F, f,

F^{- 1}

and given by:

F_{*} (x^{*} | t; u) = \{\begin{matrix} 0, & x^{*} \leq t; \\ \frac{F (x^{*}) - F (t)}{1 - F (t)}, & t < x^{*} < u; \\ 1, & x^{*} \geq u, \end{matrix}

(1)

f_{*} (x^{*} | t; u) = \{\begin{matrix} \frac{f (x^{*})}{1 - F (t)}, & t < x^{*} < u; \\ \frac{1 - F (u^{-})}{1 - F (t)}, & x^{*} = u; \\ 0, & e l s e w h e r e, \end{matrix}

(2)

F_{*}^{- 1} (s | t; u) = \{\begin{matrix} F^{- 1} (s + (1 - s) F (t)), & 0 \leq s < \frac{F (u) - F (t)}{1 - F (t)}; \\ u, & \frac{F (u) - F (t)}{1 - F (t)} \leq s \leq 1 . \end{matrix}

(3)

Note that we are interested in estimating the pth quantile of X (i.e.,

F^{- 1} (p)

) based on the observed data

X_{1}^{*} = x_{1}^{*}, \dots, X_{n}^{*} = x_{n}^{*}

. Thus, Theorems A1 and A2 in Appendix A.1 and the REC construction of Appendix A.2 have to be applied to functions (1)–(3), not F, f,

F^{- 1}

.

2.1. Empirical Approach

As mentioned earlier, the empirical approach is restricted to the range of observed data. Indeed, based on

x_{1}^{*}, \dots, x_{n}^{*}

, the empirical estimator

{\hat{F}}_{EMP} (t) = 0

. Thus, it cannot take full advantage of formulas (1)–(3), and yields a biased estimator that works within a limited range of quantile levels. In this case, the

F^{- 1} (p)

estimator is

{\hat{F}}_{EMP}^{- 1} (p) = x_{(⌈ n p ⌉)}^{*} < u

, and as follows from Theorem A1,

{\hat{F}}_{EMP}^{- 1} (p) i s AN (F_{*}^{- 1} (p), \frac{1}{n} \frac{p (1 - p)}{f_{*}^{2} (F_{*}^{- 1} (p))}), 0 < p < \frac{F (u) - F (t)}{1 - F (t)} .

(4)

To see that this estimator is positively biased, i.e., any (estimable) quantile of the observable variable

X^{*}

is never below the corresponding quantile of the unobservable variable X (which is what we want to estimate), notice that for the mean parameter in (4), we have

F_{*}^{- 1} (p) = F^{- 1} (p + (1 - p) F (t)) \geq F^{- 1} (p), 0 < p < \frac{F (u) - F (t)}{1 - F (t)},

with the inequality being strict unless

F (t) = 0

. The inequality holds because

F^{- 1}

is strictly increasing (loss severities are non-negative absolutely continuous random variables) and

(1 - p) F (t) \geq 0

.

2.2. MLE Approach

Parametric methods use the observed data

x_{1}^{*}, \dots, x_{n}^{*}

and fully recognize its distributional properties. The MLE approach is one of the most common estimation techniques. It takes into account (1)–(3) and finds parameter estimates by maximizing the following log-likelihood function:

\begin{matrix} log L (θ | x_{1}^{*}, \dots, x_{n}^{*}) = log [\prod_{i = 1}^{n} f_{*} (x_{i}^{*} | t; u)] = log [\prod_{i = 1}^{n} {[\frac{f (x_{i}^{*})}{1 - F (t)}]}^{1 {t < x_{i}^{*} < u}} {[\frac{1 - F (u^{-})}{1 - F (t)}]}^{1 {x_{i}^{*} = u}}] \end{matrix} = \sum_{i = 1}^{n} log [f (x_{i}^{*})] 1 {t < x_{i}^{*} < u} - n log [1 - F (t)] + log [1 - F (u^{-})] \sum_{i = 1}^{n} 1 {x_{i}^{*} = u},

(5)

where

1 {}

denotes the indicator function.

Once parameter MLEs,

{\hat{θ}}_{1}, \dots, {\hat{θ}}_{k}

, are available, the pth quantile estimate is found by plugging those MLE values into the parametric expression of

F^{- 1} (p) = h (θ_{1}, \dots, θ_{k})

. Let us denote this estimator as

{\hat{F}}_{MLE}^{- 1} (p) = h ({\hat{θ}}_{1}, \dots, {\hat{θ}}_{k})

. Then, as follows from the MLE’s asymptotic distribution and the delta method,

{\hat{F}}_{MLE}^{- 1} (p) i s AN (F^{- 1} (p), \frac{1}{n} d_{θ} I_{θ}^{- 1} d_{θ}^{'}), 0 < p < 1,

(6)

where

d_{θ} = (\partial h / \partial {\hat{θ}}_{1}, \dots, \partial h / \partial {\hat{θ}}_{k}) |_{(θ_{1}, \dots, θ_{k})}

, and the entries of

I_{θ}

are given by (A7) with g replaced by (2). Note that (6) is defined for

0 < p < 1

, while (4) for

0 < p < \frac{F (u) - F (t)}{1 - F (t)} \leq 1

.

2.3. PM Approach

A popular alternative to the MLE approach for estimation of loss distribution parameters is percentile matching (PM). To estimate k unknown parameters with the PM method and using the ordered data

x_{(1)}^{*} \leq \dots \leq x_{(n)}^{*}

, one has to solve the following system of equations with respect to

θ_{1}, \dots, θ_{k}

:

F_{*}^{- 1} (p_{1}) = x_{(⌈ n p_{1} ⌉)}^{*}, F_{*}^{- 1} (p_{2}) = x_{(⌈ n p_{2} ⌉)}^{*}, \dots, F_{*}^{- 1} (p_{k}) = x_{(⌈ n p_{k} ⌉)}^{*},

where

p_{1} < \dots < p_{k} < \frac{F (u) - F (t)}{1 - F (t)}

and

x_{(⌈ n p_{k} ⌉)}^{*} < u

. Once parameter PMs,

{\tilde{θ}}_{1}, \dots, {\tilde{θ}}_{k}

, are available, the pth quantile estimate is found by plugging those PM values into

F^{- 1} (p) = h (θ_{1}, \dots, θ_{k})

. Let us denote this estimator as

{\hat{F}}_{PM}^{- 1} (p) = h ({\tilde{θ}}_{1}, \dots, {\tilde{θ}}_{k})

. Then, as follows from Theorem A2 and the delta method,

{\hat{F}}_{PM}^{- 1} (p) i s AN (F^{- 1} (p), \frac{1}{n} d_{θ} D_{θ}^{*} Σ_{θ} {(D_{θ}^{*})}^{'} d_{θ}^{'}), 0 < p < 1,

(7)

where

d_{θ} = (\partial h / \partial {\tilde{θ}}_{1}, \dots, \partial h / \partial {\tilde{θ}}_{k}) |_{(θ_{1}, \dots, θ_{k})}

and

D_{θ}^{*}

is specified in Theorem A2. The entries of

Σ_{θ}

are given by (A1) with g and

G^{- 1}

replaced by expressions (2) and (3), respectively. Note that (7) is defined for

0 < p < 1

, while (4) for

0 < p < \frac{F (u) - F (t)}{1 - F (t)} \leq 1

.

3. RECs for Exponential and Pareto Models

In this section, we provide examples of RECs for exponential and single-parameter Pareto distributions under several data-truncation and censoring scenarios. For each model, we choose the (biased) empirical estimator of

F^{- 1} (p)

as the benchmark estimator. Then, using formulas (4), (6), and (7), we evaluate

{ARE}_{p}

’s for the MLE and PM estimators with respect to the empirical estimator, as well as

{ARE}_{p}

of PM with respect to MLE. The three definitions of

{ARE}_{p}

’s are given by Equations (A8)–(A10)

3.1. Exponential Distribution

Let

X_{1}, X_{2}, \dots

be i.i.d. exponentially distributed random variables with cdf

F (x) = 1 - e^{- (x - x_{0}) / θ}, x \geq x_{0}

, pdf

f (x) = (1 / θ) e^{- (x - x_{0}) / θ}, x > x_{0}

, and qf

F^{- 1} (s) = x_{0} - θ log (1 - s), 0 \leq s \leq 1

, and where

x_{0} \geq 0

is known and

θ > 0

is an unknown scale parameter. According to the model setup of Section 2, however, the

X_{i}

’s are unobservable. The data are generated by variables

X_{1}^{*}, \dots, X_{n}^{*}

which are i.i.d. with cdf, pdf, and qf given by (1), (2), and (3), respectively. This implies that when

X_{i}

’s are exponentially distributed, we have

F_{*} (x^{*} | t; u) = \{\begin{matrix} 0, & x^{*} \leq t; \\ 1 - e^{- (x^{*} - t) / θ}, & t < x^{*} < u; \\ 1, & x^{*} \geq u, \end{matrix}

f_{*} (x^{*} | t; u) = \{\begin{matrix} (1 / θ) e^{- (x^{*} - t) / θ}, & t < x^{*} < u; \\ e^{- (u - t) / θ}, & x^{*} = u; \\ 0, & elsewhere, \end{matrix}

F_{*}^{- 1} (s | t; u) = \{\begin{matrix} - θ log (1 - s) + t, & 0 \leq s < 1 - e^{- (u - t) / θ}; \\ u, & 1 - e^{- (u - t) / θ} \leq s \leq 1 . \end{matrix}

Now, for the empirical estimator

{\hat{F}}_{EMP}^{- 1} (p) = x_{(⌈ n p ⌉)}^{*}

, the asymptotic result (4) becomes

{\hat{F}}_{EMP}^{- 1} (p) i s AN (- θ log (1 - p) + t, \frac{θ^{2}}{n} \frac{p}{1 - p}), 0 < p < 1 - e^{- (u - t) / θ} .

(8)

The statement (8) shows that the asymptotic bias of

{\hat{F}}_{EMP}^{- 1} (p)

is

t - x_{0}

.

Further, MLE of

θ

is found by maximizing the log-likelihood (5) which in this case is

log L (θ | x_{1}^{*}, \dots, x_{n}^{*}) = - log θ \sum_{i = 1}^{n} 1 {t < x_{i}^{*} < u} - \frac{1}{θ} \sum_{i = 1}^{n} [(x_{i}^{*} - t) 1 {t < x_{i}^{*} < u} + (u - t) 1 {x_{i}^{*} = u}] .

It yields a closed-form solution for

θ

:

{\hat{θ}}_{MLE} = \frac{\sum_{i = 1}^{n} [(x_{i}^{*} - t) 1 {t < x_{i}^{*} < u} + (u - t) 1 {x_{i}^{*} = u}]}{\sum_{i = 1}^{n} 1 {t < x_{i}^{*} < u}} .

This in turn implies that

{\hat{F}}_{MLE}^{- 1} (p) = x_{0} - {\hat{θ}}_{MLE} log (1 - p)

, and the asymptotic result (6) becomes

{\hat{F}}_{MLE}^{- 1} (p) i s AN (x_{0} - θ log (1 - p), \frac{θ^{2}}{n} \frac{{log}^{2} (1 - p)}{1 - e^{- (u - t) / θ}}), 0 < p < 1 .

(9)

Furthermore, since for the exponential distribution there is only one unknown parameter

θ

, its PM estimator is derived by solving a single equation,

F_{*}^{- 1} (p_{1}) = x_{(⌈ n p_{1} ⌉)}^{*}

. Note that

p_{1}

has to be chosen from the range

0 < p_{1} < 1 - e^{- (u - t) / θ}

(equivalently,

x_{(⌈ n p_{1} ⌉)}^{*} < u

). In this case, the resulting estimator is also explicit and given by

{\hat{θ}}_{PM} = \frac{t - x_{(⌈ n p_{1} ⌉)}^{*}}{log (1 - p_{1})} .

Subsequently,

{\hat{F}}_{PM}^{- 1} (p) = x_{0} - {\hat{θ}}_{PM} log (1 - p)

, and the asymptotic result (7) becomes

{\hat{F}}_{PM}^{- 1} (p) i s AN (x_{0} - θ log (1 - p), \frac{θ^{2}}{n} \frac{p_{1}}{1 - p_{1}} {[\frac{log (1 - p)}{log (1 - p_{1})}]}^{2}), 0 < p < 1 .

(10)

Finally, we have everything in place for computation of ARE

_{p}

. Since

{\hat{F}}_{EMP}^{- 1} (p)

is our benchmark estimator which is biased, formulas (A8) and (A9) will be modified by replacing estimators’ variances with their mean-square errors (MSE). The MSE ratios based on (8)–(10) are:

ARE ({\hat{F}}_{MLE}^{- 1} (p), {\hat{F}}_{EMP}^{- 1} (p)) = \frac{\frac{θ^{2}}{n} \frac{p}{1 - p} + {(t - x_{0})}^{2}}{\frac{θ^{2}}{n} \frac{{log}^{2} (1 - p)}{1 - e^{- (u - t) / θ}}}, 0 < p < 1 - e^{- (u - t) / θ},

(11)

ARE ({\hat{F}}_{PM}^{- 1} (p), {\hat{F}}_{EMP}^{- 1} (p)) = \frac{\frac{θ^{2}}{n} \frac{p}{1 - p} + {(t - x_{0})}^{2}}{\frac{θ^{2}}{n} \frac{{log}^{2} (1 - p)}{{log}^{2} (1 - p_{1})} \frac{p_{1}}{1 - p_{1}}}, 0 < p < 1 - e^{- (u - t) / θ},

(12)

ARE ({\hat{F}}_{PM}^{- 1} (p), {\hat{F}}_{MLE}^{- 1} (p)) = \frac{\frac{θ^{2}}{n} \frac{{log}^{2} (1 - p)}{1 - e^{- (u - t) / θ}}}{\frac{θ^{2}}{n} \frac{{log}^{2} (1 - p)}{{log}^{2} (1 - p_{1})} \frac{p_{1}}{1 - p_{1}}} = \frac{(1 - p_{1}) {log}^{2} (1 - p_{1})}{p_{1} (1 - e^{- (u - t) / θ})}, 0 < p < 1 .

(13)

Note that for

p \geq 1 - e^{- (u - t) / θ}

, the ratios (11) and (12) are infinite because

{\hat{F}}_{EMP}^{- 1} (p)

is undefined. Also, in (13), the probability level

p_{1}

has to be chosen from the range

0 < p_{1} < 1 - e^{- (u - t) / θ}

.

In Figure 1, RECs of quantile estimators of the exponential (

x_{0} = 100, θ

) distribution are plotted for the left-truncation level

t = 500

and right-censoring at

u = 2500

. In the first column of plots, the distribution is lighter tailed (

θ = 250

) with

F (t) = 0.7981

,

F (u) = 0.9999

, and

F_{*} (u | t; u) = 0.9995

. In the second column of plots, the distribution has a heavier tail (

θ = 500

) with

F (t) = 0.5507

,

F (u) = 0.9918

, and

F_{*} (u | t; u) = 0.9817

. Due to the high bias of the empirical estimator (which goes to ∞ as

p \to 0

), the vertical axes are plotted on the logarithmic scale to minimize visual distortions. Comparison of plots across the rows reveals a couple of patterns: first, in the top row it is clearly visible that a combination of heavier tail and a slightly smaller percentage of actually observed data

F_{*} (u | t; u)

shifts all curves significantly upward (especially for small p); second, as is evident from all plots, the efficiency of PM estimators increases monotonically for

0 < p_{1} < 0.80

and then starts to decrease for

0.80 < p_{1} < 1

(i.e., the curves pm

_{4}

are above those of pm

_{3}

which are above pm

_{2}

, etc., but pm

_{6}

are below pm

_{5}

). Thus the

p_{1} \approx 0.80

level is optimal for PM estimation. This fact is in agreement with the complete sample optimality result (see discussion in Section 3.1 of Brazauskas (2009)).

3.2. Pareto Distribution

Let

X_{1}, X_{2}, \dots

be i.i.d. random variables distributed according to a single-parameter Pareto distribution with cdf

F (x) = 1 - {(x_{0} / x)}^{α}, x > x_{0}

, pdf

f (x) = (α / x_{0}) {(x_{0} / x)}^{α + 1}, x > x_{0}

, and qf

F^{- 1} (s) = x_{0} {(1 - s)}^{- 1 / α}, 0 \leq s \leq 1

. Here

x_{0} > 0

is known and

α > 0

is an unknown shape parameter, thus justifying the single-parameter characterization. As before,

X_{i}

’s are unobservable and the data are generated by variables

X_{1}^{*}, \dots, X_{n}^{*}

which are i.i.d. with cdf, pdf, and qf given by (1), (2), and (3), respectively. This implies that when

X_{i}

’s are Pareto distributed, we have

F_{*} (x^{*} | t; u) = \{\begin{matrix} 0, & x^{*} \leq t; \\ 1 - {(t / x^{*})}^{α}, & t < x^{*} < u; \\ 1, & x^{*} \geq u, \end{matrix}

f_{*} (x^{*} | t; u) = \{\begin{matrix} (α / t) {(t / x^{*})}^{α + 1}, & t < x^{*} < u; \\ {(t / u)}^{α}, & x^{*} = u; \\ 0, & elsewhere, \end{matrix}

F_{*}^{- 1} (s | t; u) = \{\begin{matrix} t {(1 - s)}^{- 1 / α}, & 0 \leq s < 1 - {(t / u)}^{α}; \\ u, & 1 - {(t / u)}^{α} \leq s \leq 1 . \end{matrix}

Next, for the empirical estimator

{\hat{F}}_{EMP}^{- 1} (p) = x_{(⌈ n p ⌉)}^{*}

, the asymptotic result (4) becomes

{\hat{F}}_{EMP}^{- 1} (p) i s AN (t {(1 - p)}^{- 1 / α}, \frac{{(t / α)}^{2}}{n} \frac{p}{{(1 - p)}^{1 + 2 / α}}), 0 < p < 1 - {(t / u)}^{α} .

(14)

As evident from the statement (14) and the fact that

t {(1 - p)}^{- 1 / α} \geq x_{0} {(1 - p)}^{- 1 / α}

(since

t \geq x_{0}

), this estimator is asymptotically (positively) biased.

Further, MLE of

α

is found by maximizing the log-likelihood (5) which in this case is

\begin{matrix} log L (α | x_{1}^{*}, \dots, x_{n}^{*}) & = & \sum_{i = 1}^{n} log (α / x_{i}^{*}) 1 {t < x_{i}^{*} < u} \\ - α \sum_{i = 1}^{n} [log (x_{i}^{*} / t) 1 {t < x_{i}^{*} < u} + log (u / t) 1 {x_{i}^{*} = u}] . \end{matrix}

It yields a closed-form solution for

α

:

{\hat{α}}_{MLE} = \frac{\sum_{i = 1}^{n} 1 {t < x_{i}^{*} < u}}{\sum_{i = 1}^{n} [log (x_{i}^{*} / t) 1 {t < x_{i}^{*} < u} + log (u / t) 1 {x_{i}^{*} = u}]} .

This in turn implies that

{\hat{F}}_{MLE}^{- 1} (p) = x_{0} {(1 - p)}^{- 1 / {\hat{α}}_{MLE}}

, and the asymptotic result (6) becomes

{\hat{F}}_{MLE}^{- 1} (p) i s AN (x_{0} {(1 - p)}^{- 1 / α}, \frac{1}{n} \frac{{log}^{2} (1 - p)}{{(1 - p)}^{2 / α}} \frac{{(x_{0} / α)}^{2}}{1 - {(t / u)}^{α}}), 0 < p < 1 .

(15)

Furthermore, similar to the exponential distribution case, PM estimator of

α

is derived by solving a single equation,

F_{*}^{- 1} (p_{1}) = x_{(⌈ n p_{1} ⌉)}^{*}

, where

x_{(⌈ n p_{1} ⌉)}^{*} < u

. The resulting estimator is given by

{\hat{α}}_{PM} = \frac{log (1 - p_{1})}{log (t / x_{(⌈ n p_{1} ⌉)}^{*})} .

Subsequently,

{\hat{F}}_{PM}^{- 1} (p) = x_{0} {(1 - p)}^{- 1 / {\hat{α}}_{PM}}

, and the asymptotic result (7) becomes

{\hat{F}}_{PM}^{- 1} (p) i s AN (x_{0} {(1 - p)}^{- 1 / α}, \frac{{(x_{0} / α)}^{2}}{n} \frac{p_{1} {log}^{2} (1 - p)}{(1 - p_{1}) {log}^{2} (1 - p_{1})} {(1 - p)}^{- 2 / α}), 0 < p < 1 .

(16)

Note that

p_{1}

has to be chosen from the range

0 < p_{1} < 1 - {(t / u)}^{α}

(equivalently,

x_{(⌈ n p_{1} ⌉)}^{*} < u

).

Finally, for computation of ARE

_{p}

, formulas (A8) and (A9) are modified the same way as in Section 3.1. The MSE ratios based on (14)–(16) are:

\begin{matrix} ARE ({\hat{F}}_{MLE}^{- 1} (p), {\hat{F}}_{EMP}^{- 1} (p)) & = & \frac{\frac{{(t / α)}^{2}}{n} \frac{p}{{(1 - p)}^{1 + 2 / α}} + {[t {(1 - p)}^{- 1 / α} - x_{0} {(1 - p)}^{- 1 / α}]}^{2}}{\frac{1}{n} \frac{{log}^{2} (1 - p)}{{(1 - p)}^{2 / α}} \frac{{(x_{0} / α)}^{2}}{1 - {(t / u)}^{α}}} \\ = & \frac{p / (1 - p) + n α^{2} {(1 - x_{0} / t)}^{2}}{{(x_{0} / t)}^{2} {log}^{2} (1 - p) / (1 - {(t / u)}^{α})}, 0 < p < 1 - {(t / u)}^{α}, \end{matrix}

(17)

\begin{matrix} ARE ({\hat{F}}_{PM}^{- 1} (p), {\hat{F}}_{EMP}^{- 1} (p)) & = & \frac{\frac{{(t / α)}^{2}}{n} \frac{p}{{(1 - p)}^{1 + 2 / α}} + {[t {(1 - p)}^{- 1 / α} - x_{0} {(1 - p)}^{- 1 / α}]}^{2}}{\frac{{(x_{0} / α)}^{2}}{n} \frac{p_{1} {log}^{2} (1 - p)}{(1 - p_{1}) {log}^{2} (1 - p_{1})} {(1 - p)}^{- 2 / α}} \\ = & \frac{{(t / x_{0})}^{2} [p / (1 - p) + n α^{2} {(1 - x_{0} / t)}^{2}]}{(p_{1} / (1 - p_{1})) {(log (1 - p) / log (1 - p_{1}))}^{2}}, 0 < p < 1 - {(t / u)}^{α}, \end{matrix}

(18)

\begin{matrix} ARE ({\hat{F}}_{PM}^{- 1} (p), {\hat{F}}_{MLE}^{- 1} (p)) & = & \frac{\frac{1}{n} \frac{{log}^{2} (1 - p)}{{(1 - p)}^{2 / α}} \frac{{(x_{0} / α)}^{2}}{1 - {(t / u)}^{α}}}{\frac{{(x_{0} / α)}^{2}}{n} \frac{p_{1} {log}^{2} (1 - p)}{(1 - p_{1}) {log}^{2} (1 - p_{1})} {(1 - p)}^{- 2 / α}} \\ = & \frac{(1 - p_{1}) {log}^{2} (1 - p_{1})}{p_{1} (1 - {(t / u)}^{α})}, 0 < p < 1 . \end{matrix}

(19)

For

p \geq 1 - {(t / u)}^{α}

, the ratios (17) and (18) are infinite. In (19), the probability level

p_{1}

has to be chosen from the range

0 < p_{1} < 1 - {(t / u)}^{α}

.

In Figure 2, RECs of quantile estimators of the Pareto (

x_{0} = 100, α

) distribution are plotted for the left-truncation level

t = 500

and right-censoring at

u = 2500

. In the first column of plots, the distribution is heavy tailed (

α = 1.50

) with

F (t) = 0.9106

,

F (u) = 0.9920

, and

F_{*} (u | t; u) = 0.9106

. In the second column of plots, the distribution has even heavier tail (

α = 1.25

) with

F (t) = 0.8663

,

F (u) = 0.9821

, and

F_{*} (u | t; u) = 0.8663

. Due to the high bias of the empirical estimator (which goes to ∞ as

p \to 0

), the vertical axes are plotted on the logarithmic scale to minimize visual distortions. Comparison of plots shows the same ordering of PM curves as those under the exponential distribution assumption. The choice of

p_{1} \approx 0.80

is also optimal for PM estimation. A change from heavy to even heavier tail and a decrease in the percentage of actually observed data

F_{*} (u | t; u)

results in less pronounced shifts of the Pareto-based REC curves; but they are much higher than the exponential RECs. Thus, since both models are truncated and censored at the identical

t = 500

and

u = 2500

, this suggests that the significant differences in the REC curves between the distributions can be used to construct a model selection method. This idea will be further discussed in Section 4.

4. Evaluation of Model Uncertainty

In this section, using simulated data we demonstrate how model uncertainty can emerge in a surprising way and examine how wrong quantile estimates can be when one makes incorrect modeling assumptions. In particular, we generate

n = 50

observations from the exponential distribution of Section 3.1 (with

x_{0} = 100

,

θ = 500

,

t = 500

,

u = 2500

), fit the exponential model using MLE and PM (

p_{1} = 0.80

) estimators to it, and perform standard model diagnostics (e.g., quantile-quantile plots) and validation (e.g., Kolmogorov-Smirnov and Anderson-Darling tests). As expected, the exponential distribution is not rejected by any of the tests. Then, using the same data we repeat the exercise by assuming a Pareto distribution, and find that it also passes all the tests. In both cases, we additionally compute AIC and BIC values, which under the incorrect Pareto assumption are better than the ones under the correct exponential assumption. Next, to make sure that this conclusion was not random, we simulate

n = 50

observations from the Pareto distribution of Section 3.2 (with

x_{0} = 100

,

α = 1.50

,

t = 500

,

u = 2500

), fit and validate both models, and find yet again that both distributional assumptions are acceptable. This exercise shows that standard model diagnostic methods can mislead the decision maker, which would be not a major issue if quantile estimates based on incorrect modeling assumptions were close to the true values of quantiles, however, that’s not the case. For completeness, we include the empirical estimates of quantiles although it is known they are incorrect. Below we provide the details of the described exercises so the interested reader can reproduce the results.

The data sets were simulated using R with a seed of 200 (it is used to initialize the random number generator). They are presented in Table 1, where censored observations are italicized.

In Figure 3, the quantile-quantile plots (QQ-plots) are provided. The plots are parameter free. That is, since the exponential and Pareto distributions are location-scale and log-location-scale families, respectively, their QQ-plots can be constructed without first estimating model parameters. Note also that only actual data can be used in these plots (i.e., no observations

u = 2500

). As is evident from Figure 3, the points in all graphs form a (roughly) straight line; thus both distributions are acceptable for both data sets.

To formally evaluate the appropriateness of the fitted model to data, we perform KS and AD goodness-of-fit tests. The models are fitted using two parameter estimation methods, MLE and PM (

p_{1} = 0.80

), to check the sensitivity of overall conclusions to model fitting procedures. The values of the test statistics along with the corresponding p-values are reported in Table 2. (The p-values are computed using parametric bootstrap with 1000 simulation runs. For a brief description of the parametric bootstrap procedure, see, for example, Section 20.4.5 of Klugman et al. (2012)). We can see that except for one isolated case (Pareto data, Pareto model, PM estimation) the p-values are above 0.10 for both distributions, all parameter estimation methods, and both tests. Thus, the fitted exponential and Pareto models are acceptable for both data sets. In addition, the table contains AIC and BIC values, which can be used as model selection tools. Based on these metrics (smaller values are better), the Pareto model would be chosen for both data sets. Of course, the decision to accept Pareto when data came from an exponential distribution is incorrect.

Next, to see whether it really matters which model we select at this stage of the analysis, we have to examine the true probability models that generated data and check how much off target our upper quantile estimates are. For the data sets of Table 1, the underlying distributions are exponential (

x_{0} = 100, θ = 500

) and Pareto (

x_{0} = 100, α = 1.50

), with the quantile functions given by:

F^{- 1} (p) = 100 - 500 log (1 - p) (exponential), F^{- 1} (p) = 100 {(1 - p)}^{- 1 / 1.50} (Pareto) .

Thus, the true values of the 90%, 95% and 99% quantiles (estimation targets) are:

Exponential : F^{- 1} (0.90) = 1251, F^{- 1} (0.95) = 1598, F^{- 1} (0.99) = 2403 .

Pareto : F^{- 1} (0.90) = 464, F^{- 1} (0.95) = 737, F^{- 1} (0.99) = 2154 .

The quantile estimation results are summarized in Table 3. There, we clearly see that the parametric estimates of the quantiles based on the correctly identified model are fairly close to their targets, but those based on the incorrect model are significantly off their targets. Also, the empirical estimates are way off target (

{\hat{F}}^{- 1} (0.99) = 2500

for the exponential data set is a lucky coincidence, not a rule).

Finally, in Table 4 we present estimated RECs, given by (11) and (17), at selected quantile levels. The curves are estimated using the MLE values from Table 2 and show how many times the parametric approach is more efficient than the empirical one in estimating a quantile. Note that as was seen in Figure 1 and Figure 2, RECs based on PM estimators have the same shapes as those of MLE, just rescaled by a constant (smaller than one). Thus PM based conclusions would not change from those of MLE and one method of analysis will be sufficient. What stands out from these computations is the vast differences between the corresponding exponential and Pareto RECs, when they are estimated using the same data set (especially for small p). We conjecture that with some additional work one can develop an effective diagnostic tool to differentiate between the models.

5. Concluding Remarks

The relative efficiency curves, REC, were introduced as a practical tool for comparison of two competing statistical procedures, when data are complete. In this paper, we have redesigned and extended such curves to the left-truncated and right-censored data scenarios that are common in insurance analytics. Our illustrations have focused on the parametric (MLE and PM) and empirical nonparametric approaches for estimation of quantiles that are key inputs for further risk analysis (e.g., contract pricing, risk measurement, capital allocation). Further, we have developed specific examples of RECs for exponential and single-parameter Pareto distributions under a few data truncation and censoring scenarios. Then, using simulated exponential and Pareto data we have examined how wrong quantile estimates can be when incorrect modeling assumptions are made. The numerical analysis involved application of standard model diagnostics and validation (e.g., QQ-plots, KS and AD tests, AIC and BIC criteria) and has demonstrated how those methods can mislead the decision maker. Finally, the newly developed RECs have been applied to study the discrepancies between the quality of quantile estimates of the fitted exponential and Pareto distributions. Our conclusion is that RECs have strong potential for being developed into an effective diagnostic tool in this context.

Author Contributions

Both authors have contributed equally to various aspects of the paper, including methodology, analysis, simulations, writing, and reviewing.

Funding

This research received no external funding.

Acknowledgments

The authors are very appreciative of valuable suggestions, insightful queries, and useful comments provided by three anonymous referees, which helped to significantly improve the paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In the appendix, we provide some theoretical results that are key to analytic derivations in the paper. Specifically, in Appendix A.1, the asymptotic normality theorems for sample quantiles and percentile-matching (PM) estimators of model parameters are presented. The construction of the relative efficiency curves (REC) is described in Appendix A.2. Note that more detailed presentations of parts of this material are available in Brazauskas (2009) and Yu and Brazauskas (2017).

Suppose we have a sample of independent and identically distributed (i.i.d.) continuous random variables,

X_{1}, \dots, X_{n}

, with the cumulative distribution function (cdf) G, probability density function (pdf) g, and quantile function (qf)

G^{- 1}

. Let the cdf, pdf, and qf be given in a parametric form, and suppose that they are indexed by a k-dimensional parameter

θ = (θ_{1}, \dots, θ_{k})

. Further, let

X_{(1)} \leq \dots \leq X_{(n)}

denote the ordered sample values. Also, throughout the paper the notation

AN

is used for “asymptotically normal.”

Appendix A.1. Asymptotic Theorems

The empirical estimator of a population quantile, say

G^{- 1} (p)

, is the corresponding sample quantile

X_{(⌈ n p ⌉)}

, where

⌈ \cdot ⌉

denotes the “rounding up” operation. We start with the asymptotic normality result for sample quantiles. (Complete technical details are available in Section 2.3.3 of Serfling (1980)).

Theorem A1.

[Asymptotic Normality of Sample Quantiles] Let

0 < p_{1} < \dots < p_{k} < 1

, with

k > 1

, and suppose that pdf g is continuous. Then the k-variate vector of sample quantiles

(X_{(⌈ n p_{1} ⌉)}, \dots, X_{(⌈ n p_{k} ⌉)})

is

AN

with the mean vector

(G^{- 1} (p_{1}), \dots, G^{- 1} (p_{k}))

and the covariance-variance matrix

\frac{1}{n} {[σ_{i j}^{2}]}_{i, j = 1}^{k}

with

σ_{i j}^{2} = \frac{p_{i} (1 - p_{j})}{g (G^{- 1} (p_{i})) g (G^{- 1} (p_{j}))} .

(A1)

In the univariate case

(k = 1)

, the sample quantile

X_{(⌈ n p ⌉)} i s AN (G^{- 1} (p), \frac{1}{n} \frac{p (1 - p)}{g^{2} (G^{- 1} (p))}) .

(A2)

The main drawback of statistical inference based on the empirical nonparametric approach is that it is restricted to the range of observed data. For the problems encountered with claim severity data, this is a major limitation. Therefore, a more appropriate alternative is to estimate distribution quantiles parametrically, which first requires estimates of the model parameters and then those values are applied to

G^{- 1} (p)

. The most common technique for parameter estimation is MLE. Its asymptotic distribution is well known and available, for example, in Section 4.2 of Serfling (1980).

Percentile matching is a popular alternative to the MLE approach for estimation of loss distribution parameters (see Klugman et al. (2012), Section 13.1). If the distribution has k unknown parameters,

(θ_{1}, \dots, θ_{k})

, PM estimators are found by matching

G^{- 1} (p_{i})

with

X_{(⌈ n p_{i} ⌉)}

,

i = 1, \dots, k

, and then solving the resulting system of equations with respect to

θ_{1}, \dots, θ_{k}

. Assuming the system of equations has a unique solution, it is clear that PM estimators of

θ_{1}, \dots, θ_{k}

are functions of

X_{(⌈ n p_{1} ⌉)}, \dots, X_{(⌈ n p_{k} ⌉)}

. Let us denote these estimators as

{\tilde{θ}}_{i} = h_{i}^{*} (X_{(⌈ n p_{1} ⌉)}, \dots, X_{(⌈ n p_{k} ⌉)})

,

i = 1, \dots, k

. Their joint asymptotic normality follows, with some modifications, from Theorem A.1 and the delta method (see, e.g., Section 3.3 of Serfling (1980)).

Theorem A2.

[Asymptotic Normality of PMs] Let

\tilde{θ} = ({\tilde{θ}}_{1}, \dots, {\tilde{θ}}_{k})

denote the PM estimator of parameter

θ = (θ_{1}, \dots, θ_{k})

. Then,

({\tilde{θ}}_{1}, \dots, {\tilde{θ}}_{k}) i s AN ((θ_{1}, \dots, θ_{k}), \frac{1}{n} D_{θ}^{*} Σ_{θ} {(D_{θ}^{*})}^{'}),

(A3)

where the entries of

Σ_{θ}

are given by (A1) and

D_{θ}^{*} = {[d_{i j}^{*}]}_{k \times k}

is the Jacobian of the transformations

h_{1}^{*}, \dots, h_{k}^{*}

evaluated at

(θ_{1}, \dots, θ_{k})

, that is,

d_{i j}^{*} = \partial h_{i}^{*} / \partial X_{(⌈ n p_{j} ⌉)} |_{(θ_{1}, \dots, θ_{k})}

.

Appendix A.2. Relative Efficiency Curves

For complete data, the relative efficiency curve, REC, was introduced by Brazauskas (2009). It is constructed by using asymptotic properties of quantile estimators. Suppose two asymptotically normal estimators of a fixed quantile of the underlying distribution are available. Plotting the ratio of their variances versus the probability level of quantile yields an REC. Such a curve provides information about the accuracy of one estimator relative to another when both are designed to estimate the same (fixed but arbitrary) quantile of the distribution. If one or both estimators are biased, REC is constructed by replacing their variances with the mean-square errors.

Next, for a fixed probability level p,

0 < p < 1

, consider the empirical nonparametric and parametric estimators of the population quantile

G_{θ}^{- 1} (p)

. Then, as follows from (A2), the empirical estimator

{\hat{G}}_{EMP}^{- 1} (p) = X_{(⌈ n p ⌉)}

satisfies:

{\hat{G}}_{EMP}^{- 1} (p) i s AN (G_{θ}^{- 1} (p), \frac{1}{n} \frac{p (1 - p)}{g_{θ}^{2} (G_{θ}^{- 1} (p))}) .

(A4)

For MLE and PM estimators, we use their asymptotic distributions in conjunction with the delta method (where

G_{θ}^{- 1} (p)

is viewed as a function of

θ

, say,

h (θ)

) and arrive at the following results:

{\hat{G}}_{MLE}^{- 1} (p) i s AN (G_{θ}^{- 1} (p), \frac{1}{n} d_{θ} I_{θ}^{- 1} d_{θ}^{'})

(A5)

and

{\hat{G}}_{PM}^{- 1} (p) i s AN (G_{θ}^{- 1} (p), \frac{1}{n} d_{θ} D_{θ}^{*} Σ_{θ} {(D_{θ}^{*})}^{'} d_{θ}^{'}) .

(A6)

Here

I_{θ} = {[I_{i j}]}_{i, j = 1}^{k}

is the Fisher information matrix, with the entries given by

I_{i j} = E [\frac{\partial log g (X)}{\partial θ_{i}} \cdot \frac{\partial log g (X)}{\partial θ_{j}}],

(A7)

matrices

D_{θ}^{*}

and

Σ_{θ}

are same as those specified in (A3), and

d_{θ} = (\partial h / \partial {\hat{θ}}_{1}, \dots, \partial h / \partial {\hat{θ}}_{k}) |_{(θ_{1}, \dots, θ_{k})}

. Note that the asymptotic variances of

{\hat{G}}_{MLE}^{- 1} (p)

and

{\hat{G}}_{PM}^{- 1} (p)

also depend on p.

Now, the asymptotic relative efficiency, ARE, of

{\hat{G}}_{EMP}^{- 1} (p)

relative to

{\hat{G}}_{MLE}^{- 1} (p)

is

{ARE}_{p} : = ARE ({\hat{G}}_{EMP}^{- 1} (p), {\hat{G}}_{MLE}^{- 1} (p)) = \frac{g_{θ}^{2} (G_{θ}^{- 1} (p))}{p (1 - p)} d_{θ} I_{θ}^{- 1} d_{θ}^{'} f o r 0 < p < 1,

(A8)

and relative to

{\hat{G}}_{PM}^{- 1} (p)

it is

{ARE}_{p} : = ARE ({\hat{G}}_{EMP}^{- 1} (p), {\hat{G}}_{PM}^{- 1} (p)) = \frac{g_{θ}^{2} (G_{θ}^{- 1} (p))}{p (1 - p)} d_{θ} D_{θ}^{*} Σ_{θ} {(D_{θ}^{*})}^{'} d_{θ}^{'} f o r 0 < p < 1 .

(A9)

For comparison of

{\hat{G}}_{PM}^{- 1} (p)

relative to

{\hat{G}}_{MLE}^{- 1} (p)

, we have

{ARE}_{p} : = ARE ({\hat{G}}_{PM}^{- 1} (p), {\hat{G}}_{MLE}^{- 1} (p)) = \frac{d_{θ} I_{θ}^{- 1} d_{θ}^{'}}{d_{θ} D_{θ}^{*} Σ_{θ} {(D_{θ}^{*})}^{'} d_{θ}^{'}} f o r 0 < p < 1 .

(A10)

Plotting the points

(p, {ARE}_{p})

yields corresponding relative efficiency curves, where

{ARE}_{p}

is defined by (A8), (A9), or (A10).

References

Aggarwal, Ankur, Michael B. Beck, Matthew Cann, Tim Ford, Dan Georgescu, Nirav Morjaria, Andrew D. Smith, Yvonne Taylor, Andreas Tsanakas, Louise Witts, and et al. 2016. Model risk—Daring to open up the black box. British Actuarial Journal 21: 229–96. [Google Scholar] [CrossRef]
Alexander, Carol, and Jose M. Sarabia. 2012. Quantile uncertainty and value-at-risk model risk. Risk Analysis 32: 1293–308. [Google Scholar] [CrossRef] [PubMed]
Bignozzi, Valeria, Giovanni Puccetti, and Ludger Rüschendorf. 2015. Reducing model risk via positive and negative dependence assumptions. Insurance: Mathematics and Economics 61: 17–26. [Google Scholar] [CrossRef] [Green Version]
Black, Rob, Andreas Tsanakas, Andrew D. Smith, Michael B. Beck, Iain D. Maclugash, Jasvir Grewal, Louise Witts, Nirav Morjaria, R. J. Green, and Zhixin Lim. 2018. Model risk: Illuminating the black box. British Actuarial Journal 23: 1–58. [Google Scholar] [CrossRef]
Brazauskas, Vytaras, and Thomas Kaiser. 2004. Discussion of “Empirical estimation of risk measures and related quantities” by Jones and Zitikis. North American Actuarial Journal 8: 114–17. [Google Scholar] [CrossRef]
Brazauskas, Vytaras, and Andreas Kleefeld. 2016. Modeling severity and measuring tail risk of Norwegian fire claims. North American Actuarial Journal 20: 1–16. [Google Scholar] [CrossRef]
Brazauskas, Vytaras. 2009. Quantile estimation and the statistical relative efficiency curve. Metron LXVII: 289–301. [Google Scholar]
Cairns, Andrew J. 2000. A discussion of parameter and model uncertainty in insurance. Insurance: Mathematics and Economics 27: 313–30. [Google Scholar] [CrossRef]
Cambou, Mathieu, and Damir Filipović. 2017. Model uncertainty and scenario aggregation. Mathematical Finance 27: 534–67. [Google Scholar] [CrossRef]
Cont, Rama, Romain Deguest, and Giacomo Scandolo. 2010. Robustness and sensitivity analysis of risk measurement procedures. Quantitative Finance 10: 593–606. [Google Scholar] [CrossRef] [Green Version]
Embrechts, Paul, Bin Wang, and Ruodu Wang. 2015. Aggregation-robustness and model uncertainty of regulatory risk measures. Finance and Stochastics 19: 763–90. [Google Scholar] [CrossRef]
Glasserman, Paul, and Xingbo Xu. 2014. Robust risk measurement and model risk. Quantitative Finance 14: 1–30. [Google Scholar] [CrossRef]
Hartman, Brian, Robert Richardson, and Rylan Bateman. 2017. Parameter Uncertainty. Technical Report. Ottawa: Casualty Actuarial Society, Canadian Institute of Actuaries, Society of Actuaries. [Google Scholar]
Hong, Liang, Todd Kuffner, and Ryan Martin. 2018. On prediction of future insurance claims when the model is uncertain. Variance 12: 90–99. [Google Scholar] [CrossRef]
Jones, Bruce L., and Ricardas Zitikis. 2003. Empirical estimation of risk measures and related quantities (with discussion). North American Actuarial Journal 7: 44–54. Discussion: 8: 114–17; Reply: 8: 117–18. [Google Scholar] [CrossRef]
Kaiser, Thomas, and Vytaras Brazauskas. 2006. Interval estimation of actuarial risk measures. North American Actuarial Journal 10: 249–68. [Google Scholar] [CrossRef]
Klugman, Stuart A., Harry H. Panjer, and Gordon E. Willmot. 2012. Loss Models: From Data to Decisions, 4th ed. New York: Wiley. [Google Scholar]
Modarres, Reza, Tapan K. Nayak, and Joseph L. Gastwirth. 2002. Estimation of upper quantiles under model and parameter uncertainty. Computational Statistics and Data Analysis 39: 529–54. [Google Scholar] [CrossRef]
Samanthi, Ranadeera, Wei Wei, and Vytaras Brazauskas. 2017. Comparing the riskiness of dependent portfolios via nested L-statistics. Annals of Actuarial Science 11: 237–52. [Google Scholar] [CrossRef]
Serfling, Robert J. 1980. Approximation Theorems of Mathematical Statistics. New York: Wiley. [Google Scholar]
Yu, Daoping, and Vytaras Brazauskas. 2017. Model uncertainty in operational risk modeling due to data truncation: A single risk case. Risks 5: 49. [Google Scholar] [CrossRef]

1	Note that some authors use ‘model risk’ instead of ‘model uncertainty’ to describe the same phenomenon.

Figure 1. Relative efficiency curves (RECs) of quantile estimators of the exponential (

x_{0} = 100, θ

) distribution for

t = 500

,

u = 2500

,

n = 100

, and

θ = 250

(left column),

θ = 500

(right column). Level

p_{1} =

0.05 (pm

_{1}

), 0.10 (pm

_{2}

), 0.25 (pm

_{3}

), 0.50 (pm

_{4}

), 0.75 (pm

_{5}

), 0.90 (pm

_{6}

). Top row: Plots of formulas (11) and (12). Bottom row: Plots of formula (13).

Figure 1. Relative efficiency curves (RECs) of quantile estimators of the exponential (

x_{0} = 100, θ

) distribution for

t = 500

,

u = 2500

,

n = 100

, and

θ = 250

(left column),

θ = 500

(right column). Level

p_{1} =

0.05 (pm

_{1}

), 0.10 (pm

_{2}

), 0.25 (pm

_{3}

), 0.50 (pm

_{4}

), 0.75 (pm

_{5}

), 0.90 (pm

_{6}

). Top row: Plots of formulas (11) and (12). Bottom row: Plots of formula (13).

Figure 2. RECs of quantile estimators of the Pareto (

x_{0} = 100, α

) distribution for

t = 500

,

u = 2500

,

n = 100

, and

α = 1.50

(left column),

α = 1.25

(right column). Level

p_{1} =

0.05 (pm

_{1}

), 0.10 (pm

_{2}

), 0.25 (pm

_{3}

), 0.50 (pm

_{4}

), 0.75 (pm

_{5}

), 0.90 (pm

_{6}

). Top row: Plots of formulas (17) and (18). Bottom row: Plots of formula (19).

Figure 2. RECs of quantile estimators of the Pareto (

x_{0} = 100, α

) distribution for

t = 500

,

u = 2500

,

n = 100

, and

α = 1.50

(left column),

α = 1.25

(right column). Level

p_{1} =

0.05 (pm

_{1}

), 0.10 (pm

_{2}

), 0.25 (pm

_{3}

), 0.50 (pm

_{4}

), 0.75 (pm

_{5}

), 0.90 (pm

_{6}

). Top row: Plots of formulas (17) and (18). Bottom row: Plots of formula (19).

Figure 3. Exponential and Pareto quantile-quantile plots for the data sets of Table 1. The dashed line represents the “best” fit line. Left column:

y = 485 + 550 x

(top) and

y = 485 + 525 x

(bottom). Right column:

y = 6.27 + 0.58 x

(top and bottom).

Figure 3. Exponential and Pareto quantile-quantile plots for the data sets of Table 1. The dashed line represents the “best” fit line. Left column:

y = 485 + 550 x

(top) and

y = 485 + 525 x

(bottom). Right column:

y = 6.27 + 0.58 x

(top and bottom).

Table 1. Left-truncated (at

t = 500

) and right-censored (at

u = 2500

) data simulated from the exponential (

x_{0} = 100, θ = 500

) and Pareto (

x_{0} = 100, α = 1.50

) distributions.

Table 1. Left-truncated (at

t = 500

) and right-censored (at

u = 2500

) data simulated from the exponential (

x_{0} = 100, θ = 500

) and Pareto (

x_{0} = 100, α = 1.50

) distributions.

Exponential Data:	501, 501, 502, 502, 540, 551, 556, 556, 567, 599, 632, 642, 644, 646, 672, 675, 699, 711, 728, 745,
	750, 805, 829, 854, 869, 874, 889, 923, 961, 1012, 1034, 1046, 1054, 1102, 1107, 1169, 1178,
	1190, 1253, 1392, 1430, 1450, 1470, 1901, 1965, 2351, 2465, 2500, 2500, 2500.
Pareto Data:	516, 526, 535, 542, 550, 570, 593, 603, 605, 608, 609, 661, 674, 688, 694, 728, 734, 751, 751, 768,
	778, 782, 786, 797, 825, 836, 836, 847, 940, 962, 968, 1034, 1080, 1115, 1118, 1120, 1134, 1137,
	1175, 1213, 1224, 1271, 1379, 1725, 1861, 2000, 2500, 2500, 2500, 2500.

Table 2. Parameter estimates, goodness-of-fit measures, and information criteria for the exponential and Pareto models fitted to the data sets of Table 1.

Assumed Model	Parameter Estimates	Goodness-of-Fit Measures		Information Criteria
		Kolmogorov-Smirnov	Anderson-Darling	AIC	BIC
		(p-Value *)	(p-Value *)	AIC	BIC
Exponential Data
Exponential	${\hat{θ}}_{MLE} = 595.57$	0.077 (0.914)	1.099 (0.317)	696.62	698.53
Exponential	${\hat{θ}}_{PM} = 554.23$	0.076 (0.637)	0.942 (0.374)	696.86	698.78
Pareto	${\hat{α}}_{MLE} = 1.491$	0.095 (0.371)	0.898 (0.344)	679.29	681.20
Pareto	${\hat{α}}_{PM} = 1.572$	0.109 (0.538)	1.112 (0.262)	682.92	684.83
Pareto Data
Exponential	${\hat{θ}}_{MLE} = 579.33$	0.109 (0.573)	0.564 (0.670)	695.99	697.90
Exponential	${\hat{θ}}_{PM} = 443.01$	0.102 (0.610)	1.006 (0.329)	696.12	698.03
Pareto	${\hat{α}}_{MLE} = 1.487$	0.128 (0.354)	1.025 (0.294)	678.29	680.20
Pareto	${\hat{α}}_{PM} = 1.816$	0.195 (0.000)	2.525 (0.000)	681.44	683.35

^{*}

The p-values are computed using parametric bootstrap with 1000 simulation runs.

Table 3. Parametric and empirical estimates of the 90%, 95% and 99% quantiles for the exponential and Pareto data sets of Table 1.

Data Set	Estimation Methodology		Quantile Estimates
Data Set	Assumed Model	Estimator	${\hat{F}}^{- 1} (0.90)$	${\hat{F}}^{- 1} (0.95)$	${\hat{F}}^{- 1} (0.99)$
Exponential	Exponential	MLE	1471	1884	2843
	Exponential	PM	1387	1774	2674
	Pareto	MLE	468	746	2194
	Pareto	PM	436	679	1902
	Empirical	$- - -$	2004	2484	2500
Pareto	Exponential	MLE	1434	1836	2768
	Exponential	PM	1123	1431	2146
	Pareto	MLE	471	750	2216
	Pareto	PM	356	522	1269
	Empirical	$- - -$	1875	2500	2500

Table 4. Estimated Pareto and exponential RECs (MLE and empirical) at selected quantile levels. Model parameter estimates are from Table 2, based on the data of Table 1.

Quantile Level p	Exponential Data		Pareto Data
Quantile Level p	Exponential Model	Pareto Model	Exponential Model	Pareto Model
0.05	8293	615,077	8792	611,390
0.10	1971	145,899	2089	145,025
0.25	267	19,631	283	19,513
0.50	47	3413	50	3393
0.75	13	877	14	872
0.90	6	344	6	342
0.95	4	228	5	227

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Brazauskas, V.; Upretee, S. Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions. Risks 2019, 7, 55. https://doi.org/10.3390/risks7020055

AMA Style

Brazauskas V, Upretee S. Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions. Risks. 2019; 7(2):55. https://doi.org/10.3390/risks7020055

Chicago/Turabian Style

Brazauskas, Vytaras, and Sahadeb Upretee. 2019. "Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions" Risks 7, no. 2: 55. https://doi.org/10.3390/risks7020055

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions

Abstract

1. Introduction

2. Quantile Estimation

2.1. Empirical Approach

2.2. MLE Approach

2.3. PM Approach

3. RECs for Exponential and Pareto Models

3.1. Exponential Distribution

3.2. Pareto Distribution

4. Evaluation of Model Uncertainty

5. Concluding Remarks

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

Appendix A.1. Asymptotic Theorems

Appendix A.2. Relative Efficiency Curves

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI