A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows

Kurth, Patrick; Nendel, Max; Streicher, Jan

doi:10.3390/risks12080131

Open AccessArticle

A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows

by

Patrick Kurth

¹,

Max Nendel

^2,* and

Jan Streicher

^1,2

¹

Landesbank Baden-Württemberg, 70173 Stuttgart, Germany

²

Center for Mathematical Economics, Bielefeld University, 33615 Bielefeld, Germany

^*

Author to whom correspondence should be addressed.

Risks 2024, 12(8), 131; https://doi.org/10.3390/risks12080131

Submission received: 3 July 2024 / Revised: 9 August 2024 / Accepted: 13 August 2024 / Published: 16 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

We present a statistical test for the long-term calibration in rating systems that can deal with overlapping time windows as required by the guidelines of the European Banking Authority (EBA), which apply to major financial institutions in the European System. In accordance with regulation, rating systems are to be calibrated and validated with respect to the long-run default rate. The consideration of one-year default rates on a quarterly basis leads to correlation effects which drastically influence the variance of the long-run default rate. In a first step, we show that the long-run default rate is approximately normally distributed. We then perform a detailed analysis of the correlation effects caused by the overlapping time windows and solve the problem of an unknown distribution of default probabilities.

Keywords:

hypothesis test; credit risk; rating system; validation; backtesting; long-run default rate; EBA guidelines; correlation effects

JEL Classification:

C12; C52; G28; G32

1. Introduction

Financial institutions use statistical models to estimate the default risk of obligors in order to manage credit risks. According to Basel II, banks are allowed to estimate risk parameters that are used to calculate regulatory capital with their own models. The legal framework for the use of such models in the internal ratings-based (IRB) approach is regulated in the Capital Requirements Regulation (CRR), see CRR (2013). In Cucinelli et al. (2018), the authors conclude that the IRB regulatory framework has promoted stronger risk management practices and strengthened banks’ overall resilience. The CRR imposes specific requirements for the models, e.g., that “institutions shall estimate PDs [probabilities of default] by obligor grade from long run averages of one-year default rates”, see Article 180. In 2017, the European banking authority published the “guidelines on PD estimation, LGD [loss given default] estimation and the treatment of defaulted exposures” (EBA-GL) specifying the CRR requirement, see EBA GL (2017). Paragraph 81 EBA-GL states that “institutions should calculate the observed average default rates as the arithmetic average of all one year default rates”. This requirement was additionally specified in EBA (2016) and ECB (2024). Here, the observed one-year default rate at a given reference date is defined as the percentage of defaulters in the 12-month period after the reference date, so that the observed long-run average default rate depends on the choice of the reference dates. Paragraph 80 EBA-GL allows institutions to choose “between an approach based on overlapping and an approach based on non-overlapping one-year time windows”. Overlapping one-year time windows occur when the time interval between two reference dates is less than one year. Due to computational simplicity, it is of course convenient to continue working with non-overlapping time windows and appropriately adjust the long-term default rate. In many cases, these types of adjustments are rather on the conservative side, see, e.g., Jing et al. (2008) for an empirical study and Li (2016); Zhou (2001) for theoretical analyses in the context of asset correlation. We also refer to Caprioli et al. (2023) for a sensitivity analysis of credit portfolio value at risk with respect to asset correlations.

On the other hand, the approach, using overlapping time windows, provides more information on defaults due to both potential short-term contracts that cannot be observed during one-year periods and possible variations between average default rates using non-overlapping time windows on different reference dates. Especially in such constellations, it is “the ECB’s understanding that overlapping one-year time windows should preferably be used”, see ECB (2024). For this reason, dealing with overlapping time windows is relevant for all major banks in the eurozone using the IRB approach. Furthermore it is favored by most financial institutions that handle portfolios with only few observed defaults, where the estimation and validation of PDs are particularly challenging, cf. Caprioli et al. (2023); Pluto and Tasche (2011); Tasche (2013). Another advantage of this approach lies in the fact that the bias caused by a specific choice of reference dates can be reduced, e.g., when calculating the long-run average default rate as the arithmetic mean of the one-year default rates on a quarterly basis.

Paragraph 87 EBA-GL requires institutions to use a statistical test of calibration at the level of rating grades and for certain portfolios. Classically, the literature dealing with calibration tests assumes a binomially distributed default rate, see, e.g., Deutsche Bundesbank (2003) and Tasche (2008) for a discussion of different hypothesis tests in this context. We also refer to Coppens et al. (2016) for the consideration of different PDs within the same rating grade and Blochwitz et al. (2004, 2006) for a modified binomial test accounting correlated defaults. However, when considering overlapping time windows, the assumption of a binomial distribution can no longer be maintained. For an overview of existing statistical tests, including a hybrid test that represents a combination of exact and approximate testing procedures, we refer to Aussenegg et al. (2011).

More generally, Monte Carlo methods can be used to construct tests for distributions that cannot be determined analytically. For the use of Monte Carlo methods, precise knowledge of the distribution of the probabilities of default within the portfolio is essential. However, in our case, these probabilities are unknown and estimated by the underlying model so they cannot be used to determine the test statistic. On the other hand, analytical tests are desirable since they satisfy requirements such as replicability and reproducibility. We refer to Blöchlinger (2012, 2017) for hypothesis tests in analytic form that take into account correlation effects, replacing the i.i.d. assumption by a conditional i.i.d. assumption. In this context, we also refer to Tasche (2003) for a one-observation-based inference on the adequacy of probability of default forecasts with dependent default events.

In this paper, we therefore present a statistical test that can be used to verify supervisory calibration requirements in the case of overlapping time windows. A major challenge lies in the analysis of the correlation effects that are caused by the overlapping time windows. On the level of individual ratings grades, the variance of the test statistic is already determined by the null hypothesis. This, however, is not the case on a portfolio level, which is why we focus in great detail on a conservative estimate for the variance by solving a related minimization problem. We then present a conservative calibration test that can deal with the unknown variance in the test statistic.

The rest of the paper is structured as follows. In Section 2.1, we introduce the terminology and notation for the formulation of the hypothesis test in Section 2.2. In Section 2.3, we show that the long-run default rate is approximately normally distributed with respect to random effects in default realization. Thereafter, we focus on the analysis of the correlation effects that arise due to the overlapping time periods in Section 2.4 and Section 2.5. Taking these correlation effects into account is essential for determining the variance of the long-run default rate, see Section 2.6. In Section 3 we formulate the hypothesis test in detail, first at the level of individual rating grades, cf. Section 3.1 and then at portfolio level, cf. Section 3.2. We conclude with a discussion on the parameters of the test and further considerations in Section 4. A closed form solution to the minimization problem related to the estimation of the variance is derived in Appendix A. In Appendix B, we propose an alternative way to estimate the variance without solving an optimization problem.

2. Setup and Preliminaries

2.1. Setup and Notation

In this section, we introduce the terminology and notation used in this paper and give a formal description of the statistical test that is studied in this work. We begin with the general setup by considering the default state of an individual obligor. Throughout, a default state over a one-year time horizon is described by a Bernoulli-distributed random variable

x \sim B (1, p)

, where

p \in \{{PD}_{1}, \dots, {PD}_{m}\},

m \in N

, and

0 < {PD}_{1} < \dots < {PD}_{m} < 1

are the default probabilities of the rating grades in the underlying master scale. In the following, we also use the notations

{PD}_{min}

and

{PD}_{max}

for the default probabilities

{PD}_{1}

and

{PD}_{m}

, respectively.

Next, we introduce the long-run default rate for a given history of reference dates

{RD}_{1} < \dots < {RD}_{N}

with

N \in N

. For all

t = 1, \dots, N

, we are given a number

n_{t} \in N_{0}

of customers within the portfolio at the reference date

{RD}_{t}

, and write

n_{min} : = min ({n_{t} | t = 1, \dots, N} ∖ {0}) and n_{max} : = max {n_{t} | t = 1, \dots, N}

for the minimal and maximal number of existing customers at any reference date throughout the history of the portfolio, respectively. Throughout, we assume that

n_{max} \neq 0

or, equivalently, that

n_{t} \neq 0

for some

t = 1, \dots, N

.

Let

q \in N

be the number of reference dates within a one-year time horizon starting from an arbitrary reference date. To be more precise, we assume that

{RD}_{t + 1} = {RD}_{t} + \frac{1}{q} for all t = 1, \dots, N - 1 .

According to the EBA guidelines EBA GL (2017), the long-run default rate should be computed at least on a quarterly basis, i.e.,

q = 4

, or on an annual basis, i.e.,

q = 1

, performing, however, an analysis of the possible bias that occurs due to the negligence of quarterly data. The main interest of our analysis lies in the case

q > 1

leading to correlation effects caused by overlapping time windows. For

s, t = 1, \dots, N

with

s > t

, we define

w_{t, s} : = max {0, {RD}_{t} + 1 - {RD}_{s}}

for the size of the overlap of the observation periods.

Since new customers are usually added over time and the business relationship ends for others, assign to each customer during the history of reference dates a number from 1 to

M \in N

, and we define

Λ_{t} \subseteq {1, . ., M}

as the set of obligors at reference date

{RD}_{t}

for all

t = 1, \dots, N

. In particular,

⋃_{t = 1}^{N} Λ_{t} = {1, \dots, M}

,

|Λ_{t}| = n_{t}

, and

n_{max} \leq M

. For

t = 1, \dots, N

, we then define the one-year default rate among the considered customers at reference date

{RD}_{t}

by

\begin{matrix} X_{t} = \frac{1}{n_{t}} \sum_{j \in Λ_{t}} x_{j, t}, \end{matrix}

(1)

where

x_{t, j} \sim B (1, p_{t, j})

is the one-year default state and

p_{t, j} \in \{{PD}_{1}, \dots, {PD}_{m}\}

the probability of default over a one-year time horizon of customer

j \in Λ_{t}

at the reference date

{RD}_{t}

. Since

n_{t} = 0

cannot be excluded, we use the convention

\frac{0}{0} : = 0

.

The realized default rate on a reference date

{RD}_{t}

is denoted by

{DR}_{t}

, i.e.,

{DR}_{t}

is the realization of the random variable

X_{t}

for

t = 1, \dots, N

. Moreover, we define

R_{N} : = \{t \in {1, \dots, N} | n_{t} > 0\}

as the set of indices for reference dates, where the portfolio contains at least one customer, and denote its cardinality by

R (N)

. Since we assume that

n_{max} \neq 0

, it follows that

R (N) > 0

, and we define the realized long-run default rate as

LRDR : = \frac{1}{R (N)} \sum_{t = 1}^{N} {DR}_{t},

which is a realization of the the long-run default rate

\begin{matrix} Z : = \frac{1}{R (N)} \sum_{t = 1}^{N} X_{t} . \end{matrix}

(2)

We emphasize that the long-run default rate is the arithmetic mean of the one-year default rates and not the arithmetic mean of the individual default states of all customers for all reference dates, which intuition might suggest. Due to the chosen convention

\frac{0}{0} = 0

, it follows that

X_{t} = 0

if

n_{t} = 0

, i.e., reference dates, where the portfolio contains no customers are completely disregarded in the computation of the long-run default rate. Nevertheless, these reference dates have to be included into the timeline since they describe an overlapping period.

2.2. Formal Description of the Test

We now give an overview over the hypothesis test presented in this paper. The aim is to formulate a statistical test for comparing the realized long-run default rate

LRDR

with the estimated long-run default rate, which is called the long-run central tendency, denoted by

LRCT

, both on the level of individual rating grades and on the portfolio level. In other words,

LRDR

describes the default rate that has actually occurred, while

LRCT

represents the long-run prediction from the rating model. Both

LRDR

and

LRCT

are therefore non-random. We use the random variable Z with expected value

μ : = E (Z)

as the test statistic. Since

LRCT

is the estimated expected value of Z, i.e.,

LRCT = \hat{μ}

, we can formulate the following hypothesis test with

\begin{matrix} null hypothesis H_{0} : μ = LRCT and \\ alternative hypothesis H_{1} : μ \neq LRCT . \end{matrix}

Based on this, we determine values

k, K \in [0, 1]

and consider the test function

φ (z) : = \{\begin{matrix} 0, & if z \in [k, K], \\ 1, & otherwise . \end{matrix}

(3)

If

φ (LRDR) = 0

, the null hypothesis based on the available data is retained, whereas it is rejected in favor of the alternative hypothesis if

φ (LRDR) = 1

. We point out that the distribution of the test statistic Z does not only depend on

μ

but also on other parameters that are not determined by the null hypothesis. The main focus of this paper is to postulate a distributional assumption for Z such that its distribution is determined by the null hypothesis. In this context, we first show that Z is approximately normally distributed, i.e.,

Z \sim N (μ, σ^{2})

. In a second step, the focus lies on a conservative estimate for the variance

σ^{2}

based on the information given by the null hypothesis. A major challenge lies in the fact that the consideration of overlapping time windows leads to unavoidable correlation effects that rule out independence assumptions on the random variables

X_{1}, \dots, X_{N}

.

2.3. Distribution of the Long-Run Default Rate

As we have seen in the previous section, the long-run calibration test requires the formulation of a distribution assumption for the long-run default rate. However, the definition of the long-run default rate as the arithmetic mean of the one-year default rates leads to difficulties in the derivation of an analytical description of the distribution based on the Bernoulli-distributed default states of the individual obligors. The aim of this section is to show that, despite the correlation effects and the possibly varying number of obligors with different PDs at each reference date, the long-run default rate is still approximately normally distributed.

In order to simplify notation, we define

y_{t, j} : = x_{t, j}

if

j \in Λ_{t}

and

y_{t, j} : = 0

otherwise, for all

t = 1, \dots, N

. Then,

Z = \frac{1}{R (N)} \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j = 1}^{M} y_{t, j} = \sum_{j = 1}^{M} Y_{j}

with

Y_{j} = \sum_{t = 1}^{N} \frac{1}{R (N)} \frac{1}{n_{t}} y_{t, j} .

In the sequel, we will show that Z is approximately normally distributed. For this, we assume that default states of different obligors are independent of each other, from which the independence of the family

{(Y_{j})}_{j = 1, \dots, M}

follows. This assumption is standard and sufficiently conservative for the calibration of rating models.

In order to apply the Lindeberg–Feller central limit theorem (CLT), cf. (Klenke 2020, Theorem 15.44), which is a generalization of the classical CLT, we first show that the variances of the random variables

{(Y_{j})}_{j = 1, \dots, M}

cannot become arbitrarily small. In fact, by neglecting the covariance between the default states of an obligor at different reference dates,

\begin{matrix} σ^{2} (Y_{j}) & = \frac{1}{R {(N)}^{2}} \sum_{t = 1}^{N} (\frac{1}{n_{t}^{2}} σ^{2} (y_{t, j}) + 2 \frac{1}{R {(N)}^{2}} \sum_{s = t + 1}^{N} \frac{1}{n_{t} n_{s}} Cov (y_{t, j}, y_{s, j})) \\ \geq \frac{1}{R {(N)}^{2}} \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} σ^{2} (y_{t, j}) for all j = 1, \dots, M . \end{matrix}

Note that, in the previous estimate, we have used the inequality

\sum_{s = t + 1}^{N} \frac{1}{n_{t} n_{s}} Cov (y_{t, j}, y_{s, j}) \geq 0,

which is not automatically satisfied but rather follows from Formula (5) below, which results from the additional assumptions and considerations on default states in Section 2.4. Moreover, for each obligor

j = 1, \dots, M

, there exists at least one index

t = 1, \dots, N

with

σ^{2} (y_{t, j}) > 0

since

⋃_{t = 1}^{N} Λ_{t} = {1, \dots, M}

. Hence, for each obligor

j = 1, \dots, M

,

σ^{2} (Y_{j}) \geq \frac{1}{R (N)} \frac{1}{n_{max}^{2}} min_{k = 1, \dots, m} \{({PD}_{k} (1 - {PD}_{k}))\} = : v_{min} .

Note that

v_{min}

, as defined on the right-hand side of the previous estimate, is independent of the customer

j = 1, \dots, M

and

v_{min} > 0

. Since, up to now, we only consider finitely many customers, we extend the family

{(Y_{j})}_{j = 1, \dots, M}

by a sequence of independent random variables

Y_{M + 1}, Y_{M + 2}, \dots

with

σ^{2} (Y_{j}) \geq v_{min}

, e.g.,

Y_{j} \sim N (μ_{j}, v_{min})

with arbitrary drift

μ_{j} \in (0, 1)

for

j \geq M + 1

.

Now, the central limit theorem shall be applied to the family

{(Y_{j})}_{j \in N}

. In the classical version by Lindeberg and Lévy, the CLT states that a proper renormalization of the arithmetic mean is approximately normally distributed if the underlying sequence of random variables is independent and identically distributed. Its generalized version by Lindeberg and Feller also applies to random variables that are not identically distributed as in our case. It states that a proper renormalization of the arithmetic mean of a family of independent random variables converges in probability against a normally distributed random variable if this family satisfies the so-called Lindeberg condition, see (Klenke 2020, Definition 15.41). In our case, the Lindeberg condition applies if, for all

ε > 0

,

lim_{k \to \infty} \frac{1}{s_{k}^{2}} \sum_{j = 1}^{k} E ({(Y_{j} - E (Y_{j}))}^{2} \cdot 1_{\{|Y_{j} - E (Y_{j})| > ε s_{k}\}}) = 0,

where

s_{k} = \sqrt{\sum_{j = 1}^{k} σ^{2} (Y_{j})} .

That is, the Lindeberg condition requires that the underlying sequence of random variables does not exhibit arbitrarily large deviations from the expected value.

In the following, we verify the Lindeberg condition in the our setup. To that end, let

ε > 0

. Since, for all

j = 1, \dots, M

, the random variable

Y_{j}

only takes values in

(0, 1)

,

σ^{2} (Y_{j}) \geq v_{min} > 0, and s_{k} \to \infty as k \to \infty,

there exists some

k_{0} \in N

such that, for all

k \in N

with

k \geq k_{0}

,

j \in N

, and

ω \in Ω

,

| Y_{j} (ω) - E (Y_{j}) | \leq ε \cdot s_{k} .

Hence, for all

k \in N

with

k \geq k_{0}

,

\begin{matrix} \frac{1}{s_{k}^{2}} \sum_{j = 1}^{k} E & ({(Y_{j} - E (Y_{j}))}^{2} \cdot 1_{\{|Y_{j} - E (Y_{j})| > ε s_{k}\}}) \\ = \frac{1}{s_{k}^{2}} \sum_{j = 1}^{k_{0}} E ({(Y_{j} - E (Y_{j}))}^{2} \cdot 1_{\{|Y_{j} - E (Y_{j})| > ε s_{k}\}}) \\ \leq \frac{1}{s_{k}^{2}} \sum_{j = 1}^{k_{0}} E ({(Y_{j} - E (Y_{j}))}^{2}) \to 0 as k \to \infty, \end{matrix}

i.e., the Lindeberg condition is satisfied. By the central limit theorem, cf. (Klenke 2020, Theorem 15.44), we obtain that

lim_{k \to \infty} (\frac{1}{s_{k}} \sum_{j = 1}^{k} (Y_{j} - E (Y_{j})) \leq z) = Φ (z) for all z \in R,

where

Φ

is the cumulative distribution function for the standard normal distribution. Since the number of customers M is large but fixed throughout our analysis, we may therefore assume that

Z \sim N (μ, σ^{2})

(4)

with

μ = \sum_{j = 1}^{M} E (Y_{j})

and

σ = s_{M}

, suppressing the dependence of

μ

and

σ

on the number of customers M.

The literature suggests that, in standard cases, a sample size of at least 30 is required in order to achieve useful results with the approximation of a normal distribution, see, for example, Hogg and Tanis (1977). In our setting, however, the situation is more complex due to the specific calculation method of the long-run default rate. As we will see in Section 4.3, the convergence depends both on the number of reference dates N and the number of customers

n_{t}

per reference date

{RD}_{t}

; hence, indirectly on M, but not solely on M. As a consequence, a large number of customers does not necessarily imply a good approximation, e.g., if we have

N = 2

reference dates and M = 1,000,000 customers, but only one customer at the second reference date, i.e.,

n_{2} = 1

. For more details, we refer to Section 4.3, where we focus on the convergence for different combinations of N and numbers of customers

n_{t}

per reference date

{RD}_{t}

.

2.4. Covariance between Default States

Pursuant to Paragraph 80 EBA-GL (EBA GL 2017), when calculating the long-run default rate, institutions may choose between overlapping time windows of one year and non-overlapping time windows. According to Paragraph 78 EBA-GL, the reference dates used should include at least all quarterly reference dates. The overlap of the time windows of the one-year default rates leads to correlations between them, which affects the variance of the long-run default rate and thus the acceptance ranges of the test.

In order to be able to describe the distribution of the test statistic as precisely as possible, in Section 2.5 below, we thus focus on the analysis of the covariances between and the default rates, and start by considering the covariance between two default states of individual obligors for overlapping observation periods in this subsection. To that end, we consider two reference dates

{RD}_{t}

and

{RD}_{s}

with

{RD}_{t} < {RD}_{s} < {RD}_{t} + 1

and an obligor who can be observed in the period

[{RD}_{t}, {RD}_{s} + 1)

. The default state in the period

[{RD}_{t}, {RD}_{t} + 1)

is described by a random variable

x_{t} \in \{0, 1\}

, the default state in the period

[{RD}_{s}, {RD}_{s} + 1)

by a random variable

x_{s} \in \{0, 1\}

, cf. Figure 1. Throughout, default events correspond to realizations of the value 1 for default states. That is, a realization of

x_{t}

or

x_{s}

answers the question whether or not the obligor under consideration defaulted in time period

[{RD}_{t}, {RD}_{t} + 1)

or

[{RD}_{s}, {RD}_{s} + 1)

, respectively.

In order to compute the covariance between the two default states

x_{t}

and

x_{s}

, a detailed knowledge of the time of default is necessary, where, in the case of more than one default event, we focus on the time of the temporally first default. The time of first default is described by a random variable

T_{1} \in [{RD}_{t}, \infty)

, which creates a link between default status and time of default, see, e.g., European Commission (2021). While

T_{1}

describes the timing of the first default starting at time

{RD}_{t}

, we model the timing of the first default after the default described by

T_{1}

and after

{RD}_{s}

using a random variable

T_{2}^{*} \in [{RD}_{s}, \infty)

, with

T_{2}^{*} > T_{1}

. We then define

T_{2} : = \{\begin{matrix} T_{1}, & if T_{1} \geq {RD}_{s} and T_{2}^{*} > {RD}_{s} + 1, \\ T_{2}^{*}, & otherwise . \end{matrix}

If besides

T_{1}

more than one default is observed in the interval

[{RD}_{s}, {RD}_{s} + 1)

,

T_{2}

describes the time of the second default, otherwise

T_{2}

is used to model the time of the first default starting from

{RD}_{s}

. We now divide the considered time period into three disjoint intervals

I_{1} : = [{RD}_{t}, {RD}_{s}], I_{2} : = ({RD}_{s}, {RD}_{t} + 1], and I_{3} : = ({RD}_{t} + 1, {RD}_{s} + 1],

and look at the following disjoint events

E_{1} : = {T_{1} \in I_{1}}, E_{2} : = {T_{1} \in I_{2}}, and E_{3} : = {T_{1} \notin I_{1} \cup I_{2}} .

Moreover, we define

E_{4} : = {T_{2} \in I_{2}}, E_{5} : = {T_{2} \in I_{3}}, and E_{6} : = {T_{2} \notin I_{2} \cup I_{3}} .

In order to simplify notation, we set

p_{t} : = P (x_{t} = 1)

and

p_{s} : = P (x_{s} = 1)

and assume that

P (x_{t} = 1 | E_{5}) = p_{t} and P (x_{s} = 1 | E_{1}) = p_{s} .

In a first step, we focus on the probability

P (E_{1})

. For this, we assume that the probability of default in a certain time interval depends on the length of this interval and on the general creditworthiness of the obligor, i.e., we assume that

P (T_{1} \in I | x_{t} = 1) = f_{1} (| I |)

for every interval

I \subseteq [{RD}_{t}, {RD}_{t} + 1)

and a non-negative and non-decreasing function

f_{1}

with

f_{1} (0) = 0

and

f_{1} (1) = 1

. This ensures that the probability of observing a default in a given time interval increases if the length of the interval increases and that the probabilities of default are identical for intervals of the same length. We deduce

P (E_{1} | x_{t} = 1) = 1 - P (E_{2} | x_{t} = 1) = 1 - f_{1} (w_{t, s}),

and therefore, using Bayes’ theorem,

1 - f_{1} (w_{t, s}) = P (E_{1} | x_{t} = 1) = P (x_{t} = 1 | E_{1}) \cdot \frac{P (E_{1})}{P (x_{t} = 1)} = \frac{P (E_{1})}{p_{t}} .

Rearranging terms, we find that

P (E_{1}) = (1 - f_{1} (w_{t, s})) p_{t}

and, analogously,

P (E_{2}) = f_{1} (w_{t, s}) p_{t} .

Hence,

\begin{matrix} E (x_{t} \cdot x_{s}) & = P (x_{t} = 1, x_{s} = 1) = P (x_{t} = 1, x_{s} = 1 | E_{1}) \cdot P (E_{1}) \\ + P (x_{t} = 1, x_{s} = 1 | E_{2}) \cdot P (E_{2}) + P (x_{t} = 1, x_{s} = 1 | E_{3}) \cdot P (E_{3}) \\ = P (x_{s} = 1 | E_{1}) \cdot P (E_{1}) + P (E_{2}) = p_{s} (1 - f_{1} (w_{t, s})) p_{t} + f_{1} (w_{t, s}) p_{t} \\ = p_{t} p_{s} + f_{1} (w_{t, s}) p_{t} (1 - p_{s}) . \end{matrix}

Analogously, one obtains

P (E_{4}) = f_{2} (w_{t, s}) p_{s} and P (E_{5}) = (1 - f_{2} (w_{t, s})) p_{s},

for a non-negative and non-decreasing function

f_{2}

with

f_{2} (0) = 0

and

f_{2} (1) = 1

, and we find that

\begin{matrix} E (x_{t} \cdot x_{s}) & = P (x_{t} = 1, x_{s} = 1) = P (x_{t} = 1, x_{s} = 1 | E_{4}) \cdot P (E_{4}) \\ + P (x_{t} = 1, x_{s} = 1 | E_{5}) \cdot P (E_{5}) + P (x_{t} = 1, x_{s} = 1 | E_{6}) \cdot P (E_{6}) \\ = P (E_{4}) + P (x_{t} = 1 | E_{5}) \cdot P (E_{5}) = f_{2} (w_{t, s}) p_{s} + p_{t} (1 - f_{2} (w_{t, s})) p_{s} \\ = p_{t} p_{s} + f_{2} (w_{t, s}) p_{s} (1 - p_{t}) . \end{matrix}

Equating the previous expectations results in

f_{1} (w_{t, s}) = f_{2} (w_{t, s}) \frac{p_{s} (1 - p_{t})}{p_{t} (1 - p_{s})},

i.e.,

f_{1}

is a linear transform of

f_{2}

. To determine the covariance between

x_{t}

and

x_{s}

, knowledge of at least one of the functions

f_{1}

or

f_{2}

is required. We assume that the distribution of the time of default within an observation year is uniform. We postulate this assumption for

x_{s}

, since it gives preference to the more recent information at

{RD}_{s}

over the information at

{RD}_{t}

, which lies further in the past. We therefore set

f_{2} (w_{t, s}) = w_{t, s}

and end up with

Cov (x_{t}, x_{s}) = E (x_{t} \cdot x_{s}) - E (x_{t}) \cdot E (x_{s}) = w_{t, s} p_{s} (1 - p_{t}) .

(5)

2.5. Covariance between Default Rates

Having previously examined the covariance between obligor default states, we extrapolate this result to the covariance of default rates. To do this, we consider a sample of debtors on reference dates

{RD}_{t}

and

{RD}_{s}

with

{RD}_{s} < {RD}_{t} + 1

and

t, s = 1, \dots, N

. During the transition from the first to the second reference date, debtors can be removed from monitoring due to defaults, terminated business relationships, or migrations to other rating systems, and new debtors can be added on the second reporting date due to new business or migrations to the rating system under consideration. In addition, there are debtors who can be observed on both reference dates. The number of these so-called persisting customers with respect to reference dates

{RD}_{t}

and

{RD}_{s}

is denoted by

|Λ_{t} \cap Λ_{s}| = : k_{t, s} \in N

. The one-year default rates on the two reference dates are then given by

X_{t} = \frac{1}{n_{t}} \sum_{j \in Λ_{t}} x_{t, j}

with

x_{t, j} \sim B (1, p_{t, j})

and

X_{s} = \frac{1}{n_{s}} \sum_{j \in Λ_{s}} x_{s, j}

with

x_{s, j} \sim B (1, p_{s, j})

. Hence,

Cov (X_{t}, X_{s}) = \frac{1}{n_{t} n_{s}} \sum_{i \in Λ_{t}} \sum_{j \in Λ_{t}} Cov (x_{t, i}, x_{s, j}) .

Again, assuming that default states for pairs of different customers are independent, using Formula (5), we find that

Cov (X_{t}, X_{s}) = \frac{1}{n_{t} n_{s}} \sum_{j \in Λ_{t} \cap Λ_{s}} Cov (x_{t, j}, x_{s, j}) = \frac{w_{t, s}}{n_{t} n_{s}} \sum_{j \in Λ_{t} \cap Λ_{s}} p_{s, j} (1 - p_{t, j}),

(6)

where

w_{t, s}

describes, as before, the size of the overlap between the observation periods starting from the reference dates

{RD}_{t}

and

{RD}_{s}

.

2.6. Variance of the Long-Run Default Rate

Based on the considerations for estimating the covariance of default rates in overlapping periods, we now discuss the variance of the long-run default rate. For the random variable

X_{t}

with

t = 1, \dots, N

, we define

E (X_{t}) = : μ_{t}

and

σ (X_{t}) = : σ_{t}

. For the long-run default rate Z, which is approximately normally distributed with

Z \sim N (μ, σ^{2})

, we obtain

μ = \frac{1}{R (N)} \cdot \sum_{t = 1}^{N} μ_{t} = \frac{1}{R (N)} \cdot \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j \in Λ_{t}} p_{t, j}

and

\begin{matrix} R {(N)}^{2} \cdot σ^{2} & = \sum_{t = 1}^{N} σ_{t}^{2} + 2 \sum_{t = 1}^{N - 1} \sum_{s = t + 1}^{N} Cov (X_{t}, X_{s}) \\ = \sum_{t = 1}^{N} σ_{i}^{2} + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} Cov (X_{t}, X_{t + i}), \end{matrix}

(7)

where the second term contains the covariances caused by overlapping time periods. Using Equation (6), we end up with

\begin{matrix} R {(N)}^{2} \cdot σ^{2} = & \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} p_{t, j} (1 - p_{t, j}) \\ + \sum_{i = 1}^{q - 1} \frac{2 (q - i)}{q} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} p_{t + i, j} (1 - p_{t, j}) . \end{matrix}

(8)

For the final calibration test, it will be crucial to estimate the variance of the long-run default rate by a term that solely depends on the expected value

μ

, the number of debtors per reference date, and debtor-independent variables, see Section 3.2.

3. Hypothesis Test for Long-Term Calibration

The aim of this section is to formulate a statistical test for comparing the realized long-run default rate with the estimated long-run default rate.

3.1. Statistical Test per Rating Grade

We start with the simplest case and consider a portfolio consisting of only one rating grade with an associated probability of default, which we call

{PD}_{grade}

. That is, all considered obligors have the same unknown probability of default, which was estimated by

{PD}_{grade}

. We recall that the measured long-run default rate, denoted by

LRDR

, is the realization of a random variable Z for which approximately

Z \sim N (μ, σ^{2})

holds. We consider the hypothesis test, described in Section 2.2, with

\begin{matrix} null hypothesis H_{0} : μ = {PD}_{grade} and \\ alternative hypothesis H_{1} : μ \neq {PD}_{grade} . \end{matrix}

From Equation (8), we obtain

\begin{matrix} R {(N)}^{2} \cdot σ^{2} & = \sum_{t = 1}^{N} {σ_{t}}^{2} + \sum_{i = 1}^{q - 1} \frac{2 (q - i)}{q} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} μ (1 - μ) \\ = \sum_{t = 1}^{N} {σ_{t}}^{2} + μ (1 - μ) \sum_{i = 1}^{q - 1} \frac{2 (q - i)}{q} \sum_{t = 1}^{N - i} \frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}} \\ = \sum_{t = 1}^{N} {σ_{t}}^{2} + μ (1 - μ) \sum_{i = 1}^{q - 1} λ_{i}, \end{matrix}

where

λ_{i} : = \frac{2 (q - i)}{q} \sum_{t = 1}^{N - i} \frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}} .

Moreover,

\sum_{t = 1}^{N} {σ_{t}}^{2} = \sum_{t \in R_{N}} \frac{1}{n_{t}} \cdot μ (1 - μ) = μ (1 - μ) \sum_{t \in R_{N}} \frac{1}{n_{t}},

which implies that

σ^{2} = \frac{μ (1 - μ)}{R {(N)}^{2}} (\sum_{t \in R_{N}} \frac{1}{n_{t}} + \sum_{i = 1}^{q - 1} λ_{i}) .

Therefore, assuming the validity of the null hypothesis,

σ

is known. In order to define the limits of the acceptance range for a significance level

α \in (0, 1)

, we choose

k, K \in [0, 1]

in such a way that

E (φ (Z)) |_{H_{0}} \leq α,

where

φ

is given by (3). To that end, let

k = {PD}_{grade} + Φ^{- 1} (\frac{1}{2} α) \cdot \sqrt{\frac{{PD}_{grade} (1 - {PD}_{grade})}{R {(N)}^{2}} (\sum_{t \in R_{N}} \frac{1}{n_{t}} + \sum_{i = 1}^{q - 1} λ_{i})}

and

K = {PD}_{grade} + Φ^{- 1} (1 - \frac{1}{2} α) \cdot \sqrt{\frac{{PD}_{grade} (1 - {PD}_{grade})}{R {(N)}^{2}} (\sum_{t \in R_{N}} \frac{1}{n_{t}} + \sum_{i = 1}^{q - 1} λ_{i})} .

Then, the calibration test passes if

LRDR \in [k, K] .

To see the influence of persisting customers on the acceptance range and the density function of the long-run default rate, we refer to Section 4.1 and Section 4.2.

3.2. Statistical Test on Portfolio Level

We now consider a portfolio of heterogeneous obligors, i.e., each obligor has its own individual probability of default. For this purpose, we denote the estimated PD of customer j on reference date

{RD}_{t}

by

{\hat{p}}_{t, j}

, the mean PD based on the rating model examined on the reference date

{RD}_{t}

by

{\hat{PD}}_{t}

for

t = 1, \dots, N

, and the long-run central tendency by

LRCT = \frac{1}{N} \cdot \sum_{t = 1}^{N} {\hat{PD}}_{t} = \frac{1}{N} \cdot \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j \in Λ_{t}} {\hat{p}}_{t, j} .

While on the level of rating grades, it is not an exception that the number of customers at a reference date

{RD}_{t}

might be zero, i.e.,

n_{t} = 0

, on a portfolio level, though this rarely happens in practice. In this section, we therefore restrict our attention to the case where

n_{t} > 0

for all

t = 1, \dots, N

and therefore

R (N) = N

.

We consider the hypothesis test from Section 2.2 with

\begin{matrix} null hypothesis H_{0} : μ = LRCT and \\ alternative hypothesis H_{1} : μ \neq LRCT . \end{matrix}

As in the case of individual rating grades, the test statistic depends on

μ

and

σ

but, in this case, Equation (8) can no longer be reduced to a term depending only on

μ

. More precisely, if the correctness of the null hypothesis is assumed, the distribution parameters are only partially determined. If, however, we replace

σ

by an expression

σ_{min} (μ)

with

σ \geq σ_{min} (μ)

, the null hypothesis implies a concrete distribution for the underlying test statistic. This results in a narrower confidence interval and therefore a greater likelihood of committing a type I error. At the same time, the probability of a type II error is reduced, comparable to lowering the significance level.

The key challenge is to derive an analytic expression for

σ_{min}

that depends only on PD-independent model parameters, such as N, M,

n_{t}

, etc., and satisfies

σ_{min} (μ) \leq σ

under the null hypothesis for all possible combinations of

p_{t, j} \in [{PD}_{min}, {PD}_{max}]

. Since we aim to minimize the distance between

σ_{min} (μ)

and

σ

, it is sensible to define

σ_{min}

via the solution to a minimization problem with cost functional given by the right-hand side of (8) under the side condition

μ = \frac{1}{N} \cdot \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j \in Λ_{t}} p_{t, j} and p_{t, j} \in [{PD}_{min}, {PD}_{max}] .

(9)

Observe that the right-hand side of (8) is, in general, neither convex nor concave with respect to the

p_{t, j}

. This, together with the high dimensionality of the problem, means the numerical or analytical computation of solutions is rather involved. In order to circumvent this issue, we therefore replace the right-hand side of (8) by a suitable linear cost functional, which turns the minimization into a linear program that can be solved analytically, up to a sort algorithm, and therefore also numerically with great efficiency.

First, note that mixed terms, i.e., products of PDs of the same customer at different reference dates, always appear in the sums

\sum_{j \in Λ_{t} \cap Λ_{t + i}} p_{t + i, j} (1 - p_{t, j})

caused by the covariances. Hence, in Formula (8), we want to substitute the subtrahend (blue) by an affine linear function with respect to the multiplier (orange) in a conservative way. On a portfolio level, it is appropriate to assume that an obligor’s PD remains constant on average over a one-year time horizon, i.e., we assume

\sum_{j \in Λ_{t} \cap Λ_{t + i}} p_{t + i, j} (1 - p_{t, j}) \approx \sum_{j \in Λ_{t} \cap Λ_{t + l}} p_{t + i, j} (1 - p_{t + i, j}) .

(10)

For most portfolios, this is not only a plausible but also a fairly conservative assumption regarding the portfolio variance since, for a given vector

(x_{1}, \dots, x_{n})

with

0 < x_{i} < 1

for

i = 1, \dots, n

, the sum

\sum_{i = 1}^{n} x_{i} (1 - y_{i})

is minimized for

x_{i} = y_{i}

, under the assumption that, for each

i = 1, \dots, n

, there exists some

j = 1, \dots, n

with

x_{i} = y_{j}

. Certainly, even more conservative assumptions can be made at this point with

{PD}_{max}

being the most conservative subtrahend possible. The impact of such a choice is discussed in Section 4.6, where we indicate that the acceptance ranges hardly change when using

{PD}_{max}

instead of

p_{t + i, j}

. We point out that the subsequent discussion applies also to this choice without limitations.

From Equation (8), we obtain

\begin{matrix} N^{2} \cdot σ^{2} & = \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} p_{t, j} (1 - p_{t, j}) \\ + \sum_{i = 1}^{q - 1} \frac{2 (q - i)}{q} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} p_{t + i, j} (1 - p_{t + i, j}) \\ = \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} g_{0} (p_{t, j}) + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} g_{i} (p_{t + i, j}) \end{matrix}

with

g_{i} (x) : = \frac{q - i}{q} \cdot x (1 - x) for x \in (0, 1) and i = 0, \dots, q - 1 .

Now, we replace each of the functions

g_{i}

by an affine linear function

f_{i}

such that the functions

g_{i}

and

f_{i}

coincide on

{PD}_{max}

and

{PD}_{min}

. Since

g_{i}

is concave and

f_{i}

is affine linear, this implies that

g_{i} (x) \geq f_{i} (x)

for all

x \in [{PD}_{min}, {PD}_{max}]

. To be precise, the functions

f_{i}

are given by

f_{i} (x) : = α_{i} \cdot x + c_{i} for all x \in (0, 1)

with

\begin{matrix} α_{i} : = \frac{q - i}{q} (1 - {PD}_{max} - {PD}_{min}) and c_{i} = \frac{q - i}{q} {PD}_{min} {PD}_{max} \end{matrix}

for all

i = 0, \dots, q - 1

. We therefore end up with the estimate

\begin{matrix} N^{2} \cdot σ^{2} & = \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} g_{0} (p_{t, j}) + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} g_{i} (p_{t + i, j}) \\ \geq \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} f_{0} (p_{t, j}) + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} f_{i} (p_{t + i, j}) \\ = \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} (α_{0} \cdot p_{t, j} + c_{0}) + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} \frac{1}{n_{t} \cdot n_{t + i}} \sum_{j \in Λ_{t} \cap Λ_{t + i}} (α_{i} \cdot p_{t + i, j} + c_{i}) \\ = \sum_{t = 1}^{N} \sum_{j \in Λ_{t}} α_{t, j} p_{t, j} + C, \end{matrix}

(11)

where in the last step, we have isolated all terms that do not depend on

p_{t, j}

in the constant C by using

k_{t, t + i} = |Λ_{t} \cap Λ_{t + i}|

, i.e.,

\begin{matrix} C = c_{0} \cdot \sum_{t = 1}^{N} \frac{1}{n_{t}} + 2 \cdot \sum_{i = 1}^{q - 1} c_{i} \cdot \sum_{t = 1}^{N - i} \frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}} \end{matrix}

and

\begin{matrix} α_{t, j} = \frac{1}{n_{t}^{2}} α_{0} 1_{Λ_{t}} (j) + \frac{2}{n_{t}} \cdot \sum_{i = 1}^{q - 1} \frac{α_{i}}{n_{t - i}} 1_{Λ_{t - i} \cap Λ_{t}} (j) \end{matrix}

with

Λ_{t - i} : = \emptyset

if

t - i < 1

for

t = 1, \dots, N

and

j = 1, \dots, M

. We have therefore reduced the original problem to

minimize \sum_{t = 1}^{N} \sum_{j \in Λ_{t}} α_{t, j} p_{t, j} under the side condition (9) .

(12)

As shown in Appendix A, the solution

p (μ)

to the minimization problem (12) can be easily computed analytically and numerically. We denote the minimal value by

m (μ)

, and define

σ_{min}^{2} (μ) = \frac{m (μ) + C}{N^{2}}

.

For the modified random variable

Z^{*} \sim N (μ, σ_{min}^{2} (μ))

, we are now able to define the limits

k, K \in [0, 1]

of the acceptance range for a significance level

α \in (0, 1)

in such a way that

E (φ (Z^{*})) |_{H_{0}} \leq α .

For this, we define

k : = LRCT + Φ^{- 1} (\frac{1}{2} α) \cdot σ_{min} (LRCT)

and

K : = LRCT + Φ^{- 1} (1 - \frac{1}{2} α) \cdot σ_{min} (LRCT) .

The calibration test passes if

LRDR \in [k, K] .

4. Discussion and Further Considerations

4.1. Effect of Persisting Customers on the Variance of Z

In this section, we aim to highlight the effects caused by persisting customers on the distribution of Z. We calculate the density function of the long-run default rate using a sample portfolio in a rating grade with

PD = 0.02

and set

q = 4

. The number of reference dates is

N = R (N) = 32

, i.e., we consider a portfolio with a relatively short history of eight years. The number of customers is set to be constant over time with

n = 50

. We furthermore assume that the ratio of persisting customers per time interval is constant over time. We have

k_{t, t + 1} = 45

for all

t \leq 31

,

k_{t, t + 2} = 40

for all

t \leq 30

, and

k_{t, t + 3} = 35

for all

t \leq 29

. Only about

70 %

of the persisting customers are still in the considered rating grade after three- quarters, which represents a relatively high fluctuation.

These specifications determine the density function of the long-run default rate (orange). In comparison, one can see the density functions of the long-run default rate with a maximum and minimum number of persisting customers in gray and blue, respectively. We have

σ_{orange}^{2} \approx 4.13 \cdot 10^{- 5}, σ_{blue}^{2} \approx 1.23 \cdot 10^{- 5}, and σ_{gray}^{2} \approx 4.71 \cdot 10^{- 5} .

Figure 2 shows that that correlation effects caused by overlapping time windows should not be neglected since, otherwise, the acceptance range would be way too tight, see also Section 4.2 below.

4.2. Effect of Persisting Customers on Acceptance Range

We examine the influence of the number of persisting customers on the width of the acceptance range. Again, we choose

q = 4

and, for the sake of simplicity, we consider the case at the level of the individual grades choosing

{PD}_{grade} = 0.02

. The proportion of persisting customers is usually lower at the level of individual grades than in the overall portfolio, since a rating migration automatically means that the customer in question is no longer in the sample for the individual grade but is still in the overall portfolio.

The number of reference dates is

N = R (N) = 60

and the number of customers is constant, i.e.,

n_{t} : = n = 50

for

t = 1, \dots, 50

. The extreme cases regarding persisting customers now represent the two scenarios in which, on the one hand, all customers remain the same over the entire history and, on the other hand, none of the customers in a specific quarter occurs in the previous quarter or in the following quarter.

The first case implies

k_{t, t + 1} = n

for

t \leq N - 1

,

k_{t, t + 2} = n

for

t \leq N - 2

and

k_{t, t + 3} = n

for

t \leq N - 3

. Thus, for two one-year default rates

X_{t}

and

X_{s}

where

1 \leq t < s \leq 50

, we have

Cov (X_{t}, X_{s}) = \frac{1}{n} \cdot w_{t, s} (1 - {PD}_{grade}) \cdot {PD}_{grade} .

For the second case, we have

k_{t, t + 1} = 0

for

t \leq N - 1

,

k_{t, t + 2} = 0

for

t \leq N - 2

and

k_{t, t + 3} = 0

for

t \leq N - 3

, implying

Cov (X_{t}, X_{s}) = 0

for two one-year default rates

X_{t}

and

X_{s}

. Since the covariance between default rates is always 0, in this case, the distribution of the long-run default rate is the same as if correlation effects due to overlapping time windows were ignored completely. Note that such a scenario is extremely unlikely, especially for a large number of ratings, and is therefore purely hypothetical since customers would constantly switch rating grades, which would imply a high level of instability in the rating system.

In Figure 3, we can see that the width of the acceptance range to a given level

α

for a portfolio only consisting of persisting customers is almost doubled compared to a portfolio with no persisting customers with respect to a three-month time horizon. A similar picture emerges when comparing the density functions of the long-run default rates, cf. Figure 4.

4.3. Some Thoughts on the Rate of Convergence

In view of Section 2.3, we briefly discuss the rate of convergence and mention a few particularities that need to be taken into account concerning the asymptotic behavior of the long-run default rate Z. When using the classical central limit theorem, it is common to specify a minimum number of random variables ensuring a reasonable approximation of the normal distribution. In our case, additional factors need to be taken into account since, for example, a large number of customers, i.e., a large number M in

Z = \frac{1}{N} \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j = 1}^{M} y_{t, j} = \sum_{j = 1}^{M} Y_{j},

does not directly lead to a good approximation. This stems from the calculation logic of the long-run default rate. For a simple illustration, we assume that we are in the setting, where all customers have the same PD with

PD = 0.01

. For

N = 1

and

M = 1000

or, analogously,

N = 1000

and

n_{t} = 1

for

t = 1, \dots, 1000

, the normal approximation is considerably better than for

N = 2

with

n_{1} = 1

and

n_{2} = 999

since, in the first case,

P (Z \geq 0, 5) \approx 0

while, in the second case,

P (Z \geq 0, 5) > 0, 01 .

Thus, the rate of convergence depends on the number of customers, the number of reference dates, and the number of customers per reference date. While the previously described scenarios are uncommon at portfolio level, at the level of individual rating grades, constellations can arise in which, on certain reference dates, there is only one customer in the corresponding rating grade.

In the sequel, we elaborate more on this particularity and aim to provide a rule of thumb, i.e., conditions for N,

n_{max}

, and

n_{min}

that imply a satisfactory normal approximation. To that end, we consider a synthetic portfolio. For the sake of simplicity, we assume that

q = 1

and simulate a convolution of weighted binomial distributions. We study the difference between the distribution functions of the long-run default rate on the level of a rating grade with

PD = 0.01

when using a simulation on the one hand and using the normal approximation on the other hand in different scenarios. We assume the number of reference dates is

N = 56

. The number of customers

n_{t}

per reference date

{RD}_{t}

and thus the number of all customers M in the portfolio varies between the considered scenarios.

In Figure 5, there are eight reference dates where only one to five customers are in the portfolio. On the other reference dates, we assume the number of customers to be constant with

n_{t} = 100

. We observe large differences in the two distribution functions, especially in the tails, which are particularly relevant for the test. If we now increase the number of customers on the critical reference dates (those with

n_{t} \leq 5

) to 10, we observe, in Figure 6, that the distribution functions are almost the same. Hence, the poor convergence in Figure 5 is caused by the few reference dates with very few customers. Furthermore, we see that, in this special case,

\frac{n_{min}}{n_{max}} \geq \frac{1}{10}

seems to be an appropriate bound in order to obtain a satisfactory approximation.

For a second comparison, we now reduce the number of customers significantly to

1 \leq n_{t} \leq 10

. Again, the number of customers equals one only on the first eight reference dates. In Figure 7, we still see major differences in the distribution functions. Doubling the number of customers leads to Figure 8, where we already obtain useful results.

On the other hand, if the number of customers is very low, e.g.,

n_{t} = 2

for half of all reference dates, we still obtain a good approximation, cf. Figure 9.

From our examples and statements from the literature, we conclude that

N \geq 30

,

n_{min} \geq 2

, and

\frac{n_{min}}{n_{max}} \geq \frac{1}{10}

seem to be useful conditions to guarantee a satisfying normal approximation.

4.4. An Alternative Way to Bound the Variance

In Appendix B, we propose a different way to estimate the variance in an even more conservative way. If one accepts the associated stricter test as a conservative check of the calibration, one can use

σ_{alt}^{2} (μ) : = \frac{1}{N} (C + K_{1}) μ - \frac{K_{2}}{N^{2}}

as the variance of the test statistic with parameters

K_{1}

and

K_{2}

depending only on the underlying portfolio, see Appendix B for the details.

In this case, we do not have to solve an optimization problem, and can determine the variance directly using the portfolio-specific parameters. However, we additionally need conservative estimates for two parameters that are not directly determined by the null hypothesis. Again, we refer to Appendix B for the details. We briefly discuss the consequences of this alternative method at this point.

The determination of

σ_{alt}^{2} (μ)

is based on a successive estimation of the variance of the test statistic against the maximum conservative value in each single step. This procedure can therefore be understood as a maximally conservative solution to the test problem. In the case of a portfolio with approximately constant size, approximately constant distribution of default probabilities over time, and no other special features, the conservatism can often be tolerated. In case of build-up or run-down portfolios or portfolios with a focus on high investment grade obligors, the risk of a type I error may become disproportionately large and the test may therefore not be suitable. This can be easily illustrated by the following two points. The estimation of quadratic terms on the left-hand side of (A6) is performed using

{PD}_{max}

regardless of how many obligors belong to the associated rating grade. For portfolios with mainly obligors with good ratings, this estimate is certainly too conservative. Moreover, the parameters

K_{1}

and

K_{2}

depend on

min \{\frac{1}{n_{t}} | t = 1, \dots, N and n_{t} > 0\} and min \{k_{t, t + i} | t = 1, \dots, N - i\},

i.e., for portfolios where the number of customers or the number of persisting customers changes strongly over time, a lot of information is lost with this method. Thus, the estimate on the variance of the test statistic in Section 3.2 is much less conservative and suitable for significantly more portfolios.

4.5. Additional Conditions on the Rating Distribution

In this section, we give an outlook and offer suggestions as to which requirements can be placed on rating distributions. The implementation of some suggestions is briefly outlined. The rating distribution of

p (μ)

obtained by minimizing (12) tends to be U-shaped, cf. Figure 10, with all but at most one rating in the best or the worst rating grade. A U-shaped rating distribution, i.e., a distribution almost exclusively between the two extreme grades, is highly unusual, both in the case of a large number of ratings and in a portfolio that hardly has any customers in the lower rating grade.

Moreover, high values of

{PD}_{max}

lead to a narrowing of the acceptance range. Depending on the portfolio, this reduction can be disproportionately strong if, for example, the bad rating grades in low-default portfolios are heavily underrepresented. Hence, the question may be raised why the rating distribution, estimated by the rating model, is not used directly. As already mentioned, this would, however, require assumptions that go far beyond the null hypothesis, and specifying the rating distribution in that way would automatically determine the null hypothesis. However, it may be sensible to place additional yet conservative conditions on the rating distribution in (9).

For example, one could demand that ratings exhibit a conservative distribution, given by a function

φ_{1} : {{PD}_{m - n + 1}, \dots, {PD}_{m}} \to [0, 1]

across the n worst rating grades with

P (p_{t, j} = {PD}_{k} | p_{t, j} \geq {PD}_{m - n + 1}) = φ_{1} ({PD}_{k}) for k \geq m - n + 1 .

For most portfolios, a conservative choice of

φ_{1}

would be

φ_{1} ({PD}_{k}) = \frac{1}{n}

, cf. Figure 11.

Another approach would be to trust the distribution estimated by the rating model to some extent, and require that there is at least a fixed fraction

δ \in [0, 1]

of the number of ratings per grade given by this distribution with density

φ_{2} : {{PD}_{1}, \dots, {PD}_{m}} \to [0, 1]

. Then, in addition to (9), one might demand

n ({PD}_{k}) \geq n_{min} ({PD}_{k}) : = ⌊\sum_{t = 1}^{N} n_{t} \cdot φ_{2} ({PD}_{k}) \cdot δ⌋,

where

n ({PD}_{k})

is the number of the ratings in rating grade k, cf. Figure 12.

We conclude this section with a modified version of a previously described approach using a uniform distribution among the n worst rating grades. As before, we consider the n worst rating grades

\{{PD}_{m - n + 1}, \dots, {PD}_{m}\}

with

{PD}_{m - n + 1} > μ

. Our aim is to formulate the requirement of a uniform distribution in such a way that we can again use the method for minimizing the variance already presented. To that end, we define

\bar{PD} : = \frac{1}{n} \sum_{i = m - n + 1}^{m} {PD}_{i} .

We follow the idea from Section 3.2, and proceed exactly as in Appendix A with

\bar{PD}

instead of

{PD}_{max}

. The effect on the variance or, almost equivalently, the acceptance range is illustrated in Figure 13. On the one hand, this approach avoids unnecessary high conservatism. On the other hand, it is still sufficiently conservative and simple to implement.

4.6. Impact of Simplification on the Acceptance Range

In this section, we discuss how different choices of affine linear functions for the simplification of the variance effect the acceptance range. We focus on the impact when choosing either the constant function

g = {PD}_{max}

or identity function as in (10).

We start by computing the reduction in the acceptance range due to choosing the affine linear function as the constant

{PD}_{max}

using a synthetic portfolio. We choose the number of customers per reference date to be constant with

n = 1000

, let the number of reference dates be

N = 60

, and

q = 4

. We define

{PD}_{max} = 0.2

,

{PD}_{min} = 0.0003

, and set

μ = 0.01

. We recall that the components of the minimal solution

p (μ)

of (12) consist only of

{PD}_{max}

,

{PD}_{min}

, and

μ_{k_{0} + 1}

, where the latter is uniquely determined by the minimization problem.

We now choose

g = {PD}_{max}

, i.e., the most conservative function possible. Proceeding exactly as in Section 3.2 and in Appendix A, we simply obtain different values for

α_{i}

and

c_{i}

with

i = 0, \dots, q - 1

, namely,

\begin{matrix} α_{i} & = & \frac{q - i}{q} (1 - {PD}_{max}), \\ c_{i} & = & 0 . \end{matrix}

Estimating the variance as in Section 3.2, once with

g = {PD}_{max}

and once with

g = id

, we obtain

\frac{|K ({PD}_{max}) - k ({PD}_{max})|}{|K (id) - k (id)|} \approx 0.994,

(13)

where

K ({PD}_{max})

and

k ({PD}_{max})

and

K (id)

and

k (id)

denote the upper and lower bound of the acceptance range for

g = {PD}_{max}

and

g = id

, respectively. In this particular setting, we see that the choice of g does not have a substantial influence on the size of the acceptance range.

The difference in the size of the acceptance range is largely influenced by the contributions of the obligors with

PD = {PD}_{min}

. By Theorem A1, large values of

{PD}_{max}

lead to an increased number of pairs

(t, j)

with

p_{t, j} = {PD}_{min}

. Hence, the size of the quotient in (13) is reduced for very large values of

{PD}_{max}

and very small values of

μ

. For instance, by redefining

{PD}_{max} = 0.45

and

μ = 0.0006

we obtain

\frac{|K ({PD}_{max}) - k ({PD}_{max})|}{|K (id) - k (id)|} \approx 0.778 .

As a result, relevant differences in the acceptance ranges mostly occur in settings with very large values of

{PD}_{max}

and values of

μ

that are close to

{PD}_{min}

. In practice, the maximally conservative choice of

g = {PD}_{max}

is therefore not recommendable since it assumes that on each reference date, all persisting customers have been in the worst rating grade, independent of their current rating. To conclude, it is recommended to choose

g = id

rather than

g = {PD}_{max}

.

5. Conclusions

We introduced a new statistical test for the long-term calibration in rating systems that can deal with correlation effects caused by overlapping time windows. Hypothesis tests that can deal with this type of correlation effects are necessary to implement regulatory requirements in the eurozone. While on the level of individual ratings grades, the variance of the test statistic is already determined by the null hypothesis, on a portfolio level, we provided a conservative estimate for the variance by solving a suitable minimization problem.

In Section 2.4, we calculated the covariances under the assumption that the time of default is uniformly distributed over the observation period, which is a natural choice in the absence of additional information about the default behavior. Considering different distributions for the time of default is, in principle, possible but it increases the mathematical complexity disproportionately.

Moreover, assuming that the default behaviors of different customers are independent, we showed that the the long-run default rate is approximately normally distributed. We are aware that, in practice, this assumption cannot always be justified, e.g., in sectors that heavily depend on economic cycles. Nevertheless, the assumption of independence ensures a smaller acceptance range and is therefore conservative. Abandoning the assumption of independence would require a completely different methodology, as the properties of the normal distribution are fundamental for the analysis performed in this paper.

All additional assumptions made in the course of the paper are of a conservative nature, in the sense that they ensure a lower estimate of the variance of the long-run default rate, e.g., in Formula (10). In Section 4.5, we discussed and showed that further assumptions on the rating distribution are also possible in specific scenarios. We also presented less conservative but, on a practical level, more plausible choices of rating distributions and discussed their effect on the acceptance range.

Author Contributions

Conceptualization, P.K., M.N. and J.S.; methodology, P.K., M.N. and J.S.; validation, P.K., M.N. and J.S.; formal analysis, P.K., M.N. and J.S.; investigation, P.K., M.N. and J.S.; resources, P.K., M.N. and J.S.; writing—original draft preparation, P.K. and J.S.; writing—review and editing, M.N.; visualization, P.K. and J.S.; supervision, M.N.; project administration, M.N.; funding acquisition, M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)—SFB 1283/2 2021—317210226.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors thank Johannes Emmerling, Markus Klein, and Marco Ritter for helpful discussions related to this work. The first and the third author are grateful for the support of the Landesbank Baden-Württemberg related to this work.

Conflicts of Interest

The authors Patrick Kurth and Jan Streicher were employed by the company Landesbank Baden-Württemberg. The remaining author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

List of Symbols and Abbreviations

$PD$	probability of default
LGD	loss given default
$B (1, p)$	Bernoulli distribution with probability p
${PD}_{k}$	default probability of rating grade k in the underlying master scale
${PD}_{min}$	equal to ${PD}_{1}$ , i.e., default probability of the best rating grade
${PD}_{max}$	equal to ${PD}_{m}$ , i.e., default probability of the worst rating grade
${RD}_{t}$	reference date number t
N	number of reference dates
$N$	set of natural numbers
$n_{t}$	number of obligors on reference date
$N_{0}$	set of natural numbers including zero
$n_{min}$	minimal number (larger than zero) of existing obligors at any reference date
$n_{max}$	maximal number of existing obligors at any reference date
$min A$	minimum of a set A
$max A$	maximum of a set A
q	number of reference dates within a one-year time horizon starting from an arbitrary reference date
$w_{t, s}$	size of the overlap of observation periods with reference dates ${RD}_{t}$ and ${RD}_{s}$
M	total number of all customers during the history
$Λ_{t}$	set of customers at reference date ${RD}_{t}$
$X_{t}$	one-year default rate at reference date ${RD}_{t}$
$x_{t}$	one-year default state of an unspecified customer at reference date ${RD}_{t}$
$x_{t, j}$	one-year default state of customer j at reference date ${RD}_{t}$
$p_{t, j}$	probability of default over a one-year time horizon of customer j at reference date ${RD}_{t}$
${DR}_{t}$	realized default rate on reference date ${RD}_{t}$
$R_{N}$	set of indices for reference dates, where the portfolio contains at least one customer
$R (N)$	cardinality of $R_{N}$
Z	long-run default rate
$LRDR$	realized long-run default rate
$LRCT$	estimated long-run default rate (long-run central tendency)
$μ$	expected value of the long-run default rate
$σ^{2}$	variance of the long-run default rate
$H_{0}$	null hypothesis
$H_{1}$	alternative hypothesis
$k, K$	lower and upper bound of the acceptance range of the hypothesis test
$Cov (X, Y)$	covariance of the random variables X and Y
$N (μ, σ^{2})$	Normal distribution with expected value $μ$ and variance $σ^{2}$
$E (X)$	expected value of a random variable X
$P$	probability measure
$R$	set of real numbers
$1_{A}$	indicator function of set A
$Φ$	cumulative distribution function for the standard normal distribution
$\| I \|$	length of an interval $I \subset R$ or cardinality of a finite set I
$k_{t, s}$	number of persisting customers with respect to reference dates ${RD}_{t}$ and ${RD}_{s}$
∅	empty set
$p (μ)$	solution of the minimization problem
$m (μ)$	minimal value of the minimization problem
$\| x \|$	absolute value of a real number x

Appendix A. Minimization Problem

Let

n \in N

and

α_{i} \geq 0

for

i = 1, \dots, n

. In this section, we compute the minimum of the function

f : R^{n} \to R, x = (x_{1}, \dots, x_{n}) \mapsto f (x) : = \sum_{i = 1}^{n} α_{i} x_{i}

(A1)

under side conditions

\sum_{i = 1}^{n} β_{i} x_{i} = m and c_{1} \leq x_{i} \leq c_{2} for i = 1, \dots, n

(A2)

with given constants

β_{i} > 0

for

i = 1, \dots, n

,

m > 0

, and

0 < c_{1} < c_{2}

. We sort the ratios

{(α_{i} / β_{i})}_{i = 1, \dots, n}

in descending order and denote by

α_{(k)}

and

β_{(k)}

, which are the numerator and denominator of the k-th largest element of these ratios. In case of equal values among the ratios

{(α_{i} / β_{i})}_{i = 1, \dots, n}

, the one with smaller index i is listed first. Moreover,

x_{(k)}

denotes the variable of f with coefficient

α_{(k)}

. We define

k_{0} : = max \{k |\sum_{j = 1}^{k} c_{1} \cdot β_{(j)} + \sum_{j = k + 1}^{n} c_{2} \cdot β_{(j)} \geq m\} .

Theorem A1.

The cost function in (A1) is minimized under (A2) for

x_{(k)} = \{\begin{matrix} c_{1}, & k \leq k_{0}, \\ μ_{k_{0} + 1}, & k = k_{0} + 1, \\ c_{2}, & k_{0} + 2 \leq k \leq n, \end{matrix}

where the value

μ_{k_{0} + 1}

is uniquely determined by (A2) through

μ_{k_{0} + 1} = \frac{m - \sum_{j = 1}^{k} c_{1} \cdot β_{(j)} - \sum_{j = k + 2}^{n} c_{2} \cdot β_{(j)}}{β_{(k_{0} + 1)}} .

Proof.

We show that for any

y \in R^{n}

that fulfills (A2),

f (y) \geq f (x)

holds if

y \neq x

. Per construction of x, there exists some

ε > 0

such that either

\sum_{j = 1}^{k_{0}} y_{(j)} \cdot β_{(j)} = \sum_{j = 1}^{k_{0}} x_{(j)} \cdot β_{(j)} + ε and \sum_{j = k_{0} + 1}^{n} y_{(j)} \cdot β_{(j)} = \sum_{j = k_{0} + 1}^{n} x_{(j)} \cdot β_{(j)} - ε

or

\sum_{j = 1}^{k_{0} + 1} y_{(j)} \cdot β_{(j)} = \sum_{j = 1}^{k_{0} + 1} x_{(j)} \cdot β_{(j)} + ε and \sum_{j = k_{0} + 2}^{n} y_{(j)} \cdot β_{(j)} = \sum_{j = k_{0} + 2}^{n} x_{(j)} \cdot β_{(j)} - ε

holds. Without loss of generality we regard the first case. Let

ε = \sum_{i = 1}^{k_{0}} ε_{i}

such that

y_{(i)} = x_{(i)} + \frac{ε_{i}}{β_{(i)}} for all i = 1, \dots, k_{0}

and

ε = \sum_{i = k_{0} + 1}^{n} ε_{i}

such that

y_{(i)} = x_{(i)} - \frac{ε_{i}}{β_{(i)}} for all i = k_{0} + 1, \dots, n .

Then,

\sum_{j = 1}^{k_{0}} y_{(j)} α_{(j)} = \sum_{j = 1}^{k_{0}} (x_{(j)} + \frac{ε_{j}}{β_{(j)}}) α_{(j)} \geq \sum_{j = 1}^{k_{0}} x_{(j)} α_{(j)} + \frac{α_{(k_{0})}}{β_{(k_{0})}} ε

and

\sum_{j = k_{0} + 1}^{n} y_{(j)} α_{(j)} = \sum_{j = k_{0} + 1}^{n} (x_{(j)} - \frac{ε_{j}}{β_{(j)}}) α_{(j)} \geq \sum_{j = k_{0} + 1}^{n} x_{(j)} α_{(j)} - \frac{α_{(k_{0} + 1)}}{β_{(k_{0} + 1)}} ε .

Hence,

\sum_{j = 1}^{n} y_{(j)} α_{(j)} \geq \sum_{j = 1}^{n} x_{(j)} α_{(j)} + (\frac{α_{(k_{0})}}{β_{(k_{0})}} - \frac{α_{(k_{0} + 1)}}{β_{(k_{0} + 1)}}) ε \geq \sum_{j = 1}^{n} x_{(j)} α_{(j)}

For the second case, one proceeds exactly in the same way. □

We now translate the previous theorem into the setting of Section 3.2. There, we aim to minimize the sum

\sum_{t = 1}^{N} \sum_{j \in Λ_{t}} α_{t, j} p_{t, j}

under the side condition

μ = \frac{1}{N} \cdot \sum_{t = 1}^{N} \frac{1}{n_{t}} \sum_{j \in Λ_{t}} p_{t, j} and p_{t, j} \in [{PD}_{min}, {PD}_{max}] .

We rewrite

μ = \sum_{t = 1}^{N} \sum_{j \in Λ_{t}} β_{t, j} p_{t, j}

with

β_{t, j} : = \frac{1}{N} \cdot \frac{1}{n_{t}}

. We first sort the set of all possible tuples

(t, j)

in ascending order, first with respect to j and then with respect to t, i.e., we can identify any possible tuple

(t, j)

with a natural number

i = 1, \dots, n : = \sum_{t = 1}^{N} n_{t}

, which simplifies the cost function to

f : R^{n} \to R, p = (p_{1}, \dots, p_{n}) \mapsto f (p) : = \sum_{i = 1}^{n} α_{i} p_{i}

(A3)

with side conditions

\sum_{i = 1}^{n} β_{i} p_{i} = μ and {PD}_{min} \leq p_{i} \leq {PD}_{max} for i = 1, \dots, n .

(A4)

Now, we are exactly in the setting of (A1) and (A2), and can apply Theorem A1.

Appendix B. Test on Portfolio Level without Solving the Minimization Problem

We start by estimating the covariance between two default rates on reference dates

{RD}_{t}

and

{RD}_{s}

. Using Equation (5), we find that

\begin{matrix} Cov (X_{t}, X_{s}) & = \frac{1}{n_{t} \cdot n_{s}} \sum_{j \in Λ_{t} \cap Λ_{t_{2}}} Cov (x_{t, j}, x_{s, j}) \geq \frac{1}{n_{t} \cdot n_{s}} \sum_{j \in Λ_{t} \cap Λ_{s}} p_{s, j} w_{t, s} (1 - {PD}_{max}) \\ = \frac{k_{t, s}}{n_{t} \cdot n_{s}} w_{t, s} (1 - {PD}_{max}) \frac{1}{k_{t, s}} \sum_{j \in Λ_{t} \cap Λ_{s}} p_{s, j} \\ \geq \frac{k_{t, s}}{n_{t} \cdot n_{s}} w_{t, s} (1 - {PD}_{max}) γ \cdot E (X_{s}) . \end{matrix}

(A5)

The parameter

γ > 0

is less than or equal to the ratio of the average default risk of persisting customers in the portfolio and the average default risk of the entire portfolio, i.e.,

E (\frac{1}{k_{t, s}} \sum_{j \in Λ_{t} \cap Λ_{s}} x_{s, j}) \geq γ \cdot E (\frac{1}{n_{s}} \sum_{j \in Λ_{s}} x_{s, j}) .

In view of the fact that persisting customers should be the majority in the portfolio, it is reasonable to assume that

γ \approx 1

. The value is to be estimated conservatively depending on the portfolio under consideration, i.e., in case of doubt, the value chosen should be relatively small. If a portfolio consists only of persisting customers,

γ

is always 1, while the parameter for portfolios with a small proportion of persisting customers can be above or below 1. In the following, the parameter

γ

is to be understood as an estimate that is independent of the reference date and conservative for the entire history. From Equation (8), we see that

\begin{matrix} σ^{2} \geq \frac{1}{N^{2}} \sum_{t = 1}^{N} \frac{1}{n_{t}^{2}} \sum_{j \in Λ_{t}} p_{t, j} (1 - p_{t, j}) \geq C \cdot E (\frac{1}{N} \sum_{t = 1}^{N} X_{t}) \end{matrix}

(A6)

with

C = \frac{1 - {PD}_{max}}{n_{max}}

. Hence, using Equation (6),

\begin{matrix} \begin{matrix} N^{2} \cdot σ^{2} & = \sum_{t = 1}^{N} σ_{i}^{2} + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} Cov (X_{t}, X_{t + i}) \\ \geq \sum_{t = 1}^{N} σ_{t}^{2} + 2 \sum_{i = 1}^{q - 1} \sum_{t = 1}^{N - i} \frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}} \cdot \frac{q - i}{q} (1 - {PD}_{max}) \cdot γ \cdot μ_{t + i} \\ \geq \sum_{t = 1}^{N} σ_{t}^{2} + 2 \sum_{i = 1}^{q - 1} min_{t = 1, \dots, N - i} (\frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}}) \frac{q - i}{q} (1 - {PD}_{max}) \cdot γ \sum_{t = 1}^{N - i} μ_{t + i} \\ \geq \sum_{t = 1}^{N} σ_{t}^{2} + K_{1} \sum_{t = 1}^{N} μ_{t} - K_{2} \end{matrix} \end{matrix}

with

\begin{matrix} K_{1} & : = 2 \sum_{i = 1}^{q - 1} min_{t = 1, \dots, N - i} (\frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}}) \frac{q - i}{q} (1 - {PD}_{max}) \cdot γ and \\ K_{2} & : = 2 \sum_{i = 1}^{q - 1} min_{t = 1, \dots, N - i} (\frac{k_{t, t + i}}{n_{t} \cdot n_{t + i}}) \frac{q - i}{q} (1 - {PD}_{max}) \cdot γ \cdot i \cdot μ_{old}, \end{matrix}

where

μ_{old} \geq {max}_{i = 1, \dots, q - 1} μ_{i}

. The term

μ_{old}

cannot be determined by the null hypothesis, thus a conservative value is to be determined. For portfolios consisting of customers with good credit ratings, the choice

μ_{old} = {PD}_{max}

is clearly not adequate. Using Equation (A6), it follows that

N^{2} \cdot σ^{2} \geq c \sum_{i = 1}^{N} μ_{i} + K_{1} \sum_{i = 1}^{N} μ_{i} - K_{2} = N (C + K_{1}) μ - K_{2}

with

c \geq \frac{1}{n_{max}} (1 - {PD}_{max})

and

\begin{matrix} σ^{2} \geq σ_{alt}^{2} (μ) : = \frac{1}{N} (c + K_{1}) μ - \frac{K_{2}}{N^{2}} . \end{matrix}

(A7)

Moreover, for portfolios with a short history and low PDs, it is not always possible to choose

μ_{old} = {PD}_{max}

since

\frac{1}{N} (c + K_{1}) μ - \frac{K_{2}}{N^{2}}

could become smaller than 0. In order to be able to rule out this case, which also appears to be of little interest for applications of the test, we set an upper bound for

μ_{old}

, i.e.,

μ_{old} \leq \frac{N \cdot (c + K_{1}) μ}{(q - 1) \cdot K_{1}} .

Considering the random variable

Z^{*} \sim N (LRCT, σ_{alt}^{2} (LRCT))

, we define the limits of the acceptance range as

k : = LRCT + Φ^{- 1} (\frac{1}{2} α) σ_{alt} (LRCT)

and

K : = LRCT + Φ^{- 1} (1 - \frac{1}{2} α) σ_{alt} (LRCT) .

The calibration test passes if

LRDR \in [k, K] .

References

Aussenegg, Wolfgang, Florian Resch, and Gerhard Winkler. 2011. Pitfalls and remedies in testing the calibration quality of rating systems. Journal of Banking and Finance 35: 698–708. [Google Scholar] [CrossRef]
Blochwitz, Stefan, Stefan Hohl, Dirk Tasche, and Carsten S. Wehn. 2004. Validating Default Probabilities on Short Time Series. Chicago: Capital & Market Risk Insights, Federal Reserve Bank of Chicago. [Google Scholar]
Blochwitz, Stefan, Marcus R. W. Martin, and Carsten S. Wehn. 2006. Statistical Approaches to PD Validation. In The Basel II Risk Parameters: Estimation, Validation, and Stress Testing. Berlin and Heidelberg: Springer, pp. 289–306. [Google Scholar]
Blöchlinger, Andreas. 2012. Validation of default probabilities. Journal of Financial and Quantitative Analysis 47: 1089–123. [Google Scholar] [CrossRef]
Blöchlinger, Andreas. 2017. Are the Probabilities Right? New Multiperiod Calibration Tests. The Journal of Fixed Income 26: 25–32. [Google Scholar] [CrossRef]
Caprioli, Sergio, Emanuele Cagliero, and Riccardo Crupi. 2023. Quantifying Credit Portfolio sensitivity to asset correlations with interpretable generative neural networks. arXiv arXiv:2309.08652. [Google Scholar] [CrossRef]
Caprioli, Sergio, Riccardo Cogo, and Raphael Cavallari. 2023. Back-Testing Credit Risk Parameters on Low Default Portfolios: A Bayesian Approach with an Application to Sovereign Risk. Preprint SSRN. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4408217 (accessed on 1 February 2023).
Council of European Union. 2013. Regulation (EU) No 575/2013 of the European Parliament and of the Council of 26 June 2013 on prudential requirements for credit institutions and investment firms and amending regulation (EU) No 648/2012. Official Journal of the European Union L 176: 1. Available online: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32013R0575 (accessed on 1 February 2023).
Coppens, Francois, Manuel Mayer, Laurent Millischer, Florian Resch, Stephan Sauer, and Klaas Schulze. 2016. Advances in Multivariate Back-Testing for Credit Risk Underestimation. Frankfurt am Main: European Central Bank. [Google Scholar]
Cucinelli, Doriana, Maria Luisa Di Battista, Malvina Marchese, and Laura Nieri. 2018. Credit risk in European banks: The bright side of the internal ratings based approach underestimation. Journal of Banking and Finance 93: 213–29. [Google Scholar] [CrossRef]
Deutsche Bundesbank. 2003. Approaches to the Validation of Internal Rating Systems. Monatsbericht September. Frankfurt am Main: Deutsche Bundesbank, pp. 59–71. [Google Scholar]
European Banking Authority (EBA). 2016. Final Draft Regulatory Technical Standards (RTS) on the Specification of the Assessment Methodology for IRB. Available online: https://www.eba.europa.eu/activities/single-rulebook/regulatory-activities/credit-risk/regulatory-technical-standards-2 (accessed on 1 February 2023).
European Banking Authority (EBA). 2017. Guidelines on PD Estimation, LGD Estimation and Treatment of Defaulted Assets. Available online: https://www.eba.europa.eu/regulation-and-policy/model-validation/guidelines-on-pd-lgd-estimation-and-treatment-of-defaulted-assets (accessed on 1 February 2023).
European Central Bank (ECB). 2024. ECB Guide to Internal Models. Available online: https://www.bankingsupervision.europa.eu/ecb/pub/pdf/ssm.supervisory_guides202402_internalmodels.en.pdf (accessed on 25 June 2024).
European Commission. 2021. Commission Delegated Regulation (EU) 2022/439. Official Journal of the European Union L 90: 1–66. [Google Scholar]
Hogg, Robert V., and Elliot A. Tanis. 1977. Probability and Statistical Inference. New York: Macmillan Publishing Co., Inc. London: Collier Macmillan Publishers. [Google Scholar]
Jing, Zhang, Fanlin Zhu, and Joseph Lee. 2008. Asset correlation, realized default correlation and portfolio credit risk modeling methodology. Moody’s KMV, March. [Google Scholar]
Klenke, Achim. 2020. Probability Theory—A Comprehensive Course, 3rd ed. Universitext. Cham: Springer. [Google Scholar]
Li, Weiping. 2016. Probability of Default and Default Correlations. Journal of Risk and Financial Management 9: 7. [Google Scholar] [CrossRef]
Pluto, Katja, and Dirk Tasche. 2011. Estimating probabilities of default for low default portfolios. In The Basel II Risk Parameters: Estimation, Validation, Stress Testing-with Applications to Loan Risk Management. Berlin and Heidelberg: Springer, pp. 75–101. [Google Scholar]
Tasche, Dirk. 2003. A Traffic Lights Approach to PD Validation. Preprint arXiv. Available online: https://arxiv.org/abs/cond-mat/0305038 (accessed on 1 February 2023).
Tasche, Dirk. 2008. Validation of internal rating systems and PD estimates. In The Analytics of Risk Model Validation. Amsterdam: Elsevier, pp. 169–96. [Google Scholar]
Tasche, Dirk. 2013. Bayesian estimation of probabilities of default for low default portfolios. Journal of Risk Management in Financial Institutions 6: 302–26. [Google Scholar] [CrossRef]
Zhou, Chunsheng. 2001. An Analysis of Default Correlations and Multiple Defaults. The Review of Financial Studies 14: 555–76. [Google Scholar] [CrossRef]

Figure 1. Default states per time period.

Figure 2. Density functions of long-run default rates.

Figure 3. Effect of proportion of persisting customers on acceptance range per confidence level

α

.

Figure 3. Effect of proportion of persisting customers on acceptance range per confidence level

α

.

Figure 4. Comparison between density functions of long-run default rates.

Figure 5. Distribution functions: simulation and normal approximation with

n_{t} \leq 5

for

t = 1, \dots, 8

and

n_{t} = 100

for

t = 9, \dots, 56

.

Figure 5. Distribution functions: simulation and normal approximation with

n_{t} \leq 5

for

t = 1, \dots, 8

and

n_{t} = 100

for

t = 9, \dots, 56

.

Figure 6. Distribution functions: simulation and normal approximation with

n_{t} \leq 10

for

t = 1, \dots, 8

and

n_{t} = 100

for

t = 9, \dots, 56

.

Figure 6. Distribution functions: simulation and normal approximation with

n_{t} \leq 10

for

t = 1, \dots, 8

and

n_{t} = 100

for

t = 9, \dots, 56

.

Figure 7. Distribution functions: simulation and normal approximation with

n_{t} = 1

for

t = 1, \dots, 8

and

n_{t} = 10

for

t = 9, \dots, 56

.

Figure 7. Distribution functions: simulation and normal approximation with

n_{t} = 1

for

t = 1, \dots, 8

and

n_{t} = 10

for

t = 9, \dots, 56

.

Figure 8. Distribution functions: simulation and normal approximation with

n_{t} = 2

for

t = 1, \dots, 8

and

n_{t} = 20

for

t = 9, \dots, 56

.

Figure 8. Distribution functions: simulation and normal approximation with

n_{t} = 2

for

t = 1, \dots, 8

and

n_{t} = 20

for

t = 9, \dots, 56

.

Figure 9. Distribution functions: simulation and normal approximation with

n_{t} = 2

for

t = 1, \dots, 28

and

n_{t} = 20

for

t = 29, \dots, 56

.

Figure 9. Distribution functions: simulation and normal approximation with

n_{t} = 2

for

t = 1, \dots, 28

and

n_{t} = 20

for

t = 29, \dots, 56

.

Figure 10. Rating distribution of rating model and rating distribution after minimization without additional assumptions on the distribution.

Figure 11. Rating distribution of rating model and rating distribution after minimization with assumption of uniform distribution for bad rating grades.

Figure 12. Rating distribution of rating model and rating distribution after minimization with assumption of trusting the distribution of the model to a certain degree.

Figure 13. Dependence of sigma on

{PD}_{max}

.

Figure 13. Dependence of sigma on

{PD}_{max}

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kurth, P.; Nendel, M.; Streicher, J. A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows. Risks 2024, 12, 131. https://doi.org/10.3390/risks12080131

AMA Style

Kurth P, Nendel M, Streicher J. A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows. Risks. 2024; 12(8):131. https://doi.org/10.3390/risks12080131

Chicago/Turabian Style

Kurth, Patrick, Max Nendel, and Jan Streicher. 2024. "A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows" Risks 12, no. 8: 131. https://doi.org/10.3390/risks12080131

APA Style

Kurth, P., Nendel, M., & Streicher, J. (2024). A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows. Risks, 12(8), 131. https://doi.org/10.3390/risks12080131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hypothesis Test for the Long-Term Calibration in Rating Systems with Overlapping Time Windows

Abstract

1. Introduction

2. Setup and Preliminaries

2.1. Setup and Notation

2.2. Formal Description of the Test

2.3. Distribution of the Long-Run Default Rate

2.4. Covariance between Default States

2.5. Covariance between Default Rates

2.6. Variance of the Long-Run Default Rate

3. Hypothesis Test for Long-Term Calibration

3.1. Statistical Test per Rating Grade

3.2. Statistical Test on Portfolio Level

4. Discussion and Further Considerations

4.1. Effect of Persisting Customers on the Variance of Z

4.2. Effect of Persisting Customers on Acceptance Range

4.3. Some Thoughts on the Rate of Convergence

4.4. An Alternative Way to Bound the Variance

4.5. Additional Conditions on the Rating Distribution

4.6. Impact of Simplification on the Acceptance Range

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

List of Symbols and Abbreviations

Appendix A. Minimization Problem

Appendix B. Test on Portfolio Level without Solving the Minimization Problem

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI