Next Article in Journal
Role of Governance in Developing Disaster Resiliency and Its Impact on Economic Sustainability
Next Article in Special Issue
The Naive Estimator of a Poisson Regression Model with a Measurement Error
Previous Article in Journal
Enablers for Growth of Cryptocurrencies: A Fuzzy–ISM Benchmarking
Previous Article in Special Issue
On the Contaminated Weighted Exponential Distribution: Applications to Modeling Insurance Claim Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modelling of Loan Non-Payments with Count Distributions Arising from Non-Exponential Inter-Arrival Times

1
Department of Computing and Information Systems, Sunway University, Petaling Jaya 47500, Malaysia
2
Institute of Actuarial Science and Data Analytics, UCSI University, Kuala Lumpur 56000, Malaysia
3
Institute of Mathematical Sciences, University of Malaya, Kuala Lumpur 50603, Malaysia
*
Author to whom correspondence should be addressed.
J. Risk Financial Manag. 2023, 16(3), 150; https://doi.org/10.3390/jrfm16030150
Submission received: 31 January 2023 / Revised: 15 February 2023 / Accepted: 20 February 2023 / Published: 23 February 2023
(This article belongs to the Special Issue Financial Data Analytics and Statistical Learning)

Abstract

:
The number of non-payments is an indicator of delinquent behaviour in credit scoring, hence its estimation and prediction are of interest. The modelling of the number of non-payments, as count data, can be examined as a renewal process. In a renewal process, the number of events (such as non-payments) which has occurred up to a fixed time t is intimately connected with the inter-arrival times between the events. In the context of non-payments, the inter-arrival times correspond to the time between two subsequent non-payments. The probability mass function and the renewal function of the count distribution are often complicated, with terms involving factorial and gamma functions, and thus their computation may encounter numerical difficulties. In this paper, with the motivation of modelling the number of non-payments through a renewal process, a general method for computing the probabilities and the renewal function based on numerical Laplace transform inversion is discussed. This method is applied to some count distributions which are derived given the distributions of the inter-arrival times. Parameter estimation with maximum likelihood estimation is considered, with an application to a data set on number of non-payments from the literature.

1. Introduction

In credit scoring, default probabilities are often of interest to identify and manage the risk of bad loans. However, evaluation of default probabilities alone is insufficient to assess the risk and returns of bank funding (Dionne et al. 1996). Before an accepted loan is classified as a bad loan, there would have been several non-payments which come with costs incurred by reminders, collection, and other administrative charges. Therefore, instead of classification of a loan as either good or bad, a flexible alternative approach to risk evaluation is through the modelling of the number of non-payments (Karlis and Rahmouni 2007). The number of non-payments, which is a primary indicator of delinquent behaviour, are count data. Modelling of the counts of non-payments will be useful for estimating the probability of default. The basic model for count data is the well-known Poisson model which exhibits equi-dispersion where the mean is equal to its variance. As such, the Poisson model is often found to be inadequate in the presence of over- or under-dispersion. Various approaches have been proposed to extend or generalize the Poisson distribution. Examples of such approaches are: mixture models for heterogeneity (Gupta and Ong 2005), such as the negative binomial (NB) (Greenwood and Yule 1920) and Poisson-inverse Gaussian (P-iG) (Holla 1967; Sankaran 1968), Lagrange expansion generalization of the Poisson distribution (Consul and Jain 1973), and count distributions from renewal processes where the time between events are non-exponential distributions (Winkelmann 1995). In the context of modelling number of non-payments, truncated count models (Dionne et al. 1996), Poisson finite mixtures (Karlis and Rahmouni 2007) and non-parametric models (Mestiri and Farhat 2021) have been investigated in the literature.
It is well-known that, in a renewal process, if the waiting times are exponential and independent, we obtain the Poisson distribution for the event counts. In the context of loan non-payments, the inter-arrival times refer to the duration between two subsequent non-payments. Thomas et al. (2016) used Markov chains to model the payment patterns to estimate recover rates. This renewal process approach to derive count distributions has been considered by several researchers. Winkelmann (1995) derived the count distribution when the inter-arrival time is an Erlang distribution. Other distributions which have been considered by various authors to model the inter-arrival times are the gamma distribution (Winkelmann 1995), Weibull distribution (McShane et al. 2008), which is very popular in the field of reliability studies, Mittag-Leffler (Jose and Abraham 2011), Gumbel Type II (Jose and Abraham 2013), and generalized Weibull (Ong et al. 2015); see Table 1. The count distributions were mostly obtained using extensive numerical and analytical methods. For example, McShane et al. (2008) and Jose and Abraham (2013) used the polynomial expansion method to derive the count distribution for Weibull and Gumbel inter-arrival times, respectively. A different approach by From (2004) is to use a family of generalized Poisson distributions to approximate the renewal counting processes with Weibull, truncated normal and exponentiated Weibull inter-arrival times. Baker and Kharrat (2017) proposed the use of repeated convolutions of the discretized distributions with Richard extrapolation as well as an adaptation of De Pril’s method to compute probabilities in event count distributions from renewal processes. Nadarajah and Chan (2018) derived count distributions arising from 13 different inter-arrival time distributions and studied their fit to football home goals data using the algebraic manipulation package Maple. A similar perspective in the modelling of non-life insurance claims data was discussed by Maciak et al. (2021) through infinitely stochastic processes and Lindholm and Zakrisson (2022).
The objectives of this paper are to propose the modelling of number of loan non-payments through the renewal process approach and to examine the computation of the pmf. Due to the rather involved computation of the probabilities mentioned previously, a simple, general and efficient method of computing the probabilities of count distributions arising from non-exponential inter-arrival time distributions of renewal processes is discussed to facilitate the statistical modelling. We consider the generalized Weibull, inverse Gaussian and convolution of two gamma distributions due to their greater generality, as they include, among others, the Weibull and gamma distributions as special cases. These inter-arrival times’ distributions have flexible hazard functions so that the corresponding count distributions are able to cater for under-, equi- and over dispersion. This relationship between the inter-arrival times’ hazard function and the dispersion of the corresponding count distribution has been proven by Winkelmann (1995). We propose an easily implemented and efficient method to compute the probabilities of the counts and, subsequently, the renewal function (expected number of renewals), given the Laplace transform of the inter-arrival times density function. The computation of the renewal function has been extensively studied by various authors, for example, in the case of the Weibull renewal function, see Smith and Leadbetter (1963); Constantine and Robinson (1997).
In Section 2, we briefly describe the relationship between the distribution of the inter-arrival times and the count distribution, as well as some existing count distributions. We focus on the case when the sequence of inter-arrival times is independent and identically distributed, which gives rise to the renewal process. Count distributions arising from inverse Gaussian and convolution of two gamma distributions as inter-arrival times are considered. In these sections, we assume that the inter-arrival time Xi is independent and identically distributed and we drop the index i from the notation, and thus X denotes the inter-arrival time. The proposed method for the computation of the count probabilities and its renewal function is discussed in Section 3. Section 4 details the application of the distributions on a data set on number of non-payments from the literature. We perform parameter estimation using maximum likelihood estimation. Finally, a concluding discussion is given in Section 5.

2. Modelling of Loan Non-Payment Counts

2.1. Count Distribution and Inter-Arrival Times Distribution

A counting process is a stochastic point process { N ( t ) , t 0 } where N(t) represents the total number of events that have occurred by time t. In this paper, the number of events corresponds to the number of non-payments. Let Sn denote the waiting time to (or arrival time of) the nth non-payment, and Xn denote the time between the (n − 1)st and the n-th non-payment of this process, i.e., two subsequent non-payments. In the rest of this paper, Xn will be referred to as inter-arrival times. Therefore, S 0 = 0 and S n = i = 1 n X i , n ≥ 1. If the sequence of inter-arrival times {X1, X2, …} is independent and identically distributed as f(x) with cumulative distribution function (cdf) F(x), the counting process { N ( t ) , t 0 } is known as a renewal process. In a renewal process, the distribution function of Sn can be obtained as the n-fold convolution Fn(x) of the distribution of Xi and F0 (t) = 1. In this case, the renewal function or expected number of non-payments E[N(t)] and the distribution of N(t) can be obtained from the relationship N ( t ) n S n t . As such, the probability mass function (pmf) of the count distribution is
P r { N ( t ) = n } = P r { S n t } P r { S n + 1 t } = F n ( t ) F n + 1 ( t ) ,
where n = 0, 1, …, and Fn(x) is the cdf of Sn. The renewal function is defined as
H ( t ) = E [ N ( t ) ] = i = 1 F i ( t ) .
A d example is, when the inter-arrival times are exponentially distributed, the counting process is a Poisson process with intensity λ(t) = λ with pmf
P r { N ( t ) = n } = e λ t ( λ t ) n n ! ,   n = 0 , 1 , 2 ,   .
The Laplace transform φ(s) of a function f(x) is defined as φ ( s ) = 0 e s x f ( x ) d x , where s is a complex number. The Laplace transform exists for the function f(x) defined over (0, ∞), whenever the integral converges. Since the inter-arrival times Xi’s are independent and identically distributed, the Laplace transform of the arrival time S n = i = 1 n X i is simply the n-fold convolution of the Laplace transform of Xi. Consequently, the Laplace transform of the count distribution is derived as
φ n ( s ) = L ( P r { N ( t ) = n } ) = L ( F n ( t ) F n + 1 ( t ) ) = 1 φ ( s ) φ ( s ) ( φ ( s ) ) n ,
where φ(s) is the Laplace transform of the inter-arrival time’s probability density function (pdf) f(x). On the other hand, the Laplace transform of (2) is L ( E [ N ( t ) ] ) = φ ( s ) s 1 ( 1 φ ( s ) ) , | φ ( s ) | < 1 .
In the existing literature, Poisson distribution and negative binomial distribution have been proposed for modelling non-payments (Dionne et al. 1996). In the following sections, we present alternative count distributions for modelling of non-payments examined from the perspective of their inter-arrival times.

2.1.1. Count Distribution for Generalized Weibull Duration

The pdf of a generalized Weibull distribution is given as
f ( x ; α , α , λ ) = a α x α 1 ( 1 a x α / λ ) λ 1 ,
for a, α > 0, x > 0 if λ 0 and 0 < x < ( λ / a ) 1 / α if λ > 0 (Mudholkar et al. 1996). An important limiting case is the Weibull distribution when λ →∞, with pdf f ( x ; a , α , λ ) = a α x α 1 e a x α . We shall re-write the Weibull pdf as f ( x ; a , α , λ ) = λ a x α λ 1 e x α λ . The generalized Weibull distribution has a flexible and closed form hazard function.
Ong et al. (2015) applied the Laplace transform technique and a formal Taylor expansion to derive the count distribution for generalized Weibull duration. The count distribution has pmf given by
P r { N ( t ) = n } = ( a α ) n p = 0 ( a / λ ) p Γ ( α ( p + n ) + 1 ) t α ( p + n ) c n ( p ) ,
where c n ( p ) = q = 0 p λ 1 q Γ ( α ( q + 1 ) ) c n 1 ( p q ) , n 1 and c 0 ( p ) = λ p Γ ( α p + 1 ) . When n = 0, P r { N ( t ) = 0 } = ( 1 a t α / λ ) λ . This count model is able to model under-, equi- and over-dispersion, since the generalized Weibull hazard function can be increasing, constant or decreasing. Special cases are as follows:
  • When λ < 0 and α = 1, we obtain the count distribution with Lomax duration. Its pmf is given by Ong et al. (2015) as
    P r { N ( t ) = n ) = ( a t ) n p = 0 ( a / γ ) p Γ ( p + n + 1 ) t p c n ( p ) .
  • When λ→∞, we obtain the Weibull count distribution and Ong et al. (2015) gives its pmf as
    P r { N ( t ) = n ) = ( a α ) n p = 0 ( a ) p Γ ( α ( p + n ) + 1 ) t α ( p + n ) c n ( p ) ,
    where c n p = q = 0 p Γ α q + 1 Γ q + 1 c n 1 p q , n 1 and c 0 ( p ) = Γ ( α p + 1 ) Γ ( p + 1 ) . When n = 0, P r { N ( t ) = 0 } = e a t α . McShane et al. (2008) applied Taylor series approximation in the derivation of the Weibull count pmf which they have found to be computationally feasible.
  • Furthermore, when α = 1, (5) reduces to the Poisson pmf.

2.1.2. Count Distribution for Gamma Duration

Let X have a gamma distribution with pdf given by
f ( x ; α , β ) = β α Γ ( α ) x α 1 e β x
for x > 0 and α, β > 0. It has mean E(X) = α/β and variance Var(X) = α/β2. The hazard function of the gamma distribution is not available in closed form but its behaviour is well-known as being monotonic increasing (α > 1), decreasing (α < 1) or constant (α = 1). When α = 1, we obtain the exponential distribution. The Laplace transform of the gamma distribution is given as φ ( s ) = β β + s α . The gamma distribution has the advantage of having a reproductive property, hence the arrival time Sn is also gamma distributed.
Winkelmann (1995) has studied the count process with gamma inter-arrival times and gives its pmf as
P r { N ( t ) = n } = G ( α n , β t ) G ( α n + α , β t )
where G ( α n , β t ) = 1 Γ ( n α ) 0 β t u n α 1 e u d u , the integral is the lower incomplete gamma function. Since the pmf is not available in closed form, Winkelmann (1995) suggested using numerical methods for its computation. The gamma count distribution inherits the properties of the gamma distribution’s hazard function; thus it is able to model over dispersion (α < 1) and under dispersion (α > 1). Its expected value is given by E [ N ( t ) ] = i = 1 G ( α i , β t ) . Special cases are as follows:
  • When α = 1, the count distribution simplifies to the Poisson distribution.
  • For integer values of α, Winkelmann (1995) has derived the Erlangian count distribution with pmf given as
P r { N ( t ) = n } = e β t i = 0 α 1 β t α n + i α n + i ! , n = 0 , 1 , 2 , .

2.1.3. Count Distribution for Convolution of Two Gamma Durations

If we represent the inter-arrival time X as a sum of two independent gamma random variables, then X has a convolution of two gamma distributions. Its density function has been studied by various authors; see Johnson et al. (2005) for a brief overview. We shall adapt the density function given by Moschopoulos (1985) for the sum of n independent gamma random variables, which is derived from the n-convolutions of the moment generating function. Let X = X 1 + X 2 , where X i , i = 1, 2, are distributed as gamma with parameters α i and β i respectively. We obtain the density function of X as
f ( x ; ρ , β 1 ) = β 1 β 2 α 2 k = 0 δ k x ρ + k 1 exp y β 1 Γ ( ρ + k ) β 1 ρ + k
for x > 0, α i > 0 , β i > 0 where β 1 = min ( β 1 , β 2 ) , ρ = α 1 + α 2 , δ k + 1 = 1 k + 1 i = 1 k + 1 i γ i δ k + 1 i for k = 0, 1, 2, …, and γ k = α 2 1 β 1 β 2 k . The convolution of two gamma distributions has an increasing hazard function when its two component distributions have an increasing hazard function, but convolutions of two distributions, both with decreasing hazard function, may give rise to a distribution with increasing hazard function. Therefore, we expect the count distribution to be more flexible in modelling over-dispersed and under-dispersed count data. As a special case, when α 1 = α 2 = 1 , we obtain the convolution of two exponential distributions which has an increasing hazard function.
Proposition 1.
If the inter-arrival time (duration) has a convolution of two gamma distributions with pdf (3.1.1), the count distribution has pmf given by
P r { N ( t ) = n } = C n ( t , α 1 , α 2 , β 1 , β 2 ) C n + 1 ( t , α 1 , α 2 , β 1 , β 2 ) ,
where  C n ( t , α 1 , α 2 , β 1 , β 2 ) = ( β 1 α 1 β 2 α 2 ) n t n ( α 1 + α 2 ) Γ ( 1 + n α 1 + α 2 ) Φ 2 ( n α 1 , n α 2 ; 1 + n ( α 1 + α 2 ) ; β 1 t , β 2 t )  and  Φ 2 ( b , b ; c ; w , z ) = k , l = 0 ( b ) k ( b ) l ( c ) k + l w k z l k ! l ! .

2.1.4. Count Distribution for Inverse Gaussian Duration

The inverse Gaussian (IG) distribution is also known as the first passage time distribution of Brownian motion with positive drift. Let X have an IG distribution with pdf given by
f ( x ; μ , λ ) = λ 2 π x 3 2 exp λ ( x μ ) 2 2 μ 2 x ,
for x > 0, where μ, λ > 0 (Johnson et al. 2005, p. 261). It is a unimodal distribution and has applications in modelling survival period, service time, equipment lives, hospital stay duration, employee service times and duration of strikes. Chhikara and Folks (1977) have discussed the application of the inverse Gaussian distribution in reliability and showed that the distribution has a non-monotonic hazard function with an almost increasing failure rate. There are several parameterizations of the IG distributions, but we adopt this particular one because it is expressed in terms of its mean E(X) = μ and λ is the scale parameter. The shape of the distribution is determined by the ratio λ/μ and the pdf is highly skewed for moderate values of this ratio. The Laplace transform is derived by Seshadri (1999) as
φ ( s ) = exp λ μ 1 1 + 2 s μ 2 λ , s 0
when μ→∞, we obtain a one-parameter limiting form of IG, known as the distribution of the first passage time of drift-free Brownian motion. Its pdf is given as f ( x ; λ ) = λ 2 π x 3 2 exp λ 2 x with x > 0, where λ > 0 (Johnson et al. 2005). The expected value and variance of this distribution are infinite. On the other hand, when μ = 1, the distribution is also known as the Wald distribution.
The count distribution with inverse Gaussian inter-arrival times has also been proposed (Nadarajah and Chan 2018) with the probability mass function given in terms of the convolution of inter-arrival distributions Fn(x), involving the standard normal cumulative distribution function. We derive an explicit expression for the inverse Gaussian count distribution, given in the following proposition.
Proposition 2.
If the inter-arrival time has an inverse Gaussian distribution with pdf (13), the count distribution has pmf given by
P r { N ( t ) = n } = k = 0 l = 0 k n k l ( l + 1 ) ! ( k l ) ! λ μ k + 1 c k ( m ) ,
where  c k ( m ) = m = 0 k + 1 k + 1 m ( 1 ) m ν = 0 m 2 ν 1 Γ ( 1 ν ) 2 μ 2 λ t ν .

2.2. Computation of the Probabilities of Count Distribution

The computation of the probabilities for most of the count distributions, such as the generalized Weibull count distribution (5), involves an infinite series and/or gamma functions Γ(x), which tends to quickly numerically overflow. As such, we propose a computational method whereby the probability function of the counts can be recovered by numerically inverting the Laplace transform (3). Using this method, given the inter-arrival time distribution and its Laplace transform, we will be able to compute the corresponding count probabilities.
For some common functions, the inverse Laplace transforms f(x) are readily available from existing tables (Erdelyi et al. 1953). Otherwise, there are explicit formulae for inverting a Laplace transform φ(s), such as the Bromwich inversion integral formula and the Post-Widder inversion formula. In most cases, it is difficult to find an analytical expression for the inverse Laplace transform using these formula and, therefore, a numerical inversion is necessary. There are numerous methods for numerical inversion of Laplace transforms in the existing literature; for a comprehensive review, see (Abate and Valkó 2004; Dubner and Abate 1968). In our study, we use a numerical inversion algorithm which is based on the Bromwich inversion integral and gives good results for smooth functions. The algorithm was originally proposed by Dubner and Abate (1968), improved by Abate and Whitt (1992) and discussed by Abate and Whitt (1995) and Abate et al. (2000) for the numerical inversion of Laplace transforms of probability distributions. The Bromwich inversion integral formula is given as
f ( x ) = L 1 ( φ ( s ) ) = l i m R 1 2 π i a i R a + i R φ ( s ) e s x d s ,
where a is another real number such that a > s 0 and i = 1 . The numerical inversion algorithm is developed by first applying the trapezoidal rule to the integral in (16), and subsequently using a Fourier-series method for approximation. Based on the algorithm, we obtain the following formula for computing the count probabilities
P r { N ( t ) = n } = e A / 2 2 s Re φ n A 2 s + e A / 2 s k = 1 ( 1 ) k Re φ n A + 2 k π i 2 s ,
where φ n . is as defined in (3).
The convergence of the infinite sum in (17) can be accelerated by applying the well-known Euler’s algorithm for alternating series. Therefore, the count probabilities are approximated using the following formula
P r { N ( t ) = n } k = 0 m m k 2 m s p + k ( s ) ,
where sp(s) is the pth partial sum
s p ( s ) = e A / 2 2 s Re φ n A 2 s + e A / 2 s k = 1 p ( 1 ) k Re φ n A + 2 k π i 2 s .
The choice of A affects the discretization error which results from using the trapezoidal rule. We use Abate and Whitt’s (1995) suggestion to set A = 18.4, p = 38 and m = 11. The value of p may be increased when necessary. The algorithm can be implemented in programming languages which provide for complex number computation, such as MATLAB©.

2.3. Renewal Function

There are many studies on the approximation of the renewal function. Using a generalized cubic splining algorithm which provides piecewise polynomial approximations to recursively defined convolution integrals, Baxter et al. (1982) has tabulated the renewal function and variance function for renewal processes with gamma, inverse Gaussian, lognormal, truncated normal and Weibull inter-arrival times. However, they noted that the convergence of the algorithm is slow for some of the parameter values. Chaudhry et al. (2013) took a slightly different approach by using the probability function obtained from numerically inverting the Laplace transform in rational function form to calculate the renewal function and variance of several count distributions. They obtained the distribution function, mean and variance of N(t) using the method of roots for numerically inverting the Laplace transform when it can be expressed as a rational function. They also studied the Padè approximation method to obtain an approximate rational function for the Laplace transform when it is not a rational function. In addition, they used the Padè approximation method prior to the roots method when the Laplace transform could not be expressed as a rational function, such as in the case of gamma and inverse Gaussian distribution.

3. Numerical Results

3.1. Count Probabilities

To illustrate the accuracy of this numerical Laplace transform inversion method, we apply it in calculating the count probabilities for generalized Weibull duration and Erlangian duration and compare the values to those obtained using Formulas (5) and (10), respectively. The formula in Equation (10) is in closed form and simple enough to compute, hence there is no need to use the method which we propose here, but it serves as a good example for this comparison. Since the Laplace transform of the generalized Weibull density function is not available in closed form, we can approximate it using Gaussian quadrature. The computed probabilities are presented in Table 2. The count probabilities for generalized Weibull duration are computed when a = 1, α = 1 and λ = −2, t = 0.25 and t = 1. For the Erlangian count distribution, we compute the probabilities when α = 2, β = 0.8, t = 0.25 and t = 1. In all cases, we find that our approximation is accurate up to at least seven decimal places. To illustrate the issue of overflowing which might occur, we present the count probabilities for generalized Weibull duration when a = 2, α = 1 and λ = −2 and t = 1 in Table 3. It is clear that, in this case, there is a numerical error in the computation of the probabilities with Formula (5) when n = 1, 2 due to instability caused by the presence of an infinite sum in Equation (5) and truncation error.
Using this proposed method, the count probabilities for convolution of two gamma and inverse Gaussian inter-arrival distributions proposed in Section 2.2 can be easily computed. Chaudhry et al. (2013) used the roots method and a Padè approximation method for computing the count probabilities for several inter-arrival times distributions. In Table 4, we compare the probability function of gamma, inverse Gaussian and Weibull count distributions with those obtained by Chaudhry et al. (2013). We note that the difference in the probabilities is at most two decimal places. In the case of Weibull count distribution, we include only the results when t = 0.25, because the algorithm could not converge for t = 0.60 and t = 1 when λ = 3, which are the other two values included by Chaudhry et al. (2013). Convergence issues with the Weibull renewal function were also discussed by Constantine and Robinson (1997) whereby they developed a convergent damped exponential series by residue calculations of the Laplace transform of the renewal integral equation for the Weibull renewal function when λ > 1.
We compare the pmf of the two count distributions proposed in Section 2.1.3 and Section 2.1.4 with the Poisson distribution. For comparison purposes, the mean for all of the distributions is set to 2, i.e., E(N) = 2. Figure 1 compares the probability functions of the inverse Gaussian count distribution with a Poisson distribution.
Figure 2 compares the probability functions of the convolution of two gamma count distribution with a Poisson distribution. The convolution of the two gamma count model can model both over-dispersion and under-dispersion relative to the Poisson distribution.
The convolution of two gamma distributions nests the special case of convolution of two exponential distributions, that is, when α 1 = α 2 = 1. This two-component hypo exponential count distribution with parameters β 1 and β 2 can model under-dispersion and Figure 3 compares its probability function with a Poisson distribution.

3.2. Renewal Function and Variance

Using the probability of the counts computed using our proposed method, we also computed the renewal function and variance function for comparison with those obtained by Chaudhry et al. (2013) and Baxter et al. (1982). The details are presented in Table 5. In most cases, the values computed using our proposed method are closer to those of Baxter et al. (1982). We note that Baxter et al. (1982) verified the accuracy of their extended cubic splining algorithm through comparisons with previous tabulations for the Weibull count distribution in the literature (see Baxter et al. 1982 for details) and a direct evaluation of the incomplete gamma integral for the gamma count distribution.

4. Real Data Analysis

Table 6 gives the distribution for the number of monthly non-payments for personal loan in a sample of 2446 clients in a Spanish bank (Dionne et al. 1996). In personal loans, small amounts of money are lent with a relatively short repayment or loan period. The repayment schedule is typically on a monthly basis with a constant amount. The empirical data has a sample mean of 1.109 and variance of 4.860, indicating presence of over dispersion, hence a simple Poisson process may not be sufficient to model the counts. The majority (68.1%) of the counts are zeroes, which correspond to clients who never missed a payment, followed by 11.1% who missed one payment and a cumulative percentage of 11.4% who missed two to four payments. The count distributions are applied to fit this data set. For the simple Poisson count process, observations with expected frequencies which are less than 1.0 are grouped in one class. We also include the log-likelihood function and Akaike information criterion (AIC) values for each fitted model in the tables.
The pmf of the count distributions is evaluated using the numerical inverse Laplace transform method discussed in Section 2.2. The maximum likelihood (ML) estimates of the parameters are obtained with numerical global optimization using the simulated annealing algorithm (Goffe et al. 1994). For numerical stability, we transform the parameters for the generalized Weibull count distributions to their corresponding reciprocals prior to performing ML estimation. The ML estimates are given in Table 7.
The count distribution with generalized Weibull as the distribution for inter-arrival times gives the best fit for the data presented in Table 6. Since the generalized Weibull distribution does not have a closed form Laplace transform, the model fitting takes up a significantly longer time. In the case of distributions with closed Laplace transform, the convolution of two gamma count distribution gives the best fit. We also verify that the convolution of the two exponentials count distribution gives the same fit as the simple Poisson distribution, implying that this distribution is not suitable for over dispersed count data. The inverse Gaussian distribution also gives a poor fit to this data set. This coincides with the characteristic of inter-arrival time distributions, which has an increasing hazard function.

5. Discussion and Conclusions

This article examines the modelling of count data commonly encountered in finance and risk management with count distributions arising from non-exponential inter-arrival time distributions in a renewal process. A specific application example on modelling of loan non-payments is presented. Since the number of non-payments and the lapsed time between payments reflect a lender’s payment behaviour, models which account for these data can assist in the development of further diagnostic techniques such as loan default prediction and tools for early warning detection. Due to the complicated calculations, computation of the probabilities arising from these distributions is investigated and discussed in this paper. The inversion of the Laplace transform is proposed as a generic method of computation, since the transforms have relatively simple forms compared to the probabilities. The proposed method is compared with some existing techniques in the literature.
When the Laplace transform of the inter-arrival time distribution is not available in closed form, other methods to approximate the Laplace transform for numerical inversion can be explored, such as the infinite series, Gaussian quadrature, Laguerre method and the continued fractions technique. This will be considered elsewhere.

Author Contributions

Conceptualization, S.-H.O.; methodology, Y.-C.L.; software, Y.-C.L.; validation, S.-H.O.; formal analysis, Y.-C.L.; investigation, S.-H.O. and Y.-C.L.; resources, S.-H.O. and Y.-C.L.; data curation, Y.-C.L.; writing—original draft preparation, Y.-C.L.; writing—review and editing, S.-H.O.; visualization, Y.-C.L.; supervision, S.-H.O.; project administration, Y.-C.L.; funding acquisition, S.-H.O. and Y.-C.L. All authors have read and agreed to the published version of the manuscript.

Funding

The authors are supported by the Malaysia Ministry of Higher Education grant FRGS/1/2020/STG06/SYUC/02/1; S.-H.O. is supported by UCSI University grant REIG-FBM-2022/050.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors wish to thank the reviewers for their insightful comments which have greatly improved the paper.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Abate, Joseph, and Peter P. Valkó. 2004. Multi-precision Laplace transform inversion. International Journal for Numerical Methods in Engineering 60: 979–93. [Google Scholar] [CrossRef]
  2. Abate, Joseph, and Ward Whitt. 1992. The Fourier-series method for inverting transforms of probability distributions. Queueing Systems 10: 5–87. [Google Scholar] [CrossRef]
  3. Abate, Joseph, and Ward Whitt. 1995. Numerical Inversion of Laplace Transforms of Probability Distributions. ORSA Journal on Computing 7: 36–43. [Google Scholar] [CrossRef] [Green Version]
  4. Abate, Joseph, Gagan L. Choudhury, and Ward Whitt. 2000. An Introduction to Numerical Transform Inversion and Its Application to Probability Models. In International Series in Operations Research & Management Science. Boston: Springer, pp. 257–323. [Google Scholar]
  5. Baker, Rose, and Tarak Kharrat. 2017. Event count distributions from renewal processes: Fast computation of probabilities. IMA Journal of Management Mathematics 29: 415–33. [Google Scholar] [CrossRef] [Green Version]
  6. Baxter, Laurence A., Ernest M. Scheuer, Denis J. McConalogue, and Wallace R. Blischke. 1982. On the Tabulation of the Renewal Function. Technometrics 24: 151. [Google Scholar] [CrossRef]
  7. Chaudhry, Mohan L., Xiaofeng Yang, and Boon Ong. 2013. Computing the Distribution Function of the Number of Renewals. American Journal of Operations Research 3: 380–86. [Google Scholar] [CrossRef] [Green Version]
  8. Chhikara, Raj S., and J. Leroy Folks. 1977. The Inverse Gaussian Distribution as a Lifetime Model. Technometrics 19: 461–68. [Google Scholar] [CrossRef]
  9. Constantine, A. Graham, and Neville I. Robinson. 1997. The Weibull renewal function for moderate to large arguments. Computational Statistics & Data Analysis 24: 9–27. [Google Scholar]
  10. Consul, Prem C., and Gaurav C. Jain. 1973. A Generalization of the Poisson Distribution. Technometrics 15: 791–99. [Google Scholar] [CrossRef]
  11. Dionne, Giorges, Manuel Artís, and Montserrat Guillén. 1996. Count data models for a credit scoring system. Journal of Empirical Finance 3: 303–25. [Google Scholar] [CrossRef]
  12. Dubner, Harvey, and Joseph Abate. 1968. Numerical inversion of Laplace transforms by relating them to the finite Fourier cosine transform. Journal of the ACM 15: 115–23. [Google Scholar] [CrossRef]
  13. Erdelyi, Arthur M., Fritz Oberhettinger, and Francesco G. Tricomi. 1953. Higher Transcendental Functions. New York: McGraw-Hill. [Google Scholar]
  14. From, Steven G. 2004. Approximating the distribution of a renewal process using generalized Poisson distributions. Journal of Statistical Computation and Simulation 74: 667–81. [Google Scholar] [CrossRef]
  15. Goffe, William L., Gary D. Ferrier, and John Rogers. 1994. Global optimization of statistical functions with simulated annealing. Journal of Econometrics 60: 65–99. [Google Scholar] [CrossRef] [Green Version]
  16. Greenwood, Major, and G. Udny Yule. 1920. An Inquiry into the Nature of Frequency Distributions Representative of Multiple Happenings with Particular Reference to the Occurrence of Multiple Attacks of Disease or of Repeated Accidents. Journal of the Royal Statistical Society 83: 255. [Google Scholar] [CrossRef] [Green Version]
  17. Gupta, Ramesh C., and Seng-Huat Ong. 2005. Analysis of Long-Tailed Count Data by Poisson Mixtures. Communications in Statistics–Theory and Methods 34: 557–73. [Google Scholar] [CrossRef]
  18. Holla, M. S. 1967. On a Poisson-inverse Gaussian distribution. Metrika 11: 115–21. [Google Scholar] [CrossRef]
  19. Johnson, Norman L., Adrienne W. Kemp, and Samuel Kotz. 2005. Univariate Discrete Distributions, 3rd ed. New York: John Wiley and Sons. [Google Scholar]
  20. Jose, K. Kanichukattu, and Bindu Abraham. 2011. A Count Model Based on Mittag-Leffler Interarrival Times. Statistica LXXI: 501–14. [Google Scholar]
  21. Jose, K. Kanichukattu, and Bindu Abraham. 2013. A Counting Process with Gumbel Inter-arrival Times for Modeling Climate Data. Journal of Environmental Statistics 4: 13. [Google Scholar]
  22. Karlis, Dimitris, and Mohieddine Rahmouni. 2007. Analysis of defaulters’ behaviour using the Poisson-mixture approach. IMA Journal of Management Mathematics 18: 297–311. [Google Scholar] [CrossRef]
  23. Lindholm, Mathias, and Henning Zakrisson. 2022. A Collective Reserving Model with Claim Openness. ASTIN Bulletin: The Journal of the IAA 52: 117–43. [Google Scholar] [CrossRef]
  24. Maciak, Matus, Ostap Okhrin, and Michal Pesta. 2021. Infinitely Stochastic Micro Reserving. Insurance: Mathematics and Economics 100: 30–58. [Google Scholar] [CrossRef]
  25. McShane, Blake, Moshe Adrian, Eric T. Bradlow, and Peter S. Fader. 2008. Count Models Based on Weibull Interarrival Times. Journal of Business & Economic Statistics 26: 369–78. [Google Scholar]
  26. Mestiri, Sami, and Abdeljelil Farhat. 2021. Using Non-parametric Count Model for Credit Scoring. Journal of Quantitative Economics 19: 39–49. [Google Scholar] [CrossRef]
  27. Moschopoulos, Peter G. 1985. The distribution of the sum of independent gamma random variables. Annals of the Institute of Statistical Mathematics 37: 541–44. [Google Scholar] [CrossRef]
  28. Mudholkar, Govind S., Deo Kumar Srivastava, and Georgia D. Kollia. 1996. A Generalization of the Weibull Distribution with Application to the Analysis of Survival Data. Journal of the American Statistical Association 91: 1575–83. [Google Scholar] [CrossRef]
  29. Nadarajah, Saralees, and Stephen Chan. 2018. Discrete distributions based on inter arrival times with application to football data. Communications in Statistics–Theory and Methods 47: 147–65. [Google Scholar] [CrossRef]
  30. Ong, Seng-Huat, Atanu Biswas, Shelton Peiris, and Yeh-Ching Low. 2015. Count distribution for generalized Weibull duration with applications. Communications in Statistics–Theory and Methods 44: 4203–16. [Google Scholar] [CrossRef]
  31. Sankaran, Munuswamy. 1968. Mixtures by the Inverse Gaussian Distribution. Sankhyā: The Indian Journal of Statistics, Series B 30: 455–58. [Google Scholar]
  32. Seshadri, Vanamamalai. 1999. The Inverse Gaussian Distribution. Lecture Notes in Statistics. New York: Springer. [Google Scholar]
  33. Smith, W. L., and M. Ross Leadbetter. 1963. On the Renewal Function for the Weibull Distribution. Technometrics 5: 393–96. [Google Scholar] [CrossRef]
  34. Thomas, Lyn C., Anna Matuszyk, Mee Chi So, Christophe Mues, and Angela Moore. 2016. Modelling repayment patterns in the collections process for unsecured consumer debt: A case study. European Journal of Operational Research 249: 476–86. [Google Scholar] [CrossRef] [Green Version]
  35. Winkelmann, Rainer. 1995. Duration Dependence and Dispersion in Count-Data Models. Journal of Business & Economic Statistics 13: 467. [Google Scholar]
Figure 1. Plots of Poisson and inverse Gaussian probabilities: (a) λ = 0.17, μ = 1 (over dispersion); (b) λ = 1, μ = 0.438 (under dispersion).
Figure 1. Plots of Poisson and inverse Gaussian probabilities: (a) λ = 0.17, μ = 1 (over dispersion); (b) λ = 1, μ = 0.438 (under dispersion).
Jrfm 16 00150 g001
Figure 2. Plots of Poisson and convolution of two gamma probabilities: (a) α 1 = 1.5, α 2 = 1.9 (under dispersion); (b) α 1 = 0.2, α 2 = 0.5 (over dispersion).
Figure 2. Plots of Poisson and convolution of two gamma probabilities: (a) α 1 = 1.5, α 2 = 1.9 (under dispersion); (b) α 1 = 0.2, α 2 = 0.5 (over dispersion).
Jrfm 16 00150 g002
Figure 3. Plot of Poisson and convolution of two exponentials probabilities: β1 = 4.2, β2 = 4.85 (under dispersion).
Figure 3. Plot of Poisson and convolution of two exponentials probabilities: β1 = 4.2, β2 = 4.85 (under dispersion).
Jrfm 16 00150 g003
Table 1. Some existing count distributions in renewal theory.
Table 1. Some existing count distributions in renewal theory.
Inter-Arrival Time DistributionProbability Mass Function (pmf) of Corresponding Count Distribution
Gamma P r { N ( t ) = n } = G ( α n , β t ) G ( α n + α , β t ) ,
G ( α n , β t ) = 1 Γ ( n α ) 0 β t u n α 1 e u d u
Weibull P r { N ( t ) = n } = j = n ( 1 ) j + n ( λ t c ) j α j n Γ ( c j + 1 ) ,
α j 0 = Γ ( c j + 1 ) Γ ( j + 1 ) , j = 0, 1, 2, …, α j n + 1 = m = n j 1 α m n Γ ( c j c m + 1 ) Γ ( j m + 1 ) ,
n = 0, 1, 2, …, j = n + 1, n + 2, n + 3, …
Mittag-Leffler P r N t = n = j = n j n 1 j n t j α / Γ 1 + j α
Gumble Type II P r { N ( t ) = n } = j = n ( 1 ) ( j + n ) ( b t a ) j δ j n Γ ( a j + 1 ) , a < 0
δ j 0 = Γ a j + 1 Γ j + 1 , j = 0, 1, 2, … δ j n + 1 = m = n j 1 δ m n Γ a j + a m + 1 Γ j m + 1
n = 0, 1, 2, …, j = n + 1, n + 2, n + 3, …
Generalized Weibull P r { N ( t ) = n } = ( a α ) n p = 0 ( a / λ ) p Γ ( α ( p + n ) + 1 ) t α ( p + n ) c n ( p ) ,
c n ( p ) = q = 0 p λ 1 q Γ ( α ( q + 1 ) ) c n 1 ( p q ) , n 1 ,   c 0 ( p ) = λ p Γ ( α p + 1 )
Table 2. Computation of probabilities for (a) generalized Weibull, and (b) Erlangian count distributions using the proposed method and pmf formula.
Table 2. Computation of probabilities for (a) generalized Weibull, and (b) Erlangian count distributions using the proposed method and pmf formula.
nPr{N(t) = n}
t = 0.25
Pr{N(t) = n}
t = 1
Proposed MethodPmf FormulaDifferenceProposed MethodPmf FormulaDifference
00.7901234621902330.7901234567901235.4001 (−9)0.4444444460776300.4444444444444441.6331 (−9)
10.1852685582816660.1852685549557493.3259 (−9)0.3414477724051530.3414477700997172.3054 (−9)
20.0226240196197150.0226240184695881.1501 (−9)0.1524212545746630.1524212522539882.3207 (−9)
30.0018624470341360.0018624467592782.7486 (−10)0.0476320000794890.0476319982797571.7997 (−9)
40.0001155288246770.0001155287746105.0067 (−11)0.0114183073500130.0114183062203991.1296 (−9)
50.0000057469219400.0000057469145807.3600 (−12)0.0022170096360050.0022170090422905.9371 (−10)
60.0000002385682160.0000002385673109.0600 (−13)0.0003614392440000.0003614389761002.6790 (−10)
70.0000000084964000.0000000084963049.0600 (−13)0.0000507592898750.0000507591841071.0577 (−10)
(a) Generalized Weibull count distribution
nPr{N(t) = n}
t = 0.25
Pr{N(t) = n}
t = 1
Proposed MethodPmf FormulaDifferenceProposed MethodPmf FormulaDifference
00.9824769126582510.9824769036935788.9647 (−9)0.8087921385604950.8087921354109993.1495 (−9)
10.0174662572758680.0174662560656641.2102 (−9)0.1821280115899340.1821280067888474.8011 (−9)
20.0000567653660990.0000567653322133.3886 (−11)0.0088955171737800.0088955152789501.8948 (−9)
30.0000000748557770.0000000748553833.9400 (−13)0.0001822926629050.0001822923328103.3009 (−10)
40.0000000000531400.0000000000531382.0000 (−15)0.0000020358894180.0000020358573923.2026 (−11)
50.0000000000000240.0000000000000240.00000.0000000142643040.0000000142623331.9710 (−12)
60.0000000000000000.0000000000000000.00000.0000000000685130.0000000000684298.4000 (−14)
70.0000000000000000.0000000000000000.00000.0000000000002410.0000000000002391.9999 (−15)
(b) Erlangian count distribution
Table 3. Count probabilities for generalized Weibull count distribution when a = 2, α = 1 and λ = −2 and t = 1.
Table 3. Count probabilities for generalized Weibull count distribution when a = 2, α = 1 and λ = −2 and t = 1.
nPr{N(t) = n}
FormulaProposed Inverse Laplace Transform Method
00.25000.2500
163.59820.2971
22.33270.2305
30.18390.1317
40.06040.0593
50.02200.0220
60.00690.0069
70.00190.0019
Table 4. Computation of probabilities for (a) gamma, (b) inverse Gaussian, and (c) Weibull count distributions for selected values of t using (i) proposed method, (ii) method of Chaudhry et al. (2013).
Table 4. Computation of probabilities for (a) gamma, (b) inverse Gaussian, and (c) Weibull count distributions for selected values of t using (i) proposed method, (ii) method of Chaudhry et al. (2013).
tPr(N(t) = 0)Pr(N(t) = 1)Pr(N(t) = 2)Pr(N(t) = 3)Pr(N(t) = 4)
(i)(ii)(i)(ii)(i)(ii)(i)(ii)(i)(ii)
0.10.69380.68710.23410.23850.05790.06020.01170.01190.00210.0019
0.40.40610.40710.30920.30880.16830.16770.07440.07430.02830.0284
1.250.12910.12910.19520.19510.20500.20500.17300.17300.12490.1249
(a) Gamma count distribution
tPr(N(t) = 0)Pr(N(t) = 1)Pr(N(t) = 2)Pr(N(t) = 3)Pr(N(t) = 4)
(i)(ii)(i)(ii)(i)(ii)(i)(ii)(i)(ii)
0.250.73940.74450.24970.24420.01080.01120.00010.00010.00000.0000
0.70.33770.33900.40700.40420.20440.20620.04600.04570.00470.0046
1.00.16230.16230.28650.28690.28710.28670.17630.17620.06810.0683
(b) Inverse Gaussian count distribution
tPr(N(t) = 0)Pr(N(t) = 1)Pr(N(t) = 2)Pr(N(t) = 3)Pr(N(t) = 4)
(i)(ii)(i)(ii)(i)(ii)(i)(ii)(i)(ii)
0.250.98450.98410.01550.01590.00000.00000.0000-0.0000-
(c) Weibull count distribution
Table 5. Computation of renewal and variance functions for (a) gamma, (b) inverse Gaussian, and (c) Weibull count distributions for selected values of t using (i) proposed method, (ii) method of Baxter et al. (1982), and (iii) method of Chaudhry et al. (2013).
Table 5. Computation of renewal and variance functions for (a) gamma, (b) inverse Gaussian, and (c) Weibull count distributions for selected values of t using (i) proposed method, (ii) method of Baxter et al. (1982), and (iii) method of Chaudhry et al. (2013).
tRenewal FunctionVariance Function
(i)(ii)(iii)(i)(ii)(iii)
0.10.39530.39330.40400.45800.44850.4623
0.41.05601.05501.05451.39541.39011.3970
1.252.66622.66532.66634.04914.04414.0487
(a) Gamma count distribution
tRenewal FunctionVariance Function
(i)(ii)(iii)(i)(ii)(iii)
0.250.27160.27150.26690.21980.22000.2188
0.70.97390.97390.97360.77170.77180.7732
1.01.76361.76381.76351.52901.52931.5294
(b) Inverse Gaussian count distribution
tRenewal FunctionVariance Function
(i)(ii)(iii)(i)(ii)(iii)
0.250.01550.01560.01590.01530.01540.0156
(c) Weibull count distribution
Table 6. Number of monthly non-payments for personal loan (Dionne et al. 1996).
Table 6. Number of monthly non-payments for personal loan (Dionne et al. 1996).
CountObservedExpected Frequencies
ExponentialGammaConvolution of Two ExponentialsConvolution of Two GammaInverse GaussianWeibullGeneralized Weibull
01665806.781159.28806.781159.18703.131156.511172.12
1271894.85610.04894.85609.94614.81607.38599.05
2101496.26320.92496.26320.89470.06319.98309.55
373183.48168.77183.48168.79314.25169.15162.84
410650.8888.7350.8888.78183.6989.7487.75
57211.2946.6411.2946.6893.8847.8048.58
6432.0924.512.0924.5541.9625.5627.60
7310.3812.870.3812.9016.3913.7216.00
831 6.76 6.785.607.399.39
925 3.55 3.561.674.005.53
1019 1.86 1.870.442.173.25
119 0.98 0.980.101.181.89
12 or more0 1.08 1.090.021.422.44
Total 2446.002446.002446.002446.002446.002446.002446.00
χ 2 37,242.911111.7737,242.911108.754057.661032.59838.51
Log-likelihood−4954.79−3569.93−4954.79−3569.49−4231.06−3558.13−3511.39
AIC9911.577143.859913.577146.998466.117118.277028.77
Table 7. ML estimates of the fitted distributions.
Table 7. ML estimates of the fitted distributions.
Inter-Arrival DistributionML Estimates of Parameters
Exponential λ ^ = 1.1092
Gamma α ^ = 0.0136 , β ^ = 0.0000
Convolution of two exponentials β 1 = 1.1092 ,   β 2
Convolution of two gamma α ^ 1 = 0.0097 , β ^ 1 = 0.0000 , α ^ 2 = 0.0000 , β ^ 1 = 4.5611
Inverse Gaussian λ ^ = 0.1358 ,   μ ^
Weibull α ^ = 18.2613 , λ ^ = 3.0684
Generalized Weibull a ^ = 40.6405 ;   α ^ = 1.0000 , λ ^ = 0.2044
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Low, Y.-C.; Ong, S.-H. Modelling of Loan Non-Payments with Count Distributions Arising from Non-Exponential Inter-Arrival Times. J. Risk Financial Manag. 2023, 16, 150. https://doi.org/10.3390/jrfm16030150

AMA Style

Low Y-C, Ong S-H. Modelling of Loan Non-Payments with Count Distributions Arising from Non-Exponential Inter-Arrival Times. Journal of Risk and Financial Management. 2023; 16(3):150. https://doi.org/10.3390/jrfm16030150

Chicago/Turabian Style

Low, Yeh-Ching, and Seng-Huat Ong. 2023. "Modelling of Loan Non-Payments with Count Distributions Arising from Non-Exponential Inter-Arrival Times" Journal of Risk and Financial Management 16, no. 3: 150. https://doi.org/10.3390/jrfm16030150

Article Metrics

Back to TopTop