1. Introduction
Over the last century, many significant distributions have been introduced to serve as models in applied sciences. Among them, the so-called generalized beta distribution introduced by [
1] is at the top of the list in terms of usefulness. The main feature of the generalized beta distribution is to be very rich; It includes a plethora of named distributions (to our knowledge, over thirty). In this paper, we focus our attention on one of the most attractive of these distributions, known as the inverse Lomax (IL) distribution. Mathematically, it also corresponds to the distribution of
, where
X is a random variable following the famous Lomax distribution (see [
2]). Thus, the cumulative distribution function (cdf) of the IL distribution is given by
where
is a positive scale parameter and
is a positive shape parameter, and
for
. The reasons for the interest of the IL distribution are the following ones. It has proved itself as a statistical model in various applications, including economics and actuarial sciences (see [
3]) and geophysics (see [
4]). Also, the mathematical and inferential aspects of the IL distribution are well developed. See, for instance, Reference [
5] for the Lorenz ordering of order statistics, Reference [
6] for the parameters estimation in a Bayesian setting, Reference [
7] for the parameters estimation from hybrid censored samples, Reference [
8] for the study of the reliability estimator under type II censoring and Reference [
9] for the Bayesian estimation of two-component mixture of the IL distribution under type I censoring. Despite an interesting compromise between simplicity and accuracy, the IL model suffers from a certain rigidity in the peak (punctual and roundness) and tail properties. This motivates the developments of diverse parametric extensions, such as the inverse power Lomax distribution introduced by [
10], the Weibull IL distribution studied by [
11] and the Marshall-Olkin IL distribution studied by [
12].
In this paper, we introduce and discuss a new extension of the IL distribution based on the half logistic generated family of distributions studied by [
13]. This general family is defined by the following cdf:
where
is a positive shape parameter and
denotes a cdf of a continuous univariate distribution. Here,
represents a vector of parameter(s) related to the corresponding parental distribution. The main motivations behind the half logistic generated family are as follows. First, it is proved in [
13] that the combined effect of the considered ratio and power transformations can significantly enrich the parental distribution, improving the flexibility of the mode, median, skewness and tail, with a positive impact on the analyses of practical data sets. The normal, Weibull and Fréchet distributions have been considered as parents in [
13]. Recent studies have also pointed out these facts in other practical settings. See, for instance, References [
14,
15,
16] which consider the generalized Weibull, Topp-Leone and power Lomax distributions as parental distributions, respectively. However, the half logistic generated family remains underexploited and, based on the previous promising studies, needs more thorough attention. In this paper, we contribute in this direction of work, with the consideration of the IL distribution as a parent.
Thus, we introduce a new attractive lifetime distribution with three parameters called the half logistic IL (HLIL) distribution. The cdf of the HLIL distribution with parameter vector
is obtained by inserting (
1) into (
2) as
where
is a positive scale parameter, and
and
are two positive shape parameters, and
for
. With various theoretical and practical developments, we show how the functionalities of the half logistic generated family confer new application perspectives to the IL distribution. In particular, the preference of the HLIL distribution over the IL distribution is motivated by the following findings: (i) the decay rate of convergence of the probability density function (pdf) of the HLIL distribution can be modulated by adjusting
contrary to the fixed one of the IL distribution, (ii) the mode of the HLIL distribution is positively affected by
on the numerical point of view, (iii) the almost constant hazard rate shape is reachable for the HLIL distribution, this is not the case of the IL distribution, (iv) the skewness of the HLIL distribution is more flexible to the IL distribution, with the observation of an ‘almost symmetric shape’ for some parameter values, and (v) the kurtosis of the HLIL distribution varies between the mesokurtic and leptokurtic cases, with a specific ‘roundness shape property’ for some parameter values. Thus, from the modeling point of view, in the category of heavy right-tailed distributions, the HLIL distribution is very flexible on the peak and tails, and its pdf allows the ‘almost symmetric shape and/or round shape fitting’ of the histogram of the data, which is not possible with the IL distribution, for instance. In this study, after developing the mathematical features of the HLIL distribution, we emphasize its inferential aspects, which are of comparable complexity with those of the IL distribution. In particular, from the related HLIL model, we develop the parameters estimation by six various well-established methods. Also, a simulation study on the estimation of the Rényi entropy is proposed. Then, three applications in environment, engineering and insurance are given; Three data sets in these fields are analysed in the rules of the art, revealing the superiority of the HLIL model over some known competitors, including the reference IL model.
The remainder of this article is arranged as follows. The important functions related to the HLIL distribution are presented in
Section 2.
Section 3 investigates some properties of the HLIL distribution such as stochastic orders, moments, Rényi and
q-entropies, and order statistics.
Section 4 studies parameter estimates for the HLIL model based on six different methods, namely: maximum likelihood (ML), least squares (LS), weighted least squares (WLS), percentile (PC), Cramér-von Mises (CV) and maximum product of spacing (MPS) methods, with practical guarantees via a simulation study. Two real life data sets are presented and analyzed to show the benefit of the HLIL model compared with some other known models in
Section 5. The article ends with some concluding remarks in
Section 6.
2. Important Functions
This section is devoted to some important functions of the HLIL distribution, with applications.
2.1. The Probability Density Function
Upon differentiation of
with respect to
x, the pdf of the HLIL distribution is given by
An analytical study of this function is proposed below. First, the following equivalence holds. As , we have , and, as , we have . Thus, for , we have , for , we have and for , we have . Furthermore, without special constraints on the parameters, we have with a converge rate depending on the parameter making the difference with the former IL distribution. The obtained equivalence results are also crucial to determine the existence of moment measures and functions (raw moments, moment generating function, entropy…)
A change point of
, say
, satisfies
or, equivalently,
. After some developments, we arrive at the following equation:
Clearly, from this equation, we can not derive an analytical formula for . It is however clear that it is modulated by the parameter , offering a certain flexibility in this regard. Also, the presence of several change points is not excluded. However, for a given , it can be determined numerically by using any standard mathematical software, as Mathematica, Matlab or R.
From the theoretical point of view, the fine shape properties of
are given by the study of the signs of the first and second order derivatives of
with respect to
x. However, the expressions obtained are very complex and do not make it possible to determine the exact range of values of the parameters characterizing the different shapes. This obstacle motivates a graphic study.
Figure 1 displays some plots of
for some different values of the parameters.
From
Figure 1, we notice that the pdf of the HLIL distribution can be reversed J, right-skewed and almost symmetrical shaped. Based on several empirical observations, we conjecture that the J shapes are mainly observed for
. The various forms of the curves also indicate a certain versatility in mode, skewness and kurtosis with is clearly a plus in terms of modeling; the pdf allows the ‘almost symmetric shape and/or round shape fitting’ of the histogram of heavy-tailed data. This specificity is not observed for the IL distribution, as attested by
Figure 2.
2.2. Reliability Functions
Here, we present the main reliability functions of the HLIL distribution, which are of importance for various applications. The survival function (sf) and cumulative hazard rate function (chrf) of the HLIL distribution are, respectively, given by
and
for
, and
and
for
. Also, the hazard rate function (hrf) is obtained as
and
for
. The analytical study of this hrf is important to understand the modeling capacity of the HLIL model. First, the following equivalences are true. As
, we have
and, as
, we have
. We deduce the following asymptotic properties, which are similar to those for
. For
, we have
, for
, we have
and for
, we have
. Furthermore, we always have
. As an alpha remark, based on the obtained asymptote as
, for
close to 1, constant shapes for the
can be observed for small values of
x.
On the other side, a change point of
, say
, satisfies
. Some algebraic manipulations give the following equation:
For a given
, the critical points for
can be determined numerically by using a mathematical software. Also, the fine shape analysis for
requires its first and second order derivatives with respect to
x. The complexity of these functions is an obstacle to determine the exact range of values of the parameters characterizing the different shapes. A graphic approach is thus preferred. Thats is, in order to illustrate its flexibility,
Figure 3 displays some plots of
for some different values of the parameters.
We notice that the hrf shows decreasing, ‘concave increasing’ and upside down bathtub shapes, as well as almost constant shapes. It is worth mentioning that the ‘concave increasing’ and almost constant shapes are not clearly observed for the IL distribution, as illustrated in
Figure 4.
2.3. Quantile Function with Applications
The quantile function (qf) of the HLIL distribution, say
, is given by the following non-linear equation:
with
. After some algebra, we get
The quartiles are given by
,
and
. In particular, the median of the HLIL distribution is given by
Skewness and kurtosis measures can be defined from the qf such as the Bowley skewness described by [
17] and the Moors kurtosis introduced by [
18]. They are, respectively, defined by
and
The sign of is informative on the ‘direction’ of the skewness (left skewed if , right-skewed if and almost symmetrical if ). As a complementary indicator, the value of indicates the ‘tailedness’ of the distribution. The advantages of these measures are that they always exist and that they are easy to calculate. Also, they are not influenced by possible extreme tails of the distribution, contrary to the skewness and kurtosis measures defined via the moments.
4. Different Methods of Estimation and Simulation
In this section, we turn out the HLIL distribution as a model, and we investigate the estimation of , and by different notorious methods.
For the purposes of this study, denotes a random sample of size n from the HLIL distribution. Thus, represent the data, and their ascending ordering values are denoted by .
4.1. Maximum Likelihood Estimates (MLEs)
For its theoretical properties, and its practical properties as well, the most popular estimation method remains the maximum likelihood method. The maximum likelihood estimates (MLEs) of
are given by
, where
denotes the likelihood function defined by
or, equivalently,
, where
denotes the log-likelihood function defined by
The differentiability of the above function can provide us the requested estimates of the parameter vector
, through the (simultaneously) solution of the three non-linear equations corresponding to
, with respect to
. On the basis of the second partial derivative of
, we determine the inverse observed information matrix whose diagonal elements correspond to the square of the standard errors (SEs) of the MLEs. The general formula can be found in [
23].
4.2. Ordinary and Weighted Least Squares Estimates (LSEs) and (WLSEs)
The (ordinary) least squares estimates (LSEs) of
are now investigated. They are given by
, where
They can be obtained by solving simultaneously the three non-linear equations summarized as , with respect to .
In a similar fashion, the weighted least squares estimates (WLSEs) of
are defined by
, where
In a similar manner to the LSEs, the WLEs can be obtained by solving simultaneously the three non-linear equations summarized as , with respect to .
We refer to [
24] for further detail on least squares methods in full generality.
4.3. Percentile Estimates (PCEs)
The percentile estimates (PCEs) was pioneered by [
25] (for the Weibull distribution). Here, the PCEs of
are defined by
, where
where
. They can be obtained by solving simultaneously the three non-linear equations corresponding to
, with respect to
.
4.4. Cramér-Von Mises Minimum Distance Estimates (CVEs)
Following the approach described by [
26], the Cramér-von Mises minimum distance estimates (CVEs) of
are defined by
, where
They can be obtained by solving simultaneously the three non-linear equations summarized as , with respect to .
4.5. Maximum Product of Spacings Estimates (MPSEs)
The method of maximum product of spacings estimation was introduced by [
27]. We now apply it in the context of the HLIL model. Let us set
for
, where, by convention,
and
. The maximum product spacings estimates (MPSEs) of
are defined by
, where
or, equivalently,
, where
They can be obtained by solving simultaneously the three non-linear equations summarized as , with respect to .
4.6. Numerical Results
We now investigate the behavior of the ML, LS, WLS, PC, CV and MPS estimates presented above, as well as entropy estimates. The software Mathcad(14) is used. We generate 1000 random samples
of sizes
, 200, 300, 1000 and 2000 from the HLIL distribution via the so-called inverse transform sampling based on the qf given by (
6) (see [
28], for instance). The two following sets of parameters are chosen as: Set1:
and Set2:
.
Then, the mean ML, LS, WLS, PC, CV and MPS estimates (Estim) of
,
and
are calculated, as well as their standard mean square errors (MSEs). Numerical results are shown in
Table 5 and
Table 6 for Set1 and Set2, respectively.
For all estimates, as expected, the MSEs decrease as sample sizes increase.
The MSEs of the MPSEs of take the lowest value among the corresponding MSEs for the other methods in almost all cases. Other details are given below.
- –
For the MLEs, when increases, the MSEs for and decrease but the MSE for is increasing.
- –
For the LSEs, when increases, the MSEs for , and are increasing.
- –
For the WLSEs, when increases, the MSEs for and are decreasing but the MSE for is increasing.
- –
For the CVEs, when increases, the MSEs for and are decreasing but the MSE for is increasing
- –
For the the PCEs, when increases, the MSEs for , and are increasing except for at where they decrease.
- –
For the MPSEs, when increases, the MSEs for and are decreasing but the MSE for is increasing.
We complete this part by performing a simulation study on the estimation of the Rényi entropy
defined by (
11) for various values of
:
,
and
. We consider the same framework as above, the ML estimates of
,
and
, as well as the plugging principle. Then, we determine the exact values of the entropy, estimates and the relative biases (RBs) defined by the generic formula: RB = mean of (Estimate − Exact value)/Exact value. The numerical results are collected in
Table 7 and
Table 8 for Set1 and Set2, respectively.
As expected, from
Table 7 and
Table 8, we notice that, when the sample sizes increase, the RBs of the Rényi entropy decrease. Also
when increases, the RBs of the Rényi entropy decrease.
when increases, the RBs of the Rényi entropy decrease.
5. Applications
In this section, we show the interest of the HLIL distribution using three real data sets in the field of environment, engineering and insurance. The data sets are presented below.
The first data set is obtained from [
29]. It consists of 30 successive values of March precipitation in Minneapolis. The measurements are in inches. The data are given by: 0.77, 1.74, 0.81, 1.20, 1.95, 1.20, 0.47, 1.43, 3.37, 2.20, 3.00, 3.09, 1.51, 2.10, 0.52, 1.62, 1.31, 0.32, 0.59, 0.81, 2.81, 1.87, 1.18, 1.35, 4.75, 2.48, 0.96, 1.89, 0.90, 2.05.
The second data set is the Aircraft Windshield data set by [
30]. The data consist of the failure times of 84 Aircraft Windshield. The unit for measurement is 1000 h.The data are given by: 0.040, 1.866, 2.385, 3.443, 0.301, 1.876, 2.481, 3.467, 0.309, 1.899, 2.610, 3.478, 0.557, 1.911, 2.625, 3.578, 0.943, 1.912, 2.632, 3.595, 1.070, 1.914, 2.646, 3.699, 1.124, 1.981, 2.661, 3.779, 1.248, 2.010, 2.688, 3.924, 1.281, 2.038, 2.823, 4.035, 1.281, 2.085, 2.890, 4.121, 1.303, 2.089, 2.902, 4.167, 1.432, 2.097, 2.934, 4.240, 1.480, 2.135, 2.962, 4.255, 1.505, 2.154, 2.964, 4.278, 1.506, 2.190, 3.000, 4.305, 1.568, 2.194, 3.103, 4.376, 1.615, 2.223, 3.114, 4.449, 1.619, 2.224, 3.117, 4.485, 1.652, 2.229, 3.166, 4.570, 1.652, 2.300, 3.344, 4.602, 1.757, 2.324, 3.376, 4.663. The third data set is a heavy-tailed real data set from the insurance industry. This data set represents monthly unemployment insurance measures from July 2008 to April 2013 including 58 sightings, and it is reported by the ministry of Labor, Licensing and Regulation (LLR), State of Maryland, USA. The data consist of 21 variables and we analyze in particular the twelfth variable (number 12). The data are available at:
https://catalog.data.gov/dataset/unemploymentinsurance-data-july-2008-to-april-2013 (accessed on the 3 February 2021).
Then, the fits of the HLIL model are compared with those of some competitive models such as inverse Lomax (IL), Topp-Leone inverse Lomax (TIL) by [
31], Weibull inverse Lomax (WIL) by [
11], Kumaraswamy Lomax (KL) by [
32], beta Lomax (BL) by [
33] and exponentiated power Lomax (EPL) by [
34]. The maximum likelihood method is considered, mainly thanks to its attractive theoretical and computational properties. We compute the well-known statistical measures, namely Akaike information criterion (AIC), Bayesian information criterion (BIC), Anderson-Darling (AD) and Cramér-von Mises (CVM). As usual, the model with a lower value for AIC, BIC, CVM and AD should be the preferred model. The results, described below, are obtained by using the R software.
Table 9,
Table 10 and
Table 11 present the MLEs and their SEs (in parentheses) for the first, second and third data sets, respectively.
Table 12,
Table 13 and
Table 14 inform on the values of the AIC, BIC, AD and CVM for the first, second and third data sets, respectively.
Since it has the smallest values for all these criteria, one can consider that the HLIL model is the best. In particular, for all the considered criteria, and the three data sets, we notice that the HLIL model significantly dominates the former IL model. On the basis of these criteria, the superiority of the HLIL model over the competitors is particularly obvious for the fit of the third data set.
Figure 5,
Figure 6 and
Figure 7 illustrate the estimated pdfs with superposition over the related histograms for the first, second and third data sets, respectively.
Visually, it is clear that the fit of the HLIL distribution is more adequate. The high flexibility of the peak and tails of the pdf of the HLIL distribution is clearly the key to success in terms of adaptability, far beyond the modeling ability of the former IL distribution. Moreover, for the third data set, as illustrated by
Figure 7, the adaptive power of the HLIL is particularly dominant.