1. Introduction
In recent years, there has been a dramatic growth in the number of generalisations of well-known probability distributions. Most notable generalizations are achieved by (i) inducting power parameters in well established parent distributions, (ii) extending the classical distribution by modification in their functions, (iii) introducing special functions such as W[K(x)] as generators and (iv) by compounding of distributions. This heaped surge of generalized families is due to the flexibility in modelling phenomenons related to the changing scenarios of contemporary scientific field including demography, actuarial, survival, biological, ecological, communication theory, epidemiology and environmental sciences. However, a clear understanding of the applicability of these models in most applied areas is necessary if one is to gain insights into systems that can be modeled as random processes. The model, thus obtained, acquires improved empirical results to the real data that is collected adaptively.
Although there exist many functions which act as generators to produce flexible classes of distributions, in this project, we will emphasize generalizations in which a ratio of survival function (sf) has been used in some form, commonly known as the odd ratio. In the reference [
1], a proportional odd family
viz. a viz. the Marshall Olkin-G (MO-G) was generalized by sf
, where
is the distribution function (cdf) of parent distribution, with the induction of a tilt parameter. Gleaton and Lynch, in the reference [
2], used the odd function as generator when they defined a log odd family (OLL-G). In the reference [
3], defined the odd Weibull family, as an asymptotically equivalent log-logistic model for larger values of
, the scale parameter. The reference [
4] used the Transformed -Transformer (TX) family, due to the reference [
5], to define odd Weibul-G families of distribution. Since then, a myriad of distributions has been generalized using odd function. Some of the important families include [
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31], among others.
Focussing on the origins and motivations of our proposed scheme, the authors in [
32] proposed the odd generalized exponential family (which we refer to OGE-G) as a better alternative to generalized exponential (GE) family using Lehmann Alternative-I (LA-I). The cdf of the two parameter OGE-G family is mentioned below:
In the reference [
33], an odd family of GE was proposed so-called generalized odd generalized exponential family (which we refer to OGE1-G). The cdf of OGE1-G family is presented as:
Because of its capacity to simulate variable hazard rate function (hrf) forms of all traditional types in lifetime data analysis, we believe OGE-G offers a sensible combination of simplicity and flexibility. However, the relevance of OGE1-G to lifespan modelling in domains such as reliability, actuarial sciences, informatics, telecommunications, and computational social sciences (just to highlight a few) is still debatable. According to the reference [
34], the Lehmann Alternative-II (LA2) approach has received less attention. This motivated us to use LA2 approach to develop the exponentiated odd generalized exponential (OGE2-G), in the same vein as OGE-G and OGE1-G. Adhering to the framework defined in the reference [
5], if
T follows GE random variable (rv), then the cdf of OGE2-G family is mentioned below:
where
and
are shape parameter and
is the vector of baseline parameter.
Consider the following points to emphasise the model’s distinctiveness; (i) In the literature, the proposed model in its current form has not been studied to the best of our knowledge, (ii) From an analytical standpoint, the OGE2-G family has a significantly better configuration and practicality than OGE-G and OGE1-G for inverted models with minimal chance to counter non-identifiability issues, (iii) The OGE2-G has several curious connections to other families. When
approaches 0,
tends to GE with
, when
tends to OGE-G, if
and
then
tends to odd exponential (OE) (iv) This new dimension allowed us to explore models which are naturally constituted by LA2. The generalizations, thus attained, produced skewed distributions with much heavier tails enabling its practicality in risk evaluation theory with far better results, (v) The successful application of OGE2-G family motivates future research, as it outperforms nine well-established existing models, (vi) We present a physical explanation for
X when
and
are integers. Consider there be a parallel system consisting of
identically independent components. Suppose that the lifetime of a rv
Y with a specific
with
components in a series system such that the risk of failing at time
x is represented by the odd function as
. Consider that the randomness of this risk is represented by the rv
X, then we can assume the following relation holds
explicitly given in Equation (
1). The OGE2-G family is offered and explored in this research, emphasising its diversity and scope for application to real life phenomenons. The major features of the OGE2-G family, including the pdf, hrf, qf, and ten unique models from OGE2-G family presented in
Table 1, are provided in the first half. Then, certain mathematical properties of the OGE2-G such as series expansion of the exponentiated pdf, moments, parameter estimation, order statistics, Rényi entropy, stress-strength analysis and stochastic dominance results are investigated. Furthermore, Fréchet is specified as baseline model termed as OGE2-Fréchet (denoted as OGE2Fr) and the maximum likelihood (ML) technique is then used to construct statistical applications of the special model. We choose to study OGE2Fr specifically as its nested model include inverse-Rayleigh (IR) and inverse exponential (IE), favoring its suitability over sub-models as well. It is applied to fit two sets of premium data from actuarial field. Using key performance indicators, we reveal that OGE2Fr outperforms nine competing models. A portion pertinent to specific risk measures, with an emphasis on the value at risk (VaR) and the expected shortfall (ES), is presented. Eventually, the estimation of risk measures for the examined data sets is then discussed, with the proposed methodology yielding a rather satisfying result. Equation (
1) can be useful in modelling real life survival data with different shapes of hrf.
Table 1 lists
and the corresponding parameters for some special distributions which are considered to be the potential sub-models of OGE2-G family.
The following is a breakdown of how the paper is constructed. In
Section 2, we acquaint the readers to the new family with basic properties and ten potential baseline models which can become members of OGE2-G family.
Section 3 is comprised of the mathematical properties of the OGE2-G family.
Section 4 progresses by taking Fréchet (Fr) as sub-model to propose OGE2Fr and related statistical and inferential properties.
Section 5 specifies two applications of actuarial data sets with emphasis on risk evaluation (premium returns) and the proposed model’s veracity is established. Furthermore, the model is applied to compute some actuarial measures.
Section 6 is the final section, with some annotations and useful insights.
4. OGE2-Fréchet Distribution
In this section, we study the first special model defined in
Section 2.2, the OGE2-Fréchet (OGE2Fr), in view of its practical application.
The OGE2Fr model can be defined from (
1) by taking
and
, as cdf and pdf of the baseline Fréchet distribution with
, respectively. The cdf and pdf of OGE2Fr distribution are, respectively, given by
and
where
and
b are shape parameters while
a is scale parameter.
The hrf and qf of the OGE2Fr distribution are obtained as
Figure 1 depicts a visualisation of the pdf and hrf functions, exhibiting the range of shapes that all these functions can take at random input parametric values. The OGE2Fr distribution’s pdf can be gradually decreasing, unimodal, and right-skewed, with different curves, tail, and asymmetric aspects, as shown in
Figure 1. The hrf, on the other hand, offers an extensive range of increasing, decreasing, unimodal, and increasing-decreasing-increasing (IDI) forms. Given a wide variety of hrf shapes being offered, the OGE2Fr distribution can in fact be a useful tool to model unpredictable time-to-event phenomena.
4.1. Linear Representation and Related Properties
The cdf of the OGE2Fr distribution is quite straightforward and is achieved by using the result defined in Equation (
6) as
where
ℓ is the power parameter and noting that
is unity.
For simplicity, we can rewrite the above result as
By differentiating the last term, we can express the density of OGE2Fr model as follows
where
represents the Fréchet density function with power parameter
ℓ. Equation (
22) enforces the fact that OGE2Fr density is a linear combination of Fréchet densities. Thus, we can derive various mathematical properties using Fréchet distribution.
Moments are the heart and soul of any statistical analysis. Moments can be used to evaluate the most essential characteristics such as mean, variance, skewness, and kurtosis of a distribution. We now directly present the mathematical expressions for the moments of OGE2Fr model as follows. Let
be a random variable with density
. Then, core properties of
X can follow from those of
. First, the
pth ordinary moment of
X can be written as
Second, the cumulants (
) of
X can be determined recursively from (
23) as
, respectively, where
. The skewness
and kurtosis
of
X can be calculated from the third and fourth standardized cumulants. Plots of mean, variance, skewness and kurtosis of the OGE2Fr distribution are displayed in
Figure 2. These plots signifies the significant role of the parameters
and
in modeling the behaviors of
X.
Third, the
pth incomplete moment of
X, denoted by
, is easily found changing variables from the lower incomplete gamma function
when calculating the corresponding moment of
. Then, we obtain
Fourth, the first incomplete moment
is used to construct the Bonferroni and Lorenz curves as discussed in
Section 3.2.
Figure 3 provides the income inequality curves (Bonferroni & Lorenz) of the proposed distribution which can easily be derived from (
24), respectively, where
is the qf of
X derived from Equation (
20).
4.2. Parameter Estimation
Let
be a sample of size
n from the OGE2Fr distribution given in Equation (
18). The log-likelihood function
for the vector of parameters
is -4.6cm0cm
The components of score vectors U(
) are
4.3. Order Statistics
For a random sample of
taken from the OGE2Fr distribution with
as the
ith order statistic. For
, the pdf corresponding to
can be expressed as
where
and
are the pdf and cdf of OGE2Fr distribution, respectively. Inserting Equations (
17) and (
18), and using the result defined in
Section 3, we have
where
and
is the probability density function of the OGE2Fr distribution with parameters and b.
4.4. Stochastic Ordering
In several areas of probability and statistics, stochastic ordering and disparities are being adhered to at an accelerating rate. For example, in analyzing the contrast of investment returns to random cash flows; two manufacturers may use distinct technologies to make gadgets with the same function, resulting in non-identical life distributions or comparing the strength of dependent structures. Here, we use the term stochastic ordering to refer to any ordering relation on a space of probability measures in a wide sense. Let
X and
Y be two rvs from OGE2Fr distributions, with assumptions previously mentioned in
Section 3. Given that
, and for
,
shall be decreasing in
x if and only if the following result holds Let
X and
Y be two rvs from OGE2Fr distributions, with assumptions previously mentioned in
Section 3. Given that
, and for
,
shall be decreasing in
x if the following result holds
4.5. Simulation Study
By using the result defined in Equation (
25), we evaluate the sensitivity of the method of estimations using the MLEs of OGE2Fr distribution parameters by Monte Carlo simulation technique. The simulation study is conducted for sample sizes
and parameter combinations, denoted by
, are:
: , , and ;
: , , and ;
: , , and .
We use Equation (
20) to generate the random observations. For each
, the empirical bias and MSE values are the average of the values from
simulated samples for given sample size
n. The formula to evaluate the mean squared error (MSEs) and the average bias (Bias) of each parameter, is given below
We report the results of the AE, Bias and MSE for the parameters
,
,
a and
b in
Table 2. The MSE of the estimators increases when the assumed model deviates from the genuine model, as anticipated. When the sample size grows larger and the symmetry degrades, the MSE shrinks. Generally speaking, the MSE decreases when the kurtosis grows. Similarly, when the asymmetry rises, the bias grows, and vice versa. As the kurtosis grows, the bias becomes smaller. In conclusion, it is apparent that the MSEs and Biases decrease when the sample size
n increases. Thus, we can say that the MLEs perform satisfactorily well in estimating the parameters of the OGE2Fr distribution.
5. Application of OGE2Fr to Premium Data
Most skewed distributions are suitable to measure risk measures associated with actuarial data. The risks involve credit, portfolio, capital, premiums losses, and stocks prices among others. We focus our attention on the stakes based on premiums. Premiums are the payments for insurance that the customer pay to the company to which they are insured. In this section, we apply the OGE2Fr lifetime model for the statistical analysis of two real life data sets both of which include premium losses. Our aim is to compare the fits of the OGE2Fr model with other well-known generalizations of the Fréchet (Fr) models given in
Table 3.
The first premium data set, designated as PD1, is derived from complaints upheld against vehicle insurance firms as a proportion of their overall business over a two-year period. The study was conducted by DFR (Darla Fry Ross) insurance and investment company (2009–2016), registered in New York state. The most common complaints are over delays in the settlement of no-fault claims and non-renewal of insurance. Top of the list are insurers with the fewest upheld complaints per million USD of premiums. The companies with the greatest complaint ratios are at the bottom of the list. The data understudy is from the year 2016. The second premium data, denoted by PD2, signifies the net premiums written (in billions of USD) to insurers which, under Article 41 of the New York Insurance Law, are required to meet minimum financial security requirements.
Table 4: Descriptives statistics of PD1 and PD2.
The OGE2Fr model is validated through the discriminatory criterions (DCs) we considered for each data set. It includes the negative log-likelihood (
) of the model taken at the corresponding MLEs, the Akaike Information Criterion (AICs), Bayesian Information Criterion (BICs), Anderson-Darling (AD), Cramér–von Mises (CvM), and Kolmogrov-Smirnov (KS) as well as the
p-value (P-KS) of the related KS test. We use the method of maximum likelihood estimation to estimate the unknown parameters as presented in
Section 4.2. For each criterion (except
p-value (KS)) with highest value), the smallest values is gained by the OGE2Fr model, indicating the best fit among its competitive models.
Some descriptive statistics related to these data are given in
Table 4. The skewness and kurtosis are indicative of exponentially tailed data (reversed-J shape). The TTT plots for the both data sets are given in
Figure 4. In particular, the TTT plots show largely decreasing hrf, permitting to fit OGE2Fr model on these data sets. The estimated hrf in
Figure 5 matches
Figure 4. In
Table 5, we present the estimates (MLEs) along with their respective standard errors(SEs) while the DCs are listed in
Table 6 for PD1 & PD2, respectively. For a more visual view, the estimated pdf, cdf, sf and Q-Q plots of the OGE2Fr model for two data sets are displayed in
Figure 6 and
Figure 7. Furthermore, the PP-plots of OGE2Fr and its three other competitive 4-parameter models for PD1 and PD2 are displayed in
Figure 8 and
Figure 9. The log-likelihood function profiles for PD1 and PD2, respectively, are provided in
Figure 10 and
Figure 11 to highlight the universality of the MLEs of
vector. The graphical visualizations are indicative of nice fits for the OGE2Fr model.
5.1. Actuarial Measures
One of the most important duties of actuarial sciences organizations is to assess market risk in a portfolio of instruments, which originates from changes in underlying factors such as equities prices, interest rates, or currency rates.One of the most important duties of actuarial sciences organizations is to assess market risk in a portfolio of instruments, which originates from changes in underlying factors such as equities prices, interest rates, or currency rates. We compute several important risk measures for the suggested distribution in this section, such as Value at Risk (VaR) and Expected Shortfall (ES), which are important in portfolio optimization under uncertainty.
5.1.1. Value at Risk
The quantile premium principal of the distribution of aggregate losses, commonly known as Value at risk (VaR), is the most widely used measure to evaluate exposure to risk in finance. VaR of a rv is the
pth quantile of its cdf. If
OGE2Fr denotes a random variable with cdf (
17), then its VaR is
5.1.2. Expected Shortfall
Artzner et al. [
47,
48] recommended the use of conditional VaR instead of VaR, famously called Expected Shortfall (ES). The ES is a metric that quantifies the average loss in situations where the VaR level is exceeded. It is defined by the following expression
The ES of OGE2Fr is given by
Figure 12 illustrates VaR and ES for some random parameter combinations of OGE2Fr.
5.1.3. Numerical Calculation of VaR and ES
The results of OGE2Fr presented in
Section 4 allowed us to further explore its application to these risk measures. From
Table 5, we take the values of MLEs of PD1 and PD2, respectively, to measure the volatility associated with these measures. Higher values of these risk measures signify heavier tails while lower values indicate a much lighter tail behavior of the model. It is worth mentioning that the OGE2Fr model produced substantially more significant results than its counterparts, indicating that the model has a heavier tail. In
Table 7, we show the numerical results of VaRs and ESs of PD1 and PD2, respectively, of the proposed model. For the convenience of the reader,
Figure 13 show the results graphically.