Abstract
This study introduces the DUS Topp–Leone family of distributions, a novel extension of the Topp–Leone distribution enhanced by the DUS transformer. We derive the cumulative distribution function (CDF) and probability density function (PDF), demonstrating the distribution’s flexibility in modeling various lifetime phenomena. The DUS-TL exponential distribution was studied as a sub-model, with analytical and graphical evidence revealing that it exhibits a unique unimodal shape, along with fat-tail characteristics, making it suitable for time-to-event data analysis. We evaluate parameter estimation methods, revealing that non-Bayesian approaches, particularly Maximum Likelihood and Least Squares, outperform Bayesian techniques in terms of bias and root mean square error. Additionally, the distribution effectively models datasets with varying skewness and kurtosis values, as illustrated by its application to total factor productivity data across African countries and the mortality rate of people who injected drugs. Overall, the DUS Topp–Leone family represents a significant advancement in statistical modeling, offering robust tools for researchers in diverse fields.
1. Introduction
In applied statistics and probability theory, the development of new probability distributions plays an essential role in effectively modeling complex, real-world phenomena. Traditional distributions, while powerful in many contexts, can sometimes fall short in capturing the full range of variability and specific patterns inherent in certain types of data. This is especially true in fields where the observed data exhibit unique skewness, kurtosis, or heavy tails that require a more flexible model to yield accurate insights. One of the key areas where advanced probability distributions are particularly valuable is in the analysis of lifetime events. These events, such as the time until failure of a machine part or the duration until recovery from a disease, often show complex patterns that cannot be fully represented by conventional distributions like the exponential or normal. In this study, we introduce a new family of probability distributions and an extended form of the exponential distribution specifically designed to capture the nuances of lifetime data. This new distribution has potential applications across various domains, offering a fresh approach to model events over time and providing more nuanced interpretations of underlying trends. One significant application of advanced probability distributions is in economics, particularly in assessing total factor productivity (TFP) across countries. TFP is a critical indicator of economic efficiency and is integral to understanding the growth trajectories of nations. By modeling the variability in TFP using a tailored probability distribution, researchers can uncover factors that drive productivity, which are not always visible in standard economic analyses. Such insights can illuminate the role of technological progress, policy changes, and labor productivity, providing policymakers with valuable information to shape strategies for sustainable development. For instance, examining the distributional characteristics of TFP may reveal how certain economies are prone to stagnation, while others exhibit robust growth resilience, despite similar inputs. This type of analysis has profound implications for development economists, who strive to understand and enhance the determinants of economic progress on a global scale. In public health, accurately modeling mortality trends is paramount, especially in the context of chronic conditions like HIV/AIDS. The global impact of HIV/AIDS on mortality rates remains substantial, affecting millions of lives and posing a persistent public health challenge. Using advanced probability distributions to model HIV/AIDS mortality data enables public health officials and researchers to gain deeper insights into trends over time, variations across demographic groups, and potential factors that influence mortality. For example, applying the proposed probability distribution can help to identify shifts in mortality patterns due to advancements in antiretroviral therapy (ART) or changes in socio-economic conditions. By accurately capturing these patterns, the model serves as a powerful tool for assessing the effectiveness of public health interventions, thereby providing a robust foundation for policy recommendations and future research in epidemiology.
The introduction of this new family of probability distributions extends beyond economics and public health, with applicability in engineering, environmental science, and social sciences where complex, real-world processes require more adaptable statistical models. In engineering, for instance, modeling failure times of components with this new distribution can lead to improved reliability assessments and maintenance strategies, ultimately enhancing system efficiency and safety. In environmental science, analyzing patterns in climate variables or biodiversity measures using flexible probability models can yield critical insights into ecological stability and risks posed by environmental change. This study contributes a valuable new tool to the field of applied statistics, addressing the need for more adaptable probability models to capture the subtleties of real-world data. The new family of probability distributions holds promise for improving the accuracy and interpretability of statistical models across disciplines, enriching the understanding of complex phenomena in a way that benefits both academic research and practical decision-making. Several techniques have been utilized in the development of new families of distributions to enhance existing distributions. A number of these techniques are presented in Table 1 below.
Table 1.
Some existing families of distributions.
Regarding fitting probability distributions to medical and related data, this study is not in isolation. Several authors have undertaken studies in this direction, leading to a rich literature base covering applications of certain probability distributions to medical and biomedical events. For instance, Al-Noor et al. [] presented the Marshall–Olkin–Marshall–Weibull distribution. This distribution was designed to fit into two datasets for the early detection of breast tumors in Egypt and fatigue fracture of Kevlar 373 (epoxy) subjected to a 90% stress level. This distribution was found to have performed better than the competing models based on the two datasets. Feroze et al. [] introduced the modified Weibull extension and applied it to a medical dataset with a Bayesian perspective under different loss functions and informative priors. The Topp–Leone modified Weibull distribution proposed by Alyami et al. [] was used to model datasets related to medical data and the results were impressive. The weighted inverse Weibull distribution with application introduced by AbaOud and Almuqrin [] exhibited heavy tail characteristics and was used to model medical data. In addition, the bounded exponentiated Weibull distribution studied by Bashir et al. [] considered COVID-19 survival rates in Spain as an illustration. Other studies devoted to applications to medical data include the works of Mazucheli et al. [], Qura et al. [], Tolba et al. [], and El-Morshedy et al. [].
The motivation for the development of the new family of distributions in this study ranges from the need to provide models that are parameter parsimonious (lower number of parameters), wider applicability of the attendant new distributions compared to the parent distribution, and improved goodness of fit, estimation adaptability and mathematical flexibility. It is easy to see that the new family formulated in Section 2 did not add any parameters to the Topp–Leone family of distributions; rather, its functional form is altered, rendering a more flexible, adaptable and tractable Topp–Leone family with a better goodness of fit. The remainder of this study is presented in Figure 1 below.
Figure 1.
Outline of the remaining sections of this study.
2. The DUS Topp–Leone Family of Distributions
Rezaei et al. [] introduced the Topp–Leone generated family of distribution with cumulative distribution function (CDF) and probability density function (PDF) expressed respectively:
and
Another interesting generalized generator of distributions is the Dinesh–Umesh–Sanjay (DUS), which was formulated by Kumar et al. [] with CDF and PDF denoted as follows, respectively:
and
If we substitute Equations (1) and (2) into Equations (3) and (4), a new generalized family of distributions named the DUS Topp–Leone family is birthed, with CDF and PDF presented as follows, respectively:
and
is a vector of the parameters of the baseline distribution whose CDF and PDF are and , respectively. The linear form of the PDF is obtained by introducing the power and binomial series expansion identities as follows:
where .
3. Baseline Extension: DUS Topp–Leone Exponential (DUS-TLE) Distribution
Suffice it to say that the exponential distribution is the most popular non-negative light-tailed standard distribution, with CDF and PDF given as and . Utilizing the exponential distribution by making appropriate replacements in Equations (5) and (6), we obtain the DUS-TLE distribution with CDF and PDF given as
and
is the additional shape parameter from the DUS-TL family for the enhancement of any baseline distribution and is the scale parameter of the exponential distribution. It is noteworthy that the embedding of the DUS transformer into the TL family only alters the functional form of the TL family. Hence, we can be sure that the attendant distribution has versatility and efficacy in modeling lifetime events across varied contexts, from economic productivity to health outcomes.
The hazard rate function (HRF), denoted as and derived from the given PDF and CDF, can be written as
Figure 2 is the PDF of the suggested DUS-TLE distribution exhibiting a unimodal shape, reversed bathtub shape and fat-tail.
Figure 2.
PDF plots of DUS-TLE .
3.1. Extreme Behavior of the DUS-TLE Distribution
The PDF of the DUS-TLE distribution exhibits interesting behavior as x approaches both 0 and ∞. As , the term tends to 1, and consequently, approaches 0. Therefore, the function tends to 0 due to the factor vanishing when . This implies that for values of , the function converges to zero at the origin.
Similarly, as , the term decays exponentially to 0, and . In this limit, the function also tends to 0, as the leading factor dominates, causing an exponential decay. The parameter governs the rate at which the function decays to zero for large x, with higher values of resulting in faster convergence.
From these observations, we can deduce that the function is bounded at both extremes, meaning it converges to zero as and as . This suggests that the function has a local maximum at some intermediate value of x. For typical values of and , the function likely exhibits a “hump”-like behavior, where it rises from 0, reaches a peak, and then decays back to 0 as x increases. The exact location and height of this peak depend on the parameters and . Larger values of influence the shape of the peak, making it sharper or broader, while affects the rate of decay as x increases.
To ensure the function converges appropriately, the parameter must be greater than 1, since for , the term would not vanish as , potentially altering the behavior near the origin. Additionally, ensures the exponential decay necessary for the function to converge to 0 as .
Overall, the function describes a bounded behavior, with convergence to 0 at both extremes, and it is likely to have a peak for intermediate values of x. The parameters and play a critical role in determining the shape and rate of decay of the function, making it potentially useful for modeling phenomena where intensity rises and falls, such as life distributions or time-to-event models.
Figure 3a–f represent the hazard function with varying behavior or patterns of shape. Some are right-tailed, others increasing, left-tailed, bathtub and bump-like shapes. These are evidence of the versatility of the model in providing fit to lifetime data of various natures.
Figure 3.
h(x) of DUS-TLE.
3.2. Mixture Representation
The PDF of the DUS-TLE distribution in Equation (8) can be reduced further using the power series expansion and the binomial series expansion for . Therefore, we rewrite the PDF as
4. Characteristics
In this section, some properties of the DUS-TLE distribution are discussed, which include the moment and its measures, moment-generating function, quantile function, entropy and order statistic.
4.1. Moment
The s-th moment of a random variable which assumes the DUS-TLE distribution can be expressed as
The first moment which coincidentally is the mean is obtained by substituting in Equation (11), hence
Similarly, the second, third and fourth moments are, respectively,
and
Figure 4a,b represent the mean and variance in 3D graphics. A close observation reveal that for increasing values of the and , the mean becomes large. The peak is at and . Values of cause the mean to depreciate. Similarly, the variance returns to zero for values around 6.
Figure 4.
(a) Mean of DUS-TLE. (b) Variance of DUS-TLE.
4.2. Quantile Function
The quantile function is a measure that facilitates the simulation of random sample that first follows a uniform distribution, , and then the distribution whose CDF is inverted to obtain the quantile function. Here, suppose for DUS-TLE ; then, the quantile function is given as
When , we obtain the median lifetime of the distribution.
Figure 5a,b are the plots of the skewness and kurtosis of the DUS-TLE distribution. The skewness plot suggests more of a negatively skewed model than positive. However, some values of the measure are positive, which rules out the possibility of only negative skewness. Therefore, there is visual evidence that the distribution can model both negatively and positively skewed data. From Figure 6, as increases and reduces in value, the kurtosis reduces. Conversely, as reduces and increases, the kurtosis increases.
Figure 5.
(a) Skewness of DUS-TLE. (b) Kurtosis of DUS-TLE.
Figure 6.
Parametric and non-parametric plots for Data-I.
4.3. Moment-Generating Function
The moment-generating function, , is a mathematical tool used to characterize the distribution of a random variable X. It is defined as the expected value of the exponential function of , i.e., The encapsulates all the moments of the distribution (e.g., mean, variance) and is useful in deriving properties of the random variable. For the random variable DUS-TLE , the is given as
where and are parameters of the distribution.
4.4. Entropy
The Rény entropy by Rényi [] is one of the popular measures of information loss. It is mathematically given as
where and .
4.5. Order Statistic
Given a random sample of size n from a population with CDF and PDF , the k-th order statistic, , represents the k-th smallest value in this sample. The PDF of the k-th order statistic from a sample of size n is generally given by
By substituting and from Equations (7) and (8), the PDF of the k-th order statistic, , for the ORET-L distribution is obtained as follows:
Using the following general binomial and power series expansion identities,
we can further decompose the expression in Equation (12) to become
5. Point Estimation
In this section, analytical estimation of the parameters of the DUS-TLE distribution is investigated using some non-Bayesian and Bayesian approaches. The essence of involving several methods is to determine their performance in large-sample scenarios.
5.1. Maximum Likelihood Estimation
Suppose we have a randomly selected group of n observations, represented as , each drawn independently from the DUS-TLE distribution with unknown parameters and . The likelihood and log-likelihood functions for the parameters and take the following forms:
The log-likelihood is obtained as
Take the first partial derivative of Equation (13) with respect to :
Take the first partial derivative of Equation (13) with respect to :
5.2. Least Squares Estimation (LSE)
The method of Least Squares Estimation was introduced by Swain et al. [] for determining the parameters of the Beta distribution. Building on the principles and findings presented by Swain et al. [], we derive the following. Their work serves as a foundation for the approach used to estimate these parameters, which involves minimizing the sum of the squared differences between observed and theoretical values, ensuring the best possible fit of the Beta distribution model to the data. This technique is fundamental in deriving accurate estimates for the distribution’s parameters.
The parameters and are estimated by the Least Squares method, yielding the estimates and , which are found by minimizing the function with respect to both and .
The estimates are derived by solving the following set of non-linear equations:
where
and
The results in Equations (19) and (20) are derived by taking the first partial derivatives of the CDF of the DUS-TLE distribution, presented in Equation (7), with respect to and , respectively.
5.3. Weighted Least Squares Estimation (WLSE)
5.4. Maximum Product of Spacing Estimation (MPS)
Cheng and Amin [] first introduced this method as an alternative to Maximum Likelihood estimation. The maximum product spacing technique offers a different approach, serving as an approximation to the Kullback–Leibler information criterion, rather than relying on the Maximum Likelihood method. Assuming the data are arranged in ascending order, we can proceed with this approach.
where = - ; . In a similar way, it is possible to opt for maximizing the function.
By differentiating the function with respect to and , and solving the resulting system of non-linear equations, and , we can determine the parameter estimates.
5.5. Cramér–von Mises Estimation (CVME)
The parameters and of the DUS-TLE distribution, denoted as the Cramér–von Mises estimates and , are derived by minimizing the function with respect to and .
The estimates are derived by solving the following set of non-linear equations:
where and are defined as outlined in Equations (19) and (20), respectively.
5.6. Anderson–Darling Estimation (ADE)
The Anderson–Darling estimators and for the parameters and of the DUS-TLE distribution are derived by minimizing the function with respect to both and .
The estimates are derived by solving the following set of non-linear equations:
where and are defined as outlined in Equations (19) and (20), respectively.
5.7. Right-Tailed Anderson–Darling Estimation (RTADE)
The estimates and for the parameters and of the DUS-TLE distribution, based on the Right-Tailed Anderson–Darling method, are determined by minimizing the function with respect to both and .
The estimates are derived by solving the following set of non-linear equations:
where and are defined as outlined in Equations (19) and (20), respectively. The estimates provided in Equations (14), (15), (17), (18), (22), (23), (25), (27), (28), (30), (31), (33) and (34) were derived using the optim() function in R, which implements the Newton–Raphson iterative method.
5.8. Bayesian Estimation
This section focuses on determining the Bayesian estimates (BEs) of the unknown parameters for the DUS-TLE distribution. In Bayesian parameter estimation, various loss functions can be utilized, such as the squared error, LINEX, and generalized entropy loss functions. To estimate the parameters, we can assume independent gamma priors for and , with their probability density functions (PDFs) given as part of the prior distribution for the DUS-TLE model.
The hyper-parameters and , for , are chosen based on prior knowledge regarding the unknown parameters. The joint prior distribution for is defined as follows:
The posterior probability distribution, conditioned on the observed data , can be expressed as follows:
This suggests that the posterior density function is
For any function and under the squared error loss (SEL) criterion, the Bayes estimator is expressed as follows:
The SEL affects both underestimation and overestimation in the same way due to its asymmetric loss function. In numerous real-world scenarios, underestimating or overestimating can lead to significant consequences. In some cases, a LINEX loss function can be suggested as an alternative to the SEL, as described by
Here, serves as a shape parameter. When , it indicates that overestimations are considered more significant than underestimations, while suggests the opposite. As approaches zero, the loss function converges to the standard squared error (SE) loss. For more information on this, one can consult Varian [] and Doostparast et al. []. The Bayes estimator (BE) of under this loss is derived as follows:
Furthermore, we incorporate the general entropy loss (GEL) function, as proposed by Calabria and Pulcini []. This function is characterized by the following definition.
The shape parameter indicates a deviation from symmetry. When , overestimations are considered more critical than underestimations, while for , the reverse holds true. In this context, the Bayes estimator is presented with respect to the Generalized Error (GE) loss function;
The estimates produced by Equations (38), (39), and (40) cannot be simplified into closed-form solutions. Consequently, to generate posterior samples and obtain the corresponding Bayes estimates, the Markov chain Monte Carlo (MCMC) technique is utilized (see Brooks [] & Van Ravenzwaaij et al. [] for more details on the MCMC technique). In MCMC, an initial subset of random samples of size M, drawn from the posterior distribution, may be discarded as “burn-in.” The remaining samples are then employed to compute the Bayes estimates. Using MCMC within the framework of the SEL, LINEX, and GEL loss functions, the Bayes estimates (BEs) of are calculated as follows:
and
where denotes the quantity of burn-in samples.
6. Simulation
In this study, several non-Bayesian methods were employed for comparison. These methods include Maximum Likelihood (ML), Maximum Product of Spacings (MPS), Least Squares (LS), Weighted Least Squares (WLS), Cramér–von Mises (CVM), Anderson–Darling (AD), and Right-Tailed Anderson–Darling (RTAD). For each of these methods, parameter estimates were computed for the 10,000 bootstrap samples across the four sample sizes. In addition to the non-Bayesian simulation, the Markov chain Monte Carlo (MCMC) technique was employed to generate posterior samples for Bayesian estimation due to the complexity of deriving closed-form solutions from Equations (37), (39), and (40). To further validate the accuracy of the proposed estimation methods, a bootstrap approach was used in conjunction with MCMC. Specifically, 10,000 bootstrap samples were generated for each sample size . For each bootstrap sample, the MCMC algorithm was applied to estimate the parameters , with a portion of the initial samples discarded as “burn-in” to ensure convergence of the Markov chain. Using the remaining samples, Bayes estimates were computed under the frameworks of the squared error loss (SEL), LINEX loss, and general entropy loss (GEL) functions. To evaluate the performance of both Bayesian and non-Bayesian estimation methods, the average bias and root mean squared error (RMSE) were calculated for each sample size and across the 10,000 bootstrap iterations. These performance metrics provided a thorough assessment of the estimation accuracy. The results demonstrate how the Bayesian estimators, particularly when combined with MCMC, outperform the non-Bayesian methods in terms of reduced bias and lower RMSE, particularly as the sample size increases. By comparing the average bias and RMSE obtained across different sample sizes, the simulation study offers critical insights into the efficiency of the non-Bayesian and Bayesian estimation methods in various sample scenarios, highlighting the robustness of the MCMC approach for parameter estimation in complex distributions.
In interpreting the simulation results presented in Table 2, Table 3, Table 4 and Table 5, we observe the comparative performance of both non-Bayesian and Bayesian estimation methods in estimating the parameters and of the DUS-TLE distribution across various sample sizes. For non-Bayesian methods like Maximum Likelihood (ML), Maximum Product Spacings (MPS), Least Squares (LS), Weighted Least Squares (WLS), Cramér–von Mises (CVM), Anderson–Darling (AD), and Right-Tailed Anderson–Darling (RTAD), the results show a trend of decreasing bias and RMSE values as sample size increases from to . This suggests that these methods provide more accurate estimates as the sample size grows, with the ML method consistently yielding lower bias and RMSE values, indicating its robustness across different sample sizes.
Table 2.
Average estimated bias and RMSE of non-Bayesian and Bayesian estimation methods, with DUS-TLE parameters at , and .
Table 3.
Average estimated bias and RMSE of non-Bayesian and Bayesian estimation methods, with DUS-TLE parameters at , and .
Table 4.
Average estimated bias and RMSE of non-Bayesian and Bayesian estimation methods, with DUS-TLE parameters at , and .
Table 5.
Average estimated bias and RMSE of non-Bayesian and Bayesian estimation methods, with DUS-TLE parameters at , and .
On the other hand, the Bayesian estimation (BE) methods, such as BE with squared error loss (SEL), BE with Linex loss (Linex1 and Linex2), and BE with Generalized Entropy loss (GEL1 and GEL2), exhibit a different pattern. These methods generally show higher bias and RMSE compared to the non-Bayesian approaches, particularly at smaller sample sizes. Even though the Bayesian methods perform slightly better as the sample size increases, they do not match the efficiency of the non-Bayesian methods in terms of bias and RMSE reduction. Notably, the method demonstrates the largest bias and RMSE across all sample sizes, suggesting that it may not be as reliable for the parameter estimation of DUS-TLE.
For both non-Bayesian and Bayesian methods, the estimation of tends to have lower bias and RMSE compared to , implying that the parameter is easier to estimate across all methods and sample sizes. Among all the methods, ML and LS perform well with lower bias and RMSE, especially as the sample size grows, while the Bayesian methods, despite improvements with larger sample sizes, remain less competitive overall. The general trend observed is that as the sample size increases, both bias and RMSE decrease for all methods. This highlights the importance of larger sample sizes in obtaining more reliable and accurate parameter estimates. However, even with larger sample sizes, non-Bayesian methods like ML and WLS continue to outperform the Bayesian approaches, particularly in terms of lower bias and RMSE for both and .
Overall, the results indicate that non-Bayesian methods, particularly ML, consistently outperform Bayesian methods in terms of bias and RMSE when estimating the parameters and , with performance improving as the sample size increases.
7. Applications
The first data type is the total factor productivity for thirty-seven African countries, extracted from https://documents.worldbank.org/en/publication/documents-reports/documentdetail/646931468157519398/total-factor-productivity-across-the-developing-world, accessed on 22 October 2024. The data is reported in Table 6. In Table 7, summary statistics providing preliminary insights into the data are presented. Observe that the skewness measure is 0.2, suggesting that the data are skewed to the right, and from theoretical visual evidence in Figure 5a, the DUS-TLE distribution can also model positively skewed data. Notice also that the kurtosis measure is 3.539668. This indicates highly peaked data (leptokurtic), and in Figure 5b, we show that our distribution can model highly peaked data. Consequently, in Table 8, the p-value of the DUS-TLE distribution is 0.9331, which is a near-perfect fit. Consider the p-value of common standard distributions such as gamma, log-normal, and Lomax; it is clear that the proposed DUS-TLE is superior in this study and hence can be extended to modeling related economic data. In Figure 6, we present the empirical density on a histogram of the Data-I, the CDF plot, P-P and TTT plots. These again provide supporting visual evidence that our model fits the TFP data well. In parameter estimating, the Lomax distribution is unusual as the estimates are too large for us to believe they emanated from the TFP data. The DUS-TLE model also has the lowest values of the model performance indicators, namely the log-likelihood (LL), Akaike Information Criterion (AIC), Consistent AIC (CAIC), Bayesian Information Criterion (BIC), Hannan–Quinn Information Criterion (HQIC), Cramér–von Mises (W), and Anderson–Darling (A) statistics.
Table 6.
Total factor productivity (TFP) for thirty-seven African countries from 2001 to 2010 (Data-I).
Table 7.
Summary statistics for Data-I.
Table 8.
MLEs, model performance, and goodness-of-fit measures using Data-I.
The second application is on overall mortality rates among people who injected drugs studied by Mathers et al. [] and presented in Table 9. In Table 10, the summary statistics are presented which reveal that the skewness measure is 0.02. This is also supported by the theoretical evidence in Figure 5a that the proposed distribution can fit right-skewed data. The kurtosis measure here is 3.689323, which implies leptokurtic data. In Table 11, the fitness measures for the distribution and its competitors are tabulated. The Kolmogorov–Smirnov (KS) statistic p-value for the DUS-TLE distribution is the highest, depicting a better fit to Data-II compared to the competing models. We further show some parametric and non-parametric plots in Figure 7.
Table 9.
Overall mortality rates among people who injected drugs (Data-II).
Table 10.
Summary statistics for Data-II.
Table 11.
MLEs, model performance, and goodness-of-fit measures using Data-II.
Figure 7.
Parametric and non-parametric plots for Data-II.
8. Final Remarks
In this study, we introduced the DUS Topp–Leone family of distributions, characterized by its novel cumulative distribution function (CDF) and probability density function (PDF) defined in Equations (5) and (6). This new family of distributions enhances the traditional Topp–Leone framework by integrating the DUS transformer, thus offering a flexible modeling tool capable of capturing a range of lifetime behaviors. The derived DUS-TLE distribution, with its CDF and PDF represented in Equations (7) and (8), highlights the versatility of this approach in various contexts, from health outcomes to economic productivity. The behavior of the DUS-TLE distribution has been thoroughly examined, revealing a unique unimodal shape with properties such as a reversed bathtub shape and fat-tail characteristics. The hazard rate function (HRF) defined in Equation (9) supports this analysis, providing insights into the distribution’s applicability in modeling time-to-event data. Through graphical representations, we demonstrated the distribution’s bounded nature at both extremes, confirming its utility in scenarios where phenomena exhibit rises and falls in intensity. Parameter estimation using both non-Bayesian and Bayesian approaches further emphasizes the DUS-TLE distribution’s robustness. Our findings indicate that non-Bayesian methods, particularly Maximum Likelihood and Least Squares, consistently outperform Bayesian techniques in terms of bias and root mean square error (RMSE), especially as sample sizes increase. The comparison of estimation methods provides valuable insights into the reliability of the DUS-TLE distribution in practical applications. Additionally, we showcased the distribution’s capability to model datasets exhibiting varying skewness and kurtosis values, as evidenced by its fit to total factor productivity data across African countries and the mortality rate of people who injected drugs. The high p-value of the DUS-TLE distribution suggests an excellent fit compared to traditional distributions, reinforcing its potential as a preferred model for diverse applications in statistical analysis. Overall, the DUS Topp–Leone family, with its innovative features and robust performance, represents a significant advancement in statistical modeling, enabling researchers to better understand and interpret complex data patterns in various fields.
Author Contributions
Conceptualization: O.J.O.; methodology: D.-F.N.E. and O.J.O.; software: O.J.O.; validation: D.-F.N.E., K.E.A., M.K., O.S.B. and O.J.O.; formal analysis: O.J.O.; investigation: D.-F.N.E., K.E.A., M.K., O.S.B. and O.J.O.; resources: D.-F.N.E., K.E.A., M.K., O.S.B. and O.J.O.; data curation: D.-F.N.E., M.K., O.S.B. and O.J.O.; writing—original draft preparation: D.-F.N.E., K.E.A. and O.J.O.; writing—review and editing: D.-F.N.E., K.E.A., M.K., O.S.B. and O.J.O.; visualization: O.J.O.; supervision: D.-F.N.E., K.E.A., and O.J.O.; project administration: D.-F.N.E., K.E.A., M.K., O.S.B. and O.J.O.; funding: M.K. All authors have read and agreed to the published version of the manuscript.
Funding
Researchers Supporting Project number (RSP2024R392), King Saud University, Riyadh, Saudi Arabia.
Data Availability Statement
This study utilized publicly available data, which are included in this paper.
Conflicts of Interest
The authors declare that there are no competing interests.
References
- Marshall, A.W.; Olkin, I. A new method for adding a parameter to a family of distributions with application to the exponential and Weibull families. Biometrika 1997, 84, 641–652. [Google Scholar] [CrossRef]
- Eugene, N.; Lee, C.; Famoye, F. Beta-normal distribution and its applications. Commun. Stat.-Theory Methods 2002, 31, 497–512. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; De Castro, M. A new family of generalized distributions. J. Stat. Comput. Simul. 2011, 81, 883–898. [Google Scholar] [CrossRef]
- Alzaatreh, A.; Lee, C.; Famoye, F. A new method for generating families of continuous distributions. Metron 2013, 71, 63–79. [Google Scholar] [CrossRef]
- Alzaghal, A.; Famoye, F.; Lee, C. Exponentiated TX family of distributions with some applications. Int. J. Stat. Probab. 2013, 2, 31. [Google Scholar] [CrossRef]
- Aljarrah, M.A.; Lee, C.; Famoye, F. On generating TX family of distributions using quantile functions. J. Stat. Distrib. Appl. 2014, 1, 1–17. [Google Scholar] [CrossRef]
- Tahir, M.; Cordeiro, G.M.; Alzaatreh, A.; Mansoor, M.; Zubair, M. The logistic-X family of distributions and its applications. Commun. Stat.-Theory Methods 2016, 45, 7326–7349. [Google Scholar] [CrossRef]
- Al-Mofleh, H. On generating a new family of distributions using the tangent function. Pak. J. Stat. Oper. Res. 2018, 14, 471–499. [Google Scholar] [CrossRef]
- Mahdavi, A.; Kundu, D. A new method for generating distributions with an application to exponential distribution. Commun. Stat.-Theory Methods 2017, 46, 6543–6557. [Google Scholar] [CrossRef]
- Gomes-Silva, F.S.; Percontini, A.; de Brito, E.; Ramos, M.W.; Venâncio, R.; Cordeiro, G.M. The odd Lindley-G family of distributions. Austrian J. Stat. 2017, 46, 65–87. [Google Scholar] [CrossRef]
- Ijaz, M.; Asim, S.M.; Alamgir; Farooq, M.; Khan, S.A.; Manzoor, S. A Gull Alpha Power Weibull distribution with applications to real and simulated data. PLoS ONE 2020, 15, e0233080. [Google Scholar] [CrossRef]
- Aldeni, M.; Lee, C.; Famoye, F. Families of distributions arising from the quantile of generalized lambda distribution. J. Stat. Distrib. Appl. 2017, 4, 1–18. [Google Scholar] [CrossRef]
- Yousof, H.M.; Altun, E.; Ramires, T.G.; Alizadeh, M.; Rasekhi, M. A new family of distributions with properties, regression models and applications. J. Stat. Manag. Syst. 2018, 21, 163–188. [Google Scholar] [CrossRef]
- Oramulu, D.O.; Alsadat, N.; Kumar, A.; Bahloul, M.M.; Obulezi, O.J. Sine generalized family of distributions: Properties, estimation, simulations and applications. Alex. Eng. J. 2024, 109, 532–552. [Google Scholar] [CrossRef]
- Zhao, W.; Khosa, S.K.; Ahmad, Z.; Aslam, M.; Afify, A.Z. Type-I heavy tailed family with applications in medicine, engineering and insurance. PLoS ONE 2020, 15, e0237462. [Google Scholar] [CrossRef] [PubMed]
- Kumar, D.; Singh, U.; Singh, S.K. A method of proposing new distribution and its application to Bladder cancer patients data. J. Stat. Appl. Pro. Lett 2015, 2, 235–245. [Google Scholar]
- Al-Noor, N.; Khaleel, M.; Mohammed, G. Theory and applications of Marshall Olkin Marshall Olkin Weibull distribution. J. Phys. Conf. Ser. 2021, 1999, 012101. [Google Scholar] [CrossRef]
- Feroze, N.; Tahir, U.; Noor-ul Amin, M.; Nisar, K.S.; Alqahtani, M.S.; Abbas, M.; Ali, R.; Jirawattanapanit, A. Applicability of modified weibull extension distribution in modeling censored medical datasets: A bayesian perspective. Sci. Rep. 2022, 12, 17157. [Google Scholar]
- Alyami, S.A.; Elbatal, I.; Alotaibi, N.; Almetwally, E.M.; Okasha, H.M.; Elgarhy, M. Topp–Leone modified Weibull model: Theory and applications to medical and engineering data. Appl. Sci. 2022, 12, 10431. [Google Scholar] [CrossRef]
- AbaOud, M.; Almuqrin, M.A. The weighted inverse Weibull distribution: Heavy-tailed characteristics, Monte Carlo simulation with medical application. Alex. Eng. J. 2024, 102, 99–107. [Google Scholar] [CrossRef]
- Bashir, S.; Masood, B.; Al-Essa, L.A.; Sanaullah, A.; Saleem, I. Properties, quantile regression, and application of bounded exponentiated Weibull distribution to COVID-19 data of mortality and survival rates. Sci. Rep. 2024, 14, 1–17. [Google Scholar] [CrossRef] [PubMed]
- Mazucheli, J.; Leiva, V.; Alves, B.; Menezes, A.F. A new quantile regression for modeling bounded data under a unit Birnbaum–Saunders distribution with applications in medicine and politics. Symmetry 2021, 13, 682. [Google Scholar] [CrossRef]
- Qura, M.E.; Fayomi, A.; Kilai, M.; Almetwally, E.M. Bivariate power Lomax distribution with medical applications. PLoS ONE 2023, 18, e0282581. [Google Scholar] [CrossRef]
- Tolba, A.H.; Onyekwere, C.K.; El-Saeed, A.R.; Alsadat, N.; Alohali, H.; Obulezi, O.J. A New Distribution for Modeling Data with Increasing Hazard Rate: A Case of COVID-19 Pandemic and Vinyl Chloride Data. Sustainability 2023, 15, 12782. [Google Scholar] [CrossRef]
- El-Morshedy, M.; Ahmad, Z.; Tag-Eldin, E.; Almaspoor, Z.; Eliwa, M.S.; Iqbal, Z. A new statistical approach for modeling the bladder cancer and leukemia patients data sets: Case studies in the medical sector. Math. Biosci. Eng. MBE 2022, 19, 10474–10492. [Google Scholar] [CrossRef] [PubMed]
- Rezaei, S.; Sadr, B.B.; Alizadeh, M.; Nadarajah, S. Topp–Leone generated family of distributions: Properties and applications. Commun. Stat.-Theory Methods 2017, 46, 2893–2909. [Google Scholar] [CrossRef]
- Rényi, A. On measures of entropy and information. Berkeley Symp. Math. Stat. Probab. 1961, 4, 547–562. [Google Scholar]
- Swain, J.J.; Venkatraman, S.; Wilson, J.R. Least-squares estimation of distribution functions in Johnson’s translation system. J. Stat. Comput. Simul. 1988, 29, 271–297. [Google Scholar] [CrossRef]
- Cheng, R.; Amin, N. Maximum Product of Spacings Estimation with Application to the Lognormal Distribution (Mathematical Report 79-1); University of Wales IST: Cardiff, UK, 1979. [Google Scholar]
- Varian, H.R. A Bayesian approach to real estate assessment. In Studies in Bayesian Econometrics and Statistics; In Honor of Leonard J. Savage; North-Holland Pub. Co.: Amsterdam, The Netherlands, 1975. [Google Scholar]
- Doostparast, M.; Akbari, M.G.; Balakrishna, N. Bayesian analysis for the two-parameter Pareto distribution based on record values and times. J. Stat. Comput. Simul. 2011, 81, 1393–1403. [Google Scholar] [CrossRef]
- Calabria, R.; Pulcini, G. Point estimation under asymmetric loss functions for left-truncated exponential samples. Commun. Stat.-Theory Methods 1996, 25, 585–600. [Google Scholar] [CrossRef]
- Brooks, S. Markov chain Monte Carlo method and its application. J. R. Stat. Soc. Ser. D Stat. 1998, 47, 69–100. [Google Scholar] [CrossRef]
- Van Ravenzwaaij, D.; Cassey, P.; Brown, S.D. A simple introduction to Markov Chain Monte–Carlo sampling. Psychon. Bull. Rev. 2018, 25, 143–154. [Google Scholar] [CrossRef] [PubMed]
- Mathers, B.M.; Degenhardt, L.; Bucello, C.; Lemon, J.; Wiessing, L.; Hickman, M. Mortality among people who inject drugs: A systematic review and meta-analysis. Bull. World Health Organ. 2013, 91, 102–123. [Google Scholar] [CrossRef] [PubMed]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).