Abstract
The Weibull is a popular distribution that models monotonous failure rate data. In this work, we introduce the four-parameter Weibull extended Weibull distribution that presents greater flexibility, thus modeling data with bathtub-shaped and unimodal failure rate. Some of its mathematical properties such as quantile function, linear representation and moments are provided. The maximum likelihood estimation is adopted to estimate its parameters, and the log-Weibull extended Weibull regression model is presented. In addition, some simulations are carried out to show the consistency of the estimators. We prove the greater flexibility and performance of this distribution and the regression model through applications to influenza and hepatitis data. The new models perform much better than some of their competitors.
1. Introduction
The Weibull is a traditional distribution for positive real data. However, it does not accommodate data with unimodal hazard function or bathtub shape. Several modifications of the Weibull appeared to model non-monotone hazard rates, including the extended Weibull (EW) model []. There are also many references regarding extensions in which one seeks to obtain hazard functions that are unimodal or bathtub shaped (see [,,], which provide a survey of the modified Weibull distributions). Most recently, refs. [,] defined the Maxwell-Weibull and the alpha power Kumaraswamy Weibull, respectively.
Two papers on EW distribution [,] have been most seminal in that they pioneered the development of distributions for bathtub-shaped hazard rates. Since the publication of these papers, many distributions and in particular other generalizations of the two-parameter Weibull distribution have been proposed, each allowing for non-monotone and bathtub-shaped hazard rates. It has been proven in the literature that the EW distribution provides significantly better fits than traditional models based on the exponential, gamma, Weibull and lognormal distributions. Thus, this is a central point to choose this distribution for the baseline model in this article.
The Weibull-G (W-G) class [] is still little explored when compared to other competitors. Some recently proposed distributions within this class are: Weibull–Dagum [], Weibull–Kumaraswamy [], Weibull Birnbaum–Saunders [], Weibull inverse Lomax [] and Weibull–Power Lomax []. Recently, ref. [] addressed the Weibull–Beta Prime distribution.
Some works using influenza data are studied from a non-parametric point of view [] or by using logistic regression [] and functional data analysis []. On the other hand, spatial regression [], machine learning models [], Markov chains [], and epidemiological models involving the fractal–fractional Caputo category [] have been used in studies with hepatitis data. Our main idea with applications to real data is to show the flexibility of the new distribution that adds one more parameter in the EW distribution as well as to the new log-Weibull extended Weibull (LWEW) regression model. As examples of the application of these models, we use time data (in days), which comprises the date of hospitalization until cure of influenza patients. To apply the LWEW regression model, a data set obtained from the literature of a study with hepatitis patients is used, in which the variable of interest is the time until death from hepatitis. The result “time until the occurrence of an event of interest” is the variable of interest in survival analysis studies, and one of the main characteristics of this type of study is censoring, i.e., the partial observation of the response. Furthermore, when considering the regression structure, we can analyze possible influences of characteristics of individuals in the sample under study on the response variable.
The three-parameter EW probability density function (pdf) of the random variable X is
where and are the shapes, and is the scale. The support of the EW distribution is , and its rth ordinary moment becomes
where and are the gamma and beta functions, respectively.
For lifetime models, it is of interest to know the rth incomplete moment of X, say , which has the form
where is the hypergeometric function defined by
and is the incomplete gamma function.
We define the Weibull extended Weibull (WEW) distribution in Section 2. The quantile function (qf) and linear representation are reported in Section 3. Estimation by the maximum likelihood method is discussed in Section 4. A simulation and a misspecification study are presented in Section 5. We define the log-Weibull extended Weibull (LWEW) regression in Section 6 and perform a simulation study for this model. Applications to influenza and hepatitis data are reported in Section 7. Some conclusions are summarized in Section 8.
2. The WEW Distribution
Consider the W-G class of distributions [] with scale and shape . By taking the pdf (1) for the baseline in this class, the cumulative distribution function (cdf) and pdf of the WEW distribution become (for )
respectively.
Henceforth, we change the notation and let have pdf (5). The WEW distribution has some special cases: the EW when , W-Weibull (WW) when , W-exponential (WE) when and .
Figure 1 and Figure 2 report the densities and hazard rate functions (hrfs) for fixed parameters, respectively. Plots of the WEW hrf can be inverted bathtub, bathtub, monotonically increasing, and monotonically decreasing.
Figure 1.
The pdf of X.

Figure 2.
The hrf of X.
3. Properties
3.1. Quantile Function
Plots of the Bowley skewness (b) [] and Moors kurtosis (M) [] of X (based on quantiles) are displayed in Figure 3 and Figure 4, respectively.
Figure 3.
Skewness of X.
Figure 4.
Kurtosis of X.
In Figure 3a, the skewness B decreases (for fixed ) when b grows. In Figure 3b, B increases to when increases, but for larger values of fixed, it tends to become constant. In Figure 3c, B decreases (for any ) when grows.
In Figure 4a, the kurtosis M decreases for if b grows. For high values of (fixed), M drops drastically when b grows, and after that, this curvature will be reversed, and then, M increases when b grows. In Figure 4b, as the parameter increases, M is increasing for , and , and it tends to become constant for . In Figure 4c, M decreases for any when grows.
3.2. Linear Representation
In conclusion, this representation is important since complete and incomplete moments, generating function, mean deviations, and reliability of X can be determined from those of the EW distribution.
3.3. Moments
We can study some important characteristics of the distribution through moments.
We can obtain the mean deviations and Lorenz and Bonferroni curves from the first incomplete moment.
4. Estimation
Equation (10) for gives the log-likelihood for the WW distribution. The maximum likelihood estimates (MLEs) can be found by maximizing using the AdequecyModel library [] of the R software; another option is the maxLik function via the maxLik library that provides a convenient interface for the MLEs [], or by the optim function by selecting an optimization method, for example, BFGS, CG, and SANN, and still finding the Hessian matrix. We also can maximize (10) numerically using SAS (PROCNLMIXED) or the Ox program (sub-routine MaxBFGS), among others. The score components in (for ) are reported in Appendix A.
5. Simulation Study
5.1. Simulations for the WEW Distribution
A Monte Carlo simulation study was conducted, using the BFGS algorithm in R software, to examine the accuracy of the MLEs of the parameters of X. Here, 1000 replications (for , 100 and 300) were generated from Equation (6), where . The scenarios under study were: , , and (Setup 1—decreasing density curve; see Figure 1a); , , and (Setup 2—unimodal density curve; see Figure 1d); and , , and (Setup 3—platykurtic unimodal density curve; see Figure 1c).
We calculated the average estimates, biases and mean squared errors (MSEs) in Table 1. The biases and MSEs decrease when n grows. Thus, the estimators are consistent.
Table 1.
Monte Carlo results from the WEW distribution.
5.2. Misspecification Study
We investigated the behavior of the MLEs of the parameters in the WEW distribution when it was poorly specified by carrying out Monte Carlo simulations based on 1000 replications (for n = 100). The observations were simulated by taking , , and . We used the maxLik library with the SANN method for each generated data set. In Table 2, the observed values are generated from the Gamma Extended Weibull (GEW) distribution [] by taking , , , and . In Table 3, the observed values are generated from the EW distribution by setting , , and . Further, in Table 4, the observed values are generated from the WW distribution with , , and .
Table 2.
Simulation results for the GEW distribution when , , , and .
Table 3.
Simulation results for the EW distribution when , , and .
Table 4.
Simulation results for the WW distribution when , , and .
In addition to the average estimates (AEs), the relative biases (RB), and MSEs, we present the mean measures of global deviance (GD), say , where is the maximized log-likelihood function (10), AIC and BIC. They indicate that there are small sample biases in the parameter estimation. The average measures of GD, AIC and BIC for the estimated WEW distribution are very close to those values obtained from the true distributions used in the generation of the observed values. Hence, the WEW distribution provides consistent MLEs even when the data are generated from different distributions.
Clearly, the goodness-of-fit measures (GD, AIC, and BIC) are lower for the distribution from which the data are generated.
6. The LWEW Regression Model
If X has the WEW pdf (5), then has the log-Weibull extended Weibull (LWEW) pdf (with real support) reparameterized in terms of and , which can be expressed as (for )
where and . For , we obtain the log-Weibull Weibull (LWW) model, where is a location and is a scale.
The survival function of Y has the form
The density of (for ) can be expressed as
We construct a regression based on the LWEW distribution
where has pdf (13), is the vector of coefficients, and is the vector of covariates for the ith response , which models the location parameter .
Consider that F and C are groups of individuals that failed and are censored, respectively. The log-likelihood for can be found from (13) and (14) as
where q is the number of failures, and . The MLE of can be found by maximizing (15).
Regression Simulation Study
A simulation study was conducted using the BFGS algorithm in R to examine the accuracy of the MLEs of the LWEW regression model with parameters: , , , and . We considered 1000 Monte Carlo replications for , 50, and 100, and censoring percentages 0%, 10%, 30%, and 66% generated using the inverse transformation method. Occurrences of the Bernoulli distribution with success probability are generated to obtain the censored observations, where p is the percentage of censoring. The location parameter is , where .
The AEs, biases, and MSEs are reported in Table 5. The biases and MSEs usually decrease when n grows. By increasing the percentage of censoring for a fixed sample size, the biases and MSEs decrease for most AEs. Thus, an improvement in the accuracy of the estimators occurs.
Table 5.
Simulation from the LWEW regression.
Clearly, it is not possible to note the same behavior for b. This can be explained, probably, because the estimators are naturally biased since the likelihood function in the presence of censoring has the contribution of the survival function.
7. Applications
7.1. The WEW Distribution
Consider a data set from the City of São Paulo (Brazil) obtained from the Severe Acute Respiratory Syndrome on the platform of the Ministry of Health (BD-SRAG at https://opendatasus.saude.gov.br/dataset/srag-2021-a-2023, accessed on: 26 May 2022), which comprises events from 31 December 2021 to March 2022. The data set passed for a filter process to obtain the 162 times (measured in days) of influenza patients from the date of admission to the hospital until cure.
In this application, we fit the WEW distribution and compared it with some special cases and competitive four-parameter distributions: GEW [], KwW [], BW [], Beta Exponentiated Exponential (BEE) [], Kumaraswamy–Gama (KwGa) [], and Beta–Gama (BGa) [].
The MLEs and their standard errors (SEs between parentheses) found via the SANN method (with AdequacyModel, GenSA and MASS libraries from R software) are reported in Table 6. We adopted the well-known , and KS statistics (with abbreviations in place of full names) to compare the WEW distribution with some competitive distributions. We used AIC, CAIC, and BIC to compare the new distribution with some special cases. The findings are reported in Table 7. Further, the likelihood ratio (LR) in Table 6 confirms the superiority of the WEW distribution for these data.
Table 6.
Estimation results.
Table 7.
GoF statistics for some fitted models.
Further, we compared the proposed distribution with the previous models via the generalized likelihood ratio (GLR) test []. The results in Table 8 indicate that the WEW distribution is the most suitable model. The histogram and the best four fitted pdfs are displayed in Figure 5a. Figure 5b reports the empirical and estimated cdfs. They also reveal the superiority of the WEW distribution.
Table 8.
GLR tests for some fitted models.
Figure 5.
(a) Estimated densities. (b) Empirical and estimated cdfs.
7.2. The LWEW Regression Model
We used a data set from a randomized clinical trial carried out to investigate the effect of therapy with steroids in the treatment of acute viral hepatitis []. Twenty-nine patients with this disease were randomized to receive either a placebo (lactose) or the steroid (Methylprednisolone) treatment. Each patient was followed for 16 weeks or until death (event of interest) or even loss of follow-up (censoring). The observed survival times, in weeks, for the two groups are reported in Table 9. The explanatory variable in this work is taken as: (): treatment (placebo = 1, steroid = 2).
Table 9.
Survival times in weeks from hepatitis study.
We fit the LWEW regression model
where has pdf (13).
Some competing models for the regression modeling are: log-gamma extended Weibull (LGEW) [], log-beta Weibull (LBW) [], and log-Kumaraswamy–Weibull or Kumaraswamy Gumbel (KwGu) [].
Table 10 provides the MLEs for the fitted LWEW, LEW, LWW, LGEW, LBW and KwGu regressions via the maxLik function and the BFGS method in R software. The codes can be accessed at https://github.com/elisangelacbiazatti/WEW (accessed on 28 April 2023). This table shows that the LWEW is the best model. The LR statistic confirms the superiority of the LWEW model for both its sub-models at the 1% level of significance. Further, control treatment and steroids are statistically different. Thus, patients who received control treatment had a shorter time to death than patients who received steroids, since the estimate of the coefficient of the treatment variable () is negative.
Table 10.
Estimation results [p values] and adequacy measures for some regressions.
The plots of the Kaplan–Meier and estimated survival functions in Figure 6 support that the LWEW model is the best among the fitted models.
Figure 6.
Kaplan–Meier and estimated survival functions.
The plot of the deviance residuals randomized around zero is reported in Figure 7a. A normal plot with an envelope is shown in Figure 7b. The model fits the data reasonably well.
Figure 7.
Deviance residuals. (a) Index plot. (b) Normal plot with envelope.
8. Conclusions
We introduced the Weibull extended Weibull density and provided some of its properties. The consistency of the maximum likelihood estimators is proven by a simulation study. An application to real influenza data revealed its flexibility. We constructed a regression model log-Weibull extended Weibull and performed some simulations to study the behavior of the estimators in small and large samples. We compared the fit to acute viral hepatitis data with other existing models and performed a residual analysis study for the final model. Overall, the two applications showed the utility of the new models for symmetric and asymmetric data, censored or uncensored. In future works, we can, for example, select other systematic components for the regression model and, as an alternative method, present the estimation of the model parameters from the Bayesian approach.
Author Contributions
Conceptualization, G.M.C. and E.C.B.; methodology, G.M.C., E.C.B. and L.H.d.S.; software, E.C.B.; validation, G.M.C. and E.C.B.; formal analysis, G.M.C. and E.C.B.; investigation, G.M.C. and E.C.B.; data curation, G.M.C. and E.C.B.; writing—original draft preparation, G.M.C., E.C.B. and L.H.d.S.; writing—review and editing, G.M.C., E.C.B. and L.H.d.S. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Stated in the text.
Conflicts of Interest
The authors declare no conflict of interest.
Appendix A. Linear Representation
By the exponential power series, we have
Substituting (A1) in the W-G pdf (with ) []
The generalized binomial theorem holds (for any real c)
where and , . Thus,
For ,
we have
where and .
Thus,
where
Similarly, for , we obtain
Appendix A. Score Vector
The score components in (for ) are
References
- Mudholkar, G.S.; Srivastava, D.K.; Kollia, G.D. A generalization of the Weibull distribution with application to the analysis of survival data. J. Am. Stat. Assoc. 1996, 91, 1575–1583. [Google Scholar] [CrossRef]
- Wienke, A. Frailty Models in Survival Analysis; Chapman and Hall/CRC: Boca Raton, FL, USA, 2011. [Google Scholar]
- Murthy, D.P.; Xie, M.; Jiang, R. Weibull Models; John Wiley & Sons: Hoboken, NJ, USA, 2004. [Google Scholar]
- Almalki, S.J.; Nadarajah, S. Modifications of the Weibull distribution: A review. Reliab. Eng. Syst. Saf. 2014, 124, 32–55. [Google Scholar] [CrossRef]
- Ishaq, A.I.; Abiodun, A.A. The Maxwell–Weibull distribution in modeling lifetime datasets. Ann. Data Sci. 2020, 7, 639–662. [Google Scholar] [CrossRef]
- Klakattawi, H.S. Survival analysis of cancer patients using a new extended Weibull distribution. PLoS ONE 2022, 17, e0264229. [Google Scholar] [CrossRef] [PubMed]
- Mudholkar, G.S.; Srivastava, D.K. Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Trans. Reliab. 1993, 42, 299–302. [Google Scholar] [CrossRef]
- Mudholkar, G.S.; Srivastava, D.K.; Freimer, M. The exponentiated Weibull family: A reanalysis of the bus-motor-failure data. Technometrics 1995, 37, 436–445. [Google Scholar] [CrossRef]
- Bourguignon, M.; Silva, R.B.; Cordeiro, G.M. The Weibull-G family of probability distributions. J. Data Sci. 2014, 12, 53–68. [Google Scholar] [CrossRef]
- Tahir, M.H.; Cordeiro, G.M.; Mansoor, M.; Zubair, M.; Alizadeh, M. The Weibull–Dagum Distribution: Properties and Applications. Commun.-Stat.-Theory Methods 2016, 45, 7376–7398. [Google Scholar] [CrossRef]
- Ishaq, A.I.; Usman, A.; Tasi’u, M.; Aliyu, Y.; Idris, F.A. A new Weibull-Kumaraswamy Distribution: Theory and Applications. Niger. J. Sci. Res. 2017, 16, 158–166. [Google Scholar]
- Benkhelifa, L. The Weibull Birnbaum-Saunders distribution and its applications. Stat. Optim. Inf. Comput. 2021, 9, 61–81. [Google Scholar] [CrossRef]
- Hassan, A.S.; Mohamed, R.E. Weibull inverse Lomax distribution. Pak. J. Stat. Oper. Res. 2019, 15, 587–603. [Google Scholar] [CrossRef]
- Hussain, N.A.; Doguwa, S.I.S.; Yahaya, A. The Weibull-Power Lomax distribution: Properties and application. Commun. Phys. Sci. 2020, 6, 869–881. [Google Scholar]
- Biazatti, E.C.; Cordeiro, G.M.; Rodrigues, G.M.; Ortega, E.M.; de Santana, L.H. A Weibull-Beta Prime Distribution to Model COVID-19 Data with the Presence of Covariates and Censored Data. Stats 2022, 5, 1159–1173. [Google Scholar] [CrossRef]
- Chow, E.J.; Rolfes, M.A.; O’Halloran, A.; Alden, N.B.; Anderson, E.J.; Bennett, N.M.; Billing, L.; Dufort, E.; Kirley, P.D.; George, A.; et al. Respiratory and nonrespiratory diagnoses associated with influenza in hospitalized adults. JAMA Netw. Open 2020, 3, e201323. [Google Scholar] [CrossRef]
- Chow, E.J.; Rolfes, M.A.; O’Halloran, A.; Anderson, E.J.; Bennett, N.M.; Billing, L.; Chai, S.; Dufort, E.; Herlihy, R.; Kim, S.; et al. Acute cardiovascular events associated with influenza in hospitalized adults: A cross-sectional study. Ann. Intern Med. 2020, 173, 605–613. [Google Scholar] [CrossRef] [PubMed]
- Rahman, A.; Jiang, D. Regional and temporal patterns of influenza: Application of functional data analysis. Infect. Dis. Model. 2021, 6, 1061–1072. [Google Scholar] [CrossRef]
- Kauhl, B.; Heil, J.; Hoebe, C.J.P.A.; Schweikart, J.; Krafft, T.; Dukers-Muijrers, N.H.T.M. The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants—An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots. PLoS ONE 2015, 10, e0135656. [Google Scholar] [CrossRef]
- Konerman, M.A.; Beste, L.A.; Van, T.; Liu, B.; Zhang, X.; Zhu, J.; Sameer, D.; Saini, S.D.; Su, G.L.; Nallamothu, B.K.; et al. Machine learning models to predict disease progression among veterans with hepatitis C virus. PLoS ONE 2019, 14, e0208141. [Google Scholar] [CrossRef]
- Razavi, H.; Gonzalez, Y.S.; Yuen, C.; Cornberg, M. Global timing of hepatitis C virus elimination in high-income countries. Liver Int. 2020, 40, 522–529. [Google Scholar] [CrossRef]
- Uçar, S. Analysis of hepatitis B disease with fractal–fractional Caputo derivative using real data from Turkey. J. Comput. Appl. Math. 2023, 419, e114692. [Google Scholar] [CrossRef]
- Kenney, J.F.; Keeping, E.S. Mathematics of Statistics; D. Van Nostrand Company: Toronto, Japan, 1961; Volume 1, p. 429. [Google Scholar]
- Moors, J. A Quantile Alternative for Kurtosis. J. R. Stat. Soc. Ser. (Stat.) 1988, 37, 25–32. [Google Scholar] [CrossRef]
- Marinho, P.R.D.; Silva, R.B.; Bourguignon, M.; Cordeiro, G.M.; Nadarajah, S. AdequacyModel: An R package for probability distributions and general purpose optimization. PLoS ONE 2019, 14, e0221487. [Google Scholar] [CrossRef] [PubMed]
- Henningsen, A.; Toomet, O. maxLik: A package for maximum likelihood estimation in R. Comput. Stat. 2011, 26, 443–458. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; Lima, M.D.C.; Gomes, A.E.; da-Silva, C.Q.; Ortega, E.M.M. The Gamma Extended Weibull distribution. J. Stat. Distrib. Appl. 2016, 3, 1–19. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; Ortega, E.M.M.; Nadarajah, S. The Kumaraswamy Weibull distribution with application to failure data. J. Frankl. Inst. 2010, 347, 1399–1429. [Google Scholar] [CrossRef]
- Lee, C.; Famoye, F.; Olumolade, O. Beta-Weibull Distribution: Some Properties and Applications to Censored Data. J. Mod. Appl. Stat. Methods 2007, 6, 173–186. [Google Scholar] [CrossRef]
- Barreto-Souza, W.; Santos, A.H.S.; Cordeiro, G.M. The Beta Generalized Exponential distribution. J. Stat. Comput. Simul. 2010, 80, 159–172. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; de Castro, M. A New Family of Generalized Distributions. J. Stat. Comput. Simul. 2011, 81, 883–898. [Google Scholar] [CrossRef]
- Kong, L.; Lee, C.; Sepanski, J. On the Properties of Beta-Gamma Distribution. J. Mod. Appl. Stat. Methods 2007, 2007 6, 187–211. [Google Scholar] [CrossRef]
- Vuong, Q.H. Likelihood Ratio Tests for Model Selection and Non-Nested Hypotheses. Econom. J. Econom. Soc. 1989, 57, 307–333. [Google Scholar] [CrossRef]
- Gregory, P.B.; Knauer, C.M.; Kempson, R.L.; Miller, R. Steroid Therapy in Severe Viral Hepatitis. N. Engl. J. Med. 1976, 294, 681–687. [Google Scholar] [CrossRef] [PubMed]
- Ortega, E.M.; Cordeiro, G.M.; Kattan, M.W. The Log-Beta Weibull Regression Model with Application to Predict Recurrence of Prostate Cancer. Stat. Pap. 2013, 54, 113–132. [Google Scholar] [CrossRef]
- Cordeiro, G.M.; Nadarajah, S.; Ortega, E.M.M. The Kumaraswamy Gumbel Distribution. Stat. Methods Appl. 2012, 21, 139–168. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).