Next Article in Journal
Community-Based Matrix Factorization (CBMF) Approach for Enhancing Quality of Recommendations
Previous Article in Journal
A Novel Joint Channel Estimation and Symbol Detection Receiver for Orthogonal Time Frequency Space in Vehicular Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Truncated Lindley-Generated Family of Distributions: Properties, Regression Analysis, and Applications

1
Department of Mathematics and Computer Science, Alexandria University, Alexandria 21544, Egypt
2
Department of Business Administration, College of Business, King Khalid University, Abha 61421, Saudi Arabia
3
Department of Exact Sciences, University of São Paulo, Piracicaba 13418-900, Brazil
4
Department of Statistics, University of Brasilia, Brasilia 70910-900, Brazil
*
Author to whom correspondence should be addressed.
Entropy 2023, 25(9), 1359; https://doi.org/10.3390/e25091359
Submission received: 4 August 2023 / Revised: 7 September 2023 / Accepted: 16 September 2023 / Published: 20 September 2023

Abstract

:
We present the truncated Lindley-G (TLG) model, a novel class of probability distributions with an additional shape parameter, by composing a unit distribution called the truncated Lindley distribution with a parent distribution function G ( x ) . The proposed model’s characteristics including critical points, moments, generating function, quantile function, mean deviations, and entropy are discussed. Also, we introduce a regression model based on the truncated Lindley–Weibull distribution considering two systematic components. The model parameters are estimated using the maximum likelihood method. In order to investigate the behavior of the estimators, some simulations are run for various parameter settings, censoring percentages, and sample sizes. Four real datasets are used to demonstrate the new model’s potential.
MSC:
62E10; 62F10; 60E05; 62P10; 62J02

1. Introduction

Suppose that G is a cumulative distribution function (cdf) that is defined on the real line, several papers have proposed composing a unit distribution with G (a parent cdf) to produce a new cdf. Eugene et al. (2002) [1] combined the cdf of the beta distribution with G to create the Beta-G model with cdf
F ( x ) = I G ( x ) ( a , b ) ,
where I x ( a , b ) = 0 x t a 1 ( 1 t ) b 1 d t / B ( a , b ) is the regularized incomplete beta function. Alexander et al. (2012) [2] and Nadarajah et al. (2014b) [3] generalized the Beta-G to the generalized-Beta-G and the modified-Beta-G. Cordeiro and Castro (2011) [4] developed the Kumaraswamy-G model by combining the Kumaraswamy cdf F ( x ) = 1 [ ( 1 x ) a ] b , x [ 0 , 1 ] with the parent cdf G.
Based on a valid cdf, F ( x ) for x R , for any continuous distribution, we can construct a unit distribution as a truncated version of F ( x ) with a cdf (monotonically increasing with lim x 0 F x = 0 and lim x 1 F x = 1 ) given by
F U T ( x ) = F ( x ) F ( 1 ) , x [ 0 , 1 ] ,
The truncated-G (TG) model is constructed by composing this truncated version of the cdf (or its associated survival function F ¯ ( x ) ) with a parent cdf G (or its associated survival function G ¯ ( x ) ) to give the parent distribution additional modeling ability and produce a new family of univariate distributions with cdfs (monotonically increasing with lim x F x = 0 and lim x F x = 1 ) given by
F 1 ( x ) = F U T ( G ( x ) ) , x R ,
F 2 ( x ) = 1 F U T ( 1 G ( x ) ) , x R .
A list of TG models are given in Table 1.
In this paper, we generate a new family of continuous distributions using a truncated version of the Lindley distribution.
The new distribution is necessary and helpful because it provides an alternative option for failure time analysis. While there are already numerous existing distributions available for this purpose, having a new distribution adds to the range of choices researchers and analysts have when analyzing failure times. The existing distributions may not always adequately capture the characteristics or behavior of the data being analyzed. Different distributions have different assumptions and properties, and no single distribution can fit all scenarios perfectly. Therefore, having a new distribution can be beneficial in situations where none of the existing options are suitable or provide a good fit to the data. Additionally, the new distribution may offer advantages over existing ones in terms of interpretability, flexibility, or computational efficiency. It could introduce novel features or modeling capabilities that were previously unavailable with other distributions. This can lead to improved accuracy and reliability in failure time analysis.
In summary, while there are already many distributions available for failure time analysis, the introduction of a new distribution expands the options and possibilities for researchers, allowing them to choose the most appropriate model for their specific data and research objectives.
On the other hand, in several research areas (medical, engineering, biology, agronomy, etc.), the failure times are affected by explanatory variables. In this paper, we propose a regression model with censored observations, based on the truncated Lindley–Weibull distribution, which is a feasible alternative for modeling failure time data. Also, different simulation studies are presented to study the behavior of maximum likelihood estimation (MLE), as well as the residual analysis of the proposed regression model. The paper is structured as follows: Section 2 describes the unit truncated Lindley distribution which is the main component of the proposed new model. We discuss its properties, including moments, mode, quantile function (qf), mean deviations, and generating function. Section 3 discusses the proposed TLG model (linear representation, properties, shapes of the TLG, stochastic representation, truncated Lindley–Weibull (TLW) submodel and estimation of the parameters using the maximum likelihood method). In Section 4, we propose a regression model based on the TLW distribution and estimate its parameters using maximum likelihood. Also, we perform some simulation studies for the TLW regression model under different sample sizes and censoring proportions. The TLW regression model application is illustrated by examining four real datasets in Section 5. Finally, Section 6 summarizes the result and presents the conclusions.

2. The Unit Truncated Lindley Model

Lindley (1958) [16] first described the Lindley distribution as a lifetime distribution with one parameter. The probability density function (pdf) and the cdf are provided by
f L ( x ; θ ) = θ 2 θ + 1 1 + x e θ x , x > 0 , θ > 0 , and F L ( x ; θ ) = 1 1 + θ x θ + 1 e θ x , x > 0 , θ > 0 ,
respectively. We suggest a new unit distribution, the unit truncated Lindley (UTL) distribution, based on the cdf of the Lindley distribution, which is a truncated form of F L ( x ) with the cdf and pdf provided by
F U T L ( x ) = C θ 1 + θ 1 + θ + θ x e θ x x [ 0 , 1 ] , θ 0 ,
f U T L ( x ) = θ 2 C θ 1 + x e θ x x [ 0 , 1 ] , θ 0 ,
where C θ = 1 / ( 1 + θ e θ 2 θ e θ ) > 0 .
The properties of the UTL model are given in Appendix A.

3. The Truncated Lindley- G Model

The Truncated Lindley-G (TLG) model is constructed by applying the TG composition scheme (2) on the cdf of the UTL model given in Equation (4), i.e.,
F T L G ( x ) = F U T L ( G ( x ) ) .
That is, the cdf and pdf of the TLG model are given by
F T L G ( x ) = C θ 1 + θ 1 + θ + θ G ( x ) e θ G ( x ) , x R , θ 0 ,
and
f T L G ( x ) = θ 2 C θ g ( x ) 1 + G ( x ) e θ G ( x ) , x R , θ 0 ,
where C θ = 1 / ( 1 + θ e θ 2 θ e θ ) .
The main reason for choosing the unit truncated form of the Lindley distribution is to add a new parameter to the parent distribution to generate a new distribution. The properties of the generated distribution will need further investigation, as they are, generally, different from those of the parent distribution.
Following the expansion e θ G ( x ) = i = 0 n ( 1 ) i θ G ( x ) i / i ! , the TLG cdf (6) has a linear representation of the exponentiated-G (EG) cdf as
F T L G ( x ) = C θ 1 + θ + i = 0 ν i θ + 1 H i ( x ) + θ H i + 1 ( x ) .
where H j ( x ) = G j ( x ) (for j = i , i + 1 ) is the EG cdf with power parameter j.
Differentiating (8) with respect to x, we obtain the linear representation of the TLG pdf as follows:
f T L G ( x ) = C θ i = 0 ν i θ + 1 h i ( x ) + θ h i + 1 ( x )
where ν i = ( 1 ) i + 1 θ i / i ! , h i ( x ) = i g ( x ) G x i 1 and h i + 1 ( x ) = ( i + 1 ) g ( x ) G x i are the EG densities with power parameters i and i + 1 , respectively. On the basis of the linear representation (9), some TLG models’ properties are similar to the EG properties reported in several references, such as AL-Hussaini and Ehsanullah (2015) [17]. Henceforth, Y i denotes that an rv has an EG distribution, with power parameter i and density h i ( x ) .

3.1. Some Properties of the TLG Model

3.1.1. Critical Points

As F T L G ( x ) = F U T L ( G ( x ) ) , we have f T L G ( x ) = g ( x ) f U T L ( G ( x ) ) . Hence, the derivative of f T L G ( x ) is
f T L G ( x ) = g ( x ) f U T L ( G ( x ) ) + g 2 ( x ) f U T L ( G ( x ) ) .
Using the identities f U T L ( y ) = θ 2 C θ 1 + y e θ y and f U T L ( y ) = θ 2 C θ [ 1 θ ( 1 + y ) ] e θ y , the above identity is written as
f T L G ( x ) = θ 2 C θ e θ G ( x ) g ( x ) ( 1 + G ( x ) ) + g 2 ( x ) [ 1 θ ( 1 + G ( x ) ) ] .
Then, all critical points x 0 of f T L G satisfy f T L G ( x 0 ) = 0 , or equivalently,
[ g ( x 0 ) θ g 2 ( x 0 ) ] ( 1 + G ( x 0 ) ) + g 2 ( x 0 ) = 0 .
Depending on the choice of the cdf G, the above equation can be reduced and its maximum (modes) and minimum points characterized. For an example where the function G is chosen to be the Weibull distribution, see Section 3.2.

3.1.2. Moments

Moments allow the examination of some of the distribution’s most significant features and characteristics. The kth raw moment (for r = 1 , 2 , ) of the TLG model is
μ k = x k f T L G ( x ) d x = θ 2 C θ x k g ( x ) 1 + G ( x ) e θ G ( x ) d x = θ 2 C θ 0 1 Q G ( y ) k 1 + y e θ y d y ,
where Q G is the qf associated with the parent cdf G.
Furthermore, the kth raw moment can be expressed from (9) using the moments of the EG distribution as
μ r = C θ i = 0 ν i θ + 1 E ( Y i r ) + θ E ( Y i + 1 r ) .

3.1.3. Quantile Function

The qf is a highly desirable property in statistical distributions and is especially helpful in the computation of several values in statistical modeling and inferences. By inverting the cdf of the TLG distribution in (6), the qf for the TLG distribution can be expressed using the qf associated with the parent cdf G as
Q T L G ( u ) = Q G 1 1 θ 1 θ W ( u C θ 1 θ 1 ) e θ 1 , u ( 0 , 1 )
Therefore, X = Q G ( U ) follows the TLG distribution with pdf (7) if U is a uniform variate on the unit interval.

3.1.4. Mean Deviations

The following relationships can be used to describe, respectively, the mean deviations of X about the mean μ = E ( X ) and the median M.
δ 1 = | x μ | f T L G ( x ) d x = 2 μ F ( μ ) 2 C θ i = 0 ν i θ + 1 I i ( μ , 1 ) + θ I i + 1 ( μ , 1 ) , and δ 2 = | x M | f T L G ( x ) d x = μ 2 C θ i = 0 ν i θ + 1 I i ( M , 1 ) + θ I i + 1 ( M , 1 ) ,
where I j ( t , k ) is the kth incomplete moment of the rv Y j that has an EG distribution with power parameter j (i.e., Y j h j ( x ) ).

3.1.5. Moment Generating Function

The mgf of X T L G can be expressed in an integral form as
M X ( t ) = E ( e t X ) = e t x f T L G ( x ) d x = θ 2 C θ g ( x ) 1 + G ( x ) e θ G ( x ) t x d x = θ 2 C θ 0 1 1 + y e θ y t Q G ( y ) d y .
Furthermore, it can be expressed using the mgf of the EG distribution as
M X ( t ) = C θ i = 0 ν i θ + 1 M i ( t ) + θ M i + 1 ( t ) ,
where M j ( t ) is the mgf of an rv Y j that has an EG distribution with power parameter j ( Y j h j ( x ) ).

3.1.6. Entropy

Entropy measures the change in the uncertainty in physical systems. The Shannon and Rényi entropies are two well-known entropy measurements. Entropy values range from very small to very large, with larger values indicating greater data uncertainty. In this section, we derive the continuous Rényi and Shannon entropies of the TLG distribution. The Rényi entropy, R ( τ ) where τ > 0 , τ 1 of the TLG distribution is given by
R ( τ ) = 1 1 τ log f T L G τ ( x ) d x = 1 1 τ log θ 2 τ C θ τ g ( x ) τ 1 + G ( x ) τ e r θ G ( x ) d x .
It follows from the expansions 1 + G ( x ) r = j = 0 r j G ( x ) j and e r θ G ( x ) = i = 0 n ( 1 ) i i ! r θ G ( x ) i that
R ( τ ) = 1 1 τ log θ 2 τ C θ τ j = 0 τ i = 0 n τ j ( 1 ) i τ θ i i ! g ( x ) τ G ( x ) i + j d x .
The Shannon entropy of the TLG distribution is given
S ( τ ) = E log f T L G ( X ) = log ( θ 2 C θ ) E log g ( X ) E log ( 1 + G ( X ) ) + θ E G ( X ) ,
using the expansion
log 1 + G ( x ) = i = 1 ( 1 ) i + 1 i G i ( x ) ,
we have
η = log ( θ 2 C θ ) + η G i = 1 ( 1 ) i + 1 i E [ G i ( X ) ] + θ E G ( X ) ,
where η G is the Shannon entropy for the parent distribution. Since G ( X ) U ( 0 , 1 ) , then
η = log ( θ 2 C θ ) + η G i = 1 ( 1 ) i + 1 i ( i + 1 ) + θ 2 , = log ( θ 2 C θ ) + η G + 1 2 log 2 + θ 2 .

3.2. Truncated Lindley–Weibull (TLW) Model

Consider the parent distribution is the Weibull distribution with shape parameter k > 0 , and scale parameter λ > 0 , the cdf and pdf are given by
G ( x ) = G ( x ; k , λ ) = 1 e ( x / λ ) k , and g ( x ) = g ( x ; k , λ ) = k λ x λ k 1 e ( x / λ ) k , x > 0 .
The cdf and pdf of the truncated Lindley–Weibull (TLW) model are given by
F T L W ( x ) = C θ 1 + θ 1 + θ + θ 1 e ( x / λ ) k e θ 1 e x / λ k , x , k , λ > 0 , θ 0 , and
f T L W ( x ) = k θ 2 C θ λ x λ k 1 2 e ( x / λ ) k e ( x / λ ) k θ 1 e ( x / λ ) k , x , k , λ > 0 , θ 0 ,
respectively, where C θ is as in Equation (7). Note that
lim x 0 + f T L W ( x ) = , k < 1 , θ 2 C θ λ , k = 1 , 0 , k > 1 , and lim x f T L W ( x ) = 0 .
The TLW model’s pdf is shown in Figure 1 for various values of θ , k , and λ . Figure 1 illustrates how the TLW distribution’s density function is flexible and changes in shape depending on the parameter values.

3.2.1. Shapes of the TLW pdf

Considering G and g as given in (12), the Equation (10) of critical points is written as
0 = [ g ( x 0 ) θ g 2 ( x 0 ) ] ( 1 + G ( x 0 ) ) + g 2 ( x 0 ) = [ g ( x 0 ) θ g 2 ( x 0 ) ] 2 e ( x 0 / λ ) k + g 2 ( x 0 ) .
As g ( x ) = g ( x ) k [ ( x / λ ) k 1 ] + 1 / x , the above identity becomes
0 = g ( x 0 ) k [ ( x 0 / λ ) k 1 ] + 1 x 0 + θ g ( x 0 ) 2 e ( x 0 / λ ) k + g 2 ( x 0 ) .
Since g ( x ) = ( k / λ ) x / λ k 1 e ( x / λ ) k and g ( x 0 ) > 0 for each x 0 > 0 , the above identity is equivalently written as
A ( z 0 ) = B θ , k ( z 0 ) ,
where for z 0 = ( x 0 / λ ) k and θ 0 , we denote
A ( z 0 ) z 0 e z 0 , B θ , k ( z 0 ) 2 θ z 0 ( 1 e z 0 ) 2 e z 0 + τ * and τ * 1 θ 1 k k .
A simple calculation shows that the function z 0 B θ , k ( z 0 ) is increasing (respectively, decreasing) when θ > 0 (respectively, θ < 0 ). Furthermore, notice that the function z 0 A ( z 0 ) reaches the minimum value 1 / e at z 0 = 1 . Using the graphs of the functions A and B θ , k , and varying the parameters θ and τ * , we can find the points of intersection of both graphs. Therefore, we can compactly classify the number of roots of Equation (16), as indicated in Table 2.
Based on Table 2, in what follows we divide our analysis into the following cases.
1
If θ > 0
(a)
and τ * 1 / e , then k > 1 and, by Table 2, there is a single root, z 0 = ( x 0 / λ ) k , of Equation (16). That is, x 0 = λ z 0 1 / k , with k > 1 , is a single critical point of f T L W . But, by (15), lim x 0 + f T L W ( x ) = lim x f T L W ( x ) = 0 for k > 1 . Consequently, x 0 is a single maximum point of the TLW pdf. Hence, for θ > 0 and τ * 1 / e , the TLW pdf is unimodal with mode x 0 .
(b)
and 1 / e < τ * < 0 , then k > 1 . Following the same steps as in Item 1(a) we have that f T L W is unimodal.
(c)
and τ * 0 , then k 1 and, by Table 2, there is no root of Equation (16). I.e., there is no critical point of f T L W . But, by (15), lim x 0 + f T L W ( x ) = for k < 1 (and = θ 2 C θ / λ for k = 1 ) and lim x f T L W ( x ) = 0 . Consequently, for θ > 0 and τ * 0 , the TLW pdf is decreasing.
2
If θ < 0
(a)
and τ * 1 / e , then k < 1 . Following the same steps as in Item 1(c) we have that f T L W is decreasing.
(b)
and 1 / e < τ * < 0 , then k < 1 , by Table 2, there are two roots, z 0 = ( x 0 / λ ) k and z 1 = ( x 1 / λ ) k , of Equation (16). In other words, x 0 = λ z 0 1 / k and x 1 = λ z 1 1 / k , with k < 1 , are two critical points of f T L W . Without loss of generality, assume that x 0 < x 1 . By (15), lim x 0 + f T L W ( x ) = , for k < 1 , and lim x f T L W ( x ) = 0 . Consequently, f T L W has an decreasing–increasing–decreasing shape with minimum point x 0 and maximum point x 1 .
(c)
and τ * 0 , then k 1 . Following the same steps as in Item 1(a) we have that f T L W is unimodal.
Table 3 summarizes the shapes of f T L W obtained in Items 1 and 2 above.
Note that the parameters θ and τ * obtained from Figure 1 obey the pdf shapes obtained in Table 3.
By way of illustration in Figure 2, we represent the shapes of the TLW pdf shown in Table 3.

3.2.2. Stochastic Representation

Let X and Y be two random variables with TLW and UTL distributions, respectively. As F T L W ( x ) = F U T L ( G ( x ) ) with G ( x ) = 1 e ( x / λ ) k , we obtain
F T L W ( x ) = P ( X x ) = P ( Y G ( x ) ) = P ( G 1 ( Y ) x ) = P λ [ log ( 1 Y ) ] 1 / k ) x , x .
Therefore, X has the stochastic representation
X = d λ [ log ( 1 Y ) ] 1 / k ,
with = d being equality in distribution. In addition to generating random numbers, a stochastic representation is useful for determining moments, characteristic functions, quantiles, etc.

3.3. Maximum Likelihood Estimation

Let x 1 , , x n represent the observed values from the TLW model with the pdf given in (14). For the vector of parameters Θ = ( θ , k , λ ) , the log-likelihood function is provided by
= ( Θ ) = n log θ 2 + log C θ + log k k log λ + ( k 1 ) i = 0 n log x i + i = 0 n log 2 e ( x i / λ ) k θ i = 0 n 1 e ( x i / λ ) k .
The following are the elements comprising the score vector U ( Θ )
U θ = n 2 θ 1 e θ + 2 θ e θ 1 + θ e θ 2 θ e θ + 1 k 1 λ i = 0 n 1 e ( x i / λ ) k , U k = n k n log λ + i = 0 n log x i n λ k i = 1 n x i k + 1 λ k i = 1 n x i k log x i log λ e x i / λ k 2 e x i λ k θ λ k i = 0 n x i k log x i log λ e x i / λ k , U λ = n k λ + k λ k + 1 i = 0 n x i k + k λ k + 1 i = 1 n x i k e x i / λ k 2 e x i / λ k + k θ λ k + 1 i = 0 n x i e x i / λ k .
Traditionally, the MLEs of the three parameters can also be calculated by setting the preceding equations to zero and simultaneously solving them. Since it appears impossible to find a closed form estimator for Θ , direct maximization of (17), as a multidimensional nonlinear unconstrained function, via a quasi-Newton optimization technique such as BFGS, SANN, Nelder–Mead, or CG might be appropriate for finding the maximum likelihood estimates of Θ = ( θ , k , λ ) .

3.4. Monte Carlo Simulation

By generating n observations from the TLW distribution with varying parameter values, we conduct simulations to validate the performance of the MLEs of the TLW distribution parameters. The BFGS method from the R package is utilized to estimate the parameter values. The sample sizes considered are n = 20, 50, 100, 150, and 300, and the replicates number is N = 5000. The simulation results are evaluated using the mean absolute bias (MAB), the mean square error (MSE), and the average estimates (AEs), where for Θ = ( θ , k , λ ) we have
M A B ( Θ ^ ) = 1 N i = 1 N | Θ ^ Θ | , M S E ( Θ ^ ) = 1 N i = 1 N ( Θ ^ Θ ) 2 , A E ( Θ ^ ) = 1 N i = 1 N Θ i ^ .
The results in Table 4 and Table 5 show that the AEs tend to the true values and that the MABs and MSEs vanish as n increases, which reveals the asymptotic consistency of the MLEs of the TLW parameters.
Using Equation (11), for the Weibull distribution we have Q W ( u ) = λ log ( 1 u ) 1 k , implying that the qf of the TLW distribution is
Q T L W ( u ) = λ log 2 + 1 θ + 1 θ W 1 ( u C θ 1 θ 1 ) e θ 1 1 K .
The data are generated from
X = λ log 2 + 1 θ + 1 θ W 1 ( U C θ 1 θ 1 ) e θ 1 1 K , U U ( 0 , 1 ) .

4. The TLW Regression Model with Censored Data and Two Systematic Components

Statistical analysis of lifetimes is an important topic used in different areas such as, for example, medicine, biology, epidemiology, engineering, among others. Failure time refers to the time until the occurrence of an event of interest, which may be death, the appearance of a tumor, the development of a disease, the breakdown of an electronic component, among other examples.
We relate the parameters λ and k to
v = ( v 1 , , v p ) T covariates by the logarithm link function
λ i = exp ( v i T β 1 ) a n d k i = exp ( v i T β 2 ) , i = 1 , , n ,
respectively, where β 1 = ( β 11 , , β 1 p ) T and β 2 = ( β 21 , , β 2 p ) T denote the vectors of regression coefficients and v i T = ( v i 1 , , v i p ) .
The survival function of X | v is given by
S ( x | v ) = 1 c θ 1 + θ 1 + θ + ω ( x | v ) exp [ ω ( x | v ) ] ,
where
ω ( x | v ) = θ 1 exp x exp ( v T β 1 ) exp v T β 2 .
Equation (19) is referred to as the TLW parametric regression model. This regression model opens new possibilities for fitting many different types of data.
Consider a sample ( x 1 , v 1 ) , , ( x n , v n ) of n independent observations, where each random response is defined by x i = min { x i * , c i } , where c 1 , , c n are the censoring times and x 1 * , , x n * are the observed lifetimes. We assume non-informative censoring such that the observed lifetimes and censoring times are independent. Let F and C be the sets of individuals for which x i is the lifetime or censoring, respectively. The total log-likelihood function for τ = ( θ , β 1 T , β 2 T ) T reduces to
l ( τ ) = r log θ 2 c θ + i F log k i λ i k i + i F ( k i 1 ) log ( x i ) i F x i λ i k i + i F log 2 exp x i λ i k i i F q ( x i | v i ) + i C log 1 c θ { 1 + θ [ 1 + θ + q ( x i | v i ) ] exp [ q ( x i | v i ) ] } ,
where r is the number of uncensored observations (failures) and q ( x i | v i ) = θ 1 exp x i λ i k i . By maximizing the log-likelihood (20), the MLE of the vector of unknown parameters can be calculated. We use the R software to determine τ ^ .

4.1. Residual Analysis

For the TLW regression model with censored observations, we present two types of residuals to evaluate deviations from the error assumptions and detect outliers. The deviance residuals have been used more frequently in the literature because they take into account the information of censored times. The TLW regression model can also use these residuals. A reliable method for detecting atypical observations and confirming that the fitted model is adequate is to plot the deviance residual against the observed times. It is possible to express the deviance residual as
r D i = s i g n ( r M i ) { 2 [ r M i + δ i log ( δ i r M i ) ] } 1 / 2 ,
where
r M i = 1 + log 1 c θ ^ { 1 + θ ^ [ 1 + θ ^ + q ^ ( x i | v i ) ] exp [ q ^ ( x i | v i ) ] } i f δ i = 1 , log 1 c θ ^ { 1 + θ ^ [ 1 + θ ^ + q ^ ( x i | v i ) ] exp [ q ^ ( x i | v i ) ] } i f δ i = 0 ,
is the martingale residual, δ i = 1 means that the observation is uncensored, δ i = 0 means that the observation is censored and
q ^ ( x i | v i ) = θ ^ 1 exp x i λ ^ i k ^ i .

4.2. Simulation Study

To verify the accuracy of the MLEs of the TLW regression model, we carried out a simulation study for different censoring percentages and sample sizes n = 100 , 300, and 500. For each sample size, we carried out N = 1000 replicates and considered the approximate censoring percentages: 0%, 10% and 30%. A covariate v 1 binomial ( 1 , 0.5 ) is included from the following systematic components:
λ i = exp ( β 10 + β 11 v 1 i ) , and k i = exp ( β 20 + β 21 v 1 i ) ,
The inverse transformation method is used to obtain the lifetimes x 1 , , x n from the TLW ( λ i , k i , θ ) distribution, and the censoring times c 1 , , c n are determined from a uniform distribution ( 0 , γ ) , where γ controls the censoring percentages. The true values used for generation are β 10 = 0.3 , β 11 = 0.4 , β 20 = 0.2 , β 21 = 0.5 , and θ = 0.6 .
The Results are checked for τ = ( β ^ 10 , β ^ 11 , β ^ 20 , β ^ 21 , θ ^ ) from MABs, MSEs, and AEs given in (18), where here Θ = τ . The simulation process is given by:
(i) Generate v 1 i binomial ( n , 1 , 0.5 ) ;
(ii) Calculate λ i = exp ( β 10 + β 11 v 1 i ) and k i = exp ( β 20 + β 21 v 1 i ) ;
(iii) Generate x i * TLW ( n , λ i , k i , θ ) ;
(iv) Generate c i uniform ( 0 , γ ) ;
(v) Calculate the survival times x i = min ( x i * , c i ) ;
(vi) If x i * < c i , then δ i = 1 ; otherwise, δ i = 0 , for i = 1 , , n , where δ is the censoring indicator.
(vii) Calculate AEs, biases, and MSEs.
Table 6 displays these values. It is verified that for all scenarios the averages of the estimates approach the true values of the parameters and the MABs and MSEs decrease as the sample size increases. These results illustrate that the estimates are consistent, even at higher censoring percentages.

5. Data Analysis

In order to demonstrate the superiority of the new distribution over some other models, we use two real datasets originating from different fields. We compare the fits of the TLW model to those of the parent Weibull model (W), the Kumarswamy–Weibull model (KW) from Cordeiro and Castro (2011) [4], the Weibull–Weibull model (WW) from Alzaatreh et al. [18], the Geometric–Poisson–Weibull model (GPW) from Nadarajah et al. (2013) [19], the Poisson–Weibull model (PW) from Ristic and Nadarajah (2013) [5] the beta-Weibull model (BW) from Eugene et al. (2002) [1], the Marshall–Olkin–Weibull model (MOW) from Marshall and Olkin (1997) [20] and the exponentiated generalized Weibull model (EGW) from Cordeiro et al. (2013) [21]. The cdfs of these models are provided in Appendix B. The parameter estimates are computed by maximizing (17) using the BFGS method available in the adequacy model package in the R software [22].
The considered models are compared according to a collection of statistics (AIC, CAIC, BIC, HQIC, minus maximum log-likelihood function ( )) which assess the relative degree of fit of these models to a dataset.
We also performed an application of the TLW regression model considering censored data. We compared different systematic components for the proposed new regression model and the Weibull regression model. In this part we use the RS algorithm in the gamlss package in the R software to maximize the log-likelihood function (20) and we use the AIC and global deviance (GD) statistics to select the most suitable models.
  • Dataset I: Temperature Dataset
This dataset, reported by Barakat et al. (2014) [23], depicts the average July temperatures ( ° C) for Neuenburg, Switzerland, between 1864 and 1993. The observations are as follows.
19.020.118.417.419.721.021.419.219.920.420.917.220.2
17.818.115.619.421.716.216.419.020.619.020.715.817.7
16.817.118.118.418.718.718.419.218.018.720.719.419.2
17.422.021.419.316.818.216.215.922.117.515.316.517.4
17.018.318.315.318.221.517.021.618.218.117.618.222.6
19.917.117.217.319.420.120.117.019.417.516.817.019.9
18.219.218.520.819.521.115.821.321.218.822.318.616.8
18.217.218.418.721.116.317.418.019.521.216.817.420.7
18.419.818.720.518.318.218.219.220.218.217.419.216.3
17.420.323.419.220.219.319.018.820.319.720.719.618.1
The MLEs and 95% CIs for the model parameters are shown in Table 7. Table 8 provides the competence of the considered models.
The TLW model fits the dataset with the lowest AIC, CAIC, BIC, HQIC, and minus log-likelihood among the other models, as determined by the adequacy statistics presented in Table 8. Therefore, it may be a viable option for modeling these data. Figure 3 compares the empirical and fitted distributions of the data, displaying the histogram and fitted pdf, the fitted and empirical cdfs, the P–P plot, and the Q–Q plot, respectively, to graphically explain the appropriateness of the TLW for modeling these data.
  • Dataset II: Breaking Stress of Carbon Fibers
The breaking stress of 64 single carbon fibers of gauge length 10 mm (Cheng and Traylor (1970) [24]). The observations are as follows.
1.9012.1322.2032.2282.2572.352.3612.3962.3972.44502.454
2.4542.4742.5182.5222.5252.5322.5752.6142.6162.6182.624
2.6592.6752.7382.742.8562.9172.9282.9372.9372.9772.996
3.033.1253.1393.1453.223.2233.2353.2433.2643.2723.294
3.3323.3463.3773.4083.4353.4933.5013.5373.5543.5623.628
3.8523.8713.8863.9714.0244.0274.2254.3955.02
Table 9 displays the MLEs and 95% CIs for the model parameters, demonstrating the validity of the considered models. According to Table 10, the TLW model fits the dataset with the lowest AIC, CAIC, BIC, HQIC, and minus log-likelihood among the other models. Therefore, it may be a viable option for modeling these data. Figure 4 compares the empirical and fitted distributions of the data, displaying the histogram and fitted pdf, the fitted and empirical cdfs, the P–P plot and the Q–Q plot to graphically demonstrate the appropriateness of the TLW for modeling these data.
  • Dataset III: COVID-19
In this application we consider the regression model for censored data. This dataset refers to patients hospitalized with COVID-19. The disease is caused by the pathogen identified as a new coronavirus, denominated severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). The epidemiological data were tallied by the Health Information System of the Brazilian government, and are available at https://opendatasus.saude.gov.br/dataset/srag-2020 (accessed on 1 May 2023).
This study involved 195 patients hospitalized in the city of Campinas, state of São Paulo, in May 2020, with infection confirmed by RT-PCR and classified as SARS caused by COVID-19. The survival time consisted of the time in days from the date of first symptoms to the date of evolution of the case, either death (failure) or end of observation (censoring). The censoring percentage was 56.92% and the following variables were considered: ( i = 1 , , 195 ) :
  • x i : observed time (in days);
  • cens i : censoring indicator ( 0 = censored, 1 = observed lifetime);
  • v i 1 : sex ( 1 = male, 0 = female);
  • v i 2 : age (in years).
There were 110 male patients (56.41%), of whom 42 (38.18%) died, while of the 85 women (43.58%), there were 42 deaths (49.41%). Figure 5a presents the Kaplan–Meier survival curve broken down by sex. It can be seen that men had a higher risk of death. Figure 5b depicts the histogram of the ages, where the greatest frequency was in the category from 50 to 75 years old.
We compared the TLW regression model with the Weibull regression model based on the following systematic components:
Systematic = M 0 : log ( λ i ) = β 10 and log ( k i ) = β 20 ; M 1 : log ( λ i ) = β 10 + β 11 v i 1 + β 12 v i 2 and log ( k i ) = β 20 ; M 2 : log ( λ i ) = β 10 and log ( k i ) = β 20 + β 21 v i 1 + β 22 v i 2 ; M 3 : log ( λ i ) = β 10 + β 11 v i 1 + β 12 v i 2 and log ( k i ) = β 20 + β 21 v i 1 + β 22 v i 2 .
Table 11 reports the values of the selection criteria of the models, in which the M 3 -TLW model was superior to the others. We also compared this model with the M 3 -Weibull model by means of the residuals in Figure 6. In turn, Figure 6a,c illustrate the residuals versus the index of the observations, showing that both models have residuals with random behavior around zero, and no point is outside the interval ( 3 , 3 ) . Nevertheless, Figure 6b,d indicate that the TLW model behaved better, with all the points within the simulated envelope, denoting its superiority. Finally, we illustrate the Kaplan–Meier curves and estimated survival curves in Figure 7 for the TLW model, showing that this model is able to capture the non-proportional curves of this dataset. The results of this model are shown in Table 12. Some conclusions can be obtained as follows.
Interpretations for λ :
  • A significant difference exists between men and women in relation to survival time (men have shorter survival). Various other studies have also indicated significant differences between the sexes (see [25,26]);
  • The survival time declines with advancing age. This result corroborates the findings of several studies that have indicated that older age is a predictor of higher mortality caused by COVID-19 (see [27,28,29]).
Interpretations for k:
  • A significant difference exists between men and women with regard to the variability in the survival time;
  • In relation to age, the variability in survival time increased with older age of the patients.
  • Dataset IV: Post-harvested
In this application, we consider the regression model for uncensored data. These data refer to Musa acuminata banana species from a banana plantation in the Philippines. A total of n = 194 banana tiers were chosen randomly, in which the numerical values of the RGB colors (red, green, and blue) were obtained from images taken by hardware of four banana classes, extra class, class I, class II, and reject, where the classes contain 65, 49, 30, and 50 samples, respectively. The dataset is available in the repository: https://data.mendeley.com/datasets/zk3tkxndjw/2 (accessed on 20 May 2023) and more details can be seen in [30]. Each banana tier sample was captured with a white background in six different views: front, back, left, right, top, and bottom views. Here, we consider the values of B in front view. Figure 8 displays a boxplot by class, it is possible to observe differences between the colors according to the class.
The variables considered are ( i = 1 , , 194 ) :
  • x i : color value;
  • v i j : banana class (factor with four levels, defined by three variable dummies j = 1 , 2 , 3 ).
We verified the relationship between colors and classes from the TLW and Weibull models according to the following systematic components:
Systematic = M 0 : log ( λ i ) = β 10 and log ( k i ) = β 20 ; M 1 : log ( λ i ) = β 10 + β 1 j v i j and log ( k i ) = β 20 ; M 2 : log ( λ i ) = β 10 and log ( k i ) = β 20 + β 2 j v i j ; M 3 : log ( λ i ) = β 10 + β 1 j v i j and log ( k i ) = β 20 + β 2 j v i j .
Table 13 displays the AIC and GD values for these fitted models, in which it can be seen that the M 3 -TLW model obtained the lowest values, being able to be chosen as the best model. In addition, we compare the M 3 -TLW and the M 3 -Weibull from the quantile residues (Figure 9). These plots agree with the results of Table 13, there is a high percentage of points outside the confidence band of the Weibull model (Figure 9e) and many deviations also from the confidence band worm plot confidence (Figure 9f).
Finally, Table 14 presents MLEs, SEs, and p-values of the model M 3 -TLW, in which classes I, II, and extra are compared with the rejected class. We can obtain the following conclusions: there is a significant difference between the color of class 1 and the rejects. Its effect is positive, that is, it presented higher color values. Class II and the extra class do not present a significant difference with the rejected class. The extra class and class I’s colors affect the shape of the distribution compared to the reject class’s color.

6. Conclusions

In this study, we propose a new class of distributions called the truncated Lindley-G (TLG) distribution with application to the truncated Lindley–Weibull (TLW) distribution with three parameters. Several structural properties of the TLG distribution, including an expansion of the density function, critical points, explicit expressions of the ordinary and incomplete moments, mean deviation, generating function, entropy, and quantile function, are discussed. The parameters of the model are estimated using the maximum likelihood technique. We fitted the TLW model to two sets of data to demonstrate the effectiveness of the proposed distribution. In comparison to the Kumarswamy–Weibull, Weibull–Weibull, Geometric–Poisson–Weibull, Poisson–Weibull, beta-Weibull, Marshall–Olkin–Weibull, and exponentiated generalized Weibull distributions, the proposed model had a better fit on four datasets. However, the goodness-of-fit measures for our model were not drastically better than the comparison models that are currently used in statistical analyses. Based on this new distribution, we propose a TLW regression model with two systematic components very suitable for modeling censored and uncensored data. Several simulation studies are performed for different parameter settings, sample sizes, and censoring percentages. We anticipate the further application of the proposed model in disciplines such as engineering, survival and lifetime data, and economics.

Author Contributions

Conceptualization, M.H., G.M.R., E.M.M.O., R.V. and H.E.; methodology, M.H., G.M.R., E.M.M.O., R.V. and H.E; software, M.H., G.M.R., E.M.M.O., R.V. and H.E.; investigation, M.H., G.M.R., E.M.M.O., R.V. and H.E.; writing—original draft preparation, M.H. and H.E.; writing—review and editing, M.H. and E.M.M.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received funding from Deanship of Scientific Research at King Khalid University through General Research Project under grant number GRP/206/44, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior–Brasil (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico—Brasil (CNPq).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Stated in the text.

Acknowledgments

The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through a General Research Project under grant number GRP/206/44. Also, this study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, Brasil (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasil (CNPq).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The UTL model has the following properties:
(1)
Moments
The UTL distribution’s kth raw moment ( k = 1 , 2 , ) is given by
μ k = θ 2 C θ e θ θ + 1 + k + 1 θ d k ,
where d k = 0 1 x k e θ x d x . Using integration by parts, d k can be calculated recursively by
d k = 1 θ k d k 1 e θ , k = 1 , 2 , ,
and
d 0 = 1 θ 1 e θ .
The first three moments are
μ 1 = 2 + θ 2 e θ 3 θ e θ 2 θ 2 e θ C θ θ , μ 2 = 6 + 2 θ 6 e θ 8 θ e θ 5 θ 2 e θ 2 θ 3 e θ C θ θ 2 , and μ 3 = 24 + 6 θ 24 e θ 30 θ e θ 18 θ 2 e θ 7 θ 3 e θ 2 θ 4 e θ C θ θ 3 .
The kth incomplete moment of X is given by
I X t ; k = E X k X t = 0 t x k f U T L ( x ) d x = θ 2 C θ t k e θ t θ + 1 + k + 1 θ d t , k ,
where d t , k = 0 t x k e θ x . Using integration by parts, d t , k can be calculated recursively by
d t , k = 1 θ k t k d t , k 1 e θ t , k = 1 , 2 , ,
and
d t , 0 = 1 θ 1 e θ t .
(2)
Mode
The mode of the UTL distribution is
M o d e = 1 θ θ i f 0.5 θ 1 , 0 i f θ > 1 , 1 i f θ < 0.5 .
(3)
Quantile Function
Therefore The UTL distribution’s qf is
Q U T L ( u ) = 1 1 θ 1 θ W ( u C θ 1 θ 1 ) e θ 1 , u ( 0 , 1 ) ,
where W ( x ) is the Lambert function satisfying W ( x ) e W ( x ) = x for x [ 1 / e , ) (see Corless et al. [31] for the definition and properties of the Lambert function).
Therefore, the median of the UTL distribution is simply M = Q U T L ( 0.5 ) , that is,
M = 1 1 θ 1 θ W ( 0.5 C θ 1 θ 1 ) e θ 1 .
(4)
Mean Deviations
The UTL distribution’s mean deviation about the mean μ = E ( X ) is given by
δ 1 = 0 1 | x μ | f U T L ( x ) d x = 0 μ ( μ x ) f U T L ( x ) d x + μ 1 ( x μ ) f U T L ( x ) d x = 2 μ F U T L ( μ ) 2 0 μ x f U T L ( x ) d x = 2 μ F U T L ( μ ) I X ( μ ; 1 )
and the mean deviation about the median M is
δ 2 = 0 1 | x M | f U T L ( x ) d x = μ 2 I X ( M ; 1 ) ,
where I X ( t ; k ) is the kth incomplete moment.
(5)
Moment Generating Function
The UTL distribution’s moment generating function (mgf) can be expressed as
M ( t ) = 0 1 e t x f U T L ( x ) d x = θ 2 C θ 2 t e ( θ t ) 2 θ e ( θ t ) e ( θ t ) t + θ + 1 t 2 2 θ t + θ 2 .

Appendix B

-
The cdf of the Kumaraswamy-G model is given by
F ( x ) = 1 1 G a ( x ) b , a , b > 0
-
The cdf of the Weibull-G model is given by
F ( x ) = 1 exp log ( 1 G ( x ) ) b a , a , b > 0
-
The cdf of the Geometric-Poisson-G model is given by
F ( x ) = exp [ a + a G ( x ) ] exp ( a ) 1 exp ( a ) b + b exp [ a + a G ( x ) ] , a > 0 , 0 < b < 1
-
The cdf of the Poisson-G model is given by
F ( x ) = 1 exp [ a G b ( x ) ] 1 exp ( a ) , a , b > 0
-
The cdf of the Beta-G model is given by
F ( x ) = I G ( x ) ( a , b )
where I x ( a , b ) = 0 x t a 1 ( 1 t ) b 1 d t / B ( a , b ) is the regularized incomplete beta function, and B ( a , b ) = 0 1 t a 1 ( 1 t ) b 1 d t is the beta function.
-
The cdf of the Marshall–Olkin-G model is given by
F ( x ) = G ( x ) a + 1 a G ( x ) , a > 0
-
The cdf of the exponentiated generalized-G model is given by
F ( x ) = 1 [ 1 G ( x ) ] a b , a , b > 0

References

  1. Eugene, N.; Lee, C.; Famoye, F. Beta-Normal Distribution and Its Applications. Commun. Stat.-Theory Methods 2002, 31, 497–512. [Google Scholar] [CrossRef]
  2. Alexander, C.; Cordeiro, G.M.; Ortega, E.M.M.; Sarabia, J.M. Generalized Beta-Generated Distributions. Comput. Stat. Data Anal. 2012, 56, 1880–1897. [Google Scholar] [CrossRef]
  3. Nadarajah, S.; Teimouri, M.; Shih, S.H. Modified Beta Distributions. Sankhya B 2014, 76, 19–48. [Google Scholar] [CrossRef]
  4. Cordeiro, G.M.; de Castro, M. A New Family of Generalized Distributions. J. Stat. Comput. Simul. 2011, 81, 883–898. [Google Scholar] [CrossRef]
  5. Ristic, M.M.; Nadarajah, S. A New Lifetime Distribution. J. Stat. Comput. Simul. 2013, 84, 135–150. [Google Scholar] [CrossRef]
  6. Nadarajah, S.; Nassiri, V.; Mohammadpour, A. Truncated-Exponential Skew-Symmetric Distributions. Stat. A J. Theor. Appl. Stat. 2014, 48, 872–895. [Google Scholar] [CrossRef]
  7. Abid, A.H.; Abdulrazak, R.K. [0,1] Truncated Frèchet-G Generator of Distributions. Appl. Math. 2017, 7, 51–66. [Google Scholar] [CrossRef]
  8. Bantan, R.A.; Jamal, F.; Chesneau, C.; Elgarhy, M. Truncated Inverted Kumaraswamy Generated Family of Distributions with Applications. Entropy 2019, 21, 1089. [Google Scholar] [CrossRef]
  9. Aldahlan, M.A. Type II Truncated Fréchet Generated Family of Distributions. Int. J. Math. Its Appl. 2019, 7, 221–228. Available online: https://ijmaa.in/index.php/ijmaa/article/view/285 (accessed on 15 September 2022).
  10. Almarashi, A.M.; Elgarhy, M.; Jamal, F.; Chesneau, C. The Exponentiated Truncated Inverse Weibull Generated Family of Distributions with Applications. Symmetry 2020, 12, 650. [Google Scholar] [CrossRef]
  11. Jamal, F.; Bakouch, H.; Nasir, M.A. A Truncated General-G class of Distributions with Application to Truncated Burr G family. REVSTAT-Stat. J. 2021, 19, 513–530. [Google Scholar] [CrossRef]
  12. Almarashi, A.M.; Jamal, F.; Chesneau, C.; Elgarhy, M. A New Truncated Muth Generated Family of Distributions with Applications. Complexity 2021, 21, 1–4. [Google Scholar] [CrossRef]
  13. ZeinEldin, R.A.; Chesneau, C.; Jamal, F.; Elgarhy, M.; Almarashi, A.M.; Al-Marzouki, S. Generalized Truncated Fréchet Generated Family Distributions and their Applications. Comput. Model. Eng. Sci. 2021, 126, 791–819. [Google Scholar] [CrossRef]
  14. Algarni, A.; Almarashi, A.M.; Jamal, F.; Chesneau, C.; Elgarhy, M. Truncated Inverse Lomax Generated Family of Distributions with Applications to Biomedical Data. J. Med. Imaging Health Inform. 2021, 11, 2425–2439. [Google Scholar] [CrossRef]
  15. Bantan, R.A.; Chesneau, C.; Jamal, F.; Elbatal, I.; Elgarhy, M. The Truncated Burr X-G Family of Distributions: Properties and Applications to Actuarial and Financial Data. Entropy 2021, 23, 1088. [Google Scholar] [CrossRef]
  16. Lindley, D.V. Fiducial Distributions and Bayes’ Theorem. J. R. Stat. Soc. 1958, 20, 102–107. Available online: https://www.jstor.org/stable/2983909 (accessed on 12 August 2020). [CrossRef]
  17. AL-Hussaini, E.K.; Ahsanullah, M. Exponentiated Distributions “Part of the book series: Atlantis Studies in Probability and Statistics”; ATLANTISSPS; Atlantis Press: Paris, France, 2015. [Google Scholar]
  18. Alzaatreh, A.; Lee, C.; Famoye, F. A New Method for Generating Families of Continuous Distributions. Metron 2013, 71, 63–79. [Google Scholar] [CrossRef]
  19. Nadarajah, S.; Cancho, V.G.; Ortega, E.M.M. The Geometric Exponential Poisson Distribution. Stat. Methods Appl. 2013, 22, 355–380. [Google Scholar] [CrossRef]
  20. Marshall, A.W.; Olkin, I. A New Method for Adding a Parameter to a Family of Distributions with Application to the Exponential and Weibull Families. Biometrika 1997, 84, 641–652. Available online: https://www.jstor.org/stable/2337585 (accessed on 3 April 2010). [CrossRef]
  21. Cordeiro, G.M.; Ortega, E.M.M.; da Cunha, D.C.C. The Exponentiated Generalized Class of Distributions. J. Data Sci. 2013, 11, 1–27. [Google Scholar] [CrossRef]
  22. Team RC. R: A language and Environment for Statistical Computing; R Foundation for Statistical Computing, Vienna, Austria. 2022. Available online: https://www.r-project.org/ (accessed on 24 June 2022).
  23. Barakat, H.; Nigm, E.; Aldallal, R. Exact Prediction Intervals for Future Current Records and Record Range from any Continuous Distribution. SORT-Stat. Oper. Res. Trans. 2014, 38, 251–270. Available online: https://raco.cat/index.php/SORT/article/view/284044 (accessed on 5 March 2017).
  24. Cheng, R.C.; Traylor, L. Characterization of Material Strength Properties Using Probabilistic Mixture Models. WIT Trans. Model. Simul. 1970, 31, 553–560. [Google Scholar]
  25. Albitar, O.; Ballouze, R.; Ooi, J.P.; Ghadzi, S.M.S. Risk factors for mortality among COVID-19 patients. Diabetes Res. Clin. Pract. 2020, 166, 1–5. [Google Scholar] [CrossRef] [PubMed]
  26. Liu, Y.; Du, X.; Chen, J.; Jin, Y.; Peng, L.; Wang, H.H.; Zhao, Y. Neutrophil-to-lymphocyte ratio as an independent risk factor for mortality in hospitalized patients with COVID-19. J. Infect. 2020, 81, 6–12. [Google Scholar] [CrossRef] [PubMed]
  27. Giacomelli, A.; Ridolfo, A.L.; Milazzo, L.; Oreni, L.; Bernacchia, D.; Siano, M.; Galli, M. 30-day mortality in patients hospitalized with COVID-19 during the first wave of the Italian epidemic: A prospective cohort study. Pharmacol. Res. 2020, 158, 104931. [Google Scholar] [CrossRef]
  28. Atlam, M.; Torkey, H.; El-Fishawy, N.; Salem, H. Coronavirus disease 2019 (COVID-19): Survival analysis using deep learning and Cox regression model. Pattern Anal. Appl. 2021, 24, 993–1005. [Google Scholar] [CrossRef] [PubMed]
  29. Rodrigues, G.M.; Ortega, E.M.; Cordeiro, G.M.; Vila, R. An extended Weibull regression for censored data: Application for COVID-19 in campinas, Brazil. Mathematics 2022, 10, 3644. [Google Scholar] [CrossRef]
  30. Piedad, E.; Caladcad, J.A. Post-harvested Musa acuminata Banana Tiers Dataset. Data Brief 2023, 46, 108856. [Google Scholar] [CrossRef]
  31. Corless, R.M.; Gonnet, G.H.; Hare, D.E.G.; Jeffrey, D.J.; Knuth, D.E. On the Lambert W function. Adv. Comput. Math. 1996, 5, 329–359. [Google Scholar] [CrossRef]
Figure 1. The pdf of the TLW model.
Figure 1. The pdf of the TLW model.
Entropy 25 01359 g001
Figure 2. Regions of the Cartesian plane θ τ * where different forms of the TLW pdf occur.
Figure 2. Regions of the Cartesian plane θ τ * where different forms of the TLW pdf occur.
Entropy 25 01359 g002
Figure 3. Histogram and fitted pdf, empirical and fitted cdfs, and P–P and Q–Q plots of the TLW model fitted to dataset I.
Figure 3. Histogram and fitted pdf, empirical and fitted cdfs, and P–P and Q–Q plots of the TLW model fitted to dataset I.
Entropy 25 01359 g003
Figure 4. Histogram and fitted pdf, empirical and fitted cdfs, and P–P and Q–Q plots of the TLW model fitted to dataset II.
Figure 4. Histogram and fitted pdf, empirical and fitted cdfs, and P–P and Q–Q plots of the TLW model fitted to dataset II.
Entropy 25 01359 g004
Figure 5. (a) Kaplan–Meier survival curve for the sex variable ( 1 = male, 0 = female); (b) histogram of the age variable.
Figure 5. (a) Kaplan–Meier survival curve for the sex variable ( 1 = male, 0 = female); (b) histogram of the age variable.
Entropy 25 01359 g005
Figure 6. Index plot and normal probability plot with envelope of the deviance residual from the fitted regressions model to the COVID-19 data. (a,b): M 3 -TLW; (c,d): M 3 -Weibull.
Figure 6. Index plot and normal probability plot with envelope of the deviance residual from the fitted regressions model to the COVID-19 data. (a,b): M 3 -TLW; (c,d): M 3 -Weibull.
Entropy 25 01359 g006
Figure 7. Kaplan–Meier survival curve and estimated survival functions from the M 3 -TLW by sex.
Figure 7. Kaplan–Meier survival curve and estimated survival functions from the M 3 -TLW by sex.
Entropy 25 01359 g007
Figure 8. Boxplot of colors by class for the Post-harvested dataset.
Figure 8. Boxplot of colors by class for the Post-harvested dataset.
Entropy 25 01359 g008
Figure 9. Index plot, normal probability plot with envelope, and worm plot of the quantile residuals from the regression models fitted to the Post-harvested dataset: (ac): M 3 -TLW; (df): M 3 -Weibull.
Figure 9. Index plot, normal probability plot with envelope, and worm plot of the quantile residuals from the regression models fitted to the Post-harvested dataset: (ac): M 3 -TLW; (df): M 3 -Weibull.
Entropy 25 01359 g009
Table 1. Previous work on TG models.
Table 1. Previous work on TG models.
ModelAuthor(s)cdf
Poisson-GRistic and Nadarajah (2013) [5] 1 e a G b ( x ) 1 e a
Truncated-exponential skew-symmetric-GNadarajah et al. (2014a) [6] 1 e a G ( x ) 1 e a
Truncated-Fréchet-GAbid and Abdulrazak (2017) [7] e a 1 G ( x ) b
Truncated inverted Kumaraswamy-GBantan et al. (2019) [8] 1 1 + G ( x ) a b ( 1 2 a ) b
Type II truncated Fréchet-G (truncated inverse Weibull-G)Aldahlan et al. (2019) [9] 1 e 1 1 G ( x ) a
Exponentiated truncated inverse Weibull-GAlmarashi et al. (2020) [10] [ 1 e 1 1 G ( x ) a ] b
Truncated Burr-GJamal et al. (2020) [11] 1 1 + G c ( x ) k 1 2 k
Truncated Muth-GAlmarashi et al. (2021) [12] 1 e [ α G ( x ) e α G ( x ) 1 / α ] 1 e [ α e α 1 / α ]
Truncated generalized Fréchet-GZeinEldin et al. (2021) [13] 1 1 e α / G ( x ) b 1 ( 1 e α ) b
Truncated inverse Lomax-GAlgarni et al. (2021) [14] 1 2 α 1 + ( 1 G ( x ) ) 1 α
Truncated Burr X-GBantan et al. (2021) [15] 1 e α 2 G 2 ( x ) θ ( 1 e α 2 ) θ
Table 2. Number of roots of equation A ( z 0 ) = B θ , k ( z 0 ) in (16) when varying the parameters θ and τ * .
Table 2. Number of roots of equation A ( z 0 ) = B θ , k ( z 0 ) in (16) when varying the parameters θ and τ * .
τ * 1 e > 1 e < 0 ≥0
θ
>0single rootsingle rootno root
<0no roottwo rootssingle root
Table 3. Shapes of TLW pdf when varying the parameters θ and τ * .
Table 3. Shapes of TLW pdf when varying the parameters θ and τ * .
τ * 1 e > 1 e < 0 ≥0
θ
>0UnimodalityUnimodalityDecreasing
<0DecreasingDecreasing–increasing–decreasingUnimodality
Table 4. Average estimates from simulations of the TLW distribution.
Table 4. Average estimates from simulations of the TLW distribution.
Parameters ME
θ k λ n θ ^ k ^ λ ^
0.50.50.5200.39270.59820.5915
500.39590.51040.5143
1000.58180.50860.4582
1500.57820.50780.4737
3000.50520.50030.5012
0.522200.38212.17882.4017
500.38282.17242.3466
1000.51952.15542.1472
1500.49452.11592.0617
3000.49842.01952.0324
220.5202.87802.82450.6463
502.62452.22450.4403
1002.14192.04190.5388
1502.05451.95450.5036
3002.00441.99940.5004
30.53202.47020.62312.2937
502.62020.37983.2610
1002.83690.46313.2424
1502.85350.51463.1024
3002.98230.50183.0635
252202.77955.97241.6501
502.34055.40462.2949
1001.85515.18551.7808
1502.07335.07332.0985
3001.99304.97902.0104
533206.12743.29872.6354
505.27813.12882.6674
1004.89563.11462.8674
1504.98953.09852.9631
3005.00133.00432.9985
542204.45334.58472.8655
505.24743.88122.4652
1004.95213.89322.4245
1505.11243.99582.1135
3004.98214.00242.0075
Table 5. MABs and MSEs from simulations of the TLW distribution.
Table 5. MABs and MSEs from simulations of the TLW distribution.
Parameters MABMSE
θ k λ n θ ^ k ^ λ ^ θ ^ k ^ λ ^
0.50.50.5200.10730.09820.09150.49270.32510.2520
500.10410.01040.01430.15380.16070.2497
1000.08180.00860.04180.13530.06560.1960
1500.07820.00780.02630.02300.04210.0540
3000.00520.00030.00120.01100.02150.0301
0.522200.11790.17880.40180.32100.45730.4200
500.11720.17240.34660.14200.29230.2584
1000.01950.15540.14720.07320.08320.1453
1500.00550.11590.06170.06120.05490.1087
3000.00160.01950.03240.01390.04900.0359
220.5200.87800.82450.14630.78100.54270.7147
500.62450.22450.05970.65310.44170.6984
1000.14190.04900.03880.14560.25420.1825
1500.05450.04550.00360.05740.00880.0821
3000.00440.00060.00040.00350.00150.0674
30.53200.52980.12310.70631.07450.89450.7984
500.37980.12020.26100.68700.30170.5203
1000.16310.03690.24240.21530.14650.2257
1500.14650.01460.10240.10400.08960.0357
3000.01770.00180.06350.08620.06510.0089
252200.77950.97240.34990.86911.21431.1401
500.34050.40460.29490.40410.96740.5189
1000.14490.18550.21920.35400.63070.5021
1500.07330.07330.09850.09570.03900.1008
3000.00700.02100.01040.00680.01070.0096
533201.12740.29870.36461.87522.01451.4571
500.27810.12880.33261.05871.51240.6501
1000.10440.11460.13260.63210.82100.0893
1500.01050.09850.03690.24800.63470.0101
3000.00130.00430.00150.04720.00860.0054
542200.54670.58470.86552.17681.74561.9087
500.24740.11880.46520.87401.01570.9889
1000.04790.10680.42450.65310.87510.2350
1500.11240.00420.11350.04780.14500.0842
3000.01790.00240.00750.00230.05410.0357
Table 6. Simulation results of TLW regression models for different censoring percentages (%) with true values: β 10 = 0.3 , β 11 = 0.4 , β 20 = 0.2 , β 21 = 0.5 , and θ = 0.6 .
Table 6. Simulation results of TLW regression models for different censoring percentages (%) with true values: β 10 = 0.3 , β 11 = 0.4 , β 20 = 0.2 , β 21 = 0.5 , and θ = 0.6 .
n = 100 n = 300 n = 500
% θ AEsMABsMSEs AEsMABsMSEs AEsMABsMSEs
0% β 10 0.2949−0.00510.0239 0.30180.00180.0091 0.30560.00560.0054
β 11 0.40770.00770.0232 0.3994−0.00060.0071 0.3960−0.00400.0044
β 20 0.22340.02340.0137 0.20730.00730.0046 0.20870.00870.0026
β 21 0.50140.00140.0260 0.50190.00190.0086 0.4971−0.00290.0049
θ 0.63230.03230.2064 0.62410.02410.0984 0.62860.02860.0569
10% β 10 0.2933−0.00670.0219 0.30120.00120.0090 0.30180.00180.0050
β 11 0.40620.00620.0228 0.3990−0.00100.0075 0.3991−0.00090.0041
β 20 0.21920.01920.0138 0.21000.01000.0051 0.20540.00540.0030
β 21 0.50640.00640.0283 0.4983−0.00170.0089 0.50280.00280.0054
θ 0.61880.01880.1765 0.62250.02250.0908 0.61440.01440.0491
30% β 10 0.2902−0.00980.0253 0.2997−0.00030.0101 0.30330.00330.0057
β 11 0.41140.01140.0266 0.3987−0.00130.0088 0.3969−0.00310.0055
β 20 0.23130.03130.0208 0.20930.00930.0060 0.20720.00720.0033
β 21 0.50050.00050.0404 0.50130.00130.0111 0.4980−0.00200.0065
θ 0.63060.03060.1611 0.61250.01250.0960 0.61380.01380.0518
Table 7. Estimates of TLW parameters for dataset I.
Table 7. Estimates of TLW parameters for dataset I.
MLEStd. ErrInf. 95% CISup. 95% CI
θ −28.4494828.085548−32.53708−25.44727
k3.4544940.95545191.5818425.327145
λ 12.755642.33381988.18143917.32985
Table 8. Competence of the models for the dataset.
Table 8. Competence of the models for the dataset.
DistributionNo. of Estimated ParametersAICCAICBICHQIC
TLW3507.124507.314515.726510.619250.562
W2524.667524.762530.402526.998260.334
KW4510.612510.932522.082515.273251.306
WW4528.667528.987540.137533.328260.334
GPW4512.800513.120524.270517.460252.400
PW4513.232513.552524.702517.893252.616
BW4511.523511.843522.993516.184251.762
MOW3513.522513.713522.125517.018253.761
EGW4512.706513.026524.176517.367252.353
Table 9. Estimates of TLW parameters for dataset II.
Table 9. Estimates of TLW parameters for dataset II.
MLEStd. ErrInf. 95% CISup. 95% CI
θ −49.547473.8976505−57.18672−43.9367
k1.3740380.49737750.3991972.348880
λ 1.0295190.6568050−0.2577952.316833
Table 10. Competence of the models for dataset II.
Table 10. Competence of the models for dataset II.
DistributionNo. of Estimated ParametersAICCAICBICHQIC
TLW3118.197118.597124.673120.74856.098
W2129.933130.130134.251131.63462.967
KW4121.642122.320130.278125.04456.821
WW4133.933134.611142.569137.33562.967
GPW4122.118122.796130.754125.52057.059
PW4123.742124.420132.377127.14457.871
BW4121.285121.963129.921124.68756.643
MOW3122.570122.970129.047125.12258.285
EGW4121.883122.561130.519125.28556.942
Table 11. AIC and GD values for TLW and Weibull regression models with different structures for COVID-19 data.
Table 11. AIC and GD values for TLW and Weibull regression models with different structures for COVID-19 data.
ModelTLW Weibull
M 0 M 1 M 2 M 3 M 0 M 1 M 2 M 3
AIC854.947821.162828.469814.707 855.651823.412848.348817.815
GD848.947811.162818.469800.707 851.651815.412840.348805.815
Table 12. MLEs, SEs, and p-values for the M 3 -TLW regression fitted to COVID-19 data.
Table 12. MLEs, SEs, and p-values for the M 3 -TLW regression fitted to COVID-19 data.
MLEsSEsp-Values
β 10 7.83250.2995<0.01
β 11 −0.4190.1432<0.01
β 12 −0.04670.0040<0.01
β 20 −0.42400.16900.01
β 21 0.46050.0939<0.01
β 22 0.00960.0027<0.01
θ 3.64080.3289<0.01
Table 13. AIC and GD values for TLW and Weibull regression models with different structures for the Post-harvested dataset.
Table 13. AIC and GD values for TLW and Weibull regression models with different structures for the Post-harvested dataset.
ModelTLW Weibull
M 0 M 1 M 2 M 3 M 0 M 1 M 2 M 3
AIC1520.2261491.5871487.4291482.161 1519.9851495.1541514.5621486.809
GD1514.2261479.5871475.4291464.161 1515.9851485.1541504.5621470.809
Table 14. MLEs, SEs, and p-values for the M 3 -TLW regression fitted to the Post-harvested dataset.
Table 14. MLEs, SEs, and p-values for the M 3 -TLW regression fitted to the Post-harvested dataset.
MLEsSEsp-Values
β 10 4.14170.0380<0.01
β 11 0.14690.0471<0.01
β 12 −0.08000.04820.0987
β 13 0.03540.04240.4044
β 20 1.53940.1087<0.01
β 21 0.44430.18250.0159
β 22 0.19940.15320.1949
β 23 0.53350.1407<0.01
θ 3.29380.2847
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hussein, M.; Rodrigues, G.M.; Ortega, E.M.M.; Vila, R.; Elsayed, H. A New Truncated Lindley-Generated Family of Distributions: Properties, Regression Analysis, and Applications. Entropy 2023, 25, 1359. https://doi.org/10.3390/e25091359

AMA Style

Hussein M, Rodrigues GM, Ortega EMM, Vila R, Elsayed H. A New Truncated Lindley-Generated Family of Distributions: Properties, Regression Analysis, and Applications. Entropy. 2023; 25(9):1359. https://doi.org/10.3390/e25091359

Chicago/Turabian Style

Hussein, Mohamed, Gabriela M. Rodrigues, Edwin M. M. Ortega, Roberto Vila, and Howaida Elsayed. 2023. "A New Truncated Lindley-Generated Family of Distributions: Properties, Regression Analysis, and Applications" Entropy 25, no. 9: 1359. https://doi.org/10.3390/e25091359

APA Style

Hussein, M., Rodrigues, G. M., Ortega, E. M. M., Vila, R., & Elsayed, H. (2023). A New Truncated Lindley-Generated Family of Distributions: Properties, Regression Analysis, and Applications. Entropy, 25(9), 1359. https://doi.org/10.3390/e25091359

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop