Next Article in Journal
Quantum-Inspired Differential Evolution with Grey Wolf Optimizer for 0-1 Knapsack Problem
Previous Article in Journal
Local Antimagic Chromatic Number for Copies of Graphs
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

New Regression Models Based on the Unit-Sinh-Normal Distribution: Properties, Inference, and Applications

by
Guillermo Martínez-Flórez
and
Roger Tovar-Falón
*,†
Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 230027, Colombia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Mathematics 2021, 9(11), 1231; https://doi.org/10.3390/math9111231
Submission received: 25 March 2021 / Revised: 13 May 2021 / Accepted: 17 May 2021 / Published: 28 May 2021

Abstract

:
In this paper, two new distributions were introduced to model unimodal and/or bimodal data. The first distribution, which was obtained by applying a simple transformation to a unit-Birnbaum–Saunders random variable, is useful for modeling data with positive support, while the second is appropriate for fitting data on the (0,1) interval. Extensions to regression models were also studied in this work, and statistical inference was performed from a classical perspective by using the maximum likelihood method. A small simulation study is presented to evaluate the benefits of the maximum likelihood estimates of the parameters. Finally, two applications to real data sets are reported to illustrate the developed methodology.

1. Introduction

The Birnbaum–Saunders (BS) distribution has been used principally for modeling the lifetime of certain structures under dynamic load, and it was introduced by Birnbaum and Saunders [1]. The probability density function (pdf) of the BS distribution is given by:
f T ( t ) = t 3 / 2 ( t + β ) 2 α β ϕ ( a t ) , t > 0 ,
where ϕ ( · ) is the pdf of the normal distribution and a t = 1 α t / β β / t , where α > 0 is a shape parameter and β > 0 is a scale parameter. We use the notation T BS ( α , β ) . The BS model has been extended to a large number of families of distributions. Castillo et al. [2], for example, introduced the epsilon Birnbaum–Saunders family of distributions based on the epsilon-skew-symmetric distribution, while Vilca-Labra and Leiva-Sánchez [3] proposed a new fatigue model from the skew-elliptical family of distributions. The new proposal is called the doubly generalized Birnbaum–Saunders distribution, and within its main properties, it is highlighted that the incorporation of the elliptical aspect allows the kurtosis to be flexible and that the skewness makes the asymmetry flexible. Martínez-Flórez et al. [4] introduced the asymmetric alpha-power extension of the BS model. A generalization referred to as the proportional hazard Birnbaum–Saunders distribution was studied by Moreno-Arenas et al. [5], which includes a new parameter that provides more flexibility in terms of skewness and kurtosis. The BS model also has been used in the study of linear regression models as in Rieck and Nedelman [6], where it was supposed that Y i = log ( T i ) with T i BS ( α , β ) for i = 1 , 2 , , n and the errors in the linear model have a sinh-normal (SHN) distribution with parameter vector ( α , 0 , 2 ) . Santos and Cribari-Neto [7] numerically evaluated the finite sample performances of the likelihood ratio, score, and Wald tests in the log-Birnbaum–Saunders regression model and introduced a RESET-like misspecification test for the proposed model by Rieck and Nedelman [6]. Furthermore, Balakrishnan and Zhu [8] discussed the maximum likelihood estimation of the model parameters under a log-linear link function for the BS lifetime regression model with equal and unequal shape parameters.
The pdf of the SHN model is given by:
f SHN ( y ; α , γ , σ ) = 2 α σ cosh y γ σ ϕ 2 α sinh y γ σ , y R ,
where α > 0 is a shape parameter, γ is a location parameter, and σ > 0 is a scale parameter. A random variable Y following the model in (2) is denoted by SHN ( α , γ , σ ) .
The SHN model was extended by Barros et al. [9] by considering a Student t distribution for the errors. This proposed Student t log-BS regression model allows attenuating the influence of the outlying observations. Other extensions of the SHN model were considered by Leiva et al. [10] and Santana et al. [11].
Generalizations of the BS distribution to model data with support in the interval (0, 1) have also been considered by several authors. Mazucheli et al. [12] presented a type of BS distribution with support in the interval ( 0 , 1 ) , which became a new alternative to the beta and Kumaraswamy distributions. This new proposal is called the unit-Birnbaum–Saunders (UBS) model and has the pdf given by:
f UBS ( x ; α , β ) = 1 2 x α β 2 π β log ( x ) 1 2 + β log ( x ) 3 2 × exp 1 2 α 2 log ( x ) β + β log ( x ) + 2 ,
where x ( 0 , 1 ) , α > 0 is a shape parameter and β > 0 is a scale parameter.
To explain response variables between zero and one, such as proportions or rates, alternative statistical models to the beta regression model were studied by Martínez-Flórez et al. [13]. The beta regression model is useful to study relations between variables where the response corresponds to rates, proportions, or indexes. Among the several studies related to the issue, we have Ospina et al. [14], Simas et al. [15], Rocha and Simas [16] and Cribari-Neto and Souza [17], among others. Recent applications of the beta regression model can be found in Ghosh [18], who developed the robust minimum density power divergence estimator and a class of robust Wald-type tests for the beta regression model. For the applications, the author considered data on health measurements of several athletes collected at the Australian Institute of Sport (HIV data) and data on anxiety, depression, and stress in non-clinical women in Australia (stress-anxiety data). On the other hand, Kim et al. [19] proposed control charts of mean and variance by using a copula Markov statistical process control (SPC) and a conditional distribution with diverse copula functions. The authors used beta regression to explain the behavior of the average run lengths of the control charts of conditional variance with data on Major League Baseball (MLB) batting average (BA) and earned run average (ERA) data from the 1998 to 2016 seasons. The main objective of this work is to introduce new families of distributions capable of modeling bimodal data with positive support or on the unit interval. The extension to the case of regression models is also studied.
The rest of this paper is organized as follows: Section 2 introduces the non-negative sinh-normal distribution, and its main statistical properties are studied in detail. The log-sinh-normal regression model is also studied. In Section 3, the log-sinh-normal regression model is introduced, and its main properties are discussed. Section 4 presents the normal distribution, and its respective extension to the case of regression models is studied. In Section 5, a small Monte Carlo simulation study is presented. Finally, in Section 6, two real data applications are reported and compared with several rival models.

2. Non-Negative Sinh-Normal Distribution

In this section, a new non-negative distribution is introduced, which is obtained by extension of the UBS model. Let X be a random variable following a UBS distribution. If Y = log ( X ) , then the distribution of Y has positive support and is referred to as a non-negative sinh-normal (SHN) distribution. The pdf of the non-negative SHN model is given by:
f ( y ; α , β ) = 1 α y cosh log ( y ) log ( β ) 2 ϕ 2 α sinh log ( y ) log ( β ) 2 ,
where ϕ ( · ) is the pdf of the standard normal distribution. The distribution in (4) can also be called log-unit-Birnbaum–Saunders (LUBS). One can see that a more general form of the non-negative SHN model is given by the pdf:
f LSHN ( y ; α , γ , σ ) = 2 α σ y cosh log ( y ) γ σ ϕ 2 α sinh log ( y ) γ σ ,
where y > 0 , α , γ , and σ are the parameters of the shape, location, and scale, respectively. This model is denoted by LSHN ( α , γ , σ ) , and we refer to it as the log-sinh-normal model.
The density function in Equation (5) integrates to one, and the proof of this can be seen in Appendix A. Figure 1 displays some forms of the pdf of the LSHN distribution for selected values of α , γ , and σ . One can see in Figure 1a that the LSHN density is unimodal for α 2 , whereas for α > 2 , the LSHN density is bimodal (see Figure 1b). This is a great result since it is possible to have a distribution for positive bimodal data.

2.1. Distribution Function, Survival Function, and Hazard Function of the LSHN Model

The cumulative distribution function (cdf) of the LSHN model is given by:
F LSHN ( y ) = F SHN log ( y ) = Φ 2 α sinh log ( y ) γ σ ,
where F S H N ( · ) is the cdf of the SHN distribution. It follows from (5) and (6) that the survival and hazard functions are given, respectively, by:
S LSHN ( t ) = 1 F SHN log ( t ) = 1 Φ 2 α sinh log ( t ) γ σ = S SHN log ( t )
and:
r LSHN ( t ) = f LSHN ( t ) 1 Φ 2 α sinh log ( t ) γ σ = r SHN ( log ( t ) )
where S S H N ( · ) and r S H N ( · ) are the survival and hazard functions of the SHN model. The graphs in Figure 2 show the form of the hazard function for some selected values of the parameters. The plots reveal that the LSHN density increases up to a certain value and then decreases to zero.

2.2. Moments of the LSHN Model

It can be shown that the r-th moment of the random variable Y following a LSHN ( α , γ , σ ) distribution is given by:
E ( Y ) = M Z ( r )
where M Z ( r ) is the moment-generating function (mgf) of the random variable with the SHN distribution. Following some results found by Rieck [20], we have that:
E ( Y r ) = e r γ k a ( α 2 ) + k b ( α 2 ) k 1 / 2 ( α 2 )
where a = r σ + 1 2 , b = r σ 1 2 , and k λ ( · ) is the third-order Besser function defined by:
k λ ( v ) = 1 2 v 2 λ 0 u λ 1 e u v 2 4 u d u .
For the special case of σ = 2 (the LUBS model), one can prove that:
E ( Y ) = e γ 2 + α 2 2 , E ( Y 2 ) = e 2 γ 2 + 4 α 2 + 3 α 4 2
and:
Var ( Y ) = e 2 γ α 2 ( 5 α 2 + 4 ) 4 .
From the above results, it can be concluded that the LSHN distribution can be obtained by applying the transformation Y = e Z to a random variable Z SHN ( α , γ , σ ) .

2.3. Cumulant-Generating Function and Mode

Let Y = log ( X ) with X UBS ( α , β ) , then the random variable Y has an LSHN distribution. It follows that:
M Y ( t ) = E ( e t Y ) = E e t log ( X ) = E ( X t ) .
Letting r = t , for t < 0 , and following Mazucheli et al. [21], we have:
M Y ( t ) = E ( X r ) = 1 2 ( 2 r α 2 β + 1 ) 2 r α 2 β + 2 r α 2 β + 1 + 1 e 2 r α 2 β + 1 1 α 2 .
Now, the cumulant-generating function (cgf) is given by:
K Y ( t ) = j = 1 K j ( Y ) t j j ! = log M Y ( t )
where K j ( Y ) is the j-th moment of the random variable Y. We have that,
K Y ( t ) = log 2 ( 2 r α 2 β + 1 ) 2 r α 2 β + 1 1 α 2 + log 2 r α 2 β + 2 r α 2 β + 1 + 1 .
The modes of the LSHN distribution can be obtained by maximizing the logarithm of the pdf. Thus, let ξ 1 = 2 α cosh log ( Y ) γ σ and ξ 2 = 2 α sinh log ( Y ) γ σ in the logarithm of the pdf of the LSHN model; taking the derivative and setting the resulting derivative equal to zero, it is obtained that the mode (or modes) of the pdf of the LSHN distribution is (are) the solution(s) of the non-linear equation:
ξ 2 ξ 1 2 ξ 2 σ ξ 1 = 0 .
Solving this non-linear equation, the mode(s) of the LSHN distribution is (are) found.

2.4. Asymptotic Distribution

If Y LSHN ( α , γ , σ ) , it can be proven that the random variable ( log ( Y ) γ ) / ( α σ / 2 ) converges to a normal distribution when α 0 , that is random variable Y converges to a log-normal (LN) distribution when α 0 . Therefore, it follows from (6) that, if Y LSHN ( α , γ , σ ) , then:
Z = 2 α sinh log ( Y ) γ σ N ( 0 , 1 ) .
Thus, if Z N ( 0 , 1 ) , then:
Y = e γ + σ sinh 1 α Z 2 LSHN ( α , γ , σ ) ,
where sinh 1 ( · ) is the inverse function of the sinh ( · ) function.

3. The LSHN Regression Model

Regression models have been a statistical technique widely used in many areas of knowledge to explain the behavior of a response variable, say Y, as a function of other variables called explanatory variables, say X 1 , , X p , and a vector of unknown parameters called regression coefficients, which is denoted by θ . In this section, the LSHN linear regression model is introduced by considering a random sample of variables Y i , such that:
Y i LSHN ( α , x i θ , σ )
for i = 1 , , n , with x i = ( X i 1 , , X i p ) and θ = ( θ 1 , , θ p ) . In this case, we suppose the functional relationship:
log ( Y i ) = x i θ + ε i
where the random variables ε i SHN ( α , 0 , σ ) for i = 1 , , n .
The functional relationship in Equation (11) is justified below from Theorem 1.
Theorem 1.
Let X U B S ( α , β ) , then for c > 0 , Y = X 1 / c has a U B S ( α , β / c ) distribution.
Proof. 
Consider X UBS ( α , β ) , and let Y = X 1 / c , then X = Y c and d X / d Y = c Y c 1 ; thus:
f ( y ) = 1 2 y c α β 2 π β log ( y c ) 1 2 + β log ( y c ) 3 2 × exp 1 2 α 2 log ( y c ) β + β log ( y c ) + 2 c y c 1 = 1 2 y α ( β / c ) 2 π β / c log ( y ) 1 2 + β / c log ( y ) 3 2 × exp 1 2 α 2 log ( y ) β / c + β / c log ( y ) + 2
That is, Y UBS ( α , β / c ) . □
To construct the model, we considered a random sample X 1 , X 2 , , X n , such that X i UBS ( α i , β i ) for i = 1 , 2 , , n ; and we supposed that X i = f ( Z 1 , Z 2 , , Z p ) and β i = exp ( z i θ ) for i = 1 , 2 , , n , where z i = ( Z 1 , Z 2 , , Z p ) . Letting α i = α and since X i 1 / c UBS ( α , β i / c ) (this follows from Theorem 1), taking X i = δ i exp ( z i θ ) where δ i UBS ( α , 1 ) , we have that, X i UBS ( α , 1 / ( 1 / exp ( z i θ ) ) ) , that is X i UBS ( α , exp ( z i θ ) ) .
Thus, for Y i = log ( X i ) = log δ i exp ( z i θ ) , we have that,
Y i = exp ( z i θ ) × log ( δ i ) log ( Y i ) = z i θ + log log ( δ i ) = z i θ + log ( ε i * ) = z i θ + ε i
Now, for ε i * = log ( δ i ) , it follows that ε i * LSHN α , log ( β i ) , 2 , then ε i * LSHN ( α , 0 , 2 ) , and then:
f ( ε i * ) = 1 α ε i * cosh log ( ε i * ) 2 ϕ 2 α sinh log ( ε i * ) 2 = 1 α ε i * cosh log ( Y i ) z i θ 2 ϕ 2 α sinh log ( Y i ) z i θ 2
It can be seen from the previous result that the regression model given in (10) generalizes the obtained model from Theorem 1.

3.1. Maximum Likelihood Estimation in the LSHN Regression Model

To get the estimates of the parameters in the LSHN regression model, we considered the maximum likelihood method. Thus, given a random sample of size n, say Y = ( Y 1 , , Y n ) , where Y i LSHN ( α , x i θ , σ ) , the log-likelihood function for the parameter vector φ = ( θ , α , σ ) can be written as follows:
( φ ; Y ) n log ( σ ) i = 1 n log ( Y i ) + i = 1 n log ( ξ i 1 ) 1 2 i = 1 n ξ i 2 2 ,
where ξ i 1 = 2 α cosh log ( Y i ) x i θ σ and ξ i 2 = 2 α sinh log ( Y i ) x i θ σ for i = 1 , , n .
After taking partial derivatives of the log-likelihood function (12) with respect to the parameters of interest and setting them equal to zero, we obtain the following score equations:
U ( θ j ) = 1 σ i = 1 n x i j ξ i 1 ξ i 2 ξ i 2 ξ i 1 , j = 1 , , p ,
U ( α ) = n α + 1 α i = 1 n ξ i 2 2 ,
U ( σ ) = n σ 1 σ i = 1 n z i tanh ( z i ) + 1 σ i = 1 n z i ξ i 1 ξ i 2 ,
where z i = ( log ( Y i ) x i θ ) / σ , for i = 1 , , n . The maximum likelihood estimators for θ 1 , , θ p , α and σ , are the solutions to the equations U ( θ j ) = 0 ( j = 1 , , p ) , U ( α ) = 0 , and U ( σ ) = 0 , which require a numerical method, such as the Newton–Raphson or quasi-Newton.

3.2. Observed and Expected Information Matrix

The elements of the observed information matrix J ( φ ) for the parameter vector φ = ( θ , α , σ ) , which are denoted by j φ j φ k with φ j ( θ 1 , , θ p , α , σ ) , can be obtained by calculating the second partial derivative of the log-likelihood function (12), i.e., j φ j φ k = 2 ( φ ; Y ) / φ j φ k . These elements are given by:
j θ j θ k = 1 σ 2 i = 1 n x i j x i k 2 ξ i 2 2 + 4 α 2 1 + ξ i 2 2 ξ i 2 2 + 4 / α 2 j α θ j = 2 σ α i = 1 n x i j ξ i 1 ξ i 2 , j α α = n α 2 + 3 α 2 i = 1 n ξ i 2 2 , j σ θ j = 1 2 σ i = 1 n x i j z i ξ i 1 2 + ξ i 2 2 sech 2 z i + ξ i 1 ξ i 2 ξ i 2 ξ i 1 , j σ α = 2 σ α i = 1 n z i ξ i 1 ξ i 2 , j σ σ = 2 σ 2 i = 1 n z i ξ i 1 ξ i 2 ξ i 2 ξ i 1 + 1 σ 2 i = 1 n z i 2 2 ξ i 2 2 + 4 α 2 1 + ξ i 2 2 ξ i 2 2 + 4 / α 2 .
The previous results are similar to those obtained by Rieck and Nedelman [6]. The elements of the expected information matrix, I ( φ ) , defined as n 1 times the expected values of the elements of the observed information matrix, are denoted by i θ θ , i α θ j , i α α , i σ θ , i σ α , and i σ σ . Following Rieck and Nedelman [6], we make:
a k ( φ ) = E z k 2 ξ i 2 2 + 4 α 2 1 + ξ i 2 2 ξ i 2 2 + 4 / α 2 , b k ( φ ) = E ( z k ξ i 1 ξ i 2 ) and
d k ( φ ) = E z ξ i 1 ξ i 2 .
Then, the following elements of the I ( φ ) matrix are obtained:
i θ θ = 1 σ 2 C ( α ) X X , i α θ j = 0 , j = 1 , , p , i α α = 2 α 2 , i σ θ = 1 2 σ a 1 ( φ ) + b 0 ( φ ) d 0 ( φ ) X ¯ , i σ α = 2 b 1 ( φ ) σ α , i σ σ = a 2 ( φ ) σ 2 + 2 b 1 ( φ ) d 1 ( φ ) σ 2 ,
where:
C ( α ) = 1 + 4 α 2 2 π α 2 1 erf [ ( 2 / α 2 ) 1 / 2 ] exp ( 2 / α 2 )
and erf ( x ) is the error function given by:
erf ( x ) = 2 π 0 x e z 2 d z .
One can be show that det ( I ( φ ) ) 0 , that is the information matrix is non-singular, which guarantees the existence of the covariance matrix of the maximum likelihood estimators. The Fisher information matrix is given by Var ( φ ) = I 1 ( φ ) . The existence of I 1 ( φ ) also guarantees that the vector of maximum likelihood estimators has asymptotic distribution:
n ( θ ^ , α ^ , σ ^ ) d N p + 2 ( θ , α , σ ) , I 1 ( φ )
that is the maximum likelihood estimators of the model parameters are consistent and asymptotically follow a normal distribution with the covariance matrix being the inverse of the Fisher information matrix. The approximation N p + 2 φ , n 1 I 1 ( φ ) can be used to construct confidence intervals for the parameters φ j . These confidence intervals are given by:
φ ^ j ± z 1 α / 2 × se ( φ ^ j ) ,
where se ( φ ^ j ) corresponds to the square root of the r-th diagonal element of the matrix I 1 ( φ ) and z 1 α / 2 denotes the 100 ( 1 α / 2 ) quantile of the standard normal distribution.

4. Unit-Sinh-Normal Distribution

Now, we introduce the SHN model with support on interval ( 0 , 1 ) , which is denominated by the unit-sinh-normal model, and it is denoted by Y USHN ( α , γ , σ ) . The pdf is given by:
f USHN ( y ; α , γ , σ ) = 1 ( 1 y ) log ( 1 y ) 1 2 σ α cosh log log ( 1 y ) γ σ × ϕ 2 α sinh log log ( 1 y ) γ σ ,
where y ( 0 , 1 ) , α > 0 is a shape parameter, γ R is a location parameter, and σ > 0 is a scale parameter. It can be seen in the complement of the sinh and cosh functions that the density function in (16) is defined on the log-log complementary transformation, which is widely used in generalized linear models. Although the density (16) could be defined from the log-log link function, we used the log-log complement link function. Note that, if y ( 0 , 1 ) , then ( 1 y ) ( 0 , 1 ) , and the simple transformation Z = 1 Y leads to the model with the log-log link function. Figure 3 displays some plots of the pdf of the USHN distribution for some selected values of the parameters. The plots reveal that the USHN density is unimodal for α 2 (see Figure 3a), and the density function is bimodal for α > 2 (see Figure 3b). One of the advantages of the USHN distribution is that it can be used for modeling data sets of proportions and rates with bimodal behaviors.

4.1. Distribution Function, Survival Function, and Hazard Function of the USHN Model

Is easy to see that the corresponding cdf of the random variable Y USHN ( α , γ , σ ) is given by:
F USHN ( y ) = F SHN log ( log ( 1 y ) ) = Φ 2 α sinh log ( log ( 1 y ) ) γ σ ,
where F SHN ( · ) is the cdf of the SHN ( α , γ , σ ) distribution. The survival function S USHN ( t ) and hazard function r USHN ( t ) are given by:
S USHN ( t ) = 1 F SHN log log ( 1 t ) = 1 Φ 2 α sinh log log ( 1 t ) γ σ = S SHN log log ( 1 t )
and:
r USHN ( t ) = f USHN ( t ) 1 Φ 2 α sinh log ( log ( 1 t ) ) γ σ = r SHN log log ( 1 t )
respectively, where S SHN ( · ) and r SHN ( · ) are the survival function and hazard function of the SHN model, respectively. From (17), it is concluded that:
Z = 2 α sinh log log ( 1 Y ) γ σ N ( 0 , 1 ) ,
which implies that log log ( 1 Y ) SHN ( α , γ , σ ) . Figure 4 shows the behavior of the hazard function of a USHN random variable for some selected values of the parameters. The graphs reveal that the hazard function is increasing up to a certain value and then is decreasing to zero.
One can see that a random variable Y following a USHN ( α , γ , σ ) distribution can be generated by using the expression:
Y = 1 e e γ + σ sinh 1 α 2 Φ 1 ( U ) ,
where U U ( 0 , 1 ) denotes the uniform distribution on the ( 0 , 1 ) interval and Φ 1 ( · ) refers to the inverse of the cdf of the standard normal distribution.

4.2. Moments of the USHN Model

The r-th moment of a random variable Y following a USHN ( α , γ , σ ) distribution is given by:
E ( Y r ) = E 1 e X r = j = 0 r r j ( 1 ) j E e j X
where X LSHN ( α , γ , σ ) . Using the Taylor expansion for e j X and from the r-th moment of the LSHN ( α , γ , σ ) distribution, it follows that:
E ( Y r ) = j = 0 r l = 0 r j ( 1 ) j + l ( j e γ ) l l ! k a 1 ( α 2 ) + k b 1 ( α 2 ) k 1 / 2 ( α 2 )
where a 1 = l σ + 1 2 and b 1 = l σ 1 2 , with k a ( · ) the third-order Besser function defined in (7).

4.3. Cumulant-Generating Function and Mode

The mgf of the USHN model can be obtained by using:
M Y ( t ) = E ( e t Y ) = E ( e t ( 1 e X ) ) = e t E ( e t e X ) = e r E ( e r Z ) = e r M Z ( r ) ,
where Z UBS ( α , β ) and r = t , with M Z ( r ) being the mgf of the UBS distribution. Thus,
K Y ( r ) = r + K Z ( r ) , r > 0 ,
where K Z ( r ) is the cgf of the UBS distribution. To find the mode of the USHN distribution, we reasoned in the same way as in the LSHN model. Then, let ξ 1 * = 2 α cosh log ( log ( 1 Y ) ) γ σ and ξ 2 * = 2 α sinh log ( log ( 1 Y ) ) γ σ . Deriving the logarithm of the pdf of the USHN distribution, substituting ξ 1 * and ξ 2 * , and equaling to zero, we obtain the non-linear equation:
ξ 1 * ξ 1 * ξ 2 * σ log ( 1 Y ) 1 ) = ξ 2 * .
Solving this non-linear equation, the mode(s) of the USHN distribution is (are) found.

4.4. Asymptotic Distribution

Let Y USHN . One can prove that random variable log ( log ( 1 Y ) ) γ α σ / 2 converges to a normal distribution when α tends to zero. Thus, if Y USHN ( α , γ , σ ) , then:
Z = 2 α sinh log ( log ( 1 Y ) ) γ σ N ( 0 , 1 ) .
It follows from the result above that:
Y = 1 exp exp γ + σ sinh 1 α Z 2 USHN ( α , γ , σ ) .

4.5. The LUSHN Regression Model

Now, we introduce the LUSHN linear regression model. We considered a set of p explanatory variables, which are denoted by x i = ( x i 1 , , x i p ) , and a p-dimensional vector of unknown parameters θ = ( θ 1 , , θ p ) , such that, for i = 1 , , n , it follows the functional relationship:
log log ( 1 Y i ) = x i θ + ε i ,
where ε i SHN ( α , 0 , σ ) . From (18), we have that,
Z i = log ( log ( 1 Y i ) ) SHN ( α , x i θ , σ ) ;
hence, E ( Z i ) = x i θ . Thus, Z ^ i = x i θ ^ , and therefore,
Y ^ i = 1 exp exp Z ^ i = 1 exp exp x i θ ^ .
To obtain the estimates of the model parameters, we considered the maximum likelihood method as in the LSHN regression model. Thus, given a random sample of size n, say Y = ( Y 1 , , Y n ) , the log-likelihood function for the parameter vector ρ = ( θ , α , σ ) is given by:
( ρ ; Y ) = n log ( σ ) i = 1 n log ( 1 Y i ) i = 1 n log ( log ( 1 Y i ) ) + i = 1 n log ( ξ i 1 ) 1 2 i = 1 n ξ i 2 2 ,
where:
ξ i 1 = 2 α cosh log ( log ( 1 Y i ) ) x i θ σ , and ξ i 2 = 2 α sinh log ( log ( 1 Y i ) ) x i θ σ ,
for i = 1 , , n .
The score function and the observed information matrix of the LUSHN regression model have the same form as the respective expressions of the LSHN regression model by substituting log ( Y i ) by log ( log ( 1 Y i ) ) and by defining:
z i = log ( log ( 1 Y i ) ) x i θ σ
for i = 1 , , n . The MLEs for θ , α , and σ , are the solutions to the equations U ( θ j ) = 0 ( j = 1 , , p ) , U ( α ) = 0 and U ( σ ) = 0 , which require a numerical method such as the Newton–Raphson or quasi-Newton.

5. Simulation Study

To analyze the behavior of the estimators of the parameters in the LSHN regression model, we carried out a small Monte Carlo simulation study. To generate the random variable USHN, we applied the described algorithm in this paper. In this simulation study, we analyzed the behavior of the estimators of the model parameters:
log ( log ( 1 Y i ) ) = θ 0 + θ 1 X i + ε i , i = 1 , 2 , , n .
where ε i SHN ( α , 0 , σ ) . The values of the explanatory variable X were taken from a uniform random variable on the (0,1) interval, that is X i U ( 0 , 1 ) . Without loss of generality, we took the value of the scale parameter equal to σ = 1.0 ; however, the following results can be obtained for any value of the scale parameter from the simple transformation ε i = σ ν i with ν i SHN ( α , 0 , 1 ) . The values of shape parameter were taken as α = 0.50, 0.75, 1.25, 1.75, 2.25, 2.75 to take into account different configurations in the form of the pdf of the random variable ε i . On the other hand, since the coefficients θ i , i = 0 , 1 in the model (18) can be any number in the set of real numbers and there are no restrictions on the values that can be assumed, we took the particular values θ 0 = 0.75 and θ 1 = 0.25 . To analyze some statistical measures of the maximum likelihood estimator (MLE), we considered small, moderate, and large sample sizes: n = 10 , 25 , 50 , 75 , 100 , 200 , 500 , and 5000 iterations were performed for each scenario. The studied characteristics were: the relative bias (RB), the root of the mean squared error (RMSE), and the ratio between the standard deviation (SD) of the estimate and the average SD (RSD). Finally, we examined the coverage probability (CP) of the 95% confidence interval based on the asymptotic normality of the ML estimators.
Table 1 and Table 2 present the results of the simulation study. It can be observed that the RB and RMSE of the MLEs tend to decrease when the sample size increases, which guarantees the unbiasedness and asymptotic consistency of the MLE. It is also observed that, for small sample sizes, important biases are obtained in the estimates of α and θ 1 . It is also observed that, for small sample sizes, important biases are obtained in the estimates of α and θ 1 . Another interesting aspect to take into account is that, for values less than one of the α parameter, the bias of σ is quite considerable for small sample sizes; however, this bias is quite negligible for values of α above of one.
Regarding the coverage rates of the confidence intervals (CP), the simulation results showed that these were higher than 95% for the parameter α in all of the considered sample sizes. For the scale parameter σ , the CPs were low when there were small sample sizes (less than 50), and they tended to increase when the sample size increased to around 90 %. It was also observed that the CPs for the coefficients θ 0 and θ 1 were close to 95 % for moderate and large sample sizes (greater than 75).

6. Applications

To illustrate the potentiality of the proposed distributions, we considered two data sets of real-life examples taken from the literature. The first data set was an example of positive data called fatigue data in hardened steel. The second data set corresponded to data of the body fat data in athletes of the Australian Institute of Sport (AIS), which is an example of the observations on the ( 0 , 1 ) interval.

6.1. Fatigue Data

This data set consisted of failure times ( T ) in rolling contact fatigue of ten hardened steel specimens tested at each of the four values of four contact stress points, X. The data were obtained using a four-ball rolling contact test rig at the Princeton Laboratories of Mobil Research and Development Co. This data set was analyzed by Chan et al. [22] by considering the regression model:
log ( T i ) = θ 1 + θ 2 log ( X i ) + ε i , i = 1 , , 40 .
We considered that the positive response variable T followed the distribution:
T i LSHN α , θ 1 + θ 2 log ( X i ) , σ .
For this data set, we fit the log-BS (LBS) model, the log-skewed BS (LSBS) of Lemonte [23], and the proposed LSHN distribution. The MLEs of the parameters of the fitted models are given in Table 3.
To compare the fitted models, we used the AIC and BIC criteria, which are given by:
AIC = 2 ( θ ^ ) + 2 p and AIC = 2 ( θ ^ ) + p lg ( n ) ,
where p is the number of parameters of the model in question and n is the sample size. The best model is the one with the smallest AIC or BIC. According to AIC and BIC criteria in the table, we can see that the asymmetric models LSBS and LSHN fit better than the LBS model, that is the data present a larger degree of asymmetry than allowed by the BS model. We can conclude that the regression model with the LSHN error distribution provides a better fit than the regression model with the LSBS error distribution.
The significance of the variable log ( x i ) on the response variable T i can be tested through the Wald statistic, θ ^ 2 / se ( θ ^ 2 ) , which gives the value 12.618 / 1.371 = 9.203501 with the respective p-value < 0.0001 , in such a way that the logarithm of contact stress points affects the failure time of the hardened steel.
We recall that if Y i LSHN ( α , x i θ , σ ) , then:
Z i = 2 α sinh log ( Y i ) x i θ σ N ( 0 , 1 ) .
for i = 1 , , n . Figure 5b plots the envelope of the random variable Z i . The plot reveals that the LSHN regression model presents a good fit to the Fatigue data. The plot in Figure 5a depicts the envelope for the log-BS model regression.

6.2. Body Fat Data

We considered the data set included in the library sn of R Development Core Team [24] available for download at http://azzalini.stat.unipd.it/SN/index.html (accessed on 20 March 2021). We considered only the data of 37 rowing athletes in the AIS dataset. We were interested in the prediction of the body fat percentage (Bfat) of each athlete by considering their lean body mass (lbm). For the analysis, we considered the random variable:
Y i LUSHN ( α , θ 1 + θ 2 X i , σ ) .
where Y i is the body fat percentage of the i-th athlete for i = 1 , , 37 . We also fit the beta regression model with logit link and the natural logarithm link to model the dispersion parameter. The MLEs of the parameters and their corresponding standard errors (in parenthesis) were: for the beta regression model, θ ^ 1 = 0.3262 ( 0.259 ) , θ ^ 2 = 0.0313 ( 0.003 ) , and σ ^ = 4.7027 ( 0.232 ) with A I C = 136.56 and B I C = 131.73 ; for the LUSHN regression model, α ^ = 0.0576 ( 0.025 ) , θ ^ 1 = 0.2569 ( 0.224 ) , θ ^ 2 = 0.0311 ( 0.003 ) , and σ ^ = 8.3982 ( 3.651 ) , with A I C = 138.994 and B I C = 132.550 . According to AIC and BIC criteria, the better model was the non-negative SHN. Figure 6 plots the envelope of the variable Z i = 2 α sinh log ( log ( Y i ) ) θ 1 θ 2 X i σ N ( 0 , 1 ) in which we can see that the LUSHN regression model presents a good fit to the body fat data.

7. Conclusions

In this paper, two new families of bimodal distributions were introduced. The new families were generated by applying transformations to the unit-Birnbaum–Saunders and were very useful alternatives for modeling data limited on the interval (0,1) or with positive support, due to their flexibility to fit data with a high degree of asymmetry and/or kurtosis. The main statistical properties of the families and the problem of the parameter estimation were studied in detail by using the maximum likelihood method. The observed and expected information matrix for the family was also deduced. A small Monte Carlo simulation was carried out, showing that the maximum likelihood estimators had good asymptotic properties for moderate and large sample sizes. Extensions to regression models were also presented based on the new family of distribution. Furthermore, we showed that such families of distributions can be useful to fit better to real data sets, especially when the variables are considered to explain the response variable in a regression model.

Author Contributions

All authors contributed equally to this work. All authors read and agreed to the published version of the manuscript.

Funding

The research of R.T.-F. and G.M.-F. were supported by project: Resolución de Problemas de Situaciones Reales Usando Análisis Estadístico a través del Modelamiento Multidimensional de Tasas y Proporciones; Esquemas de Monitoreamiento para Datos Asimétricos no Normales y una Estrategia Didáctica para el Desarrollo del Pensamiento Lógico-Matemático. Universidad de Córdoba, Colombia, Code FCB-05-19.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Details about data available are given in Section 6.

Acknowledgments

G.M.-F. and R.T.-F. acknowledges the support given by Universidad de Córdoba, Montería, Colombia.

Conflicts of Interest

The authors declare no conflict of interes.

Appendix A. Related Theorems

Theorem A1.
Let T B S ( α , β ) . Then, Y = log ( T ) S H N ( α , γ , σ = 2 ) , where γ = log ( β ) .
Proof. 
The density of a Birnbaum–Saunders distribution is:
f T ( t ) = t 3 / 2 ( t + β ) 2 α β ϕ ( a t ) = exp ( α 2 ) 2 α ( 2 π β ) 1 / 2 t 3 / 2 ( t + β ) exp 1 2 α 2 t β β t
where ϕ ( · ) is the pdf of the normal distribution and a t = 1 α t / β β / t .
Letting Y = log ( T ) , then T = e Y = k 1 ( y ) , and by applying the theorem for the transformation of random variables, it follows that:
f Y ( y ) = f T k 1 ( y ) d d y k 1 ( y ) = exp ( α 2 ) 2 α ( 2 π β ) 1 / 2 exp ( y ) 3 / 2 ( exp ( y ) + β ) exp 1 2 α 2 exp ( y ) β β exp ( y ) exp ( y ) = 2 α 2 π 1 β 1 exp ( y ) 1 / 2 exp ( y ) + β exp α 2 2 exp ( y log β ) + exp ( ( y log β ) ) exp ( α 2 ) = 2 α 2 π 1 β 1 exp ( y ) 1 / 2 + β ( exp ( y ) ) 1 1 / 2 exp 2 α 2 exp ( y log β ) 4 + exp ( ( y log β ) ) 4 + α 2 = 2 α 2 π 1 exp ( y log β ) 1 / 2 + exp ( ( y log β ) ) 1 / 2 exp 2 α 2 exp ( y log β ) 4 + exp ( ( y log β ) ) 4 1 2 = 2 α 2 π 1 exp y log β 2 + exp y log β 2 exp 2 α 2 exp ( y log β ) 1 / 2 2 exp ( ( y log β ) ) 1 / 2 2 2 = 2 α 2 π 1 2 cosh y log β 2 exp 2 α 2 exp y log β 2 exp y log β 2 2 2 = 2 2 α 2 π 1 cosh y log β 2 exp 2 α 2 sinh y log β 2 2
Thus, Y SHN ( α , γ = log β , σ = 2 ) . □
Proposition A1.
The density function in Equation (5) integrates to one.
Proof. 
We considered the density function f ( y ) such as:
f ( y ) = 2 α σ y cosh log ( y ) γ σ ϕ 2 α sinh log ( y ) γ σ , y > 0
and we let:
u = 2 α sinh log ( y ) γ σ d u = 2 α σ y cosh log ( y ) γ σ d y ,
then,
0 f ( y ) d y = 0 2 α σ y cosh log ( y ) γ σ ϕ 2 α sinh log ( y ) γ σ d y = ϕ ( u ) d u = 1
 □

References

  1. Birnbaum, Z.W.; Saunders, S.C. A new family of life distributions. J. Appl. Prob. 1969, 6, 319–327. [Google Scholar] [CrossRef]
  2. Castillo, N.O.; Gómez, H.W.; Bolfarine, H. Epsilon Birnbaum–Saunders distribution family: Properties and inference. Stat. Pap. 2011, 52, 871–883. [Google Scholar] [CrossRef]
  3. Vilca-Labra, F.; Leiva-Sánchez, V. A new fatigue life model based on the family of skew-elliptical distributions. Commun. Stat. Theory Methods 2006, 35, 229–244. [Google Scholar] [CrossRef]
  4. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. An alpha-power extension for the Birnbaum–Saunders distribution. Stat. Am. J. Theor. Appl. Stat. 2014, 48, 896–912. [Google Scholar]
  5. Moreno-Arenas, G.; Martínez-Flórez, G.; Barrera-Causil, C. Proportional Hazard Birnbaum–Saunders distribution with application to the survival data analysis. Rev. Colomb. Estad. 2016, 39, 129–147. [Google Scholar] [CrossRef]
  6. Rieck, J.R.; Nedelman, J.R. A log-linear model for the Birnbaum–Saunders distribution. Technometrics 1991, 33, 51–60. [Google Scholar]
  7. Santos, J.; Cribari-Neto, F. Hypothesis testing in log-Birnbaum–Saunders regressions. Commun. Stat. Simul. Comput. 2017, 46, 3990–4003. [Google Scholar] [CrossRef]
  8. Balakrishnan, N.; Zhu, X. Inference for the Birnbaum–Saunders Lifetime Regression Model with Applications. Commun. Stat. Simul. Comput. 2015, 48, 2073–2100. [Google Scholar] [CrossRef]
  9. Barros, M.; Paula, G.A.; Leiva, V. A new class of survival regression models with heavy-tailed errors: Robustness and diagnostics. Lifetime Data Anal. 2008, 14, 316–332. [Google Scholar] [CrossRef]
  10. Leiva, V.; Vilca-Labra, F.; Balakrishnan, N.; Sanhueza, A. A skewed sinh-normal distribution and its properties and application to air pollution. Commun. Stat. Theory Methods 2010, 39, 426–443. [Google Scholar] [CrossRef]
  11. Santana, L.; Vilca, F.; Leiva, V. Influence analysis in skew-Birnbaum–Saunders regression models and applications. J. Appl. Stat. 2011, 38, 1633–1649. [Google Scholar] [CrossRef]
  12. Mazucheli, J.; Menezes, A.; Dey, S. The unit-Birnbaum–Saunders distribution with applications. Chil. J. Stat. 2018, 9, 47–57. [Google Scholar]
  13. Martínez-Flórez, G.; Bolfarine, H.; Gómez, H.W. Power-models for proportions with zero/one excess. Appl. Math. Inf. Sci. 2018, 12, 293–303. [Google Scholar] [CrossRef]
  14. Ospina, R.; Cribari-Neto, F.; Vasconcellos, K.L.P. Improved point and interval estimation for a beta regression model. Comput. Stat. Data Anal. 2006, 51, 960–981. [Google Scholar] [CrossRef]
  15. Simas, A.B.; Barreto-Souza, W.; Rocha, A.V. Improved estimators for a general class of beta regression models. Comput. Statist. Data Anal. 2010, 54, 348–366. [Google Scholar] [CrossRef] [Green Version]
  16. Rocha, A.V.; Simas, A.B. Influence diagnostics in a general class of beta regression models. Test 2011, 20, 95–119. [Google Scholar] [CrossRef]
  17. Cribari-Neto, F.; Souza, T.C. Testing inference in variable dispersion beta regressions. J. Stat. Comput. Sim. 2012, 82, 1827–1843. [Google Scholar] [CrossRef]
  18. Ghosh, A. Robust inference under the beta regression model with application to health care studies. Stat. Methods Med. Res. 2019, 28, 871–888. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Kim, J.M.; Baik, J.; Reller, M. Control charts of mean and variance using copula Markov SPC and conditional distribution by copula. Commun. Stat. Simul. Comput. 2021, 50, 85–102. [Google Scholar] [CrossRef]
  20. Rieck, J.R. Statistical Analysis for the Birnbaum–Saunders Fatigue Life Distribution. Ph.D. Thesis, Department of Mathematical Sciences, Clemson University, Clemson, SC, USA, 1989. [Google Scholar]
  21. Mazucheli, J.; Leiva, V.; Alves, B.; Menezes, A.F.B. A new quantile Regression for modeling bounded data under a unit Birnbaum–Saunders distribution with applications in medicine and politics. Symmetry 2021, 13, 682. [Google Scholar] [CrossRef]
  22. Chan, P.S.; Ng, H.K.T.; Balakrishnan, N.; Zhou, Q. Point and interval estimation for extreme-value regression model under Type-II censoring. Comput. Stat. Data Anal. 2012, 52, 4040–4058. [Google Scholar] [CrossRef]
  23. Lemonte, A.J. A log-Birnbaum–Saunders regression model with asymmetric errors. J. Stat. Comput. Simul. 2012, 82, 1775–1787. [Google Scholar] [CrossRef] [Green Version]
  24. R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018; Available online: http://www.R-project.org (accessed on 22 February 2021).
Figure 1. Probability density function of the LSHN ( α , 0.5 , 0.25 ) distribution for: (a) α = 2 (solid line), α = 1.0 (dashed line), and α = 0.75 (dotted line); (b) α = 6.5 (solid line), α = 4.5 (dashed line), and α = 2.5 (dotted line).
Figure 1. Probability density function of the LSHN ( α , 0.5 , 0.25 ) distribution for: (a) α = 2 (solid line), α = 1.0 (dashed line), and α = 0.75 (dotted line); (b) α = 6.5 (solid line), α = 4.5 (dashed line), and α = 2.5 (dotted line).
Mathematics 09 01231 g001
Figure 2. Hazard function of the LSHN ( α , 0.5 , 0.25 ) distribution for: (a) α = 2 (solid line), α = 1.0 (dashed line), and α = 0.75 (dotted line); (b) α = 6.5 (solid line), α = 4.5 (dashed line), and α = 2.5 (dotted line).
Figure 2. Hazard function of the LSHN ( α , 0.5 , 0.25 ) distribution for: (a) α = 2 (solid line), α = 1.0 (dashed line), and α = 0.75 (dotted line); (b) α = 6.5 (solid line), α = 4.5 (dashed line), and α = 2.5 (dotted line).
Mathematics 09 01231 g002
Figure 3. Probability density function of the USHN ( α , 0.15 , 0.5 ) distribution for: (a) α = 2 (solid line), α = 1.25 (dashed line), and α = 0.75 (dotted line); (b) α = 4.5 (solid line), α = 3.5 (dashed line), and α = 2.5 (dotted line).
Figure 3. Probability density function of the USHN ( α , 0.15 , 0.5 ) distribution for: (a) α = 2 (solid line), α = 1.25 (dashed line), and α = 0.75 (dotted line); (b) α = 4.5 (solid line), α = 3.5 (dashed line), and α = 2.5 (dotted line).
Mathematics 09 01231 g003
Figure 4. Hazard function of the USHN ( α , 0.15 , 0.5 ) distribution for: (a) α = 2 (solid line), α = 1.25 (dashed line), and α = 0.75 (dotted line); (b) α = 4.5 (solid line), α = 3.5 (dashed line), and α = 2.5 (dotted line).
Figure 4. Hazard function of the USHN ( α , 0.15 , 0.5 ) distribution for: (a) α = 2 (solid line), α = 1.25 (dashed line), and α = 0.75 (dotted line); (b) α = 4.5 (solid line), α = 3.5 (dashed line), and α = 2.5 (dotted line).
Mathematics 09 01231 g004
Figure 5. Envelopes of the residuals for: (a) the LBS distribution and (b) the LSHN distribution.
Figure 5. Envelopes of the residuals for: (a) the LBS distribution and (b) the LSHN distribution.
Mathematics 09 01231 g005
Figure 6. Envelope of the residuals for the LUSHN regression model.
Figure 6. Envelope of the residuals for the LUSHN regression model.
Mathematics 09 01231 g006
Table 1. Empirical relative bias (RB), root of the mean squared error (RMSE), ratio between the standard deviation of the estimate and the average standard deviation (RSD), and coverage probability (CP) of the 95% confidence interval for the MLEs of the α and σ in the LUBS model.
Table 1. Empirical relative bias (RB), root of the mean squared error (RMSE), ratio between the standard deviation of the estimate and the average standard deviation (RSD), and coverage probability (CP) of the 95% confidence interval for the MLEs of the α and σ in the LUBS model.
α ^ σ ^
α nRBRMSERSDCPRBRMSERSDCP
0.50106.9565.9100.98999.98−0.5520.6950.19240.88
252.7892.0520.904100.0−0.4960.6140.35646.26
501.4461.0380.78397.16−0.3910.5510.44155.46
751.0630.7660.75694.54−0.3340.4920.44760.00
1000.8680.6320.74893.66−0.3010.4560.46663.42
2000.5130.4000.72492.76−0.1980.4020.53170.58
5000.2390.2400.75993.70−0.0860.3300.60679.22
0.75107.1668.2240.93599.96−0.6080.7160.43627.78
252.3122.5460.992100 0.0−0.4440.5750.48743.60
501.0451.1930.91197.34−0.2880.4890.52857.74
750.6790.8140.86896.06−0.2110.4430.55365.68
1000.4790.6230.84195.48−0.1480.4150.56971.48
2000.2370.3840.84195.26−0.0580.3680.67179.38
5000.0800.2310.89695.680.0060.2910.81286.68
1.25105.73610.8770.900100.0−0.5850.6550.68424.16
251.4932.8541.009100.0−0.3260.4950.66250.20
500.5811.2630.97699.42−0.1470.4230.70567.20
750.3160.8330.96598.06−0.0670.3750.74576.40
1000.2260.6680.97097.74−0.0370.3520.81380.24
2000.0900.4150.98997.140.0030.2800.95786.68
5000.0320.2491.01195.240.0020.1641.02390.60
1.75104.80913.6870.967100.0−0.5260.5980.78126.72
251.1603.3131.064100.0−0.2550.4360.78456.12
500.4071.3801.00799.92−0.0890.3860.87373.86
750.2300.9641.03399.04−0.0380.3300.94480.50
1000.1690.7821.03998.28−0.0300.2851.00583.04
2000.0760.4771.01895.98−0.0140.1871.03788.46
5000.0280.2761.00395.80−0.0060.1091.01192.36
2.25104.37115.7440.905100.0−0.4840.5620.85829.96
251.0043.8131.082100.0−0.2060.4220.88761.10
500.3611.6631.07199.94−0.0740.3420.99876.34
750.2081.1221.04898.20−0.0460.2721.05382.56
1000.1300.8581.02596.94−0.0250.2261.04986.28
2000.0650.5391.00496.24−0.0180.1431.01289.76
5000.0260.3301.0494.88−0.0070.0901.03892.28
2.75104.02818.4590.934100.0−0.4440.5310.89134.52
250.9004.2051.051100.0−0.1770.3980.97064.02
500.3271.8821.07298.60−0.0750.2771.03178.30
750.1871.2731.04697.28−0.0420.2251.05584.56
1000.1371.0091.02596.60−0.0350.1871.05287.18
2000.0660.6321.01396.36−0.0210.1221.00991.08
5000.0270.3680.99995.98−0.0090.0761.00693.40
Table 2. Empirical relative bias (RB), root of the mean squared error (RMSE), ratio between the standard deviation of the estimate and the average standard deviation (RSD), and coverage probability (CP) of the 95% confidence interval for the MLEs of the θ 0 and θ 1 in the LUBS model.
Table 2. Empirical relative bias (RB), root of the mean squared error (RMSE), ratio between the standard deviation of the estimate and the average standard deviation (RSD), and coverage probability (CP) of the 95% confidence interval for the MLEs of the θ 0 and θ 1 in the LUBS model.
θ ^ 0 θ ^ 1
α nRBRMSERSDCPRBRMSECSDCP
0.5010−0.0220.1861.23281.920.2010.3311.23481.74
25−0.0110.1061.12189.080.0650.1841.10789.16
50−0.0060.0731.06991.800.0290.1261.06392.26
75−0.0010.0581.04893.380.0020.1011.05093.20
100−0.0020.0501.04393.340.0070.0861.02694.10
200−0.0010.0351.01894.140.0000.0601.01694.50
5000.0000.0221.00894.480.0010.0381.01694.64
0.7510−0.0020.2661.44475.540.1130.4731.43277.38
25−0.0060.1561.22284.920.0360.2721.20785.72
500.0000.1061.10990.560.0000.1831.10690.74
75−0.0020.0851.07692.180.0110.1491.08791.88
100−0.0020.0711.03393.440.0030.1241.03793.70
2000.0010.0501.01194.80−0.0040.0871.02094.56
5000.0000.0311.00494.86−0.0010.0540.99994.88
1.2510−0.0110.4051.71768.840.0180.7121.65570.64
250.0030.2411.29983.14−0.0190.4211.29183.76
500.0020.1581.13189.62−0.0110.2761.13589.92
750.0020.1261.08091.74−0.0150.2191.08292.06
1000.0010.1071.05093.22−0.0060.1851.04693.30
2000.0000.0731.01294.60−0.0040.1281.01794.38
5000.0010.0471.01494.58−0.0110.0811.01194.46
1.75100.0120.5191.80566.92−0.0080.9251.75267.98
250.0050.2981.32183.64−0.0630.5181.30483.42
500.0010.1901.12490.260.0000.3351.13390.14
750.0050.1531.09191.64−0.020.2641.07792.06
1000.0000.1301.06292.660.0000.2251.05792.76
2000.0010.0911.04093.62−0.0030.1561.03094.10
5000.0000.0561.00694.900.0010.0961.00695.00
2.25100.0050.5971.86766.88−0.0191.0441.79168.74
250.0000.3271.31382.960.0040.5681.28484.32
50−0.0070.2101.12490.600.0380.3661.11890.50
75−0.0040.1681.08991.800.0310.2931.08992.00
1000.0000.1441.07092.52−0.0060.2491.06393.06
200−0.0020.1001.04793.780.0120.1731.04493.28
5000.0000.0621.01794.940.0000.1061.00894.58
2.75100.0020.6541.87967.68−0.0171.1521.80069.18
250.0000.3381.25785.32−0.0310.6011.26585.52
50−0.0050.2251.13390.160.0370.3921.12590.48
75−0.0030.1791.09292.340.0250.3141.10092.24
1000.0030.1501.05993.16−0.0230.2581.04993.20
2000.0000.1011.01094.44−0.0090.1761.01094.86
5000.0000.0641.01394.780.0040.1121.01494.60
Table 3. MLE (standard error) for the LBS, LSBS, and LSHN models.
Table 3. MLE (standard error) for the LBS, LSBS, and LSHN models.
ParametersLBSLSBSLSHN
α 1.279(0.143)2.011(0.313)0.228(0.076)
θ 1 0.097(0.170)−0.961(0.166)0.296(0.159)
θ 2 −14.116(1.571)−13.870(1.602)−12.618(1.371)
λ / σ −0.932(0.174)8.675(2.933)
AIC129.235125.360120.099
BIC134.296132.115126.854
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Martínez-Flórez, G.; Tovar-Falón, R. New Regression Models Based on the Unit-Sinh-Normal Distribution: Properties, Inference, and Applications. Mathematics 2021, 9, 1231. https://doi.org/10.3390/math9111231

AMA Style

Martínez-Flórez G, Tovar-Falón R. New Regression Models Based on the Unit-Sinh-Normal Distribution: Properties, Inference, and Applications. Mathematics. 2021; 9(11):1231. https://doi.org/10.3390/math9111231

Chicago/Turabian Style

Martínez-Flórez, Guillermo, and Roger Tovar-Falón. 2021. "New Regression Models Based on the Unit-Sinh-Normal Distribution: Properties, Inference, and Applications" Mathematics 9, no. 11: 1231. https://doi.org/10.3390/math9111231

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop