A New Family of Modified Slash Distributions with Applications

Reyes, Jimmy; Iriarte, Yuri A.

doi:10.3390/math11133018

Open AccessArticle

A New Family of Modified Slash Distributions with Applications

by

Jimmy Reyes

^† and

Yuri A. Iriarte

^*,†

Departamento de Estadística y Ciencia de Datos, Facultad de Ciencias Básicas, Universidad de Antofagasta, Antofagasta 1240000, Chile

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Mathematics 2023, 11(13), 3018; https://doi.org/10.3390/math11133018

Submission received: 30 May 2023 / Revised: 26 June 2023 / Accepted: 5 July 2023 / Published: 7 July 2023

Download

Browse Figures

Versions Notes

Abstract

:

This article presents a new family of symmetric heavy-tailed distributions. This model is based on the ratio of two independent random variables; one with a normal distribution in the numerator and another with a Birnbaum–Saunders distribution in the denominator. The result is a new slash-like distribution capable of modeling high levels of kurtosis, so it can be considered as a viable alternative to other heavy-tailed distributions in the literature. Fundamental properties such as density and raw moments are derived. Parameter estimation is performed using the moment and maximum likelihood methods. A simulation study to evaluate the behavior of the estimators is carried out. Finally, the utility of the new distribution is illustrated by fitting two real datasets.

Keywords:

Birnbaum–Saunders distribution; kurtosis; maximum likelihood; modified slash distribution; moments; slash distribution

MSC:

62E10; 62F10

1. Introduction

The slash distribution has a symmetrical bell shape similar to the normal, but with the distinctive feature of having heavier tails. As a consequence, the slash distribution may perform better when modeling data that exhibits high kurtosis levels. Specifically, the random variable X follows the slash distribution with kurtosis parameter

q > 0

, denoted as

X \sim S (q)

, if it is represented as

X = \frac{Z}{U^{1 / q}},

(1)

where

Z \sim normal (0, 1)

and

U \sim uniform (0, 1)

are independent.

From Equation (1), it can be verified that

X \to Z

as

q \to \infty

, that is, the standard normal distribution is a limit case of the slash distribution. If

q = 1

, then X follows the canonic slash distribution proposed by Rogers and Tukey [1].

The literature referring to slash distribution is extensive. Several properties and extensions of this distribution can be found in Mosteller and Tukey [2], Johnson et al. [3], Kafadar [4], Wang and Genton [5], Gómez et al. [6], Arslan [7], Genç [8], Arslan and Genç [9], and Genç [10], among others.

In recent years, some authors have invested significant effort in developing studies focused on proposing distributions with even heavier tails than the slash distribution, which are originated by modifying the distribution of the denominator of Equation (1). For example, Reyes et al. [11] introduced the modified-slash distribution by considering that U—in Equation (1)—has the exponential distribution with mean 1/2. The authors illustrate that the modified-slash distribution—presenting the same parameter dimension as the slash distribution—can present a better performance in fitting data that exhibit a high kurtosis level. Similarly, Rojas et al. [12] introduced the extended-slash distribution by replacing

U^{1 / q}

by W in Equation (1), where W has a beta distribution with mean

q {(q + q_{2})}^{- 1}

and variance

q q_{2} {(q + q_{2})}^{- 2} {(q + q_{2} + 1)}^{- 1}

, with

q, q_{2} > 0

. Here, the authors show that the extended-slash distribution—by presenting two kurtosis parameters—can perform better than the slash distribution when fitting high kurtosis data. In this same line, Reyes et al. [13] proposed the generalized modified-slash distribution by replacing

V^{1 / q}

by W in Equation (1), where W has a gamma distribution with mean 2 and variance

2 / β

, with

β > 0

. Here, the authors show that the generalized modified-slash distribution can perform better than the slash and modified-slash distributions in fitting high kurtosis data.

The aim of this paper is to improve the modelling of high kurtosis data by introducing a new heavy-tailed modification of the slash distribution. This modification has such heavy tails that it can outperform even the modified-slash, extended-slash, and generalized modified-slash distributions. To achieve this end, we introduce a type II modified-slash distribution by replacing

U^{1 / q}

by V in Equation (1), where V has a Birnbaum–Saunders distribution.

The Birnbaum–Saunders distribution [14,15], originally derived to model the time to failure due to material fatigue, has played an important role in reliability studies. An important number of studies on properties, applications, and generalizations of this distribution can be found in the literature; see for example, Díaz-García and Leiva-Sánchez [16], Sanhueza et al. [17], and Gómez et al. [18], to name a few.

Specifically, the random variable V has a Birnbaum–Saunders distribution with a shape parameter

α > 0

and scale parameter

β > 0

, denoted as

V \sim BS (α, β)

, if it can, by being represented as

\begin{matrix} V = β {\{\frac{α Z}{2} + {[{(\frac{α Z}{2})}^{2} + 1]}^{\frac{1}{2}}\}}^{2}, \end{matrix}

(2)

where Z has a standard normal distribution.

The pdf of V results to be

\begin{matrix} f (v; α, β) = \frac{v + β}{2 α \sqrt{β v^{3}}} ϕ (\frac{1}{α} [\sqrt{\frac{v}{β}} - \sqrt{\frac{β}{v}}]), v > 0, \end{matrix}

(3)

where

ϕ (\cdot)

represents the pdf of standard normal distribution.

We provide evidence that by considering a Birnbaum–Saunders distribution—with representation and pdf given by Equations (2) and (3)—in the denominator of Equation (1), a new extremely heavy-tailed distribution is defined, which can perform better than the slash, extended-slash, modified-slash, and generalized modified-slash distributions when fitting data that show high levels of kurtosis.

The remainder of this paper is summarized as follows. In Section 2, we propose the new distribution and study some of its fundamental properties, such as stochastic representation, density, and raw moments. Section 3 discusses the problem of parameter estimation via the moment and maximum likelihood methods. In addition, a simulation study is carried out to evaluate the behavior of the estimators. In Section 4, two application examples aimed at evaluating the comparative performance of the proposed distribution are considered. Final comments are considered in Section 5.

2. Type II Modified Slash Distribution

This section proposes the new distribution and derives some of its fundamental properties.

2.1. Representation and Density

Definition 1.

A random variable Y has a type II modified-slash distribution with location parameter

μ \in R

, scale parameter

σ > 0

, and kurtosis parameter

α > 0

, denoted as

Y \sim T 2 MS (μ, σ, α)

, if it can be as

\begin{matrix} Y = μ + σ \frac{Z}{V}, \end{matrix}

(4)

where

Z \sim normal (0, 1)

and

V \sim BS (2 α, 1)

are independent.

Remark 1.

From the stochastic representation of

V \sim BS (2 α, 1)

, it can be seen that

V \to 1

as

α ↓ 0

. Thus, from Equation (1), it follows that

Y \to (μ + σ Z)

as

α ↓ 0

, which means that the T2MS

(μ, σ, α)

distribution converges to the normal(μ,

σ^{2}

) distribution as α decreases to 0.

Proposition 1.

Let

Y \sim T 2 MS (μ, σ, α)

. Then, the pdf of Y is is given by

f_{Y} (y; μ, σ, α) = \frac{1}{4 σ α} \int_{0}^{\infty} \frac{t + 1}{\sqrt{t}} ϕ (\frac{1}{2 α} [\sqrt{t} - \frac{1}{\sqrt{t}}]) ϕ (z t) d t,

(5)

where

y \in R

,

z = \frac{y - μ}{σ}

,

μ \in R

,

σ > 0

, and

ϕ (\cdot)

represents the pdf of the standard normal distribution.

Proof.

From Equation (4), taking into account that

T = V \sim BS (2 α, 1)

and applying the Jacobian technique (see Ross [19], Equation (7.1)), we obtain that the joint pdf of

(Y, T)

is

f_{Y, T} (y, t; μ, σ, α) = \frac{t + 1}{4 σ α \sqrt{t}} ϕ ((\frac{y - μ}{σ}) t) ϕ (\frac{1}{2 α} [\sqrt{t} - \frac{1}{\sqrt{t}}]), y \in R, t > 0,

and marginalizing with respect to T, the result in Equation (5) is obtained. □

Corollary 1.

If

α = 1

, then Y follows the canonic type II modified slash (CT2MS) distribution with pdf

\begin{matrix} f_{Y} (y; μ, σ) = \frac{1}{4 σ} \int_{0}^{\infty} \frac{t + 1}{\sqrt{t}} ϕ (\frac{\sqrt{t} - \sqrt{t^{- 1}}}{2}) ϕ (z t) d t . \end{matrix}

We compute the pdf of the T2MS

(μ, σ, α)

distribution using the stats::integrate( ) function of the R programming language [20]. The R code used is provided in Appendix A.1.

Figure 1 shows the behavior of the T2MS pdf with

μ = 0

,

σ = 1

and

α = 0.3

, 0.4, and 0.5, respectively. In the figure, it can be seen that the pdf of the T2MS distribution has a symmetrical bell shape. Note that as

α

increases, the weight of the tails and the density value associated with the mode also increase, which means that the kurtosis level of the distribution is increased.

2.2. Moments

In this section, we derive the raw moments of the T2MS

(μ, σ, α)

distribution, which are used to calculate some associated measures such as the mean, variance, Fisher’s skewness, and kurtosis coefficients.

Proposition 2.

Let

X \sim T 2 MS (0, 1, α)

. Then, the kth raw moment of X is given by

\begin{matrix} E (X^{k}) = \{\begin{matrix} 0, & if k = 2 j + 1, j = 0, 1, 2, \dots \\ \frac{(2 j)!}{2^{j} j!} m_{2 j} (α), & if k = 2 j, j = 1, 2, 3, \dots \end{matrix} \end{matrix}

(6)

where

\begin{matrix} m_{2 j} (α) & = & \sum_{y = 0}^{2 j} (\binom{4 j}{2 y}) \sum_{s = 0}^{y} (\binom{y}{s}) α^{2 (2 j + s - y)} \frac{[2 (2 j + s - y)]!}{2^{2 j + s - y} (2 j + s - y)!} . \end{matrix}

(7)

Proof.

From Definition 1 with

μ = 0

and

σ = 1

, by the condition of independence of

Z \sim normal (0, 1)

and

V \sim BS (2 α, 1)

, it follows that

\begin{matrix} E [X^{2 j}] = E [{(\frac{Z}{V})}^{2 j}] = E [Z^{2 j}] E [V^{- 2 j}] . \end{matrix}

So, we observe that:

For an odd k, $k = 2 j + 1$ with $j = 0, 1, 2, \dots$ , $E [Z^{2 j + 1}] = 0$ since $Z \sim normal (0, 1)$ . Thus, the kth raw moment of X is equal to 0.
For an even k, $k = 2 j$ with $j = 1, 2, \dots$ , the result is obtained by noting that $E [Z^{2 j}] = (2 j)! / (2^{j} j!)$ and $E [V^{- 2 j}] = E [V^{2 j}] = m_{2 j} (α)$ , where $m_{2 j} (\cdot)$ is as in Equation (7), which finally leads to the result in Equation (6).

□

From Proposition 2, the following corollaries are immediate:

Corollary 2.

Let

Y \sim T 2 MS (μ, σ, α)

. Then, the rth raw moment of Y can be written as

E [Y^{r}] = μ^{r} + \sum_{k = 1}^{r} (\binom{r}{k}) μ^{r - k} σ^{j} E (X^{k}), r = 1, 2, 3, \dots

where

E (X^{k})

is as in Equation (6).

Corollary 3.

Let

Y \sim T 2 MS (μ, σ, α)

. Then, the expectation and the variance of Y can be written as

\begin{matrix} E [Y] = μ and V (Y) = σ^{2} (24 α^{4} + 8 α^{2} + 1) . \end{matrix}

Corollary 4.

Let

Y \sim T 2 MS (μ, σ, α)

. Then, the Fisher’s skewness (S) and kurtosis (K) coefficient of Y are given by

S = 0 and K = \frac{40320 α^{8} + 11520 α^{6} + 1440 α^{4} + 96 α^{2} + 3}{{(24 α^{4} + 8 α^{2} + 1)}^{2}} .

From Corollary 4, it is easy to see that the Fisher’s kurtosis coefficient of the T2MS

(μ, σ, α)

distribution takes values in the interval

(3, 70)

depending on the assumed value for

α

. Consequently, the T2MS

(μ, σ, α)

distribution presents heavier tails than the normal distribution. Figure 2 describes the behavior of the Fisher’s kurtosis coefficient of the T2MS

(μ, σ, α)

distribution. Here, it is seen that the kurtosis level of the T2MS

(μ, σ, α)

distribution increases as

α

increases.

3. Parameter Estimation

Initially, this section discusses the parameter estimation for the T2MS

(μ, σ, α)

distribution via the moment and maximum likelihood methods. Secondly, a simulation study is carried out in order to evaluate the behavior of the provided estimators.

3.1. Moment Estimation

Proposition 3.

Let

y_{1}, \dots, y_{n}

be an observed random sample for the random variable

Y \sim T 2 MS (μ, σ, α)

. Then, the moment estimators

{\hat{μ}}_{M}

,

{\hat{σ}}_{M}

, and

{\hat{α}}_{M}

for μ, σ, and α satisfy the following equations:

\begin{matrix} k_{y} {(24 {\hat{α}}_{M}^{4} + 8 {\hat{α}}_{M}^{2} + 1)}^{2} & = & 40320 {\hat{α}}_{M}^{8} + 11520 {\hat{α}}_{M}^{6} + 1440 {\hat{α}}_{M}^{4} + 96 {\hat{α}}_{M}^{2} + 3, \end{matrix}

(8)

\begin{matrix} {\hat{σ}}_{M} & = & \sqrt{\frac{s_{y}^{2}}{24 {\hat{α}}_{M}^{4} + 8 {\hat{α}}_{M}^{2} + 1}}, \end{matrix}

(9)

\begin{matrix} {\hat{μ}}_{M} & = & \bar{y}, \end{matrix}

(10)

where

\bar{y}

is the sample mean,

s_{y}^{2}

is the sample variance, and

k_{y}

is the sample Fisher’s kurtosis coefficient.

Proof.

Equations (8) and (9) are direct consequences of equating the mean, variance, and Fisher’s kurtosis coefficient of the T2MS distribution—given in Corollaries 3 and 4—with the corresponding mean, variance, and Fisher’s kurtosis coefficient of the sample. □

3.2. Maximum Likelihood Estimation

Let

y_{1}, \dots, y_{n}

be an observed random sample of size n on

Y \sim T 2 MS (μ, σ, α)

. The log-likelihood function for

θ = {(μ, σ, α)}^{'}

can be written as

\begin{matrix} ℓ (θ; y_{i}) = - n log (4) - n log (α) - n log (σ) + \sum_{i = 1}^{n} log \{\int_{0}^{\infty} g_{α} (t, z_{i}) d t\}, \end{matrix}

(11)

where

\begin{matrix} g_{α} (t, z_{i}) = \frac{(t + 1)}{\sqrt{t}} ϕ (\frac{\sqrt{t} - \sqrt{t^{- 1}}}{2 α}) ϕ (z_{i} t), \end{matrix}

and the components of the score vector

U (θ)

can be written as

\begin{matrix} U_{μ} (θ) & = & \frac{\partial ℓ (θ)}{\partial μ} = \frac{1}{σ} \sum_{i = 1}^{n} \frac{z_{i} r_{α} (z_{i})}{h_{α} (z_{i})}, \end{matrix}

(12)

\begin{matrix} U_{σ} (θ) & = & \frac{\partial ℓ (θ)}{\partial σ} = - \frac{n}{σ} + \frac{1}{σ} \sum_{i = 1}^{n} \frac{z_{i}^{2} r_{α} (z_{i})}{h_{α} (z_{i})}, \end{matrix}

(13)

\begin{matrix} U_{α} (θ) & = & \frac{\partial ℓ (θ)}{\partial α} = - \frac{n}{α} + \frac{1}{4 α^{3}} \sum_{i = 1}^{n} \frac{s_{α} (z_{i})}{h_{α} (z_{i})}, \end{matrix}

(14)

where

h_{α} (z_{i}) = \int_{0}^{\infty} g_{α} (t, z_{i}) d t, r_{α} (z_{i}) = \int_{0}^{\infty} t^{2} g_{α} (t, z_{i}) d t,

and s_{α} (z_{i}) = \int_{0}^{\infty} {(\sqrt{t} - \sqrt{t^{- 1}})}^{2} g_{α} (t, z_{i}) d t .

Thus, the maximum likelihood estimator

{\hat{θ}}_{M L} = {({\hat{μ}}_{M L}, {\hat{σ}}_{M L}, {\hat{α}}_{M L})}^{'}

of

θ = {(μ, σ, α)}^{'}

can by obtained be solving the system of equations

U (θ) = 0

. However, it is not possible to obtain a closed form for

\hat{θ}

, so the maximum likelihood estimates must be obtained by solving the system using numerical procedures.

Based on the approximation of the asymptotic variance of the maximum likelihood estimator, the interval estimation and the hypothesis tests for

μ

,

σ

, and

α

can be performed by computing the observed information matrix

J_{n} (θ)

. This matrix is given by

\begin{matrix} J_{n} (θ) = & - (\begin{matrix} J_{μ μ} & J_{μ σ} & J_{μ α} \\ J_{σ σ} & J_{σ α} \\ J_{α α} \end{matrix}), \\ J_{θ_{r} θ_{p}} = & - \frac{\partial^{2} ℓ (θ; y_{i})}{\partial θ_{r} θ_{p}} |_{θ = {\hat{θ}}_{ML}}, r = p = 1, 2, 3, \end{matrix}

where

θ_{1} = μ

,

θ_{2} = σ

,

θ_{3} = α

,

ℓ (θ, y_{i})

is as in Equation (11), and the second partial derivatives are presented in Appendix B.

So, the observed covariance matrix is the inverse of

J_{n} (θ)

,

J_{n}^{- 1} (θ)

, and the diagonal elements of

J_{n}^{- 1} (\hat{θ})

are the variances of

\hat{μ}

,

\hat{σ}

, and

\hat{α}

, which we denote by

\hat{var} (\hat{μ})

,

\hat{var} (\hat{σ})

, and

\hat{var} (\hat{α})

, respectively. Then, the asymptotic

(1 - γ) 100 %

confidence intervals for

μ

,

σ

, and

α

are

\hat{μ} \pm z_{δ / 2} \sqrt{\hat{var} (\hat{μ})}

,

\hat{σ} \pm z_{δ / 2} \sqrt{\hat{var} (\hat{σ})}

, and

\hat{α} \pm z_{δ / 2} \sqrt{\hat{var} (\hat{α})}

, respectively, where

z_{δ / 2}

stands for the upper percentile

δ / 2

of the standard normal distribution.

3.3. Practical Considerations

Regarding the parameter estimate via the moment method, we calculate the root of Equation (8) using the rootSolve::uniroot.all( ) function [21] in the R programming language. Once the estimate

{\hat{α}}_{M}

is obtained, we use it to obtain the estimate

{\hat{σ}}_{M}

of

σ

from the computation of Equation (9).

Regarding the maximum likelihood estimation of the parameters of the T2MS

(μ, σ, α)

distribution, since the system of score equations does not lead to closed analytical expressions of the estimators, it is necessary to use a computational routine to obtain the root of this system. For this, we suggest the use of the rootSolve::multiroot( ) function [21] in the R programming language. This function implements the Newton–Raphson method to obtain an approximation of the root of the system of nonlinear equations to be solved.

In this case, one may alternatively prefer to address the optimization problem

{max}_{θ} ℓ (θ;

y_{i})

, subject to

μ \in R

,

σ > 0

and

α > 0

, where

ℓ (\cdot; \cdot)

is as in Equation (11). Here, we suggest using the stats::optim( ) function in the R programming language. In particular, we suggest the L-BFGS-B method [22], which allows the parameter space to be specified by box constraints. We use the moment estimates discussed in Section 3.1 as the values to initialize the iterative process.

The R codes used to obtain the estimates are provided in Appendix A.2.

3.4. Simulation Study

In this section, we consider a simulation study aimed at evaluating the behavior of the moment and maximum likelihood estimators of the T2MS

(μ, σ, α)

distribution parameters. We generate 1000 random samples from the T2MS

(μ, σ, α)

distribution under scenarios A

(μ = - 5, σ = 1, α = 0.5)

and B

(μ = 5, σ = 1, α = 0.2)

, and for each of the sample sizes n = 25, 50, 100, 200, and 400. The samples were generated considering the following steps, which were formulated from the stochastic representation of the T2MS

(μ, σ, α)

random variable:

Choose values for $μ$ , $σ$ , $α$ , and n.
Generate $w \sim normal (0, 1)$ .
Compute $v = {[α w + \sqrt{{(α w)}^{2} + 1}]}^{2}$ .
Generate $z \sim normal (0, 1)$ .
Compute $y = μ + σ \frac{z}{v}$ .
Repeat steps 2 to 5 n times.

For each generated sample, we calculate the M and ML estimates under the practical considerations of Section 3.3. Table 1 reports the average estimate and standard deviation for each 1000 moment and maximum likelihood estimates obtained in scenarios A and B, under the different sample sizes considered. In the table, the consistency property of the estimators provided by both estimation methods can be observed; note that as the sample size increases, the AEs obtained with both estimation methods approach the true values of the parameters and that the SDs decrease to zero. However, maximum likelihood estimators show greater efficiency and provide estimates with less bias; note that the bias associated with the maximum likelihood estimates is smaller (especially in small samples) and that the SDs are smaller.

4. Illustrations

In this section, we present two applications to real data that illustrate the usefulness of the type II modified slash (T2MS) distribution in fitting high kurtosis data. In each application, we compare the performance of the T2MS distribution with that of other heavy-tailed distributions, such as the slash (S), extended-slash (ES), modified-slash (MS), and generalized modified-slash (GMS) distributions. Below are the pdfs of these distributions:

1.: The S pdf;

$\begin{matrix} f (y; μ, σ, q) = \frac{q}{σ} \int_{0}^{1} t^{q} ϕ (\frac{y - μ}{σ} t) d t, \end{matrix}$

where $y \in R$ , $μ \in R$ is a location parameter, $σ > 0$ is the scale parameter, $q > 0$ is a kurtosis parameter, and $ϕ (\cdot)$ is the pdf of the standard normal distribution.
2.: The ES pdf [12];

$\begin{matrix} f (y; μ, σ, q, q_{2}) = \frac{1}{σ B (q, q_{2})} \int_{0}^{1} ϕ (\frac{y - μ}{σ} t) t^{q} {(1 - t)}^{q_{2} - 1} d t, \end{matrix}$

where $y \in R$ , $μ \in R$ is a location parameter, $σ > 0$ is the scale parameter, $q, q_{2} > 0$ are kurtosis parameters, and $ϕ (\cdot)$ is the pdf of the standard normal distribution.
3.: The MS pdf [11];

$\begin{matrix} f (y; μ, σ, q) = \frac{2 q}{\sqrt{2 π} σ} \int_{0}^{\infty} t^{q} exp \{- \frac{1}{2} [{(\frac{y - μ}{σ})}^{2} t^{2} + 4 t^{q}]\} d t, \end{matrix}$

where $y \in R$ , $μ \in R$ is a location parameter, $σ > 0$ is the scale parameter, and $q > 0$ is a kurtosis parameter.
4.: The GMS pdf [13];

$\begin{matrix} f (y; μ, σ, q) = \frac{{(2 q)}^{q}}{\sqrt{2 π} σ Γ (q)} \int_{0}^{\infty} t^{q} exp \{- \frac{t^{2}}{2} {(\frac{y - μ}{σ})}^{2} - 2 q t\} d t, \end{matrix}$

where $y \in R$ , $μ \in R$ is a location parameter, $σ > 0$ is the scale parameter, and $q > 0$ is a kurtosis parameter.

In addition, we include the normal (N) distribution in the analysis because it is a limiting case of most of the aforementioned distributions.

In each application, we used the Anderson–Darling (AD) test to assess the quality of fit of the T2MS distribution. This test is computed using the goftest::ad.test( ) function [23] of the R programming language. The comparative performance of the fitted distributions is evaluated using the Akaike Information Criterion (AIC) [24] and the Bayesian Information Criterion (BIC) [25].

4.1. Ant Movement Direction Data

In this section, we consider a set of observations on the initial direction of movement of 730 ants subjected to a visual stimulus. These data were originally presented in Jander [26] and subsequently analyzed in Batschelet [27], Sengupta and Pal [28], and Jones and Pewsey [29]. Figure 3 shows the boxplot of these data and Table 2 shows the statistical summary. Here, it can be seen that the data present a very smooth level of negative skewness and a high level of kurtosis, explained by the presence of atypical observations. Taking these properties into account, we expect that the T2MS distribution can fit this dataset appropriately.

Table 3 reports the maximum likelihood estimates and the AIC and BIC values for the distributions fitted to the ants data. In this table, it is observed that the T2MS distribution has the lowest AIC and BIC values, suggesting that it should be selected for fitting the ant data. For the T2MS distribution, we obtain an observed statistic equal to 2.328 and a p-value equal to 0.798 in the AD test, suggesting that this distribution performs well in fitting the data. Figure 4 presents the histogram for the ants data and the pdfs fitted via the maximum likelihood method. In the figure, it can be seen that the pdf of the T2MS distribution is closest to the empirical frequencies both in the center and in the extremes of the histogram.

4.2. DEM/GBP Exchange Rate Returns Data

In this section, we consider a set of 1974 observations on the percentage returns of Deutsche mark/British pound (DEM/GBP) exchange rates from 1984 through 1991. This dataset can be found under the name MarkPound in the AER statistical package [30] of the R programming language. Table 4 shows some descriptive statistics for these data and Figure 5 presents the boxplot. From these, it can be seen that the data present a smooth level of skewness and an important level of kurtosis explained by the presence of several atypical observations.

Table 5 reports the maximum likelihood estimates and the AIC and BIC values for the distributions fitted to the returns data. In this table, it is observed that the T2MS distribution has the lowest AIC and BIC values, suggesting that it should be selected for fitting the returns data. For the T2MS distribution, we obtain an observed statistic equal to 3.169 and a p-value equal to 0.635 in the AD test, suggesting that this distribution performs well in fitting the data. Figure 6 presents the histogram for the returns data and the pdfs fitted via the ML method. In the figure, it can be seen that the pdf of the T2MS distribution is closest to the empirical frequencies both in the center and in the extremes of the histogram.

5. Final Comments

In this article, we propose an alternative distribution for modeling high kurtosis data. The new distribution can be understood as a modified version of the slash distribution that —like other slash distributions in the literature—arises as a quotient of independent random variables. The novelty here is to consider a random variable with a Birnbaum–Saunders distribution in the denominator, something that we believe has not been previously explored. We observe that this modification of the representation of a slash random variable leads to a new distribution with extremely heavy tails, which can outperform other distributions in the analysis of high kurtosis data.

The fundamental properties of the new distribution are derived, among them the stochastic representation, the density function, and the raw moments with associated measures. Parameters in the proposed distribution are estimated using the moment and maximum likelihood methods. Through Monte Carlo simulation experiments, it is observed that both estimation methods provide consistent estimators. However, it could be observed that the maximum likelihood estimators are more efficient. Two applications to real data are considered. In each application, it is illustrated that the proposed distribution performs well in modeling high kurtosis data, even better than other heavy-tailed distributions in the literature.

Author Contributions

Conceptualization, J.R.; methodology, J.R. and Y.A.I.; software, J.R. and Y.A.I.; validation, J.R. and Y.A.I.; formal analysis, J.R. and Y.A.I.; investigation, J.R. and Y.A.I.; supervision, J.R. and Y.A.I. All authors contributed significantly to this research article. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by SEMILLERO UA-2023.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. R Codes

In this section, we present R codes for pdf computation, pseudorandom number generation, and the parameter ML estimation of the T2MS distribution.

Appendix A.1. Code for the Computation of the T2MS Pdf

dT2MS <- function(y,m,s,a){

n = length(y)

f = rep(0,n)

for(i in 1:n){

f[i] = stats::integrate(function(t,y,m,s,a){

(t+1)/sqrt(t)*dnorm((sqrt(t)-sqrt(1/t))/(2*a))*dnorm(((y-m)/s)*t)

},lower=0,upper=Inf,y=y[i],m=m,s=s,a=a)$value

}

return(f/(4*s*a))

}

Appendix A.2. Code to Obtain the Moment and Maximum Likelihood Estimates

kurtT2MS <- function(u){

(40320*u^8+11520*u^6+1440*u^4+96*u^2+3)/(24*u^4+8*u^2+1)^2

}

loglik <- function(p,y){

-sum(log(dT2MS(y,p[1],p[2],p[3])))

}

n <- 500

m <- 5

s <- 1

a <- 0.5

w <- rnorm(n)

v <- (a*w+((a*w)^2+1)^{1/2})^2

z <- rnorm(n)

y <- m+s*z/v

estM_alpha <- rootSolve::uniroot.all(function(u,w)kurtT2MS(u)

-moments::kurtosis(w), lower=1e-12, upper=50, tol=1e-12, w = y)

estM_sigma <- sqrt((mean(y^2)-mean(y)^2)/(24*estM_alpha^4+8*estM_alpha^2+1)

)

estM_mu <- mean(y)

estML <- stats::optim(par=c(estM_mu,estM_sigma,estM_alpha), fn=loglik,

method=c(‘‘L-BFGS-B’’), lower=c(-Inf,1e-15,1e-15), upper=c(Inf,Inf,Inf),

y=y)

Appendix B. Elements of the Observed Information Matrix

If

y_{1}, \dots, y_{n}

is a random sample from a TIIMS

(μ, σ, α)

population and

z_{i} = \frac{y_{i} - μ}{σ}

, the second partial derivatives of the log-likelihood function with respect to all the parameters are given by

\begin{matrix} \frac{\partial^{2} ℓ (θ)}{\partial μ^{2}} & = & \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{2} r_{α}^{2} (z_{i})}{h_{α}^{2} (z_{i})} - \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{r_{α} (z_{i})}{h_{α} (z_{i})} + \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{2} u_{α} (z_{i})}{h_{α} (z_{i})}, \\ \frac{\partial^{2} ℓ (θ)}{\partial μ \partial σ} & = & \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{3} r_{α}^{2} (z_{i})}{h_{α}^{2} (z_{i})} - \frac{2}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i} r_{α} (z_{i})}{h_{α} (z_{i})} + \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i} u_{α} (z_{i})}{h_{α} (z_{i})}, \\ \frac{\partial^{2} ℓ (θ)}{\partial μ \partial α} & = & \frac{1}{4 σ α^{3}} \sum_{i = 1}^{n} \frac{z_{i} s_{α} (z_{i}) r_{α} (z_{i})}{h_{α}^{2} (z_{i})} + \frac{1}{4 σ α^{3}} \sum_{i = 1}^{n} \frac{z_{i} v_{α} (z_{i})}{h_{α} (z_{i})}, \\ \frac{\partial^{2} ℓ (θ)}{\partial σ^{2}} & = & \frac{n}{σ^{2}} - \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{4} r_{α}^{2} (z_{i})}{h_{α}^{2} (z_{i})} - \frac{3}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{2} r_{α} (z_{i})}{h_{α} (z_{i})} + \frac{1}{σ^{2}} \sum_{i = 1}^{n} \frac{z_{i}^{4} u_{α} (z_{i})}{h_{α} (z_{i})}, \\ \frac{\partial^{2} ℓ (θ)}{\partial σ α} & = & - \frac{1}{4 σ α^{3}} \frac{z_{i}^{2} s_{α} (z_{i}) r_{α} (z_{i})}{h_{α}^{2} (z_{i})} + \frac{1}{4 σ^{2} α^{3}} \sum_{i = 1}^{n} \frac{z_{i}^{2} u_{α} (z_{i})}{h_{α} (z_{i})}, \\ \frac{\partial^{2} ℓ (θ)}{\partial α^{2}} & = & \frac{n}{α^{2}} - \frac{1}{16 α^{6}} \sum_{i = 1}^{n} \frac{s_{α} (z_{i})}{h_{α}^{2} (z_{i})} - \frac{3}{4 α^{4}} \sum_{i = 1}^{n} \frac{s_{α} (z_{i})}{h_{α} (z_{i})} + \frac{1}{16 α^{6}} \sum_{i = 1}^{n} \frac{w_{α} (z_{i})}{h_{α} (z_{i})}, \end{matrix}

where

θ = {(μ, σ α)}^{'}

,

h_{α} (z_{i})

,

r_{α} (z_{i})

and

s_{α} (z_{i})

are as in Equations (12)–(14), and

u_{α} (z_{i}) = \int_{0}^{\infty} t^{4} g_{α} (t, z_{i}) d t, v_{α} (z_{i}) = \int_{0}^{\infty} {(\sqrt{t} - \sqrt{t^{- 1}})}^{2} t^{2} g_{α} (t, z_{i}) d t

and w_{α} (z_{i}) = \int_{0}^{\infty} {(\sqrt{t} - \sqrt{t^{- 1}})}^{4} g_{α} (t, z_{i}) d t .

Abbreviations

The following abbreviations are used in this manuscript:

S	Slash
MS	Modified-slash
GMS	Generalized modified-slash
ES	Extended-slash
T2MS	Type II modified-slash
AE	Average estimate
SD	Standard deviation
AIC	Akaike information criteria
BIC	Bayesian information criteria
AD	Anderson–Darling
pdf	Probability density function
cdf	Cumulative distribution function

References

Rogers, W.H.; Tukey, J.W. Understanding some long-tailed symmetrical distributions. Stat. Neerl. 1972, 26, 211–226. [Google Scholar] [CrossRef]
Mosteller, F.; Tukey, J.W. Data Analysis and Regression. A Second Course in Statistics; Addison-Wesley: Boston, MA, USA, 1977. [Google Scholar]
Johnson, N.L.; Kotz, S.; Balakrishnan, N. Continuous Univariate Distributions, 2nd ed.; John Wiley & Sons: New York, NY, USA, 1995; Volume 1. [Google Scholar]
Kafadar, K. A biweight approach to the one-sample problem. J. Am. Stat. Assoc. 1982, 77, 416–424. [Google Scholar] [CrossRef]
Wang, J.; Genton, M.G. The multivariate skew-slash distribution. J. Stat. Plan. Inference 2006, 136, 209–220. [Google Scholar] [CrossRef]
Gómez, H.W.; Quintana, F.A.; Torres, F.J. A new family of slash-distributions with elliptical contours. Stat. Probab. Lett. 2007, 77, 717–725. [Google Scholar] [CrossRef]
Arslan, O. An alternative multivariate skew-slash distribution. Stat. Probab. Lett. 2008, 78, 2756–2761. [Google Scholar] [CrossRef]
Genç, A.İ. A generalization of the univariate slash by a scale-mixtured exponential power distribution. Commun. Stat. Simul. Comput. 2007, 36, 937–947. [Google Scholar] [CrossRef]
Arslan, O.; Genç, A.İ. A generalization of the multivariate slash distribution. J. Stat. Plan. Inference 2009, 139, 1164–1170. [Google Scholar] [CrossRef]
Genç, A.İ. A skew extension of the slash distribution via beta-normal distribution. Stat. Pap. 2013, 54, 427–442. [Google Scholar] [CrossRef]
Reyes, J.; Gómez, H.W.; Bolfarine, H. Modified slash distribution. Statistics 2013, 47, 929–941. [Google Scholar] [CrossRef]
Rojas, M.A.; Bolfarine, H.; Gómez, H.W. An extension of the slash-elliptical distribution. SORT 2014, 38, 215–230. [Google Scholar]
Reyes, J.; Barranco-Chamorro, I.; Gómez, H.W. Generalized modified slash distribution with applications. Commun. Stat. Theory Methods 2020, 49, 2025–2048. [Google Scholar] [CrossRef]
Birnbaum, Z.W.; Saunders, S.C. Estimation for a family of life distributions with applications to fatigue. J. Appl. Probab. 1969, 6, 328–347. [Google Scholar] [CrossRef]
Birnbaum, Z.W.; Saunders, S.C. A new family of life distributions. J. Appl. Probab. 1969, 6, 319–327. [Google Scholar] [CrossRef]
Díaz-García, J.A.; Leiva-Sánchez, V. A new family of life distributions based on the elliptically contoured distributions. J. Stat. Plan. Inference 2005, 128, 445–457. [Google Scholar] [CrossRef]
Sanhueza, A.; Leiva, V.; Balakrishnan, N. The generalized Birnbaum–Saunders distribution and its theory, methodology, and application. Commun. Stat. Methods 2008, 37, 645–670. [Google Scholar] [CrossRef]
Gómez, H.W.; Olivares-Pacheco, J.F.; Bolfarine, H. An extension of the generalized Birnbaum—Saunders distribution. Stat. Probab. Lett. 2009, 79, 331–338. [Google Scholar] [CrossRef]
Ross, S. A First Course in Probability; Pearson Prentice Hall: Upper Saddle River, NJ, USA, 2012. [Google Scholar]
R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Soetaert, K. rootSolve: Nonlinear Root Finding, Equilibrium and Steady-State Analysis of Ordinary Differential Equations; R Package 1.6; R Foundation for Statistical Computing: Vienna, Austria, 2009. [Google Scholar]
Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar] [CrossRef]
Faraway, J.; Marsaglia, G.; Marsaglia, J.; Baddeley, A. goftest: Classical Goodness-of-Fit Tests for Univariate Distributions; R Package Version 1.2-3; R Foundation for Statistical Computing: Vienna, Austria, 2021. [Google Scholar]
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Jander, R. Die optische Richtungsorientierung der roten Waldameise (Formica rufa L.). Z. Vgl. Physiol. 1957, 40, 162–238. [Google Scholar] [CrossRef]
Batschelet, E. Circular Statistic in Biology; Academic Press: London, UK, 1981. [Google Scholar]
Sengupta, A.; Pal, C. On optimal tests for isotropy against the symmetric wrapped stable-circular uniform mixture family. J. Appl. Stat. 2001, 28, 129–143. [Google Scholar] [CrossRef]
Jones, M.; Pewsey, A. A family of symmetric distributions on the circle. J. Am. Stat. Assoc. 2005, 100, 1422–1428. [Google Scholar] [CrossRef] [Green Version]
Kleiber, C.; Zeileis, A. Applied Econometrics with R; Springer: New York, NY, USA, 2008; ISBN 978-0-387-77316-2. [Google Scholar]

Figure 1. Plot of the pdf of the T2MS distribution with

μ = 0

and

σ = 1

.

Figure 1. Plot of the pdf of the T2MS distribution with

μ = 0

and

σ = 1

.

Figure 2. Plot of the Fisher’s kurtosis coefficient of the T2MS

(μ, σ, α)

(red line) and normal (black circle) distributions.

Figure 2. Plot of the Fisher’s kurtosis coefficient of the T2MS

(μ, σ, α)

(red line) and normal (black circle) distributions.

Figure 3. Boxplot for ants dataset.

Figure 4. Left: Histogram of the ants data and densities fitted via the maximum likelihood method. Right: Zoom the tails of the histogram.

Figure 5. Boxplot for returns data.

Figure 6. Left: Histogram of the returns data and densities fitted via the maximum likelihood method. Right: Zoom the tails of the histogram.

Table 1. The average estimate (AE) and standard deviations (SD) for each 1000 moment and maximum likelihood estimates obtained in scenarios A (

μ = - 5, σ = 1, α = 0.5

) and B (

μ = 5, σ = 1, α = 0.2

), under the different sample sizes considered in the study.

Table 1. The average estimate (AE) and standard deviations (SD) for each 1000 moment and maximum likelihood estimates obtained in scenarios A (

μ = - 5, σ = 1, α = 0.5

) and B (

μ = 5, σ = 1, α = 0.2

), under the different sample sizes considered in the study.

Scenario	n	AE ( ${\hat{μ}}_{M}$ )	AE ( ${\hat{μ}}_{ML}$ )	AE ( ${\hat{σ}}_{M}$ )	AE ( ${\hat{σ}}_{ML}$ )	AE ( ${\hat{α}}_{M}$ )	AE ( ${\hat{α}}_{ML}$ )
A	25	−5.023	−5.002	1.581	0.966	0.227	0.528
	50	−5.010	−5.001	1.463	0.984	0.289	0.507
	100	−5.006	−5.001	1.371	0.998	0.343	0.502
	200	−5.004	−5.000	1.280	1.000	0.386	0.501
	400	−5.000	−5.000	1.200	1.000	0.431	0.500
B	25	5.007	5.004	1.063	0.922	0.146	0.268
	50	5.005	5.004	1.055	0.976	0.156	0.221
	100	5.004	5.003	1.041	0.997	0.165	0.202
	200	5.001	5.001	1.020	0.999	0.178	0.201
	400	5.000	5.000	1.016	1.000	0.183	0.200
Scenario	n	SD ( ${\hat{μ}}_{M}$ )	SD ( ${\hat{μ}}_{ML}$ )	SD ( ${\hat{σ}}_{M}$ )	SD ( ${\hat{σ}}_{ML}$ )	SD ( ${\hat{α}}_{M}$ )	SD ( ${\hat{α}}_{ML}$ )
A	25	0.421	0.181	0.499	0.287	0.187	0.183
	50	0.304	0.120	0.324	0.197	0.151	0.122
	100	0.217	0.083	0.280	0.142	0.141	0.080
	200	0.146	0.054	0.262	0.100	0.133	0.055
	400	0.105	0.038	0.255	0.069	0.124	0.040
B	25	0.240	0.214	0.207	0.206	0.087	0.086
	50	0.165	0.153	0.143	0.141	0.075	0.071
	100	0.118	0.105	0.107	0.103	0.064	0.062
	200	0.079	0.070	0.087	0.070	0.058	0.044
	400	0.059	0.052	0.068	0.050	0.047	0.032

Table 2. Descriptive statistics for ants data set.

Size	Average	Standard Deviation	Skewness	Kurtosis
730	$176.438$	$62.643$	$- 0.205$	$4.587$

Table 3. Maximum likelihood estimates and values of the information criteria (AIC and BIC) for the distributions fitted to the ants data.

Parameter	N	S	ES	MS	GMS	T2MS
$μ$	176.487	181.526	181.739	181.738	181.724	181.791
$μ$	(2.317)	(1.270)	(1.231)	(1.224)	(1.229)	(0.050)
$σ$	62.606	16.819	1.275	16.762	14.722	36.980
$σ$	(1.638)	(1.246)	(1.066)	(1.245)	(0.884)	(1.832)
q	-	1.172	1.920	1.505	1.978	0.464
q		(0.085)	(0.195)	(0.095)	(0.201)	(0.024)
$q_{2}$	-	-	42.544	-	-	-
$q_{2}$			(37.623)
AIC	8115.474	7950.532	7914.972	7921.978	7911.628	7882.058
BIC	8124.660	7964.311	7933.344	7935.757	7925.407	7895.837

Table 4. Descriptive statistics for returns data.

Size	Average	Standard Deviation	Skewness	Kurtosis
1974	−0.016	0.047	−0.248	6.621

Table 5. Maximum likelihood estimates and values of the information criteria (AIC and BIC) for the distributions fitted to the returns data.

Parameter	N	S	ES	MS	GMS	T2MS
$μ$	$- 0.016$	0.003	0.003	0.004	0.003	0.003
$μ$	(0.010)	(0.008)	(0.008)	(0.008)	(0.008)	(0.008)
$σ$	0.470	0.238	0.034	0.225	0.159	0.354
$σ$	(0.007)	(0.009)	(0.003)	(0.005)	(0.003)	(0.008)
q	-	2.223	4.063	2.615	4.321	0.286
q		(0.146)	(0.336)	(0.049)	(0.333)	(0.015)
$q_{2}$	-	-	33.750	-	-	-
$q_{2}$			(2.521)
AIC	2626.192	2333.100	2300.656	2311.604	2296.674	2286.606
BIC	2637.368	2349.863	2323.007	2328.367	2313.437	2303.369

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Reyes, J.; Iriarte, Y.A. A New Family of Modified Slash Distributions with Applications. Mathematics 2023, 11, 3018. https://doi.org/10.3390/math11133018

AMA Style

Reyes J, Iriarte YA. A New Family of Modified Slash Distributions with Applications. Mathematics. 2023; 11(13):3018. https://doi.org/10.3390/math11133018

Chicago/Turabian Style

Reyes, Jimmy, and Yuri A. Iriarte. 2023. "A New Family of Modified Slash Distributions with Applications" Mathematics 11, no. 13: 3018. https://doi.org/10.3390/math11133018

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Family of Modified Slash Distributions with Applications

Abstract

1. Introduction

2. Type II Modified Slash Distribution

2.1. Representation and Density

2.2. Moments

3. Parameter Estimation

3.1. Moment Estimation

3.2. Maximum Likelihood Estimation

3.3. Practical Considerations

3.4. Simulation Study

4. Illustrations

4.1. Ant Movement Direction Data

4.2. DEM/GBP Exchange Rate Returns Data

5. Final Comments

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A. R Codes

Appendix A.1. Code for the Computation of the T2MS Pdf

Appendix A.2. Code to Obtain the Moment and Maximum Likelihood Estimates

Appendix B. Elements of the Observed Information Matrix

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI